[ 
https://issues.apache.org/jira/browse/PDFBOX-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363568#comment-15363568
 ] 

John Hewson edited comment on PDFBOX-3403 at 7/6/16 1:08 AM:
-------------------------------------------------------------

Unfortunately this fix was not a good choice. The exception which was thrown is 
there to verify PDFBox's internal consistency (that's why it's an unchecked 
exception), and bypassing it now allows PDFBox to get into an inconsistent 
state with invariants such as "getBaseEncoding() never returns null" being 
violated.

As the exception says, symbolic fonts *must* have a built-in encoding. It's the 
job of the caller of this function to make sure the inputs are correct - that's 
what needs to happen. Otherwise it's garbage in, garbage out.

The `readEncoding()` method of `PDSimpleFont` is responsible for reading, and 
making fixes to encodings. This code should be making sure that the invariants 
set by DictionaryEncoding are satisfied, rather than forcing through invalid 
data. *MacExpertEncoding* is in fact a valid encoding and we need to support it.

The only change that appears to be necessary here is to add support for 
MacExpertEncoding.


was (Author: jahewson):
Unfortunately this fix was not a good choice. The exception which was thrown is 
there to verify PDFBox's internal consistency (that's why it's an unchecked 
exception), and bypassing it now allows PDFBox to get into an inconsistent 
state where invariants such as "getBaseEncoding() never returns null" being 
violated.

As the exception says, symbolic fonts *must* have a built-in encoding. It's the 
job of the caller of this function to make sure the inputs are correct - that's 
what needs to happen. Otherwise it's garbage in, garbage out.

The `readEncoding()` method of `PDSimpleFont` is responsible for reading, and 
making fixes to encodings. This code should be making sure that the invariants 
set by DictionaryEncoding are satisfied, rather than forcing through invalid 
data. *MacExpertEncoding* is in fact a valid encoding and we need to support it.

The only change that appears to be necessary here is to add support for 
MacExpertEncoding.

> IllegalArgumentException: Symbolic fonts must have a built-in encoding
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-3403
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3403
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.2, 2.0.3, 2.1.0
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>             Fix For: 2.0.3, 2.1.0
>
>         Attachments: PDFBOX-3403.pdf
>
>
> Happens with text extraction and rendering:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Symbolic fonts 
> must have a built-in encoding
>       at 
> org.apache.pdfbox.pdmodel.font.encoding.DictionaryEncoding.<init>(DictionaryEncoding.java:113)
>       at 
> org.apache.pdfbox.pdmodel.font.PDSimpleFont.readEncoding(PDSimpleFont.java:126)
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:131)
>       at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:60)
>       at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:123)
>       at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:829)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to