[ 
https://issues.apache.org/jira/browse/PDFBOX-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363568#comment-15363568
 ] 

John Hewson edited comment on PDFBOX-3403 at 7/6/16 1:07 AM:
-------------------------------------------------------------

Unfortunately this fix was not a good choice. The exception which was thrown is 
there to verify PDFBox's internal consistency, and bypassing it now allows 
PDFBox to get into an inconsistent state where invariants such as 
"getBaseEncoding() never returns null" being violated.

As the exception says, symbolic fonts *must* have a built-in encoding. It's the 
job of the caller of this function to make sure the inputs are correct - that's 
what needs to happen. Otherwise it's garbage in, garbage out.

The `readEncoding()` method of `PDSimpleFont` is responsible for reading, and 
making fixes to encodings. This code should be making sure that the invariants 
set by DictionaryEncoding are satisfied, rather than forcing through invalid 
data. *MacExpertEncoding* is in fact a valid encoding and we need to support it.

In fact, the only change that appears to be necessary here is to add support 
for MacExpertEncoding.


was (Author: jahewson):
Unfortunately this fix was not a good choice. The exception which was thrown is 
there to verify PDFBox's internal consistency, and bypassing it now allows 
PDFBox to get into an inconsistent state where invariants such as 
"getBaseEncoding() never returns null" being violated.

As the exception says, symbolic fonts *must* have a built-in encoding. It's the 
job of the caller of this function to make sure the inputs are correct - that's 
what needs to happen. Otherwise it's garbage in, garbage out.

The `readEncoding()` method of `PDSimpleFont` is responsible for reading, and 
making fixes to encodings. This code should be making sure that the invariants 
set by DictionaryEncoding are satisfied, rather than forcing through invalid 
data. *MacExpertEncoding* is in fact a valid encoding and we need to support it.

> IllegalArgumentException: Symbolic fonts must have a built-in encoding
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-3403
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3403
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.2, 2.0.3, 2.1.0
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>             Fix For: 2.0.3, 2.1.0
>
>         Attachments: PDFBOX-3403.pdf
>
>
> Happens with text extraction and rendering:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Symbolic fonts 
> must have a built-in encoding
>       at 
> org.apache.pdfbox.pdmodel.font.encoding.DictionaryEncoding.<init>(DictionaryEncoding.java:113)
>       at 
> org.apache.pdfbox.pdmodel.font.PDSimpleFont.readEncoding(PDSimpleFont.java:126)
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:131)
>       at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:60)
>       at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:123)
>       at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:829)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to