[ 
https://issues.apache.org/jira/browse/PDFBOX-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15367217#comment-15367217
 ] 

John Hewson edited comment on PDFBOX-3403 at 7/8/16 4:56 AM:
-------------------------------------------------------------

Just to add a little more background here, the DictionaryEncoding class in 
PDFBox is PDFBox's internal representation of what a dictionary encoding 
logically is, not a direct analog of the data in the PDF. To simplify what is 
an extremely complex process we have some distinct separations of concerns 
surrounding encodings so that the consumers of these classes don't have to 
handle the many possible failure cases (as pretty much any piece of data in a 
PDF file can be missing, incorrect, or in need of repair). This prevents the 
consumers from becoming error-handling spaghetti. So that's why it's 
PDType1Font's job to construct a valid DictionaryEncoding, not 
DictionaryEncoding's job to handle bad data. We fix bad data at the point it is 
encountered, we don't propagate it.

Type3 fonts are handled by a separate mechanism, there is a single-argument 
DictionaryEncoding constructor specifically for these.


was (Author: jahewson):
Just to add a little more background here, the DictionaryEncoding class in 
PDFBox is PDFBox's internal representation of what a dictionary encoding 
logically is, not a direct analog of the data in the PDF. To simplify what is 
an extremely complex process we have some distinct separations of concerns 
surrounding encodings so that the consumers of these classes don't have to 
handle the many possible failure cases (as pretty much any piece of data in a 
PDF file can be missing, incorrect, or in need of repair). This prevents the 
consumers from becoming error-handling spaghetti. So that's why it's 
PDType1Font's job to construct a valid DictionaryEncoding, not 
DictionaryEncoding's job to handle bad data. We fix bad data at the point it is 
encountered, we don't propagate it.

> IllegalArgumentException: Symbolic fonts must have a built-in encoding
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-3403
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3403
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.2, 2.0.3, 2.1.0
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>             Fix For: 2.0.3, 2.1.0
>
>         Attachments: PDFBOX-3403-XXX.pdf, PDFBOX-3403-YYY.pdf, PDFBOX-3403.pdf
>
>
> Happens with text extraction and rendering:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Symbolic fonts 
> must have a built-in encoding
>       at 
> org.apache.pdfbox.pdmodel.font.encoding.DictionaryEncoding.<init>(DictionaryEncoding.java:113)
>       at 
> org.apache.pdfbox.pdmodel.font.PDSimpleFont.readEncoding(PDSimpleFont.java:126)
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:131)
>       at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:60)
>       at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:123)
>       at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:829)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to