[
https://issues.apache.org/jira/browse/PDFBOX-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363568#comment-15363568
]
John Hewson edited comment on PDFBOX-3403 at 7/6/16 1:10 AM:
-------------------------------------------------------------
Unfortunately this fix was not a good choice. The exception which was thrown is
there to verify PDFBox's internal consistency (that's why it's an unchecked
exception), and bypassing it now allows PDFBox to get into an inconsistent
state with invariants such as "getBaseEncoding() never returns null" being
violated.
As the exception says, symbolic fonts *must* have a built-in encoding. It's the
job of the caller of this function to make sure the inputs are correct - that's
what needs to happen. Otherwise it's garbage in, garbage out.
The `readEncoding()` method of `PDSimpleFont` is responsible for reading, and
making fixes to encodings. This code should be making sure that the invariants
set by DictionaryEncoding are satisfied, rather than forcing through invalid
data.
The only change that appears to be necessary here is to add support for
MacExpertEncoding.
was (Author: jahewson):
Unfortunately this fix was not a good choice. The exception which was thrown is
there to verify PDFBox's internal consistency (that's why it's an unchecked
exception), and bypassing it now allows PDFBox to get into an inconsistent
state with invariants such as "getBaseEncoding() never returns null" being
violated.
As the exception says, symbolic fonts *must* have a built-in encoding. It's the
job of the caller of this function to make sure the inputs are correct - that's
what needs to happen. Otherwise it's garbage in, garbage out.
The `readEncoding()` method of `PDSimpleFont` is responsible for reading, and
making fixes to encodings. This code should be making sure that the invariants
set by DictionaryEncoding are satisfied, rather than forcing through invalid
data. *MacExpertEncoding* is in fact a valid encoding and we need to support it.
The only change that appears to be necessary here is to add support for
MacExpertEncoding.
> IllegalArgumentException: Symbolic fonts must have a built-in encoding
> ----------------------------------------------------------------------
>
> Key: PDFBOX-3403
> URL: https://issues.apache.org/jira/browse/PDFBOX-3403
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel
> Affects Versions: 2.0.2, 2.0.3, 2.1.0
> Reporter: Tilman Hausherr
> Assignee: Tilman Hausherr
> Fix For: 2.0.3, 2.1.0
>
> Attachments: PDFBOX-3403.pdf
>
>
> Happens with text extraction and rendering:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Symbolic fonts
> must have a built-in encoding
> at
> org.apache.pdfbox.pdmodel.font.encoding.DictionaryEncoding.<init>(DictionaryEncoding.java:113)
> at
> org.apache.pdfbox.pdmodel.font.PDSimpleFont.readEncoding(PDSimpleFont.java:126)
> at
> org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:131)
> at
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:60)
> at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:123)
> at
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
> at
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:829)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]