[ 
https://issues.apache.org/jira/browse/PDFBOX-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363895#comment-15363895
 ] 

Michael Klink edited comment on PDFBOX-3403 at 7/6/16 7:52 AM:
---------------------------------------------------------------

{quote}
The exception which was thrown is there to verify PDFBox's internal consistency 
(that's why it's an unchecked exception), and bypassing it now allows PDFBox to 
get into an inconsistent state with invariants such as "getBaseEncoding() never 
returns null" being violated.
{quote}

There is no need for PDFBox to become internally inconsistent just because a 
font has no base encoding. As the base encoding can completely be overridden by 
the *Differences*, code depending on the contents of the base encoding likely 
is broken anyways.

If it merely is a matter of preventing {{null}} base encodings, one could 
interpret the absence of a built-in encoding in a font and base encoding name 
in the dictionary as an empty built-in encoding and create such an entry.

Consider e.g. a symbolic Type 3 font. Type 3 fonts do not need a built-in 
encoding (cf. ISO 32000-1 9.6.6.1, _"Except for Type 3 fonts, every font 
program shall have a built-in encoding."_), and in their case we have an 
_"encoding dictionary whose Differences array shall specify the complete 
character encoding for this font"_ (ISO 32000-1 Table 112), so no need for the 
optional base encoding entry in the encoding dictionary.

Thus, such a font may neither have a built-in encoding nor a base encoding 
entry in the encoding dictionary. And for symbols not usually found in font 
programs, a symbolic Type 3 font is the obvious choice.


was (Author: mkl):
{quote}
The exception which was thrown is there to verify PDFBox's internal consistency 
(that's why it's an unchecked exception), and bypassing it now allows PDFBox to 
get into an inconsistent state with invariants such as "getBaseEncoding() never 
returns null" being violated.
{quote}

There is no need for PDFBox to become internally inconsistent just because a 
font has no base encoding. As the base encoding can completely be overridden by 
the *Differences*, code depending on the contents of the base encoding likely 
is broken anyways.

If it merely is a matter of preventing {{null}} base encodings, one could 
interpret the absence of a built-in encoding in a font and base encoding name 
in the dictionary as an empty built-in encoding and create such an entry.

> IllegalArgumentException: Symbolic fonts must have a built-in encoding
> ----------------------------------------------------------------------
>
>                 Key: PDFBOX-3403
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3403
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.2, 2.0.3, 2.1.0
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>             Fix For: 2.0.3, 2.1.0
>
>         Attachments: PDFBOX-3403.pdf
>
>
> Happens with text extraction and rendering:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Symbolic fonts 
> must have a built-in encoding
>       at 
> org.apache.pdfbox.pdmodel.font.encoding.DictionaryEncoding.<init>(DictionaryEncoding.java:113)
>       at 
> org.apache.pdfbox.pdmodel.font.PDSimpleFont.readEncoding(PDSimpleFont.java:126)
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.<init>(PDType1CFont.java:131)
>       at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:60)
>       at org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:123)
>       at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
>       at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:829)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to