[
https://issues.apache.org/jira/browse/PDFBOX-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145080#comment-14145080
]
Daniel Scheibe commented on PDFBOX-2350:
----------------------------------------
I'll give it a try tomorrow and see what i can find out. While debugging i
checked a couple of other Type 1 PDF parsers and it seems (from what i
understood) the way JPedal goes about it is to seek for the "exec" (yes not
"eexec") token in the font descriptor plus the potentially existing cr+lf and
everything beyond that point is considered to be the binary data. While this
approach might be "fault tolerant" for my case i agree if the PDF contains an
offset that is one byte "off" then the my file is broken.
> Type1 Parser hangs indefinitely
> -------------------------------
>
> Key: PDFBOX-2350
> URL: https://issues.apache.org/jira/browse/PDFBOX-2350
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox
> Affects Versions: 2.0.0
> Environment: Windows 7, JDK 1.7.0_51-b13
> Reporter: Daniel Scheibe
> Attachments: PDFBOX-2350-289451-endless.pdf
>
>
> When rendering the first page of my pdf document the Type1Parser
> (org.apache.fontbox.type1.Type1Parser) hangs in a loop in
> {{parseBinary(byte[] bytes) throws IOException}}
> and "kills" our rendering pipeline. Please find the loop that hangs below:
> // find /Private dict
> while (!lexer.peekToken().getText().equals("Private"))
> {
> lexer.nextToken();
> }
> There is no token named "Private" ever in the list of returned tokens
> (they're empty all the time).
> Furthermore going deeper into the source code it seems the class reading the
> tokens (Type1Lexer) does never finally advance the buffer position and always
> returns an empty name token in the readToken(Token prevToken) method.
> Looking at the decrypted buffer i cannot get something useful out of it based
> on my current understanding.
> Unfortunately i cannot provide the pdf in question as it contains confidental
> data.
> Acrobat Reader XI Version 11.0.08 renders the document just fine.
> In addition it seems the pdf was encrypted (40-Bit RC4) with an empty
> password and says it's pdf version 1.5.
> Does this provide enough information or can i do anything else to help
> nailing this one down?
> I guess this might be a pdf document structure/feature that is not yet
> supported completely but at least pdfbox should throw an exception instead of
> failing "silently"...
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)