[jira] [Commented] (PDFBOX-3645) Probably not handling header read on bad PDFs

Tilman Hausherr (JIRA) Tue, 17 Jan 2017 09:08:09 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826394#comment-15826394
 ]


Tilman Hausherr commented on PDFBOX-3645:
-----------------------------------------

This was introduced with PDFBOX-318 / 
https://sourceforge.net/p/pdfbox/bugs/471/ . I tried reading this file 
("exception_version1.pdf") and got a correct version 1.3.

I suspect that the intention wasn't to allow split headers, but to prevent the 
loop from taking too long. So my intention would be to leave this unchanged.

Do you have a PDF with a split header, or was this rather an observation you 
made while looking through the code?

> Probably not handling header read on bad PDFs
> ---------------------------------------------
>
>                 Key: PDFBOX-3645
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3645
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.4
>            Reporter: G. Ralph Kuntz
>            Priority: Minor
>
> In COSParser.parseHeader, the code on line 1893 tries to handle PDFs where 
> the version number is split on a separate line from the "%PDF-". It assigns 
> the newly read line to `header`, detects a line that starts with a digit, 
> then breaks.
> On line 1902, the code tries again to match the header "%PDF-xxx", but will 
> not succeed in the case where the version was on a separate line, because 
> `header` will contain only the version number and not the leading `%PDF-`.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-3645) Probably not handling header read on bad PDFs

Reply via email to