Improved handling erronous data between endstream and endobj lines
------------------------------------------------------------------

                 Key: PDFBOX-803
                 URL: https://issues.apache.org/jira/browse/PDFBOX-803
             Project: PDFBox
          Issue Type: Improvement
            Reporter: Adam Nichols
            Assignee: Adam Nichols
             Fix For: 1.3.0


I found that a PDF created by Exstream Dialogue Version 5.0.039 had ">> " 
between the endstream and endobj sections.  When this happened, PDFBox threw an 
exception.  This patch ignores junk characters between these sections so the 
files can be processed.  A log message is written warning the user of the 
violation of the spec.  For reference, here's the object I found in the file 
(excluding the stream data):
27     0 obj
<<
/Filter [/A85 /Fl]
/Length 322
>>
stream
(data from stream omitted)
endstream
>> endobj 
%PDF Font (F315)

As a side note Exstream seems to have sold their Dialogue software to HP, and 
the current version is 7.  This means the bug is likely fixed in the latest 
version, but there are still some older PDFs out there which PDFBox should be 
able to handle without throwing an exception.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to