Re: FYI: Workaround for incorrect XRef/XRefStm input

Tilman Hausherr Wed, 23 Nov 2016 11:13:51 -0800

Please try current version 1.8.12, maybe it is fixed there (I see no"streamOffset != prev" anywhere - maybe you mean something else?). Ifnot, look whether it is fixed in the version on svn,

https://svn.apache.org/viewvc/pdfbox/branches/1.8/pdfbox/src/main/java/org/apache/pdfbox/pdfparser/NonSequentialPDFParser.java?view=markup
and if not, please open an issue in JIRA, preferably with a diff.


Tilman


Am 21.11.2016 um 23:44 schrieb Brzrk One:

ewps... left out that it was pdfbox 1.8.9...

On Mon, Nov 21, 2016 at 5:12 PM, Brzrk One <[email protected]> wrote:

I have a PDF file (which I cannot share) with the trailer:

trailer
<<
/Size 16922
/Root 1 0 R
/Info 9 0 R
/ID [<495BB8DD62106B9AB4E6E1C8B591C982> <91EB7F87537B4838AF45C0D28A9882
80>]
/XRefStm 5347791
startxref
5135270

But there is only a single xref table in this pdf file: there is no object
with /Type /XRef.
In this situation, NonSequentialPDFParser.parseXref() will enter the
XREF_STM paragraph, but, since there is no object with /Type /XRef at
offset 5347791 (a position that lands smack dab in the middle of the xref
table) it does a brute force search for some XRef entry, and returns offset
5135270, which is the location of the one and only xref table in the file.

I added this check to the XREF_STM paragraph, which seems to get around
the problem:


*if* ( streamOffset != prev ) {

// if the positions are the same, this a hybrid *xref* table / *xrefstm*
but no /XRef stream...
parseXrefObjStream(prev, *false*);

}


  I see similar code in 2.0.3 COSParser.parseXref().
  HtH, Pat



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: FYI: Workaround for incorrect XRef/XRefStm input

Reply via email to