[ 
https://issues.apache.org/jira/browse/PDFBOX-3506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15516786#comment-15516786
 ] 

Tilman Hausherr commented on PDFBOX-3506:
-----------------------------------------

We do process both, and with this file, we must do so, because the /XRefStm is 
always the same: 81244.

A slightly advanced strategy might be to make a check, i.e. to put in the map 
the offset that is the highest, but only if the offsets are positive. (Negative 
offsets are the object numbers of Object Streams)

In the attached file, offset 83043 would have priority over offset 1016.

Here's what's going on with the file. Note that it's going backwards, i.e. the 
first table here is the one at the bottom of the file.
{code}
nextXrefObj() at offset 83707, type: TABLE
objKey: 7 0 R, offset: 83043 added to map
objKey: 1 0 R, offset: 17 added to map
objKey: 2 0 R, offset: 124 added to map
objKey: 3 0 R, offset: 180 added to map
objKey: 4 0 R, offset: 416 added to map
objKey: 5 0 R, offset: 609 added to map
objKey: 6 0 R, offset: 777 added to map
objKey: 7 0 R, offset: 1016 *** ignored ***
objKey: 8 0 R, offset: -14 added to map
objKey: 9 0 R, offset: -14 added to map
objKey: 10 0 R, offset: -14 added to map
objKey: 11 0 R, offset: -14 added to map
objKey: 12 0 R, offset: -14 added to map
objKey: 13 0 R, offset: -14 added to map
objKey: 14 0 R, offset: 1243 added to map
objKey: 15 0 R, offset: 1627 added to map
objKey: 16 0 R, offset: 1828 added to map
nextXrefObj() at offset 82841, type: TABLE
objKey: 7 0 R, offset: 82208 added to map
objKey: 1 0 R, offset: 17 added to map
objKey: 2 0 R, offset: 124 added to map
objKey: 3 0 R, offset: 180 added to map
objKey: 4 0 R, offset: 416 added to map
objKey: 5 0 R, offset: 609 added to map
objKey: 6 0 R, offset: 777 added to map
objKey: 7 0 R, offset: 1016 *** ignored ***
objKey: 8 0 R, offset: -14 added to map
objKey: 9 0 R, offset: -14 added to map
objKey: 10 0 R, offset: -14 added to map
objKey: 11 0 R, offset: -14 added to map
objKey: 12 0 R, offset: -14 added to map
objKey: 13 0 R, offset: -14 added to map
objKey: 14 0 R, offset: 1243 added to map
objKey: 15 0 R, offset: 1627 added to map
objKey: 16 0 R, offset: 1828 added to map
nextXrefObj() at offset 82030, type: TABLE
objKey: 1 0 R, offset: 17 added to map
objKey: 2 0 R, offset: 124 added to map
objKey: 3 0 R, offset: 180 added to map
objKey: 4 0 R, offset: 416 added to map
objKey: 5 0 R, offset: 609 added to map
objKey: 6 0 R, offset: 777 added to map
objKey: 7 0 R, offset: 1016 added to map
objKey: 8 0 R, offset: -14 added to map
objKey: 9 0 R, offset: -14 added to map
objKey: 10 0 R, offset: -14 added to map
objKey: 11 0 R, offset: -14 added to map
objKey: 12 0 R, offset: -14 added to map
objKey: 13 0 R, offset: -14 added to map
objKey: 14 0 R, offset: 1243 added to map
objKey: 15 0 R, offset: 1627 added to map
objKey: 16 0 R, offset: 1828 added to map
nextXrefObj() at offset 81514, type: TABLE
objKey: 1 0 R, offset: 17 added to map
objKey: 2 0 R, offset: 124 added to map
objKey: 3 0 R, offset: 180 added to map
objKey: 4 0 R, offset: 416 added to map
objKey: 5 0 R, offset: 609 added to map
objKey: 6 0 R, offset: 777 added to map
objKey: 7 0 R, offset: 1016 added to map
objKey: 15 0 R, offset: 1627 added to map
objKey: 16 0 R, offset: 1828 added to map
objKey: 17 0 R, offset: 81244 added to map
{code}

I also had a look at a few files from the digitalcorpora site with many XRefStm 
entries, and these were always different.

> Not able to read the custom metadata in trailer section
> -------------------------------------------------------
>
>                 Key: PDFBOX-3506
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3506
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.3, 2.1.0
>         Environment: Windows 7, PDF version 1.5
>            Reporter: Kent Lee
>         Attachments: PDFBOX-3506.pdf, test.pdf
>
>
> When using below code does not able to retrieve custom metadata stored in 
> trailer section of pdf
> PDDocumentInformation documentInformation = document.getDocumentInformation();
>               Set<String> customMetadataKeys = 
> documentInformation.getMetadataKeys();
> Pdfbox 1.8.12 does not have this issues



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to