[ 
https://issues.apache.org/jira/browse/PDFBOX-6203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18080953#comment-18080953
 ] 

Andreas Lehmkühler commented on PDFBOX-6203:
--------------------------------------------

The mentioned ticket didn't introduce the issue but revealed another issue with 
overlapping object keys.

The splitter uses under the hood the import page feature. It works fine as long 
as some of the pages don't share the same resources. The splitter processes the 
given pdf as wanted and returns a list containing the splitted pdfs. Those pdfs 
were newly created using importPage. At the end the code iterates over the list 
and saves one pdf after the other. In the given pdf some of the pages are 
sharing some of the colorspaces, which leads to pdfs using the same 
COS-level-objects. If the first pdf is written those objects get an object key. 
Once the second pdf is written the objects already having an object key got 
mixed up with those which hadn't one yet.

Finally I've found a fix and know it works. We already had some code to detect 
overlapping object keys, but it didn't work in the given case.

I've fixed another issue as well. PDFBox produces pdfs with more or less gaps 
in the xref table when splitting pdfs due to the fact that the splitter doesn't 
reset all of the imported objects.


> Splitter.split() corrupts result PDFs
> -------------------------------------
>
>                 Key: PDFBOX-6203
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-6203
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 3.0.7 PDFBox
>            Reporter: Tilman Hausherr
>            Assignee: Andreas Lehmkühler
>            Priority: Major
>              Labels: regression
>         Attachments: PDFBOX-6203.pdf
>
>
> As reported by Andreas Cserinko in the users mailing list. It can be 
> reproduced from the command line:
> java -jar pdfbox-app-3.0.7.jar split -split 1 -endPage 2 -i PDFBOX-6203.pdf
> It worked with 3.0.6.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to