[
https://issues.apache.org/jira/browse/PDFBOX-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810518#comment-16810518
]
Andreas Lehmkühler commented on PDFBOX-4502:
--------------------------------------------
I ran two tests and the results are more or less equal
* split Pages 950 to 1000 from the PDF reference
* split the PDF reference into pieces of 50 pages each
As Tilmans optimization is the better approach to process huge files I've
followed his proposal
> Performance issue with splitter and huge files
> ----------------------------------------------
>
> Key: PDFBOX-4502
> URL: https://issues.apache.org/jira/browse/PDFBOX-4502
> Project: PDFBox
> Issue Type: Improvement
> Components: Utilities
> Affects Versions: 2.0.14, 3.0.0 PDFBox
> Reporter: ccouturi
> Assignee: Andreas Lehmkühler
> Priority: Major
> Labels: patch, performance
> Fix For: 2.0.15, 3.0.0 PDFBox
>
> Attachments: fix_seek_splitter.patch
>
>
> The Splitter.processPages is currently experiencing performance issue when
> it's used in "seek" mode with startPage > 0 and huge files.
> {color:#9876aa}sourceDocument{color}.getPage(i){color:#cc7832}; {color}
> {color:#cc7832}{color:#333333}is called on each page of the document and this
> method is time consuming. It may possible to move this call only for pages
> between start and end pages parameters.{color}
> {color}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]