[
https://issues.apache.org/jira/browse/PDFBOX-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025332#comment-17025332
]
Andreas Lehmkühler commented on PDFBOX-4569:
--------------------------------------------
The hardest part was the reintegration of the branch in the trunk ;-)
I stumbled upon the overwrite issue when loading a malformed pdf. The parser
itself modifies some of the pages to repair the number of pages. I didn't
change anything at all. BTW, If we reread all objects instead of caching them,
those reread objects aren't the same as the first read ones but should be
equal. I stumbled upon some code using "==" instead of "equals" and it didn't
work to simply use "equals" as it broke some other cases, seems related to
PDFBOX-4723.
IMHO we need to analyze the situation first to be able to come up with a
possible solution.
I'd like to follow up with the idea of using memory mapped files especially as
[~torakiki] posted a very promising hint on dev@
However, we should discuss this on dev@ or create new tickets
> Implement an ondemand Parser
> ----------------------------
>
> Key: PDFBOX-4569
> URL: https://issues.apache.org/jira/browse/PDFBOX-4569
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing
> Affects Versions: 3.0.0 PDFBox
> Reporter: Andreas Lehmkühler
> Assignee: Andreas Lehmkühler
> Priority: Major
> Fix For: 3.0.0 PDFBox
>
> Attachments: PDFBOX-1084.pdf
>
>
> There is a need to replace the big bang parser with an ondemand parser
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]