[ 
https://issues.apache.org/jira/browse/PDFBOX-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180386#comment-14180386
 ] 

John Hewson commented on PDFBOX-2445:
-------------------------------------

{quote}
John Hewson couldn’t we probably change PDFTextStripper to not use 
document.getDocumentCatalog().getAllPages() as I understand that this loads 
everything? Or did that change already?
{quote}

Yes, but that API is not in 1.8, though it could be added. Unless the problem 
can be reproduced, I wouldn't bother.

> Out of Memory - Extract text for Apache_Solr_4.7_Ref_Guide.pdf
> --------------------------------------------------------------
>
>                 Key: PDFBOX-2445
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2445
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing, PDModel
>    Affects Versions: 1.8.7, 2.0.0
>            Reporter: Maruan Sahyoun
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to