[ 
https://issues.apache.org/jira/browse/PDFBOX-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13608738#comment-13608738
 ] 

Pierre Huttin commented on PDFBOX-1544:
---------------------------------------

After a quick patch  (transform BaseParser.readInt method into 
BaseParser.readLong, and fixing references).

I'm able to open my 21GB file, but it took 3H30 to open the document, and the 
scratchfile was arround the same size than the document.
                
> Not able to loadNonSeq document larger than 2GB
> -----------------------------------------------
>
>                 Key: PDFBOX-1544
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1544
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing, PDModel
>    Affects Versions: 1.7.1
>            Reporter: Pierre Huttin
>
> When I try to open open a document larger than 2GB (I have test with a 21GB 
> document) using the method PDDocument.loadNonSeq(). The PDFParser trigger me 
> the following error:
> Exception in thread "main" java.io.IOException: Error: Expected an integer 
> type, actual='22580639698'
>       at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1608) 
>                            
>       at 
> org.apache.pdfbox.pdfparser.PDFParser.parseStartXref(PDFParser.java:677)      
>                   
>       at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:237)
>       at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:574)
>        
>       at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1124)         
>                   
>       at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1107)         
>                   
>       
> the problem seems to come from BaseParser which try to return int type.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to