[ 
https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660745#comment-14660745
 ] 

Brad Baker commented on PDFBOX-1498:
------------------------------------

I'm hitting this exception loading large pdfs (~6-8 GB each) with pdfbox 
versions 1.8.8 - 1.8.10. I get the IndexOutOfBoundsException with 
PDDocument.load() and PDDocument.loadNonSeq().  I can provide a pdf if you tell 
me where to upload it. Or should I create a new issue?

Thanks,
Brad

Aug 06, 2015 1:31:14 PM hp.pdfbox.test.Main main
INFO: Test Large PDF Load D:\workspace_trunk_luna\test_pdfbox\pdfs\ELOISA 
ARTOLA CD17433_Indigo.pdf
Aug 06, 2015 1:31:14 PM hp.pdfbox.test.Main main
INFO: Create Steam
Aug 06, 2015 1:32:44 PM hp.pdfbox.test.Main main
INFO: Start Load
org.apache.pdfbox.exceptions.WrappedIOException
        at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:278)
        at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1219)
        at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1186)
        at hp.pdfbox.test.Main.main(Main.java:22)
Caused by: java.lang.IndexOutOfBoundsException: Index: 1041, Size: 1041
        at java.util.ArrayList.rangeCheck(Unknown Source)
        at java.util.ArrayList.get(Unknown Source)
        at 
org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:110)
        at 
org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106)
        at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
        at java.io.BufferedOutputStream.flush(Unknown Source)
        at java.io.FilterOutputStream.close(Unknown Source)
        at 
org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:616)
        at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:650)
        at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:203)
        ... 3 more



INFO: Create Steam
Aug 06, 2015 1:51:47 PM hp.pdfbox.test.Main main
INFO: Start Load
Aug 06, 2015 1:53:39 PM org.apache.pdfbox.pdfparser.XrefTrailerResolver 
setStartxref
WARNING: Did not found XRef object at specified startxref position 8552119825
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 509, 
Size: 509
        at java.util.ArrayList.rangeCheck(Unknown Source)
        at java.util.ArrayList.get(Unknown Source)
        at 
org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:110)
        at 
org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106)
        at java.io.BufferedOutputStream.flushBuffer(Unknown Source)
        at java.io.BufferedOutputStream.flush(Unknown Source)
        at java.io.FilterOutputStream.close(Unknown Source)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1847)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1448)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1374)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:1348)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:429)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:915)
        at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1305)
        at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1288)
        at hp.pdfbox.test.Main.main(Main.java:22)


> Index Out Of Bounds Exception while reading large PDF Document 
> ---------------------------------------------------------------
>
>                 Key: PDFBOX-1498
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1498
>             Project: PDFBox
>          Issue Type: Bug
>            Reporter: Manoj Patel
>            Assignee: Andreas Lehmkühler
>
> I am getting java.lang.IndexOutOfBoundsException while reading large PDF 
> document (800 mb). 
> Below is the full stack
> Exception in thread "main" org.apache.pdfbox.exceptions.WrappedIOException
>       at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038)
>       at imageData.AddFooter.main(AddFooter.java:26)
> Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377
>       at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>       at java.util.ArrayList.get(ArrayList.java:322)
>       at 
> org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84)
>       at 
> org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106)
>       at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>       at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
>       at java.io.FilterOutputStream.close(FilterOutputStream.java:140)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606)
>       at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566)
>       at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187)
>       ... 3 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to