could you try the non sequential parser PDFDocument.loadNonSeq(...) Kind regards
Maruan Sahyoun Am 22.01.2013 um 08:06 schrieb Manoj Patel (JIRA) <[email protected]>: > > [ > https://issues.apache.org/jira/browse/PDFBOX-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559441#comment-13559441 > ] > > Manoj Patel commented on PDFBOX-1498: > ------------------------------------- > > Its around 800 mb size document. You can try any pdf file with same size. > >> Index Out Of Bounds Exception while reading large PDF Document >> --------------------------------------------------------------- >> >> Key: PDFBOX-1498 >> URL: https://issues.apache.org/jira/browse/PDFBOX-1498 >> Project: PDFBox >> Issue Type: Bug >> Reporter: Manoj Patel >> Assignee: Andreas Lehmkühler >> >> I am getting java.lang.IndexOutOfBoundsException while reading large PDF >> document (800 mb). >> Below is the full stack >> Exception in thread "main" org.apache.pdfbox.exceptions.WrappedIOException >> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:243) >> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1071) >> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1038) >> at imageData.AddFooter.main(AddFooter.java:26) >> Caused by: java.lang.IndexOutOfBoundsException: Index: 3377, Size: 3377 >> at java.util.ArrayList.RangeCheck(ArrayList.java:547) >> at java.util.ArrayList.get(ArrayList.java:322) >> at >> org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) >> at >> org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106) >> at >> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65) >> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) >> at java.io.FilterOutputStream.close(FilterOutputStream.java:140) >> at >> org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:606) >> at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:566) >> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:187) >> ... 3 more > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira
