This smells of Exception overwriting. BaseParser.java:610 is actually a clean-up procedure, and if it crashes it's quite possible that the original error is lost.
I have a gut feeling that there is an OOME somewhere above, that gets wiped out by a crashed clean-up procedure. That said: did you give your application at least 2G of heap memory? The amount is arbitrary, but I suspect it will require as a bare minimum the size of the file and then some, and possibly even more (for pointers and stuff). I would start with an -Xmx2G. 2013/10/30 Brent Pathakis <[email protected]>: > Hi, > > I'm trying to use PDFbox to load a large pdf document (>1gb): > [ > File inputPdf = new File("c:\\some.pdf"); > PDFTextStripper stop = new PDFTextStripper (); > > FileInputStream fis=null; > fis=new FileInputStream(inputPdf); > pd = PDDocument.load(fis,true);[/CODE] > > This code works fine for smaller pdfs, but only larger ones I'm getting: > > org.apache.pdfbox.exceptions.WrappedIOException > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:245) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1192) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1159) > at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1130) > at PDFRedact.main(PDFRedact.java:19) > Caused by: java.lang.IndexOutOfBoundsException: Index: 15625, Size: 15625 > at java.util.ArrayList.RangeCheck(Unknown Source) > at java.util.ArrayList.get(Unknown Source) > at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) > at org.apache.pdfbox.io.RandomAccessFileOutputStream.write( > RandomAccessFileOutputStream.java:106) > at java.io.BufferedOutputStream.flushBuffer(Unknown Source) > at java.io.BufferedOutputStream.flush(Unknown Source) > at java.io.FilterOutputStream.close(Unknown Source) > at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java: > 610) > at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:568) > at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:188) > ... 4 more > > > Any ideas or help would be appreciated. > > *Brent Pathakis* > 801 536 0041

