[ 
https://issues.apache.org/jira/browse/PDFBOX-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068558#comment-14068558
 ] 

Brandon Lyon commented on PDFBOX-2226:
--------------------------------------

I found a public PDF and tested it. It worked without the out of bounds 
exception. I suspect the problem may be the multi-line fields in the failing 
document, some of which are displayed in multiple places, though there is only 
one field. When looking at the documents that did successfully merge, you can 
see ") Tj 0 - 13 RTd (" where a new-line is supposed to be. So, perhaps you 
need to attempt to merge documents with multi-line fields to reproduce the 
issue. I'll continue investigating the issue, but I also need to move forward 
with my own project

> IndexOutOfBoundsException when merging many PDFs in memory
> ----------------------------------------------------------
>
>                 Key: PDFBOX-2226
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2226
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 1.8.6
>         Environment: Windows 7 64-bit, JDK8
>            Reporter: Brandon Lyon
>
> An IndexOutOfBoundsException occurs when attempting to merge many (at least 
> 10) PDF documents together. All PDFs exist in byte arrays in memory, not as 
> files. The stack trace looks as follows (irrelevant details redacted):
> 2014-07-18 11:48:22,858 ERROR [io.undertow.servlet] (default task-5) ****: 
> Uncaught exception: : ****
>       ****
> Caused by: org.apache.pdfbox.exceptions.WrappedIOException
>       at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:267) 
> [pdfbox-1.8.6.jar:]
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1216) 
> [pdfbox-1.8.6.jar:]
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1183) 
> [pdfbox-1.8.6.jar:]
>       at 
> org.apache.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:236)
>  [pdfbox-1.8.6.jar:]
>       at 
> org.apache.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:185)
>  [pdfbox-1.8.6.jar:]
>       at ****
>       ... 29 more
> Caused by: java.lang.IndexOutOfBoundsException: Index: 145, Size: 145
>       at java.util.ArrayList.rangeCheck(ArrayList.java:638) [rt.jar:1.8.0_05]
>       at java.util.ArrayList.get(ArrayList.java:414) [rt.jar:1.8.0_05]
>       at 
> org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:110) 
> [pdfbox-1.8.6.jar:]
>       at 
> org.apache.pdfbox.io.RandomAccessFileOutputStream.write(RandomAccessFileOutputStream.java:106)
>  [pdfbox-1.8.6.jar:]
>       at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) 
> [rt.jar:1.8.0_05]
>       at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) 
> [rt.jar:1.8.0_05]
>       at java.io.FilterOutputStream.close(FilterOutputStream.java:158) 
> [rt.jar:1.8.0_05]
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:634) 
> [pdfbox-1.8.6.jar:]
>       at 
> org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:609) 
> [pdfbox-1.8.6.jar:]
>       at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:194) 
> [pdfbox-1.8.6.jar:]
>       ... 34 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to