Here's an additional error:

WARNING: java.lang.NullPointerException
at
org.apache.pdfbox.util.TextPosition.<init>(TextPosition.java:95)
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:443)
org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:50)
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:493)
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:214)
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:173)
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:358)
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:282)
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:238)

16 Apr 2009 11:16:28 PM org.apache.pdfbox.pdfparser.BaseParser parseCOSArray
WARNING: Corrupt object reference
Jamie Band wrote:
I am also getting the following:

java.lang.System.arraycopy(Object, int, Object, int, int) at org.apache.pdfbox.util.PDFTextStripper.writeText(PDDocument, Writer) [ARRAY INDEX OUT OF BOUNDS]


Jamie Band wrote:
Hi There

When calling PDFBox to extract text from PDF documents, I find that it is prudent to wrap the calls with a Throwable clause since PDFBox appears to frequently generate Null Pointer and Class Cast exceptions.

Occasionally, I receive null pointer exceptions in the following:

org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(List, COSDictionary, boolean) (The method calls itself recursively) [NULL POINTER] org.apache.pdfbox.encryption.DocumentEncryption.decryptDocument(String) [CLASSCAST EXCEPTION]

I am using the latest checkout from svn.

I am sorry I don't have more information than since I obtained the exception from a long running application.

Regards,

Jamie





Reply via email to