Hi,
I am using pdf box 1.8.9 for extracting pdf contents(actually using apache tika 
which in turn is using pdf box). I am encountering the below exceptions while 
trying to parse Portuguese or Spanish pdf files. They are different exceptions 
but seem to be related to handling Spanish or Portuguese characters. Has 
anybody encountered these exceptions before?? Any suggestions to fix it??

I can attached the pdf files if that would be helpful.


Exception list:--


1.)    java.lang.RuntimeException: java.io.IOException: Expected='null' 
actual='n' at offset 4306
at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:198)
at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:205)


2.)     java.lang.RuntimeException: java.io.IOException: Unknown dir object 
c=')' cInt=41 peek=')' peekInt=41 8544
                at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:198)


3.)    java.lang.RuntimeException: java.io.IOException: Error expected floating 
point number actual='--22.'

at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:198)

at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:205)


4.)    java.lang.RuntimeException: java.io.IOException: Error expected floating 
point number actual='173.0.2'

at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:198)

at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:205)


5.)    java.lang.RuntimeException: java.io.IOException: Value is not an 
integer: -1-15

at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:198)

at 
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:205)


Thanks,
Mouthgalya Ganapathy
Product Development Team

______________________________________________________________________
Confidentiality Notice:  The information contained in this e-mail and any 
attachment(s) is confidential and for the use of the addressee(s) only.  If you 
are not the intended recipient of this e-mail, do not duplicate or redistribute 
it by any means.  Please delete this e-mail and any attachment(s) and notify us 
immediately.  Unauthorized use, reliance, disclosure or copying of the contents 
of this e-mail and any attachment(s), or any similar action, is strictly 
prohibited.  Fitch Ratings reserves the right, to the extent permitted by 
applicable law, to retain, monitor and intercept e-mail messages both to and 
from its systems.

This e-mail has been scanned by the MessageLabs Email Security System.  For 
more information, please visit http://www.messagelabs.com/email.
______________________________________________________________________

Reply via email to