You are correct, Leonard - I ran my file through a PDF validator against all 
the 5 or 6 variations  of PDF (PDF/A1, PDF/A2 etc) and it failed every one - - 
but so did all the other 30 or 40 different PDFs from all the other energy 
suppliers! In fact every PDF document I have on my laptop fails to validate, so 
it would appear on the basis of my (very basic) validation exercise that 
nothing is ever 100% PDF compatible. I've even downloaded Abode Reader X 
(10.1.2) from your company and looked at some of the PDFs supplied with that, 
and they fail to validate ... Btw, Itext was happy to convert these other 30 to 
40 PDFs to text, but not the one I posted ... I was just curious as to why 
Richard
 From: [email protected]
To: [email protected]
Date: Sat, 4 Feb 2012 15:51:35 -0800
Subject: Re: [iText-questions] PDFTextExtractor returns an exception - 'Input 
string was not in a correct format" when parsing this file



Just because Adobe Reader processes your file does NOT make it valid PDF.  
Reader is EXTREMELY lenient because the average user would have no way to fix 
such crappy PDFs. If you ran this file through the PDF validator in Acrobat, I 
would bet it would fail. Leonard From: RIchard Hammond 
[mailto:[email protected]] 
Sent: Saturday, February 04, 2012 6:45 PM
To: [email protected]
Subject: Re: [iText-questions] PDFTextExtractor returns an exception - 'Input 
string was not in a correct format" when parsing this file Kevin - the stack 
trace wasn't posted because the problem is easily reproducable.
 
If you use Abode Acrobat Reader to view the PDF it looks fine as I tested this 
before I posted the query.  The original file was produced by a power company 
(who should know what they are doing)   Also, I have used another PDF to text 
converter on the file, and that worked fine too ...
 
Regards
 
Richard
 > Date: Sat, 4 Feb 2012 14:23:13 -0800
> From: [email protected]
> To: [email protected]
> Subject: Re: [iText-questions] PDFTextExtractor returns an exception - 'Input 
> string was not in a correct format" when parsing this file
> 
> Next time, post the stack trace!
> 
> For everyone's reference, here is the stack trace:
> 
> java.lang.RuntimeException: - is not a valid number -
> java.lang.NumberFormatException: For input string: "-"
> at com.itextpdf.text.pdf.PdfNumber.<init>(PdfNumber.java:83)
> at
> com.itextpdf.text.pdf.PdfContentParser.readPRObject(PdfContentParser.java:180)
> at com.itextpdf.text.pdf.PdfContentParser.parse(PdfContentParser.java:89)
> at
> com.itextpdf.text.pdf.parser.PdfContentStreamProcessor.processContent(PdfContentStreamProcessor.java:358)
> at
> com.itextpdf.text.pdf.parser.PdfReaderContentParser.processContent(PdfReaderContentParser.java:79)
> at
> com.itextpdf.text.pdf.parser.PdfTextExtractor.getTextFromPage(PdfTextExtractor.java:73)
> at
> com.itextpdf.text.pdf.parser.PdfContentReaderTool.listContentStreamForPage(PdfContentReaderTool.java:181)
> at
> com.itextpdf.text.pdf.parser.PdfContentReaderTool.listContentStream(PdfContentReaderTool.java:204)
> at
> com.itextpdf.text.pdf.parser.PdfContentReaderTool.main(PdfContentReaderTool.java:248)
> 
> 
> A little more digging, and I isolate the problem to this chunk of content:
> 
> 10 w
> 1 J
> 0.0 G
> 7820 --240 m
> 7857 --233 l
> S
> Q
> 
> 
> That sure doesn't look like valid PDF to me. So who created this PDF, and
> why did they include two negative signs?
> 
> --
> View this message in context: 
> http://itext-general.2136553.n4.nabble.com/PDFTextExtractor-returns-an-exception-Input-string-was-not-in-a-correct-format-when-parsing-this-file-tp4357472p4358032.html
> Sent from the iText - General mailing list archive at Nabble.com.
> 
> ------------------------------------------------------------------------------
> Try before you buy = See our experts in action!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-dev2
> _______________________________________________
> iText-questions mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
> 
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a 
> reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples: 
> http://itextpdf.com/themes/keywords.php
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php                                       
------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to