Jukka Zitting schrieb:
> Hi,
>
> On Tue, Aug 24, 2010 at 2:07 PM, reinhard schwab <reinhard.sch...@aon.at> 
> wrote:
>   
>> another exception i have encountered today
>>
>> parse
>> ftp://ftp.cordis.europa.eu/pub/fp7/ict/docs/content-knowledge/flyer-tim_en.pdf
>>
>> java.lang.RuntimeException: java.io.IOException: Not a number: -
>>     
>
> Looks like PDFBOX-592 that I just fixed in PDFBox trunk based on the
> suggestion in the issue.
>
> BR,
>
> Jukka Zitting
>
>   
now its not a hyphen, its a point.
same document.
i have synchronized to trunk of pdfbox and rebuild tika.

java.lang.RuntimeException: java.io.IOException: Not a number: .
    at
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:149)
    at
org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:158)
    at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:241)
    at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:208)
    at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:441)
    at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:365)
    at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:321)
    at
org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:241)
    at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:53)
    at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:87)
Caused by: java.io.IOException: Not a number: .
    at org.apache.pdfbox.cos.COSNumber.get(COSNumber.java:84)
    at
org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:324)
    at
org.apache.pdfbox.pdfparser.PDFStreamParser.access$000(PDFStreamParser.java:47)
    at
org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:146)
    ... 15 more

best regards
reinhard

Reply via email to