[
https://issues.apache.org/jira/browse/PDFBOX-228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jukka Zitting updated PDFBOX-228:
---------------------------------
Reporter: Jukka Zitting
Fix Version/s: 0.8.0-incubator
> Error expected floating point number actual='91.-59'
> ----------------------------------------------------
>
> Key: PDFBOX-228
> URL: https://issues.apache.org/jira/browse/PDFBOX-228
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Reporter: Jukka Zitting
> Priority: Minor
> Fix For: 0.8.0-incubator
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1618712
> Originally submitted by kierantopping on 2006-12-19 03:03.
> Hello,
> When attempting to extract text from the following file
> (http://www.luckit.org/pdfbox/uk-case-of-william-goodwin.pdf), I observe the
> following exception (full stack trace at end of report):
> java.io.IOException: Error expected floating point number actual='91.-59'
> I'm using version 0.7.3 (on linux, jre 1.5.06). I'll happily supply any more
> information that you might require.
> Many thanks in advance,
> Kieran
> Full stack trace:
> java.io.IOException: Error expected floating point number actual='91.-59'
> at org.pdfbox.cos.COSFloat.<init>(COSFloat.java:77)
> at org.pdfbox.cos.COSNumber.get(COSNumber.java:106)
> at
> org.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:259)
> at org.pdfbox.pdfparser.PDFStreamParser.parse(PDFStreamParser.java:115)
> at org.pdfbox.cos.COSStream.getStreamTokens(COSStream.java:133)
> at org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:202)
> at org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:174)
> at org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:336)
> at org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:259)
> at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216)
> at org.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:149)
> [comment on SourceForge]
> Originally sent by tweakerbee.
> Logged In: YES
> user_id=1625706
> Originator: NO
> This should probably be 91.-59.
> Forgot to take out the dash here. Should read: "This should probably be 91.59"
> [comment on SourceForge]
> Originally sent by tweakerbee.
> Logged In: YES
> user_id=1625706
> Originator: NO
> Your PDF is flawed. On the first page there is an instruction
> 91.-59 Tz
> that is incorrect. Tz will set the textscaling (default = 100), but has to do
> so with a double. This should probably be 91.-59. It appears to be just
> before the text "Emma", so maybe you can fix it with Adobe Acrobat or another
> editor.
> I doubt you'll get useful results though, the PDF is mostly image with OCR'ed
> text behind it. It will contain many errors.
> HTH,
> Matthijs
> [comment on SourceForge]
> Originally sent by kierantopping.
> Logged In: YES
> user_id=1672087
> Originator: YES
> Small extra snippet: Unlike request 1610268, downgrading to 0.7.2 does NOT
> fix/workaround this issue.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.