[ 
https://issues.apache.org/jira/browse/PDFBOX-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696232#action_12696232
 ] 

Justin LeFebvre commented on PDFBOX-372:
----------------------------------------

Ran this through the trunk version of Pdfbox and had no issues extracting the 
text. I believe that the changes Brian and I made to the Parser fixed this 
issue. 

> java.io.IOException: Error: expected hex character and not  :32
> ---------------------------------------------------------------
>
>                 Key: PDFBOX-372
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-372
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 0.7.3
>         Environment: Solaris OS JDK 6
>            Reporter: DURGA DEEP
>         Attachments: Webmail02.pdf
>
>
> Unable to parse the following PDF Attachment. 
> java.io.IOException: Error: expected hex character and not  :32
>         at org.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:283)
>         at org.fontbox.cmap.CMapParser.parse(CMapParser.java:105)
>         at org.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:535)
>         at org.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:387)
>         at 
> org.pdfbox.util.PDFStreamEngine.showString(PDFStreamEngine.java:325)
>         at org.pdfbox.util.operator.ShowText.process(ShowText.java:64)
>         at 
> org.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:452)
>         at 
> org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:215)
>         at 
> org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:174)
>         at 
> org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:336)
>         at 
> org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:259)
>         at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216)
>         at org.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:149)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to