[ 
https://issues.apache.org/jira/browse/PDFBOX-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985341#action_12985341
 ] 

Hesham commented on PDFBOX-938:
-------------------------------

@Andreas ... Thanks for your reply.
I have attached a sample executable jar "Sample.zip" to test it ... Please 
download it, extract the zip and just double click the jar file. The source 
code is also inside.

If you see any problems with it please tell me about it. I am still getting the 
same problems when using it. 

> Wrong extracted text using PDFBox 1.4
> -------------------------------------
>
>                 Key: PDFBOX-938
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-938
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.4.0
>            Reporter: Hesham
>         Attachments: Another book - Wrong extracted f char.pdf, 
> Another+book+-+Wrong+extracted+f+char.txt, Sample.zip, Wrong extracted f 
> char.pdf
>
>
> Hello ,
>  
> I am using PDFBox v1.4 to extract some text from a PDF, but some words are 
> not extracted right.
> For example words :
> "Nefteiugansk" is read: "Nežeiugansk"
> "fiancee" is read: "Äancée"
> "first" is read: "Ärst"
>  
> Please check the attached file to test this.
> Best regards

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to