[ 
https://issues.apache.org/jira/browse/PDFBOX-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003459#comment-13003459
 ] 

Andreas Lehmkühler commented on PDFBOX-970:
-------------------------------------------

I can't confirm the umlaut issue. The latest snapshot works fine for me. Do you 
have the icu-jar on your classpath?

The position of the german quote seems to be misinterpreted. Because of being 
placed very low on the line the algo presumes is has to be on the next line. It 
was already an issue with 1.4.0

I guess the JIRA error occured because of some maintenance ( the infra guys 
just upgraded JIRA to 4.2.4).

> TeX-created ligatures and umlauts are not recognised
> ----------------------------------------------------
>
>                 Key: PDFBOX-970
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-970
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 1.5.0
>         Environment: Mac OS X 10.6.6, Java(TM) SE Runtime Environment (build 
> 1.6.0_22-b04-307-10M3261)
>            Reporter: Thomas Fischer
>              Labels: textExtraction
>         Attachments: A Python Library for Provenance Recording and 
> Querying.txt, A Python Library for Provenance Recording and Querying.txt, 
> Test.pdf, Test.pdf
>
>
> Ligatures in a TeX-created document are lost, which are regognised by v. 1.4, 
> e.g.
>   1.4          1.5
> official      ocial
> effort        e ort
> fields        elds
> first          rst
> In addition, German umlauts (ä, ö, ü) are represented as ( a,  o,  u), 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to