Some Ligatures in a PDF file are not recognised.
------------------------------------------------

                 Key: PDFBOX-1017
                 URL: https://issues.apache.org/jira/browse/PDFBOX-1017
             Project: PDFBox
          Issue Type: Improvement
          Components: Text extraction
    Affects Versions: 1.6.0
         Environment: Mac OS X 10.6.7, java version "1.6.0_24"
            Reporter: Thomas Fischer


In the attached file, some ligatures (Qu, Th, ch, ck, fft, ft, tt) are not 
transformed but remain in the text with Unicode characters in the private range 
UE0xx: "...im rabbinisen Sritum in untersiedlien Kontexten und dort,..."

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to