[
https://issues.apache.org/jira/browse/PDFBOX-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989060#comment-13989060
]
Andreas Lehmkühler commented on PDFBOX-2058:
--------------------------------------------
It was a bad idea not to use the build-in mapping of a type1c font as special
character can't be extracted anymore.
I reimplemented the feature in revision 1592393. I'm going to backport the
changes to the branch as well, but I've to run some tests first.
> The text of pdfs using Type1C can't be extracted correct
> --------------------------------------------------------
>
> Key: PDFBOX-2058
> URL: https://issues.apache.org/jira/browse/PDFBOX-2058
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.8.4, 1.8.5, 1.8.6, 2.0.0
> Reporter: Andreas Lehmkühler
> Assignee: Andreas Lehmkühler
> Labels: type1cfont
>
> PDFBOX-1770 introduced a regression with pdfs using a Type1C font. Special
> characters incluing ligatures can't be extracted anymore.
--
This message was sent by Atlassian JIRA
(v6.2#6252)