[jira] [Commented] (PDFBOX-2058) The text of pdfs using Type1C can't be extracted correct

JIRA Sun, 04 May 2014 10:41:33 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13989060#comment-13989060
 ]


Andreas Lehmkühler commented on PDFBOX-2058:
--------------------------------------------

It was a bad idea not to use the build-in mapping of a type1c font as special 
character can't be extracted anymore.

I reimplemented the feature in revision 1592393. I'm going to backport the 
changes to the branch as well, but I've to run some tests first.

> The text of pdfs using Type1C can't be extracted correct
> --------------------------------------------------------
>
>                 Key: PDFBOX-2058
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2058
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.4, 1.8.5, 1.8.6, 2.0.0
>            Reporter: Andreas Lehmkühler
>            Assignee: Andreas Lehmkühler
>              Labels: type1cfont
>
> PDFBOX-1770 introduced a regression with pdfs using a Type1C font. Special 
> characters incluing ligatures can't be extracted anymore.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2058) The text of pdfs using Type1C can't be extracted correct

Reply via email to