[ 
https://issues.apache.org/jira/browse/PDFBOX-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960297#comment-15960297
 ] 

Andreas Lehmkühler commented on PDFBOX-3747:
--------------------------------------------

The origin issue is about creating pdfs using pdfbox. But yes, if you try to 
extract text from such pdfs you get false results. I've simply used the 
HelloWorldTTF example with calibri.ttf on windows 7. I'm going to attach a 
sample when I'm back in office

> CmapSubtable#getCharCodes provides values in random order
> ---------------------------------------------------------
>
>                 Key: PDFBOX-3747
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3747
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.5, 3.0.0
>            Reporter: Andreas Lehmkühler
>            Assignee: Andreas Lehmkühler
>             Fix For: 2.0.6, 3.0.0
>
>
> Some fonts may have an ambigious glyphId to character code mapping. 
> CmapSubtable#getCharCodes provides all of them, but in a random order. We 
> should sort the list to provide a consistent order.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to