[
https://issues.apache.org/jira/browse/PDFBOX-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960666#comment-15960666
]
Andreas Lehmkühler commented on PDFBOX-3747:
--------------------------------------------
I've attached a sample pdf as requested. It was created on windows 7 using
calibri.ttf as font. The ToUnicode mapping includes a wrong mapping for the
hyphen character:
{code}
<0372> <0372> <2010>
{code}
Correct is the following mapping
{code}
<0372> <0372> <002D>
{code}
> CmapSubtable#getCharCodes provides values in random order
> ---------------------------------------------------------
>
> Key: PDFBOX-3747
> URL: https://issues.apache.org/jira/browse/PDFBOX-3747
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox
> Affects Versions: 2.0.5, 3.0.0
> Reporter: Andreas Lehmkühler
> Assignee: Andreas Lehmkühler
> Fix For: 2.0.6, 3.0.0
>
> Attachments: PDFBOX-3747.pdf
>
>
> Some fonts may have an ambigious glyphId to character code mapping.
> CmapSubtable#getCharCodes provides all of them, but in a random order. We
> should sort the list to provide a consistent order.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]