[
https://issues.apache.org/jira/browse/PDFBOX-5719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789781#comment-17789781
]
Tilman Hausherr commented on PDFBOX-5719:
-----------------------------------------
This is a how-to question, isn't it? I'm not even sure what the question is -
how PDFDebugger works to show the unicode in that table? These should be asked
in the users mailing list.
https://pdfbox.apache.org/support.html
> PDFbox fix
> -----------
>
> Key: PDFBOX-5719
> URL: https://issues.apache.org/jira/browse/PDFBOX-5719
> Project: PDFBox
> Issue Type: Improvement
> Components: Text extraction
> Affects Versions: 2.0.26
> Environment: OS: Ubuntu
> Java: 16
> Reporter: MMG
> Priority: Major
> Labels: fontbox, unicode, unicodemapping
> Attachments: Kommunikationsbedingungen-Einlagen_FIDOR-Bank.pdf
>
>
> Hello,
> I am experiencing an issue related to the "No Unicode Mapping" warning in the
> PDFBox debugger. Similar to Apache DebugBar, I am saving font glyphs to disk
> and then using an AI to detect the characters. My objective is to update the
> font Unicode map based on the AI results and save the PDF.
> Here's my main idea: Save unknown glyph Unicode mappings to disk, send each
> image to the AI for detection, and then update the font Unicode mapping. I
> found a helpful example on Stack Overflow (link:
> [https://stackoverflow.com/questions/39485920/how-to-add-unicode-in-truetype0font-on-pdfbox-2-0-0]),
> where the solution involves creating a CosStream to update the font Unicode
> mapping. This approach seems suitable for my needs.
> In the mentioned question, the answer suggests creating a CosStream to update
> the font Unicode mapping. I want to retrieve the ToUnicode text as shown in
> the mentioned question and modify the text to fix the font Unicode, then
> update the font. However, I am unsure of how to obtain the ToUnicode text
> view (similar to the PDF debugger).
> Can anyone provide assistance on how to address this issue? Any help would be
> greatly appreciated.
> Sample pdf file attached
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]