[
https://issues.apache.org/jira/browse/PDFBOX-5961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929167#comment-17929167
]
Michael Klink commented on PDFBOX-5961:
---------------------------------------
{quote}The problematic number is the number sign, but as UTF8. {quote}
Could you explain in detail? I had a quick look at the ToUnicode map, and here
"№" (U+2116) is mapped to as UTF-16 in a bfrange entry:
{noformat}<E28480> <E284BF> <2100>{noformat}
> IllegalArgumentException: Not a valid Unicode code point: 0xE28496
> ------------------------------------------------------------------
>
> Key: PDFBOX-5961
> URL: https://issues.apache.org/jira/browse/PDFBOX-5961
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 4.0.0
> Reporter: Tilman Hausherr
> Priority: Major
> Attachments: PDFJS-19527.pdf
>
>
> {noformat}
> IllegalArgumentException: Not a valid Unicode code point: 0xE28496
> java.base/java.lang.String.valueOfCodePoint(String.java:3345)
> java.base/java.lang.Character.toString(Character.java:8053)
> org.apache.pdfbox.pdmodel.font.PDType0Font.toUnicode(PDType0Font.java:548)
> org.apache.pdfbox.pdmodel.font.PDFont.toUnicode(PDFont.java:450)
>
> org.apache.pdfbox.text.LegacyPDFStreamEngine.showGlyph(LegacyPDFStreamEngine.java:279)
>
> org.apache.pdfbox.debugger.pagepane.DebugTextOverlay$DebugTextStripper.showGlyph(DebugTextOverlay.java:209)
>
> org.apache.pdfbox.contentstream.PDFStreamEngine.showText(PDFStreamEngine.java:792)
>
> org.apache.pdfbox.contentstream.PDFStreamEngine.showTextString(PDFStreamEngine.java:651)
> {noformat}
> The problems are somehow related to the /ToUnicode stream at
> {{Root/Pages/Kids/[0]/Resources/Font/F3/ToUnicode}}. This is a different bug
> than PDFBOX-5960 and not the problem that is in PDF.js 19527. I played around
> a bit supporting 3 byte codes (memo for me: version before 21.2 12:20) but
> it's still the same exception.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]