Christian Appl created PDFBOX-4793:
--------------------------------------

             Summary: Questionable fallback font for some embedded chinese fonts
                 Key: PDFBOX-4793
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4793
             Project: PDFBox
          Issue Type: Bug
          Components: Rendering
    Affects Versions: 2.0.18
            Reporter: Christian Appl
         Attachments: image-2020-03-04-09-49-42-323.png, 
image-2020-03-04-09-58-01-055.png, image-2020-03-04-10-09-25-343.png, 
image-2020-03-04-10-31-03-065.png, pdf_font-zhcn.pdf

*Issue:*
I tried to render PDFs, that contain embedded chinese fonts. Neither the PDF 
Debugger, nor printouts of the document (PDFPrintable), nor the PDFRenderer can 
display/render the chinese glyphs correctly and will render placeholders 
instead.

*Assumptions:*
I assume, that said embedded fonts are incomplete and don't contain all glyphs, 
that would be required to render the text properly and therefore PDFbox 
attempts to use the previously determined fallback font. (!?)
 !image-2020-03-04-09-49-42-323.png! 
 !image-2020-03-04-09-58-01-055.png! 
And fails to find the glyphs in said fallback font.

Which is not surprising, as the Fallback font "MalgunGothic-Semilight" (Windows 
standard font) does not contain chinese characters.
 !image-2020-03-04-10-09-25-343.png! 

*Debugging:*
I tried to understand how the fallback font is determined and what could be 
done to solve this problem on my end. But I was unable to find a satisfying 
solution.
My best guess so far is, that the CIDFontMapping (FontMapperImpl) is to blame 
for determining an unfit fallback font.
Although it seems to check, whether required codepages are contained in a 
fallback font, it still does rank the Malgun font as the topscorer and best 
substitute font, even though it does clearly not contain all required codepages.

*My opinion:*
This is troubling, as better fit fonts exist and could have been selected. 
(ie.: Adobe Stong Std) And are indeed included in the CIDFontMapping, but 
seemingly are scoring lower for some reason.

*Further information:*
I can not disclose the document in question, however I found a document 
(pdf_font-zhcn.pdf) in another issue (PDFBOX-3132), that can be used to 
reproduce the issue (ie.: by dropping it into the PDF Debugger)
 !image-2020-03-04-10-31-03-065.png! 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to