All, I'm trying to convert a PDF to an image and I'm encountering problems with some font rendering on some Linux systems. If anyone could provide any ideas on how to fix this I'd appreciate it.
The PDF is too large to attach, so it's available at this link: https://drive.google.com/file/d/1dNXgHsfn0cy2Gx9HxhSTQdeWAAjaDplk/view?usp=sharing So far as I can tell, the attached file comes from some sort of mail merge-style application that is injecting text into a template. The injected text uses a different font than the rest of the document. On Windows systems, this works fine, but on Linux systems, PDFBox renders the text as gibberish glyphs in a way that I've never seen before. When I reproduce the issue with logging increased to trace, I get the following line in the log. 15:55:15.622 [main] WARN org.apache.pdfbox.pdmodel.font.PDCIDFontType2 - Using non-embedded GIDs in font Calibri When I list the fonts in the PDF, Calibri is listed as both an embedded *and *an Identity-H font. Given that we have to substitute Carlito for Calibri, this may be relevant. In the source code <https://github.com/apache/pdfbox/blob/d6ebddf07f99bcc04f5b106c84623048b697bee7/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/PDCIDFontType2.java#L241>, a comment line suggests there's a mismatch that involves GIDs, CIDs, and embedded vs non-embedded fonts. Has anyone here ever seen behavior like this before? Is this a bug? If it is a bug, what is the procedure to report it? If it's not a bug, does anyone have any suggestions on what I might need to fix in my environment? Any input that anyone might have would be helpful. Thank you, Daniel