Robert Amidon created PDFBOX-5905:
-------------------------------------

             Summary: Many ZapfDingbats symbols do not appear when page is 
rendered.
                 Key: PDFBOX-5905
                 URL: https://issues.apache.org/jira/browse/PDFBOX-5905
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 3.0.3 PDFBox, 2.0.32
         Environment: Ubuntu 22.04
            Reporter: Robert Amidon
         Attachments: missing_symbols.png

We've encountered some PDFs that have characters written using the ZapfDingbats 
symbol font. On Windows PdfBox recreates the symbols appropriately, but when 
testing on Ubuntu 22.04 we've noticed that many symbols, such as the heavy 
checkmark (U+2714) are missing. Instead an empty character is produced. 

Upon debugging it appears that the PdfBox toolkit substitutes the ZapfDingbats 
font for some other font instead. In our observation, the code attempts to 
fallback to "Helvetica," but because this font is not present the last-resort 
"LiberationSans" is selected instead. This font is not capable of displaying 
the heavy checkmark symbol (a20/U+2714) and hence it is missing when the PDF 
page is rendered to an image. The following appears in the Debug console:

{quote}
WARNING: Using fallback font LiberationSans for ZapfDingbats
{quote}

While debugging, if we inject the font name "DejaVuSans" into the variable for 
"fallbackName" in the {{FontMapperImpl::getFontBoxFont(String, 
PDFontDescriptor)}} method, then DejaVuSans is instead resolved as the base 
font and the heavy checkmark symbol (and others) is drawn correctly.

It's not clear to us why a more appropriate font is not chosen in this 
instance. It results in many ZapfDingbats symbols missing when the page is 
rendered.  

[This PDF 
file|https://www.w3.org/Style/XSL/TestSuite/contrib/XEP/Tests/zapf-dingbats.pdf]
 demonstrates the problem.

h2. Steps to Reproduce: 
# Create a Java project and import PdfBox v2.0.32
# Execute the below code snippet, substituting the placeholder file paths for 
real ones on your system, on Ubuntu 22.04
# Observe that the image file that is produced is missing many symbols. 

{code}
String inputPath = "zapf-dingbats.pdf";
File pdfFile = new File(inputPath);
try (PDDocument pdfDoc = PDDocument.load(pdfFile)) {
    PDFRenderer renderer = new PDFRenderer(pdfDoc);
    BufferedImage image = renderer.renderImage(0);

    String outputPath = "zapf-dingbats.ubuntu.png";
    try (OutputStream outFile = new FileOutputStream(outputPath)) {
        ImageIOUtil.writeImage(image, "PNG", outFile);
    }
}
catch (IOException e) {
    e.printStackTrace();
}
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to