[ https://issues.apache.org/jira/browse/PDFBOX-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17926338#comment-17926338 ]
Tilman Hausherr commented on PDFBOX-5953: ----------------------------------------- Here's what I wrote in the mailing list (not a solution, sadly) === I tried displaying your PDF file in PDFDebugger... page 1 is ok, but page 2 has empty fields. This is because these fields have no appearance stream, NeedAppearances is set, so the viewer has to do this. I switched on "repair acroform" in PDFDebugger and things got terrible, now page 1 is wrong too. !screenshot-1.png! If I use the java calls like you did, all pages are wrong, including the first one. The SimSun font isn't embedded but is on my computer (as a .ttc file). Several other viewers display the file properly. Adobe asks for a filename when closing which indicates it had to repair something. Here's some code that fixes the rendering of this file: {code:java} try (PDDocument doc = Loader.loadPDF(new File("20251103 mail test2.pdf"))) { PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm(null); // avoids any fixup PDResources dr = acroForm.getDefaultResources(); PDFont font = PDType0Font.load(doc, new FileInputStream("SimSun-UNSECURE.ttf"), false); // source https://fontzone.net/font-details/simsun dr.put(COSName.getPDFName("SimSun"), font); acroForm.refreshAppearances(); PDFRenderer r = new PDFRenderer(doc); ImageIO.write(r.renderImageWithDPI(0, 300), "png", new File("page1.png")); ImageIO.write(r.renderImageWithDPI(6, 300), "png", new File("page7.png")); } {code} The font I have on windows is a ttc file so I can't embed it. I downloaded a simsun.ttf file from a dubious source and used that to replace the file in the default resources. However saving this file brings a file that is now incorrectly displayed in Adobe Reader. > Missing Fields in Table During PDF to Image Conversion > ------------------------------------------------------ > > Key: PDFBOX-5953 > URL: https://issues.apache.org/jira/browse/PDFBOX-5953 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering > Affects Versions: 2.0.33 > Environment: Windows10, JDK17 > Reporter: Vincent Lee > Priority: Blocker > Attachments: PdfFileTest.java, debugInfo.txt, > image-2025-02-12-18-12-15-149.png, image-2025-02-12-18-12-24-291.png, > screenshot-1.png, test.pdf, test_page1.png, test_page10.png, test_page2.png, > test_page3.png, test_page4.png, test_page5.png, test_page6.png, > test_page7.png, test_page8.png, test_page9.png > > > The PDF displays correctly in the MS Edge browser. However, after converting > it to an image using PDFBox, some fields begin to appear blank starting from > the seventh image. > Interestingly, the first six images are generated correctly. After comparing, > I noticed that some fonts starting from the seventh page of the PDF differ > from the ones used earlier. > I suspect that missing fonts may be the cause of the issue, but since there > are no errors or warnings in the debug information, I’m unsure which fonts > are missing. > > !image-2025-02-12-18-12-15-149.png! > > !image-2025-02-12-18-12-24-291.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org