Hi,
I tried displaying your PDF file in PDFDebugger... page 1 is ok, but
page 2 has empty fields. This is because these fields have no appearance
stream, NeedAppearances is set, so the viewer has to do this. I switched
on "repair acroform" in PDFDebugger and things got terrible, now page 1
is wrong too.
If I use the java calls like you did, all pages are wrong, including
the first one. The SimSun font isn't embedded but is on my computer.
Several other viewers display the file properly. Adobe asks for a
filename when closing which indicates it had to repair something.
Has your file been created with PDFBox merge? Because field names like
"dummyFieldName1" sound like us 😂
Did you make a JIRA registration request that was refused? If you did,
please make it again and I'll approve it. Just mention something so I
know it's you.
I don't know if it is a bug, but I think it's likely, but sadly I don't
know what it is at this time. It may or may not be related to the other
mail.
Tilman
On 11.02.2025 11:55, 李一凡 wrote:
Hello,
I am currently using JDK17 and pdfbox-2.0.33.jar to convert PDFs into
images on a Windows 10 OS.
The PDF displays correctly in the MS Edge browser. However, after
converting it to an image using PDFBox, some fields begin to appear blank
starting from the seventh image.
Interestingly, the first six images are generated correctly. After
comparing, I noticed that some fonts starting from the seventh page of the
PDF differ from the ones used earlier.
I suspect that missing fonts may be the cause of the issue, but since there
are no errors or warnings in the debug information, I’m unsure which fonts
are missing.
I have uploaded attachments to Google Drive
<https://drive.google.com/drive/folders/1cxpVIJphGwQEQaqwtUlaVllXMSOL5yT0?usp=sharing>.
The folder contains the original PDF, a screenshot of the seventh page
opened in MS Edge, the converted images, the source code, and the
"debugInfo.txt" file.
I have removed some redundant logs and only included what I believe to be
important in this email. The full DEBUG information is included in the
attached "debugInfo.txt."
Here are some key DEBUG log entries:
18:13:33.188 [main] DEBUG org.example.PdfFileTest - Page 5 rendered
18:13:33.234 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.272 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.274 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.276 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.277 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.277 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.278 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.278 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
18:13:33.280 [main] DEBUG org.example.PdfFileTest - Page 6 rendered
18:13:33.310 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.358 [main] DEBUG org.example.PdfFileTest - Page 7 rendered
18:13:33.392 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.427 [main] DEBUG org.example.PdfFileTest - Page 8 rendered
18:13:33.461 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.494 [main] DEBUG org.example.PdfFileTest - Page 9 rendered
18:13:33.526 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
PostScript name information is provided for the font SimSun
18:13:33.541 [main] DEBUG org.example.PdfFileTest - Page 10 rendered
It is worth noting that the message "No PostScript name information is
provided for the font SimSun" appears on every page, but on the first five
pages, it is immediately followed by
"org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont(xxxx)". Starting
from the sixth page, only the "No PostScript name" message is output, but
the conversion for the sixth page still works fine. The issue only appears
starting from the seventh page.
Best regards