Hi Tilman,

I’m not sure about the origin of my PDF, as it was uploaded by our
client, so I can’t confirm whether it was generated by PDFBox.

I’m not certain about the font causing the garbled text; I suspect it
may not be SimSun, but another font. It seems that your computer may
be missing this font, which is causing the issue.

To help, I’ve copied all my fonts from C:\Windows\Fonts and uploaded
them to Google Drive in a folder named "font," with a total of 370
fonts. Feel free to download them and give it a try if needed.

I’ve just created a JIRA account using the email x279968...@gmail.com,
and my username is vincent lee, which is a temporary English name I
gave myself. My previous emails showed my Chinese name. Once you
approve my request, I will follow your instructions to submit a bug
report.


On Tue, Feb 11, 2025 at 8:16 PM Tilman Hausherr <thaush...@t-online.de> wrote:
>
> Hi,
>
> I tried displaying your PDF file in PDFDebugger... page 1 is ok, but page 2 
> has empty fields. This is because these fields have no appearance stream, 
> NeedAppearances is set, so the viewer has to do this. I switched on "repair 
> acroform" in PDFDebugger and things got terrible, now page 1 is wrong too.
>
>  If I use the java calls like you did, all pages are wrong, including the 
> first one. The SimSun font isn't embedded but is on my computer.
>
> Several other viewers display the file properly. Adobe asks for a filename 
> when closing which indicates it had to repair something.
>
> Has your file been created with PDFBox merge? Because field names like 
> "dummyFieldName1" sound like us 😂
>
> Did you make a JIRA registration request that was refused? If you did, please 
> make it again and I'll approve it. Just mention something so I know it's you.
>
> I don't know if it is a bug, but I think it's likely, but sadly I don't know 
> what it is at this time. It may or may not be related to the other mail.
>
> Tilman
>
> On 11.02.2025 11:55, 李一凡 wrote:
>
> Hello,
>
> I am currently using JDK17 and pdfbox-2.0.33.jar to convert PDFs into
> images on a Windows 10 OS.
>
> The PDF displays correctly in the MS Edge browser. However, after
> converting it to an image using PDFBox, some fields begin to appear blank
> starting from the seventh image.
>
> Interestingly, the first six images are generated correctly. After
> comparing, I noticed that some fonts starting from the seventh page of the
> PDF differ from the ones used earlier.
>
> I suspect that missing fonts may be the cause of the issue, but since there
> are no errors or warnings in the debug information, I’m unsure which fonts
> are missing.
>
> I have uploaded attachments to Google Drive
> <https://drive.google.com/drive/folders/1cxpVIJphGwQEQaqwtUlaVllXMSOL5yT0?usp=sharing>.
> The folder contains the original PDF, a screenshot of the seventh page
> opened in MS Edge, the converted images, the source code, and the
> "debugInfo.txt" file.
> I have removed some redundant logs and only included what I believe to be
> important in this email. The full DEBUG information is included in the
> attached "debugInfo.txt."
>
> Here are some key DEBUG log entries:
>
> 18:13:33.188 [main] DEBUG org.example.PdfFileTest - Page 5 rendered
> 18:13:33.234 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
> PostScript name information is provided for the font SimSun
> 18:13:33.272 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
> getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
> null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
> 18:13:33.274 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
> getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
> null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
> 18:13:33.276 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
> getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
> null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
> 18:13:33.277 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
> getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
> null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
> 18:13:33.277 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
> getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
> null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
> 18:13:33.278 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
> getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
> null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
> 18:13:33.278 [main] DEBUG org.apache.pdfbox.pdmodel.font.FontMapperImpl -
> getFont('TTF','SimSun') returns SimSun (TTF, mac: 0x0, os/2: 0x0, cid:
> null) C:\Windows\FONTS\simsun.ttc 77be8045 1690865065721
> 18:13:33.280 [main] DEBUG org.example.PdfFileTest - Page 6 rendered
> 18:13:33.310 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
> PostScript name information is provided for the font SimSun
> 18:13:33.358 [main] DEBUG org.example.PdfFileTest - Page 7 rendered
> 18:13:33.392 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
> PostScript name information is provided for the font SimSun
> 18:13:33.427 [main] DEBUG org.example.PdfFileTest - Page 8 rendered
> 18:13:33.461 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
> PostScript name information is provided for the font SimSun
> 18:13:33.494 [main] DEBUG org.example.PdfFileTest - Page 9 rendered
> 18:13:33.526 [main] DEBUG org.apache.fontbox.ttf.PostScriptTable - No
> PostScript name information is provided for the font SimSun
> 18:13:33.541 [main] DEBUG org.example.PdfFileTest - Page 10 rendered
>
>
> It is worth noting that the message "No PostScript name information is
> provided for the font SimSun" appears on every page, but on the first five
> pages, it is immediately followed by
> "org.apache.pdfbox.pdmodel.font.FontMapperImpl - getFont(xxxx)". Starting
> from the sixth page, only the "No PostScript name" message is output, but
> the conversion for the sixth page still works fine. The issue only appears
> starting from the seventh page.
>
> Best regards
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to