Hi,
Upgrading from 3.0.4 to 3.0.7 I have an issue where my PDFBox output seems
corrupt.
Trying to simply my issue to the max, here is what I have :
I use TextEdit + MacOS Print to PDF to create a minimal PDF (that only
contains the word « minimal »)
I run following code on PDFBox 3.0.7, JKD 21 if that’s the matter.
main() throws Exception {
try (PDDocument target = new PDDocument();
PDDocument embeded =
Loader.loadPDF(Paths.get("minimal.pdf").toFile())) {
target.getPages().add(embeded.getPage(0));
target.getPages().add(embeded.getPage(0));
target.save(Paths.get("out.pdf").toFile());
target.save(Paths.get("out_nocomp.pdf").toFile(),
CompressParameters.NO_COMPRESSION);
}
}
For both out.pdf and out_nocomp.pdf I get a two pages output document.
Both display fine in Chrome or FFox PDF viewer, but out.pdf does not on
MacOS Preview : it looks empty / no visible text
Using PDF Box debugger.jar outputs warnings for out.pdf, such as :
Warning [SetFontAndSize] font 'TT1' not found in resources
Warning [PDFStreamEngine] No current font, will use default
Warning [SetFontAndSize] font 'TT1' not found in resources
Warning [PDFStreamEngine] No current font, will use default
In this minimal case, it does not look that bad (although it's always fishy
when things are not rendering the same).
But in the real world where my output is complex, with pages copied from
one document to the next using LayoutUtil) I get
- invalid fonts one some pages (e.g font F1 which is supposed to be
Helvetica gets empty-ish)
- If I copy pages from two different documents, it « happens » that the
output contains twice one document and not the other
- I have trouble reproducing in my IDE, but no trouble reproducing in
mvn tests
My LLM was the one to suggest turning compression off, referring to
commit r1930285
/ PDFBOX-5169 / PDFBOX-6142 as likely difference points between 3.0.4 and
3.0.7, and it seems related (both in my minimal case, and full-fledge app
or unit tests, compression off = it works, default output = strange stuff).
Happy to provide files (not sure they will pass on the mailing list).
G