Hello,

First of all, thank you for this wonderful Swiss army knife for PDFs that is 
PdfBox.

Java version : 21
Pdfbox version : 3.0.4

Description :
When a pdf is saved with option CompressParameters.NO_COMPRESSION, useless 
lines like

nnnnnnnnnn 65535 f

are added to xref section

When splitting a pdf, this side effect seems cumulative when saving each part.
Not really relevant when saving only one pdf but when splitting a pdf to 5000 
parts, it becomes huge.

You can reproduce the issue with any pdf

Current workaround to fix this issue : open and save the produced pdf(s) with 
itextpdf 5.5.13.4 remove the useless lines like nnnnnnnnnn 65535 f

Regards,
Yannick

Test class :

import org.apache.pdfbox.Loader;
import org.apache.pdfbox.multipdf.Splitter;
import org.apache.pdfbox.pdfwriter.compress.CompressParameters;
import org.apache.pdfbox.pdmodel.PDDocument;

import java.io.File;
import java.util.List;

public class Test_NO_COMPRESSION {


    public static void main(String[] args) {
        if (args.length != 1) {
            System.out.println("Test_NO_COMPRESSION usage : [absolute path of 
the pdf to save and split]");
            System.exit(-1);
        }

        try {
            File pdf = new File(args[0]);
            String targetPath = pdf.getAbsolutePath() + "." + 
System.currentTimeMillis() + ".pdf";
            try (PDDocument doc = Loader.loadPDF(pdf)) {
                System.out.println("Saving NO_COMPRESSION to file " + 
targetPath);
                doc.save(targetPath, CompressParameters.NO_COMPRESSION);
            }

            try (PDDocument doc = Loader.loadPDF(pdf)) {
                for (int i = 1; i < doc.getNumberOfPages(); i++) {
                    Splitter splitter = new Splitter();
                    splitter.setStartPage(i);
                    splitter.setEndPage(i + 1);
                    splitter.setSplitAtPage(i + 1);

                    List<PDDocument> documents = splitter.split(doc);
                    PDDocument tempDoc = documents.getFirst();
                    String splitFilePath = targetPath + ".part." + i + ".pdf";

                    System.out.println("Saving page #" + i + " NO_COMPRESSION 
to file " + splitFilePath);
                    tempDoc.save(splitFilePath, 
CompressParameters.NO_COMPRESSION);
                    tempDoc.close();

                }
            }


            System.out.println("Done");

        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}



Worldline, Cardlink, GoPay and Santeos are registered trademarks and trade 
names owned by the Worldline Group. This e-mail and any documents attached are 
confidential and intended solely for the addressee. It may also be privileged. 
If you are not the intended recipient of this e-mail, you are not authorized to 
copy, disclose, use or retain it. Please notify the sender immediately and 
delete this e-mail (including any attachments) from your systems. As e-mails 
may be intercepted, amended or lost, they are not secure. Therefore, 
Worldline's and its subsidiaries' liability cannot be triggered for the message 
content. Although the Worldline Group endeavors to maintain a virus-free 
network, we do not warrant that this e-mail is virus-free and do not accept 
liability for any damages, losses or consequences resulting from any 
transmitted virus if any. The risks are deemed to be accepted by anyone who 
communicates with Worldline or its subsidiaries by e-mail.
Please consider the environment before printing, sending or forwarding this 
email.

Reply via email to