[ 
https://issues.apache.org/jira/browse/PDFBOX-5978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938511#comment-17938511
 ] 

Michael Klink commented on PDFBOX-5978:
---------------------------------------

Indeed, leaving a gap in the cross reference table would be incorrect. Doing so 
would cause troubles later, in particular in signing use cases.

But there is no need to keep object numbers in split PDF files.

And it would be nice if _decode_ kept object numbers but it doesn't,

So I would vote for re-assigning object numbers. But only by explicit request, 
e.g. using some \{{write}} parameter or even some special \{{write}} overload. 
When only slightly manipulating a PDF, I as a developer would like to easily 
recognize unchanged objects by their object number.

By the way, the example document [^PDFJS-17784.pdf] has a broken cross 
reference structure: It has massive gaps (object numbers without an entry, 
neither a free nor a used one) in its cross reference information which IMO is 
not allowed - it admittedly only is explicitly written in the specification 
section for cross reference {_}tables{_}, but the nature of that requirement is 
not table specific and, therefore, should apply to {_}streams{_}, too.

> Issue when saving pdf with NO_COMPRESSION
> -----------------------------------------
>
>                 Key: PDFBOX-5978
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5978
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Writing
>    Affects Versions: 3.0.4 PDFBox
>            Reporter: Yannick Hanus
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>         Attachments: NO_COMPRESSION.65535f.png, PDFJS-17784.pdf, 
> PDFJS-17784_unc.pdf, Test_NO_COMPRESSION.java, output.part.1.pdf
>
>
> Java version : 21
> Pdfbox version : 3.0.4
>  
> When a pdf is saved with option CompressParameters.NO_COMPRESSION, useless 
> lines like
> _nnnnnnnnnn_ 65535 f
> are added to xref section
> When splitting a pdf, this side effect seems cumulative when saving each part.
> Not really relevant when saving only one pdf but when splitting a pdf to 5000 
> parts, it becomes huge.
> You can reproduce the issue with any pdf
> Current workaround to fix this issue : open and save the produced pdf(s) with 
> itextpdf 5.5.13.4 remove the useless lines like _nnnnnnnnnn_ 65535 f : 
>  
> {code:java}
>         try (InputStream is = new FileInputStream(tempFile)) {
>             PdfReader pdfReader = new PdfReader(is);
>             PdfStamper pdfStamper = new PdfStamper(pdfReader, new 
> FileOutputStream(targetSplitFile));
>             pdfStamper.close();
>             pdfReader.close();
>         } catch (Exception e) {
>             throw new RuntimeException("Unable to save with itext " + 
> targetSplitFile, e);
>         }
> {code}
> You can use the attached class to reproduce de issue. 
> Just pass the absolute path to a pdf as argument to the class



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to