[ 
https://issues.apache.org/jira/browse/PDFBOX-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18067257#comment-18067257
 ] 

Daniel Persson commented on PDFBOX-6176:
----------------------------------------

Hi Team.


I've now added a writer.patch file that solved the issue for me. The customer 
asks us continuously about this "issue" in their daily production using 
"EidosMedia Methode 7.0 output driver 6.0.1.0, OM 1.6.22" creates these 
problems, so we needed to find a solution.

To be clear, this solution is solely created by Claude Code. I created a 
repository, gave it all the requirements that the problem exists only in 3.0.x 
and not 2.0.x, and asked it to build both and compare the result of the split 
operation by qpdf. It played around with the code for 20 minutes and created a 
patch. Seems reasonable and it's explaination is:

 
{noformat}
  One file changed: COSWriter.java — 12 lines added, 5 removed.

  Pre-assign the xref stream's object key and offset before building the xref 
stream content, then include it as an
  entry in its own cross-reference data:

  1. ++number to pre-assign the xref stream's object number
  2. Capture the write offset (same as startxref)
  3. Add a NormalXReference entry for the xref stream itself to pdfxRefStream
  4. Set Size = number + 1 (straightforward: one greater than the highest 
object number)
  5. Build the xref stream content (now includes its own entry)
  6. Write the object using the pre-assigned key via doWriteObject(key, 
obj){noformat}
 

I understand if you don't want to apply AI-generated code, and this might not 
be the right fix for the problem.

I've run through all our regression tests of 50k pages, and I can't see any 
regression rendering PDFs on iOS, Chrome, Android, Poppler, and PDFBox.

Best regards

Daniel

> reported number of objects (7412) is not one plus the highest object number 
> (7410)
> ----------------------------------------------------------------------------------
>
>                 Key: PDFBOX-6176
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-6176
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Writing
>    Affects Versions: 3.0.0 PDFBox, 3.0.7 PDFBox
>            Reporter: Daniel Persson
>            Priority: Minor
>         Attachments: 180511_A-14.pdf, test.pdf, writer.patch
>
>
> A new customer reported that they got a bunch of errors during a split 
> operation in their flow. 
> {code:java}
> $ qpdf --split-pages=1 test.pdf page-%d.pdf
> WARNING: test.pdf: reported number of objects (7412) is not one plus the 
> highest object number (7410)
> qpdf: operation succeeded with warnings; resulting file may have some 
> problems {code}
> Seems I could recreate this issue with a lot of files just by loading and 
> saving a PDF. 
> {code:java}
> public static void main(String[] args) throws Exception {
>    //PDDocument doc = PDDocument.load(new File("180511_A-14.pdf"));
>    PDDocument doc = Loader.loadPDF(new File("180511_A-14.pdf"));
>    doc.save(new File("test.pdf"));
> } {code}
>  
> I've only been able to reproduce the error with 3.0.x not with 2.0.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to