[ https://issues.apache.org/jira/browse/PDFBOX-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17952130#comment-17952130 ]
ASF subversion and git services commented on PDFBOX-6009: --------------------------------------------------------- Commit 1925586 from Tilman Hausherr in branch 'pdfbox/trunk' [ https://svn.apache.org/r1925586 ] PDFBOX-6009: remove structure elements without /Pg entry if there is at least one MCID > Splitter does not include structure tree in documents past the first split > -------------------------------------------------------------------------- > > Key: PDFBOX-6009 > URL: https://issues.apache.org/jira/browse/PDFBOX-6009 > Project: PDFBox > Issue Type: Bug > Components: Utilities > Affects Versions: 2.0.34, 3.0.5 PDFBox > Reporter: Tilman Hausherr > Priority: Major > Labels: StructureTree > Attachments: pdfbox-split-missing-tags_mail 15.5.2025-p1.pdf, > pdfbox-split-missing-tags_mail 15.5.2025-p2.pdf, > pdfbox-split-missing-tags_mail 15.5.2025-p3.pdf, > pdfbox-split-missing-tags_mail 15.5.2025.pdf > > > As submitted by Alastair Porter in the users mailing list > java -jar pdfbox/app/target/pdfbox-app-4.0.0-SNAPSHOT.jar split -i input.pdf > -outputPrefix output-split > Only first page has the appropriate structure tree (/K is missing) > === from the post in the mailing list === > In the first file, I correctly see the /K element. What's more, this element > has correctly been pruned and doesn't include any items from the input > document which point to pages that are not in this split. > In subsequent split files, I see no /K element in the StructTreeRoot at all. > I attached a PDF which I've been using for simple testing, which exhibits > this behaviour. > I had a bit of a look through the existing code, and I see that in > Splitter.java, in cloneStructureTree > {code:java} > COSBase k1 = srcStructureTreeRoot.getK(); > COSBase k2 = new KCloner(dstPageTree).createClone(k1, > dstStructureTreeRoot.getCOSObject(), null); > dstStructureTreeRoot.setK(k2); > {code} > k2 is always null after the first split, it seems like it may not be created > correctly. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org