Hi, With some of our PDFs I get two different errors: 1: java.lang.NullPointerException at org.apache.pdfbox.multipdf.Splitter.cloneStructureTree(Splitter.java:238) at org.apache.pdfbox.multipdf.Splitter.split(Splitter.java:145) at org.apache.pdfbox.tools.PDFSplit.call(PDFSplit.java:133) at org.apache.pdfbox.tools.PDFSplit.call(PDFSplit.java:41) at picocli.CommandLine.executeUserObject(CommandLine.java:2031)
This appears to be related to your change in rev 1925636 (PDFBOX-6009: get ParentTreeNextKey from tree). I note that with the commit before this change, the splitter runs and generates files, but I've not yet verified the accuracy of the structure tree. This appears to happen on files whose /K have no /Pg element 2 Exception in thread "main" java.lang.StackOverflowError at java.base/java.lang.StringCoding.encodeUTF8(StringCoding.java:909) at java.base/java.lang.StringCoding.encode(StringCoding.java:449) at java.base/java.lang.String.getBytes(String.java:964) at org.apache.pdfbox.cos.COSName.writePDF(COSName.java:778) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSName(COSWriterObjectStream.java:308) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:232) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:352) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:240) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:354) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:240) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:329) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:236) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:354) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:240) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:354) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:240) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:329) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:236) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:354) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:240) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSArray(COSWriterObjectStream.java:329) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:236) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:354) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeObject(COSWriterObjectStream.java:240) at org.apache.pdfbox.pdfwriter.compress.COSWriterObjectStream.writeCOSDictionary(COSWriterObjectStream.java:354) ... continues I also get this stack overflow on one of the sample files that I successfully tested on Friday, so it's possible that a change since then has caused this. This appears to happen on files whose /K do have a /Pg (on files with no /Pg I get the NPE first) I'm currently verifying if we can privately share these documents with you. Please let me know if it would be useful for debugging. I have an account on apache jira, please let me know if you'd prefer to continue there, or if it's OK to use the mailing list. Thanks, Alastair On Sat, 17 May 2025 at 13:23, Tilman Hausherr <thaush...@t-online.de> wrote: > Hi, > > Make sure to download the software again, I found another bug that I fixed. > > Tilman > > On 16.05.2025 21:36, Alastair Porter wrote: > > Hi Tilman, > > > > > >> Please try with a snapshot: > > > https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/4.0.0-SNAPSHOT/ > > > > Now elements without /Pg entry are removed only if they have MCIDs. Note > >> that the "new" second page doesn't pass the PAC test but this is because > >> it starts with H2. > > > > It looks like this works! Thanks for the prompt response and fix. I've > > checked a few test files which I have and their splits now include the > > expected tags. I'll send this to the rest of our team next week for them > to > > review in more detail, but it looks like things are working here for us. > > > > Thanks again. > > Alastair > > > > On Fri, 16 May 2025 at 16:17, Tilman Hausherr <thaush...@t-online.de> > wrote: > > > >> Please try with a snapshot: > >> > >> > >> > https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/4.0.0-SNAPSHOT/ > >> > >> Now elements without /Pg entry are removed only if they have MCIDs. Note > >> that the "new" second page doesn't pass the PAC test but this is because > >> it starts with H2. > >> > >> Please try the new version on other PDFs that had this problem. > >> > >> Tilman > >> > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > >> For additional commands, e-mail: users-h...@pdfbox.apache.org > >> > >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > >