On 23.12.2024 20:42, Tilman Hausherr wrote:
On 23.12.2024 20:00, Tilman Hausherr wrote:
Hi,

In the meantime I was able to reproduce it with page 49 and 50 and I have a theory what happened. The newer versions of PDFBox do a lot of cleanup in the structure tree when cleaning up annotation destinations that don't exist in the destination. Because you deleted the annotations manually, these "orphan pages" are possibly still in the structure tree part that is kept.


1) Sorry, I see you mentioned 20 in the initial post. I looked too much into the code

2) I inspected the result file, it does indeed have orphan pages. I'll look at the code (that I wrote in January) to find out whether I had the intention of removing orphan pages or not when cloning the structure tree. In the worst case (for you) I never had the intention and can't, so you created the problem by deleting the annotations without cleaning up the structure tree. In the best case (for you), there's either a bug in the code, or there isn't but I can add some orphan cleaning.

Updates:

3) fixed a typo in your last name, sorry

4) created and fixed https://issues.apache.org/jira/browse/PDFBOX-5928 , so that the orphan test works better now, now it does detect the orphan. Still need to find out what I mentioned in (2).

Tilman

PS in case this wasn't clear, you don't have to write any further test code.

Reply via email to