Determine page of PDStructureNode

Pascal.Schumacher Wed, 07 May 2025 07:17:37 -0700

Hi,

trying to improve the accessibility of an existing pdfs.


In a first step I collect the marked content ids of elements with no content 
from the content streams.

Then I remove PDStructureNodes whose first kid is an Integer value from the 
collected marked content ids from the structure tree.

This works fine for simple documents but fails if there are multiple pages with 
contain elements with the same marked content ids.

Is there a way to determine which page a structure node references?  Screen 
readers can process these documents correctly, so the information should be be 
available somewhere.

Thanks and kind regards,
Pascal

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Determine page of PDStructureNode

Reply via email to