[
https://issues.apache.org/jira/browse/PDFBOX-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350234#comment-15350234
]
Maruan Sahyoun commented on PDFBOX-3398:
----------------------------------------
There is no such information in the PDF document linked from the SO article.
As far as I could see the StructureTree is used to identify the different
building blocks of the document and points to the corresponding marked content
sequences in the page content stream. The only accessibility feature used is
the specification of the language using the {{Lang}} attribute. There is no
plain text definition or replacement text as far as I can tell which is why you
couldn't find it.
> Text (XML) output of pdf structure
> ----------------------------------
>
> Key: PDFBOX-3398
> URL: https://issues.apache.org/jira/browse/PDFBOX-3398
> Project: PDFBox
> Issue Type: New Feature
> Components: Parsing, Utilities
> Reporter: Stefan Hegny
> Priority: Minor
>
> It would be nice to have a text/xml representation output to pdf file of the
> entire document structure as can be browsed in the debugger window GUI. It
> would allow for easier searching and understanding of the structure. Not sure
> if it should be an option to PDFReader/PDFDebugger or a separate class that
> might also be bundled into an app jar. I would even start working on it given
> the preferred base to start on
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]