[
https://issues.apache.org/jira/browse/PDFBOX-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713387#comment-17713387
]
Moritz Flöter edited comment on PDFBOX-5539 at 4/18/23 4:56 AM:
----------------------------------------------------------------
!pdfvoler-structure-view.png!
I was preparing an overview over tooling that can be used to debug pdfs, remove
sensitive information, create test data etc.
I came across PDF Vole
([https://github.com/Rossi1337/pdf_vole/releases/tag/v20210711]) and liked the
idea of how the view is structured. This would also simplify the view selection
in PDFBox Debugger as there would only be a structured and a page view instead
of three distinct views. Even if not all objects are pulled from a single xref
table, the view also makes structural sense as the most basic exploration of
any pdf file would start with looking at the trailer and the at the xref - and
realistically looking at the xref, you want to see where the pdf objects are in
order to inspect them (for example hrough a text editor).
I do realize that the actual implementation would happen in a separate issue
(if at all). But as it is closely linked, I'd place the idea here first, before
creating an issue.
was (Author: moritzf):
!pdfvoler-structure-view.png!
I was preparing an overview over tooling that can be used to debug pdfs, remove
sensitive information, create test data etc.
I came across PDF Vole
([https://github.com/Rossi1337/pdf_vole/releases/tag/v20210711]) and liked the
idea of how the view is structured. This would also simplify the view selection
in PDFBox Debugger as there would only be a structured and a page view instead
of three distinct views. Even if not all objects are pulled from a single xref
table, the view also makes structural sense as the most basic exploration of
any pdf file would start with looking at the trailer and the at the xref - and
realistically looking at the xref, you want to see the pdf objects (for example
hrough a text editor).
I do realize that the actual implementation would happen in a separate issue
(if at all). But as it is closely linked, I'd place the idea here first, before
creating an issue.
> Show CRT in PDFDebugger
> -----------------------
>
> Key: PDFBOX-5539
> URL: https://issues.apache.org/jira/browse/PDFBOX-5539
> Project: PDFBox
> Issue Type: New Feature
> Components: Utilities
> Affects Versions: 2.0.27, 3.0.0 PDFBox
> Reporter: Moritz Flöter
> Assignee: Andreas Lehmkühler
> Priority: Major
> Fix For: 3.0.0 PDFBox
>
> Attachments: Frühlingsangebot.pdf, pdfvoler-structure-view.png
>
>
> For analyzing potentially erroneous PDFs it would be quite helpful to be able
> to show the CRT (Cross Reference Table/xref) and navigate to its entries.
> Some software does provide rather technical (and therefore quite precise)
> information about errors in the PDF files mentioning the object number in the
> pdf file instead of page numbers. With PDF-Debugger, I currently have to
> navigate the Document Catalog Tree structure to find the object. Furthermore,
> navigating the tree structure does not enable one to view unreferenced
> objects.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]