[ 
https://issues.apache.org/jira/browse/PDFBOX-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713387#comment-17713387
 ] 

Moritz Flöter edited comment on PDFBOX-5539 at 4/18/23 4:56 AM:
----------------------------------------------------------------

!pdfvoler-structure-view.png!

I was preparing an overview over tooling that can be used to debug pdfs, remove 
sensitive information, create test data etc.

I came across PDF Vole 
([https://github.com/Rossi1337/pdf_vole/releases/tag/v20210711]) and liked the 
idea of how the view is structured. This would also simplify the view selection 
in PDFBox Debugger as there would only be a structured and a page view instead 
of three distinct views. Even if not all objects are pulled from a single xref 
table, the view also makes structural sense as the most basic exploration of 
any pdf file would start with looking at the trailer and the at the xref - and 
realistically looking at the xref, you want to see where the pdf objects are in 
order to inspect them (for example hrough a text editor).

I do realize that the actual implementation would happen in a separate issue 
(if at all). But as it is closely linked, I'd place the idea here first, before 
creating an issue.


was (Author: moritzf):
!pdfvoler-structure-view.png!

I was preparing an overview over tooling that can be used to debug pdfs, remove 
sensitive information, create test data etc.

I came across PDF Vole 
([https://github.com/Rossi1337/pdf_vole/releases/tag/v20210711]) and liked the 
idea of how the view is structured. This would also simplify the view selection 
in PDFBox Debugger as there would only be a structured and a page view instead 
of three distinct views. Even if not all objects are pulled from a single xref 
table, the view also makes structural sense as the most basic exploration of 
any pdf file would start with looking at the trailer and the at the xref - and 
realistically looking at the xref, you want to see the pdf objects (for example 
hrough a text editor).

I do realize that the actual implementation would happen in a separate issue 
(if at all). But as it is closely linked, I'd place the idea here first, before 
creating an issue.

> Show CRT in PDFDebugger
> -----------------------
>
>                 Key: PDFBOX-5539
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5539
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: Utilities
>    Affects Versions: 2.0.27, 3.0.0 PDFBox
>            Reporter: Moritz Flöter
>            Assignee: Andreas Lehmkühler
>            Priority: Major
>             Fix For: 3.0.0 PDFBox
>
>         Attachments: Frühlingsangebot.pdf, pdfvoler-structure-view.png
>
>
> For analyzing potentially erroneous PDFs it would be quite helpful to be able 
> to show the CRT (Cross Reference Table/xref) and navigate to its entries.
> Some software does provide rather technical (and therefore quite precise) 
> information about errors in the PDF files mentioning the object number in the 
> pdf file instead of page numbers. With PDF-Debugger, I currently have to 
> navigate the Document Catalog Tree structure to find the object. Furthermore, 
> navigating the tree structure does not enable one to view unreferenced 
> objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to