[ 
https://issues.apache.org/jira/browse/PDFBOX-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17470074#comment-17470074
 ] 

Tilman Hausherr commented on PDFBOX-5355:
-----------------------------------------

It's missing. Implementing this would need some understanding of the PDF 
Structure Tree. Use PDFDebugger to see what I mean (chose "Show Internal 
Structure" in the menu)

I had a look at the /K and /ParentTree trees and didn't find the page 3 that 
has the image, only page 1 and 2. (But you wanted to remove page 1 anyway)

To learn a bit about the /K and /ParentTree trees, search for 
{{.getParentTree()}} and {{.getK()}} and its usages in the sources. It's mostly 
in the merge code, especially in the tests (look for {{checkElement}}).

I don't know if all there is to do is to remove all nodes that have a specific 
page. It seemed to me that sometimes one page leads to another page (as a 
child) so that one would be removed as well?!

> remove page from pdf with image violate conformance level pdf1.7
> ----------------------------------------------------------------
>
>                 Key: PDFBOX-5355
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5355
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.20, 3.0.0 JBIG2
>            Reporter: lappa-lappa
>            Priority: Major
>         Attachments: pdf_result.pdf, with_image.pdf
>
>
> open [https://www.pdf-online.com/osa/validate.aspx] and upload 
> "with_image.pdf" document, validation passed
> Execute following code (update absolute paths to files):
> {{{}byte[] withImage = 
> readFile("C:/r/{}}}{{{}pdf/{}}}{{{}with_image.pdf");{}}}
> {{try (PDDocument boxDocument = Loader.loadPDF(withImage)) {}}
> {{  boxDocument.removePage(0);}}
>     try (ByteArrayOutputStream bos = new ByteArrayOutputStream()) {
>         boxDocument.save(bos);
> {{    byte[] pdfBytes = bos.toByteArray();}}
> {{{}    Files.write(Path.of("C:/r/{}}}{{{}pdf/{}}}{{{}pdf_result.pdf"), 
> pdfBytes);{}}}
>     }
> {{} catch (IOException e) {}}
> {{{}  e.printS{}}}tackTrace();
> {{}}}
> {{upload pdf_result.pdf into [https://www.pdf-online.com/osa/validate.aspx] , 
> validation is not passed.}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to