[ 
https://issues.apache.org/jira/browse/PDFBOX-5169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325528#comment-17325528
 ] 

Tilman Hausherr commented on PDFBOX-5169:
-----------------------------------------

This is something about allowing the document to be read by screen readers, 
e.g. for blind people. I know that WORD offers this as an option, but from the 
metadata your file wasn't produced by WORD, but by Adobe InDesign. There's 
probably an option too. The information can also be removed by calling 
{{document.getDocumentCatalog().setStructureTreeRoot(null);}} when using the 
API.

Another possibility would be to postprocess with QPDF.

> PDFMerger produces overly large output PDF
> ------------------------------------------
>
>                 Key: PDFBOX-5169
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5169
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.22, 2.0.23
>         Environment: Debian 10
>            Reporter: Jakov Vežić
>            Priority: Minor
>
> Using PDFMerger to combine
> [https://www.dropbox.com/s/kprk7aeggni420c/1.pdf?dl=1]
> with
> [https://www.dropbox.com/s/0h8bced4tm3gppz/2.pdf?dl=1]
> results in an overly large file. The two input files are 1,25 MB and 16,3 MB 
> large, while the output file is just over 400 MB large. The action also 
> consumes about 1 GB of memory. No errors are produced during the merge that I 
> can tell.
> The command is:
> {code:java}
> java -Xmx2500M -jar pdfbox-app-2.0.23.jar PDFMerger 1.pdf 2.pdf output.pdf
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to