[ 
https://issues.apache.org/jira/browse/PDFBOX-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364639#comment-17364639
 ] 

yoonho edited comment on PDFBOX-5216 at 6/17/21, 2:27 AM:
----------------------------------------------------------

[~mkl]

Hello, I also tried the code you posted on stackoverflow.

When I tested the same file with ADOBE DC (19.8mb) and the code you uploaded 
(19.6mb), there was not much difference in file size.

However, when viewing the two files with ADOBE DC's internal file viewer, On 
the first page, you can see that the number of image objects in ADOBE DC has 
been reduced from 2 to 1, whereas in the code you posted, it is kept at 2.

Could you please tell me the reason? You mentioned that the new version of 
PDFBox has not been tested yet, can it be used reliably in versions prior to 
PDFBox 3.0 pre-releases?

!samepage.png!


was (Author: chae):
Hello, I also tried the code you posted on stackoverflow.

When I tested the same file with ADOBE DC (19.8mb) and the code you uploaded 
(19.6mb), there was not much difference in file size.

However, when viewing the two files with ADOBE DC's internal file viewer, On 
the first page, you can see that the number of image objects in ADOBE DC has 
been reduced from 2 to 1, whereas in the code you posted, it is kept at 2.

Could you please tell me the reason? You mentioned that the new version of 
PDFBox has not been tested yet, can it be used reliably in versions prior to 
PDFBox 3.0 pre-releases?

!samepage.png!

> Is there a way to optimize by cleaning up duplicate objects?
> ------------------------------------------------------------
>
>                 Key: PDFBOX-5216
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5216
>             Project: PDFBox
>          Issue Type: Wish
>            Reporter: yoonho
>            Priority: Major
>         Attachments: samepage.png, 스크린샷 2021-06-15 오후 2.02.21.png
>
>
> Is there a way to clean up duplicate objects using PDFBox?
> [http://gofile.me/4hSqO/Cis33w0Sa] - Original
> [http://gofile.me/4hSqO/7XKmWqUBB]  - Clean version
> I applied the Adobe DC's Optimize option (relevant in the attached file). As 
> a result, a 48mb PDF file was reduced to 19mb. I think this is due to 
> cleaning up duplicate objects in the PDF.
> Am I right? I would like to implement this process with PDFBox. How should I 
> approach it?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to