[Libreoffice-bugs] [Bug 157028] FILESAVE PDF Tagged PDF export makes file size grow significantly

bugzilla-daemon Tue, 05 Sep 2023 09:46:59 -0700

https://bugs.documentfoundation.org/show_bug.cgi?id=157028


--- Comment #3 from Tiago do Amaral Rodrigues <[email protected]> ---
I have the same issue; I downloaded the file in this issue for checking, and
exported it (using LO 7.6.0.3 (x86_64) using the same preferences, except for
checking and unchecking the “tagged PDF” option. As Gabor Kelemen reports
above, the sizes are the following:

original:     1 219 784 bytes
without tags:   629 441 bytes
with tags:    3 679 547 bytes

Then I ran the generated PDFs through an optimiser
(https://github.com/pts/pdfsizeopt) and it reported the following:

> C:\pdfsizeopt>pdfsizeopt.exe 2013_Annual_Report-no-structure.pdf 
> 2013_Annual_Report-no-structure-optimised.pdf
> info: This is pdfsizeopt ZIP rUNKNOWN size=69856.
> info: prepending to PATH: C:\pdfsizeopt\pdfsizeopt_win32exec
> info: loading PDF from: 2013_Annual_Report-no-structure.pdf
> info: loaded PDF of 629441 bytes
> info: separated to 330 objs + xref + trailer
> info: parsed 330 objs
> info: eliminated 103 unused objs, depth=8
> info: found 0 Type1 fonts loaded
> info: found 0 Type1C fonts loaded
> info: optimized 108 streams, kept 22 #orig, 1 uncompressed, 85 zip
> info: compressed 1 streams, kept 0 of them uncompressed
> info: saving PDF with 227 objs to: 
> 2013_Annual_Report-no-structure-optimised.pdf
> info: generated object stream of 2362 bytes in 116 objects (8%)
> info: generated 606279 bytes (96%)

> C:\pdfsizeopt>pdfsizeopt.exe 2013_Annual_Report-with-structure.pdf 
> 2013_Annual_Report-with-structure-optimised.pdf
> info: This is pdfsizeopt ZIP rUNKNOWN size=69856.
> info: prepending to PATH: C:\pdfsizeopt\pdfsizeopt_win32exec
> info: loading PDF from: 2013_Annual_Report-with-structure.pdf
> info: loaded PDF of 3679547 bytes
> info: separated to 26308 objs + xref + trailer
> info: parsed 26308 objs
> info: eliminated 103 unused objs, depth=9
> info: found 0 Type1 fonts loaded
> info: found 0 Type1C fonts loaded
> info: optimized 108 streams, kept 15 #orig, 1 uncompressed, 92 zip
> info: eliminated 12804 duplicate objs
> info: compressed 1 streams, kept 0 of them uncompressed
> info: saving PDF with 13401 objs to: 
> 2013_Annual_Report-with-structure-optimised.pdf
> info: generated object stream of 190352 bytes in 13290 objects (7%)
> info: generated 835699 bytes (23%)

yielding:
without tags:   606 279 bytes (-3.68%)
with tags:      835 699 bytes (-77.29%)


So the first PDF file had 330 PDF objects, of which 103 were considered unused
and discarded. The second PDF file had 26 308 objects, of which 103 were
considered unused and 12 804 were duplicates; both classes were discarded.
These may be the objects that were fused together by Michael Stahl's commit,
but if not it may be convenient to explore what are the options for the
PDF-manipulating library that LO uses.

In any case, I will attempt overnight to download the daily version and install
it to try the same exercise again, and then report back with the results.
Thanks again.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Libreoffice-bugs] [Bug 157028] FILESAVE PDF Tagged PDF export makes file size grow significantly

Reply via email to