Hum, yeah RunLengthDecode doesn't seem to be the best algorithm for this kind of image. Well, it's not really a good compression algorithm at all from what I see!

An interesting fact I found was that if I pass my 27 mb file to ps2ps (ghostscript ps2write device), I end up with a 1.7 MB file that is "/ASCII85Decode filter /LZWDecode filter". I don't know much about these decoding algorithms, but it would be really nice if that kind of post-compression happened directly in poppler's pdftops.

I'd be willing to help if someone helped me figure it out. I see poppler already has a LZWStream class, would it simply be a matter of pluging it in somewhere in PSOutputDev.cc, in place or in addition to RunLengthDecode?

Pierre-Luc

On 01/27/2016 01:55 PM, William Bader wrote:
tux-yellow and tux-white both convert to a 2549x3299 RGB bitmap that is RunLength compressed and ASCII85 encoded.

The yellow file is larger than the white file because "255 194 14" does not compress as well as "255 255 255".

The original tux image was Flate encoded with /DecodeParms of <</Predictor 15/Columns 512>>

I am not a poppler maintainer, but I think that it should be possible to add an option to do Flate compression.

If you want to look at the code, open poppler/PSOutputDev.cc and search for occurrences of /RunLengthDecode

The "nothing" files are small because they paint the background by drawing a box instead of by copying a bitmapped image.

I think that when a PDF has several images on top of each other, pdftops needs to convert the entire area to a bitmap even if some of the parts were originally drawn with vector commands. The original images have a bitmapped tux over a vector background, but pdftops can't separate them and has to rasterize the entire page.

Regards,

William


To: [email protected]
From: [email protected]
Date: Tue, 26 Jan 2016 14:19:17 -0500
Subject: [poppler] pdftops creates huge file with simple color background (attached examples)

Hi poppler team,
I have an issue with pdftops version 0.39.0 with conversion of some
specific templates to postscript.  I have created very simple use cases
so that you can understand the issue.
pdftops tux-white.pdf
pdftops tux-yellow.pdf
ls -al *.ps
-rw-r--r-- 1   2816703 Jan 26 11:53 tux-white.ps
-rw-r--r-- 1  27576263 Jan 26 11:53 tux-yellow.ps
The size of the second PS is 27MB, but only the background color has
changed.  This seems related to the fact that there is an image on the
template, because if I remove the image, there is no significant size
difference:
pdftops nothing-white.pdf
pdftops nothing-yellow.pdf
ls -al *.ps
-rw-r--r-- 1     11129 Jan 26 10:34 nothing-white.ps
-rw-r--r-- 1     11167 Jan 26 10:34 nothing-yellow.ps
Is this a known issue? Thanks!
Pierre-Luc

_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler

_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to