Hum, yeah RunLengthDecode doesn't seem to be the best algorithm for this
kind of image. Well, it's not really a good compression algorithm at
all from what I see!
An interesting fact I found was that if I pass my 27 mb file to ps2ps
(ghostscript ps2write device), I end up with a 1.7 MB file that is
"/ASCII85Decode filter /LZWDecode filter". I don't know much about
these decoding algorithms, but it would be really nice if that kind of
post-compression happened directly in poppler's pdftops.
I'd be willing to help if someone helped me figure it out. I see
poppler already has a LZWStream class, would it simply be a matter of
pluging it in somewhere in PSOutputDev.cc, in place or in addition to
RunLengthDecode?
Pierre-Luc
On 01/27/2016 01:55 PM, William Bader wrote:
tux-yellow and tux-white both convert to a 2549x3299 RGB bitmap that
is RunLength compressed and ASCII85 encoded.
The yellow file is larger than the white file because "255 194 14"
does not compress as well as "255 255 255".
The original tux image was Flate encoded with /DecodeParms of
<</Predictor 15/Columns 512>>
I am not a poppler maintainer, but I think that it should be possible
to add an option to do Flate compression.
If you want to look at the code, open poppler/PSOutputDev.cc and
search for occurrences of /RunLengthDecode
The "nothing" files are small because they paint the background by
drawing a box instead of by copying a bitmapped image.
I think that when a PDF has several images on top of each other,
pdftops needs to convert the entire area to a bitmap even if some of
the parts were originally drawn with vector commands. The original
images have a bitmapped tux over a vector background, but pdftops
can't separate them and has to rasterize the entire page.
Regards,
William
To: [email protected]
From: [email protected]
Date: Tue, 26 Jan 2016 14:19:17 -0500
Subject: [poppler] pdftops creates huge file with simple color
background (attached examples)
Hi poppler team,
I have an issue with pdftops version 0.39.0 with conversion of some
specific templates to postscript. I have created very simple use cases
so that you can understand the issue.
pdftops tux-white.pdf
pdftops tux-yellow.pdf
ls -al *.ps
-rw-r--r-- 1 2816703 Jan 26 11:53 tux-white.ps
-rw-r--r-- 1 27576263 Jan 26 11:53 tux-yellow.ps
The size of the second PS is 27MB, but only the background color has
changed. This seems related to the fact that there is an image on the
template, because if I remove the image, there is no significant size
difference:
pdftops nothing-white.pdf
pdftops nothing-yellow.pdf
ls -al *.ps
-rw-r--r-- 1 11129 Jan 26 10:34 nothing-white.ps
-rw-r--r-- 1 11167 Jan 26 10:34 nothing-yellow.ps
Is this a known issue?
Thanks!
Pierre-Luc
_______________________________________________ poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler
_______________________________________________
poppler mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/poppler