OK, I've Googled this one till my brain hurts and got nothing... time to
seek the higher wisdom.
I get large PDF files from publishers to index, which I do by running
them through a few bash scripts and then working with the printed
output. I have found a way to do everything via bash, but lately the
file sizes are getting bigger and bigger (the latest was over 500Mb!)
and it takes forever to open and print these -- not to mention paging
through them if I need to find something.
The images are of no use to me, so an easy way to compress the files
would be to eliminate the images, but as far as I can tell there is no
simple way to remove all the images at once from a PDF file, while
keeping the text and page layout. Have I missed something obvious, or is
this really the case? If so, [insert profane expression of incredulity
here]!
The second-best option is to reduce the quality of the images to a bare
minimum, but so far the only way I can find to do this is to use a
Windows system, open the file in Adobe Acrobat, go to the Print dialog,
change the settings and print the whole thing to another PDF file with
minimal image quality. It's a pain and it takes forever.
Any ideas? There are various suggestions on the web about using
ghostscript, imagemagick, ps2ps and so on but all they seem to do is
make the resulting file larger instead of smaller.
I'm doing this quite often, so a bash script would be useful. I can also
probably make sense of Python, but anything beyond that might be a stretch.
Thanks in advance,
Jon.
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html