I'm assuming you're using Perl here. There are a couple of reasonable looking options on cpan for reading and manipulating PDFs. I've done it myself before.
My recommendation is to write a program to read these in and then dump the contents out as text, or at least the basic structure. It's tedious as some formatters place every single letter on the page. The point is to see what is on blank pages that makes them look "not blank" to a computer. Probably the easy way is to just look at a sample PDF and make note of some page numbers of blank pages and then only dump those. If you can then easily identify that then you can drop those pages and write the PDF back out (or create a new PDF which includes only non-blank pages - even better as it can be a filter). Michael On Thu, Aug 28, 2014 at 2:14 AM, Paul Boniol <[email protected]> wrote: > I've got a large PDF. The program that created it inserted a large number > of blank pages (600 pages, best guess 1/4 are blank). > > Is there any way to print the pages with text and not print all the blank > pages? > > Google turned folks wanting to do it, and few answers. (Or folks > complaining about printer driver/PPD issues adding blank pages. Or with a > scanned PDF, which this isn't.) > > I tried Adobe Acrobat X preflight "remove empty pages" from the PDF (I > found where and how with great difficulty). If I did everything correctly, > and appears I did, there must be something invisible on the pages, so > Acrobat doesn't consider them blank and leaves them in... > > There was something loosely defined on making images of the pages and > evaluating the contrast or something. But I'm not sure how to do that. > > Any ideas? My Google-foo has run out on this one. > > Paul > > -- > -- > You received this message because you are subscribed to the Google Groups > "NLUG" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/nlug-talk?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "NLUG" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- Michael Darrin Chaney, Sr. [email protected] http://www.michaelchaney.com/ -- -- You received this message because you are subscribed to the Google Groups "NLUG" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nlug-talk?hl=en --- You received this message because you are subscribed to the Google Groups "NLUG" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
