Hi there, I'm trying to validate random pdfs (potentially huge - 100s of MBs) according to the following rule set: - Dimensions of all pages should be A4 (297 mm * 210 mm) - There should be no content within a certain rectangular area of a page (left margin where the print shop inserts a bar code) - Number of pages should be less than N - PDF version used
So far we've been using PDDocument.load with a scratch file, but with huge documents (e.g. product catalogues), things explode. Is there a way to stream parse a PDF similar to stream parsing an XML document (e.g. using StAX) and validate one page at a time? Cheers Stefan

