The Pdfboxversion is the 2.0 trunk Version.

For performance reason we render Pdf's with one picture over the whole page (scanned pdf's) at our own. (about 2 sec faster) The other pdf's we will render it with pdfbox. We check different attributes from the page-resoureces (ShadingNames, ExtGSNames, PatternNames, PropetiesNames, ColorSpaceNames) and the Count and Size of the Picture (larger then the Mediabox). But we don't the check the fontnames from the resources because we have ocr (unvisible) text on the pdf-page to search in the page.

Now we have an pdf where is a an size-filled background-image and some text overlayed. We detect this page as scanned page and so we just render the picture.

Would there be a better solution to check/detect that an pdf-page is an scanned pdf-page with no attitional text?

regarts, Manfred

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to