I remember there was somehow in PDFBox to read some resources from the PDF and skip others, I don't remember how but I think there's some way to skip parsing images in the PDF.

Best regards ,
Hesham
--------------------------------------------------
From: "Erik Scholtz, ArgonSoft GmbH" <[email protected]>
Sent: Wednesday, February 03, 2010 6:03 PM
To: <[email protected]>
Subject: Re: PDF contains any text?

Andreas,

without parsing the content of a document and telling about its contents sounds to me like you are looking for the PDDocument.oracle_of_delphi() method :)

But to answer your question: No - you have to look at the resources of each page whether there are text-resources or not, to find out about that. There is no "central resource_available dictionary" in PDF.


Best regards,
Erik

Roeder, Andreas wrote:
Hi,

Is there a way to find out if a PDF contains any text without parsing the whole document?
Some PDF contain just images.

Best Regards,

Andreas


Reply via email to