I remember there was somehow in PDFBox to read some resources from the PDF
and skip others, I don't remember how but I think there's some way to skip
parsing images in the PDF.
Best regards ,
Hesham
--------------------------------------------------
From: "Erik Scholtz, ArgonSoft GmbH" <[email protected]>
Sent: Wednesday, February 03, 2010 6:03 PM
To: <[email protected]>
Subject: Re: PDF contains any text?
Andreas,
without parsing the content of a document and telling about its contents
sounds to me like you are looking for the PDDocument.oracle_of_delphi()
method :)
But to answer your question: No - you have to look at the resources of
each page whether there are text-resources or not, to find out about that.
There is no "central resource_available dictionary" in PDF.
Best regards,
Erik
Roeder, Andreas wrote:
Hi,
Is there a way to find out if a PDF contains any text without parsing the
whole document?
Some PDF contain just images.
Best Regards,
Andreas