On Fri, 25 Jun 2004 19:36:11 +, [EMAIL PROTECTED]
<[EMAIL PROTECTED]> wrote:
>
> Is there any way to get PHP to simply read the PDF file for text only--just the
> surface of it, just the words, as if it were a human reading the PDF itself--and not
> for the internal code of the file?
I do t
Steve,
You must turn the file to postscript before you can read anything out of it,
and even then, a lot of the time words are broken up into different "show"
statements.
use pdf2ps (part of GNU GhostScript) to convert to PS and then search for
patterns like this:
(text) show
that is the most bas