[htdig] PDF indexing problem

Justin Hopkins Mon, 07 Aug 2000 14:22:49 -0700
Hello,

I'm trying to index a dozen or so pdf files on my intranet, 
and both parse_doc.pl w/xpd and acrobat (3 and 4) choke
on the .pdf files. 

When acroread chokes, it gives me several of these
sorts of errors: 

PDF::parse: cannot open acroread output from 
http://omniweb/resmis/docs/PMSs/lib
ica/userguide/LTCONFIG.pdf

When parse_doc.pl chokes, it gives several:
sh: /usr/local/bin/parse_doc.pl: No such file or directory

The URLs are valid and the files do exist. The PDFs open
fine in both IE and separately under acroread 3. I've
checked and rechecked the variables inside parse_doc.pl
to make sure they point to the correct translators.
All the files have appropriate execute and read permissions.

When I tell htdig to use acrobat as the parser, this
is what the relevant htdig.conf line looks like:

pdf_parser: /usr/local/Acrobat3/bin/acroread -toPostScript -pairs

When I tell htdig to use parse_doc.pl as the parser, this
is what the relevant htdig.conf line looks like:

external_parsers: "application/pdf" "/usr/local/bin/parse_doc.pl"

(Naturally I comment out one or the other depending on
what is running)

Any thoughts as to where I should look/what could be the problem?
Thanks,
Justin Hopkins

------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
[htdig] PDF indexing problem

Reply via email to