Dick,

 

Have you ‘told’ htdig what to do with pdf files in your configfile? Htdig does not automatically starts indexing files other than HTML. In our case, we have these lines in our config file:

 

external_parsers:    application/msword /usr/local/bin/parse_doc.pl \

                     application/pdf /usr/local/bin/parse_pdf.pl

 

Make sure to set the paths right.

 

Hope this helps,

 

Marco Houtman

 


From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Richard Peskin
Sent: woensdag 9 februari 2005 22:36
To: [email protected]
Subject: [htdig] indexing words in a pdf file

 

Is there any special setup needed to index words in a pdf file? I have pdfinfo and pdftotext in the Htdig/bin directory and Htdig/bin/conv_doc.pl correctly points to pdfinfo and pdftotext. While htseach will locate a pdf file which is a URL (i.e. an href which is that pdf file), I am not seeing words inside that pd file indexed.
Any help is appreciated.
--dick peskin




____________________________________
Richard L. Peskin, RLP Consulting, Londonderry, VT
http://www.rlpcon.com
http://www.caip.rutgers.edu/~peskin

Reply via email to