Hi, I'm just starting using htdig (3.1.6) to index our new company intranet and it works (it's brilliant, in fact but enough crawling)!
At the moment I'm using conv_doc.pl with catdoc, pdftotext and pdfinfo as external parsers but I would like to extend the number of document types I can handle. I downloaded doc2html and read the docs. and now I'm confused (too much choice). Can anyone recommend a parser set that works? My priorities are Word 2000, PDF, Excel, PowerPoint and Flash (with Flash very low on my list. Thanks, Steve. _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

