According to Pietro Palladino:
> I'm an Italian engineer (so sorry for my English) :-) and I'm evaluating
> htdig to use it on the website of University of Naples...It's a very good
> search engine, but I had some problems...I succeeded in indexing .doc, .rtf,
> .pdf, .ps and .ppt files, but I couldn't index .xls files.
> Actually I'm using RedHat 7.1 for the testing. These are the options that I
> inserted in my htdig.conf file:
>
> external_parsers: application/rtf->text/html /usr/local/scripts/doc2html.pl \
> text/rtf->text/html /usr/local/scripts/doc2html.pl \
> application/pdf->text/html /usr/local/scripts/doc2html.pl \
> application/postscript->text/html
> /usr/local/scripts/doc2html.pl \
> application/msword->text/html
> /usr/local/scripts/doc2html.pl \
> application/msexcel->text/html
> /usr/local/scripts/doc2html.pl \
> application/vnd.ms-excel->text/html
> /usr/local/scripts/doc2html.pl \
> application/vnd.ms-powerpoint->text/html
> /usr/local/scripts/doc2html.pl
>
> I installed xlHtml-0.2.6-2 as an excel parser. In this package there's a ppt
> parser too (pptHtml). The thing I can't understand is that the pptHtml works
> fine (when it's called from doc2html.pl) but xlHtml doesn't work :-(((.
> I tested it from the command line and it works great....but I don't know why
> it doesn't work when called from doc2html.pl. In this Perl script, the lines
> concerning the parsing are almost the same for both .ppt and .xls files....
Do you know for certain that your web server returns a content-type header
of application/vnd.ms-excel for a .xls file? If not, that could be the
problem. You can also try running...
/usr/local/scripts/doc2html.pl /path/to/some/file.xls \
application/vnd.ms-excel http://myhost/some/file.xls /path/to/htdig.conf
from the command line to see if doc2html properly spits out text from
your spreadsheet, in the form of an HTML file.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html