According to Darren Zamrykut: > Htdig 3.1.5 is installed on our SuSE 7.0 Server running as a guest under > VM operating system. > The indexing has been great up until a couple of weeks ago. > Htdig follows the links to word document files on file.html. > It now randomly says that a certain word document cannot be found, but > when I click on the link to this document, it opens up. Usually > everyday, htdig reports that one or two documents are not found and > consequently the word database shrinks.
I would guess that most likely htdig is putting too much demands on your web server and it can't keep up, so it fails on the occasional request. Try setting server_wait_time. See http://www.htdig.org/attrs.html > When I run htdig from commandline with same options that I have been > using since I first put htdig into production, the robots text file > prevents the index b/c the directories are disallowed. > > /opt/www/htdig/bin/htdig -u username:password -s -c /opt/www/htdig/conf/htdig.conf > > I understand the concept behind the robots.txt file, but why now does it > not allow indexing when before it allowed access. Nothing has changed > in the robots.txt file or with the web server. That's a strange one. It almost seems like it was failing to fetch the robots.txt file before. -- Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

