I am still trying to get htdig to index my documents but I am meeting with no success.
I am fairly confident that at least the programs executed by rundig are using htdig.conf because when I make changes to the start_url I see them in the verbose messages that show up.
I tried the specific recommendation in FAQ 5.25 which explains how to create a shell script that generates a file of urls. I did this, got a list of all the html docs in the directory I wanted to index but when I executed the script from the start_url directive it says all of the urls list are not found and does not build a word list.
Sample output: ---------------------------------------------------------------------------------- New server: web2.forefrontnet.com, 80 New server: , 0 Unknown host: 0/robots.txt 0:0:0:http://web2.forefrontnet.com/alhambra.html: not found 1:1:0:http://web2.forefrontnet.com/comm/minaf2000-01-18.html: not found 2:2:0:http://web2.forefrontnet.com/comm/minaf2000-01-31.html: not found 3:3:0:http://web2.forefrontnet.com/comm/minaf2000-02-24.html: not found 4:4:0:http://web2.forefrontnet.com/comm/minaf2000-03-27.html: not found 5:5:0:http://web2.forefrontnet.com/comm/minaf2000-04-18.html: not found 6:6:0:http://web2.forefrontnet.com/comm/minaf2000-04-24.html: not found 7:7:0:http://web2.forefrontnet.com/comm/minaf2000-04-27.html: not found 8:8:0:http://web2.forefrontnet.com/comm/minaf2000-05-23.html: not found 9:9:0:http://web2.forefrontnet.com/comm/minaf2000-05-24.html: not found ---------------------------------------------------------------------------------- Sample of text that was acted on: ---------------------------------------------------------------------------------- http://web2.forefrontnet.com//alhambra.html http://web2.forefrontnet.com//comm/minaf2000-01-18.html http://web2.forefrontnet.com//comm/minaf2000-01-31.html http://web2.forefrontnet.com//comm/minaf2000-02-24.html http://web2.forefrontnet.com//comm/minaf2000-03-27.html http://web2.forefrontnet.com//comm/minaf2000-04-18.html http://web2.forefrontnet.com//comm/minaf2000-04-24.html http://web2.forefrontnet.com//comm/minaf2000-04-27.html http://web2.forefrontnet.com//comm/minaf2000-05-23.html http://web2.forefrontnet.com//comm/minaf2000-05-24.html ----------------------------------------------------------------------------------
I even removed the start_url directive entirely and it generated a large wordlist but not from anything in my site past the initial directory /var/www/html/ . Also the words it does find are not reachable from the search html although results can be seen by using htsearch from the command line.
Once again, any suggestion would be greatly appreciated.
Don Griffey
------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

