Hello, all.

We have just started putting PHP pages on our site, but HtDig won't index them. In the config file, I have php listed as one of the valid_extensions. It is not in the bad_extensions list. I ran htdig -vvv -c [the config file], and found this error for the only PHP file there:
---------------------------
href: http://[rejectedfileURL].php


   Rejected: forbidden by server robots.txt!
url rejected: (level 1)http://[rejectedfileURL].php
---------------------------
But, our robots.txt looks like:
---------------------------
User-agent: *
Disallow: /somedir
Disallow: /someotherdir
---------------------------

etc., and the directory where that PHP file lives is NOT one of the disallowed directories. Running HtDig 3.2.0b4, Apache 1.3.X, RH 7.3. Apache happily serves up the file to a web browser, so it should be serving it to HtDig. Is there a patch for this erroneous robots.txt-reading? Thanks for any help on this.

J. Dudley




------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to