We have just started putting PHP pages on our site, but HtDig won't index them. In the config file, I have php listed as one of the valid_extensions. It is not in the bad_extensions list. I ran htdig -vvv -c [the config file], and found this error for the only PHP file there:
---------------------------
href: http://[rejectedfileURL].php
Rejected: forbidden by server robots.txt! url rejected: (level 1)http://[rejectedfileURL].php --------------------------- But, our robots.txt looks like: --------------------------- User-agent: * Disallow: /somedir Disallow: /someotherdir ---------------------------
etc., and the directory where that PHP file lives is NOT one of the disallowed directories. Running HtDig 3.2.0b4, Apache 1.3.X, RH 7.3. Apache happily serves up the file to a web browser, so it should be serving it to HtDig. Is there a patch for this erroneous robots.txt-reading? Thanks for any help on this.
J. Dudley
------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

