Hi Chetan,
Depending on plugin you use for fetching http content one of the
following classes contains robots.txt parsing code.

src/plugin/protocol-http
org.apache.nutch.protocol.http.RobotRulesParser
or 
src/plugin/protocol-httpclient
org.apache.nutch.protocol.httpclient.RobotRulesParser

Just look for usage of RobotRulesParser class in plugin to see how it
affects fetching.
Regards,
Piotr

On 7/25/05, Chetan Sahasrabudhe <[EMAIL PROTECTED]> wrote:
> Hello,
> 
>     I am trying to locate code that verifies the crawl process against 
> robots.txt file.
> Any pointers?
> 
> Regards
> Chetan
> 
> 
> 
>


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to