Hi Chetan, Depending on plugin you use for fetching http content one of the following classes contains robots.txt parsing code.
src/plugin/protocol-http org.apache.nutch.protocol.http.RobotRulesParser or src/plugin/protocol-httpclient org.apache.nutch.protocol.httpclient.RobotRulesParser Just look for usage of RobotRulesParser class in plugin to see how it affects fetching. Regards, Piotr On 7/25/05, Chetan Sahasrabudhe <[EMAIL PROTECTED]> wrote: > Hello, > > I am trying to locate code that verifies the crawl process against > robots.txt file. > Any pointers? > > Regards > Chetan > > > > ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
