Hi,

robots.txt is periodically rechecked and the previously denied URL should be 
retried when the time to refetch it comes.  If robots.txt rules no longer deny 
access to it, it should be fetched.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Saurabh Suman <[email protected]>
> To: [email protected]
> Sent: Thursday, July 30, 2009 11:29:28 PM
> Subject: denied by robots.txt rules
> 
> 
> Hi 
> if a url is denied by denied once by robots.txt rules,is crawled again by
> nutch?
> 
> -- 
> View this message in context: 
> http://www.nabble.com/denied-by-robots.txt-rules-tp24750517p24750517.html
> Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to