Re: denied by robots.txt rules

Otis Gospodnetic Sun, 02 Aug 2009 20:14:00 -0700

Hi,

robots.txt is periodically rechecked and the previously denied URL should be 
retried when the time to refetch it comes.  If robots.txt rules no longer deny 
access to it, it should be fetched.


Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Saurabh Suman <[email protected]>
> To: [email protected]
> Sent: Thursday, July 30, 2009 11:29:28 PM
> Subject: denied by robots.txt rules
> 
> 
> Hi 
> if a url is denied by denied once by robots.txt rules,is crawled again by
> nutch?
> 
> -- 
> View this message in context: 
> http://www.nabble.com/denied-by-robots.txt-rules-tp24750517p24750517.html
> Sent from the Nutch - User mailing list archive at Nabble.com.

Re: denied by robots.txt rules

Reply via email to