Hi everyone here,
       I want to know how Nutch update page after recrawl. For example, a
page was fetched successfully and stored in the DB or file system by last
crawl command. But it returns 404 when recrawl the same page, will Nutch use
this 404's page information to update the former successful page information
? How about other situation, 301? 302? 503?
      Thanks in advance.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/What-is-the-Nutch-page-update-mechanism-after-recrawl-tp4002366.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to