In a previous discussion about handling of failures in nutch, it was
mentioned that a broken segment cannot be fixed and it's urls should be
re-crawled.
Thus, it seems that there should be a way to control segment size, so that
one can limit the risk of having to re-crawl a huge amount of urls if only
one of them fails.

Any existing way in nutch to do this?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-it-possible-to-control-the-segment-size-tp3970452.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to