Hello All,

I've been looking around for a way to *safely* stop a crawl on Nutch 1.6.
So far the only suggestion I see is to kill the hadoop job or just ctrl+c.
However, when I do this I oftentimes end up with corrupt segments that
won't index to Solr, which is, of course not ideal. Is there any kind of a
proper solution to this (besides just updating to Nutch 2.x - not an option
here)?

If not, are there any known workarounds? Would it suffice to catch the
keyboard interrupt and delete the last segement - are there any issues with
this (besides losing that segment's data)? Can anyone think of a more
elegant solution?

Thanks!

Alex

Reply via email to