Hi, For reference, ideally you should fetch many smaller segments. This prevents many baddies. This sounds brutal, but I would just kill it. You loose one segment... hopefully. Lewis
On Tue, Apr 30, 2013 at 4:20 PM, AC Nutch <[email protected]> wrote: > Hello All, > > I've been looking around for a way to *safely* stop a crawl on Nutch 1.6. > So far the only suggestion I see is to kill the hadoop job or just ctrl+c. > However, when I do this I oftentimes end up with corrupt segments that > won't index to Solr, which is, of course not ideal. Is there any kind of a > proper solution to this (besides just updating to Nutch 2.x - not an option > here)? > > If not, are there any known workarounds? Would it suffice to catch the > keyboard interrupt and delete the last segement - are there any issues with > this (besides losing that segment's data)? Can anyone think of a more > elegant solution? > > Thanks! > > Alex > -- *Lewis*

