Re: Nutch frozen but not exiting

Andrzej Bialecki Sat, 28 Nov 2009 13:46:30 -0800

Paul Tomblin wrote:

My nutch crawl just stopped.  The process is still there, and doesn't
respond to a "kill -TERM" or a "kill -HUP", but it hasn't written
anything to the log file in the last 40 minutes.  The last thing it
logged was some calls to my custom url filter.  Nothing has been
written in the hadoop directory or the crawldir/crawldb or the
segments dir in that time.


How can I tell what's going on and why it's stopped?

If you run in distributed / pseudo-distributed mode, you can check thestatus in the JobTracker UI. If you are running in "local" mode, thenit's likely that the process is in a (single) reduce phase sorting thedata - with larger jobs in "local" mode the sorting phase may take verylong time, due to a heavy disk IO (and in disk-wait state it may beuninterruptible).


Try to generate a thread dump to see what code is being executed.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Nutch frozen but not exiting

Reply via email to