Launching a segread/readdb command kills any running nutch commands
-------------------------------------------------------------------
Key: NUTCH-252
URL: http://issues.apache.org/jira/browse/NUTCH-252
Project: Nutch
Type: Bug
Versions: 0.8-dev
Environment: multi-box installation using DFS (1 jobtracker/namenode master,
10 tasktracker/datanode slaves)
Reporter: Chris Schneider
Priority: Minor
I use a simple script to conduct a whole-web crawl (generate, fetch, updatedb,
and repeat until target depth reached). While this is running, I monitor the
progress via the jobtracker's browser-based UI. Sometimes there's a fairly long
pause after one mapreduce job completes and the next one gets launched, so I
mistakenly assume that depth has been reached. I then launch a segread -list or
readdb -stats command to summarize the results. Doing so apparently kills any
active jobs with absolutely no warning in any of the logs, the console output,
or the jobtracker's UI. The jobs just stop writing to the logs and any child
processes disappear. Usually, the jobtracker and tasktrackers remain up and
respond to subsequent commands.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers