Did this happen just once or it happens every time? This usually happens when the Child processes are forcibly killed. If it was a one-off thing, it is possible that someone else working on your machine at the same time killed the processes. If it happens every time, then it could be due to lack of system resources. Maybe unix is killing these processes because they are eating too much RAM?
On Wed, Mar 2, 2011 at 3:45 PM, Marc Sturlese <[email protected]>wrote: > Hey there, > My cluster was working fine but suddenly lots and lots of tasks start > failing like: > > java.lang.Throwable: Child Error > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:472) > Caused by: java.io.IOException: Task process exit with nonzero status of 1. > at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:459) > > I restarted the whole cluster but since it happened once its getting broken > every time I run a job. > Any clue or advice? > Thanks in advance. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Tasks-seem-to-fail-randomly-with-nonzero-status-of-1-tp2612433p2612433.html > Sent from the Hadoop lucene-users mailing list archive at Nabble.com. >
