Re: Map task failure recovery

Arun C Murthy Fri, 19 Oct 2007 01:11:33 -0700

Nguyen Manh Tien wrote:

Owen, Could you show me how to start additional data nodes or tasktrackers ?


On a new node that you want to be a part of the cluster run:

$ $HADOOP_HOME/bin/hadoop-daemon.sh start datanode

or

$ $HADOOP_HOME/bin/hadoop-daemon.sh start tasktracker

Arun


2007/10/19, Owen O'Malley <[EMAIL PROTECTED]>:

On Oct 18, 2007, at 8:05 PM, Ming Yang wrote:

In the original MapReduce paper from Google, it mentioned
that healthy workers can take over failed task from other
workers. Does Hadoop has the same failure recovery strategy?


Yes. If a task fails on one node, it is assigned to another free node
automatically.

Also the other question is, in the paper, it seems the nodes can
be added/removed while the cluster is running jobs. How does
Hadoop achieve this? Since the slave locations are saved in the
file and the master doesn't know about new nodes until it
restart and reload the slave list.


The slaves file is only used by the startup scripts when bringing up
the cluster. If additional data nodes or task trackers (ie. slaves)
are started they automatically join the cluster and will be given
work. If the servers on one of the slaves are killed, the work will
be redone on other nodes.

-- Owen

Re: Map task failure recovery

Reply via email to