Re: Map task failure recovery

Nguyen Manh Tien Fri, 19 Oct 2007 00:46:49 -0700

Owen, Could you show me how to start additional data nodes or tasktrackers ?



2007/10/19, Owen O'Malley <[EMAIL PROTECTED]>:
>
> On Oct 18, 2007, at 8:05 PM, Ming Yang wrote:
>
> > In the original MapReduce paper from Google, it mentioned
> > that healthy workers can take over failed task from other
> > workers. Does Hadoop has the same failure recovery strategy?
>
> Yes. If a task fails on one node, it is assigned to another free node
> automatically.
>
> > Also the other question is, in the paper, it seems the nodes can
> > be added/removed while the cluster is running jobs. How does
> > Hadoop achieve this? Since the slave locations are saved in the
> > file and the master doesn't know about new nodes until it
> > restart and reload the slave list.
>
> The slaves file is only used by the startup scripts when bringing up
> the cluster. If additional data nodes or task trackers (ie. slaves)
> are started they automatically join the cluster and will be given
> work. If the servers on one of the slaves are killed, the work will
> be redone on other nodes.
>
> -- Owen
>
>
>

Re: Map task failure recovery

Reply via email to