Also, a simple explanation of how speculative execution works and what are the key settings can be found here: http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD wQ6AEwAA
In addition, there used to be other parameters (slownodethreshold, slowtaskthreshold & speculativecap) http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html but I believe they were deprecated... Regards German ./g -----Original Message----- From: Harsh J [mailto:[email protected]] Sent: Thursday, July 18, 2013 12:11 PM To: <[email protected]> Subject: Re: Fault tolerance and Speculative Execution What you describe in the first paragraph is not true. Speculative execution API toggles are listed in the documentation: http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration and in the mapred-default page in property form: http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative execution is enabled by default. On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati <[email protected]> wrote: > Hi all, > Is it true that Hadoop 'always' starts same map tasks multiple times > in order to be fault tolerant. i.e. same task is launched on several > machines so that even if a node fails then same task would be > available on other node. And in case no node fails redundant task that finishes late is killed. > If it is true how can I change that configuration for Hadoop to do it > or not do it. > > Speculative execution on the other hand does what I explained above > (redundant map tasks) but only after all the map tasks are scheduled > and if some nodes are free it starts redundant map tasks for those > which are running slow. Is it always true? How do change this > configuration enable/disable. > > I am using Hadoop-1.1.2 incase version matters. > > I really appreciate if someone could help me with this. Thank you. > > Regards > Sundeep > > -- Harsh J
