Hello, On Sat, Feb 12, 2011 at 9:34 PM, Pedro Costa <psdc1...@gmail.com> wrote: > Hi, > > 1 - When a Map task is taking too long to finish its process, the JT > launches another Map task to process. This means that the task that > was replaced is killed?
If a task times out, it is killed and rescheduled. If you're noticing this in the final waves, it could be the speculative execution feature of Hadoop MapReduce - enabled by defaults. > 2 - Does Hadoop MR allows that the same input split be processed by 2 > different mappers at the same time? In some ways, yes. There is a speculative execution feature that does this exact thing (two tasks may be 'computing' in the same race - whichever reports a completion first, wins). See the 'Speculative execution' sub-topic of this YDN Hadoop modules page for some details: http://developer.yahoo.com/hadoop/tutorial/module4.html#tolerence But it should also be possible to have duplicated input splits / paths in order to do this (although the 'same-time' is not a guarantee, again). -- Harsh J www.harshj.com