Did you try increasing the parallelism? Also at times mapred.task.timeout tuning works.If you are doing it via pig, some have reported good performance by speculative execution. Cheers, /R
On 5/20/10 1:39 PM, "Alexander SchÀtzle" <[email protected]> wrote: Hi, I often get this error message when executing a Join over big data (~ 160 GB): "Task attempt failed to report status for 602 seconds. Killing!" The job finally finishes but a lot of reduce tasks are killed with this error message. I execute the JOIN with a PARALLEL statement of 9. Finally all 9 reduces succeed but there are also, for example, 13 Failed Taks attempts. This also causes the execution time to get very slow! Does anybody have an idea what's happening or have the same problem? Thx in advance, Alex
