I recently had these problems as well. I noticed that these errors were not such a big deal in maps, but if they occur with a reduce, the job will eventually fail. In my case, I noted that these errors came predominantly from a single node and so I simply stopped using that node. I think that the node had less memory than others or may have been otherwise deficient.
I can't find the discussion on the web about this topic that guided me last time, but I had the impression that this was related to the general 0.19 malaise. My problem has not recurred since blacklisting tasks on my weak node. On Thu, Jul 16, 2009 at 1:04 PM, Grant Ingersoll <[email protected]>wrote: > 09/07/16 13:30:06 INFO mapred.JobClient: Task Id : > attempt_200907160952_0004_m_000000_1, Status : FAILED > Too many fetch-failures >
