On Jul 18, 2011, at 3:02 PM, Geoffry Roberts wrote: > All, > > I am getting the following errors during my MR jobs (see below). Ultimately > the jobs finish well enough, but these errors do slow things down. I've done > some reading and I understand that this is all caused by failures in my > network. Is there a way of determining which node(s) in my cluster are > causing the problem? >
The TT running on 'localhost' ran attempt_201107180916_0030_m_000003_0 whose output couldn't be fetched. Take a look at the TT logs and see what you find. Arun > Thanks > > 11/07/18 14:53:06 INFO mapreduce.Job: map 99% reduce 28% > 11/07/18 14:53:10 INFO mapreduce.Job: map 100% reduce 28% > 11/07/18 14:53:15 INFO mapreduce.Job: Task Id : > attempt_201107180916_0030_m_000003_0, Status : FAILED > Too many fetch-failures > 11/07/18 14:53:15 WARN mapreduce.Job: Error reading task > outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201107180916_0030_m_000003_0&filter=stdout > 11/07/18 14:53:15 WARN mapreduce.Job: Error reading task > outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201107180916_0030_m_000003_0&filter=stderr > 11/07/18 14:53:17 INFO mapreduce.Job: map 100% reduce 29% > 11/07/18 14:53:19 INFO mapreduce.Job: map 96% reduce 29% > 11/07/18 14:53:25 INFO mapreduce.Job: map 98% reduce 29% > > > -- > Geoffry Roberts >