I'm running into a wall with one of my map reduce jobs (actually its a 7 jobs, chained together). I get to the 5th MR job, which takes as input the output from the 3rd MR job, and right off the bat I start getting "Lost task tracker" and "Could not obtain block..." errors. Eventually I get enough of these errors that hadoop just kills my tasks, and fails the job all together.
I'm running a 5 node hadoop cluster on EC2. The input to the 5th MR job is ~400mb in size (10 part-* files, each ~40mb in size), so its not really that big. And I seem to get this no matter how big a hdfs cluster I create (5 - 15 nodes). I'm not really sure how to proceed in trouble shooting the issue. Any help would be greatly appreciated. -- Thanks, John C
