Protection against incorrectly configured reduces -------------------------------------------------
Key: MAPREDUCE-1521 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1521 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobtracker Reporter: Arun C Murthy Assignee: Arun C Murthy Fix For: 0.22.0 We've seen a fair number of instances where naive users process huge data-sets (>10TB) with badly mis-configured #reduces e.g. 1 reduce. This is a significant problem on large clusters since it takes each attempt of the reduce a long time to shuffle and then run into problems such as local disk-space etc. Then it takes 4 such attempts. Proposal: Come up with heuristics/configs to fail such jobs early. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.