[ https://issues.apache.org/jira/browse/MAPREDUCE-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aaron Kimball updated MAPREDUCE-434: ------------------------------------ Attachment: MAPREDUCE-434.4.patch Attaching a new patch that fixes TestJobCounters. TestJobCounters tracks the number of spilled records; the jobs "A", "B", and "C" were off by 16K, 32K, and 24K respectively in their previous values vs. current ones. I believe that the reason for this is that when the reducer reads records from a disk file that increases the spilled records counter; previously, the localjobrunner copied map output files to the reducer and then ran the merge, reading all those records in on the "reduce side." The new logic uses the LocalFetcher which fetches all records from the "map side" to memory on the reduce side. In jobs A and B, the difference in counter values is exactly the number of records emitted by the combiner -- suggesting that those records were previously double-counted, but now are counted only once (correctly). Job C is harder for me to understand because it involves 5 map tasks and thus has a multi-level merge (io sort factor=2), but I think the difference is benign. If someone more familiar with the merge counters would take a look at this, I'd appreciate it. > local map-reduce job limited to single reducer > ---------------------------------------------- > > Key: MAPREDUCE-434 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-434 > Project: Hadoop Map/Reduce > Issue Type: Bug > Environment: local job tracker > Reporter: Yoram Arnon > Assignee: Aaron Kimball > Priority: Minor > Attachments: MAPREDUCE-434.2.patch, MAPREDUCE-434.3.patch, > MAPREDUCE-434.4.patch, MAPREDUCE-434.patch > > > when mapred.job.tracker is set to 'local', my setNumReduceTasks call is > ignored, and the number of reduce tasks is set at 1. > This prevents me from locally debugging my partition function, which tries to > partition based on the number of reduce tasks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.