[ https://issues.apache.org/jira/browse/HADOOP-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Douglas updated HADOOP-4302: ---------------------------------- Attachment: 4302-2.patch *sigh* There's still a race condition in the last patch. If the third output is fetching (allocated) but not closed when the second closes, it's possible to merge the first two to disk before allocating the following three, which trigger a similar fault. The reduce will begin with all segments merged to disk. The solution sets {{mapred.job.shuffle.merge.percent}} high enough to avoid an intermediate merge in the test until the fetch thread is stalled on the final output. > TestReduceFetch fails intermittently > ------------------------------------ > > Key: HADOOP-4302 > URL: https://issues.apache.org/jira/browse/HADOOP-4302 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.19.0 > Reporter: Devaraj Das > Assignee: Chris Douglas > Priority: Blocker > Fix For: 0.19.0 > > Attachments: 0J5g5b71.html.part, 4302-0.patch, 4302-1.patch, > 4302-2.patch > > > I see TestReduceFetch failing once in a while. Here is one such failure > http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3350/testReport/org.apache.hadoop.mapred/TestReduceFetch/testReduceFromPartialMem/ > Marking this as a blocker until we get to the root cause. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.