[ 
https://issues.apache.org/jira/browse/HADOOP-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604202#action_12604202
 ] 

Sharad Agarwal commented on HADOOP-3517:
----------------------------------------

{quote} 2. To get around the case where we never get to the point where 75% of 
shuffle-threads are blocked, we need to un-stall the merge when the no. of 
shuffle-threads blocked is greater than or equal to the number of required 
map-outputs. {quote}
this condition is redundant. 'no of shuffle threads blocked (on 
RamManager.reserve)' will *always* be less than or equal to 'required 
map-outputs' . So in the current implementation, merge would never wait.

In the patch, noticed that additional variable numMapsInJob is not required. 
Already there is numMaps member variable which can be used.

> The last InMemory merge may be missed
> -------------------------------------
>
>                 Key: HADOOP-3517
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3517
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Devaraj Das
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3517_1_20080610.patch
>
>
> This is post HADOOP-3366. The inmem merge thread has the loop:
> {code}
>         while (!exitInMemMerge) {
>             ramManager.waitForDataToMerge();
>             doInMemMerge();
>           }
> {code}
> The fetchOutputs, at the end of copying everything, does the following:
> {code}
>         exitInMemMerge = true; 
>         ramManager.close();
> {code}
> Now if the merge thread is doing a merge (inside the doInMemMerge method) 
> when the exitInMemMerge is set to true, the loop will break and the last 
> merge of the files that got shuffled recently will be skipped. 
> ramManager.close(), that internally does a final notify to the merge thread 
> also won't have any effect in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to