[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037549#comment-14037549
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Committing this. 

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-18 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036015#comment-14036015
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Thanks [~maysamyabandeh] for the quick iterations and patience with reviews. 
Hopefully, last set of comments.

bq. I took the liberty to apply the same pattern on also the already existing 
public methods (not previously touched by the patch) whose visibilities were 
relaxed for testing purposes.
While this should be the right thing to do, it might be a little too late. 
Unfortunately, this is backwards incompatible. 

bq. Sorry I did the moving but I forgot to update the visibilities.
Just realized the package name is changed. We should also move the file to a 
matching folder: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/rm/TestRMContainerAllocator.java

Took a closer look at how the variables mapResourceRequest and 
reduceResourceRequest are accessed and the synchronizations. What do you think 
of making those two {{AtomicInteger}}s and reverting any synchronization 
changes made in the patch to address findbugs. If any findbugs warnings still 
persist, we can add an exclusion in 
./hadoop-mapreduce-project/dev-support/findbugs-exclude.xml. Also, you could 
locally run mvn findbugs:findbugs for quicker turnaround.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036498#comment-14036498
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651258/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4672//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4672//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4672//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036519#comment-14036519
 ] 

Sangjin Lee commented on MAPREDUCE-5844:


I agree regarding synchronization. The new member variables Maysam introduced 
are essentially renames of the existing ones, and the only new method that 
accesses these variables is setMapResourceRequest, which is synchronized. We 
looked at all code paths that access and/or mutate them, and they are all 
synchronized.

The ones that are mentioned in the findbugs warning as unsynchronized are in 
fact synchronized, explicitly or implicitly.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-18 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036518#comment-14036518
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


Just noticed that the findbugs warnings are actually old and have been 
surpassed before. It came up again since we changed the variable names:
{code}
   !--
 The below fields are accessed locally and only via methods that are 
synchronized. 
 --
   Match
 Class name=org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator /
 Or
  Field name=mapResourceReqt /
  Field name=reduceResourceReqt /
  Field name=maxReduceRampupLimit /
  Field name=reduceSlowStart /
 /Or
 Bug pattern=IS2_INCONSISTENT_SYNC /
   /Match
{code}

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036603#comment-14036603
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651278/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4673//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4673//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-18 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14036711#comment-14036711
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Thanks Maysam and Sangjin for looking into the synchronization around the two 
variables. Agree with your assessment. 

+1 for the patch. I ll wait until tomorrow to commit, in case [~jlowe] or 
anyone else is interested in taking a look at the patch. 

Regarding moving the file, I don't know why applying the patch doesn't move it 
to the new location. I guess I ll just do the svn mv myself manually at the 
time of commit. 

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034374#comment-14034374
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Patch looks mostly good. Hopefully, one last comment (repeating from above): 
can we move TestRMContainerAllocator to the same package as 
RMContainerAllocator, and reduce the visibility of methods exposed to tests to 
package-private instead of public? I tried this locally and the refactor seemed 
straight-forward. 

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-17 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034455#comment-14034455
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


Sorry I did the moving but I forgot to update the visibilities. Will do.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034549#comment-14034549
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650906/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4666//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4666//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4666//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034713#comment-14034713
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650936/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4667//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4667//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4667//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034756#comment-14034756
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650947/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4668//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4668//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-app.html
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4668//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-17 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034757#comment-14034757
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


findbugs warnings seems like false alarm to me.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033114#comment-14033114
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Thanks for updating the patch, Maysam.

Few comments:
# Unfortunately, RMContainerAllocator and RMContainerRequestor are not 
annotated to @Private classes. So, all the fields/methods that are made 
accessible should have a @Private annotation in addition the @VisibleForTesting 
annotation.
# By moving TestRMContainerAllocator to be in the same package as the above two 
files, we can limit the visibility to package-private instead of public. Can 
you please check if that is straight-forward? 
# Can we combine the following two statements into one? 
{code}
allocationDelayThresholdMs = conf.getInt(
MRJobConfig.MR_JOB_REDUCER_PREEMPT_DELAY_SEC,
MRJobConfig.DEFAULT_MR_JOB_REDUCER_PREEMPT_DELAY_SEC);
allocationDelayThresholdMs *= 1000; //sec - ms
{code}
# Nit: Rename setMapResourceReqt and setReduceResourceReqt to end in Request 
instead of Reqt? #
# Nit: In the tests, can we use a smaller sleep time? Also, instead of sleeping 
for an extra second, can we sleep for the exact time and then check if the 
reducer gets preempted in a loop with much smaller sleep? YARN/MR should use a 
Clock so tests don't have to actually sleep for that long.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033288#comment-14033288
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Can we change the field names also to end in Request instead of Reqt? 

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033298#comment-14033298
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650692/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4663//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4663//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033350#comment-14033350
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12650707/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4664//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4664//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch, 
 MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028787#comment-14028787
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649990/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4651//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4651//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027096#comment-14027096
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

Current patch looks reasonable to me. Would be nice to add unit tests for the 
newly added config. 

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-10 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027123#comment-14027123
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


Thanks [~kasha] for reviewing it. About the unit test, I looked into it and it 
seems to be non-trivial to me: On one hand preemptReducesIfNeeded uses local 
fields and is not feasible to be tested separately via mocking. The alternative 
would be to test the entire RMContainerAllocator object; however to make sure 
that preemptReducesIfNeeded is exercised in the test RMContainerAllocator 
object should be fed with a complicated set of events: some mappers are not 
finished, but enough are finished to trigger reducer start, and finally mapper 
failure. The complexity of the unit test in this way would much more than that 
of the minor change introduced by the patch. I guess it would be possible to 
come up with unit tests with reasonable complexity if we make changes into the 
RMContainerAllocator to make it more testable, but I am not sure whether such 
changes are desirable as part of this jira.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027130#comment-14027130
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

This is new functionality and we should definitely try to add unit tests. To 
keep the test-initiated changes simple, I am open to exposing any of the 
required local fields annotated with @VisibleForTesting. 

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-06-10 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027166#comment-14027166
 ] 

Jason Lowe commented on MAPREDUCE-5844:
---

Agree with [~kasha] that we should try to add unit tests for this.

Couple of other minor comments on the patch:

- The new property name seems a bit too generic and could cause confusion, as 
it doesn't apply to all task allocations but specifically to map allocations 
and only when considering preempting reducers.  Would something like 
yarn.app.mapreduce.job.reducer.preempt.delay make more sense?  Also 
mapreduce.job.* is a namespace that should also be appropriate to this, so we 
could shorten it to something like mapreduce.job.reducer.preempt.delay.
- The new property should have an entry in mapred-default.xml to document it.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-05-28 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011249#comment-14011249
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


Thanks [~wangda]. I guess there is no unanimity about the default value of the 
threshold, as some also suggested to have its default equal to zero. We are 
planning to have it enabled in our config settings anyway, but I let the 
community decide on the default value in the source code.

I think using the timestamp of the last allocated mapper is interesting since 
we would keep only one timestamp versus one per mapper request. The challenge 
however would be that it does not clarify how recent is each map request. We 
can of course have another single timestamp for earliest received map request 
but maintaining it after one of the many inflight map requests get allocated 
would add a bit of more complexity to the patch and its logic. I figured having 
the logic in the patch as simple as possible would justify a new timestamp 
field per container request.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-05-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009158#comment-14009158
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12646827/namenode-gc.2014-05-26-23-29.log.0
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4621//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch, namenode-gc.2014-05-26-23-29.log.0


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007184#comment-14007184
 ] 

Wangda Tan commented on MAPREDUCE-5844:
---

Hi [~maysamyabandeh], 
Thanks for your patch, I think currently, headroom needs to be well improved in 
fair or capacity scheduler. So it's better to make to make your method become a 
default behavior (change time threshold to 0 and reasonable number in my 
opinion. 
A suggestion is, can we simply set a time equals the last mapper container we 
get, and use this time to check if we run into hard to allocate mapper 
situation. Which can avoid modify ContainerRequest code.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-05-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007913#comment-14007913
 ] 

Hadoop QA commented on MAPREDUCE-5844:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12646359/MAPREDUCE-5844.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4619//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4619//console

This message is automatically generated.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh
 Attachments: MAPREDUCE-5844.patch


 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-05-20 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003454#comment-14003454
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


Thanks [~jlowe] and [~kasha]. Sounds great! I will submit a patch soon. The 
patch adds a timestamp to each scheduled mapper, and triggers a preemption when 
a configurable threshold is passed the timestamp.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-05-19 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001941#comment-14001941
 ] 

Jason Lowe commented on MAPREDUCE-5844:
---

Yes, the always-full-but-churning queue scenario can lead to an unfortunate 
preemption case, because the RM and AM can't know cluster resources will free 
up imminently.  I think it's reasonable to add a configurable delay before 
preempting a reducer when a map retroactively fails.  That way users can tune 
based on their cluster usage -- if they have a heavily used, high-churn cluster 
they can tune the parameter based on the average time for a container to free 
up in their queue.  Other users who rarely run in this mode or for which the 
delay would be intolerable can set this to zero to avoid unnecessary delays in 
processing fetch-failed maps when there is no current headroom.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-05-19 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14002007#comment-14002007
 ] 

Karthik Kambatla commented on MAPREDUCE-5844:
-

And, we can start with a default value of zero until we see clear evidence that 
having a higher number doesn't hurt.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-05-16 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000413#comment-14000413
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


[~jlowe], we observed more of these cases where the queue was actually full and 
a fix for proper headroom calculation would not help either. The thing is that 
although the queue might be full at each point of time, there is a constant 
flow of containers completing in the queue and new containers being assigned. 
Therefore, if the MRAppMaster does not aggressively preempt its reducer and ask 
for a container for its failed mapper, it is actually quite likely to get the 
mapper in a timely manner.

Was chatting offline with [~kkambatl] and it came up that perhaps delayed 
preemption could be a more reasonable reaction in such cases. I was wondering 
what is your take on that?

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-04-18 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973866#comment-13973866
 ] 

Sandy Ryza commented on MAPREDUCE-5844:
---

Hi [~maysamyabandeh], you're not misreading the code - headroom calculation in 
the Fair Scheduler needs to be fixed.  I filed YARN-1959 for this. 

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-04-17 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973318#comment-13973318
 ] 

Jason Lowe commented on MAPREDUCE-5844:
---

Moved this to MAPREDUCE since the decision to preempt reducers for mappers is 
ultimately an MR AM decision and not a YARN decision.

A headroom of zero should mean there is literally no more room in the queue, 
and I would expect the job would need to take action in those cases to make 
progress in light of fetch failures.  (e.g.: think of a scenario where all the 
other jobs taking up resources are long-running and won't release resources 
anytime soon)

If you are seeing cases where reducers are shot then immediately relaunched 
along with the failed maps then that implies that either the headroom 
calculation is wrong or resources happened to be freed right at the time the 
new containers were requested.  Note that there are a number of issues with 
headroom calculations, see YARN-1198 and related JIRAs.

Assuming those are fixed, there might be some usefulness to a grace period 
where we wait for other apps to free up resources in the queue to avoid 
shooting reducers.  A proper value for that probably depends upon how much work 
would be lost by the reducers in question, how long we can tolerate waiting to 
try to preserve that work, and how likely it is that another app will free up 
resources anytime soon.  If we wait and still don't get our resources then 
that's purely worse than a job that took decisive action as soon as a map 
retroactively failed and there's no more space left in the queue.   Also if the 
headroom is zero because a single job has hit user limits within the queue then 
waiting serves no purpose -- it has to shoot a reducer in that case to make 
progress.  In that latter case we'd need additional information in the allocate 
response from the scheduler to know that waiting for resources to be released 
from other applications in the queue isn't going to work.

It would be good to verify from the RM logs what is happening in your case.  If 
the headroom calculation is wrong then we should fix that, otherwise if 
resources are churning quickly then a grace period before preempting reducers 
may make sense.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-04-17 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973430#comment-13973430
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


Thanks [~jlowe] for your detailed comment.

# As I explained in the description of the jira the printed headroom in the 
logs is always zero. e.g.,
{code}
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor: getResources() for 
application_x: ask=8 release= 0 newContainers=0 finishedContainers=0 
resourcelimit=memory:0, vCores:0 knownNMs=x
{code}
And this is not because there is no headroom (I know that by checking the 
available resources when job was running).
# I actually was not surprised by headroom set always to zero since I found the 
the headroom field being abandoned in the source code of fairscheduler: in 
SchedulerApplicationAttempt#getHeadroom() is the one with which the headroom 
field in the response is set, while SchedulerApplicationAttempt#setHeadroom() 
is never invoked in FairScheduler (it is invoked in capacity and fifo scheduler 
though)
# I assumed that not invoking setHeadroom in fair scheduler was intentional 
perhaps due to complications of computing the headroom when fair share is taken 
into account. But based on your comment, I understand that this could be a 
forgotten case rather than abandoned one.
# At least in the observed case that we suffered from this problem, the 
headroom was available and both the preempted reducer and the mapper were 
assigned immediately (less than a few seconds). So, delaying the preemption 
even for a period as short as 1 minute could prevent this problem, while not 
having a tangible negative impact in cases that the preemption was actually 
required. I agree that there are tradeoffs with the this preemption delay 
(specially when it is high) but even a short value will suffice to cover this 
special case that the headroom is already available.
# Weather we will have a fix for headroom calculation in fairschedualr or not, 
it seems to me that allowing the user to configure the preemption to be 
postponed for a short delay would not be hurtful, if it is not beneficial.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-04-17 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973526#comment-13973526
 ] 

Jason Lowe commented on MAPREDUCE-5844:
---

Ah, I didn't realize this was the FairScheduler.  Yeah, if headroom is always 
zero that's going to wreak havoc as soon as any fetch failure occurs when 
reducers have been launched.  The priority should be to fix the headroom 
calculation in the scheduler first.  I suspect once that's done then for most 
cases there won't be such a need for grace period support in the AMs.

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (MAPREDUCE-5844) Reducer Preemption is too aggressive

2014-04-17 Thread Maysam Yabandeh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13973572#comment-13973572
 ] 

Maysam Yabandeh commented on MAPREDUCE-5844:


Thanks [~jlowe]

[~sandyr], [~tucu00], could you please comment on the plan for setting headroom 
in the fair scheduler's responses to apps? Or perhaps I am misreading the code 
and it is already there but not working! Should I open a jira for that?

 Reducer Preemption is too aggressive
 

 Key: MAPREDUCE-5844
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5844
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Maysam Yabandeh
Assignee: Maysam Yabandeh

 We observed cases where the reducer preemption makes the job finish much 
 later, and the preemption does not seem to be necessary since after 
 preemption both the preempted reducer and the mapper are assigned 
 immediately--meaning that there was already enough space for the mapper.
 The logic for triggering preemption is at 
 RMContainerAllocator::preemptReducesIfNeeded
 The preemption is triggered if the following is true:
 {code}
 headroom +  am * |m| + pr * |r|  mapResourceRequest
 {code} 
 where am: number of assigned mappers, |m| is mapper size, pr is number of 
 reducers being preempted, and |r| is the reducer size.
 The original idea apparently was that if headroom is not big enough for the 
 new mapper requests, reducers should be preempted. This would work if the job 
 is alone in the cluster. Once we have queues, the headroom calculation 
 becomes more complicated and it would require a separate headroom calculation 
 per queue/job.
 So, as a result headroom variable is kind of given up currently: *headroom is 
 always set to 0* What this implies to the speculation is that speculation 
 becomes very aggressive, not considering whether there is enough space for 
 the mappers or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)