[jira] [Commented] (MAPREDUCE-6726) YARN Registry based AM discovery with retry and in-flight task persistent via JHS

2016-09-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508444#comment-15508444
 ] 

Hadoop QA commented on MAPREDUCE-6726:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 2m 41s 
{color} | {color:red} Docker failed to build yetus/hadoop:2c91fd8. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12829485/MAPREDUCE-6726-MAPREDUCE-6608.001.patch
 |
| JIRA Issue | MAPREDUCE-6726 |
| Console output | 
https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6729/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> YARN Registry based AM discovery with retry and in-flight task persistent via 
> JHS
> -
>
> Key: MAPREDUCE-6726
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6726
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: applicationmaster
>Reporter: Junping Du
>Assignee: Srikanth Sampath
> Attachments: MAPREDUCE-6726-MAPREDUCE-6608.001.patch, 
> MAPREDUCE-6726-MAPREDUCE-6608.001.patch, WorkPreservingMRAppMaster.pdf
>
>
> Several tasks will be achieved in this JIRA based on the demo patch in 
> MAPREDUCE-6608:
> 1. AM discovery base on YARN register service. Could be replaced by YARN-4758 
> later due to scale up issue.
> 2. Retry logic for TaskUmbilicalProtocol RPC connection
> 3. In-flight task recover after AM restart via JHS
> 4. Configuration to control the behavior compatible with previous when not 
> enable this feature (by default).
> All security related issues and other concerns discussed in MAPREDUCE-6608 
> will be addressed in follow up JIRAs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6726) YARN Registry based AM discovery with retry and in-flight task persistent via JHS

2016-09-20 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6726:
--
Attachment: MAPREDUCE-6726-MAPREDUCE-6608.001.patch

Just create precommit job for MAPREDUCE-6608 branch. Verify if it works by 
attach previous patch again.

> YARN Registry based AM discovery with retry and in-flight task persistent via 
> JHS
> -
>
> Key: MAPREDUCE-6726
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6726
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: applicationmaster
>Reporter: Junping Du
>Assignee: Srikanth Sampath
> Attachments: MAPREDUCE-6726-MAPREDUCE-6608.001.patch, 
> MAPREDUCE-6726-MAPREDUCE-6608.001.patch, WorkPreservingMRAppMaster.pdf
>
>
> Several tasks will be achieved in this JIRA based on the demo patch in 
> MAPREDUCE-6608:
> 1. AM discovery base on YARN register service. Could be replaced by YARN-4758 
> later due to scale up issue.
> 2. Retry logic for TaskUmbilicalProtocol RPC connection
> 3. In-flight task recover after AM restart via JHS
> 4. Configuration to control the behavior compatible with previous when not 
> enable this feature (by default).
> All security related issues and other concerns discussed in MAPREDUCE-6608 
> will be addressed in follow up JIRAs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-5507) MapReduce reducer ramp down is suboptimal with potential job-hanging issues

2016-09-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507409#comment-15507409
 ] 

Varun Saxena commented on MAPREDUCE-5507:
-

I was initially thinking of having a configuration to ramp up reducers if maps 
are hanging for a while but as per discussion on MAPREDUCE-6689, this may lead 
to suboptimal job performance at it will be very hard to decide a right 
configuration value for this.

We haven't encountered any job hang issues in our deployments since 
MAPREDUCE-6513, MAPREDUCE-6514 has gone in our branch.
So I am fine with closing it. Maybe we can check with defect reporter too. cc 
[~ojoshi].



> MapReduce reducer ramp down is suboptimal with potential job-hanging issues
> ---
>
> Key: MAPREDUCE-5507
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster
>Reporter: Omkar Vinit Joshi
>Assignee: Omkar Vinit Joshi
>Priority: Critical
> Attachments: MAPREDUCE-5507.20130922.1.patch
>
>
> Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and 
> "mapreduce.job.reduce.slowstart.completedmaps" then reducers are launched 
> more aggressively. However the calculation to either Ramp up or Ramp down 
> reducer is not done in most optimal way. 
> * If MR AM at any point sees situation something like 
> ** scheduledMaps : 30
> ** scheduledReducers : 10
> ** assignedMaps : 0
> ** assignedReducers : 11
> ** finishedMaps : 120
> ** headroom : 756 ( when your map /reduce task needs only 512mb)
> * then today it simply hangs because it thinks that there is sufficient room 
> to launch one more mapper and therefore there is no need to ramp down. 
> However, if this continues forever then this is not the correct way / optimal 
> way.
> * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 
> and there are running reducers around then it should wait for certain time ( 
> upper limited by average map task completion time ... for heuristic 
> sake)..but after that if still it doesn't get new container for map task then 
> it should preempt the reducer one by one with some interval and should ramp 
> up slowly...
> ** Preemption of reducers can be done in little smarter way
> *** preempt reducer on a node manager for which there is any pending map 
> request.
> *** otherwise preempt any other reducer. MR AM will contribute to getting new 
> mapper by releasing such a reducer / container because it will reduce its 
> cluster consumption and thereby may become candidate for an allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org