[jira] [Commented] (MAPREDUCE-6726) YARN Registry based AM discovery with retry and in-flight task persistent via JHS
[ https://issues.apache.org/jira/browse/MAPREDUCE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508444#comment-15508444 ] Hadoop QA commented on MAPREDUCE-6726: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 2m 41s {color} | {color:red} Docker failed to build yetus/hadoop:2c91fd8. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12829485/MAPREDUCE-6726-MAPREDUCE-6608.001.patch | | JIRA Issue | MAPREDUCE-6726 | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6729/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > YARN Registry based AM discovery with retry and in-flight task persistent via > JHS > - > > Key: MAPREDUCE-6726 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6726 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: applicationmaster >Reporter: Junping Du >Assignee: Srikanth Sampath > Attachments: MAPREDUCE-6726-MAPREDUCE-6608.001.patch, > MAPREDUCE-6726-MAPREDUCE-6608.001.patch, WorkPreservingMRAppMaster.pdf > > > Several tasks will be achieved in this JIRA based on the demo patch in > MAPREDUCE-6608: > 1. AM discovery base on YARN register service. Could be replaced by YARN-4758 > later due to scale up issue. > 2. Retry logic for TaskUmbilicalProtocol RPC connection > 3. In-flight task recover after AM restart via JHS > 4. Configuration to control the behavior compatible with previous when not > enable this feature (by default). > All security related issues and other concerns discussed in MAPREDUCE-6608 > will be addressed in follow up JIRAs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6726) YARN Registry based AM discovery with retry and in-flight task persistent via JHS
[ https://issues.apache.org/jira/browse/MAPREDUCE-6726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated MAPREDUCE-6726: -- Attachment: MAPREDUCE-6726-MAPREDUCE-6608.001.patch Just create precommit job for MAPREDUCE-6608 branch. Verify if it works by attach previous patch again. > YARN Registry based AM discovery with retry and in-flight task persistent via > JHS > - > > Key: MAPREDUCE-6726 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6726 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: applicationmaster >Reporter: Junping Du >Assignee: Srikanth Sampath > Attachments: MAPREDUCE-6726-MAPREDUCE-6608.001.patch, > MAPREDUCE-6726-MAPREDUCE-6608.001.patch, WorkPreservingMRAppMaster.pdf > > > Several tasks will be achieved in this JIRA based on the demo patch in > MAPREDUCE-6608: > 1. AM discovery base on YARN register service. Could be replaced by YARN-4758 > later due to scale up issue. > 2. Retry logic for TaskUmbilicalProtocol RPC connection > 3. In-flight task recover after AM restart via JHS > 4. Configuration to control the behavior compatible with previous when not > enable this feature (by default). > All security related issues and other concerns discussed in MAPREDUCE-6608 > will be addressed in follow up JIRAs. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5507) MapReduce reducer ramp down is suboptimal with potential job-hanging issues
[ https://issues.apache.org/jira/browse/MAPREDUCE-5507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15507409#comment-15507409 ] Varun Saxena commented on MAPREDUCE-5507: - I was initially thinking of having a configuration to ramp up reducers if maps are hanging for a while but as per discussion on MAPREDUCE-6689, this may lead to suboptimal job performance at it will be very hard to decide a right configuration value for this. We haven't encountered any job hang issues in our deployments since MAPREDUCE-6513, MAPREDUCE-6514 has gone in our branch. So I am fine with closing it. Maybe we can check with defect reporter too. cc [~ojoshi]. > MapReduce reducer ramp down is suboptimal with potential job-hanging issues > --- > > Key: MAPREDUCE-5507 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5507 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: applicationmaster >Reporter: Omkar Vinit Joshi >Assignee: Omkar Vinit Joshi >Priority: Critical > Attachments: MAPREDUCE-5507.20130922.1.patch > > > Today if we are setting "yarn.app.mapreduce.am.job.reduce.rampup.limit" and > "mapreduce.job.reduce.slowstart.completedmaps" then reducers are launched > more aggressively. However the calculation to either Ramp up or Ramp down > reducer is not done in most optimal way. > * If MR AM at any point sees situation something like > ** scheduledMaps : 30 > ** scheduledReducers : 10 > ** assignedMaps : 0 > ** assignedReducers : 11 > ** finishedMaps : 120 > ** headroom : 756 ( when your map /reduce task needs only 512mb) > * then today it simply hangs because it thinks that there is sufficient room > to launch one more mapper and therefore there is no need to ramp down. > However, if this continues forever then this is not the correct way / optimal > way. > * Ideally for MR AM when it sees that assignedMaps drops have dropped to 0 > and there are running reducers around then it should wait for certain time ( > upper limited by average map task completion time ... for heuristic > sake)..but after that if still it doesn't get new container for map task then > it should preempt the reducer one by one with some interval and should ramp > up slowly... > ** Preemption of reducers can be done in little smarter way > *** preempt reducer on a node manager for which there is any pending map > request. > *** otherwise preempt any other reducer. MR AM will contribute to getting new > mapper by releasing such a reducer / container because it will reduce its > cluster consumption and thereby may become candidate for an allocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org