[jira] [Updated] (YARN-914) Support graceful decommission of nodemanager

2015-02-18 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-914:

Attachment: GracefullyDecommissionofNodeManagerv3.pdf

Update proposal to incorporate most comments above, include: AM notification 
mechanism, name, UI changes, etc. In addition, add some details on core state 
transition for RMNode state machine. Will break down sub liras and start the 
work if no further more comments on significant issues. 

 Support graceful decommission of nodemanager
 

 Key: YARN-914
 URL: https://issues.apache.org/jira/browse/YARN-914
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Luke Lu
Assignee: Junping Du
 Attachments: Gracefully Decommission of NodeManager (v1).pdf, 
 Gracefully Decommission of NodeManager (v2).pdf, 
 GracefullyDecommissionofNodeManagerv3.pdf


 When NMs are decommissioned for non-fault reasons (capacity change etc.), 
 it's desirable to minimize the impact to running applications.
 Currently if a NM is decommissioned, all running containers on the NM need to 
 be rescheduled on other NMs. Further more, for finished map tasks, if their 
 map output are not fetched by the reducers of the job, these map tasks will 
 need to be rerun as well.
 We propose to introduce a mechanism to optionally gracefully decommission a 
 node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-914) Support graceful decommission of nodemanager

2015-02-11 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-914:

Attachment: Gracefully Decommission of NodeManager (v2).pdf

Update proposal to reflect what we discussed above. 
Some key updates:
- Change the whole architecture to keep Decommission_In_Progress dark from NM 
side but only within RM side.
- Move tracking of timeout out of core of YARN to new CLI
- Keep track on persistent of RMNode state (with tracking with YARN-2567)
- Remove new configurations of enable and timeout, as both seems 
unnecessary for now
- Break down work items

 Support graceful decommission of nodemanager
 

 Key: YARN-914
 URL: https://issues.apache.org/jira/browse/YARN-914
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Luke Lu
Assignee: Junping Du
 Attachments: Gracefully Decommission of NodeManager (v1).pdf, 
 Gracefully Decommission of NodeManager (v2).pdf


 When NMs are decommissioned for non-fault reasons (capacity change etc.), 
 it's desirable to minimize the impact to running applications.
 Currently if a NM is decommissioned, all running containers on the NM need to 
 be rescheduled on other NMs. Further more, for finished map tasks, if their 
 map output are not fetched by the reducers of the job, these map tasks will 
 need to be rerun as well.
 We propose to introduce a mechanism to optionally gracefully decommission a 
 node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-914) Support graceful decommission of nodemanager

2015-02-04 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-914:

Attachment: Gracefully Decommission of NodeManager (v1).pdf

Put a proposal of design doc for this feature after an offline discussion with 
[~mingma] and addressing [~jlowe] 's comments above. Review and comments are 
appreciated here.

 Support graceful decommission of nodemanager
 

 Key: YARN-914
 URL: https://issues.apache.org/jira/browse/YARN-914
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Luke Lu
Assignee: Junping Du
 Attachments: Gracefully Decommission of NodeManager (v1).pdf


 When NMs are decommissioned for non-fault reasons (capacity change etc.), 
 it's desirable to minimize the impact to running applications.
 Currently if a NM is decommissioned, all running containers on the NM need to 
 be rescheduled on other NMs. Further more, for finished map tasks, if their 
 map output are not fetched by the reducers of the job, these map tasks will 
 need to be rerun as well.
 We propose to introduce a mechanism to optionally gracefully decommission a 
 node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-914) Support graceful decommission of nodemanager

2013-07-15 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated YARN-914:
-

Description: 
When NNs are decommissioned for non-fault reasons (capacity change etc.), it's 
desirable to minimize the impact to running applications.

Currently if a NM is decommissioned, all running containers on the NM need to 
be rescheduled on other NMs. Further more, for finished map tasks, if their map 
output are not fetched by the reducers of the job, these map tasks will need to 
be rerun as well.

We propose to introduce a mechanism to optionally gracefully decommission a 
node manager.

  was:
When NNs are decommissioned for non-fault reasons (capacity change etc.), it's 
desirable to minimize the impact to running applications.

Currently if a NN is decommissioned, all running containers on the NN need to 
be rescheduled on other NNs. Further more, for finished map tasks, if their map 
output are not fetched by the reducers of the job, these map tasks will need to 
be rerun as well.

We propose to introduce a mechanism to optionally gracefully decommission a 
node manager.


 Support graceful decommission of nodemanager
 

 Key: YARN-914
 URL: https://issues.apache.org/jira/browse/YARN-914
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Luke Lu
Assignee: Junping Du

 When NNs are decommissioned for non-fault reasons (capacity change etc.), 
 it's desirable to minimize the impact to running applications.
 Currently if a NM is decommissioned, all running containers on the NM need to 
 be rescheduled on other NMs. Further more, for finished map tasks, if their 
 map output are not fetched by the reducers of the job, these map tasks will 
 need to be rerun as well.
 We propose to introduce a mechanism to optionally gracefully decommission a 
 node manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-914) Support graceful decommission of nodemanager

2013-07-15 Thread Luke Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Lu updated YARN-914:
-

Description: 
When NMs are decommissioned for non-fault reasons (capacity change etc.), it's 
desirable to minimize the impact to running applications.

Currently if a NM is decommissioned, all running containers on the NM need to 
be rescheduled on other NMs. Further more, for finished map tasks, if their map 
output are not fetched by the reducers of the job, these map tasks will need to 
be rerun as well.

We propose to introduce a mechanism to optionally gracefully decommission a 
node manager.

  was:
When NNs are decommissioned for non-fault reasons (capacity change etc.), it's 
desirable to minimize the impact to running applications.

Currently if a NM is decommissioned, all running containers on the NM need to 
be rescheduled on other NMs. Further more, for finished map tasks, if their map 
output are not fetched by the reducers of the job, these map tasks will need to 
be rerun as well.

We propose to introduce a mechanism to optionally gracefully decommission a 
node manager.


 Support graceful decommission of nodemanager
 

 Key: YARN-914
 URL: https://issues.apache.org/jira/browse/YARN-914
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.4-alpha
Reporter: Luke Lu
Assignee: Junping Du

 When NMs are decommissioned for non-fault reasons (capacity change etc.), 
 it's desirable to minimize the impact to running applications.
 Currently if a NM is decommissioned, all running containers on the NM need to 
 be rescheduled on other NMs. Further more, for finished map tasks, if their 
 map output are not fetched by the reducers of the job, these map tasks will 
 need to be rerun as well.
 We propose to introduce a mechanism to optionally gracefully decommission a 
 node manager.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira