[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers

2022-05-24 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541330#comment-17541330
 ] 

Masatake Iwasaki commented on YARN-8118:


updated the target version for preparing 2.10.2 release.

> Better utilize gracefully decommissioning node managers
> ---
>
> Key: YARN-8118
> URL: https://issues.apache.org/jira/browse/YARN-8118
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.2
> Environment: * Google Compute Engine (Dataproc)
>  * Java 8
>  * Hadoop 2.8.2 using client-mode graceful decommissioning
>Reporter: Karthik Palaniappan
>Assignee: Karthik Palaniappan
>Priority: Major
> Attachments: YARN-8118-branch-2.001.patch
>
>
> Proposal design doc with background + details (please comment directly on 
> doc): 
> [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7]
> tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications 
> to complete before shutting down, but they cannot run new containers from 
> those in-progress applications. This is wasteful, particularly in 
> environments where you are billed by resource usage (e.g. EC2).
> Proposal: YARN should schedule containers from in-progress applications on 
> DECOMMISSIONING nodes, but should still avoid scheduling containers from new 
> applications. That will make in-progress applications complete faster and let 
> nodes decommission faster. Overall, this should be cheaper.
> I have a working patch without unit tests that's surprisingly just a few real 
> lines of code (patch 001). If folks are happy with the proposal, I'll write 
> unit tests and also write a patch targeted at trunk.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers

2020-09-03 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190558#comment-17190558
 ] 

Masatake Iwasaki commented on YARN-8118:


updated the target version for preparing 2.10.1 release.

> Better utilize gracefully decommissioning node managers
> ---
>
> Key: YARN-8118
> URL: https://issues.apache.org/jira/browse/YARN-8118
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.2
> Environment: * Google Compute Engine (Dataproc)
>  * Java 8
>  * Hadoop 2.8.2 using client-mode graceful decommissioning
>Reporter: Karthik Palaniappan
>Assignee: Karthik Palaniappan
>Priority: Major
> Attachments: YARN-8118-branch-2.001.patch
>
>
> Proposal design doc with background + details (please comment directly on 
> doc): 
> [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7]
> tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications 
> to complete before shutting down, but they cannot run new containers from 
> those in-progress applications. This is wasteful, particularly in 
> environments where you are billed by resource usage (e.g. EC2).
> Proposal: YARN should schedule containers from in-progress applications on 
> DECOMMISSIONING nodes, but should still avoid scheduling containers from new 
> applications. That will make in-progress applications complete faster and let 
> nodes decommission faster. Overall, this should be cheaper.
> I have a working patch without unit tests that's surprisingly just a few real 
> lines of code (patch 001). If folks are happy with the proposal, I'll write 
> unit tests and also write a patch targeted at trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers

2018-04-09 Thread Karthik Palaniappan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431557#comment-16431557
 ] 

Karthik Palaniappan commented on YARN-8118:
---

Sure – I think I get the use case you guys are describing – I'm just trying to 
understand why that's different than option #2 (wait for running containers to 
finish, then decommission the node immediately after).

Is the idea that those 20 minute containers would drain shuffle from 
decommissioning nodes faster than the 10 minute timeout? So then Jason's 
comment about gracefully decommissioning on a "sufficiently large cluster" 
makes sense. So as an admin you just need to set this timeout to enough time to 
finish in-progress containers, finish the current stage (e.g. the map stage), 
and at least start all tasks in the next stage (e.g. the reduce stage) to drain 
shuffle. But you don't necessarily need to wait for the entire application to 
finish.

I still think option #2 and option #3 are both valid secondary use cases, so 
I'm inclined to make an enum parameter for "graceful decommission strategy". In 
terms of plumbing the flag through, using XML config is by far the easiest. But 
I can see an argument that this should be a parameter on a per-decommission-rpc 
basis. Thoughts?

> Better utilize gracefully decommissioning node managers
> ---
>
> Key: YARN-8118
> URL: https://issues.apache.org/jira/browse/YARN-8118
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.2
> Environment: * Google Compute Engine (Dataproc)
>  * Java 8
>  * Hadoop 2.8.2 using client-mode graceful decommissioning
>Reporter: Karthik Palaniappan
>Priority: Major
> Attachments: YARN-8118-branch-2.001.patch
>
>
> Proposal design doc with background + details (please comment directly on 
> doc): 
> [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7]
> tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications 
> to complete before shutting down, but they cannot run new containers from 
> those in-progress applications. This is wasteful, particularly in 
> environments where you are billed by resource usage (e.g. EC2).
> Proposal: YARN should schedule containers from in-progress applications on 
> DECOMMISSIONING nodes, but should still avoid scheduling containers from new 
> applications. That will make in-progress applications complete faster and let 
> nodes decommission faster. Overall, this should be cheaper.
> I have a working patch without unit tests that's surprisingly just a few real 
> lines of code (patch 001). If folks are happy with the proposal, I'll write 
> unit tests and also write a patch targeted at trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers

2018-04-09 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431524#comment-16431524
 ] 

Robert Kanter commented on YARN-8118:
-

Thanks for your ideas [~Karthik Palaniappan].

Consider this scenario: You want to gracefully decommission a node with a 
timeout of 10 minutes.  Suppose you have a job that has containers which 
normally take 20 minutes to run.  At this point, we wouldn't want to start any 
of those containers on that node because they're not going to finish before the 
decom timeout ends, so they'd just get killed halfway through; instead of 
running on another node, which would be faster overall.

I'm fine with adding an option for the behavior you're describing, but I don't 
think we can change the default behavior here (it's also not a "bugfix" like 
your design doc suggests; as [~jlowe], [~djp], and my above scenario show, 
there are valid use cases for the current behavior).  

> Better utilize gracefully decommissioning node managers
> ---
>
> Key: YARN-8118
> URL: https://issues.apache.org/jira/browse/YARN-8118
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.2
> Environment: * Google Compute Engine (Dataproc)
>  * Java 8
>  * Hadoop 2.8.2 using client-mode graceful decommissioning
>Reporter: Karthik Palaniappan
>Priority: Major
> Attachments: YARN-8118-branch-2.001.patch
>
>
> Proposal design doc with background + details (please comment directly on 
> doc): 
> [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7]
> tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications 
> to complete before shutting down, but they cannot run new containers from 
> those in-progress applications. This is wasteful, particularly in 
> environments where you are billed by resource usage (e.g. EC2).
> Proposal: YARN should schedule containers from in-progress applications on 
> DECOMMISSIONING nodes, but should still avoid scheduling containers from new 
> applications. That will make in-progress applications complete faster and let 
> nodes decommission faster. Overall, this should be cheaper.
> I have a working patch without unit tests that's surprisingly just a few real 
> lines of code (patch 001). If folks are happy with the proposal, I'll write 
> unit tests and also write a patch targeted at trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers

2018-04-09 Thread Karthik Palaniappan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431472#comment-16431472
 ] 

Karthik Palaniappan commented on YARN-8118:
---

Not sure I understand your use cases (@Jason/@Junping). For jobs that produce 
shuffle data (i.e. all Hadoop-ecosystem jobs?), killing a container is just as 
bad as removing the shuffle it produced. I can imagine a few reasonable 
scenarios around removing nodes:

1) immediately remove nodes (regular decommissioning)

2) wait for containers to finish, but don't wait until applications finish 
(scenarios where shuffle doesn't matter)

3) wait for apps to finish and let in-progress apps use decommissioning nodes

#1 is regular (forceful) decommissioning. #3 is my proposal  – focused at cloud 
environments with potentially drastic scaling events. #2 makes sense for 
non-cloud environments where few nodes are being removed at a time. It also 
makes sense when running jobs that don't produce shuffle output.

So if you're willing to tolerate a behavioral change, maybe #2 should be the 
default, and #3 should be an additional flag (either an XML property or a flag 
on the graceful decommission request).

However, as currently implemented, it seems like graceful decommissioning is 
the worst of all worlds – wait for apps to finish, but don't let apps use 
decommissioning nodes. Am I missing something obvious here? I couldn't find 
anything in the original design docs discussing why it was implemented that way.

> Better utilize gracefully decommissioning node managers
> ---
>
> Key: YARN-8118
> URL: https://issues.apache.org/jira/browse/YARN-8118
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.2
> Environment: * Google Compute Engine (Dataproc)
>  * Java 8
>  * Hadoop 2.8.2 using client-mode graceful decommissioning
>Reporter: Karthik Palaniappan
>Priority: Major
> Attachments: YARN-8118-branch-2.001.patch
>
>
> Proposal design doc with background + details (please comment directly on 
> doc): 
> [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7]
> tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications 
> to complete before shutting down, but they cannot run new containers from 
> those in-progress applications. This is wasteful, particularly in 
> environments where you are billed by resource usage (e.g. EC2).
> Proposal: YARN should schedule containers from in-progress applications on 
> DECOMMISSIONING nodes, but should still avoid scheduling containers from new 
> applications. That will make in-progress applications complete faster and let 
> nodes decommission faster. Overall, this should be cheaper.
> I have a working patch without unit tests that's surprisingly just a few real 
> lines of code (patch 001). If folks are happy with the proposal, I'll write 
> unit tests and also write a patch targeted at trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers

2018-04-07 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429629#comment-16429629
 ] 

Junping Du commented on YARN-8118:
--

Thanks for contributing your idea and code, [~Karthik Palaniappan]! 
As Jason mentioned above, our main goal here is to remove decommissioning nodes 
from service ASAP with least price of interrupting existing progress that 
applications already made (existing containers running). in my opinion, in most 
cases, there is no significant difference between containers to be scheduled by 
existing applications or new applications. If there are any, the right solution 
should be via priority/preemption mechanism between applications. In another 
word, we don't have assumption on priority differences between existing and new 
applications in our typical decommissioning cases.
However, in a pure cloud environment (like EMR, etc.), the scenario could be 
different - what I can imagine (please correct me if I am wrong) is: user(also 
an admin in yarn prospective) drop most workloads to a dedicated yarn cluster 
and wish the cluster can shrink to some minimal size later when applications 
get finished. If this is the case that current design and code want to target, 
then we should take Jason's suggestion above to have a new configure for 
cluster or a new parameter for graceful decommission CLI. 
We need to be careful here as previous decommissioning nodes operation is 
idempotent, here we need to figure out what means if new applications get 
submitted between multiple operations and how to track them - I don't think the 
current code provide a way.

> Better utilize gracefully decommissioning node managers
> ---
>
> Key: YARN-8118
> URL: https://issues.apache.org/jira/browse/YARN-8118
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.2
> Environment: * Google Compute Engine (Dataproc)
>  * Java 8
>  * Hadoop 2.8.2 using client-mode graceful decommissioning
>Reporter: Karthik Palaniappan
>Priority: Major
> Attachments: YARN-8118-branch-2.001.patch
>
>
> Proposal design doc with background + details (please comment directly on 
> doc): 
> [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7]
> tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications 
> to complete before shutting down, but they cannot run new containers from 
> those in-progress applications. This is wasteful, particularly in 
> environments where you are billed by resource usage (e.g. EC2).
> Proposal: YARN should schedule containers from in-progress applications on 
> DECOMMISSIONING nodes, but should still avoid scheduling containers from new 
> applications. That will make in-progress applications complete faster and let 
> nodes decommission faster. Overall, this should be cheaper.
> I have a working patch without unit tests that's surprisingly just a few real 
> lines of code (patch 001). If folks are happy with the proposal, I'll write 
> unit tests and also write a patch targeted at trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers

2018-04-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16427591#comment-16427591
 ] 

Jason Lowe commented on YARN-8118:
--

This is definitely not desirable in all cases.  We often decommission a node in 
order to gracefully remove it from service as soon as possible, and allowing 
new containers to run on it will usually harm more than help that effort given 
a sufficiently large cluster to run those containers elsewhere.  If this is 
added it should not change the default behavior as it exists today.  It would 
need to be either a config option for the whole cluster or a parameter as part 
of the rmadmin command to gracefully decomm a specific node.


> Better utilize gracefully decommissioning node managers
> ---
>
> Key: YARN-8118
> URL: https://issues.apache.org/jira/browse/YARN-8118
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.2
> Environment: * Google Compute Engine (Dataproc)
>  * Java 8
>  * Hadoop 2.8.2 using client-mode graceful decommissioning
>Reporter: Karthik Palaniappan
>Priority: Major
> Attachments: YARN-8118-branch-2.001.patch
>
>
> Proposal design doc with background + details (please comment directly on 
> doc): 
> [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7]
> tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications 
> to complete before shutting down, but they cannot run new containers from 
> those in-progress applications. This is wasteful, particularly in 
> environments where you are billed by resource usage (e.g. EC2).
> Proposal: YARN should schedule containers from in-progress applications on 
> DECOMMISSIONING nodes, but should still avoid scheduling containers from new 
> applications. That will make in-progress applications complete faster and let 
> nodes decommission faster. Overall, this should be cheaper.
> I have a working patch without unit tests that's surprisingly just a few real 
> lines of code (patch 001). If folks are happy with the proposal, I'll write 
> unit tests and also write a patch targeted at trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8118) Better utilize gracefully decommissioning node managers

2018-04-04 Thread Karthik Palaniappan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426372#comment-16426372
 ] 

Karthik Palaniappan commented on YARN-8118:
---

CC [~djp], [~danzhi], [~rkanter], [~sunilg], who I think were the main authors 
of graceful decommissioning in YARN. (Please add anybody I missed)

> Better utilize gracefully decommissioning node managers
> ---
>
> Key: YARN-8118
> URL: https://issues.apache.org/jira/browse/YARN-8118
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.8.2
> Environment: * Google Compute Engine (Dataproc)
>  * Java 8
>  * Hadoop 2.8.2 using client-mode graceful decommissioning
>Reporter: Karthik Palaniappan
>Priority: Major
> Attachments: YARN-8118-branch-2.001.patch
>
>
> Proposal design doc with background + details (please comment directly on 
> doc): 
> [https://docs.google.com/document/d/1hF2Bod_m7rPgSXlunbWGn1cYi3-L61KvQhPlY9Jk9Hk/edit#heading=h.ab4ufqsj47b7]
> tl;dr Right now, DECOMMISSIONING nodes must wait for in-progress applications 
> to complete before shutting down, but they cannot run new containers from 
> those in-progress applications. This is wasteful, particularly in 
> environments where you are billed by resource usage (e.g. EC2).
> Proposal: YARN should schedule containers from in-progress applications on 
> DECOMMISSIONING nodes, but should still avoid scheduling containers from new 
> applications. That will make in-progress applications complete faster and let 
> nodes decommission faster. Overall, this should be cheaper.
> I have a working patch without unit tests that's surprisingly just a few real 
> lines of code (patch 001). If folks are happy with the proposal, I'll write 
> unit tests and also write a patch targeted at trunk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org