[jira] [Commented] (YARN-7715) Support NM promotion/demotion of running containers.
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714443#comment-16714443 ] Zhankun Tang commented on YARN-7715: [~miklos.szeg...@cloudera.com], [~asuresh], Is this JIRA depend on YARN-5085? Why YARN-5085 is merged into branch 2.9.0 and 3.0.0 but this JIRA is merged into branch 3.2.0? > Support NM promotion/demotion of running containers. > > > Key: YARN-7715 > URL: https://issues.apache.org/jira/browse/YARN-7715 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7715.000.patch, YARN-7715.001.patch, > YARN-7715.002.patch, YARN-7715.003.patch, YARN-7715.004.patch > > > In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups > params for the containers, based on opportunistic or guaranteed, in the > *preStart* method. > Now that YARN-5085 is in, Container executionType (as well as the cpu, memory > and any other resources) can be updated after the container has started. This > means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7715) Support NM promotion/demotion of running containers.
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16477692#comment-16477692 ] Miklos Szegedi commented on YARN-7715: -- [~yangjiandan], none of these are the job of this jira. This Jira is about setting the cgroup based on the setting already propagated to node manager and taken care of. I agree the flag needs to be in state store, however this has nothing to do with cgroups. Also I am not convinced that the AM has to be notified about cgroup errors. cgroup has to be as transparent and failsafe as possible. Any communication to the AM would just add unnecessary network overhead and probably does not solve the problem. The information that some cgroup update failed on some node might be interesting to the AM but it is not actionable. > Support NM promotion/demotion of running containers. > > > Key: YARN-7715 > URL: https://issues.apache.org/jira/browse/YARN-7715 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7715.000.patch, YARN-7715.001.patch, > YARN-7715.002.patch, YARN-7715.003.patch, YARN-7715.004.patch > > > In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups > params for the containers, based on opportunistic or guaranteed, in the > *preStart* method. > Now that YARN-5085 is in, Container executionType (as well as the cpu, memory > and any other resources) can be updated after the container has started. This > means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7715) Support NM promotion/demotion of running containers.
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476961#comment-16476961 ] Jiandan Yang commented on YARN-7715: - Thanks [~miklos.szeg...@cloudera.com] Updating execution type also needs to update cgroup(cfs_period_us, cfs_quota_us, shares), AM is not notified when update cgroup fail. Recover container will error when NM restarts if not storing updated execution type > Support NM promotion/demotion of running containers. > > > Key: YARN-7715 > URL: https://issues.apache.org/jira/browse/YARN-7715 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7715.000.patch, YARN-7715.001.patch, > YARN-7715.002.patch, YARN-7715.003.patch, YARN-7715.004.patch > > > In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups > params for the containers, based on opportunistic or guaranteed, in the > *preStart* method. > Now that YARN-5085 is in, Container executionType (as well as the cpu, memory > and any other resources) can be updated after the container has started. This > means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7715) Support NM promotion/demotion of running containers.
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476852#comment-16476852 ] Miklos Szegedi commented on YARN-7715: -- updateContainer only supports execution type updates so far. A vcore increase or decrease does not trigger it. I agree, it should. This Jira was about promotion, that one is about resource change. Would you like to file a JIRA? Do you mean state store in the second case? I think that is legit, however it is also out of the scope of this patch. [~haibochen], [~asuresh], what do you think? Do we need/have the opportunistic flag in state store? > Support NM promotion/demotion of running containers. > > > Key: YARN-7715 > URL: https://issues.apache.org/jira/browse/YARN-7715 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7715.000.patch, YARN-7715.001.patch, > YARN-7715.002.patch, YARN-7715.003.patch, YARN-7715.004.patch > > > In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups > params for the containers, based on opportunistic or guaranteed, in the > *preStart* method. > Now that YARN-5085 is in, Container executionType (as well as the cpu, memory > and any other resources) can be updated after the container has started. This > means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7715) Support NM promotion/demotion of running containers.
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476776#comment-16476776 ] Jiandan Yang commented on YARN-7715: - Hi, [~miklos.szeg...@cloudera.com] Thanks for your reply. I mean AM does not know when NM updates resource failed. Consider flowing case: 1. AM increase vcore by updateContainer 2. NM update Cgroup failed when executing CGroupsCpuResourceHandlerImpl#updateContainer And another question: updated containes need to store, but I did not find related code in your patch. > Support NM promotion/demotion of running containers. > > > Key: YARN-7715 > URL: https://issues.apache.org/jira/browse/YARN-7715 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7715.000.patch, YARN-7715.001.patch, > YARN-7715.002.patch, YARN-7715.003.patch, YARN-7715.004.patch > > > In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups > params for the containers, based on opportunistic or guaranteed, in the > *preStart* method. > Now that YARN-5085 is in, Container executionType (as well as the cpu, memory > and any other resources) can be updated after the container has started. This > means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7715) Support NM promotion/demotion of running containers.
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16476409#comment-16476409 ] Miklos Szegedi commented on YARN-7715: -- cgroups should work independently of the AM I think. In fact the AM does not even know, if a container is opportunistic or guaranteed at a certain time, does it? > Support NM promotion/demotion of running containers. > > > Key: YARN-7715 > URL: https://issues.apache.org/jira/browse/YARN-7715 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7715.000.patch, YARN-7715.001.patch, > YARN-7715.002.patch, YARN-7715.003.patch, YARN-7715.004.patch > > > In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups > params for the containers, based on opportunistic or guaranteed, in the > *preStart* method. > Now that YARN-5085 is in, Container executionType (as well as the cpu, memory > and any other resources) can be updated after the container has started. This > means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7715) Support NM promotion/demotion of running containers.
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475703#comment-16475703 ] Jiandan Yang commented on YARN-7715: - [~miklos.szeg...@cloudera.com] How to inform AM if update cgroup resource fail? > Support NM promotion/demotion of running containers. > > > Key: YARN-7715 > URL: https://issues.apache.org/jira/browse/YARN-7715 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7715.000.patch, YARN-7715.001.patch, > YARN-7715.002.patch, YARN-7715.003.patch, YARN-7715.004.patch > > > In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups > params for the containers, based on opportunistic or guaranteed, in the > *preStart* method. > Now that YARN-5085 is in, Container executionType (as well as the cpu, memory > and any other resources) can be updated after the container has started. This > means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7715) Support NM promotion/demotion of running containers.
[ https://issues.apache.org/jira/browse/YARN-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16470935#comment-16470935 ] Hudson commented on YARN-7715: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14161 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14161/]) YARN-7715. Support NM promotion/demotion of running containers. (Miklos (haibochen: rev 6341c3a437489737a9c4bf0911b218b0023d8dd9) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsMemoryResourceHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/TestContainerSchedulerQueuing.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/gpu/GpuResourceHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/fpga/FpgaResourceHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/ResourceHandlerChain.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/NetworkPacketTaggingHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsCpuResourceHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/ResourceHandler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsBlkioResourceHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TrafficControlBandwidthHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsMemoryResourceHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/TestResourcePluginManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/numa/NumaResourceHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsCpuResourceHandlerImpl.java > Support NM promotion/demotion of running containers. > > > Key: YARN-7715 > URL: https://issues.apache.org/jira/browse/YARN-7715 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Miklos Szegedi >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7715.000.patch, YARN-7715.001.patch, > YARN-7715.002.patch, YARN-7715.003.patch, YARN-7715.004.patch > > > In YARN-6673 and YARN-6674, the cgroups resource handlers update the cgroups > params for the containers, based on opportunistic or guaranteed, in the > *preStart* method. > Now that YARN-5085 is in, Container executionType (as well as the cpu, memory > and any other resources) can be updated after the container has started. This > means we need the ability to change cgroups params after container start. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org