[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.017.patch Attached ver.017 patch fixed unit tests, rest of unit tests should not related. I would leave white space warnings as-is because this is directly captured from nvidia-smi output. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch, YARN-6620.010.patch, YARN-6620.011.patch, > YARN-6620.012.patch, YARN-6620.013.patch, YARN-6620.014.patch, > YARN-6620.015.patch, YARN-6620.016.patch, YARN-6620.017.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.016.patch > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch, YARN-6620.010.patch, YARN-6620.011.patch, > YARN-6620.012.patch, YARN-6620.013.patch, YARN-6620.014.patch, > YARN-6620.015.patch, YARN-6620.016.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.015.patch Attached ver.015 patch. Major change is: - Fixed a bug in 014 which prevent GPU resources added to resource-types.xml - Update resource mapping of container once resources allocated, added unit test for this as well. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch, YARN-6620.010.patch, YARN-6620.011.patch, > YARN-6620.012.patch, YARN-6620.013.patch, YARN-6620.014.patch, > YARN-6620.015.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.014.patch Attached ver.14 patch. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch, YARN-6620.010.patch, YARN-6620.011.patch, > YARN-6620.012.patch, YARN-6620.013.patch, YARN-6620.014.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.013.patch Attached ver.13 patch, addressed javadocs warnings. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch, YARN-6620.010.patch, YARN-6620.011.patch, > YARN-6620.012.patch, YARN-6620.013.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.012.patch Attached ver.012 patch, [~sunilg] could you please help check the updated patch? > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch, YARN-6620.010.patch, YARN-6620.011.patch, > YARN-6620.012.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.011.patch Thanks [~zhengchenyu] for your review, I replied your comments here: bq. When storeAssignedResources failed, there will be a leak. Because postComplete will never be called. It means usedDevices won't be removed. So should add cleanupAssignGpus in catch section. Fixed and added UT. bq. I thinks NM recovery for gpu doesn't take effect. Becuase RecoveredContainerState's ResourceMappings didn't add to the recovered Container. It added to ContainerImpl in constructor: {code} // constructor for a recovered container public ContainerImpl(Configuration conf, Dispatcher dispatcher, ContainerLaunchContext launchContext, Credentials creds, NodeManagerMetrics metrics, ContainerTokenIdentifier containerTokenIdentifier, Context context, RecoveredContainerState rcs) { this(conf, dispatcher, launchContext, creds, metrics, containerTokenIdentifier, context, rcs.getStartTime()); // this.resourceMappings = rcs.getResourceMappings(); } {code} Please let me know if I missed anything. Attached ver.11 patch. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch, YARN-6620.010.patch, YARN-6620.011.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.010.patch Attached 010 patch, I tested in my local environment with 2 GPU devices, it can do proper GPU isolation on top of YARN-3926. (Will add draft of documentation soon). > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch, YARN-6620.010.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.009.patch Attached ver.9 patch, (hopefully) fixed Jenkins issues > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch, > YARN-6620.009.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.008.patch Rebased to trunk (008) > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch, YARN-6620.008.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.007.patch Attached ver.7 patch, added unit tests / javadocs. Also this patch is on top of YARN-3926 which can get GPU request from resource vector and set GPU resource value while registering to RM. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch, YARN-6620.007.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.006-WIP.patch Attached ver.6 WIP patch. Updates: 1) Switched JAXB to handle XML parsing instead of check tags. ([~devaraj.k] please review). 2) Discussed with [~tangzhankun] / [~zyluo] / [~shaneku...@gmail.com] offline, Added {{ResourcePlugin}} interface to make future adding new resource types support to NM could be easier. Minimum code changes of core NM are needed to support new resource types (like SSD/FPGA). Please refer to package: {{org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin}} TODO: 1) Integration to load/set resource value to {{Resource}} vector after YARN-3926 merge. 2) Integration to report resource value while registering/updating to NM. 3) Javadocs / unit test / compilation for new added classes / logics. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch, > YARN-6620.006-WIP.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.005.patch Attached completed GPU discovery logic (006). Will update the patch again once YARN-3926 get merged, some TODO items. 1) Report #GPUs when register to RM. 2) Use #GPUs specified in Resource object to enforce GPU isolation. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch, YARN-6620.005.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.004.patch Rebased on top of YARN-7033, and added WIP GPU detection logic. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch, YARN-6620.004.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.003.patch Attached ver.3 patch, fixed findbug warnings, etc. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch, > YARN-6620.003.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.002.patch Rebased to latest trunk (002) > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch, YARN-6620.002.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: YARN-6620.001.patch Attached ver.001 patch for review. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-6620.001.patch > > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Description: This JIRA plan to add support of: 1) GPU configuration for NodeManagers 2) Isolation in CGroups. (Java side). 3) NM restart and recovery allocated GPU devices was: This JIRA plan to add support of: 1) GPU configuration for NodeManagers 2) Isolation in CGroups. This is the minimal requirements to support GPU on YARN. > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Summary: [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups (was: [YARN-6223] Support GPU Configuration and Isolation in CGroups.) > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. > This is the minimal requirements to support GPU on YARN. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6620) [YARN-6223] NM Java side code changes to support isolate GPU devices by using CGroups
[ https://issues.apache.org/jira/browse/YARN-6620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6620: - Attachment: (was: YARN-6620.001.patch) > [YARN-6223] NM Java side code changes to support isolate GPU devices by using > CGroups > - > > Key: YARN-6620 > URL: https://issues.apache.org/jira/browse/YARN-6620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > > This JIRA plan to add support of: > 1) GPU configuration for NodeManagers > 2) Isolation in CGroups. (Java side). > 3) NM restart and recovery allocated GPU devices -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org