[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-18 Thread Bilwa S T (Jira)
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304629#comment-17304629 ] Bilwa S T commented on YARN-10697: -- Thanks [~Jim_Brennan] [~jhung] for your comments. I basically added

[jira] [Created] (YARN-10704) The CS effective capacity for absolute mode in UI should support GPU.

2021-03-18 Thread Qi Zhu (Jira)
Qi Zhu created YARN-10704: - Summary: The CS effective capacity for absolute mode in UI should support GPU. Key: YARN-10704 URL: https://issues.apache.org/jira/browse/YARN-10704 Project: Hadoop YARN

[jira] [Commented] (YARN-10616) Nodemanagers cannot detect GPU failures

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304596#comment-17304596 ] Qi Zhu commented on YARN-10616: --- Thanks [~ebadger] for clarify. It make sense to me now. If we can realize

[jira] [Updated] (YARN-10701) The yarn.resource-types should support multi types without trimmed.

2021-03-18 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-10701: --- Fix Version/s: 3.3.1 3.4.0 +1. Thanks for the patch, [~zhuqi]. I've committed

[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-18 Thread Jonathan Hung (Jira)
[ https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304467#comment-17304467 ] Jonathan Hung commented on YARN-10697: -- [~Jim_Brennan] [~BilwaST] I agree, I don't think we should

[jira] [Comment Edited] (YARN-10616) Nodemanagers cannot detect GPU failures

2021-03-18 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304456#comment-17304456 ] Eric Badger edited comment on YARN-10616 at 3/18/21, 9:22 PM: -- The issue

[jira] [Commented] (YARN-10616) Nodemanagers cannot detect GPU failures

2021-03-18 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304456#comment-17304456 ] Eric Badger commented on YARN-10616: The issue with graceful decommissioning is that you have to edit

[jira] [Commented] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups

2021-03-18 Thread Ahmed Hussein (Jira)
[ https://issues.apache.org/jira/browse/YARN-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304445#comment-17304445 ] Ahmed Hussein commented on YARN-10597: -- Thanks [~shuzirra] for the patch. It is fine to ignore the

[jira] [Commented] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-18 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304408#comment-17304408 ] Hadoop QA commented on YARN-10702: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem

[jira] [Commented] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-18 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304387#comment-17304387 ] Hadoop QA commented on YARN-10702: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem

[jira] [Commented] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304337#comment-17304337 ] Hadoop QA commented on YARN-10674: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem

[jira] [Commented] (YARN-10495) make the rpath of container-executor configurable

2021-03-18 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304333#comment-17304333 ] Eric Badger commented on YARN-10495: I would suggest using a dockerfile with the same OS version as

[jira] [Updated] (YARN-10703) Fix potential null pointer error of gpuNodeResourceUpdateHandler in NodeResourceMonitorImpl.

2021-03-18 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-10703: --- Fix Version/s: 3.3.1 I've also committed this to branch-3.3. This has now been committed to trunk

[jira] [Updated] (YARN-10692) Add Node GPU Utilization and apply to NodeMetrics.

2021-03-18 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-10692: --- Fix Version/s: 3.3.1 I cherry-picked this to branch-3.3 I would like all of the GPU stuff to go back

[jira] [Updated] (YARN-10641) Refactor the max app related update, and fix maxApplications update error when add new queues.

2021-03-18 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10641: -- Summary: Refactor the max app related update, and fix maxApplications update error when add

[jira] [Commented] (YARN-10703) Fix potential null pointer error of gpuNodeResourceUpdateHandler in NodeResourceMonitorImpl.

2021-03-18 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304313#comment-17304313 ] Eric Badger commented on YARN-10703: +1 I've committed this to trunk (3.4) > Fix potential null

[jira] [Updated] (YARN-10703) Fix potential null pointer error of gpuNodeResourceUpdateHandler in NodeResourceMonitorImpl.

2021-03-18 Thread Eric Badger (Jira)
[ https://issues.apache.org/jira/browse/YARN-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-10703: --- Fix Version/s: 3.4.0 > Fix potential null pointer error of gpuNodeResourceUpdateHandler in >

[jira] [Commented] (YARN-10703) Fix potential null pointer error of gpuNodeResourceUpdateHandler in NodeResourceMonitorImpl.

2021-03-18 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304282#comment-17304282 ] Hadoop QA commented on YARN-10703: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Commented] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Hadoop QA (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304253#comment-17304253 ] Hadoop QA commented on YARN-10674: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem ||

[jira] [Updated] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-18 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-10702: --- Attachment: YARN-10702.004.patch > Add cluster metric for amount of CPU used by RM Event Processor >

[jira] [Commented] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-18 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304250#comment-17304250 ] Jim Brennan commented on YARN-10702: Jumped the gun. Patch 004 has fixes for the other checkstyle

[jira] [Commented] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-18 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304240#comment-17304240 ] Jim Brennan commented on YARN-10702: Thanks for the review [~zhuqi]! patch 003 fixes the method

[jira] [Updated] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-18 Thread Jim Brennan (Jira)
[ https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-10702: --- Attachment: YARN-10702.003.patch > Add cluster metric for amount of CPU used by RM Event Processor >

[jira] [Commented] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304228#comment-17304228 ] Qi Zhu commented on YARN-10674: --- [~gandras] Now i understand you, we can just use the code: {code:java}

[jira] [Updated] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10674: -- Attachment: YARN-10674.016.patch > fs2cs: should support auto created queue deletion. >

[jira] [Commented] (YARN-10703) Fix potential null pointer error of gpuNodeResourceUpdateHandler in NodeResourceMonitorImpl.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304214#comment-17304214 ] Qi Zhu commented on YARN-10703: --- [~pbacsko] [~gandras] [~ebadger]  Sorry for the potential null pointer

[jira] [Created] (YARN-10703) Fix potential null pointer error of gpuNodeResourceUpdateHandler in NodeResourceMonitorImpl.

2021-03-18 Thread Qi Zhu (Jira)
Qi Zhu created YARN-10703: - Summary: Fix potential null pointer error of gpuNodeResourceUpdateHandler in NodeResourceMonitorImpl. Key: YARN-10703 URL: https://issues.apache.org/jira/browse/YARN-10703

[jira] [Comment Edited] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304142#comment-17304142 ] Qi Zhu edited comment on YARN-10674 at 3/18/21, 1:29 PM: - Thanks [~gandras] for

[jira] [Comment Edited] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304142#comment-17304142 ] Qi Zhu edited comment on YARN-10674 at 3/18/21, 1:26 PM: - Thanks [~gandras] for

[jira] [Comment Edited] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304142#comment-17304142 ] Qi Zhu edited comment on YARN-10674 at 3/18/21, 1:24 PM: - Thanks [~gandras] for

[jira] [Commented] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304142#comment-17304142 ] Qi Zhu commented on YARN-10674: --- Thanks [~gandras] for reply. If we don't have  PreemptionMode.ENABLED, we

[jira] [Commented] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Andras Gyori (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304130#comment-17304130 ] Andras Gyori commented on YARN-10674: - These are valid suggestions [~pbacsko] and my idea was this.

[jira] [Commented] (YARN-10701) The yarn.resource-types should support multi types without trimmed.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304124#comment-17304124 ] Qi Zhu commented on YARN-10701: --- Thanks [~gandras] for your confirm. [~pbacsko] Could you help review

[jira] [Commented] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304122#comment-17304122 ] Qi Zhu commented on YARN-10674: --- Thanks a lot [~pbacsko] for patient review. Very good suggestion, it make

[jira] [Updated] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10674: -- Attachment: YARN-10674.015.patch > fs2cs: should support auto created queue deletion. >

[jira] [Commented] (YARN-10641) Refactor the max app related update, and fix maxApllications update error when add new queues.

2021-03-18 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304117#comment-17304117 ] Peter Bacsko commented on YARN-10641: - +1 Thanks for the patch [~zhuqi] and [~gandras] for the

[jira] [Commented] (YARN-10692) Add Node GPU Utilization and apply to NodeMetrics.

2021-03-18 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304089#comment-17304089 ] Peter Bacsko commented on YARN-10692: - Thanks [~zhuqi] for the patch, committed to trunk. > Add Node

[jira] [Commented] (YARN-10692) Add Node GPU Utilization and apply to NodeMetrics.

2021-03-18 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304078#comment-17304078 ] Peter Bacsko commented on YARN-10692: - +1 LGTM. Committing this soon. > Add Node GPU Utilization

[jira] [Commented] (YARN-10659) Improve CS MappingRule %secondary_group evaluation

2021-03-18 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304077#comment-17304077 ] Szilard Nemeth commented on YARN-10659: --- Thanks [~shuzirra] for working on this. Latest patch LGTM,

[jira] [Updated] (YARN-10659) Improve CS MappingRule %secondary_group evaluation

2021-03-18 Thread Szilard Nemeth (Jira)
[ https://issues.apache.org/jira/browse/YARN-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10659: -- Fix Version/s: 3.4.0 > Improve CS MappingRule %secondary_group evaluation >

[jira] [Commented] (YARN-10685) Fix typos in AbstractCSQueue

2021-03-18 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304041#comment-17304041 ] Peter Bacsko commented on YARN-10685: - +1 thanks [~zhuqi] for the patch, committed to trunk. > Fix

[jira] [Updated] (YARN-10685) Fix typos in AbstractCSQueue

2021-03-18 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10685: Summary: Fix typos in AbstractCSQueue (was: Fixed some Typo in AbstractCSQueue.) > Fix typos in

[jira] [Commented] (YARN-10674) fs2cs: should support auto created queue deletion.

2021-03-18 Thread Peter Bacsko (Jira)
[ https://issues.apache.org/jira/browse/YARN-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304027#comment-17304027 ] Peter Bacsko commented on YARN-10674: - Thanks [~zhuqi] for the patch. I think we are very close. I

[jira] [Comment Edited] (YARN-10641) Refactor the max app related update, and fix maxApllications update error when add new queues.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297081#comment-17297081 ] Qi Zhu edited comment on YARN-10641 at 3/18/21, 9:57 AM: - [~pbacsko] [~gandras]

[jira] [Comment Edited] (YARN-10641) Refactor the max app related update, and fix maxApllications update error when add new queues.

2021-03-18 Thread Qi Zhu (Jira)
[ https://issues.apache.org/jira/browse/YARN-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297081#comment-17297081 ] Qi Zhu edited comment on YARN-10641 at 3/18/21, 9:57 AM: - [~pbacsko] [~gandras]

[jira] [Commented] (YARN-10659) Improve CS MappingRule %secondary_group evaluation

2021-03-18 Thread Andras Gyori (Jira)
[ https://issues.apache.org/jira/browse/YARN-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303996#comment-17303996 ] Andras Gyori commented on YARN-10659: - Thanks [~shuzirra], the patch looks good to me now +1 non