[jira] [Updated] (YARN-10226) NPE in Capacity Scheduler while using %primary_group queue mapping

2020-04-14 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10226:

Attachment: YARN-10234-002.patch

> NPE in Capacity Scheduler while using %primary_group queue mapping
> --
>
> Key: YARN-10226
> URL: https://issues.apache.org/jira/browse/YARN-10226
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Fix For: 3.3.0, 3.4.0
>
> Attachments: YARN-10226-001.patch
>
>
> If we use the following queue mapping:
> {{u:%user:%primary_group}}
> then we get a NPE inside ResourceManager:
> {noformat}
> 2020-04-06 11:59:13,883 ERROR resourcemanager.ResourceManager 
> (ResourceManager.java:serviceStart(881)) - Failed to load/recover state
> java.lang.NullPointerException
> at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.getQueue(CapacitySchedulerQueueManager.java:138)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getContextForPrimaryGroup(UserGroupMappingPlacementRule.java:163)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getPlacementForUser(UserGroupMappingPlacementRule.java:118)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getPlacementForApp(UserGroupMappingPlacementRule.java:227)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.PlacementManager.placeApplication(PlacementManager.java:67)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.placeApplication(RMAppManager.java:827)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:378)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:367)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:594)
> ...
> {noformat}
> We to check if parent queue is null in 
> {{UserGroupMappingPlacementRule.getContextForPrimaryGroup()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10226) NPE in Capacity Scheduler while using %primary_group queue mapping

2020-04-14 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-10226:

Attachment: (was: YARN-10234-002.patch)

> NPE in Capacity Scheduler while using %primary_group queue mapping
> --
>
> Key: YARN-10226
> URL: https://issues.apache.org/jira/browse/YARN-10226
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Fix For: 3.3.0, 3.4.0
>
> Attachments: YARN-10226-001.patch
>
>
> If we use the following queue mapping:
> {{u:%user:%primary_group}}
> then we get a NPE inside ResourceManager:
> {noformat}
> 2020-04-06 11:59:13,883 ERROR resourcemanager.ResourceManager 
> (ResourceManager.java:serviceStart(881)) - Failed to load/recover state
> java.lang.NullPointerException
> at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.getQueue(CapacitySchedulerQueueManager.java:138)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getContextForPrimaryGroup(UserGroupMappingPlacementRule.java:163)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getPlacementForUser(UserGroupMappingPlacementRule.java:118)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getPlacementForApp(UserGroupMappingPlacementRule.java:227)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.PlacementManager.placeApplication(PlacementManager.java:67)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.placeApplication(RMAppManager.java:827)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:378)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:367)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:594)
> ...
> {noformat}
> We to check if parent queue is null in 
> {{UserGroupMappingPlacementRule.getContextForPrimaryGroup()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10226) NPE in Capacity Scheduler while using %primary_group queue mapping

2020-04-09 Thread Sunil G (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-10226:
---
Summary: NPE in Capacity Scheduler while using %primary_group queue mapping 
 (was: NPE when using %primary_group queue mapping)

> NPE in Capacity Scheduler while using %primary_group queue mapping
> --
>
> Key: YARN-10226
> URL: https://issues.apache.org/jira/browse/YARN-10226
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Critical
> Attachments: YARN-10226-001.patch
>
>
> If we use the following queue mapping:
> {{u:%user:%primary_group}}
> then we get a NPE inside ResourceManager:
> {noformat}
> 2020-04-06 11:59:13,883 ERROR resourcemanager.ResourceManager 
> (ResourceManager.java:serviceStart(881)) - Failed to load/recover state
> java.lang.NullPointerException
> at 
> java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.getQueue(CapacitySchedulerQueueManager.java:138)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getContextForPrimaryGroup(UserGroupMappingPlacementRule.java:163)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getPlacementForUser(UserGroupMappingPlacementRule.java:118)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.getPlacementForApp(UserGroupMappingPlacementRule.java:227)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.placement.PlacementManager.placeApplication(PlacementManager.java:67)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.placeApplication(RMAppManager.java:827)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:378)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:367)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:594)
> ...
> {noformat}
> We to check if parent queue is null in 
> {{UserGroupMappingPlacementRule.getContextForPrimaryGroup()}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org