[jira] [Updated] (YARN-10201) Make AMRMProxyPolicy aware of SC load
[ https://issues.apache.org/jira/browse/YARN-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Young Chen updated YARN-10201: -- Attachment: YARN-10201.v2.patch > Make AMRMProxyPolicy aware of SC load > - > > Key: YARN-10201 > URL: https://issues.apache.org/jira/browse/YARN-10201 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Young Chen >Assignee: Young Chen >Priority: Major > Attachments: YARN-10201.v0.patch, YARN-10201.v1.patch, > YARN-10201.v2.patch > > > LocalityMulticastAMRMProxyPolicy is currently unaware of SC load when > splitting resource requests. We propose changes to the policy so that it > receives feedback from SCs and can load balance requests across the federated > cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063433#comment-17063433 ] Peter Bacsko commented on YARN-10198: - [~maniraj...@gmail.com] I was thinking about adding a comment about null-check. In short: it's not necessary, because instanceof simply returns false if the object is null. > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch, > YARN-10198-003.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063380#comment-17063380 ] Manikandan R commented on YARN-10198: - Should we do NULL check for {{parent}} after Line No. 163 to address mappings like u:%user:%primary_group ? Other than this, Looks good to me. Thanks [~pbacsko] and [~prabhujoseph]. > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch, > YARN-10198-003.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063301#comment-17063301 ] Prabhu Joseph commented on YARN-10198: -- Thanks [~pbacsko] for the patch. The patch looks good to me, +1. Will wait for [~maniraj...@gmail.com] to review the latest patch. > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch, > YARN-10198-003.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10198) [managedParent].%primary_group mapping rule doesn't work after YARN-9868
[ https://issues.apache.org/jira/browse/YARN-10198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10198: Attachment: YARN-10198-003.patch > [managedParent].%primary_group mapping rule doesn't work after YARN-9868 > > > Key: YARN-10198 > URL: https://issues.apache.org/jira/browse/YARN-10198 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10198-001.patch, YARN-10198-002.patch, > YARN-10198-003.patch > > > YARN-9868 introduced an unnecessary check if we have the following placement > rule: > [managedParentQueue].%primary_group > Here, {{%primary_group}} is expected to be created if it doesn't exist. > However, there is this validation code which is not necessary: > {noformat} > } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { > if (this.queueManager > .getQueue(groups.getGroups(user).get(0)) != null) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } else { > return null; > } > {noformat} > We should revert this part to the original version: > {noformat} > } else if (mapping.queue.equals(PRIMARY_GROUP_MAPPING)) { > return getPlacementContext(mapping, > groups.getGroups(user).get(0)); > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections
[ https://issues.apache.org/jira/browse/YARN-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-10194: - Attachment: YARN-10194-002.patch > YARN RMWebServices /scheduler-conf/validate leaks ZK Connections > > > Key: YARN-10194 > URL: https://issues.apache.org/jira/browse/YARN-10194 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.3.0 >Reporter: Akhil PB >Assignee: Prabhu Joseph >Priority: Critical > Attachments: YARN-10194-001.patch, YARN-10194-002.patch > > > YARN RMWebServices /scheduler-conf/validate leaks ZK Connections. Validation > API creates a new CapacityScheduler and missed to close after the validation. > Every CapacityScheduler#init opens MutableCSConfigurationProvider which opens > ZKConfigurationStore and creates a ZK Connection. > *ZK LOGS* > {code} > -03-12 16:45:51,881 WARN org.apache.zookeeper.server.NIOServerCnxnFactory: [2 > times] Error accepting new connection: Too many connections from > /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,449 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,710 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,876 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:53,068 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:53,391 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,008 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,287 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,483 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > {code} > And there is an another bug in ZKConfigurationStore which has not handled > close() of ZKCuratorManager. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9995) Code cleanup in TestSchedConfCLI
[ https://issues.apache.org/jira/browse/YARN-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063148#comment-17063148 ] Bilwa S T edited comment on YARN-9995 at 3/20/20, 7:51 AM: --- Hi [~snemeth] do u mean something like below ? {code:java} Object getSchedUpdateInfoValues(SchedConfUpdateInfo schedUpdateInfo, String methodname) { Method method = null; try { method = SchedConfUpdateInfo.class.getMethod(methodname); return method.invoke(schedUpdateInfo); } catch (Exception ex) { ex.printStackTrace(); } return null; } {code} was (Author: bilwast): Hi [~snemeth] do u mean something like below ? {quote} Object getValues(SchedConfUpdateInfo schedUpdateInfo, String methodname) { Method method = null; try { method = SchedConfUpdateInfo.class.getMethod(methodname); return method.invoke(schedUpdateInfo); } catch (Exception ex) { ex.printStackTrace(); } return null; } {quote} > Code cleanup in TestSchedConfCLI > > > Key: YARN-9995 > URL: https://issues.apache.org/jira/browse/YARN-9995 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Bilwa S T >Priority: Minor > > Some tests are too verbose: > - add / delete / remove queues testcases: Creating SchedConfUpdateInfo > instances could be simplified with a helper method or something like that. > - Some fields can be converted to local variables: sysOutStream, sysOut, > sysErr, csConf > - Any additional cleanup -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9995) Code cleanup in TestSchedConfCLI
[ https://issues.apache.org/jira/browse/YARN-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063148#comment-17063148 ] Bilwa S T commented on YARN-9995: - Hi [~snemeth] do u mean something like below ? {quote} Object getValues(SchedConfUpdateInfo schedUpdateInfo, String methodname) { Method method = null; try { method = SchedConfUpdateInfo.class.getMethod(methodname); return method.invoke(schedUpdateInfo); } catch (Exception ex) { ex.printStackTrace(); } return null; } {quote} > Code cleanup in TestSchedConfCLI > > > Key: YARN-9995 > URL: https://issues.apache.org/jira/browse/YARN-9995 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Bilwa S T >Priority: Minor > > Some tests are too verbose: > - add / delete / remove queues testcases: Creating SchedConfUpdateInfo > instances could be simplified with a helper method or something like that. > - Some fields can be converted to local variables: sysOutStream, sysOut, > sysErr, csConf > - Any additional cleanup -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-9995) Code cleanup in TestSchedConfCLI
[ https://issues.apache.org/jira/browse/YARN-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9995: Comment: was deleted (was: Hi [~snemeth] do u mean something like below? {quote} Object getSchedUpdateInfoValues(SchedConfUpdateInfo schedUpdateInfo, String methodname) { Method method = null; try { method = SchedConfUpdateInfo.class.getMethod(methodname); return method.invoke(schedUpdateInfo); } catch (Exception ex)\{ ex.printStackTrace(); } return null; } {quote}) > Code cleanup in TestSchedConfCLI > > > Key: YARN-9995 > URL: https://issues.apache.org/jira/browse/YARN-9995 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Bilwa S T >Priority: Minor > > Some tests are too verbose: > - add / delete / remove queues testcases: Creating SchedConfUpdateInfo > instances could be simplified with a helper method or something like that. > - Some fields can be converted to local variables: sysOutStream, sysOut, > sysErr, csConf > - Any additional cleanup -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9995) Code cleanup in TestSchedConfCLI
[ https://issues.apache.org/jira/browse/YARN-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063146#comment-17063146 ] Bilwa S T edited comment on YARN-9995 at 3/20/20, 7:44 AM: --- Hi [~snemeth] do u mean something like below? {quote} Object getSchedUpdateInfoValues(SchedConfUpdateInfo schedUpdateInfo, String methodname) { Method method = null; try { method = SchedConfUpdateInfo.class.getMethod(methodname); return method.invoke(schedUpdateInfo); } catch (Exception ex)\{ ex.printStackTrace(); } return null; } {quote} was (Author: bilwast): Hi [~snemeth] do u mean something like below? Object getSchedUpdateInfoValues(SchedConfUpdateInfo schedUpdateInfo, String methodname) { Method method = null; try { method = SchedConfUpdateInfo.class.getMethod(methodname); return method.invoke(schedUpdateInfo); } catch (Exception ex) { ex.printStackTrace(); } return null; } > Code cleanup in TestSchedConfCLI > > > Key: YARN-9995 > URL: https://issues.apache.org/jira/browse/YARN-9995 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Bilwa S T >Priority: Minor > > Some tests are too verbose: > - add / delete / remove queues testcases: Creating SchedConfUpdateInfo > instances could be simplified with a helper method or something like that. > - Some fields can be converted to local variables: sysOutStream, sysOut, > sysErr, csConf > - Any additional cleanup -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9995) Code cleanup in TestSchedConfCLI
[ https://issues.apache.org/jira/browse/YARN-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063146#comment-17063146 ] Bilwa S T commented on YARN-9995: - Hi [~snemeth] do u mean something like below? Object getSchedUpdateInfoValues(SchedConfUpdateInfo schedUpdateInfo, String methodname) { Method method = null; try { method = SchedConfUpdateInfo.class.getMethod(methodname); return method.invoke(schedUpdateInfo); } catch (Exception ex) { ex.printStackTrace(); } return null; } > Code cleanup in TestSchedConfCLI > > > Key: YARN-9995 > URL: https://issues.apache.org/jira/browse/YARN-9995 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Bilwa S T >Priority: Minor > > Some tests are too verbose: > - add / delete / remove queues testcases: Creating SchedConfUpdateInfo > instances could be simplified with a helper method or something like that. > - Some fields can be converted to local variables: sysOutStream, sysOut, > sysErr, csConf > - Any additional cleanup -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10194) YARN RMWebServices /scheduler-conf/validate leaks ZK Connections
[ https://issues.apache.org/jira/browse/YARN-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063144#comment-17063144 ] Sunil G commented on YARN-10194: [~prabhujoseph] pls rebase. This patch is no longer applying > YARN RMWebServices /scheduler-conf/validate leaks ZK Connections > > > Key: YARN-10194 > URL: https://issues.apache.org/jira/browse/YARN-10194 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.3.0 >Reporter: Akhil PB >Assignee: Prabhu Joseph >Priority: Critical > Attachments: YARN-10194-001.patch > > > YARN RMWebServices /scheduler-conf/validate leaks ZK Connections. Validation > API creates a new CapacityScheduler and missed to close after the validation. > Every CapacityScheduler#init opens MutableCSConfigurationProvider which opens > ZKConfigurationStore and creates a ZK Connection. > *ZK LOGS* > {code} > -03-12 16:45:51,881 WARN org.apache.zookeeper.server.NIOServerCnxnFactory: [2 > times] Error accepting new connection: Too many connections from > /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,449 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,710 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:52,876 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:53,068 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:53,391 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [2 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,008 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,287 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: Error accepting new > connection: Too many connections from /172.27.99.64 - max is 60 > 2020-03-12 16:45:54,483 WARN > org.apache.zookeeper.server.NIOServerCnxnFactory: [4 times] Error accepting > new connection: Too many connections from /172.27.99.64 - max is 60 > {code} > And there is an another bug in ZKConfigurationStore which has not handled > close() of ZKCuratorManager. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10203) Stuck in express_upgrading if there is any component which has no instance
kyungwan nam created YARN-10203: --- Summary: Stuck in express_upgrading if there is any component which has no instance Key: YARN-10203 URL: https://issues.apache.org/jira/browse/YARN-10203 Project: Hadoop YARN Issue Type: Bug Reporter: kyungwan nam Assignee: kyungwan nam I was trying to "express upgrade" which introduced in YARN-8298. https://hadoop.apache.org/docs/r3.2.0/hadoop-yarn/hadoop-yarn-site/yarn-service/ServiceUpgrade.html but, service state stuck in EXPRESS_UPGRADING. It happens only If there is any component that has no instance. ("number_of_containers" : 0) the component which has no instance should be excepted from upgrade target -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org