[jira] [Updated] (YARN-11103) SLS cleanup after previously merged SLS refactor jiras
[ https://issues.apache.org/jira/browse/YARN-11103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11103: -- Affects Version/s: 3.4.0 > SLS cleanup after previously merged SLS refactor jiras > -- > > Key: YARN-11103 > URL: https://issues.apache.org/jira/browse/YARN-11103 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler-load-simulator >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > There have been some jiras that moved around SLS code in order to have a more > readable SLSRunner. > Mostly, the code fragments were just moved to separate classes. > Most of the issues came up were just because our build system detected them > as failures but they were part of the original code so they were not newly > introduced issues. > There were some comments about fixing these, here are all of them I found, so > we need to fix these (if they are not yet fixed): > * > https://issues.apache.org/jira/browse/YARN-10548?focusedCommentId=17512336=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17512336 > https://issues.apache.org/jira/browse/YARN-10548?focusedCommentId=17513012=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17513012 > https://issues.apache.org/jira/browse/YARN-10552?focusedCommentId=17511762=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17511762 > https://issues.apache.org/jira/browse/YARN-10552?focusedCommentId=17390981=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17390981 > https://issues.apache.org/jira/browse/YARN-10547?focusedCommentId=17510839=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17510839 > https://issues.apache.org/jira/browse/YARN-11094?focusedCommentId=17512324=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17512324 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11103) SLS cleanup after previously merged SLS refactor jiras
[ https://issues.apache.org/jira/browse/YARN-11103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11103: -- Target Version/s: 3.4.0 > SLS cleanup after previously merged SLS refactor jiras > -- > > Key: YARN-11103 > URL: https://issues.apache.org/jira/browse/YARN-11103 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler-load-simulator >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > There have been some jiras that moved around SLS code in order to have a more > readable SLSRunner. > Mostly, the code fragments were just moved to separate classes. > Most of the issues came up were just because our build system detected them > as failures but they were part of the original code so they were not newly > introduced issues. > There were some comments about fixing these, here are all of them I found, so > we need to fix these (if they are not yet fixed): > * > https://issues.apache.org/jira/browse/YARN-10548?focusedCommentId=17512336=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17512336 > https://issues.apache.org/jira/browse/YARN-10548?focusedCommentId=17513012=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17513012 > https://issues.apache.org/jira/browse/YARN-10552?focusedCommentId=17511762=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17511762 > https://issues.apache.org/jira/browse/YARN-10552?focusedCommentId=17390981=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17390981 > https://issues.apache.org/jira/browse/YARN-10547?focusedCommentId=17510839=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17510839 > https://issues.apache.org/jira/browse/YARN-11094?focusedCommentId=17512324=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17512324 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11106) Fix the test failure due to missing conf of yarn.resourcemanager.node-labels.am.default-node-label-expression
[ https://issues.apache.org/jira/browse/YARN-11106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11106: -- Component/s: test > Fix the test failure due to missing conf of > yarn.resourcemanager.node-labels.am.default-node-label-expression > - > > Key: YARN-11106 > URL: https://issues.apache.org/jira/browse/YARN-11106 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Affects Versions: 3.4.0 >Reporter: Junfan Zhang >Assignee: Junfan Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11106) Fix the test failure due to missing conf of yarn.resourcemanager.node-labels.am.default-node-label-expression
[ https://issues.apache.org/jira/browse/YARN-11106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11106: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Fix the test failure due to missing conf of > yarn.resourcemanager.node-labels.am.default-node-label-expression > - > > Key: YARN-11106 > URL: https://issues.apache.org/jira/browse/YARN-11106 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Junfan Zhang >Assignee: Junfan Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11107) When NodeLabel is enabled for a YARN cluster, AM blacklist program does not work properly
[ https://issues.apache.org/jira/browse/YARN-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11107: -- Hadoop Flags: Reviewed Target Version/s: 3.4.0 > When NodeLabel is enabled for a YARN cluster, AM blacklist program does not > work properly > - > > Key: YARN-11107 > URL: https://issues.apache.org/jira/browse/YARN-11107 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.2, 3.3.0 >Reporter: Xiping Zhang >Assignee: Xiping Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h > Remaining Estimate: 0h > > Yarn NodeLabel is enabled in the production environment. We encountered a > application AM that blacklisted all NMS corresponding to the lable in the > queue, and other application in the queue cannot apply for computing > resources. We found that RM printed a lot of logs "Trying to fulfill > reservation for application..." -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11111) Recovery failure when node-label configure-type transit from delegated-centralized to centralized
[ https://issues.apache.org/jira/browse/YARN-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-1: -- Hadoop Flags: Reviewed > Recovery failure when node-label configure-type transit from > delegated-centralized to centralized > - > > Key: YARN-1 > URL: https://issues.apache.org/jira/browse/YARN-1 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: Junfan Zhang >Assignee: Junfan Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > When i make configure-type from delegated-centralized to centralized in > yarn-site.xml and restart the RM, it failed. > The error stacktrace is as follows > > {code:txt} > 2022-04-13 14:44:14,885 WARN org.apache.hadoop.ha.ActiveStandbyElector: > Exception handling the winning of election > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:901) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:476) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:610) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508) > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when > transitioning to Active mode > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:333) > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) > ... 4 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.ReplaceLabelsOnNodeRequestPBImpl.initNodeToLabels(ReplaceLabelsOnNodeRequestPBImpl.java:61) > at > org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.ReplaceLabelsOnNodeRequestPBImpl.getNodeToLabels(ReplaceLabelsOnNodeRequestPBImpl.java:138) > at > org.apache.hadoop.yarn.nodelabels.store.op.NodeLabelMirrorOp.recover(NodeLabelMirrorOp.java:76) > at > org.apache.hadoop.yarn.nodelabels.store.op.NodeLabelMirrorOp.recover(NodeLabelMirrorOp.java:41) > at > org.apache.hadoop.yarn.nodelabels.store.AbstractFSNodeStore.loadFromMirror(AbstractFSNodeStore.java:120) > at > org.apache.hadoop.yarn.nodelabels.store.AbstractFSNodeStore.recoverFromStore(AbstractFSNodeStore.java:149) > at > org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.recover(FileSystemNodeLabelsStore.java:106) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.initNodeLabelStore(CommonNodeLabelsManager.java:252) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStart(CommonNodeLabelsManager.java:266) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:910) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1278) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1319) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1315) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1315) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:328) > ... 5 more > 2022-04-13 14:44:14,886 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Trying to re-establish ZK session > {code} > When i digging into the codebase, found that the node and labels mapping is > stored in the nodelabel.mirror file when configured the type of centralized. > So the content of nodelabel.mirror
[jira] [Updated] (YARN-11111) Recovery failure when node-label configure-type transit from delegated-centralized to centralized
[ https://issues.apache.org/jira/browse/YARN-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-1: -- Component/s: yarn > Recovery failure when node-label configure-type transit from > delegated-centralized to centralized > - > > Key: YARN-1 > URL: https://issues.apache.org/jira/browse/YARN-1 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: Junfan Zhang >Assignee: Junfan Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > When i make configure-type from delegated-centralized to centralized in > yarn-site.xml and restart the RM, it failed. > The error stacktrace is as follows > > {code:txt} > 2022-04-13 14:44:14,885 WARN org.apache.hadoop.ha.ActiveStandbyElector: > Exception handling the winning of election > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:901) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:476) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:610) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508) > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when > transitioning to Active mode > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:333) > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) > ... 4 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.ReplaceLabelsOnNodeRequestPBImpl.initNodeToLabels(ReplaceLabelsOnNodeRequestPBImpl.java:61) > at > org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.ReplaceLabelsOnNodeRequestPBImpl.getNodeToLabels(ReplaceLabelsOnNodeRequestPBImpl.java:138) > at > org.apache.hadoop.yarn.nodelabels.store.op.NodeLabelMirrorOp.recover(NodeLabelMirrorOp.java:76) > at > org.apache.hadoop.yarn.nodelabels.store.op.NodeLabelMirrorOp.recover(NodeLabelMirrorOp.java:41) > at > org.apache.hadoop.yarn.nodelabels.store.AbstractFSNodeStore.loadFromMirror(AbstractFSNodeStore.java:120) > at > org.apache.hadoop.yarn.nodelabels.store.AbstractFSNodeStore.recoverFromStore(AbstractFSNodeStore.java:149) > at > org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.recover(FileSystemNodeLabelsStore.java:106) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.initNodeLabelStore(CommonNodeLabelsManager.java:252) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStart(CommonNodeLabelsManager.java:266) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:910) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1278) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1319) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1315) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1315) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:328) > ... 5 more > 2022-04-13 14:44:14,886 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Trying to re-establish ZK session > {code} > When i digging into the codebase, found that the node and labels mapping is > stored in the nodelabel.mirror file when configured the type of centralized. > So the content of nodelabel.mirror file
[jira] [Updated] (YARN-11111) Recovery failure when node-label configure-type transit from delegated-centralized to centralized
[ https://issues.apache.org/jira/browse/YARN-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-1: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Recovery failure when node-label configure-type transit from > delegated-centralized to centralized > - > > Key: YARN-1 > URL: https://issues.apache.org/jira/browse/YARN-1 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Junfan Zhang >Assignee: Junfan Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > When i make configure-type from delegated-centralized to centralized in > yarn-site.xml and restart the RM, it failed. > The error stacktrace is as follows > > {code:txt} > 2022-04-13 14:44:14,885 WARN org.apache.hadoop.ha.ActiveStandbyElector: > Exception handling the winning of election > org.apache.hadoop.ha.ServiceFailedException: RM could not transition to Active > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) > at > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:901) > at > org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:476) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:610) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508) > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when > transitioning to Active mode > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:333) > at > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144) > ... 4 more > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.ReplaceLabelsOnNodeRequestPBImpl.initNodeToLabels(ReplaceLabelsOnNodeRequestPBImpl.java:61) > at > org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.ReplaceLabelsOnNodeRequestPBImpl.getNodeToLabels(ReplaceLabelsOnNodeRequestPBImpl.java:138) > at > org.apache.hadoop.yarn.nodelabels.store.op.NodeLabelMirrorOp.recover(NodeLabelMirrorOp.java:76) > at > org.apache.hadoop.yarn.nodelabels.store.op.NodeLabelMirrorOp.recover(NodeLabelMirrorOp.java:41) > at > org.apache.hadoop.yarn.nodelabels.store.AbstractFSNodeStore.loadFromMirror(AbstractFSNodeStore.java:120) > at > org.apache.hadoop.yarn.nodelabels.store.AbstractFSNodeStore.recoverFromStore(AbstractFSNodeStore.java:149) > at > org.apache.hadoop.yarn.nodelabels.FileSystemNodeLabelsStore.recover(FileSystemNodeLabelsStore.java:106) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.initNodeLabelStore(CommonNodeLabelsManager.java:252) > at > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager.serviceStart(CommonNodeLabelsManager.java:266) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:910) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1278) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1319) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1315) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1315) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:328) > ... 5 more > 2022-04-13 14:44:14,886 INFO org.apache.hadoop.ha.ActiveStandbyElector: > Trying to re-establish ZK session > {code} > When i digging into the codebase, found that the node and labels mapping is > stored in the nodelabel.mirror file when configured the type of centralized. > So the content of
[jira] [Updated] (YARN-11116) Migrate Times util from SimpleDateFormat to thread-safe DateTimeFormatter class
[ https://issues.apache.org/jira/browse/YARN-6?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-6: -- Hadoop Flags: Reviewed Target Version/s: 3.3.5, 3.4.0 Affects Version/s: 3.3.5 3.4.0 > Migrate Times util from SimpleDateFormat to thread-safe DateTimeFormatter > class > --- > > Key: YARN-6 > URL: https://issues.apache.org/jira/browse/YARN-6 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.4.0, 3.3.5 >Reporter: Jonathan Turner Eagles >Assignee: Jonathan Turner Eagles >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.5 > > Attachments: YARN-6.001.perftest.patch > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Came across a stack trace with SimpleDateFormatter in it which led me to > investigate current practices > > {noformat} > 6578 "IPC Server handler 29 on 8032" #797 daemon prio=5 os_prio=0 > tid=0x7fb6527d nid=0x953b runnable [0x7fb5ba034000] > 6579 java.lang.Thread.State: RUNNABLE > 6580 at org.apache.hadoop.yarn.util.Times.formatISO8601(Times.java:95) > 6581 at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.createAndGetApplicationReport(RMAppImpl.java:810) > 6582 at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:396) > 6583 at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:224) > 6584 at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:529) > 6585 at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530) > 6586 at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:500) > 6587 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1069) > 6588 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1003) > 6589 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:936) > 6590 at java.security.AccessController.doPrivileged(Native Method) > 6591 at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:2135) > 6592 at > org.apache.hadoop.security.UserGroupInformation.doAsPrivileged(UserGroupInformation.java:2123) > 6593 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2875) > 6594 > {noformat} > > DateTimeFormatter is thread-safe meaning no need to wrap the class in Thread > local as they can be reused safely across threads. In addition, the new > classes are slightly more performant. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11114) RMWebServices returns only apps matching exactly the submitted queue name
[ https://issues.apache.org/jira/browse/YARN-4?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-4: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > RMWebServices returns only apps matching exactly the submitted queue name > - > > Key: YARN-4 > URL: https://issues.apache.org/jira/browse/YARN-4 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, webapp >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > I've added 2 testcases that demonstrate the issue with [this > commit|https://github.com/szilard-nemeth/hadoop/commit/88dcf40f4dab564477542b8efb82f4f20d132eee]. > 1. With 'testAppsQueryByQueueShortname', there's a finishedApp submitted to > "root.default" and there's a runningApp that is submitted to "default". > The testcase queries the apps by queue name "default" and the response only > contains the runningApp, which is submitted to "default" so the other app > that is submitted to "root.default" is not returned. > 2. With 'testAppsQueryByQueueFullname', there's a finishedApp submitted to > "root.default" and there's a runningApp that is submitted to "default" (same > setup as above). > The testcase queries the apps by queue name "root.default" (which is the full > queue path) and the response only contains the finishedApp, which is > submittted to "root.default" so the other app that is submitted to "default" > is not returned. > A trivial conclusion of this is that only those applications are included in > the response that exactly match the queue name where the application is > submitted to, either specified explicity at submission or resolved by the > placement engine. > Before YARN-9879 was implemented, Capacity Scheduler was only capable of > definining a leaf queue with a specific name in the whole hierarchy once, > meaning that leaf queue names were unique. > For example root.a.testQueue and root.b.testQueue couldn't coexist, as the > leaf queue name is the same. > At this point, I supposed that YARN-9879 is causing this issue, but as the > behaviour of CS before YARN-9879 was merged didn't allow two leaf queues with > the same name, a query of "root.default" and "default" could easily work as > it was guaranteed that there's not another "default" leaf queue in the > hierarchy, just one. I digged a bit further. > I also noticed that YARN-8659 ([commit > link|https://github.com/apache/hadoop/commit/7c13872cbbb6f1b0b1c2dde894885b41186b3797]) > could have introduced this issue a long time ago, as it removed the iterator > logic that queried the applications with method YarnScheduler#getAppsInQueue > (see > [this|https://github.com/apache/hadoop/commit/7c13872cbbb6f1b0b1c2dde894885b41186b3797#diff-5b432bf3a8eb3e039878300ffb9db1f728226b9e3f63c4eb53be5ed5a833390aL843]). > Let's follow the implementation of YarnScheduler#getAppsInQueue for CS: > 1. First of all, > [here|https://github.com/apache/hadoop/blob/4c05d257ba3f3311b5bbc993f6e5e35637487d88/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L2501-L2509] > is the method definition. > [CapacityScheduler#getQueue|https://github.com/apache/hadoop/blob/4c05d257ba3f3311b5bbc993f6e5e35637487d88/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L824-L829] > is called from here. > 2. > [CapacityScheduler#getQueue|https://github.com/apache/hadoop/blob/4c05d257ba3f3311b5bbc993f6e5e35637487d88/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L824-L829] > is then calling > [QueueManager#getQueue|https://github.com/apache/hadoop/blob/da09d68056d4e6a9490ddc6d9ae816b65217e117/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerQueueManager.java#L136-L138]. > 3. > [QueueManager#getQueue|https://github.com/apache/hadoop/blob/da09d68056d4e6a9490ddc6d9ae816b65217e117/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerQueueManager.java#L136-L138] > is then calling
[jira] [Updated] (YARN-11121) Check GetClusterMetrics Request parameter is null
[ https://issues.apache.org/jira/browse/YARN-11121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11121: -- Fix Version/s: (was: 3.4.0) > Check GetClusterMetrics Request parameter is null > - > > Key: YARN-11121 > URL: https://issues.apache.org/jira/browse/YARN-11121 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > The original code logic does not judge that the request is NULL. In this > case, add a judgment condition to ensure that when it is empty, it can be > effectively processed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11123) ResourceManager webapps test failures due to org.apache.hadoop.metrics2.MetricsException and subsequent java.net.BindException: Address already in use
[ https://issues.apache.org/jira/browse/YARN-11123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11123: -- Component/s: resourcemanager Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > ResourceManager webapps test failures due to > org.apache.hadoop.metrics2.MetricsException and subsequent > java.net.BindException: Address already in use > -- > > Key: YARN-11123 > URL: https://issues.apache.org/jira/browse/YARN-11123 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Running all tests from: org/apache/hadoop/yarn/server/resourcemanager/webapp > produces the following test failures: > # First, > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication#testDelegationTokenAuth > fails with: > {code} > org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server > at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:479) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1443) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.startWepApp(MockRM.java:822) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1552) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:195) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.setupAndStartRM(TestRMWebServicesDelegationTokenAuthentication.java:190) > at > org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.before(TestRMWebServicesDelegationTokenAuthentication.java:133) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:27) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at
[jira] [Updated] (YARN-11128) Fix comments in TestProportionalCapacityPreemptionPolicy*
[ https://issues.apache.org/jira/browse/YARN-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11128: -- Hadoop Flags: Reviewed Target Version/s: 3.3.5, 3.4.0 (was: 3.4.0, 3.3.5) > Fix comments in TestProportionalCapacityPreemptionPolicy* > - > > Key: YARN-11128 > URL: https://issues.apache.org/jira/browse/YARN-11128 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, documentation >Affects Versions: 3.4.0, 3.3.5 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > At various places, comment for appsConfig is > {{// queueName\t(priority,resource,host,expression,#repeat,reserved,pending)}} > but should be > {{// > queueName\t(priority,resource,host,expression,#repeat,reserved,pending,user)}} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11126) ZKConfigurationStore Java deserialisation vulnerability
[ https://issues.apache.org/jira/browse/YARN-11126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11126: -- Target Version/s: 3.3.4, 2.10.2, 3.4.0 > ZKConfigurationStore Java deserialisation vulnerability > --- > > Key: YARN-11126 > URL: https://issues.apache.org/jira/browse/YARN-11126 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.3.2 >Reporter: Tamas Domok >Assignee: Tamas Domok >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 2.10.2, 3.2.4, 3.3.4 > > Attachments: TestZKConfigurationStoreCVE.java > > Time Spent: 2h 50m > Remaining Estimate: 0h > > ZKConfigurationStore uses ObjectInputStream to deserialise objects from > ZooKeeper. An attacker who *has access to ZK* can exploit this, e.g.: using > [gadget chain deserialisation > attacks|https://snyk.io/blog/serialization-and-deserialization-in-java/] the > attacker can run arbitrary commands, even create reverse shells. > A useful > [CheatSheet|https://github.com/GrrrDog/Java-Deserialization-Cheat-Sheet/blob/master/README.md] > for Java Deserialisation. > I managed to start the Calculator app on my Mac using the following payload: > {code} > //java -jar ./target/ysoserial-0.0.6-SNAPSHOT-all.jar CommonsBeanutils1 > 'open /System/Applications/Calculator.app' | base64 > @Test > public void testDeserializationCommonsBeanutils1() throws Exception { > >
[jira] [Updated] (YARN-11128) Fix comments in TestProportionalCapacityPreemptionPolicy*
[ https://issues.apache.org/jira/browse/YARN-11128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11128: -- Target Version/s: 3.3.5, 3.4.0 Affects Version/s: 3.3.5 3.4.0 > Fix comments in TestProportionalCapacityPreemptionPolicy* > - > > Key: YARN-11128 > URL: https://issues.apache.org/jira/browse/YARN-11128 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler, documentation >Affects Versions: 3.4.0, 3.3.5 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > At various places, comment for appsConfig is > {{// queueName\t(priority,resource,host,expression,#repeat,reserved,pending)}} > but should be > {{// > queueName\t(priority,resource,host,expression,#repeat,reserved,pending,user)}} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11133) YarnClient gets the wrong EffectiveMinCapacity value
[ https://issues.apache.org/jira/browse/YARN-11133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11133: -- Hadoop Flags: Reviewed Target Version/s: 3.3.5, 3.4.0 > YarnClient gets the wrong EffectiveMinCapacity value > > > Key: YARN-11133 > URL: https://issues.apache.org/jira/browse/YARN-11133 > Project: Hadoop YARN > Issue Type: Bug > Components: api >Affects Versions: 3.2.3, 3.3.2 >Reporter: Zilong Zhu >Assignee: Zilong Zhu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.5 > > Time Spent: 2h > Remaining Estimate: 0h > > It calls the QueueConfigurations#getEffectiveMinCapacity to get the wrong > value when I use the YarnClient. I found some bugs with > QueueConfigurationsPBImpl#mergeLocalToBuilder. > {code:java} > private void mergeLocalToBuilder() { > if (this.effMinResource != null) { > builder > .setEffectiveMinCapacity(convertToProtoFormat(this.effMinResource)); > } > if (this.effMaxResource != null) { > builder > .setEffectiveMaxCapacity(convertToProtoFormat(this.effMaxResource)); > } > if (this.configuredMinResource != null) { > builder.setEffectiveMinCapacity( > convertToProtoFormat(this.configuredMinResource)); > } > if (this.configuredMaxResource != null) { > builder.setEffectiveMaxCapacity( > convertToProtoFormat(this.configuredMaxResource)); > } > } {code} > configuredMinResource was incorrectly assigned to effMinResource. This causes > the real effMinResource to be overwritten and configuredMinResource is null. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11134) Support getNodeToLabels API in FederationClientInterceptor
[ https://issues.apache.org/jira/browse/YARN-11134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11134: -- Fix Version/s: (was: 3.4.0) > Support getNodeToLabels API in FederationClientInterceptor > -- > > Key: YARN-11134 > URL: https://issues.apache.org/jira/browse/YARN-11134 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0, 3.3.1, 3.3.2 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Attachments: YARN-11134.01.patch > > > The Node Label capability is a very important capability for Yarn, and it is > also a very important capability for Yarn Federation. > The Patch will complete the getNodeToLabels method. > The issue mentioned in this JIRA will be continued in YARN-10465. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11137) Improve log message in FederationClientInterceptor
[ https://issues.apache.org/jira/browse/YARN-11137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11137: -- Target Version/s: 3.4.0 > Improve log message in FederationClientInterceptor > -- > > Key: YARN-11137 > URL: https://issues.apache.org/jira/browse/YARN-11137 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: YARN-11137.01.patch, YARN-11137.02.patch > > Time Spent: 3h 20m > Remaining Estimate: 0h > > While reading the relevant yarn-federation-router's code, I found the > following issues with log method in FederationClientInterceptor: > The log methods are inconsistent, some use the splicing method, and some use > the placeholder method,as follows: > org.apache.hadoop.yarn.server.router.clientrmsubmit.FederationClientInterceptor#getNewApplication > {code:java} > for (int i = 0; i < numSubmitRetries; ++i) { > SubClusterId subClusterId = > getRandomActiveSubCluster(subClustersActive); > LOG.debug( > "getNewApplication try #{} on SubCluster {}", i, subClusterId); > ApplicationClientProtocol clientRMProxy = > getClientRMProxyForSubCluster(subClusterId); > ... > }{code} > org.apache.hadoop.yarn.server.router.clientrmsubmit.FederationClientInterceptor#submitApplication > {code:java} > for (int i = 0; i < numSubmitRetries; ++i) { > SubClusterId subClusterId = policyFacade.getHomeSubcluster( > request.getApplicationSubmissionContext(), blacklist); > LOG.info("submitApplication appId" + applicationId + " try #" + i > + " on SubCluster " + subClusterId); >... > } {code} > I think the first way is better. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-11138) TestRouterWebServicesREST Junit Test Error Fix
[ https://issues.apache.org/jira/browse/YARN-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan resolved YARN-11138. --- Hadoop Flags: (was: Reviewed) Resolution: Duplicate > TestRouterWebServicesREST Junit Test Error Fix > -- > > Key: YARN-11138 > URL: https://issues.apache.org/jira/browse/YARN-11138 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, test >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 28.818 s <<< FAILURE! - in > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST > [ERROR] org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST > Time elapsed: 28.817 s <<< FAILURE! > java.lang.AssertionError: Web app not running > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.waitWebAppRunning(TestRouterWebServicesREST.java:199) > at > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.setUp(TestRouterWebServicesREST.java:217) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11138) TestRouterWebServicesREST Junit Test Error Fix
[ https://issues.apache.org/jira/browse/YARN-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11138: -- Fix Version/s: (was: 3.4.0) > TestRouterWebServicesREST Junit Test Error Fix > -- > > Key: YARN-11138 > URL: https://issues.apache.org/jira/browse/YARN-11138 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, test >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 28.818 s <<< FAILURE! - in > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST > [ERROR] org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST > Time elapsed: 28.817 s <<< FAILURE! > java.lang.AssertionError: Web app not running > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.waitWebAppRunning(TestRouterWebServicesREST.java:199) > at > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.setUp(TestRouterWebServicesREST.java:217) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-11138) TestRouterWebServicesREST Junit Test Error Fix
[ https://issues.apache.org/jira/browse/YARN-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan reopened YARN-11138: --- > TestRouterWebServicesREST Junit Test Error Fix > -- > > Key: YARN-11138 > URL: https://issues.apache.org/jira/browse/YARN-11138 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, test >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 28.818 s <<< FAILURE! - in > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST > [ERROR] org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST > Time elapsed: 28.817 s <<< FAILURE! > java.lang.AssertionError: Web app not running > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.waitWebAppRunning(TestRouterWebServicesREST.java:199) > at > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.setUp(TestRouterWebServicesREST.java:217) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11141) Capacity Scheduler does not support ambiguous queue names when moving application across queues
[ https://issues.apache.org/jira/browse/YARN-11141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11141: -- Target Version/s: 3.3.5, 3.4.0 Affects Version/s: 3.3.5 3.4.0 > Capacity Scheduler does not support ambiguous queue names when moving > application across queues > --- > > Key: YARN-11141 > URL: https://issues.apache.org/jira/browse/YARN-11141 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0, 3.3.5 >Reporter: András Győri >Assignee: András Győri >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > Time Spent: 1.5h > Remaining Estimate: 0h > > CapacityScheduler#moveApplication can not resolve ambiguous queue names due > to using queue name instead of queue path. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11140) Support getClusterNodeLabels API in FederationClientInterceptor
[ https://issues.apache.org/jira/browse/YARN-11140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11140: -- Fix Version/s: (was: 3.4.0) > Support getClusterNodeLabels API in FederationClientInterceptor > --- > > Key: YARN-11140 > URL: https://issues.apache.org/jira/browse/YARN-11140 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > > *getClusterNodeLabels* used by the client to get the labels of nodes in the > cluster, this is the basic and commonly used method, it should be implemented > in > Yarn Federation. > The JIRA will be linked directly to YARN-10465, and a PR will be submitted in > YARN-10465 to implement the feature. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11138) TestRouterWebServicesREST Junit Test Error Fix
[ https://issues.apache.org/jira/browse/YARN-11138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11138: -- Component/s: federation test Hadoop Flags: Reviewed Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > TestRouterWebServicesREST Junit Test Error Fix > -- > > Key: YARN-11138 > URL: https://issues.apache.org/jira/browse/YARN-11138 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, test >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Fix For: 3.4.0 > > > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: > 28.818 s <<< FAILURE! - in > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST > [ERROR] org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST > Time elapsed: 28.817 s <<< FAILURE! > java.lang.AssertionError: Web app not running > at org.junit.Assert.fail(Assert.java:89) > at > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.waitWebAppRunning(TestRouterWebServicesREST.java:199) > at > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.setUp(TestRouterWebServicesREST.java:217) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.RunBefores.invokeMethod(RunBefores.java:33) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11142) Remove unused Imports in Hadoop YARN project
[ https://issues.apache.org/jira/browse/YARN-11142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11142: -- Component/s: yarn Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Remove unused Imports in Hadoop YARN project > > > Key: YARN-11142 > URL: https://issues.apache.org/jira/browse/YARN-11142 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > h3. Optimize Imports to keep code clean > # Remove any unused imports -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11147) ResourceUsage and QueueCapacities classes provide node label iterators that are not thread safe
[ https://issues.apache.org/jira/browse/YARN-11147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11147: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > ResourceUsage and QueueCapacities classes provide node label iterators that > are not thread safe > --- > > Key: YARN-11147 > URL: https://issues.apache.org/jira/browse/YARN-11147 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: András Győri >Assignee: András Győri >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > AbstractResourceUsage#getNodePartitionsSet and > QueueCapacities#getNodePartitionsSet provide keySet, a mutable view on the > HashMap's keys, that is subject to change. Iterating through an iterator that > is modified by an other thread at the same time results in a > ConcurrentModificationException as the following stacktrace shows: > {code:java} > 2022-04-28 13:21:53,692 FATAL org.apache.hadoop.yarn.event.EventDispatcher: > Error in handling event type NODE_LABELS_UPDATE to the Event Dispatcher > java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445) > at java.util.HashMap$KeyIterator.next(HashMap.java:1469) > at com.google.common.collect.Sets$1$1.computeNext(Sets.java:758) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:141) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:136) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CSQueueUtils.updateQueueStatistics(CSQueueUtils.java:236) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.updateClusterResource(ParentQueue.java:1281) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeLabelsAndQueueResource(CapacityScheduler.java:2115) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1900) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:169) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11152) QueueMetrics is leaking memory when creating a new queue during reinitialisation
[ https://issues.apache.org/jira/browse/YARN-11152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11152: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > QueueMetrics is leaking memory when creating a new queue during > reinitialisation > > > Key: YARN-11152 > URL: https://issues.apache.org/jira/browse/YARN-11152 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: András Győri >Assignee: András Győri >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Capacity Scheduler handles reinitialisation by reparsing the entire queue > hierarchy, then reinitialising the old queue hierarchy by taking the newly > parsed queues into account. After this, the newly parsed queues are discarded > and they are GCed. > However, with the introduction of YARN-6492, we are storing a parent queue in > QueueMetrics, which is problematic, because at that point, the parent queue > could still point to a parent reference, that is a newly parsed parent queue > (which should be discarded after the reinitialisation). Due to this fact, > QueueMetrics could contain parents members of an entirely different queue > hierarchy than the current hierarchy in use. It could lead to subtle problems > as well as memory leak, because one parent reference will keep the whole > queue hierarchy alive. > This problem arised when we programatically added one queue after an other > via the mutation API, thus keeping alive hundreds of queue hierarchies at the > same time, crippling the GC and the whole RM. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11153) Make proxy server support YARN federation.
[ https://issues.apache.org/jira/browse/YARN-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11153: -- Target Version/s: 3.4.0 > Make proxy server support YARN federation. > -- > > Key: YARN-11153 > URL: https://issues.apache.org/jira/browse/YARN-11153 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 3.2.1 >Reporter: Chenyu Zheng >Assignee: Chenyu Zheng >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: YARN-10775-design-doc.001.pdf > > Time Spent: 8h 40m > Remaining Estimate: 0h > > I setup a yarn federation cluster, I can't connect the running app web, but > the completed and accepted app's web works. > So I think need two step: > (a) YARN-11153: make proxy server support federation. (YARN-11153) > (b) YARN-11154: make router support proxy server. > Though it is a not difficult problem, but not easy to describe the problem. > So I submit a document YARN-10775-design-doc.001.pdf to explain this. > > If standalone proxyserver is enable, after step (a), the problem is solved. > If standalone proxyserver is disable, after step (a) and (b), we use router > as web proxy server, so we hide the cluster info for client, I think it is > reasonable. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11160) Support getResourceProfiles, getResourceProfile API's for Federation
[ https://issues.apache.org/jira/browse/YARN-11160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11160: -- Component/s: federation Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Support getResourceProfiles, getResourceProfile API's for Federation > > > Key: YARN-11160 > URL: https://issues.apache.org/jira/browse/YARN-11160 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > Support getResourceProfiles, getResourceProfile API's for Federation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11158) Support getDelegationToken, renewDelegationToken, cancelDelegationToken API's for Federation
[ https://issues.apache.org/jira/browse/YARN-11158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11158: -- Component/s: federation Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Support getDelegationToken, renewDelegationToken, cancelDelegationToken API's > for Federation > > > Key: YARN-11158 > URL: https://issues.apache.org/jira/browse/YARN-11158 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11159) Support failApplicationAttempt, updateApplicationPriority, updateApplicationTimeouts API's for Federation
[ https://issues.apache.org/jira/browse/YARN-11159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11159: -- Component/s: federation Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Support failApplicationAttempt, updateApplicationPriority, > updateApplicationTimeouts API's for Federation > - > > Key: YARN-11159 > URL: https://issues.apache.org/jira/browse/YARN-11159 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Support failApplicationAttempt, updateApplicationPriority, > updateApplicationTimeouts API's for Federation -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11161) Support getAttributesToNodes, getClusterNodeAttributes, getNodesToAttributes API's for Federation
[ https://issues.apache.org/jira/browse/YARN-11161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11161: -- Component/s: federation Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Support getAttributesToNodes, getClusterNodeAttributes, getNodesToAttributes > API's for Federation > - > > Key: YARN-11161 > URL: https://issues.apache.org/jira/browse/YARN-11161 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > Support getAttributesToNodes, getClusterNodeAttributes, getNodesToAttributes > API's for Federation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11162) Set the zk acl for nodes created by ZKConfigurationStore.
[ https://issues.apache.org/jira/browse/YARN-11162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11162: -- Target Version/s: 3.3.4, 2.10.2, 3.4.0 > Set the zk acl for nodes created by ZKConfigurationStore. > - > > Key: YARN-11162 > URL: https://issues.apache.org/jira/browse/YARN-11162 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.10.1 >Reporter: Owen O'Malley >Assignee: Owen O'Malley >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 2.10.2, 3.2.4, 3.3.4 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11167) impove import * In YARN Project
[ https://issues.apache.org/jira/browse/YARN-11167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11167: -- Fix Version/s: (was: 3.4.0) > impove import * In YARN Project > --- > > Key: YARN-11167 > URL: https://issues.apache.org/jira/browse/YARN-11167 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Directly using * to reference does not conform to the code specification, > adjust it and refer to the specified package. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11169) Support moveApplicationAcrossQueues, getQueueInfo API's for Federation
[ https://issues.apache.org/jira/browse/YARN-11169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11169: -- Hadoop Flags: Reviewed Target Version/s: 3.4.0 > Support moveApplicationAcrossQueues, getQueueInfo API's for Federation > -- > > Key: YARN-11169 > URL: https://issues.apache.org/jira/browse/YARN-11169 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Support moveApplicationAcrossQueues, getQueueInfo API's for Federation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11176) Refactor TestAggregatedLogDeletionService
[ https://issues.apache.org/jira/browse/YARN-11176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11176: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Refactor TestAggregatedLogDeletionService > - > > Key: YARN-11176 > URL: https://issues.apache.org/jira/browse/YARN-11176 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The code of TestAggregatedLogDeletionService is quite messy. > Some refactor could be performed on this code to make it more readable and > easier to understand. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11172) Fix testDelegationToken
[ https://issues.apache.org/jira/browse/YARN-11172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11172: -- Hadoop Flags: Reviewed Target Version/s: 3.3.5, 3.4.0 > Fix testDelegationToken > --- > > Key: YARN-11172 > URL: https://issues.apache.org/jira/browse/YARN-11172 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 3.3.5 >Reporter: Chenyu Zheng >Assignee: Chenyu Zheng >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > Time Spent: 3h 40m > Remaining Estimate: 0h > > UT fail after HDFS-16563, other PR is blocked. > {code:java} > [ERROR] > testDelegationToken(org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens) > Time elapsed: 17.379 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:87) > at org.junit.Assert.assertTrue(Assert.java:42) > at org.junit.Assert.assertTrue(Assert.java:53) > at > org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.testDelegationToken(TestClientRMTokens.java:207) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11169) Support moveApplicationAcrossQueues, getQueueInfo API's for Federation
[ https://issues.apache.org/jira/browse/YARN-11169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11169: -- Component/s: federation > Support moveApplicationAcrossQueues, getQueueInfo API's for Federation > -- > > Key: YARN-11169 > URL: https://issues.apache.org/jira/browse/YARN-11169 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Support moveApplicationAcrossQueues, getQueueInfo API's for Federation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation
[ https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11177: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Support getNewReservation, submitReservation, updateReservation, > deleteReservation API's for Federation > --- > > Key: YARN-11177 > URL: https://issues.apache.org/jira/browse/YARN-11177 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11180) Refactor some code of getNewApplication, submitApplication, forceKillApplication, getApplicationReport
[ https://issues.apache.org/jira/browse/YARN-11180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11180: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Refactor some code of getNewApplication, submitApplication, > forceKillApplication, getApplicationReport > -- > > Key: YARN-11180 > URL: https://issues.apache.org/jira/browse/YARN-11180 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 5h 40m > Remaining Estimate: 0h > > *1) FederationClientInterceptor#getNewApplication* > 1.Increase request is empty check > 2.Use RouterServerUtil.logAndThrowException instead of throw YarnRuntime > Exception. > *2) > FederationClientInterceptor#submitApplication/forceKillApplication/getApplicationReport/getApplications* > 1.Use RouterServerUtil.logAndThrowException instead of throw YarnRuntime > Exception. > 2.Use string.format instead of + > 3.Fix Code Style. > *3) FederationClientInterceptor#getClusterMetrics* > 1.Increase request is empty check > *4) > FederationClientInterceptor#getClusterNodes/getQueueUserAcls/listReservations/getNodeToLabels/getLabelsToNodes/getClusterNodeLabels* > 1.Use RouterServerUtil.logAndThrowException instead of throw YarnRuntime > Exception. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation
[ https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11177: -- Component/s: federation > Support getNewReservation, submitReservation, updateReservation, > deleteReservation API's for Federation > --- > > Key: YARN-11177 > URL: https://issues.apache.org/jira/browse/YARN-11177 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11182) Refactor TestAggregatedLogDeletionService: 2nd phase
[ https://issues.apache.org/jira/browse/YARN-11182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11182: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Refactor TestAggregatedLogDeletionService: 2nd phase > > > Key: YARN-11182 > URL: https://issues.apache.org/jira/browse/YARN-11182 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h > Remaining Estimate: 0h > > The code of TestAggregatedLogDeletionService is quite messy. > After YARN-11176, a significant refactor has been performed. > Some more refactor could be performed on this file in order to easily define > new tests without copying between ~100-200 lines of code for a testcase. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11185) Pending app metrics are increased doubly when a queue reaches its max-parallel-apps limit
[ https://issues.apache.org/jira/browse/YARN-11185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11185: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Pending app metrics are increased doubly when a queue reaches its > max-parallel-apps limit > - > > Key: YARN-11185 > URL: https://issues.apache.org/jira/browse/YARN-11185 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: András Győri >Assignee: András Győri >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > When an application is submitted to a queue, its pending app metric is > increased, even, if the application reached the queue's max-parallel-apps > limit. If this application is allowed to run in the future because some other > application is finished, the application is submitted to the queue again, > increasing the pending app queue and user metrics again. Even if the > application finishes, it can only decrease the pending metric by one, which > makes the pending app metric monotonically increasing, whereas the ideal > state should eventually be 0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11187) Remove WhiteBox in yarn module.
[ https://issues.apache.org/jira/browse/YARN-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11187: -- Hadoop Flags: Reviewed Target Version/s: 3.3.5, 3.4.0 > Remove WhiteBox in yarn module. > --- > > Key: YARN-11187 > URL: https://issues.apache.org/jira/browse/YARN-11187 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 3.4.0, 3.3.5 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11188) Only files belong to the first file controller are removed even if multiple log aggregation file controllers are configured
[ https://issues.apache.org/jira/browse/YARN-11188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11188: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Only files belong to the first file controller are removed even if multiple > log aggregation file controllers are configured > --- > > Key: YARN-11188 > URL: https://issues.apache.org/jira/browse/YARN-11188 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Log aggregation can be configured to have a comma-separated list of file > controllers. > The current behaviour only removes files that belong to the first file > controller. > This can be problematic. > For example, if some user configures IFile as the file controller, and later > on changes the file controllers to specify multiple file controllers (e.g. > value = TFile,IFile) then only the first controller will be considered and > the files belong to that controller will be removed, in this case files > written by the TFile controller will be removed and the files created with > the IFile controller will be kept. > This behaviour should be changed so that all of the files should be removed > if multiple file controllers are enabled. > h2. CODE PATH > > 1. > [AggregatedLogDeletionService$LogDeletionTask#run|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L82-L108]: > > Let's understand what does this method do. > 1.1 An important bit is to see how the value of the field called > 'retentionMillis' is set. In the constructor of LogDeletionTask, there's an > incoming parameter called 'retentionSecs' that is just multiplied by 1000 to > have a millisecond value. > Let's see where 'retentionSecs' is coming from. > 1.2 > [AggregatedLogDeletionService#scheduleLogDeletionTask|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L258-L283] > that sets the value of retentionSecs. > The config key for this value is 'yarn.log-aggregation.retain-seconds'. > The javadoc says: "How long to wait before deleting aggregated logs, -1 > disables. Be careful set this too small and you will spam the name node." > 1.3 Going back to > [https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L82-L108], > the 'cutOffMillis' value is computed by getting the current time in millis > minus the retentionMillis. > 1.4 The main point of this method is to iterate over the files in the remote > root log dir (field called 'remoteRootLogDir') and to check if it is a > directory. If so, a new Path is created with that particular directory ([code > link|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L90-L96]). > One more important thing to mention: There's a field called 'suffix' that is > added to the remote root log dir path. > Let's check how the 'remoteRootLogDir' and 'suffix' field get its value as > this is crucial to understand how the log dirs are deleted. > 1.5 remoteRootLogDir is set in the constructor of LogDeletionTask, > [here|https://github.com/apache/hadoop/blob/d336227e5c63a70db06ac26697994c96ed89d230/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L77]. > The value is returned by calling fileController.getRemoteRootLogDir(). > The LogAggregationFileControllerFactory creates the instance of > LogAggregationFileController. > > *The process of determining the log aggregation file controller is quite > messy, let me describe this in detail.* > *There are 2 types of file controllers: LogAggregationIndexedFileController > and LogAggregationTFileController* > *There's a testcase called > [TestLogAggregationFileControllerFactory#testLogAggregationFileControllerFactory|#testLogAggregationFileControllerFactory] > that shows how the LogAggregationFileControllerFactory is configured.* > 2.1
[jira] [Updated] (YARN-11187) Remove WhiteBox in yarn module.
[ https://issues.apache.org/jira/browse/YARN-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11187: -- Component/s: test > Remove WhiteBox in yarn module. > --- > > Key: YARN-11187 > URL: https://issues.apache.org/jira/browse/YARN-11187 > Project: Hadoop YARN > Issue Type: Improvement > Components: test >Affects Versions: 3.4.0, 3.3.5 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11192) TestRouterWebServicesREST failing after YARN-9827
[ https://issues.apache.org/jira/browse/YARN-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11192: -- Target Version/s: 3.4.0 > TestRouterWebServicesREST failing after YARN-9827 > - > > Key: YARN-11192 > URL: https://issues.apache.org/jira/browse/YARN-11192 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > In YARN-9827, the following modifications: > {code:java} > GenericExceptionHandler should respond with SERVICE_UNAVAILABLE in case of > connection and service unavailable exception instead of > INTERNAL_SERVICE_ERROR. {code} > This modification caused all of YARN Federation's TestRouterWebServicesREST > unit tests to fail > {code:java} > [ERROR] Tests run: 201, Failures: 15, Errors: 0, Skipped: 0, Flakes: 2 > . > [ERROR] > org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST.testUpdateAppStateXML(org.apache.hadoop.yarn.server.router.webapp.TestRouterWebServicesREST) > [ERROR] Run 1: TestRouterWebServicesREST.testUpdateAppStateXML:774 > expected:<500> but was:<503> > [ERROR] Run 2: TestRouterWebServicesREST.testUpdateAppStateXML:774 > expected:<500> but was:<503> > [ERROR] Run 3: TestRouterWebServicesREST.testUpdateAppStateXML:774 > expected:<500> but was:<503> {code} > Report-URL: > https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4464/5/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-router.txt -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11196) NUMA Awareness support in DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11196: -- Hadoop Flags: Reviewed Target Version/s: 3.4.0 > NUMA Awareness support in DefaultContainerExecutor > -- > > Key: YARN-11196 > URL: https://issues.apache.org/jira/browse/YARN-11196 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.3.3 >Reporter: Prabhu Joseph >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > [YARN-5764|https://issues.apache.org/jira/browse/YARN-5764] has added support > of NUMA Awareness for Containers launched through LinuxContainerExecutor. > This feature is useful to have in DefaultContainerExecutor as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11198) Deletion of assigned resources (e.g. GPU's, NUMA, FPGA's) from State Store
[ https://issues.apache.org/jira/browse/YARN-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11198: -- Hadoop Flags: Reviewed Target Version/s: 3.4.0 > Deletion of assigned resources (e.g. GPU's, NUMA, FPGA's) from State Store > -- > > Key: YARN-11198 > URL: https://issues.apache.org/jira/browse/YARN-11198 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.3.3 >Reporter: Prabhu Joseph >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > [YARN-7033|https://issues.apache.org/jira/browse/YARN-7033] provided support > to recover assigned resources to container. But did not delete them from > State Store as part of removal of container after the configured duration > yarn.nodemanager.duration-to-track-stopped-containers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11190) CS Mapping rule bug: User matcher does not work correctly for usernames with dot
[ https://issues.apache.org/jira/browse/YARN-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11190: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > CS Mapping rule bug: User matcher does not work correctly for usernames with > dot > > > Key: YARN-11190 > URL: https://issues.apache.org/jira/browse/YARN-11190 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: testUserNameSetDefaultAndPlaceWith2Rules.log, > testUserNameSetDefaultAndPlaceWith2RulesUsernameReplacedWithDot.log, > testcases.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Given the following scenario, the placement engine does not work as expected. > A user with a '.' (dot) inside his/her username submits a job. > Let the username be "test.user" > There are 2 mapping rules: > 1. The matcher matches the user with name "test.user" and has an associated > mapping rule action that sets the default queue to "root.user". > 2. The second mapping rule matches the same user ("test.user") and places the > application to the default queue. > *Expactation:* > When the user with username "root.user" submits a job, the application will > be placed to queue "root.user". > *Observed behaviour:* > The application is placed to test_dot_user. > This means that the dot is replaced to "{_}dot{_}" too early so that the > default queue is set incorrectly. > > I have attached a patch file that demonstrates this behaviour with 2 new > testcases along with the logs of these testcases. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11198) Deletion of assigned resources (e.g. GPU's, NUMA, FPGA's) from State Store
[ https://issues.apache.org/jira/browse/YARN-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11198: -- Component/s: nodemanager > Deletion of assigned resources (e.g. GPU's, NUMA, FPGA's) from State Store > -- > > Key: YARN-11198 > URL: https://issues.apache.org/jira/browse/YARN-11198 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.3.3 >Reporter: Prabhu Joseph >Assignee: Samrat Deb >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > [YARN-7033|https://issues.apache.org/jira/browse/YARN-7033] provided support > to recover assigned resources to container. But did not delete them from > State Store as part of removal of container after the configured duration > yarn.nodemanager.duration-to-track-stopped-containers. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11203) Fix typo in hadoop-yarn-server-router module
[ https://issues.apache.org/jira/browse/YARN-11203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11203: -- Target Version/s: 3.4.0 > Fix typo in hadoop-yarn-server-router module > > > Key: YARN-11203 > URL: https://issues.apache.org/jira/browse/YARN-11203 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h > Remaining Estimate: 0h > > Fix typo in hadoop-yarn-server-router module. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11210) Fix YARN RMAdminCLI retry logic for non-retryable kerberos configuration exception
[ https://issues.apache.org/jira/browse/YARN-11210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11210: -- Hadoop Flags: Reviewed Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Fix YARN RMAdminCLI retry logic for non-retryable kerberos configuration > exception > -- > > Key: YARN-11210 > URL: https://issues.apache.org/jira/browse/YARN-11210 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 3.4.0 >Reporter: Kevin Wikant >Assignee: Kevin Wikant >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > h2. Description of Problem > Applications which call YARN RMAdminCLI (i.e. YARN ResourceManager client) > synchronously can be blocked for up to 15 minutes with the default > configuration of "yarn.resourcemanager.connect.max-wait.ms"; this is not an > issue in of itself, but there is a non-retryable IllegalArgumentException > exception thrown within the YARN ResourceManager client that is getting > swallowed & treated as a retryable "connection exception" meaning that it > gets retried for 15 minutes. > The purpose of this JIRA (and PR) is to modify the YARN client so that it > does not retry on this non-retryable exception. > h2. Background Information > YARN ResourceManager client treats connection exceptions as retryable & with > the default value of "yarn.resourcemanager.connect.max-wait.ms" will attempt > to connect to the ResourceManager for up to 15 minutes when facing > "connection exceptions". This arguably makes sense because connection > exceptions are in some cases transient & can be recovered from without any > action needed from the client. See example below where YARN ResourceManager > client was able to recover from connection issues that resulted from the > ResourceManager process being down. > {quote}> yarn rmadmin -refreshNodes > 22/06/28 14:40:17 INFO client.RMProxy: Connecting to ResourceManager at > /0.0.0.0:8033 > 22/06/28 14:40:18 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 22/06/28 14:40:19 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 22/06/28 14:40:20 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 2 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > ... > 22/06/28 14:40:27 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 9 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 22/06/28 14:40:28 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 22/06/28 14:40:29 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > ... > 22/06/28 14:40:37 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 9 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 22/06/28 14:40:37 INFO retry.RetryInvocationHandler: > java.net.ConnectException: Your endpoint configuration is wrong; For more > details see: [http://wiki.apache.org/hadoop/UnsetHostnameOrPort], while > invoking ResourceManagerAdministrationProtocolPBClientImpl.refreshNodes over > null after 1 failover attempts. Trying to failover after sleeping for 41061ms. > 22/06/28 14:41:19 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 22/06/28 14:41:20 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > ... > 22/06/28 14:41:28 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8033. Already tried 9 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 22/06/28 14:41:28 INFO retry.RetryInvocationHandler: > java.net.ConnectException: Your endpoint configuration is wrong; For more > details
[jira] [Updated] (YARN-11204) Various MapReduce tests fail with NPE in AggregatedLogDeletionService.stopRMClient
[ https://issues.apache.org/jira/browse/YARN-11204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11204: -- Component/s: log-aggregation (was: test) > Various MapReduce tests fail with NPE in > AggregatedLogDeletionService.stopRMClient > -- > > Key: YARN-11204 > URL: https://issues.apache.org/jira/browse/YARN-11204 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: > hadoop-mapreduce-project_hadoop-mapreduce-client_testlogs.txt, > testAllOpportunisticMaps_logs.txt > > Time Spent: 0.5h > Remaining Estimate: 0h > > During testing of HADOOP-15327, I noticed that lots of unit test are failing > in the module called 'hadoop-mapreduce-client-jobclient'. > See this link for details: > https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/9/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client.txt > In case of the above Jenkins link expires later, I attached the same text > file to this jira. > Let's see one example: > org.apache.hadoop.mapred.TestMROpportunisticMaps#testAllOpportunisticMaps > Logs are also attached. > An example stacktrace, for reference: > {code} > 2022-06-29 11:24:13,510 INFO [Listener at 0.0.0.0/8049] > service.AbstractService (AbstractService.java:noteFailure(268)) - Service > TestMROpportunisticMaps failed in state STOPPED > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService.stopRMClient(AggregatedLogDeletionService.java:322) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService.serviceStop(AggregatedLogDeletionService.java:229) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:160) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:134) > at > org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceStop(JobHistoryServer.java:203) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster$JobHistoryServerWrapper.serviceStop(MiniMRYarnCluster.java:293) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:160) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:134) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.stop(MiniMRYarnClusterAdapter.java:56) > at > org.apache.hadoop.mapred.TestMROpportunisticMaps.doTest(TestMROpportunisticMaps.java:108) > at > org.apache.hadoop.mapred.TestMROpportunisticMaps.doTest(TestMROpportunisticMaps.java:74) > at > org.apache.hadoop.mapred.TestMROpportunisticMaps.testAllOpportunisticMaps(TestMROpportunisticMaps.java:60) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at
[jira] [Updated] (YARN-11204) Various MapReduce tests fail with NPE in AggregatedLogDeletionService.stopRMClient
[ https://issues.apache.org/jira/browse/YARN-11204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11204: -- Component/s: test Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Various MapReduce tests fail with NPE in > AggregatedLogDeletionService.stopRMClient > -- > > Key: YARN-11204 > URL: https://issues.apache.org/jira/browse/YARN-11204 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Affects Versions: 3.4.0 >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: > hadoop-mapreduce-project_hadoop-mapreduce-client_testlogs.txt, > testAllOpportunisticMaps_logs.txt > > Time Spent: 0.5h > Remaining Estimate: 0h > > During testing of HADOOP-15327, I noticed that lots of unit test are failing > in the module called 'hadoop-mapreduce-client-jobclient'. > See this link for details: > https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3259/9/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client.txt > In case of the above Jenkins link expires later, I attached the same text > file to this jira. > Let's see one example: > org.apache.hadoop.mapred.TestMROpportunisticMaps#testAllOpportunisticMaps > Logs are also attached. > An example stacktrace, for reference: > {code} > 2022-06-29 11:24:13,510 INFO [Listener at 0.0.0.0/8049] > service.AbstractService (AbstractService.java:noteFailure(268)) - Service > TestMROpportunisticMaps failed in state STOPPED > java.lang.NullPointerException > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService.stopRMClient(AggregatedLogDeletionService.java:322) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService.serviceStop(AggregatedLogDeletionService.java:229) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:160) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:134) > at > org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceStop(JobHistoryServer.java:203) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapreduce.v2.MiniMRYarnCluster$JobHistoryServerWrapper.serviceStop(MiniMRYarnCluster.java:293) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:54) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:102) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:160) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:134) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.mapred.MiniMRYarnClusterAdapter.stop(MiniMRYarnClusterAdapter.java:56) > at > org.apache.hadoop.mapred.TestMROpportunisticMaps.doTest(TestMROpportunisticMaps.java:108) > at > org.apache.hadoop.mapred.TestMROpportunisticMaps.doTest(TestMROpportunisticMaps.java:74) > at > org.apache.hadoop.mapred.TestMROpportunisticMaps.testAllOpportunisticMaps(TestMROpportunisticMaps.java:60) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at
[jira] [Updated] (YARN-11212) [Federation] Add getNodeToLabels REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11212: -- Target Version/s: 3.4.0 > [Federation] Add getNodeToLabels REST APIs for Router > - > > Key: YARN-11212 > URL: https://issues.apache.org/jira/browse/YARN-11212 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > Add getNodeToLabels REST APIs for Router. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11211) QueueMetrics leaks Configuration objects when validation API is called multiple times
[ https://issues.apache.org/jira/browse/YARN-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11211: -- Target Version/s: 3.4.0 > QueueMetrics leaks Configuration objects when validation API is called > multiple times > - > > Key: YARN-11211 > URL: https://issues.apache.org/jira/browse/YARN-11211 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: András Győri >Assignee: András Győri >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > QueueMetrics#QUEUE_METRICS is a static map, which is a source of multiple > bugs eg. YARN-11152. > The current scenario could be reproduced by adding queues one at a time via > the mutation API. > # Validate adding queue1 via validation API > # Validation API instantiates a new CS, with a new Configuration, that > instantiates a ConfigurationProperties > # QueueMetrics does share the same QUEUE_METRICS cache with the original CS, > where there is now a Metrics object that belongs to the new CS -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11211) QueueMetrics leaks Configuration objects when validation API is called multiple times
[ https://issues.apache.org/jira/browse/YARN-11211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11211: -- Affects Version/s: 3.4.0 > QueueMetrics leaks Configuration objects when validation API is called > multiple times > - > > Key: YARN-11211 > URL: https://issues.apache.org/jira/browse/YARN-11211 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: András Győri >Assignee: András Győri >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > QueueMetrics#QUEUE_METRICS is a static map, which is a source of multiple > bugs eg. YARN-11152. > The current scenario could be reproduced by adding queues one at a time via > the mutation API. > # Validate adding queue1 via validation API > # Validation API instantiates a new CS, with a new Configuration, that > instantiates a ConfigurationProperties > # QueueMetrics does share the same QUEUE_METRICS cache with the original CS, > where there is now a Metrics object that belongs to the new CS -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11221) [Federation] Add replaceLabelsOnNodes, replaceLabelsOnNode REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11221: -- Target Version/s: 3.4.0 > [Federation] Add replaceLabelsOnNodes, replaceLabelsOnNode REST APIs for > Router > --- > > Key: YARN-11221 > URL: https://issues.apache.org/jira/browse/YARN-11221 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11220) [Federation] Add getLabelsToNodes, getClusterNodeLabels, getLabelsOnNode REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11220: -- Target Version/s: 3.4.0 > [Federation] Add getLabelsToNodes, getClusterNodeLabels, getLabelsOnNode REST > APIs for Router > - > > Key: YARN-11220 > URL: https://issues.apache.org/jira/browse/YARN-11220 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11219) [Federation] Add getAppActivities, getAppStatistics REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11219: -- Target Version/s: 3.4.0 > [Federation] Add getAppActivities, getAppStatistics REST APIs for Router > > > Key: YARN-11219 > URL: https://issues.apache.org/jira/browse/YARN-11219 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11217) [Federation] Add dumpSchedulerLogs REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11217: -- Target Version/s: 3.4.0 > [Federation] Add dumpSchedulerLogs REST APIs for Router > --- > > Key: YARN-11217 > URL: https://issues.apache.org/jira/browse/YARN-11217 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11218) [Federation] Add getActivities, getBulkActivities REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11218: -- Target Version/s: 3.4.0 > [Federation] Add getActivities, getBulkActivities REST APIs for Router > -- > > Key: YARN-11218 > URL: https://issues.apache.org/jira/browse/YARN-11218 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11223) [Federation] Add getAppPriority, updateApplicationPriority REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11223: -- Target Version/s: 3.4.0 > [Federation] Add getAppPriority, updateApplicationPriority REST APIs for > Router > --- > > Key: YARN-11223 > URL: https://issues.apache.org/jira/browse/YARN-11223 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11225) [Federation] Add postDelegationToken, postDelegationTokenExpiration, cancelDelegationToken REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11225: -- Target Version/s: 3.4.0 > [Federation] Add postDelegationToken, postDelegationTokenExpiration, > cancelDelegationToken REST APIs for Router > > > Key: YARN-11225 > URL: https://issues.apache.org/jira/browse/YARN-11225 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11224) [Federation] Add getAppQueue, updateAppQueue REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11224: -- Target Version/s: 3.4.0 > [Federation] Add getAppQueue, updateAppQueue REST APIs for Router > - > > Key: YARN-11224 > URL: https://issues.apache.org/jira/browse/YARN-11224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11222) [Federation] Add addToClusterNodeLabels, removeFromClusterNodeLabels REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11222: -- Target Version/s: 3.4.0 > [Federation] Add addToClusterNodeLabels, removeFromClusterNodeLabels REST > APIs for Router > - > > Key: YARN-11222 > URL: https://issues.apache.org/jira/browse/YARN-11222 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11226) [Federation] Add createNewReservation, submitReservation, updateReservation, deleteReservation REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11226: -- Target Version/s: 3.4.0 > [Federation] Add createNewReservation, submitReservation, updateReservation, > deleteReservation REST APIs for Router > --- > > Key: YARN-11226 > URL: https://issues.apache.org/jira/browse/YARN-11226 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11230) [Federation] Add getContainer, signalToContainer REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11230: -- Target Version/s: 3.4.0 > [Federation] Add getContainer, signalToContainer REST APIs for Router > -- > > Key: YARN-11230 > URL: https://issues.apache.org/jira/browse/YARN-11230 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11236) [RESERVATION] Implement FederationReservationHomeSubClusterStore With MemoryStore
[ https://issues.apache.org/jira/browse/YARN-11236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11236: -- Target Version/s: 3.4.0 > [RESERVATION] Implement FederationReservationHomeSubClusterStore With > MemoryStore > - > > Key: YARN-11236 > URL: https://issues.apache.org/jira/browse/YARN-11236 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11228) [Federation] Add getAppAttempts, getAppAttempt REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11228: -- Target Version/s: 3.4.0 > [Federation] Add getAppAttempts, getAppAttempt REST APIs for Router > --- > > Key: YARN-11228 > URL: https://issues.apache.org/jira/browse/YARN-11228 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11229) [Federation] Add checkUserAccessToQueue REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11229: -- Target Version/s: 3.4.0 > [Federation] Add checkUserAccessToQueue REST APIs for Router > > > Key: YARN-11229 > URL: https://issues.apache.org/jira/browse/YARN-11229 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11227) [Federation] Add getAppTimeout, getAppTimeouts, updateApplicationTimeout REST APIs for Router
[ https://issues.apache.org/jira/browse/YARN-11227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11227: -- Target Version/s: 3.4.0 > [Federation] Add getAppTimeout, getAppTimeouts, updateApplicationTimeout REST > APIs for Router > - > > Key: YARN-11227 > URL: https://issues.apache.org/jira/browse/YARN-11227 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11235) [RESERVATION] Refactor Policy Code and Define getReservationHomeSubcluster
[ https://issues.apache.org/jira/browse/YARN-11235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11235: -- Target Version/s: 3.4.0 > [RESERVATION] Refactor Policy Code and Define getReservationHomeSubcluster > -- > > Key: YARN-11235 > URL: https://issues.apache.org/jira/browse/YARN-11235 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: [RESERVATION] Add support for reservation-based > routing.pdf > > Time Spent: 8h 20m > Remaining Estimate: 0h > > Refer to 2.1 Router Policy, which describes the changes to be made. The > documentation will continue to improve, the current version is V1. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11238) Optimizing FederationClientInterceptor Call with Parallelism
[ https://issues.apache.org/jira/browse/YARN-11238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11238: -- Target Version/s: 3.4.0 > Optimizing FederationClientInterceptor Call with Parallelism > > > Key: YARN-11238 > URL: https://issues.apache.org/jira/browse/YARN-11238 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11240) Fix incorrect placeholder in yarn-module
[ https://issues.apache.org/jira/browse/YARN-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11240: -- Target Version/s: 3.4.0 > Fix incorrect placeholder in yarn-module > > > Key: YARN-11240 > URL: https://issues.apache.org/jira/browse/YARN-11240 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Try to deal with the moudle problem at a time. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11239) Optimize FederationClientInterceptor audit log
[ https://issues.apache.org/jira/browse/YARN-11239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11239: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Optimize FederationClientInterceptor audit log > -- > > Key: YARN-11239 > URL: https://issues.apache.org/jira/browse/YARN-11239 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11237) Bug while disabling proxy failover with Federation
[ https://issues.apache.org/jira/browse/YARN-11237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11237: -- Target Version/s: 3.4.0 > Bug while disabling proxy failover with Federation > -- > > Key: YARN-11237 > URL: https://issues.apache.org/jira/browse/YARN-11237 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Affects Versions: 3.3.3 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > When one disables the use of RM fail over proxy with federation, there is a > bug checking a wrong/parent flag `yarn.federation.enabled` whether the > federation is used instead of the fail over feature flag > `yarn.federation.failover.enabled` of federation. Without this change, when > fail over feature is disabled, node manager cannot be started. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11240) Fix incorrect placeholder in yarn-module
[ https://issues.apache.org/jira/browse/YARN-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11240: -- Component/s: yarn > Fix incorrect placeholder in yarn-module > > > Key: YARN-11240 > URL: https://issues.apache.org/jira/browse/YARN-11240 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Try to deal with the moudle problem at a time. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11241) Add uncleaning option for local app log file with log-aggregation enabled
[ https://issues.apache.org/jira/browse/YARN-11241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11241: -- Hadoop Flags: Reviewed Target Version/s: 3.3.5, 3.4.0 Affects Version/s: 3.3.5 3.4.0 > Add uncleaning option for local app log file with log-aggregation enabled > - > > Key: YARN-11241 > URL: https://issues.apache.org/jira/browse/YARN-11241 > Project: Hadoop YARN > Issue Type: New Feature > Components: log-aggregation >Affects Versions: 3.4.0, 3.3.5 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > Time Spent: 20m > Remaining Estimate: 0h > > Add uncleaning option for local app log file with log-aggregation enabled > This will be helpful for debugging purpose. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11245) Upgrade JUnit from 4 to 5 in hadoop-yarn-csi
[ https://issues.apache.org/jira/browse/YARN-11245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11245: -- Component/s: yarn-csi Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Upgrade JUnit from 4 to 5 in hadoop-yarn-csi > > > Key: YARN-11245 > URL: https://issues.apache.org/jira/browse/YARN-11245 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-csi >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Upgrade JUnit from 4 to 5 in hadoop-yarn-csi -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11245) Upgrade JUnit from 4 to 5 in hadoop-yarn-csi
[ https://issues.apache.org/jira/browse/YARN-11245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11245: -- Hadoop Flags: Reviewed > Upgrade JUnit from 4 to 5 in hadoop-yarn-csi > > > Key: YARN-11245 > URL: https://issues.apache.org/jira/browse/YARN-11245 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-csi >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Upgrade JUnit from 4 to 5 in hadoop-yarn-csi -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11253) Add Configuration to delegationToken RemoverScanInterval
[ https://issues.apache.org/jira/browse/YARN-11253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11253: -- Target Version/s: 3.4.0 > Add Configuration to delegationToken RemoverScanInterval > > > Key: YARN-11253 > URL: https://issues.apache.org/jira/browse/YARN-11253 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > When reading the code, I found the case of hard coding, I think the > parameters should be abstracted into the configuration. > org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService# > createRMDelegationTokenSecretManager > {code:java} > protected RMDelegationTokenSecretManager > createRMDelegationTokenSecretManager(Configuration conf, RMContext rmContext) > { >// . 360 This hard code should be extracted >return new RMDelegationTokenSecretManager(secretKeyInterval, > tokenMaxLifetime, tokenRenewInterval, 360, rmContext); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11250) Capture the Performance Metrics of ZookeeperFederationStateStore
[ https://issues.apache.org/jira/browse/YARN-11250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11250: -- Target Version/s: 3.4.0 > Capture the Performance Metrics of ZookeeperFederationStateStore > > > Key: YARN-11250 > URL: https://issues.apache.org/jira/browse/YARN-11250 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Capture the Performance Metrics of ZookeeperFederationStateStore. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11248) Add unit test for FINISHED_CONTAINERS_PULLED_BY_AM event on DECOMMISSIONING
[ https://issues.apache.org/jira/browse/YARN-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11248: -- Hadoop Flags: Reviewed > Add unit test for FINISHED_CONTAINERS_PULLED_BY_AM event on DECOMMISSIONING > --- > > Key: YARN-11248 > URL: https://issues.apache.org/jira/browse/YARN-11248 > Project: Hadoop YARN > Issue Type: Test > Components: test >Affects Versions: 3.3.3 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > Add unit test for FINISHED_CONTAINERS_PULLED_BY_AM event on DECOMMISSIONING -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11252) [RESERVATION] Yarn Federation Router Supports Update / Delete Reservation in MemoryStore
[ https://issues.apache.org/jira/browse/YARN-11252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11252: -- Target Version/s: 3.4.0 > [RESERVATION] Yarn Federation Router Supports Update / Delete Reservation in > MemoryStore > > > Key: YARN-11252 > URL: https://issues.apache.org/jira/browse/YARN-11252 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0, 3.3.4 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11255) Support loading alternative docker client config from system environment
[ https://issues.apache.org/jira/browse/YARN-11255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11255: -- Labels: pull-request-available (was: ) > Support loading alternative docker client config from system environment > > > Key: YARN-11255 > URL: https://issues.apache.org/jira/browse/YARN-11255 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > When using YARN docker support, although the hadoop shell supported > {code:java} > -docker_client_config{code} > to pass the client config file that contains security token to generate the > docker config for each job as a temporary file. > For other applications that submit jobs to YARN, e.g. Spark, which loads the > docker setting via system environment e.g. > {code:java} > spark.executorEnv.* {code} > will not be able to add those authorization token because this system > environment isn't considered in YARN. > Add genetic solution to handle these kind of cases without making changes in > spark code or others > Eg > When using remote container registry, the > {{YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG}} must reference the config.json > file containing the credentials used to authenticate. > {code:java} > DOCKER_IMAGE_NAME=hadoop-docker > DOCKER_CLIENT_CONFIG=hdfs:///user/hadoop/config.json > spark-submit --master yarn \ > --deploy-mode cluster \ > --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \ > --conf > spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$DOCKER_IMAGE_NAME \ > --conf > spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG=$DOCKER_CLIENT_CONFIG > \ > --conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \ > --conf > spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$DOCKER_IMAGE_NAME > \ > --conf > spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG=$DOCKER_CLIENT_CONFIG > \ > sparkR.R{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11254) hadoop-minikdc dependency duplicated in hadoop-yarn-server-nodemanager
[ https://issues.apache.org/jira/browse/YARN-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11254: -- Hadoop Flags: Reviewed Target Version/s: 3.4.0 > hadoop-minikdc dependency duplicated in hadoop-yarn-server-nodemanager > -- > > Key: YARN-11254 > URL: https://issues.apache.org/jira/browse/YARN-11254 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.4.0 >Reporter: Clara Fang >Assignee: Clara Fang >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > The dependency hadoop-minikdc is defined twice in > hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/pom.xml > {code:xml} > > org.apache.hadoop > hadoop-minikdc > test > > > org.apache.hadoop > hadoop-minikdc > test > > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11254) hadoop-minikdc dependency duplicated in hadoop-yarn-server-nodemanager
[ https://issues.apache.org/jira/browse/YARN-11254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11254: -- Affects Version/s: 3.4.0 > hadoop-minikdc dependency duplicated in hadoop-yarn-server-nodemanager > -- > > Key: YARN-11254 > URL: https://issues.apache.org/jira/browse/YARN-11254 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.4.0 >Reporter: Clara Fang >Assignee: Clara Fang >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > The dependency hadoop-minikdc is defined twice in > hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/pom.xml > {code:xml} > > org.apache.hadoop > hadoop-minikdc > test > > > org.apache.hadoop > hadoop-minikdc > test > > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11255) Support loading alternative docker client config from system environment
[ https://issues.apache.org/jira/browse/YARN-11255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11255: -- Component/s: yarn Hadoop Flags: Reviewed Target Version/s: 3.4.0 > Support loading alternative docker client config from system environment > > > Key: YARN-11255 > URL: https://issues.apache.org/jira/browse/YARN-11255 > Project: Hadoop YARN > Issue Type: New Feature > Components: yarn >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Fix For: 3.4.0 > > > When using YARN docker support, although the hadoop shell supported > {code:java} > -docker_client_config{code} > to pass the client config file that contains security token to generate the > docker config for each job as a temporary file. > For other applications that submit jobs to YARN, e.g. Spark, which loads the > docker setting via system environment e.g. > {code:java} > spark.executorEnv.* {code} > will not be able to add those authorization token because this system > environment isn't considered in YARN. > Add genetic solution to handle these kind of cases without making changes in > spark code or others > Eg > When using remote container registry, the > {{YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG}} must reference the config.json > file containing the credentials used to authenticate. > {code:java} > DOCKER_IMAGE_NAME=hadoop-docker > DOCKER_CLIENT_CONFIG=hdfs:///user/hadoop/config.json > spark-submit --master yarn \ > --deploy-mode cluster \ > --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \ > --conf > spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$DOCKER_IMAGE_NAME \ > --conf > spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG=$DOCKER_CLIENT_CONFIG > \ > --conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \ > --conf > spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$DOCKER_IMAGE_NAME > \ > --conf > spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG=$DOCKER_CLIENT_CONFIG > \ > sparkR.R{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11255) Support loading alternative docker client config from system environment
[ https://issues.apache.org/jira/browse/YARN-11255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11255: -- Affects Version/s: 3.4.0 > Support loading alternative docker client config from system environment > > > Key: YARN-11255 > URL: https://issues.apache.org/jira/browse/YARN-11255 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 3.4.0 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Fix For: 3.4.0 > > > When using YARN docker support, although the hadoop shell supported > {code:java} > -docker_client_config{code} > to pass the client config file that contains security token to generate the > docker config for each job as a temporary file. > For other applications that submit jobs to YARN, e.g. Spark, which loads the > docker setting via system environment e.g. > {code:java} > spark.executorEnv.* {code} > will not be able to add those authorization token because this system > environment isn't considered in YARN. > Add genetic solution to handle these kind of cases without making changes in > spark code or others > Eg > When using remote container registry, the > {{YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG}} must reference the config.json > file containing the credentials used to authenticate. > {code:java} > DOCKER_IMAGE_NAME=hadoop-docker > DOCKER_CLIENT_CONFIG=hdfs:///user/hadoop/config.json > spark-submit --master yarn \ > --deploy-mode cluster \ > --conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \ > --conf > spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$DOCKER_IMAGE_NAME \ > --conf > spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG=$DOCKER_CLIENT_CONFIG > \ > --conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \ > --conf > spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$DOCKER_IMAGE_NAME > \ > --conf > spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG=$DOCKER_CLIENT_CONFIG > \ > sparkR.R{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11271) Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timelineservice-hbase-common
[ https://issues.apache.org/jira/browse/YARN-11271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11271: -- Hadoop Flags: Reviewed Target Version/s: 3.4.0 > Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timelineservice-hbase-common > > > Key: YARN-11271 > URL: https://issues.apache.org/jira/browse/YARN-11271 > Project: Hadoop YARN > Issue Type: Sub-task > Components: test, yarn >Affects Versions: 3.3.4 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11270) Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timelineservice-hbase-client
[ https://issues.apache.org/jira/browse/YARN-11270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11270: -- Target Version/s: 3.4.0 > Upgrade JUnit from 4 to 5 in hadoop-yarn-server-timelineservice-hbase-client > > > Key: YARN-11270 > URL: https://issues.apache.org/jira/browse/YARN-11270 > Project: Hadoop YARN > Issue Type: Sub-task > Components: test, yarn >Affects Versions: 3.3.4 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11278) Ambiguous error message in mutation API
[ https://issues.apache.org/jira/browse/YARN-11278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11278: -- Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Ambiguous error message in mutation API > --- > > Key: YARN-11278 > URL: https://issues.apache.org/jira/browse/YARN-11278 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Affects Versions: 3.4.0 >Reporter: András Győri >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > In RMWebServices#updateSchedulerConfiguration, we are checking two > prerequisites: > {code:java} > if (scheduler instanceof MutableConfScheduler && ((MutableConfScheduler) > scheduler).isConfigurationMutable()) { {code} > However, the error message is misleading in the second case (namely if the > configuration is not mutable eg. a FILE_CONFIGURATION_STORE) > {code:java} > } else { > return Response.status(Status.BAD_REQUEST) > .entity("Configuration change only supported by " + > "MutableConfScheduler.") > .build(); {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11297) Improve Yarn Router Reservation Submission Code
[ https://issues.apache.org/jira/browse/YARN-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11297: -- Hadoop Flags: Reviewed > Improve Yarn Router Reservation Submission Code > --- > > Key: YARN-11297 > URL: https://issues.apache.org/jira/browse/YARN-11297 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > The same reservation may be submitted repeatedly. At this time, we should use > the reserved results first. If the reserved results are not available, > consider applying from other RMs. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11283) [Federation] Fix Typo of NodeManager AMRMProxy.
[ https://issues.apache.org/jira/browse/YARN-11283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11283: -- Target Version/s: 3.4.0 > [Federation] Fix Typo of NodeManager AMRMProxy. > --- > > Key: YARN-11283 > URL: https://issues.apache.org/jira/browse/YARN-11283 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, nodemanager >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Fix Typo of NodeManager amrmproxy -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11287) Fix NoClassDefFoundError: org/junit/platform/launcher/core/LauncherFactory after YARN-10793
[ https://issues.apache.org/jira/browse/YARN-11287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11287: -- Hadoop Flags: Reviewed > Fix NoClassDefFoundError: org/junit/platform/launcher/core/LauncherFactory > after YARN-10793 > --- > > Key: YARN-11287 > URL: https://issues.apache.org/jira/browse/YARN-11287 > Project: Hadoop YARN > Issue Type: Bug > Components: build, test >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > After executing the yarn-project global unit test, I found the following > error: > {code:java} > ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M1:test (default-test) > on project hadoop-yarn-server-applicationhistoryservice: Execution > default-test of goal > org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M1:test failed: > java.lang.NoClassDefFoundError: > org/junit/platform/launcher/core/LauncherFactory: > org.junit.platform.launcher.core.LauncherFactory -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/PluginExecutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :hadoop-yarn-server-applicationhistoryservice {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11307) Fix Yarn Router Broken Link
[ https://issues.apache.org/jira/browse/YARN-11307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11307: -- Hadoop Flags: Reviewed > Fix Yarn Router Broken Link > --- > > Key: YARN-11307 > URL: https://issues.apache.org/jira/browse/YARN-11307 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, router >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11303) Upgrade jquery ui to 1.13.2
[ https://issues.apache.org/jira/browse/YARN-11303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11303: -- Target Version/s: 3.3.5, 3.4.0 Affects Version/s: 3.3.5 3.4.0 > Upgrade jquery ui to 1.13.2 > --- > > Key: YARN-11303 > URL: https://issues.apache.org/jira/browse/YARN-11303 > Project: Hadoop YARN > Issue Type: Improvement > Components: security >Affects Versions: 3.4.0, 3.3.5 >Reporter: D M Murali Krishna Reddy >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.5 > > > The current jquery-ui version used(1.13.1) in the trunk has the following > vulnerability > [CVE-2022-31160|https://nvd.nist.gov/vuln/detail/CVE-2022-31160] so we need > to upgrade to at least 1.13.2. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11324) [Federation] Fix some PBImpl classes to avoid NPE.
[ https://issues.apache.org/jira/browse/YARN-11324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11324: -- Hadoop Flags: Reviewed > [Federation] Fix some PBImpl classes to avoid NPE. > -- > > Key: YARN-11324 > URL: https://issues.apache.org/jira/browse/YARN-11324 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation, router, yarn >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2022-09-30-16-52-25-031.png > > > When completing YARN-11323, I found that there is a bug in > ApplicationHomeSubClusterPBImpl, which may cause a null pointer exception > when getting getApplicationId > {code:java} > @Test > public void testGetApplicationIdNullException() throws YarnException { > ApplicationId appId = ApplicationId.newInstance(Time.now(), 1); > ApplicationHomeSubCluster appHomeSC = ApplicationHomeSubCluster.newInstance( > appId, subClusterId); > System.out.println(appHomeSC.getApplicationId()); > } {code} > The test results are as follows: > !image-2022-09-30-16-52-25-031.png|width=818,height=271! > > After we set the ApplicationId, direct get will get a null value. > *Why this problem occurs?* > The reason for this problem is because we did not set a value for > ApplicationHomeSubClusterProtoOrBuilder when we setApplication > *Improve the code:* > 1.set a value for ApplicationHomeSubClusterProtoOrBuilder when we > setApplication. > 2. At the same time, in order to improve the access efficiency, we should > first check whether the internal property is empty when getApplication. If it > is not empty, we can return it directly. If it is empty, we convert it from > the proto object. > While modifying ApplicationHomeSubClusterImpl, I will check the pbImpl > classes of all router modules to make sure all pbimpl are fixed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org