[jira] [Commented] (YARN-9581) WebAppUtils#getRMWebAppURLWithScheme ignores rm2
[ https://issues.apache.org/jira/browse/YARN-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848628#comment-16848628 ] Prabhu Joseph commented on YARN-9581: - [~pbacsko] [~adam.antal] Can you review this Jira when you get time. This fixes LogsCli (-am 1) failure for a running job in case of RM HA with rm2 active and rm1 is down. > WebAppUtils#getRMWebAppURLWithScheme ignores rm2 > > > Key: YARN-9581 > URL: https://issues.apache.org/jira/browse/YARN-9581 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9581-001.patch, YARN-9581-002.patch > > > Yarn Logs fails for a running job in case of RM HA with rm2 active and rm1 is > down. > {code} > hrt_qa@prabhuYarn:~> /usr/hdp/current/hadoop-yarn-client/bin/yarn logs > -applicationId application_1558613472348_0004 -am 1 > 19/05/24 18:04:49 INFO client.AHSProxy: Connecting to Application History > server at prabhuYarn/172.27.23.55:10200 > 19/05/24 18:04:50 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > Unable to get AM container informations for the > application:application_1558613472348_0004 > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > Error while authenticating with endpoint: > https://prabhuYarn:8090/ws/v1/cluster/apps/application_1558613472348_0004/appattempts > Can not get AMContainers logs for the > application:application_1558613472348_0004 with the appOwner:hrt_qa > {code} > LogsCli getRMWebAppURLWithoutScheme only checks the first one from the RM > list yarn.resourcemanager.ha.rm-ids. > {code} > yarnConfig.set(YarnConfiguration.RM_HA_ID, rmIds.get(0)); > {code} > SchedConfCli also fails > {code} > [ambari-qa@pjosephdocker-3 ~]$ yarn schedulerconf -update > root.default:maximum-capacity=90 > Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: > java.net.ConnectException: Connection refused (Connection refused) > at > com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155) > at com.sun.jersey.api.client.Client.handle(Client.java:652) > at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KWON BYUNGCHANG updated YARN-9583: -- Attachment: YARN-9583.001.patch > Failed job which is submitted unknown queue is showed all users > --- > > Key: YARN-9583 > URL: https://issues.apache.org/jira/browse/YARN-9583 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 3.1.2 >Reporter: KWON BYUNGCHANG >Priority: Minor > Attachments: YARN-9583-screenshot.png, YARN-9583.001.patch > > > Failed job which is submitted unknown queue is showed all users. > I attached RM UI screen shot. > reproduction senario >1. user foo submit job to unknown queue without view-acl and failed job. >2. user bar can access job of user foo. > According to comments in QueueACLsManager .java that caused the problem. > This situation can happen when RM is restarted after deletion queue. > I think showing app of non existing queue to all users is the problem after > RM start. > It will become a security hole. > I fixed it a little bit. > After fixing it, Both owner of job and admin of yarn can access job which is > submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9583) Failed job which is submitted unknown queue is showed all users
[ https://issues.apache.org/jira/browse/YARN-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] KWON BYUNGCHANG updated YARN-9583: -- Attachment: YARN-9583-screenshot.png > Failed job which is submitted unknown queue is showed all users > --- > > Key: YARN-9583 > URL: https://issues.apache.org/jira/browse/YARN-9583 > Project: Hadoop YARN > Issue Type: Bug > Components: security >Affects Versions: 3.1.2 >Reporter: KWON BYUNGCHANG >Priority: Minor > Attachments: YARN-9583-screenshot.png > > > Failed job which is submitted unknown queue is showed all users. > I attached RM UI screen shot. > reproduction senario >1. user foo submit job to unknown queue without view-acl and failed job. >2. user bar can access job of user foo. > According to comments in QueueACLsManager .java that caused the problem. > This situation can happen when RM is restarted after deletion queue. > I think showing app of non existing queue to all users is the problem after > RM start. > It will become a security hole. > I fixed it a little bit. > After fixing it, Both owner of job and admin of yarn can access job which is > submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8693) Add signalToContainer REST API for RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8693: --- Attachment: YARN-8693.002.patch > Add signalToContainer REST API for RMWebServices > > > Key: YARN-8693 > URL: https://issues.apache.org/jira/browse/YARN-8693 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8693.001.patch, YARN-8693.002.patch > > > Currently YARN has a RPC command which is "yarn container -signal ID [signal command]>" to signal > OUTPUT_THREAD_DUMP/GRACEFUL_SHUTDOWN/FORCEFUL_SHUTDOWN commands to container. > That is not enough and we need to add signalToContainer REST API for better > management from cluster administrators or management system. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9580) Fulfilled reservation information in assignment is lost when transferring in ParentQueue#assignContainers
[ https://issues.apache.org/jira/browse/YARN-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848603#comment-16848603 ] Tao Yang commented on YARN-9580: UT Failures about fair scheduler and state store seem unrelated to this patch, can't reproduce them on my local environment. [~cheersyang], could you please help to review this patch? > Fulfilled reservation information in assignment is lost when transferring in > ParentQueue#assignContainers > - > > Key: YARN-9580 > URL: https://issues.apache.org/jira/browse/YARN-9580 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9580.001.patch > > > When transferring assignment from child queue to parent queue, fulfilled > reservation information including fulfilledReservation and > fulfilledReservedContainer in assignment is lost. > When multi-nodes enabled, this lost can raise a problem that allocation > proposal is generated but can't be accepted because there is a check for > fulfilled reservation information in > FiCaSchedulerApp#commonCheckContainerAllocation, this endless loop will > always be there and the resource of the node can't be used anymore. > In HB-driven scheduling mode, fulfilled reservation can be allocated via > another calling stack: CapacityScheduler#allocateContainersToNode --> > CapacityScheduler#allocateContainerOnSingleNode --> > CapacityScheduler#allocateFromReservedContainer, in this way assignment can > be generated by leaf queue and directly submitted, I think that's why we > hardly find this problem before. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8693) Add signalToContainer REST API for RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848601#comment-16848601 ] Hadoop QA commented on YARN-8693: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-8693 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8693 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969831/YARN-8693.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24151/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add signalToContainer REST API for RMWebServices > > > Key: YARN-8693 > URL: https://issues.apache.org/jira/browse/YARN-8693 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8693.001.patch > > > Currently YARN has a RPC command which is "yarn container -signal ID [signal command]>" to signal > OUTPUT_THREAD_DUMP/GRACEFUL_SHUTDOWN/FORCEFUL_SHUTDOWN commands to container. > That is not enough and we need to add signalToContainer REST API for better > management from cluster administrators or management system. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8693) Add signalToContainer REST API for RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8693: --- Attachment: YARN-8693.001.patch > Add signalToContainer REST API for RMWebServices > > > Key: YARN-8693 > URL: https://issues.apache.org/jira/browse/YARN-8693 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8693.001.patch > > > Currently YARN has a RPC command which is "yarn container -signal ID [signal command]>" to signal > OUTPUT_THREAD_DUMP/GRACEFUL_SHUTDOWN/FORCEFUL_SHUTDOWN commands to container. > That is not enough and we need to add signalToContainer REST API for better > management from cluster administrators or management system. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9583) Failed job which is submitted unknown queue is showed all users
KWON BYUNGCHANG created YARN-9583: - Summary: Failed job which is submitted unknown queue is showed all users Key: YARN-9583 URL: https://issues.apache.org/jira/browse/YARN-9583 Project: Hadoop YARN Issue Type: Bug Components: security Affects Versions: 3.1.2 Reporter: KWON BYUNGCHANG Failed job which is submitted unknown queue is showed all users. I attached RM UI screen shot. reproduction senario 1. user foo submit job to unknown queue without view-acl and failed job. 2. user bar can access job of user foo. According to comments in QueueACLsManager .java that caused the problem. This situation can happen when RM is restarted after deletion queue. I think showing app of non existing queue to all users is the problem after RM start. It will become a security hole. I fixed it a little bit. After fixing it, Both owner of job and admin of yarn can access job which is submitted unknown queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9497) Support grouping by diagnostics for query results of scheduler and app activities
[ https://issues.apache.org/jira/browse/YARN-9497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848575#comment-16848575 ] Tao Yang commented on YARN-9497: Thanks [~cheersyang] for the review and commit! > Support grouping by diagnostics for query results of scheduler and app > activities > - > > Key: YARN-9497 > URL: https://issues.apache.org/jira/browse/YARN-9497 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9497.001.patch, YARN-9497.002.patch, > YARN-9497.003.patch > > > [Design Doc > #4.3|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.6fbpge17dmmr] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9581) WebAppUtils#getRMWebAppURLWithScheme ignores rm2
[ https://issues.apache.org/jira/browse/YARN-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848462#comment-16848462 ] Hadoop QA commented on YARN-9581: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 42s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 0s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 1s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 26m 38s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}109m 3s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9581 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969819/YARN-9581-002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d6ad96ffc6e1 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 37900c5 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24150/testReport/ | | Max. process+thread count | 688 (vs. ulimit of 1) | | modules | C:
[jira] [Commented] (YARN-9497) Support grouping by diagnostics for query results of scheduler and app activities
[ https://issues.apache.org/jira/browse/YARN-9497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848447#comment-16848447 ] Hudson commented on YARN-9497: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16606 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16606/]) YARN-9497. Support grouping by diagnostics for query results of (wwei: rev 9f056d905f3d21faf0dc9bd42e14ea61313ee9e8) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/RouterWebServices.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/PassThroughRESTRequestInterceptor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/TestActivitiesManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/BaseRouterWebServicesTest.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivityNode.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWSConsts.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppActivitiesInfo.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivitiesUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/NodeAllocationInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/FederationInterceptorREST.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ActivityNodeInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServiceProtocol.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppRequestAllocationInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesSchedulerActivitiesWithMultiNodesEnabled.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/DefaultRequestInterceptorREST.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/MockRESTRequestInterceptor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivitiesManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppAllocationInfo.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/ActivitiesInfo.java > Support grouping by diagnostics for query results of scheduler and app > activities > - > > Key: YARN-9497 > URL: https://issues.apache.org/jira/browse/YARN-9497 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9497.001.patch, YARN-9497.002.patch, > YARN-9497.003.patch > > > [Design Doc > #4.3|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.6fbpge17dmmr] -- This message was sent by Atlassian JIRA
[jira] [Commented] (YARN-9497) Support grouping by diagnostics for query results of scheduler and app activities
[ https://issues.apache.org/jira/browse/YARN-9497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848446#comment-16848446 ] Weiwei Yang commented on YARN-9497: --- Thanks [~Tao Yang], latest patch looks good to me. Committing now. > Support grouping by diagnostics for query results of scheduler and app > activities > - > > Key: YARN-9497 > URL: https://issues.apache.org/jira/browse/YARN-9497 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9497.001.patch, YARN-9497.002.patch, > YARN-9497.003.patch > > > [Design Doc > #4.3|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.6fbpge17dmmr] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7494) Add muti-node lookup mechanism and pluggable nodes sorting policies to optimize placement decision
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848445#comment-16848445 ] Weiwei Yang commented on YARN-7494: --- Thanks [~jutia] for the comments. I agree current ResourceUsageMultiNodeLookupPolicy might be too naive. I had similar discussion before with [~sunilg] and [~Tao Yang]. We should partition sorted nodes by a range of scores, and in each range of score, we do a shuffle and let some random nodes within the same range score have the chance to be picked up by allocation thread. We need to continue to improve these policies. > Add muti-node lookup mechanism and pluggable nodes sorting policies to > optimize placement decision > -- > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-7494.001.patch, YARN-7494.002.patch, > YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, > YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, > YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, > YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, > YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, > YARN-7494.18.patch, YARN-7494.19.patch, YARN-7494.20.patch, > YARN-7494.v0.patch, YARN-7494.v1.patch, multi-node-designProposal.png > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9581) WebAppUtils#getRMWebAppURLWithScheme ignores rm2
[ https://issues.apache.org/jira/browse/YARN-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9581: Attachment: YARN-9581-002.patch > WebAppUtils#getRMWebAppURLWithScheme ignores rm2 > > > Key: YARN-9581 > URL: https://issues.apache.org/jira/browse/YARN-9581 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9581-001.patch, YARN-9581-002.patch > > > Yarn Logs fails for a running job in case of RM HA with rm2 active and rm1 is > down. > {code} > hrt_qa@prabhuYarn:~> /usr/hdp/current/hadoop-yarn-client/bin/yarn logs > -applicationId application_1558613472348_0004 -am 1 > 19/05/24 18:04:49 INFO client.AHSProxy: Connecting to Application History > server at prabhuYarn/172.27.23.55:10200 > 19/05/24 18:04:50 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > Unable to get AM container informations for the > application:application_1558613472348_0004 > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > Error while authenticating with endpoint: > https://prabhuYarn:8090/ws/v1/cluster/apps/application_1558613472348_0004/appattempts > Can not get AMContainers logs for the > application:application_1558613472348_0004 with the appOwner:hrt_qa > {code} > LogsCli getRMWebAppURLWithoutScheme only checks the first one from the RM > list yarn.resourcemanager.ha.rm-ids. > {code} > yarnConfig.set(YarnConfiguration.RM_HA_ID, rmIds.get(0)); > {code} > SchedConfCli also fails > {code} > [ambari-qa@pjosephdocker-3 ~]$ yarn schedulerconf -update > root.default:maximum-capacity=90 > Exception in thread "main" com.sun.jersey.api.client.ClientHandlerException: > java.net.ConnectException: Connection refused (Connection refused) > at > com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:155) > at com.sun.jersey.api.client.Client.handle(Client.java:652) > at com.sun.jersey.api.client.WebResource.handle(WebResource.java:682) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org