[jira] [Commented] (YARN-5047) Refactor nodeUpdate across schedulers
[ https://issues.apache.org/jira/browse/YARN-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593965#comment-15593965 ] Hudson commented on YARN-5047: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10652 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10652/]) YARN-5047. Refactor nodeUpdate across schedulers. (Ray Chiang via kasha) (kasha: rev 754cb4e30fac1c5fe8d44626968c0ddbfe459335) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java > Refactor nodeUpdate across schedulers > - > > Key: YARN-5047 > URL: https://issues.apache.org/jira/browse/YARN-5047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, scheduler >Affects Versions: 3.0.0-alpha1 >Reporter: Ray Chiang >Assignee: Ray Chiang > Fix For: 2.9.0 > > Attachments: YARN-5047.001.patch, YARN-5047.002.patch, > YARN-5047.003.patch, YARN-5047.004.patch, YARN-5047.005.patch, > YARN-5047.006.patch, YARN-5047.007.patch, YARN-5047.008.patch, > YARN-5047.009.patch, YARN-5047.010.patch, YARN-5047.011.patch, > YARN-5047.012.patch > > > FairScheduler#nodeUpdate() and CapacityScheduler#nodeUpdate() have a lot of > commonality in their code. See about refactoring the common parts into > AbstractYARNScheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593945#comment-15593945 ] Arun Suresh edited comment on YARN-4597 at 10/21/16 4:31 AM: - [~jianhe], thanks again for taking a look. bq. I think there might be some behavior change or bug for scheduling guaranteed containers when the oppotunistic-queue is enabled. Previously, when launching container, NM will not check for current vmem usage, and cpu usage. It assumes what RM allocated can be launched. Now, NM will check these limits and won't launch the container if hits the limit. Yup, we do a *hasResources* check only at the start of a container and when a container is killed. We assumed that resources requested by a container is constant, essentially we considered only actual *allocated* resources which we assume will not varying during the lifetime of the container... which implies, there is no point in checking this at any other time other than start and kill of containers. But like you stated, if we consider container resource *utilization*, based on the work [~kasha] is doing in YARN-1011, then yes, we should have a timer thread that periodically checks the vmem and cpu usage and starts (and kills) containers based on that. bq. the ResourceUtilizationManager looks like only incorporated some utility methods, not sure how we will make this pluggable later. Following on my point above, the idea was to have a {{ResourceUtilizationManager}} that can provide a different value of {{getCurrentUtilization}}, {{addResource}} and {{subtractResource}} which is used by the ContainerScheduler to calculate the resources to free up. For instance, the current default one only takes into account actual resource *allocated* to containers... for YARN-1011, we might replace that with the resource *utilized* by running containers, and provide a different value for {{getCurrentUtilization}}. The timer thread I mentioned in the previous point, which can be apart of this new ResourceUtilizationManager, can send events to the scheduler to re-process queued containers when utilization has changed. bq. The logic to select opportunisitic container: we may kill more opportunistic containers than required. e.g... Good catch, in the {{resourcesToFreeUp}}, I needed to decrement any already-marked-for-kill opportunistic container. It was there earlier, Had removed it when I was testing something, but forgot to put it back :) bq. we don't need to synchronize on the currentUtilization object? I don't see any other place it's synchronized Yup, It isnt required. Varun did point out the same.. I thought I had fixed it, think I might have missed 'git add'ing the change w.r.t Adding the new transitions, I was seeing some error messages in some testcases. Will rerun and see if they are required… but in anycase, having them there should be harmless right? The rest of your comments makes sense.. will address them shortly. was (Author: asuresh): [~jianhe], thanks again for taking a look. bq. I think there might be some behavior change or bug for scheduling guaranteed containers when the oppotunistic-queue is enabled. Previously, when launching container, NM will not check for current vmem usage, and cpu usage. It assumes what RM allocated can be launched. Now, NM will check these limits and won't launch the container if hits the limit. Yup, we do a *hasResources* check only at the start of a container and when a container is killed. We assumed that resources requested by a container is constant, essentially we considered only actual *allocated* resources which we assume will not varying during the lifetime of the container... which implies, there is no point in checking this at any other time other than start and kill of containers. But like you stated, if we consider container resource *utilization*, based on the work [~kasha] is doing in YARN-1011, then yes, we should have a timer thread that periodically checks the vmem and cpu usage and starts (and kills) containers based on that. bq. the ResourceUtilizationManager looks like only incorporated some utility methods, not sure how we will make this pluggable later. Following on my point above, the idea was to have a {{ResourceUtilizationManager}} that can provide a different value of {{getCurrentUtilization}}, {{addResource}} and {{subtractResource}} which is used by the ContainerScheduler to calculate the resources to free up. For instance, the current default one only takes into account actual resource *allocated* to containers... for YARN-1011, we might replace that with the resource *utilized* by running containers, and provide a different value for {{getCurrentUtilization}}. The timer thread I mentioned in the previous point, which can be apart of this new ResourceUtilizationManager, can send events to the scheduler to re-process
[jira] [Comment Edited] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593945#comment-15593945 ] Arun Suresh edited comment on YARN-4597 at 10/21/16 4:31 AM: - [~jianhe], thanks again for taking a look. bq. I think there might be some behavior change or bug for scheduling guaranteed containers when the oppotunistic-queue is enabled. Previously, when launching container, NM will not check for current vmem usage, and cpu usage. It assumes what RM allocated can be launched. Now, NM will check these limits and won't launch the container if hits the limit. Yup, we do a *hasResources* check only at the start of a container and when a container is killed. We assumed that resources requested by a container is constant, essentially we considered only actual *allocated* resources which we assume will not varying during the lifetime of the container... which implies, there is no point in checking this at any other time other than start and kill of containers. But like you stated, if we consider container resource *utilization*, based on the work [~kasha] is doing in YARN-1011, then yes, we should have a timer thread that periodically checks the vmem and cpu usage and starts (and kills) containers based on that. bq. the ResourceUtilizationManager looks like only incorporated some utility methods, not sure how we will make this pluggable later. Following on my point above, the idea was to have a {{ResourceUtilizationManager}} that can provide a different value of {{getCurrentUtilization}}, {{addResource}} and {{subtractResource}} which is used by the ContainerScheduler to calculate the resources to free up. For instance, the current default one only takes into account actual resource *allocated* to containers... for YARN-1011, we might replace that with the resource *utilized* by running containers, and provide a different value for {{getCurrentUtilization}}. The timer thread I mentioned in the previous point, which can be apart of this new ResourceUtilizationManager, can send events to the scheduler to re-process queued containers when utilization has changed. bq. The logic to select opportunisitic container: we may kill more opportunistic containers than required. e.g... Good catch, in the {{resourcesToFreeUp}}, I needed to decrement any already-marked-for-kill opportunistic container. It was there earlier, Had removed it when I was testing something, but forgot to put it back :) bq. we don't need to synchronize on the currentUtilization object? I don't see any other place it's synchronized Yup, It isnt required. Varun did point out the same.. I thought I had fixed it, think I might have missed 'git add'ing the change w.r.t Adding the new transitions, I was seeing some error messages in some testcases. Will rerun and see if they are required… but in anycase, having them there should be harmless right? The rest of your comments makes sense.. will address them shortly. was (Author: asuresh): [~jianhe], thanks again for taking a look. bq. I think there might be some behavior change or bug for scheduling guaranteed containers when the oppotunistic-queue is enabled. Previously, when launching container, NM will not check for current vmem usage, and cpu usage. It assumes what RM allocated can be launched. Now, NM will check these limits and won't launch the container if hits the limit. Yup, we do a *hasResources* check only at the start of a container and when a container is killed. We assumed that resources requested by a container is constant, essentially we considered only actual *allocated* resources which we assume will not varying during the lifetime of the container... which implies, there is no point in checking this at any other time other than start and kill of containers. But like you stated, if we consider container resource *utilization*, based on the work [~kasha] is doing in YARN-1011, then yes, we should have a timer thread that periodically checks the vmem and cpu usage and starts (and kills) containers based on that. bq. the ResourceUtilizationManager looks like only incorporated some utility methods, not sure how we will make this pluggable later. Following on my point above, the idea was to have a {{ResourceUtilizationManager}} that can provide a different value of {{getCurrentUtilization}}, {{addResource}} and {{subtractResource}} which is used by the ContainerScheduler to calculate the resources to free up. For instance, the current default one only takes into account actual resource *allocated* to containers... for YARN-1011, we might replace that with the resource *utilized* by running containers, and provide a different value for {{getCurrentUtilization}}. The timer thread I mentioned in the previous point, which can be apart of this new ResourceUtilizationManager, can send events to the scheduler to re-process
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593945#comment-15593945 ] Arun Suresh commented on YARN-4597: --- [~jianhe], thanks again for taking a look. bq. I think there might be some behavior change or bug for scheduling guaranteed containers when the oppotunistic-queue is enabled. Previously, when launching container, NM will not check for current vmem usage, and cpu usage. It assumes what RM allocated can be launched. Now, NM will check these limits and won't launch the container if hits the limit. Yup, we do a *hasResources* check only at the start of a container and when a container is killed. We assumed that resources requested by a container is constant, essentially we considered only actual *allocated* resources which we assume will not varying during the lifetime of the container... which implies, there is no point in checking this at any other time other than start and kill of containers. But like you stated, if we consider container resource *utilization*, based on the work [~kasha] is doing in YARN-1011, then yes, we should have a timer thread that periodically checks the vmem and cpu usage and starts (and kills) containers based on that. bq. the ResourceUtilizationManager looks like only incorporated some utility methods, not sure how we will make this pluggable later. Following on my point above, the idea was to have a {{ResourceUtilizationManager}} that can provide a different value of {{getCurrentUtilization}}, {{addResource}} and {{subtractResource}} which is used by the ContainerScheduler to calculate the resources to free up. For instance, the current default one only takes into account actual resource *allocated* to containers... for YARN-1011, we might replace that with the resource *utilized* by running containers, and provide a different value for {{getCurrentUtilization}}. The timer thread I mentioned in the previous point, which can be apart of this new ResourceUtilizationManager, can send events to the scheduler to re-process queued containers when utilization has changed. bq. The logic to select opportunisitic container: we may kill more opportunistic containers than required. e.g... Good catch, in the {{resourcesToFreeUp}}, I needed to decrement any already-marked-for-kill opportunistic container. It was there earlier, Had removed it when I was testing something, but forgot to put it back :) bq. we don't need to synchronize on the currentUtilization object? I don't see any other place it's synchronized Yup, It isnt required. Varun did point out the same.. I thought I had fixed it, think I might have missed 'git add'ing the change w.r.t Adding the new transitions, I was seeing some error messages in some testcases. Will rerun and see if they are required… but in anycase, having them there should be harmless right? The rest of your comments makes sense.. will address them shortly. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4911) Bad placement policy in FairScheduler causes the RM to crash
[ https://issues.apache.org/jira/browse/YARN-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593930#comment-15593930 ] Hudson commented on YARN-4911: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10651 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10651/]) YARN-4911. Bad placement policy in FairScheduler causes the RM to crash (kasha: rev a064865abf7dceee46d3c42eca67a04a25af9d4e) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueuePlacementPolicy.java > Bad placement policy in FairScheduler causes the RM to crash > > > Key: YARN-4911 > URL: https://issues.apache.org/jira/browse/YARN-4911 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability > Fix For: 2.9.0 > > Attachments: YARN-4911.001.patch, YARN-4911.002.patch, > YARN-4911.003.patch, YARN-4911.004.patch > > > When you have a fair-scheduler.xml with the rule: > > > > and the queue okay1 doesn't exist, the following exception occurs in the RM: > 2016-04-01 16:56:33,383 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type APP_ADDED to the scheduler > java.lang.IllegalStateException: Should have applied a rule before reaching > here > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueuePlacementPolicy.assignAppToQueue(QueuePlacementPolicy.java:173) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:728) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:634) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1224) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) > at java.lang.Thread.run(Thread.java:745) > which causes the RM to crash. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5047) Refactor nodeUpdate across schedulers
[ https://issues.apache.org/jira/browse/YARN-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593913#comment-15593913 ] Karthik Kambatla commented on YARN-5047: The checkstyle issues reported are benign, and test failures are flaky. The last two builds have two different test failures. +1. Checking this in.. > Refactor nodeUpdate across schedulers > - > > Key: YARN-5047 > URL: https://issues.apache.org/jira/browse/YARN-5047 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, scheduler >Affects Versions: 3.0.0-alpha1 >Reporter: Ray Chiang >Assignee: Ray Chiang > Attachments: YARN-5047.001.patch, YARN-5047.002.patch, > YARN-5047.003.patch, YARN-5047.004.patch, YARN-5047.005.patch, > YARN-5047.006.patch, YARN-5047.007.patch, YARN-5047.008.patch, > YARN-5047.009.patch, YARN-5047.010.patch, YARN-5047.011.patch, > YARN-5047.012.patch > > > FairScheduler#nodeUpdate() and CapacityScheduler#nodeUpdate() have a lot of > commonality in their code. See about refactoring the common parts into > AbstractYARNScheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5047) Refactor nodeUpdate across schedulers
[ https://issues.apache.org/jira/browse/YARN-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593886#comment-15593886 ] Hadoop QA commented on YARN-5047: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 622 unchanged - 5 fixed = 624 total (was 627) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 35m 5s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 25s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832210/YARN-5047.012.patch | | JIRA Issue | YARN-5047 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f119cf665123 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d7d87de | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13463/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/13463/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit test logs | https://builds.apache.org/job/PreCommit-YARN-Build/13463/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13463/testReport/ | | modules | C:
[jira] [Commented] (YARN-5047) Refactor nodeUpdate across schedulers
[ https://issues.apache.org/jira/browse/YARN-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593888#comment-15593888 ] Hadoop QA commented on YARN-5047: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 622 unchanged - 5 fixed = 624 total (was 627) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 16s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 21s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMAdminService | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12832210/YARN-5047.012.patch | | JIRA Issue | YARN-5047 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux e46ce9b94fe2 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d7d87de | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13462/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/13462/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit test logs | https://builds.apache.org/job/PreCommit-YARN-Build/13462/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13462/testReport/ | | modules | C:
[jira] [Commented] (YARN-2009) Priority support for preemption in ProportionalCapacityPreemptionPolicy
[ https://issues.apache.org/jira/browse/YARN-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593887#comment-15593887 ] Sunil G commented on YARN-2009: --- Thanks [~eepayne] for sharing detailed test scenario. I tested manually also. But the test code which i posted was from my unit test. I will try mock the case as you mentioned and will update the cause for the same. Thank You. > Priority support for preemption in ProportionalCapacityPreemptionPolicy > --- > > Key: YARN-2009 > URL: https://issues.apache.org/jira/browse/YARN-2009 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Devaraj K >Assignee: Sunil G > Attachments: YARN-2009.0001.patch, YARN-2009.0002.patch, > YARN-2009.0003.patch, YARN-2009.0004.patch, YARN-2009.0005.patch, > YARN-2009.0006.patch, YARN-2009.0007.patch, YARN-2009.0008.patch, > YARN-2009.0009.patch, YARN-2009.0010.patch, YARN-2009.0011.patch > > > While preempting containers based on the queue ideal assignment, we may need > to consider preempting the low priority application containers first. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4911) Bad placement policy in FairScheduler causes the RM to crash
[ https://issues.apache.org/jira/browse/YARN-4911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593882#comment-15593882 ] Karthik Kambatla commented on YARN-4911: +1. Checking this in. > Bad placement policy in FairScheduler causes the RM to crash > > > Key: YARN-4911 > URL: https://issues.apache.org/jira/browse/YARN-4911 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Ray Chiang >Assignee: Ray Chiang > Labels: supportability > Attachments: YARN-4911.001.patch, YARN-4911.002.patch, > YARN-4911.003.patch, YARN-4911.004.patch > > > When you have a fair-scheduler.xml with the rule: > > > > and the queue okay1 doesn't exist, the following exception occurs in the RM: > 2016-04-01 16:56:33,383 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type APP_ADDED to the scheduler > java.lang.IllegalStateException: Should have applied a rule before reaching > here > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueuePlacementPolicy.assignAppToQueue(QueuePlacementPolicy.java:173) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:728) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:634) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1224) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:691) > at java.lang.Thread.run(Thread.java:745) > which causes the RM to crash. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5388) MAPREDUCE-6719 requires changes to DockerContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-5388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593878#comment-15593878 ] Karthik Kambatla commented on YARN-5388: +1 on the trunk patch. For the branch-2 patch, I am not sure we should make the code improvements in DockerContainerExecutor.java. > MAPREDUCE-6719 requires changes to DockerContainerExecutor > -- > > Key: YARN-5388 > URL: https://issues.apache.org/jira/browse/YARN-5388 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-5388.001.patch, YARN-5388.002.patch, > YARN-5388.003.patch, YARN-5388.branch-2.001.patch, > YARN-5388.branch-2.002.patch, YARN-5388.branch-2.003.patch > > > Because the {{DockerContainerExecuter}} overrides the {{writeLaunchEnv()}} > method, it must also have the wildcard processing logic from > YARN-4958/YARN-5373 added to it. Without it, the use of -libjars will fail > unless wildcarding is disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5724) [Umbrella] Better Queue Management in YARN
[ https://issues.apache.org/jira/browse/YARN-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593753#comment-15593753 ] Karthik Kambatla commented on YARN-5724: [~xgong] - is this proposal specific to CapacityScheduler? Or, are you suggesting common changes that all schedulers could benefit? > [Umbrella] Better Queue Management in YARN > -- > > Key: YARN-5724 > URL: https://issues.apache.org/jira/browse/YARN-5724 > Project: Hadoop YARN > Issue Type: Task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: Designdocv1-Configuration-basedQueueManagementinYARN.pdf > > > This serves as an umbrella ticket for tasks related to better queue > management in YARN. > Today's the only way to manage the queue is through admins editing > configuration files and then issuing a refresh command. This will bring many > inconveniences. For example, the users can not create / delete /modify their > own queues without talking to site level admins. > Even in today's approach (configuration-based), we still have several places > needed to improve: > * It is possible today to add or modify queues without restarting the RM, > via a CS refresh. But for deleting queue, we have to restart the > resourcemanager. > * When a queue is STOPPED, resources allocated to the queue can be handled > better. Currently, they'll only be used if the other queues are setup to go > over their capacity. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5711) AM cannot reconnect to RM after failover when using RequestHedgingRMFailoverProxyProvider
[ https://issues.apache.org/jira/browse/YARN-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593657#comment-15593657 ] Hadoop QA commented on YARN-5711: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s {color} | {color:red} YARN-5711 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834588/YARN-5711-v1.patch | | JIRA Issue | YARN-5711 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13461/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > AM cannot reconnect to RM after failover when using > RequestHedgingRMFailoverProxyProvider > - > > Key: YARN-5711 > URL: https://issues.apache.org/jira/browse/YARN-5711 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, resourcemanager >Affects Versions: 2.9.0, 3.0.0-alpha1 >Reporter: Subru Krishnan >Assignee: Subru Krishnan >Priority: Critical > Attachments: YARN-5711-v1.patch > > > When RM failsover, it does _not_ auto re-register running apps and so they > need to re-register when reconnecting to new primary. This is done by > catching {{ApplicationMasterNotRegisteredException}} in *allocate* calls and > re-registering. But *RequestHedgingRMFailoverProxyProvider* does _not_ > propagate {{YarnException}} as the actual invocation is done asynchronously > using seperate threads, so AMs cannot reconnect to RM after failover. > This JIRA proposes that the *RequestHedgingRMFailoverProxyProvider* propagate > any {{YarnException}} that it encounters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5711) AM cannot reconnect to RM after failover when using RequestHedgingRMFailoverProxyProvider
[ https://issues.apache.org/jira/browse/YARN-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593646#comment-15593646 ] Subru Krishnan edited comment on YARN-5711 at 10/21/16 1:42 AM: Attaching a patch that returns any exception encountered with the active RM as discussed offline with [~jianhe]. Thanks to [~ellenfkh] for extensively testing this out in our cluster. FYI, there are some formatting fixes in *RequestHedgingRMFailoverProxyProvider* as it seems to follow intellij formatter rather than standard hadoop. was (Author: subru): Attaching a patch that returns any exception encountered with the active RM as discussed offline with [~jianhe]. Thanks to [~ellenfkh] for extensively testing this out in our cluster. > AM cannot reconnect to RM after failover when using > RequestHedgingRMFailoverProxyProvider > - > > Key: YARN-5711 > URL: https://issues.apache.org/jira/browse/YARN-5711 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, resourcemanager >Affects Versions: 2.9.0, 3.0.0-alpha1 >Reporter: Subru Krishnan >Assignee: Subru Krishnan >Priority: Critical > Attachments: YARN-5711-v1.patch > > > When RM failsover, it does _not_ auto re-register running apps and so they > need to re-register when reconnecting to new primary. This is done by > catching {{ApplicationMasterNotRegisteredException}} in *allocate* calls and > re-registering. But *RequestHedgingRMFailoverProxyProvider* does _not_ > propagate {{YarnException}} as the actual invocation is done asynchronously > using seperate threads, so AMs cannot reconnect to RM after failover. > This JIRA proposes that the *RequestHedgingRMFailoverProxyProvider* propagate > any {{YarnException}} that it encounters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5711) AM cannot reconnect to RM after failover when using RequestHedgingRMFailoverProxyProvider
[ https://issues.apache.org/jira/browse/YARN-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-5711: - Attachment: YARN-5711-v1.patch Attaching a patch that returns any exception encountered with the active RM as discussed offline with [~jianhe]. Thanks to [~ellenfkh] for extensively testing this out in our cluster. > AM cannot reconnect to RM after failover when using > RequestHedgingRMFailoverProxyProvider > - > > Key: YARN-5711 > URL: https://issues.apache.org/jira/browse/YARN-5711 > Project: Hadoop YARN > Issue Type: Bug > Components: applications, resourcemanager >Affects Versions: 2.9.0, 3.0.0-alpha1 >Reporter: Subru Krishnan >Assignee: Subru Krishnan >Priority: Critical > Attachments: YARN-5711-v1.patch > > > When RM failsover, it does _not_ auto re-register running apps and so they > need to re-register when reconnecting to new primary. This is done by > catching {{ApplicationMasterNotRegisteredException}} in *allocate* calls and > re-registering. But *RequestHedgingRMFailoverProxyProvider* does _not_ > propagate {{YarnException}} as the actual invocation is done asynchronously > using seperate threads, so AMs cannot reconnect to RM after failover. > This JIRA proposes that the *RequestHedgingRMFailoverProxyProvider* propagate > any {{YarnException}} that it encounters. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5725) Test uncaught exception in TestContainersMonitorResourceChange.testContainersResourceChange when setting IP and host
[ https://issues.apache.org/jira/browse/YARN-5725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5725: - Attachment: YARN-5725.002.patch [~templedf] I agree, I added the necessary mock containers, so that we have both solutions now. > Test uncaught exception in > TestContainersMonitorResourceChange.testContainersResourceChange when setting > IP and host > > > Key: YARN-5725 > URL: https://issues.apache.org/jira/browse/YARN-5725 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Minor > Attachments: YARN-5725.000.patch, YARN-5725.001.patch, > YARN-5725.002.patch > > Original Estimate: 2h > Remaining Estimate: 2h > > The issue is a warning but it prevents container monitor to continue > 2016-10-12 14:38:23,280 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(594)) - > Uncaught exception in ContainersMonitorImpl while monitoring resource of > container_123456_0001_01_01 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl$MonitoringThread.run(ContainersMonitorImpl.java:455) > 2016-10-12 14:38:23,281 WARN [Container Monitor] > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(613)) - > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl > is interrupted. Exiting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.
[ https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593578#comment-15593578 ] Vinod Kumar Vavilapalli commented on YARN-5611: --- HADOOP-11552 may be an easy out as long as we agree it makes into the next release where app-timeouts feature and update-priorities exists. Otherwise, we risk regressing on high-availability. Let's make sure this is a blocker. OTOH, even assuming HADOOP-11552, there is one API concern I have. If multiple users start trying to update the timeout for a single application, the behavior from individual user's point of view is arbitrary. Both the user's clients block, and after the call returns, the timeout may be set to the value that they asked for or what the other user asked for. To avoid this, the API should be something like {{targetTimeoutFromNow(long abSoluteTimeoutValueAtRequest, targetTimeout)}} so that the server can reject requests if the current recorded value changes by the time the request is accepted. Essentially this is {{Read -> CompareAndSetOrFail}}. > Provide an API to update lifetime of an application. > > > Key: YARN-5611 > URL: https://issues.apache.org/jira/browse/YARN-5611 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-5611.patch, 0002-YARN-5611.patch, > 0003-YARN-5611.patch, YARN-5611.v0.patch > > > YARN-4205 monitors an Lifetime of an applications is monitored if required. > Add an client api to update lifetime of an application. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5750) YARN-4126 broke Oozie on unsecure cluster
[ https://issues.apache.org/jira/browse/YARN-5750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593549#comment-15593549 ] Robert Kanter commented on YARN-5750: - This code goes back to 2012/2013. Looking at OOZIE-1148 and OOZIE-1159, Oozie originally had this hardcoded to "oozie mr token" before these two JIRAs, and it worked fine because the JT couldn't renew tokens anyway. These two JIRAs changed it so that Oozie would use the correct value in a secure cluster and the dummy value in a non-secure cluster. https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/service/HadoopAccessorService.java#L609 I can't speak as to why Oozie was always getting a delegation token. Perhaps it's needed for impersonation? Have we verified that impersonation works without delegation tokens in a non-secure cluster? > YARN-4126 broke Oozie on unsecure cluster > - > > Key: YARN-5750 > URL: https://issues.apache.org/jira/browse/YARN-5750 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Peter Cseh > > Oozie is using a DummyRenewer on unsecure clusters and can't submit workflows > on an unsecure cluster after YARN-4126. > {noformat} > org.apache.oozie.action.ActionExecutorException: JA009: > org.apache.hadoop.yarn.exceptions.YarnException: java.io.IOException: > Delegation Token can be issued only with kerberos authentication > at > org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:1092) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getDelegationToken(ApplicationClientProtocolPBServiceImpl.java:335) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:663) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2419) > Caused by: java.io.IOException: Delegation Token can be issued only with > kerberos authentication > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:1065) > ... 10 more > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:457) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:437) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1128) > at > org.apache.oozie.action.hadoop.TestJavaActionExecutor.submitAction(TestJavaActionExecutor.java:343) > at > org.apache.oozie.action.hadoop.TestJavaActionExecutor.submitAction(TestJavaActionExecutor.java:363) > at > org.apache.oozie.action.hadoop.TestJavaActionExecutor.testKill(TestJavaActionExecutor.java:602) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at junit.framework.TestCase.runTest(TestCase.java:168) > at junit.framework.TestCase.runBare(TestCase.java:134) > at junit.framework.TestResult$1.protect(TestResult.java:110) > at junit.framework.TestResult.runProtected(TestResult.java:128) > at junit.framework.TestResult.run(TestResult.java:113) > at junit.framework.TestCase.run(TestCase.java:124) > at junit.framework.TestSuite.runTest(TestSuite.java:232) > at junit.framework.TestSuite.run(TestSuite.java:227) > at > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:24) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >
[jira] [Commented] (YARN-5716) Add global scheduler interface definition and update CapacityScheduler to use it.
[ https://issues.apache.org/jira/browse/YARN-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593544#comment-15593544 ] Hadoop QA commented on YARN-5716: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 13 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 59s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 47s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 55s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 0s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: The patch generated 132 new + 1472 unchanged - 164 fixed = 1604 total (was 1636) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 41s {color} | {color:green} hadoop-yarn-project_hadoop-yarn generated 0 new + 6484 unchanged - 10 fixed = 6484 total (was 6494) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 0 new + 928 unchanged - 10 fixed = 928 total (was 938) {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 51s {color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 39m 41s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 95m 51s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager | | | hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices | \\ \\ || Subsystem || Report/Notes
[jira] [Comment Edited] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593535#comment-15593535 ] Botong Huang edited comment on YARN-5525 at 10/21/16 12:50 AM: --- Thanks all for the comment! bq. Are we going to change ContainerLaunchContext to convey the information [~jianhe]: did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default/fallback implementation, and then add an optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! was (Author: botong): Thanks all for the comment! bq. Are we going to change ContainerLaunchContext to convey the information [~jianhe]: did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default, and then add an optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593535#comment-15593535 ] Botong Huang edited comment on YARN-5525 at 10/21/16 12:50 AM: --- Thanks all for the comment! bq. Are we going to change ContainerLaunchContext to convey the information [~jianhe]: did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default, and then add an optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! was (Author: botong): Thanks all or the comment! bq. Are we going to change ContainerLaunchContext to convey the information [~jianhe]: did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default, and then add an optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593535#comment-15593535 ] Botong Huang commented on YARN-5525: Thanks all or the comment! bq. Are we going to change ContainerLaunchContext to convey the information [~jianhe]: did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default, and then add an optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-5525: --- Comment: was deleted (was: Thanks [~jianhe] for the comment! bq. Are we going to change ContainerLaunchContext to convey the information Did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default, and then add an optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! ) > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5686) DefaultContainerExecutor random working dir algorigthm skews results
[ https://issues.apache.org/jira/browse/YARN-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593533#comment-15593533 ] Miklos Szegedi commented on YARN-5686: -- +1 (non-binding) It looks good to me. Thank you, [~vrushalic]! > DefaultContainerExecutor random working dir algorigthm skews results > > > Key: YARN-5686 > URL: https://issues.apache.org/jira/browse/YARN-5686 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Vrushali C >Priority: Minor > Attachments: YARN-5686.001.patch, YARN-5686.002.patch > > > {code} > long randomPosition = RandomUtils.nextLong() % totalAvailable; > ... > while (randomPosition > availableOnDisk[dir]) { > randomPosition -= availableOnDisk[dir++]; > } > {code} > The code above selects a disk based on the random number weighted by the free > space on each disk respectively. For example, if I have two disks with 100 > bytes each, totalAvailable is 200. The value of randomPosition will be > 0..199. 0..99 should select the first disk, 100..199 should select the second > disk inclusively. Random number 100 should select the second disk to be fair > but this is not the case right now. > We need to use > {code} > while (randomPosition >= availableOnDisk[dir]) > {code} > instead of > {code} > while (randomPosition > availableOnDisk[dir]) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593384#comment-15593384 ] Botong Huang edited comment on YARN-5525 at 10/21/16 12:37 AM: --- Thanks [~jianhe] for the comment! bq. Are we going to change ContainerLaunchContext to convey the information Did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default, and then add an optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! was (Author: botong): Thanks [~jianhe] for the comment! bq. Are we going to change ContainerLaunchContext to convey the information Did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default, and then add the optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593384#comment-15593384 ] Botong Huang edited comment on YARN-5525 at 10/21/16 12:36 AM: --- Thanks [~jianhe] for the comment! bq. Are we going to change ContainerLaunchContext to convey the information Did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the {{AppLogAggregator}} in Yarn conf as the default, and then add the optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! was (Author: botong): Thanks [~jianhe] for the comment! bq. Are we going to change ContainerLaunchContext to convey the information Did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the default {{AppLogAggregator}} in Yarn conf, and then add the optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593384#comment-15593384 ] Botong Huang edited comment on YARN-5525 at 10/21/16 12:35 AM: --- Thanks [~jianhe] for the comment! bq. Are we going to change ContainerLaunchContext to convey the information Did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? How about I add the default {{AppLogAggregator}} in Yarn conf, and then add the optional entry in {{LogAggregationContext}} so that it is possible for apps to use a different one without restarting the NM? Thanks! was (Author: botong): Thanks [~jianhe] for the comment! bq. Are we going to change ContainerLaunchContext to convey the information Did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? Which one do you prefer? > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593507#comment-15593507 ] Hadoop QA commented on YARN-5761: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 6 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 25 new + 913 unchanged - 14 fixed = 938 total (was 927) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 39m 3s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 56m 21s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834573/YARN-5761.1.rebase.patch | | JIRA Issue | YARN-5761 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 1dd8d8d30bfa 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 262827c | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13460/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13460/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13460/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL:
[jira] [Commented] (YARN-5280) Allow YARN containers to run with Java Security Manager
[ https://issues.apache.org/jira/browse/YARN-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593491#comment-15593491 ] Robert Kanter commented on YARN-5280: - Thanks for continuing your work on this [~gphillips]. Here's some more feedback on the latest patch. I haven't had the time to test it out, so this is all based on reading through the code changes: # Can you look into the test failures reported above? Also the checkstyle and warnings. Unfortunately, it looks like the Jenkins job has been purged so we don't have that info there anymore. # Why do we add the queue name to the env? It looks like you're only using the queue in the {{JavaSandboxLinuxContainerRuntime}}, so I think it could go in the {{ContainerRuntimeContext}} instead. #- Also, it's in MR code, so it's only going to be added for MR Apps and not other JVM-based Apps (e.g. Spark, Oozie-on-Yarn Launcher, etc). # The class Javadoc comment in {{DelegatingLinuxContainerRuntime}} should be updated now that it can also delegate to the {{JavaSandboxLinuxContainerRuntime}}. # The config properties added to {{JavaSandboxLinuxContainerRuntime}} (i.e. {{"yarn.nodemanager.linux-container-executor.sandbox-mode.*"}}) should be defined in {{YarnConfiguration}} along with a default value. See the other properties in {{YarnConfiguration}} for examples. # Instead of inlining {{PosixFilePermissions.fromString("rwxr-xr-x"))}} and similar in {{JavaSandboxLinuxContainerRuntime}}, they should be declared as private constants. # We could use some additional unit tests. There's some complicated regexes, different operating modes, etc that we should make sure to more fully cover. > Allow YARN containers to run with Java Security Manager > --- > > Key: YARN-5280 > URL: https://issues.apache.org/jira/browse/YARN-5280 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, yarn >Affects Versions: 2.6.4 >Reporter: Greg Phillips >Assignee: Greg Phillips >Priority: Minor > Attachments: YARN-5280.001.patch, YARN-5280.002.patch, > YARN-5280.patch, YARNContainerSandbox.pdf > > > YARN applications have the ability to perform privileged actions which have > the potential to add instability into the cluster. The Java Security Manager > can be used to prevent users from running privileged actions while still > allowing their core data processing use cases. > Introduce a YARN flag which will allow a Hadoop administrator to enable the > Java Security Manager for user code, while still providing complete > permissions to core Hadoop libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5757) RM Cluster Node REST API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5757: - Summary: RM Cluster Node REST API documentation is not up to date (was: RM Cluster Node API documentation is not up to date) > RM Cluster Node REST API documentation is not up to date > > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, yarn >Affects Versions: 2.7.3, 3.0.0-alpha1 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Fix For: 3.0.0-alpha2 > > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5757) RM Cluster Node API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5757: - Affects Version/s: 2.7.3 > RM Cluster Node API documentation is not up to date > --- > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, yarn >Affects Versions: 2.7.3, 3.0.0-alpha1 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Fix For: 3.0.0-alpha2 > > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5757) RM Cluster Node API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593486#comment-15593486 ] Miklos Szegedi commented on YARN-5757: -- [~gsohn], I added the requested fields. Let me know, if you need anything else. > RM Cluster Node API documentation is not up to date > --- > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, yarn >Affects Versions: 3.0.0-alpha1 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Fix For: 3.0.0-alpha2 > > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5757) RM Cluster Node API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5757: - Fix Version/s: 3.0.0-alpha2 > RM Cluster Node API documentation is not up to date > --- > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, yarn >Affects Versions: 3.0.0-alpha1 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Fix For: 3.0.0-alpha2 > > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5757) RM Cluster Node API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5757: - Component/s: yarn resourcemanager > RM Cluster Node API documentation is not up to date > --- > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, yarn >Affects Versions: 3.0.0-alpha1 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Fix For: 3.0.0-alpha2 > > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5757) RM Cluster Node API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5757: - Affects Version/s: 3.0.0-alpha1 > RM Cluster Node API documentation is not up to date > --- > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0-alpha1 >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4126) RM should not issue delegation tokens in unsecure mode
[ https://issues.apache.org/jira/browse/YARN-4126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593445#comment-15593445 ] Jian He commented on YARN-4126: --- [~jlowe], I think this jira's intention is not wrong, deletion token is not required for unsecure cluster. why does oozie require the delegation token in unsecure cluster ? This jira was actually opened because of this, previous [comment | https://issues.apache.org/jira/browse/YARN-4126?focusedCommentId=14735170=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14735170] said so. But I don't remember what the exact issue is... > RM should not issue delegation tokens in unsecure mode > -- > > Key: YARN-4126 > URL: https://issues.apache.org/jira/browse/YARN-4126 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Bibin A Chundatt > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: 0001-YARN-4126.patch, 0002-YARN-4126.patch, > 0003-YARN-4126.patch, 0004-YARN-4126.patch, 0005-YARN-4126.patch, > 0006-YARN-4126.patch > > > ClientRMService#getDelegationToken is currently returning a delegation token > in insecure mode. We should not return the token if it's in insecure mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5750) YARN-4126 broke Oozie on unsecure cluster
[ https://issues.apache.org/jira/browse/YARN-5750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593427#comment-15593427 ] Jian He commented on YARN-5750: --- [~gezapeti], what is the DummyRenewer used for ? I wonder why oozie requires the delegation token in unsecure cluster in the first place. > YARN-4126 broke Oozie on unsecure cluster > - > > Key: YARN-5750 > URL: https://issues.apache.org/jira/browse/YARN-5750 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.8.0 >Reporter: Peter Cseh > > Oozie is using a DummyRenewer on unsecure clusters and can't submit workflows > on an unsecure cluster after YARN-4126. > {noformat} > org.apache.oozie.action.ActionExecutorException: JA009: > org.apache.hadoop.yarn.exceptions.YarnException: java.io.IOException: > Delegation Token can be issued only with kerberos authentication > at > org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:1092) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getDelegationToken(ApplicationClientProtocolPBServiceImpl.java:335) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:663) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2423) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2419) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2419) > Caused by: java.io.IOException: Delegation Token can be issued only with > kerberos authentication > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:1065) > ... 10 more > at > org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:457) > at > org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:437) > at > org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1128) > at > org.apache.oozie.action.hadoop.TestJavaActionExecutor.submitAction(TestJavaActionExecutor.java:343) > at > org.apache.oozie.action.hadoop.TestJavaActionExecutor.submitAction(TestJavaActionExecutor.java:363) > at > org.apache.oozie.action.hadoop.TestJavaActionExecutor.testKill(TestJavaActionExecutor.java:602) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at junit.framework.TestCase.runTest(TestCase.java:168) > at junit.framework.TestCase.runBare(TestCase.java:134) > at junit.framework.TestResult$1.protect(TestResult.java:110) > at junit.framework.TestResult.runProtected(TestResult.java:128) > at junit.framework.TestResult.run(TestResult.java:113) > at junit.framework.TestCase.run(TestCase.java:124) > at junit.framework.TestSuite.runTest(TestSuite.java:232) > at junit.framework.TestSuite.run(TestSuite.java:227) > at > org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83) > at org.junit.runners.Suite.runChild(Suite.java:128) > at org.junit.runners.Suite.runChild(Suite.java:24) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.IOException: > org.apache.hadoop.yarn.exceptions.YarnException: java.io.IOException: > Delegation Token can be issued only with kerberos authentication > at > org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getDelegationToken(ClientRMService.java:1092) > at >
[jira] [Commented] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593384#comment-15593384 ] Botong Huang commented on YARN-5525: Thanks [~jianhe] for the comment! bq. Are we going to change ContainerLaunchContext to convey the information Did you mean {{LogAggregationContext}} inside {{ApplicationSubmissionContext}}, vs. Yarn configuration? Which one do you prefer? > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-5761: Attachment: YARN-5761.1.rebase.patch > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL: https://issues.apache.org/jira/browse/YARN-5761 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-5761.1.patch, YARN-5761.1.rebase.patch > > > Currently, in scheduler code, we are doing queue manager and scheduling work. > We'd better separate the queue manager out of scheduler logic. In that case, > it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593349#comment-15593349 ] Xuan Gong commented on YARN-5761: - This patch applies for branch-2. > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL: https://issues.apache.org/jira/browse/YARN-5761 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-5761.1.patch > > > Currently, in scheduler code, we are doing queue manager and scheduling work. > We'd better separate the queue manager out of scheduler logic. In that case, > it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5762) Summarize ApplicationNotFoundException in the RM log
[ https://issues.apache.org/jira/browse/YARN-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash reassigned YARN-5762: -- Assignee: Ravi Prakash > Summarize ApplicationNotFoundException in the RM log > > > Key: YARN-5762 > URL: https://issues.apache.org/jira/browse/YARN-5762 > Project: Hadoop YARN > Issue Type: Task >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Assignee: Ravi Prakash >Priority: Minor > Attachments: YARN-5762.01.patch > > > We found a lot of {{ApplicationNotFoundException}} in the RM logs. These were > most likely caused by the {{AggregatedLogDeletionService}} [which > checks|https://github.com/apache/hadoop/blob/262827cf75bf9c48cd95335eb04fd8ff1d64c538/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L156] > that the application is not running anymore. e.g. > {code}2016-10-17 15:25:26,542 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 20 on 8032, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from :12205 Call#35401 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1473396553140_1451' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > 2016-10-17 15:25:26,633 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 47 on 8032, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from :12205 Call#35404 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1473396553140_1452' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5762) Summarize ApplicationNotFoundException in the RM log
[ https://issues.apache.org/jira/browse/YARN-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-5762: --- Attachment: YARN-5762.01.patch Here's a simple 1 line patch > Summarize ApplicationNotFoundException in the RM log > > > Key: YARN-5762 > URL: https://issues.apache.org/jira/browse/YARN-5762 > Project: Hadoop YARN > Issue Type: Task >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Priority: Minor > Attachments: YARN-5762.01.patch > > > We found a lot of {{ApplicationNotFoundException}} in the RM logs. These were > most likely caused by the {{AggregatedLogDeletionService}} [which > checks|https://github.com/apache/hadoop/blob/262827cf75bf9c48cd95335eb04fd8ff1d64c538/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L156] > that the application is not running anymore. e.g. > {code}2016-10-17 15:25:26,542 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 20 on 8032, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from :12205 Call#35401 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1473396553140_1451' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > 2016-10-17 15:25:26,633 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 47 on 8032, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from :12205 Call#35404 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1473396553140_1452' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5760) [ATSv2] Create HBase connection only if an app collector is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593336#comment-15593336 ] Sangjin Lee commented on YARN-5760: --- A connection pool is an apt description. I do see value in minimizing the number of connections, but this should be done correctly or it could be a source of complexity and issues down the road. > [ATSv2] Create HBase connection only if an app collector is publishing from NM > -- > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4597) Add SCHEDULE to NM container lifecycle
[ https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593320#comment-15593320 ] Jian He commented on YARN-4597: --- [~asuresh], some more questions and comments on the patch: - why are these two transitions added? {code} .addTransition(ContainerState.DONE, ContainerState.DONE, ContainerEventType.CONTAINER_RESOURCES_CLEANEDUP) .addTransition(ContainerState.DONE, ContainerState.DONE, ContainerEventType.CONTAINER_LAUNCHED) .addTransition(ContainerState.DONE, ContainerState.DONE, {code} - the storeContainerKilled will be called in ContainerLaunch#cleanupContainer later, we don't need to call it here? {code} public void sendKillEvent(int exitStatus, String description) { try { context.getNMStateStore().storeContainerKilled(containerId); } catch (IOException ioe) { LOG.error("Could not log container state change to state store..", ioe); } {code} - remove unused imports in ContainersMonitor.java - remove ContainersMonitorImpl#allocatedCpuUsage unused method - why do you need to add the additional check for SCHEDULED state ? {code} // Process running containers if (remoteContainer.getState() == ContainerState.RUNNING || remoteContainer.getState() == ContainerState.SCHEDULED) { {code} - why this test needs to be changed? {code} testGetContainerStatus(container, i, EnumSet.of(ContainerState.RUNNING, ContainerState.SCHEDULED), "", {code} - similarly here in TestNodeManagerShutdown, we still need to change the test to make sure the container reaches to running state? {code} Assert.assertTrue( EnumSet.of(ContainerState.RUNNING, ContainerState.SCHEDULED) .contains(containerStatus.getState())); {code} - why do we need to change the test to run for 10 min ? {code} @Test(timeout = 60) public void testAMRMClient() throws YarnException, IOException { {code} - unreleated to this patch: should ResourceUtilization#pmem,vmem be changed to long type ? we had specifically changed it for Resource object - we don't need to synchronize on the currentUtilization object? I don't see any other place it's synchronized {code} synchronized (currentUtilization) { {code} - In case we exceed the max-queue length, we are killing the container directly instead of queueing the container, in this case, we should not store the container as queued? {code} try { this.context.getNMStateStore().storeContainerQueued( container.getContainerId()); } catch (IOException e) { LOG.warn("Could not store container state into store..", e); } {code} - The ResourceUtilizationManager looks like only incorporated some utility methods, not sure how we will make this pluggable later.. - I think there might be some behavior change or bug for scheduling guaranteed containers when the oppotunistic-queue is enabled. -- Previously, when launching container, NM will not check for current vmem usage, and cpu usage. It assumes what RM allocated can be launched. -- Now, NM will check these limits and won't launch the container if hits the limit. -- Suppose the guarateed container hits the limit, it will be queued into queuedGuaranteedContainers. And this container will never be launched until one other container finishes which triggers the code path, even if it's not hitting these limits any more. This is a problem especially when other containers are long running container and never finishes. - The logic to select opportunisitic container: we may kill more opportunistic containers than required. e.g. -- one guarateed container comes, we select one opportunistic container -- before the selected opportunistic container is killed, another guarateed comes, then we will select two opportunistic containers to kill -- The process repeats, we may end up killing more opportunistic containers than required. > Add SCHEDULE to NM container lifecycle > -- > > Key: YARN-4597 > URL: https://issues.apache.org/jira/browse/YARN-4597 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Chris Douglas >Assignee: Arun Suresh > Attachments: YARN-4597.001.patch, YARN-4597.002.patch, > YARN-4597.003.patch > > > Currently, the NM immediately launches containers after resource > localization. Several features could be more cleanly implemented if the NM > included a separate stage for reserving resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5716) Add global scheduler interface definition and update CapacityScheduler to use it.
[ https://issues.apache.org/jira/browse/YARN-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593292#comment-15593292 ] Wangda Tan commented on YARN-5716: -- Thanks [~jianhe] for the thorough review! bq. how about check whether "getNode(node.getNodeID())" equals to null, I feel that's easier to reason for a removed node It is also possible that ref to the node is changed, for example, node resource updated. In that case, we may need to skip such node for safety. bq. This if condition can be merged into previous "if (reservedContainer != null) {" condition, as they are the same. No we cannot do this merge, because it is possible that in the previous reservedContainer != null if, we reserve a new container so the check is not valid. bq. Looks like one behavior change is that previously on node heartbeat, we always satisfy reservedContainer first, now in async scheduling, it's not the case any more ? It is still the same, if you look at {{allocateContainerOnSingleNode}}, we try to satisify reserved contaienr first. bq. PlacementSet, placement is an abstract name, how about NodeSet to be more concret? I would prefer use "PlacementSet", since it is for "placement", we could add more information to it, for example, racks. bq. PlacementSetUtils.getSingleNode -> hasSingleNode But what we need to do is to return the node if it is a single-node-placement-set, I think this name is better bq. nodePartition parameter is not needed, it can be inferred from 'node' parameter The original purpose adding the partition is, the partition of node could be updated in between of proposal proposed and applied, it will be used to check if we should reject the proposal when partition of the node changed. I have a separate "TODO" in FiCaSchedulerApp: {code} // TODO, make sure all node labels are not changed {code} bq. In LeafQueue#updateCurrentResourceLimits, multiple threads will update cachedResourceLimitsForHeadroom without synchronization This is intentional: we want the resourceLimitsForHeadroom is up-to-date, it is possible that one thread has some inconsistency data, but it will be corrected soon by other threads. Since the resourceLimitsForHeadroom is only used to just give more hints to application, it should be fine. And ResourceLimits is volatile, so it is safe as well. bq. SchedulerApplicationAttempt#incNumAllocatedContainers, all the locality statistics functionality are removed ? Oh I missed that, will update it in the next iteration. Addressed all the other comments Uploaded patch ver.5 > Add global scheduler interface definition and update CapacityScheduler to use > it. > - > > Key: YARN-5716 > URL: https://issues.apache.org/jira/browse/YARN-5716 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-5716.001.patch, YARN-5716.002.patch, > YARN-5716.003.patch, YARN-5716.004.patch, YARN-5716.005.patch > > > Target of this JIRA: > - Definition of interfaces / objects which will be used by global scheduling, > this will be shared by different schedulers. > - Modify CapacityScheduler to use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5716) Add global scheduler interface definition and update CapacityScheduler to use it.
[ https://issues.apache.org/jira/browse/YARN-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-5716: - Attachment: YARN-5716.005.patch > Add global scheduler interface definition and update CapacityScheduler to use > it. > - > > Key: YARN-5716 > URL: https://issues.apache.org/jira/browse/YARN-5716 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-5716.001.patch, YARN-5716.002.patch, > YARN-5716.003.patch, YARN-5716.004.patch, YARN-5716.005.patch > > > Target of this JIRA: > - Definition of interfaces / objects which will be used by global scheduling, > this will be shared by different schedulers. > - Modify CapacityScheduler to use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5764) NUMA awareness support for launching containers
[ https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-5764: Affects Version/s: (was: 2.6.0) > NUMA awareness support for launching containers > --- > > Key: YARN-5764 > URL: https://issues.apache.org/jira/browse/YARN-5764 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, yarn > Environment: SW: CentOS 6.7, Hadoop 2.6.0 > Processors: Intel Xeon CPU E5-2699 v4 @2.20GHz > Memory: 256GB 4 NUMA nodes >Reporter: Olasoji > > The purpose of this feature is to improve Hadoop performance by minimizing > costly remote memory accesses on non SMP systems. Yarn containers, on launch, > will be pinned to a specific NUMA node and all subsequent memory allocations > will be served by the same node, reducing remote memory accesses. The current > default behavior is to spread memory across all NUMA nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5764) NUMA awareness support for launching containers
[ https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-5764: Fix Version/s: (was: 2.6.0) > NUMA awareness support for launching containers > --- > > Key: YARN-5764 > URL: https://issues.apache.org/jira/browse/YARN-5764 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, yarn >Affects Versions: 2.6.0 > Environment: SW: CentOS 6.7, Hadoop 2.6.0 > Processors: Intel Xeon CPU E5-2699 v4 @2.20GHz > Memory: 256GB 4 NUMA nodes >Reporter: Olasoji > > The purpose of this feature is to improve Hadoop performance by minimizing > costly remote memory accesses on non SMP systems. Yarn containers, on launch, > will be pinned to a specific NUMA node and all subsequent memory allocations > will be served by the same node, reducing remote memory accesses. The current > default behavior is to spread memory across all NUMA nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5764) NUMA awareness support for launching containers
Olasoji created YARN-5764: - Summary: NUMA awareness support for launching containers Key: YARN-5764 URL: https://issues.apache.org/jira/browse/YARN-5764 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, yarn Affects Versions: 2.6.0 Environment: SW: CentOS 6.7, Hadoop 2.6.0 Processors: Intel Xeon CPU E5-2699 v4 @2.20GHz Memory: 256GB 4 NUMA nodes Reporter: Olasoji Fix For: 2.6.0 The purpose of this feature is to improve Hadoop performance by minimizing costly remote memory accesses on non SMP systems. Yarn containers, on launch, will be pinned to a specific NUMA node and all subsequent memory allocations will be served by the same node, reducing remote memory accesses. The current default behavior is to spread memory across all NUMA nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5747) Application timeline metric aggregation in timeline v2 will lose last round aggregation when an application finishes
[ https://issues.apache.org/jira/browse/YARN-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593276#comment-15593276 ] Sangjin Lee commented on YARN-5747: --- +1. > Application timeline metric aggregation in timeline v2 will lose last round > aggregation when an application finishes > > > Key: YARN-5747 > URL: https://issues.apache.org/jira/browse/YARN-5747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5747-trunk.001.patch > > > As discussed in YARN-3816, when an application finishes we should perform an > extra round of application level timeline aggregation. Otherwise data posted > after the last round of aggregation will get lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5715) introduce entity prefix for return and sort order
[ https://issues.apache.org/jira/browse/YARN-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593274#comment-15593274 ] Sangjin Lee commented on YARN-5715: --- The jenkins appears to be unstable right now, and that might be why the build hasn't kicked in. I think the latest patch is almost there. Should this be committed to trunk? We know that more parts to the reader code are needed. Should we wait until those parts are done before we commit this to trunk? Is this needed on the trunk now? (TimelineEntity.java) - l.597: nit: “Set” -> “Sets” - also, for “user”, let’s say either “users” or “the user” - Can we move the statement “User can use …” to the end of the javadoc (after “Entities will be stored…”)? IMO it is more important to state that the entities will be stored in the id prefix order than how to invert the prefix. (TimelineServiceHelper.java) - l.50: nit: “Invert” -> “Inverts” (EntityRowKey.java) - l.230: we should use “long” here (not “Long”) > introduce entity prefix for return and sort order > - > > Key: YARN-5715 > URL: https://issues.apache.org/jira/browse/YARN-5715 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Rohith Sharma K S >Priority: Critical > Attachments: YARN-5715-YARN-5355.01.patch, > YARN-5715-YARN-5355.02.patch, YARN-5715-YARN-5355.03.patch, > YARN-5715-YARN-5355.04.patch, YARN-5715-YARN-5355.05.patch > > > While looking into YARN-5585, we have come across the need to provide a sort > order different than the current entity id order. The current entity id order > returns entities strictly in the lexicographical order, and as such it > returns the earliest entities first. This may not be the most natural return > order. A more natural return/sort order would be from the most recent > entities. > To solve this, we would like to add what we call the "entity prefix" in the > row key for the entity table. It is a number (long) that can be easily > provided by the client on write. In the row key, it would be added before the > entity id itself. > The entity prefix would be considered mandatory. On all writes (including > updates) the correct entity prefix should be set by the client so that the > correct row key is used. The entity prefix needs to be unique only within the > scope of the application and the entity type. > For queries that return a list of entities, the prefix values will be > returned along with the entity id's. Queries that specify the prefix and the > id should be returned quickly using the row key. If the query omits the > prefix but specifies the id (query by id), the query may be less efficient. > This JIRA should add the entity prefix to the entity API and add its handling > to the schema and the write path. The read path will be addressed in > YARN-5585. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5763) HttpListener takes upwards of 3 minutes to start on failover
Sean Po created YARN-5763: - Summary: HttpListener takes upwards of 3 minutes to start on failover Key: YARN-5763 URL: https://issues.apache.org/jira/browse/YARN-5763 Project: Hadoop YARN Issue Type: Bug Reporter: Sean Po Assignee: Sean Po When Yarn RM fails over to another instance, it takes multiple minutes before the new master Yarn RM can begin accepting requests. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5762) Summarize ApplicationNotFoundException in the RM log
[ https://issues.apache.org/jira/browse/YARN-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593147#comment-15593147 ] Ravi Prakash commented on YARN-5762: IMHO we should summarize these into 1 line instead of whole stack traces. > Summarize ApplicationNotFoundException in the RM log > > > Key: YARN-5762 > URL: https://issues.apache.org/jira/browse/YARN-5762 > Project: Hadoop YARN > Issue Type: Task >Affects Versions: 2.7.2 >Reporter: Ravi Prakash >Priority: Minor > > We found a lot of {{ApplicationNotFoundException}} in the RM logs. These were > most likely caused by the {{AggregatedLogDeletionService}} [which > checks|https://github.com/apache/hadoop/blob/262827cf75bf9c48cd95335eb04fd8ff1d64c538/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L156] > that the application is not running anymore. e.g. > {code}2016-10-17 15:25:26,542 INFO org.apache.hadoop.ipc.Server: IPC Server > handler 20 on 8032, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from :12205 Call#35401 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1473396553140_1451' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > 2016-10-17 15:25:26,633 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 47 on 8032, call > org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport > from :12205 Call#35404 Retry#0 > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1473396553140_1452' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5762) Summarize ApplicationNotFoundException in the RM log
Ravi Prakash created YARN-5762: -- Summary: Summarize ApplicationNotFoundException in the RM log Key: YARN-5762 URL: https://issues.apache.org/jira/browse/YARN-5762 Project: Hadoop YARN Issue Type: Task Affects Versions: 2.7.2 Reporter: Ravi Prakash Priority: Minor We found a lot of {{ApplicationNotFoundException}} in the RM logs. These were most likely caused by the {{AggregatedLogDeletionService}} [which checks|https://github.com/apache/hadoop/blob/262827cf75bf9c48cd95335eb04fd8ff1d64c538/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L156] that the application is not running anymore. e.g. {code}2016-10-17 15:25:26,542 INFO org.apache.hadoop.ipc.Server: IPC Server handler 20 on 8032, call org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport from :12205 Call#35401 Retry#0 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1473396553140_1451' doesn't exist in RM. at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) 2016-10-17 15:25:26,633 INFO org.apache.hadoop.ipc.Server: IPC Server handler 47 on 8032, call org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport from :12205 Call#35404 Retry#0 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1473396553140_1452' doesn't exist in RM. at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5575) Many classes use bare yarn. properties instead of the defined constants
[ https://issues.apache.org/jira/browse/YARN-5575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593136#comment-15593136 ] Hadoop QA commented on YARN-5575: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 25 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 44s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 14s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 50s {color} | {color:red} root: The patch generated 2 new + 1765 unchanged - 57 fixed = 1767 total (was 1822) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 11s {color} | {color:red} hadoop-yarn-server-applicationhistoryservice in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 38m 41s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m 15s {color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 25s {color} | {color:green} hadoop-yarn-applications-distributedshell in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 119m 30s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 11s {color} | {color:green} hadoop-gridmix in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 253m 39s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices | | | hadoop.yarn.client.cli.TestLogsCLI | | | hadoop.hdfs.TestNNBench | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834469/YARN-5575.003.patch | | JIRA Issue | YARN-5575 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit
[jira] [Commented] (YARN-5746) The state of the parentQueue and its childQueues should be synchronized.
[ https://issues.apache.org/jira/browse/YARN-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593129#comment-15593129 ] Hadoop QA commented on YARN-5746: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 68 unchanged - 0 fixed = 69 total (was 68) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 34m 55s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 49m 18s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834546/YARN-5746.2.patch | | JIRA Issue | YARN-5746 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 969c8800fde6 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 262827c | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13457/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13457/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13457/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > The state of the parentQueue and its childQueues should be synchronized. > > > Key: YARN-5746 >
[jira] [Commented] (YARN-5757) RM Cluster Node API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593098#comment-15593098 ] Hadoop QA commented on YARN-5757: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 8m 47s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834551/YARN-5757.000.patch | | JIRA Issue | YARN-5757 | | Optional Tests | asflicense mvnsite | | uname | Linux d6f706506cc6 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 262827c | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13458/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > RM Cluster Node API documentation is not up to date > --- > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5759) Capability to register for a notification/callback on the expiry of timeouts
[ https://issues.apache.org/jira/browse/YARN-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593093#comment-15593093 ] Hitesh Shah commented on YARN-5759: --- Will this address support for a post-app action executed by YARN after the application reaches an end state? i.e. somewhat like a finally block for a yarn app? > Capability to register for a notification/callback on the expiry of timeouts > > > Key: YARN-5759 > URL: https://issues.apache.org/jira/browse/YARN-5759 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Gour Saha > > There is a need for the YARN native services REST-API service, to take > certain actions once a timeout of an application expires. For example, an > immediate requirement is to destroy a Slider application, once its lifetime > timeout expires and YARN has stopped the application. Destroying a Slider > application means cleanup of Slider HDFS state store and ZK paths for that > application. > Potentially, there will be advanced requirements from the REST-API service > and other services in the future, which will make this feature very handy. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5757) RM Cluster Node API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593082#comment-15593082 ] Grant Sohn commented on YARN-5757: -- Can you add the version and components fields to the JIRA? Thanks. > RM Cluster Node API documentation is not up to date > --- > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5755) Enhancements to STOP queue handling
[ https://issues.apache.org/jira/browse/YARN-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593076#comment-15593076 ] Vrushali C commented on YARN-5755: -- I would like to work on this if you can give me some more context. > Enhancements to STOP queue handling > --- > > Key: YARN-5755 > URL: https://issues.apache.org/jira/browse/YARN-5755 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5757) RM Cluster Node API documentation is not up to date
[ https://issues.apache.org/jira/browse/YARN-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miklos Szegedi updated YARN-5757: - Attachment: YARN-5757.000.patch Update the documentation to reflect the current state of the REST API > RM Cluster Node API documentation is not up to date > --- > > Key: YARN-5757 > URL: https://issues.apache.org/jira/browse/YARN-5757 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Miklos Szegedi >Priority: Trivial > Attachments: YARN-5757.000.patch > > > For an example please refer to this field that does not exist since YARN-686: > healthStatus string The health status of the node - Healthy or Unhealthy -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593057#comment-15593057 ] Hadoop QA commented on YARN-5356: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 40s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 37s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 43s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 28s {color} | {color:red} root: The patch generated 4 new + 161 unchanged - 3 fixed = 165 total (was 164) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 56s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 35m 38s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s {color} | {color:green} hadoop-sls in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 93m 52s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834522/YARN-5356.006.patch | | JIRA Issue | YARN-5356 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux fd527e52320b 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6d2da38 | | Default
[jira] [Commented] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593015#comment-15593015 ] Xuan Gong commented on YARN-5525: - bq. push it down completely to AppLogAggregatorImpl rather than make it completely pluggable. +1 for this approach. This is the per-app behavior (customize app specific log aggregation directory, customize log aggregation format). > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5267) RM REST API doc for app lists "Application Type" instead of "applicationType"
[ https://issues.apache.org/jira/browse/YARN-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593014#comment-15593014 ] Hadoop QA commented on YARN-5267: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 8m 12s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12811999/YARN-5267.001.patch | | JIRA Issue | YARN-5267 | | Optional Tests | asflicense mvnsite | | uname | Linux a4105f2ba5a1 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 262827c | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13456/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > RM REST API doc for app lists "Application Type" instead of "applicationType" > -- > > Key: YARN-5267 > URL: https://issues.apache.org/jira/browse/YARN-5267 > Project: Hadoop YARN > Issue Type: Bug > Components: api, documentation >Affects Versions: 2.6.4 >Reporter: Grant Sohn >Priority: Trivial > Labels: documentation > Attachments: YARN-5267.001.patch > > > From the docs: > {noformat} > Note that depending on security settings a user might not be able to see all > the fields. > Item Data Type Description > idstring The application id > user string The user who started the application > name string The application name > Application Type string The application type > > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2009) Priority support for preemption in ProportionalCapacityPreemptionPolicy
[ https://issues.apache.org/jira/browse/YARN-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593001#comment-15593001 ] Eric Payne commented on YARN-2009: -- Hi [~sunilg]. I am confused by something you said in the [comment above|https://issues.apache.org/jira/browse/YARN-2009?focusedCommentId=15591597=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15591597]: {quote} I tested below case {code} ... "b\t" // app3 in b + "(4,1,n1,,40,false,20,_user1_);" + // app3 b "b\t" // app1 in a + "(6,1,n1,,5,false,30,_user2_)"; ... {code} {quote} I assumed that the above was from a unit test. As far as I can tell, nothing in the {{o.a.h.y.s.r.monitor.capacity}} framework supports testing with different users. Were you using the above code as pseudocode to document a manual test? > Priority support for preemption in ProportionalCapacityPreemptionPolicy > --- > > Key: YARN-2009 > URL: https://issues.apache.org/jira/browse/YARN-2009 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Devaraj K >Assignee: Sunil G > Attachments: YARN-2009.0001.patch, YARN-2009.0002.patch, > YARN-2009.0003.patch, YARN-2009.0004.patch, YARN-2009.0005.patch, > YARN-2009.0006.patch, YARN-2009.0007.patch, YARN-2009.0008.patch, > YARN-2009.0009.patch, YARN-2009.0010.patch, YARN-2009.0011.patch > > > While preempting containers based on the queue ideal assignment, we may need > to consider preempting the low priority application containers first. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5746) The state of the parentQueue and its childQueues should be synchronized.
[ https://issues.apache.org/jira/browse/YARN-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-5746: Attachment: YARN-5746.2.patch > The state of the parentQueue and its childQueues should be synchronized. > > > Key: YARN-5746 > URL: https://issues.apache.org/jira/browse/YARN-5746 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-5746.1.patch, YARN-5746.2.patch > > > The state of the parentQueue and its childQeues need to be synchronized. > * If the state of the parentQueue becomes STOPPED, the state of its > childQueue need to become STOPPED as well. > * If we change the state of the queue to RUNNING, we should make sure the > state of all its ancestor must be RUNNING. Otherwise, we need to fail this > operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5756) Add state-machine implementation for queues
[ https://issues.apache.org/jira/browse/YARN-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong reassigned YARN-5756: --- Assignee: Xuan Gong > Add state-machine implementation for queues > --- > > Key: YARN-5756 > URL: https://issues.apache.org/jira/browse/YARN-5756 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592976#comment-15592976 ] Hadoop QA commented on YARN-5761: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} | {color:red} YARN-5761 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834541/YARN-5761.1.patch | | JIRA Issue | YARN-5761 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13455/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL: https://issues.apache.org/jira/browse/YARN-5761 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-5761.1.patch > > > Currently, in scheduler code, we are doing queue manager and scheduling work. > We'd better separate the queue manager out of scheduler logic. In that case, > it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-5755) Enhancements to STOP queue handling
[ https://issues.apache.org/jira/browse/YARN-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong reassigned YARN-5755: --- Assignee: Xuan Gong > Enhancements to STOP queue handling > --- > > Key: YARN-5755 > URL: https://issues.apache.org/jira/browse/YARN-5755 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5739) Provide timeline reader API to list available timeline entity types for one application
[ https://issues.apache.org/jira/browse/YARN-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592927#comment-15592927 ] Li Lu commented on YARN-5739: - We may not want to introduce another table for storing entity types for each application since this bothers the write path a lot. Considering this is a relatively rare use case in a cluster, we would rather put most of the burden on the reader side. I got some offline discussion with Enis from HBase community, and it seems like we can provide a custom filter on the storage layer to "jump" on a scan. In this way, we can quickly jump over all entities within one application, just grabbing out different entity types. In this JIRA we also need to add endpoints serving this data. > Provide timeline reader API to list available timeline entity types for one > application > --- > > Key: YARN-5739 > URL: https://issues.apache.org/jira/browse/YARN-5739 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Li Lu >Assignee: Li Lu > > Right now we only show a part of available timeline entity data in the new > YARN UI. However, some data (especially library specific data) are not > possible to be queried out by the web UI. It will be appealing for the UI to > provide an "entity browser" for each YARN application. Actually, simply > dumping out available timeline entities (with proper pagination, of course) > would be pretty helpful for UI users. > On timeline side, we're not far away from this goal. Right now I believe the > only thing missing is to list all available entity types within one > application. The challenge here is that we're not storing this data for each > application, but given this kind of call is relatively rare (compare to > writes and updates) we can perform some scanning during the read time. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5761) Separate QueueManager from Scheduler
[ https://issues.apache.org/jira/browse/YARN-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-5761: Attachment: YARN-5761.1.patch > Separate QueueManager from Scheduler > > > Key: YARN-5761 > URL: https://issues.apache.org/jira/browse/YARN-5761 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-5761.1.patch > > > Currently, in scheduler code, we are doing queue manager and scheduling work. > We'd better separate the queue manager out of scheduler logic. In that case, > it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5761) Separate QueueManager from Scheduler
Xuan Gong created YARN-5761: --- Summary: Separate QueueManager from Scheduler Key: YARN-5761 URL: https://issues.apache.org/jira/browse/YARN-5761 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Currently, in scheduler code, we are doing queue manager and scheduling work. We'd better separate the queue manager out of scheduler logic. In that case, it would be much easier and safer to extend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5525) Make log aggregation service class configurable
[ https://issues.apache.org/jira/browse/YARN-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592878#comment-15592878 ] Jian He commented on YARN-5525: --- bq. push it down completely to AppLogAggregatorImpl I vote for this. This way, we don't need an additional config to replace the LogAggregationService and restart NM to apply the config. Are we going to change ContainerLaunchContext to convey the information about which LogAggregator for the app to use ? > Make log aggregation service class configurable > --- > > Key: YARN-5525 > URL: https://issues.apache.org/jira/browse/YARN-5525 > Project: Hadoop YARN > Issue Type: Improvement > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Botong Huang >Priority: Minor > Attachments: YARN-5525.v1.patch, YARN-5525.v2.patch, > YARN-5525.v3.patch > > > Make the log aggregation class configurable and extensible, so that > alternative log aggregation behaviors like app specific log aggregation > directory, log aggregation format can be implemented and plugged in. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5747) Application timeline metric aggregation in timeline v2 will lose last round aggregation when an application finishes
[ https://issues.apache.org/jira/browse/YARN-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592836#comment-15592836 ] Vrushali C commented on YARN-5747: -- LGTM too. > Application timeline metric aggregation in timeline v2 will lose last round > aggregation when an application finishes > > > Key: YARN-5747 > URL: https://issues.apache.org/jira/browse/YARN-5747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5747-trunk.001.patch > > > As discussed in YARN-3816, when an application finishes we should perform an > extra round of application level timeline aggregation. Otherwise data posted > after the last round of aggregation will get lost. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592792#comment-15592792 ] Hadoop QA commented on YARN-5356: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 45s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 59s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 43s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 51s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 29s {color} | {color:red} root: The patch generated 4 new + 160 unchanged - 3 fixed = 164 total (was 163) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s {color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 1s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 2s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 37s {color} | {color:green} hadoop-sls in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 102m 50s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834471/YARN-5356.005.patch | | JIRA Issue | YARN-5356 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux cabfe3866fce 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6d2da38 | | Default Java |
[jira] [Updated] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Inigo Goiri updated YARN-5356: -- Attachment: YARN-5356.006.patch Using configured capacity when physical is not available. > NodeManager should communicate physical resource capability to ResourceManager > -- > > Key: YARN-5356 > URL: https://issues.apache.org/jira/browse/YARN-5356 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Nathan Roberts >Assignee: Inigo Goiri > Attachments: YARN-5356.000.patch, YARN-5356.001.patch, > YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, > YARN-5356.004.patch, YARN-5356.005.patch, YARN-5356.006.patch > > > Currently ResourceUtilization contains absolute quantities of resource used > (e.g. 4096MB memory used). It would be good if the NM also communicated the > actual physical resource capabilities of the node so that the RM can use this > data to schedule more effectively (overcommit, etc) > Currently the only available information is the Resource the node registered > with (or later updated using updateNodeResource). However, these aren't > really sufficient to get a good view of how utilized a resource is. For > example, if a node reports 400% CPU utilization, does that mean it's > completely full, or barely utilized? Today there is no reliable way to figure > this out. > [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you > have thoughts/opinions on this? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592776#comment-15592776 ] Inigo Goiri commented on YARN-5356: --- OK, that makes sense too. Let me switch to that. > NodeManager should communicate physical resource capability to ResourceManager > -- > > Key: YARN-5356 > URL: https://issues.apache.org/jira/browse/YARN-5356 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Nathan Roberts >Assignee: Inigo Goiri > Attachments: YARN-5356.000.patch, YARN-5356.001.patch, > YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, > YARN-5356.004.patch, YARN-5356.005.patch > > > Currently ResourceUtilization contains absolute quantities of resource used > (e.g. 4096MB memory used). It would be good if the NM also communicated the > actual physical resource capabilities of the node so that the RM can use this > data to schedule more effectively (overcommit, etc) > Currently the only available information is the Resource the node registered > with (or later updated using updateNodeResource). However, these aren't > really sufficient to get a good view of how utilized a resource is. For > example, if a node reports 400% CPU utilization, does that mean it's > completely full, or barely utilized? Today there is no reliable way to figure > this out. > [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you > have thoughts/opinions on this? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5760) [ATSv2] Create HBase connection only if an app collector is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592721#comment-15592721 ] Joep Rottinghuis edited comment on YARN-5760 at 10/20/16 7:20 PM: -- If we lazily initialize the connection then that code should be done thread-safe. If we don't want to close and discard the connection right after an application lifecycle ends, then we need to keep track of reference count and have a separate timer, as well as code to ensure that we don't introduce a race condition between the timer expiring and a new application starting and bumping up the reference count for a connection. Similar concerns for re-using and/or re-establishing a connection in the face of failures (YARN-4061) will have to be safe. I can see this code start simple in intention and get complicated quickly in implementation, or be a source of subtle bugs. In other words, it sounds like we're heading towards implementing our own ConnectionManager / connection pool (of size 1) here. was (Author: jrottinghuis): If we lazily initialize the connection then that code should be done thread-safe. If we don't want to close and discard the connection right after an application lifecycle ends, then we need to keep track of reference count and have a separate timer, as well as code to ensure that we don't introduce a race condition between the timer expiring and a new application starting and bumping up the reference count for a connection. Similar concerns for re-using and/or re-establishing a connection in the face of failures (YARN-4061) will have to be safe. I can see this code start simple in intention and get complicated quickly in implementation, or be a source of subtle bugs. > [ATSv2] Create HBase connection only if an app collector is publishing from NM > -- > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2009) Priority support for preemption in ProportionalCapacityPreemptionPolicy
[ https://issues.apache.org/jira/browse/YARN-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592740#comment-15592740 ] Eric Payne commented on YARN-2009: -- Hi [~sunilg]. Here is a description of my test environment, the steps I executed, and the results I am seeing I don't know why the unit test you described above is not catching this, but I will continue to investigate. In the meantime, can you please try the following and let me know what you discover? - ||Property Name||Property Value|| |monitoring_interval (ms)|1000| |max_wait_before_kill (ms)|500| |total_preemption_per_round|1.0| |max_ignored_over_capacity|0.2| |select_based_on_reserved_containers|true| |natural_termination_factor|2.0| |intra-queue-preemption.enabled|true| |intra-queue-preemption.minimum-threshold|0.5| |intra-queue-preemption.max-allowable-limit|0.1| {noformat:title=Cluster} Nodes: 3 Mem per node: 4 GB Total Cluster Size: 12 GB Container size: 0.5 GB {noformat} ||Queue||Guarantee||Max||Minimum user limit percent||User Limit Factor|| |root|100% (12 GB)|100% (12 GB)|N/A|N/A| |default|50% (6 GB)|100% (12 GB)|50% (2 users can run in queue simultaneously)|2.0 (one user can consume twice the queue's Guarantee| |eng|50% (6 GB)|100% (12 GB)|50% (2 users can run in queue simultaneously)|2.0 (one user can consume twice the queue's Guarantee| - {{user1}} starts {{app1}} at priority 1 in the {{default}} queue, and requests 30 mappers which want to run for 10 minutes each: -- Sleep job: {{-m 30 -mt 60}} -- Total requested resources are 15.5 GB: ((30 map containers * 0.5 GB per container) + 0.5 GB AM container)) ||App Name||User Name||Priority||Used||Pending|| |app1|user1|1|0|15.5 GB| - The RM assigns {{app1}} 24 containers, consuming 12 GB (all cluster resources): -- {{(23 mappers * 0.5 GB) + 0.5 GB AM = 12 GB}} ||App Name||User Name||Priority||Used||Pending|| |app1|user1|1|12 GB|3.5 GB| - {{user2}} starts {{app2}} at priority 2 in the {{default}} queue, and requests 30 mappers which want to run for 10 minutes each: ||App Name||User Name||Priority||Used||Pending|| |app1|user1|1|12 GB|3.5 GB| |app2|user2|2|0|15.5 GB| - The intra-queue preemption monitor iterates over the containers for several {{monitoring_interval}}'s and preempts 12 containers (6 GB resources) - The RM assigns the the preempted containers to {{app2}} ||App Name||User Name||Priority||Used||Pending|| |app1|user1|1|6 GB|3.5 GB| |app2|user2|2|6 GB|3.5 GB| - The intra-queue preemption monitor continues to preempt containers from {{app1}}. -- However, since the MULP for the {{default}} queue should be 6GB, the RM gives the preempted containers back to {{app1}} -- This repeats indefinitely. > Priority support for preemption in ProportionalCapacityPreemptionPolicy > --- > > Key: YARN-2009 > URL: https://issues.apache.org/jira/browse/YARN-2009 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Reporter: Devaraj K >Assignee: Sunil G > Attachments: YARN-2009.0001.patch, YARN-2009.0002.patch, > YARN-2009.0003.patch, YARN-2009.0004.patch, YARN-2009.0005.patch, > YARN-2009.0006.patch, YARN-2009.0007.patch, YARN-2009.0008.patch, > YARN-2009.0009.patch, YARN-2009.0010.patch, YARN-2009.0011.patch > > > While preempting containers based on the queue ideal assignment, we may need > to consider preempting the low priority application containers first. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5715) introduce entity prefix for return and sort order
[ https://issues.apache.org/jira/browse/YARN-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-5715: Attachment: YARN-5715-YARN-5355.05.patch Updated patch with following changes # Added DEFAULT_ENTITY_PREFIX constant in TimelineEntity. # Modified the java doc as per Varun's review comment. > introduce entity prefix for return and sort order > - > > Key: YARN-5715 > URL: https://issues.apache.org/jira/browse/YARN-5715 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Rohith Sharma K S >Priority: Critical > Attachments: YARN-5715-YARN-5355.01.patch, > YARN-5715-YARN-5355.02.patch, YARN-5715-YARN-5355.03.patch, > YARN-5715-YARN-5355.04.patch, YARN-5715-YARN-5355.05.patch > > > While looking into YARN-5585, we have come across the need to provide a sort > order different than the current entity id order. The current entity id order > returns entities strictly in the lexicographical order, and as such it > returns the earliest entities first. This may not be the most natural return > order. A more natural return/sort order would be from the most recent > entities. > To solve this, we would like to add what we call the "entity prefix" in the > row key for the entity table. It is a number (long) that can be easily > provided by the client on write. In the row key, it would be added before the > entity id itself. > The entity prefix would be considered mandatory. On all writes (including > updates) the correct entity prefix should be set by the client so that the > correct row key is used. The entity prefix needs to be unique only within the > scope of the application and the entity type. > For queries that return a list of entities, the prefix values will be > returned along with the entity id's. Queries that specify the prefix and the > id should be returned quickly using the row key. If the query omits the > prefix but specifies the id (query by id), the query may be less efficient. > This JIRA should add the entity prefix to the entity API and add its handling > to the schema and the write path. The read path will be addressed in > YARN-5585. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5760) [ATSv2] Create HBase connection only if an app collector is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592721#comment-15592721 ] Joep Rottinghuis commented on YARN-5760: If we lazily initialize the connection then that code should be done thread-safe. If we don't want to close and discard the connection right after an application lifecycle ends, then we need to keep track of reference count and have a separate timer, as well as code to ensure that we don't introduce a race condition between the timer expiring and a new application starting and bumping up the reference count for a connection. Similar concerns for re-using and/or re-establishing a connection in the face of failures (YARN-4061) will have to be safe. I can see this code start simple in intention and get complicated quickly in implementation, or be a source of subtle bugs. > [ATSv2] Create HBase connection only if an app collector is publishing from NM > -- > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5686) DefaultContainerExecutor random working dir algorigthm skews results
[ https://issues.apache.org/jira/browse/YARN-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-5686: - Attachment: YARN-5686.002.patch Uploading patch 002 that reorders the test case data and fixes the checkstyle comment. > DefaultContainerExecutor random working dir algorigthm skews results > > > Key: YARN-5686 > URL: https://issues.apache.org/jira/browse/YARN-5686 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Miklos Szegedi >Assignee: Vrushali C >Priority: Minor > Attachments: YARN-5686.001.patch, YARN-5686.002.patch > > > {code} > long randomPosition = RandomUtils.nextLong() % totalAvailable; > ... > while (randomPosition > availableOnDisk[dir]) { > randomPosition -= availableOnDisk[dir++]; > } > {code} > The code above selects a disk based on the random number weighted by the free > space on each disk respectively. For example, if I have two disks with 100 > bytes each, totalAvailable is 200. The value of randomPosition will be > 0..199. 0..99 should select the first disk, 100..199 should select the second > disk inclusively. Random number 100 should select the second disk to be fair > but this is not the case right now. > We need to use > {code} > while (randomPosition >= availableOnDisk[dir]) > {code} > instead of > {code} > while (randomPosition > availableOnDisk[dir]) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5760) [ATSv2] Create HBase connection only if an app collector is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592693#comment-15592693 ] Vrushali C commented on YARN-5760: -- Also, a related thought is considering using node labels to allow AMs on only certain nodes. That way we don't have too many nodes connecting to hbase/zk. That may be helpful if there are very frequently created short lived AMs. We could add it as a suggestion in the docs for people who might want to use it. > [ATSv2] Create HBase connection only if an app collector is publishing from NM > -- > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5760) [ATSv2] Create HBase connection only if an app collector is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592688#comment-15592688 ] Varun Saxena commented on YARN-5760: Probably we can be a little conservative with tear downs. Tearing it down after an interval of inactivity sounds like a good idea. > [ATSv2] Create HBase connection only if an app collector is publishing from NM > -- > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5760) [ATSv2] Create HBase connection only if an app collector is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592684#comment-15592684 ] Vrushali C commented on YARN-5760: -- Yes, opening a connection (checking for an open connection) the first time an app shows up for publishing may be a good idea. I am thinking about how/when to close the connection. Do we think we want to close it after an interval of inactivity on the NM or right after an app on that NM ends? On a busy cluster, it might mean too many build ups and tear downs if we do it right after an app ends. > [ATSv2] Create HBase connection only if an app collector is publishing from NM > -- > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5760) [ATSv2] Create HBase connection only if an app collector is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-5760: --- Issue Type: Sub-task (was: Bug) Parent: YARN-5355 > [ATSv2] Create HBase connection only if an app collector is publishing from NM > -- > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5760) [ATSv2] Create HBase connection only if an app collector is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-5760: --- Summary: [ATSv2] Create HBase connection only if an app collector is publishing from NM (was: [ATSv2] Create HBase connection only if an app is publishing from NM) > [ATSv2] Create HBase connection only if an app collector is publishing from NM > -- > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5760) [ATSv2] Create HBase connection only if an app is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-5760: --- Summary: [ATSv2] Create HBase connection only if an app is publishing from NM (was: [ATSv2] Create HBase connection only if an application is publishing from NM) > [ATSv2] Create HBase connection only if an app is publishing from NM > > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5760) [ATSv2] Create HBase connection only if an application is publishing from NM
[ https://issues.apache.org/jira/browse/YARN-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-5760: --- Description: Irrespective of NM handling an app or not, we initialize HBaseTimelineWriterImpl in TimelineCollectorManager. This in turn calls ConnectionFactory#createConnection to manage connections with HBase. But it seems this opens up a connection with Zookeeper (i.e. as soon as NM starts up) instead of opening connection when atleast one app arrives for publishing and closing it if no apps are being published from this NM. This leads to unnecessary connections to Zookeeper. > [ATSv2] Create HBase connection only if an application is publishing from NM > > > Key: YARN-5760 > URL: https://issues.apache.org/jira/browse/YARN-5760 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > > Irrespective of NM handling an app or not, we initialize > HBaseTimelineWriterImpl in TimelineCollectorManager. > This in turn calls ConnectionFactory#createConnection to manage connections > with HBase. > But it seems this opens up a connection with Zookeeper (i.e. as soon as NM > starts up) instead of opening connection when atleast one app arrives for > publishing and closing it if no apps are being published from this NM. > This leads to unnecessary connections to Zookeeper. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5760) [ATSv2] Create HBase connection only if an application is publishing from NM
Varun Saxena created YARN-5760: -- Summary: [ATSv2] Create HBase connection only if an application is publishing from NM Key: YARN-5760 URL: https://issues.apache.org/jira/browse/YARN-5760 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Varun Saxena Assignee: Varun Saxena -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5742) Serve aggregated logs of historical apps from timeline service
[ https://issues.apache.org/jira/browse/YARN-5742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592635#comment-15592635 ] Joep Rottinghuis commented on YARN-5742: We should really carefully consider whether serving yarn application logs is a timeline service concern. I'd argue that belongs in a separate Yarn service. It would be perfectly acceptable to have that separate service store metadata (about the current log location of tasks, whether that is on local disk, or aggregated to a location such as HDFS) in the timeline service, but I think the serving itself doesn't below here. Providing an api that can read files from HDFS and stream them out would be opening security concerns, would duplicate efforts from services such as WebHDFS/HttpFS and all their concerns. The HttpFS approach of a central pool of nodes serving data from HDFS has been superseded by a distributed WebHDFS approach. Note by the way that WebHDFS as it stands today still has some compatibility challenges with HDFS federation. Both of these general approaches of serving HDFS data have to deal with proxying user requests and correctly limiting visibility of HDFS files to the users with the appropriate access. > Serve aggregated logs of historical apps from timeline service > -- > > Key: YARN-5742 > URL: https://issues.apache.org/jira/browse/YARN-5742 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Rohith Sharma K S > Attachments: YARN-5742-POC-v0.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5754) Variable earliest missing null check in computeShares() in FifoPolicy.java
[ https://issues.apache.org/jira/browse/YARN-5754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592617#comment-15592617 ] Yufei Gu commented on YARN-5754: Additional unit test are not necessary. > Variable earliest missing null check in computeShares() in FifoPolicy.java > -- > > Key: YARN-5754 > URL: https://issues.apache.org/jira/browse/YARN-5754 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 3.0.0-alpha1 >Reporter: Yufei Gu >Assignee: Yufei Gu > Attachments: YARN-5754.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4061) [Fault tolerance] Fault tolerant writer for timeline v2
[ https://issues.apache.org/jira/browse/YARN-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592611#comment-15592611 ] Joep Rottinghuis commented on YARN-4061: You do bring up an interesting question [~gtCarrera9], and that is what happens if the timeline collector / writer is down. This would occur in the current implementation when the nodemanager is down (is restarted). Once collectors become dedicated / separate per-application containers, then something similar can happen. The clients will time out and will have to do re-tries. I think what you indicated here is the concern what happens to data buffered in memory in the collector before it is written to either HBase or even spooled to disk (or HDFS). Even in the HDFS case there will be buffering. The current TimelineWriter interface covers this by assuming that all writes are buffered and providing an explicit flush call to flush all previously buffered data to permanent storage. For the spooling case to HDFS that would mean we'd have to do a hsync/flush there as well. This jira is really mainly focussed on what would happen if we cannot persist data to the distributed back-end system (HBase in the current implementation). > [Fault tolerance] Fault tolerant writer for timeline v2 > --- > > Key: YARN-4061 > URL: https://issues.apache.org/jira/browse/YARN-4061 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Joep Rottinghuis > Labels: YARN-5355 > Attachments: FaulttolerantwriterforTimelinev2.pdf > > > We need to build a timeline writer that can be resistant to backend storage > down time and timeline collector failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5734) OrgQueue for easy CapacityScheduler queue configuration management
[ https://issues.apache.org/jira/browse/YARN-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592596#comment-15592596 ] Jonathan Hung commented on YARN-5734: - [~rémy], glad to hear this is useful for your company. With this enabled, {{refreshQueue}} will no longer use the configuration from {{capacity-scheduler.xml}} as the latest conf, since calling capacity scheduler's reinitialize will load the capacity scheduler configuration from the backing store (e.g. derby database). The intent behind {{reset}} is to clear the configuration from the DB and load it from the xml file. > OrgQueue for easy CapacityScheduler queue configuration management > -- > > Key: YARN-5734 > URL: https://issues.apache.org/jira/browse/YARN-5734 > Project: Hadoop YARN > Issue Type: New Feature >Reporter: Min Shen >Assignee: Min Shen > Attachments: OrgQueue_Design_v0.pdf > > > The current xml based configuration mechanism in CapacityScheduler makes it > very inconvenient to apply any changes to the queue configurations. We saw 2 > main drawbacks in the file based configuration mechanism: > # This makes it very inconvenient to automate queue configuration updates. > For example, in our cluster setup, we leverage the queue mapping feature from > YARN-2411 to route users to their dedicated organization queues. It could be > extremely cumbersome to keep updating the config file to manage the very > dynamic mapping between users to organizations. > # Even a user has the admin permission on one specific queue, that user is > unable to make any queue configuration changes to resize the subqueues, > changing queue ACLs, or creating new queues. All these operations need to be > performed in a centralized manner by the cluster administrators. > With these current limitations, we realized the need of a more flexible > configuration mechanism that allows queue configurations to be stored and > managed more dynamically. We developed the feature internally at LinkedIn > which introduces the concept of MutableConfigurationProvider. What it > essentially does is to provide a set of configuration mutation APIs that > allows queue configurations to be updated externally with a set of REST APIs. > When performing the queue configuration changes, the queue ACLs will be > honored, which means only queue administrators can make configuration changes > to a given queue. MutableConfigurationProvider is implemented as a pluggable > interface, and we have one implementation of this interface which is based on > Derby embedded database. > This feature has been deployed at LinkedIn's Hadoop cluster for a year now, > and have gone through several iterations of gathering feedbacks from users > and improving accordingly. With this feature, cluster administrators are able > to automate lots of thequeue configuration management tasks, such as setting > the queue capacities to adjust cluster resources between queues based on > established resource consumption patterns, or managing updating the user to > queue mappings. We have attached our design documentation with this ticket > and would like to receive feedbacks from the community regarding how to best > integrate it with the latest version of YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5759) Capability to register for a notification/callback on the expiry of timeouts for an application
Gour Saha created YARN-5759: --- Summary: Capability to register for a notification/callback on the expiry of timeouts for an application Key: YARN-5759 URL: https://issues.apache.org/jira/browse/YARN-5759 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Gour Saha There is a need for the YARN native services REST-API service, to take certain actions once a timeout of an application expires. For example, an immediate requirement is to destroy a Slider application, once its lifetime timeout expires and YARN has stopped the application. Destroying a Slider application means cleanup of Slider HDFS state store and ZK paths for that application. Potentially, there will be advanced requirements from the REST-API service and other services in the future, which will make this feature very handy. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5759) Capability to register for a notification/callback on the expiry of timeouts
[ https://issues.apache.org/jira/browse/YARN-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gour Saha updated YARN-5759: Summary: Capability to register for a notification/callback on the expiry of timeouts (was: Capability to register for a notification/callback on the expiry of timeouts for an application) > Capability to register for a notification/callback on the expiry of timeouts > > > Key: YARN-5759 > URL: https://issues.apache.org/jira/browse/YARN-5759 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Gour Saha > > There is a need for the YARN native services REST-API service, to take > certain actions once a timeout of an application expires. For example, an > immediate requirement is to destroy a Slider application, once its lifetime > timeout expires and YARN has stopped the application. Destroying a Slider > application means cleanup of Slider HDFS state store and ZK paths for that > application. > Potentially, there will be advanced requirements from the REST-API service > and other services in the future, which will make this feature very handy. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3649) Allow configurable prefix for hbase table names (like prod, exp, test etc)
[ https://issues.apache.org/jira/browse/YARN-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592560#comment-15592560 ] Hadoop QA commented on YARN-3649: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 13s {color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 52s {color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s {color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s {color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 55s {color} | {color:green} YARN-5355 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s {color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s {color} | {color:green} YARN-5355 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s {color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 209 unchanged - 1 fixed = 209 total (was 210) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 6m 32s {color} | {color:green} hadoop-yarn-server-timelineservice-hbase-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 7s {color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 37m 41s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12834462/YARN-3649-YARN-5355.004.patch | | JIRA Issue | YARN-3649 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux
[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592554#comment-15592554 ] Kuhu Shukla commented on YARN-5356: --- I think that if we do not have the values from the plugin we could initialize the physicalResource to totalNodeResource (from conf values) instead of zero. May be something like : {code} int physicalMemoryMb; int physicalCores; if (rcp != null) { physicalMemoryMb = (int) rcp.getPhysicalMemorySize() / (1024 * 1024); physicalCores = rcp.getNumProcessors(); } else { physicalMemoryMb = conf.getInt( YarnConfiguration.NM_PMEM_MB, YarnConfiguration.DEFAULT_NM_PMEM_MB) + conf.getInt( YarnConfiguration.NM_SYSTEM_RESERVED_PMEM_MB, 0); .. } {code} > NodeManager should communicate physical resource capability to ResourceManager > -- > > Key: YARN-5356 > URL: https://issues.apache.org/jira/browse/YARN-5356 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Nathan Roberts >Assignee: Inigo Goiri > Attachments: YARN-5356.000.patch, YARN-5356.001.patch, > YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, > YARN-5356.004.patch, YARN-5356.005.patch > > > Currently ResourceUtilization contains absolute quantities of resource used > (e.g. 4096MB memory used). It would be good if the NM also communicated the > actual physical resource capabilities of the node so that the RM can use this > data to schedule more effectively (overcommit, etc) > Currently the only available information is the Resource the node registered > with (or later updated using updateNodeResource). However, these aren't > really sufficient to get a good view of how utilized a resource is. For > example, if a node reports 400% CPU utilization, does that mean it's > completely full, or barely utilized? Today there is no reliable way to figure > this out. > [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you > have thoughts/opinions on this? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5356) NodeManager should communicate physical resource capability to ResourceManager
[ https://issues.apache.org/jira/browse/YARN-5356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592555#comment-15592555 ] Kuhu Shukla commented on YARN-5356: --- I think that if we do not have the values from the plugin we could initialize the physicalResource to totalNodeResource (from conf values) instead of zero. May be something like : {code} int physicalMemoryMb; int physicalCores; if (rcp != null) { physicalMemoryMb = (int) rcp.getPhysicalMemorySize() / (1024 * 1024); physicalCores = rcp.getNumProcessors(); } else { physicalMemoryMb = conf.getInt( YarnConfiguration.NM_PMEM_MB, YarnConfiguration.DEFAULT_NM_PMEM_MB) + conf.getInt( YarnConfiguration.NM_SYSTEM_RESERVED_PMEM_MB, 0); .. } {code} > NodeManager should communicate physical resource capability to ResourceManager > -- > > Key: YARN-5356 > URL: https://issues.apache.org/jira/browse/YARN-5356 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, resourcemanager >Affects Versions: 3.0.0-alpha1 >Reporter: Nathan Roberts >Assignee: Inigo Goiri > Attachments: YARN-5356.000.patch, YARN-5356.001.patch, > YARN-5356.002.patch, YARN-5356.002.patch, YARN-5356.003.patch, > YARN-5356.004.patch, YARN-5356.005.patch > > > Currently ResourceUtilization contains absolute quantities of resource used > (e.g. 4096MB memory used). It would be good if the NM also communicated the > actual physical resource capabilities of the node so that the RM can use this > data to schedule more effectively (overcommit, etc) > Currently the only available information is the Resource the node registered > with (or later updated using updateNodeResource). However, these aren't > really sufficient to get a good view of how utilized a resource is. For > example, if a node reports 400% CPU utilization, does that mean it's > completely full, or barely utilized? Today there is no reliable way to figure > this out. > [~elgoiri] - Lots of good work is happening in YARN-2965 so curious if you > have thoughts/opinions on this? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5575) Many classes use bare yarn. properties instead of the defined constants
[ https://issues.apache.org/jira/browse/YARN-5575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592515#comment-15592515 ] Miklos Szegedi commented on YARN-5575: -- +1 (non-binding) The change looks good to me. Thank you, [~templedf]! > Many classes use bare yarn. properties instead of the defined constants > --- > > Key: YARN-5575 > URL: https://issues.apache.org/jira/browse/YARN-5575 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: YARN-5575.001.patch, YARN-5575.002.patch, > YARN-5575.003.patch > > > MAPREDUCE-5870 introduced the following line: > {code} > conf.setInt("yarn.cluster.max-application-priority", 10); > {code} > It should instead be: > {code} > conf.setInt(YarnConfiguration.MAX_CLUSTER_LEVEL_APPLICATION_PRIORITY, > 10); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org