[jira] [Commented] (YARN-8351) RM is flooded with node attributes manager logs
[ https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488448#comment-16488448 ] genericqa commented on YARN-8351: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} YARN-3409 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 26s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 44s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s{color} | {color:green} YARN-3409 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} YARN-3409 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 68m 44s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}126m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8351 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924872/YARN-8351-YARN-3409.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1ce6bd5d826b 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | YARN-3409 / 4cf0d40 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20851/testReport/ | | Max. process+thread count | 827 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20851/console | |
[jira] [Commented] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"
[ https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488439#comment-16488439 ] Rohith Sharma K S commented on YARN-8346: - Thanks [~jlowe] for quick turnaround. I verified the patch in cluster and working fine as expected. I am +1 for the patch. > Upgrading to 3.1 kills running containers with error "Opportunistic container > queue is full" > > > Key: YARN-8346 > URL: https://issues.apache.org/jira/browse/YARN-8346 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0, 3.0.2 >Reporter: Rohith Sharma K S >Assignee: Jason Lowe >Priority: Blocker > Attachments: YARN-8346.001.patch > > > It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the > running containers are killed and second attempt is launched for that > application. The diagnostics message is "Opportunistic container queue is > full" which is the reason for container killed. > In NM log, I see below logs for after container is recovered. > {noformat} > 2018-05-23 17:18:50,655 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler: > Opportunistic container [container_e06_1527075664705_0001_01_01] will > not be queued at the NMsince max queue length [0] has been reached > {noformat} > Following steps are executed for rolling upgrade > # Install 2.8.4 cluster and launch a MR job with distributed cache enabled. > # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration. > # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8351) RM is flooded with node attributes manager logs
[ https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488371#comment-16488371 ] Weiwei Yang commented on YARN-8351: --- Thanks [~sunilg] for the quick response! > RM is flooded with node attributes manager logs > --- > > Key: YARN-8351 > URL: https://issues.apache.org/jira/browse/YARN-8351 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8351-YARN-3409.001.patch, YARN-8351.001.patch > > > When distributed node attributes enabled, RM updates these attributes on each > NM HB interval, and each time it creates a log like > {noformat} > REPLACE attributes on nodes: NM="xxx", attributes="" > {noformat} > this should be in DEBUG level. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8351) RM is flooded with node attributes manager logs
[ https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488368#comment-16488368 ] Sunil Govindan commented on YARN-8351: -- Patch looks straight forward. Committing shortly. Pending jenkins. > RM is flooded with node attributes manager logs > --- > > Key: YARN-8351 > URL: https://issues.apache.org/jira/browse/YARN-8351 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8351-YARN-3409.001.patch, YARN-8351.001.patch > > > When distributed node attributes enabled, RM updates these attributes on each > NM HB interval, and each time it creates a log like > {noformat} > REPLACE attributes on nodes: NM="xxx", attributes="" > {noformat} > this should be in DEBUG level. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8352) AM should retry on a different node after the previous application attempt fail
[ https://issues.apache.org/jira/browse/YARN-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhizhen Hou updated YARN-8352: -- Description: I submit a job to the yarn, and both two times the AM is allocated on the same node. After the first allocate call to scheduler, the follows call should include the black node list, but now the black node list is always null. (was: I submit a job to the yarn, and both two times the AM is allocated on the same node. After the first allocate call to scheduler, the follows call should include the black node list, but now the black node list is a constant null.) > AM should retry on a different node after the previous application attempt > fail > --- > > Key: YARN-8352 > URL: https://issues.apache.org/jira/browse/YARN-8352 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.7.5 >Reporter: Zhizhen Hou >Priority: Major > > I submit a job to the yarn, and both two times the AM is allocated on the > same node. After the first allocate call to scheduler, the follows call > should include the black node list, but now the black node list is always > null. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8351) RM is flooded with node attributes manager logs
[ https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8351: -- Attachment: YARN-8351-YARN-3409.001.patch > RM is flooded with node attributes manager logs > --- > > Key: YARN-8351 > URL: https://issues.apache.org/jira/browse/YARN-8351 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8351-YARN-3409.001.patch, YARN-8351.001.patch > > > When distributed node attributes enabled, RM updates these attributes on each > NM HB interval, and each time it creates a log like > {noformat} > REPLACE attributes on nodes: NM="xxx", attributes="" > {noformat} > this should be in DEBUG level. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8351) RM is flooded with node attributes manager logs
[ https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488362#comment-16488362 ] genericqa commented on YARN-8351: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} YARN-8351 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8351 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924871/YARN-8351.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20850/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > RM is flooded with node attributes manager logs > --- > > Key: YARN-8351 > URL: https://issues.apache.org/jira/browse/YARN-8351 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8351.001.patch > > > When distributed node attributes enabled, RM updates these attributes on each > NM HB interval, and each time it creates a log like > {noformat} > REPLACE attributes on nodes: NM="xxx", attributes="" > {noformat} > this should be in DEBUG level. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8351) RM is flooded with node attributes manager logs
[ https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8351: -- Attachment: YARN-8351.001.patch > RM is flooded with node attributes manager logs > --- > > Key: YARN-8351 > URL: https://issues.apache.org/jira/browse/YARN-8351 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Weiwei Yang >Assignee: Weiwei Yang >Priority: Major > Attachments: YARN-8351.001.patch > > > When distributed node attributes enabled, RM updates these attributes on each > NM HB interval, and each time it creates a log like > {noformat} > REPLACE attributes on nodes: NM="xxx", attributes="" > {noformat} > this should be in DEBUG level. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8352) AM should retry on a different node after the previous application attempt fail
Zhizhen Hou created YARN-8352: - Summary: AM should retry on a different node after the previous application attempt fail Key: YARN-8352 URL: https://issues.apache.org/jira/browse/YARN-8352 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.7.5 Reporter: Zhizhen Hou I submit a job to the yarn, and both two times the AM is allocated on the same node. After the first allocate call to scheduler, the follows call should include the black node list, but now the black node list is a constant null. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8351) RM is flooded with node attributes manager logs
Weiwei Yang created YARN-8351: - Summary: RM is flooded with node attributes manager logs Key: YARN-8351 URL: https://issues.apache.org/jira/browse/YARN-8351 Project: Hadoop YARN Issue Type: Sub-task Reporter: Weiwei Yang Assignee: Weiwei Yang When distributed node attributes enabled, RM updates these attributes on each NM HB interval, and each time it creates a log like {noformat} REPLACE attributes on nodes: NM="xxx", attributes="" {noformat} this should be in DEBUG level. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5764) NUMA awareness support for launching containers
[ https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488321#comment-16488321 ] Weiwei Yang commented on YARN-5764: --- Hi [~devaraj.k], [~miklos.szeg...@cloudera.com], can you update the fixed version for this jira? > NUMA awareness support for launching containers > --- > > Key: YARN-5764 > URL: https://issues.apache.org/jira/browse/YARN-5764 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, yarn >Reporter: Olasoji >Assignee: Devaraj K >Priority: Major > Attachments: NUMA Awareness for YARN Containers.pdf, NUMA Performance > Results.pdf, YARN-5764-v0.patch, YARN-5764-v1.patch, YARN-5764-v10.patch, > YARN-5764-v11.patch, YARN-5764-v2.patch, YARN-5764-v3.patch, > YARN-5764-v4.patch, YARN-5764-v5.patch, YARN-5764-v6.patch, > YARN-5764-v7.patch, YARN-5764-v8.patch, YARN-5764-v9.patch > > > The purpose of this feature is to improve Hadoop performance by minimizing > costly remote memory accesses on non SMP systems. Yarn containers, on launch, > will be pinned to a specific NUMA node and all subsequent memory allocations > will be served by the same node, reducing remote memory accesses. The current > default behavior is to spread memory across all NUMA nodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488320#comment-16488320 ] Eric Yang commented on YARN-7530: - +1 for branch-3.1 change. > hadoop-yarn-services-api should be part of hadoop-yarn-services > --- > > Key: YARN-7530 > URL: https://issues.apache.org/jira/browse/YARN-7530 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Chandni Singh >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-7530-branch-3.1.001.patch, YARN-7530.001.patch, > YARN-7530.002.patch > > > Hadoop-yarn-services-api is currently a parallel project to > hadoop-yarn-services project. It would be better if hadoop-yarn-services-api > is part of hadoop-yarn-services for correctness. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488291#comment-16488291 ] genericqa commented on YARN-6677: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 43s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 23s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 17 new + 145 unchanged - 0 fixed = 162 total (was 145) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 36m 25s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 96m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.TestContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-6677 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924847/YARN-6677.00.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ff06c4489b33 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d996479 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/20848/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit |
[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records
[ https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488283#comment-16488283 ] genericqa commented on YARN-8333: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 23s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 11s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry: The patch generated 7 new + 8 unchanged - 0 fixed = 15 total (was 8) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 48m 35s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8333 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924854/YARN-8333.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a01231175909 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d996479 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/20849/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-registry.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20849/testReport/ | | Max. process+thread count | 442 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry | | Console output |
[jira] [Updated] (YARN-8350) NPE in service AM related to placement policy
[ https://issues.apache.org/jira/browse/YARN-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi updated YARN-8350: - Description: It seems like this NPE is happening in a service with more than one component when one component has a placement policy and the other does not. It causes the AM to crash. {noformat} java.lang.NullPointerException at org.apache.hadoop.yarn.service.component.Component.requestContainers(Component.java:644) at org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:310) at org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:293) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) at org.apache.hadoop.yarn.service.component.Component.handle(Component.java:919) at org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:344) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) at org.apache.hadoop.yarn.service.ServiceMaster.lambda$serviceStart$0(ServiceMaster.java:253) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hadoop.yarn.service.ServiceMaster.serviceStart(ServiceMaster.java:251) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:317) {noformat} was: It seems like this NPE is happening in a service with more than one component when one component has a placement policy and the other does not. It causes the AM to crash. See https://github.com/hortonworks/hadoop/blob/3c66d40e26bc2d0e17a6e1869201021a8c2f6df1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/Component.java {noformat} java.lang.NullPointerException at org.apache.hadoop.yarn.service.component.Component.requestContainers(Component.java:644) at org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:310) at org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:293) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) at org.apache.hadoop.yarn.service.component.Component.handle(Component.java:919) at org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:344) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) at org.apache.hadoop.yarn.service.ServiceMaster.lambda$serviceStart$0(ServiceMaster.java:253) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hadoop.yarn.service.ServiceMaster.serviceStart(ServiceMaster.java:251) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:317) {noformat} > NPE in service AM related to placement policy > - > > Key: YARN-8350 > URL: https://issues.apache.org/jira/browse/YARN-8350 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Gour Saha >Priority: Critical > > It seems like this NPE is happening in a service with more than one component > when one component has a placement policy and the other does not. It causes > the AM to
[jira] [Commented] (YARN-4677) RMNodeResourceUpdateEvent update from scheduler can lead to race condition
[ https://issues.apache.org/jira/browse/YARN-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488271#comment-16488271 ] Robert Kanter commented on YARN-4677: - Thanks [~wilfreds] for the trunk patch and [~gphillips] for the branch-2 patch. The trunk patch looks fine, but a couple things on the branch-2 patch: # Instead of calling {{getSchedulerNode}} and {{getNode}} again later on in {{nodeUpdate}}, we should simply use the {{schedulerNode}} we're now getting. # The comment about the TODO can be removed now. > RMNodeResourceUpdateEvent update from scheduler can lead to race condition > -- > > Key: YARN-4677 > URL: https://issues.apache.org/jira/browse/YARN-4677 > Project: Hadoop YARN > Issue Type: Sub-task > Components: graceful, resourcemanager, scheduler >Affects Versions: 2.7.1 >Reporter: Brook Zhou >Assignee: Wilfred Spiegelenburg >Priority: Major > Attachments: YARN-4677-branch-2.001.patch, > YARN-4677-branch-2.002.patch, YARN-4677.01.patch > > > When a node is in decommissioning state, there is time window between > completedContainer() and RMNodeResourceUpdateEvent get handled in > scheduler.nodeUpdate (YARN-3223). > So if a scheduling effort happens within this window, the new container could > still get allocated on this node. Even worse case is if scheduling effort > happen after RMNodeResourceUpdateEvent sent out but before it is propagated > to SchedulerNode - then the total resource is lower than used resource and > available resource is a negative value. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488266#comment-16488266 ] genericqa commented on YARN-8292: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 42s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 37s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 6 new + 97 unchanged - 0 fixed = 103 total (was 97) {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 39s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 41s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 17s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 36s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 90m 10s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8292 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924843/YARN-8292.008.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 21ea8b19a4a8 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6
[ https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488263#comment-16488263 ] Hsin-Liang Huang commented on YARN-8326: Here is more detail information from node manager log that compares between Hadoop 3.0 and 2.6. They are both running on 4 node cluster with 3 data nodes with same machine power/cpu/memory and same type of job. I picked only one node to compare the container cycle. *1. On 3.0.* when I request 8 containers to run on 3 data nodes, I picked the second node to examine the log: this job used 2 containers in this node: container *container_e04_1527109836290_0004_01_02* on application application_1527109836290_0004 (from container succeeded to Stopping container (from blue to red line) took about *4 seconds*) 152231 2018-05-23 15:04:45,541 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(1059)) - Start request for container_e04_1527109836290_0004_01_02 by user hlhuang 152232 2018-05-23 15:04:45,657 INFO containermanager.ContainerManagerImpl (ContainerManagerImpl.java:startContainerInternal(1127)) - Creating a new application reference for app application_1527109836290_0004 152233 2018-05-23 15:04:45,658 INFO application.ApplicationImpl (ApplicationImpl.java:handle(632)) - Application application_1527109836290_0004 transitioned from NEW to INITING 152234 2018-05-23 15:04:45,658 INFO application.ApplicationImpl (ApplicationImpl.java:transition(446)) - Adding container_e04_1527109836290_0004_01_02 to application application_1527109836290_0004 152235 2018-05-23 15:04:45,658 INFO application.ApplicationImpl (ApplicationImpl.java:handle(632)) - Application application_1527109836290_0004 transitioned from INITING to RUNNING 152236 2018-05-23 15:04:45,659 INFO container.ContainerImpl (ContainerImpl.java:handle(2108)) - Container container_e04_1527109836290_0004_01_02 transitioned from NEW to SCHEDULED 152237 2018-05-23 15:04:45,659 INFO containermanager.AuxServices (AuxServices.java:handle(220)) - Got event CONTAINER_INIT for appId application_1527109836290_0004 152238 2018-05-23 15:04:45,659 INFO yarn.YarnShuffleService (YarnShuffleService.java:initializeContainer(289)) - Initializing container container_e04_1527109836290_0004_01_02 152239 2018-05-23 15:04:45,660 INFO scheduler.ContainerScheduler (ContainerScheduler.java:startContainer(503)) - Starting container [container_e04_1527109836290_0004_01_02] 152246 2018-05-23 15:04:45,965 INFO container.ContainerImpl (ContainerImpl.java:handle(2108)) - Container container_e04_1527109836290_0004_01_02 transitioned from SCHEDULED to RUNNING 152247 2018-05-23 15:04:45,965 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:onStartMonitoringContainer(941)) - Starting resource-monitoring for container_e04_1527109836290_0004_01_02 {color:#205081}152250 2018-05-23 15:04:46,002 INFO launcher.ContainerLaunch (ContainerLaunch.java:handleContainerExitCode(512)) - Container container_e04_1527109836290_0004_01_02 succeeded{color} 152251 2018-05-23 15:04:46,003 INFO container.ContainerImpl (ContainerImpl.java:handle(2108)) - Container container_e04_1527109836290_0004_01_02 transitioned from RUNNING to EXITED_WITH_SUCCESS 152252 2018-05-23 15:04:46,003 INFO launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(668)) - Cleaning up container container_e04_1527109836290_0004_01_02 152254 2018-05-23 15:04:48,132 INFO nodemanager.LinuxContainerExecutor (LinuxContainerExecutor.java:deleteAsUser(794)) - Deleting absolute path : /hadoop/yarn/local/usercache/hlhuang/appcache/application_1527109836290_0004/container_e04_1527109836290_0004_01_02 152256 2018-05-23 15:04:48,133 INFO container.ContainerImpl (ContainerImpl.java:handle(2108)) - Container container_e04_1527109836290_0004_01_02 transitioned from EXITED_WITH_SUCCESS to DONE 152258 2018-05-23 15:04:49,171 INFO nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:removeOrTrackCompletedContainersFromContext(682)) - Removed completed containers from NM context: [container_e04_1527109836290_0004_01_02] 152260 2018-05-23 15:04:50,289 INFO application.ApplicationImpl (ApplicationImpl.java:transition(489)) - Removing container_e04_1527109836290_0004_01_02 from application application_1527109836290_0004 {color:#d04437}152261 2018-05-23 15:04:50,290 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:onStopMonitoringContainer(932)) - Stopping resource-monitoring for container_e04_1527109836290_0004_01_02{color} 152263 2018-05-23 15:04:50,290 INFO yarn.YarnShuffleService (YarnShuffleService.java:stopContainer(295)) - Stopping container container_e04_1527109836290_0004_01_02 152262 2018-05-23 15:04:50,290 INFO containermanager.AuxServices
[jira] [Commented] (YARN-8350) NPE in service AM related to placement policy
[ https://issues.apache.org/jira/browse/YARN-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488242#comment-16488242 ] Billie Rinaldi commented on YARN-8350: -- I also tried adding a placement policy with an empty constraints array to the component that previously had no placement policy, and that resulted in a different NPE. > NPE in service AM related to placement policy > - > > Key: YARN-8350 > URL: https://issues.apache.org/jira/browse/YARN-8350 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Billie Rinaldi >Assignee: Gour Saha >Priority: Critical > > It seems like this NPE is happening in a service with more than one component > when one component has a placement policy and the other does not. It causes > the AM to crash. See > https://github.com/hortonworks/hadoop/blob/3c66d40e26bc2d0e17a6e1869201021a8c2f6df1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/Component.java > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.yarn.service.component.Component.requestContainers(Component.java:644) > at > org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:310) > at > org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:293) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) > at > org.apache.hadoop.yarn.service.component.Component.handle(Component.java:919) > at > org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:344) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) > at > org.apache.hadoop.yarn.service.ServiceMaster.lambda$serviceStart$0(ServiceMaster.java:253) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.hadoop.yarn.service.ServiceMaster.serviceStart(ServiceMaster.java:251) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:317) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records
[ https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488241#comment-16488241 ] Eric Yang commented on YARN-8333: - Patch 001 added multi-A record per component. > Load balance YARN services using RegistryDNS multiple A records > --- > > Key: YARN-8333 > URL: https://issues.apache.org/jira/browse/YARN-8333 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8333.001.patch > > > For scaling stateless containers, it would be great to support DNS round > robin for fault tolerance and load balancing. The current DNS record format > for RegistryDNS is > [container-instance].[application-name].[username].[domain]. For example: > {code} > appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120 > appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121 > appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122 > appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123 > {code} > It would be nice to add multi-A record that contains all IP addresses of the > same component in addition to the instance based records. For example: > {code} > appcatalog.appname.hbase.ycluster. IN A 123.123.123.120 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.121 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.122 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.123 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records
[ https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8333: Attachment: YARN-8333.001.patch > Load balance YARN services using RegistryDNS multiple A records > --- > > Key: YARN-8333 > URL: https://issues.apache.org/jira/browse/YARN-8333 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Attachments: YARN-8333.001.patch > > > For scaling stateless containers, it would be great to support DNS round > robin for fault tolerance and load balancing. The current DNS record format > for RegistryDNS is > [container-instance].[application-name].[username].[domain]. For example: > {code} > appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120 > appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121 > appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122 > appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123 > {code} > It would be nice to add multi-A record that contains all IP addresses of the > same component in addition to the instance based records. For example: > {code} > appcatalog.appname.hbase.ycluster. IN A 123.123.123.120 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.121 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.122 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.123 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8350) NPE in service AM related to placement policy
Billie Rinaldi created YARN-8350: Summary: NPE in service AM related to placement policy Key: YARN-8350 URL: https://issues.apache.org/jira/browse/YARN-8350 Project: Hadoop YARN Issue Type: Bug Reporter: Billie Rinaldi Assignee: Gour Saha It seems like this NPE is happening in a service with more than one component when one component has a placement policy and the other does not. It causes the AM to crash. See https://github.com/hortonworks/hadoop/blob/3c66d40e26bc2d0e17a6e1869201021a8c2f6df1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/Component.java {noformat} java.lang.NullPointerException at org.apache.hadoop.yarn.service.component.Component.requestContainers(Component.java:644) at org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:310) at org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:293) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487) at org.apache.hadoop.yarn.service.component.Component.handle(Component.java:919) at org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:344) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) at org.apache.hadoop.yarn.service.ServiceMaster.lambda$serviceStart$0(ServiceMaster.java:253) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.hadoop.yarn.service.ServiceMaster.serviceStart(ServiceMaster.java:251) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:317) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488224#comment-16488224 ] Hudson commented on YARN-4599: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14277 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14277/]) YARN-4599. Set OOM control for memory cgroups. (Miklos Szegedi via Haibo (haibochen: rev d9964799544eefcf424fcc178d987525f5356cdf) * (edit) .gitignore * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupElasticMemoryController.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandler.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/executor/ContainerSignalContext.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/DummyRunnableWithContext.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/impl/oom_listener.c * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/impl/oom_listener_main.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsMemoryResourceHandlerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitorResourceChange.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/impl/oom_listener.h * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsMemoryResourceHandlerImpl.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupElasticMemoryController.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestDefaultOOMHandler.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCGroupsMemory.md * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/DefaultOOMHandler.java > Set OOM control for memory cgroups > -- > > Key: YARN-4599 > URL: https://issues.apache.org/jira/browse/YARN-4599 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.9.0 >Reporter: Karthik Kambatla >Assignee: Miklos Szegedi >Priority: Major > Labels: oct16-medium > Fix For: 3.2.0 > > Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, > YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, > YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, > YARN-4599.007.patch, YARN-4599.008.patch, YARN-4599.009.patch, >
[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored
[ https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488222#comment-16488222 ] Eric Yang commented on YARN-8342: - We have the following options: 1. Allow exemption to bind-mount launch-container.sh for untrusted yarn mode, and not drop launch_command. 2. Change the name docker.privileged-containers.registries back to docker.trusted.registries. Images outside of trusted.registries are disallowed. 3. Add a error message to indicate that untrusted yarn mode without launch command is not supported. Option 1 requires RHEL 7.5+ to be completely immune to security hole. Option 2 and 3 are safe but it would be hard for users to understand the problem was generated from Hadoop implementation limitations. I am in favor of implementing option 1. Thoughts? > Using docker image from a non-privileged registry, the launch_command is not > honored > > > Key: YARN-8342 > URL: https://issues.apache.org/jira/browse/YARN-8342 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Critical > Labels: Docker > > During test of the Docker feature, I found that if a container comes from > non-privileged docker registry, the specified launch command will be ignored. > Container will success without any log, which is very confusing to end users. > And this behavior is inconsistent to containers from privileged docker > registries. > cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Summary: Preempt opportunistic containers when root container cgroup goes over memory limit (was: Preempt all opportunistic containers when root container cgroup goes over memory limit) > Preempt opportunistic containers when root container cgroup goes over memory > limit > -- > > Key: YARN-6677 > URL: https://issues.apache.org/jira/browse/YARN-6677 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Miklos Szegedi >Priority: Major > Attachments: YARN-6677.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit
[ https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6677: - Attachment: YARN-6677.00.patch > Preempt opportunistic containers when root container cgroup goes over memory > limit > -- > > Key: YARN-6677 > URL: https://issues.apache.org/jira/browse/YARN-6677 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Miklos Szegedi >Priority: Major > Attachments: YARN-6677.00.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488210#comment-16488210 ] Haibo Chen commented on YARN-4599: -- Thanks [~sandflee] for the initial proposal, [~miklos.szeg...@cloudera.com] for the patch and everyone else for the discussion! I have now committed the patch to trunk! > Set OOM control for memory cgroups > -- > > Key: YARN-4599 > URL: https://issues.apache.org/jira/browse/YARN-4599 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.9.0 >Reporter: Karthik Kambatla >Assignee: Miklos Szegedi >Priority: Major > Labels: oct16-medium > Fix For: 3.2.0 > > Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, > YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, > YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, > YARN-4599.007.patch, YARN-4599.008.patch, YARN-4599.009.patch, > YARN-4599.010.patch, YARN-4599.011.patch, YARN-4599.012.patch, > YARN-4599.013.patch, YARN-4599.014.patch, YARN-4599.015.patch, > YARN-4599.016.patch, YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch > > > YARN-1856 adds memory cgroups enforcing support. We should also explicitly > set OOM control so that containers are not killed as soon as they go over > their usage. Today, one could set the swappiness to control this, but > clusters with swap turned off exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488201#comment-16488201 ] Wangda Tan commented on YARN-8292: -- Updated (008) patch. > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488200#comment-16488200 ] Wangda Tan commented on YARN-8292: -- Thanks [~jlowe], addressed all comments. TestPreemptionForQueueWithPriorities is a flaky test which only fails for some cases (I tried 10+ times and failed once). I updated test case a bit to make it more stable and deterministic. > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-8292: - Attachment: YARN-8292.008.patch > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8327) Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows
[ https://issues.apache.org/jira/browse/YARN-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488198#comment-16488198 ] Hudson commented on YARN-8327: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14276 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14276/]) YARN-8327. Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on (inigoiri: rev f09dc73001fd5f3319765fa997f4b0ca9e8f2aff) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestAggregatedLogFormat.java > Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows > -- > > Key: YARN-8327 > URL: https://issues.apache.org/jira/browse/YARN-8327 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3 > > Attachments: YARN-8327.v1.patch, YARN-8327.v2.patch, > image-2018-05-18-16-52-08-250.png, image-2018-05-21-09-05-49-550.png > > > TestAggregatedLogFormat#testReadAcontainerLogs1 fails on Windows because of > the line separator. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7899) [AMRMProxy] Stateful FederationInterceptor for pending requests
[ https://issues.apache.org/jira/browse/YARN-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488179#comment-16488179 ] genericqa commented on YARN-7899: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 39s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 14 new + 16 unchanged - 0 fixed = 30 total (was 16) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 14s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 13s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}115m 21s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-7899 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924828/YARN-7899.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 789936ec75bf 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cddbbe5 | |
[jira] [Commented] (YARN-8327) Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows
[ https://issues.apache.org/jira/browse/YARN-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488159#comment-16488159 ] Íñigo Goiri commented on YARN-8327: --- +1 on [^YARN-8327.v2.patch]. Committing. > Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows > -- > > Key: YARN-8327 > URL: https://issues.apache.org/jira/browse/YARN-8327 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8327.v1.patch, YARN-8327.v2.patch, > image-2018-05-18-16-52-08-250.png, image-2018-05-21-09-05-49-550.png > > > TestAggregatedLogFormat#testReadAcontainerLogs1 fails on Windows because of > the line separator. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests to fix NPE failures
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488111#comment-16488111 ] Hudson commented on YARN-8348: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14275 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14275/]) YARN-8348. Incorrect and missing AfterClass in HBase-tests to fix NPE (inigoiri: rev d72615611cfa6bd82756270d4b10136ec1e56741) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorageEntities.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowRun.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorageApps.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorageDomain.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowActivity.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowRunCompaction.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorageSchema.java > Incorrect and missing AfterClass in HBase-tests to fix NPE failures > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This JIRA fixes the NPE failures in HBase-tests and reduces the failed tests > in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488100#comment-16488100 ] genericqa commented on YARN-7530: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 28s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 15s{color} | {color:red} root in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-services-api in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 8s{color} | {color:red} hadoop-yarn-services-api in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 0m 44s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 8s{color} | {color:red} hadoop-yarn-services-api in branch-3.1 failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 40s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 13s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f | | JIRA Issue | YARN-7530 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924835/YARN-7530-branch-3.1.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml | | uname | Linux 8bc14eca71b4 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.1 / 61b5b2f | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | mvninstall | https://builds.apache.org/job/PreCommit-YARN-Build/20846/artifact/out/branch-mvninstall-root.txt | | compile | https://builds.apache.org/job/PreCommit-YARN-Build/20846/artifact/out/branch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-api.txt | | mvnsite | https://builds.apache.org/job/PreCommit-YARN-Build/20846/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-api.txt | | javadoc |
[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative
[ https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488092#comment-16488092 ] Jason Lowe commented on YARN-8292: -- Thanks for updating the patch! The TestPreemptionForQueueWithPriorities failure appears to be related. Nit: Using the new isAnyMajorResourceAboveZero method will be a bit more readable and more efficient than the fitsIn check against none since fitsIn does unnecessary unit conversion checks. What is the point of the new static methods added to Resources? It's more succinct to call the ResourceCalculator method directly, e.g.: rc.isAnyMajorResourceZeroOrNegative(resource) instead of Resources.isAnyMajorResourceZeroOrNegative(rc, resource). It would be good to cleanup the whitespace nit. Speaking of whitespace, one of the checkstyle errors was caused by a whitespace-only formatting change in this patch (the for loop in computeFixpointAllocation) > Fix the dominant resource preemption cannot happen when some of the resource > vector becomes negative > > > Key: YARN-8292 > URL: https://issues.apache.org/jira/browse/YARN-8292 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sumana Sathish >Assignee: Wangda Tan >Priority: Critical > Attachments: YARN-8292.001.patch, YARN-8292.002.patch, > YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, > YARN-8292.006.patch, YARN-8292.007.patch > > > This is an example of the problem: > > {code} > // guaranteed, max,used, pending > "root(=[30:18:6 30:18:6 12:12:6 1:1:1]);" + //root > "-a(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // a > "-b(=[10:6:2 10:6:2 6:6:3 0:0:0]);" + // b > "-c(=[10:6:2 10:6:2 0:0:0 1:1:1])"; // c > {code} > There're 3 resource types. Total resource of the cluster is 30:18:6 > For both of a/b, there're 3 containers running, each of container is 2:2:1. > Queue c uses 0 resource, and have 1:1:1 pending resource. > Under existing logic, preemption cannot happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8343) YARN should have ability to run images only from a whitelist docker registries
[ https://issues.apache.org/jira/browse/YARN-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488084#comment-16488084 ] Eric Badger commented on YARN-8343: --- I think we need to do a rework around privileged/non-privileged containers as is. I agree that there is use in a mechanism that you only allow images from a certain registry to run at all. This could manifest as a whitelist or as a flag to only accept images from the privileged registries list or something else that we design that makes this all less confusing > YARN should have ability to run images only from a whitelist docker registries > -- > > Key: YARN-8343 > URL: https://issues.apache.org/jira/browse/YARN-8343 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Critical > Labels: Docker > > This is a superset of docker.privileged-containers.registries, admin can > specify a whitelist and all images from non-privileged-container.registries > will be rejected. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests to fix NPE failures
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated YARN-8348: -- Description: HBase tests are failing in [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] for 2 reasons: * incorrect afterClass; * not defined KeyProviderTokenIssuer. While in windows are failing for the previous 2 reasons plus * missing afterClass. This JIRA fixes the NPE failures in HBase-tests and reduces the failed tests in Linux. was: HBase tests are failing in [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] for 2 reasons: * incorrect afterClass; * not defined KeyProviderTokenIssuer. While in windows are failing for the previous 2 reasons plus * missing afterClass. This Jira tracks the effort to fix the NPE failures in HBase-tests and reduces the failed tests in Linux. > Incorrect and missing AfterClass in HBase-tests to fix NPE failures > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This JIRA fixes the NPE failures in HBase-tests and reduces the failed tests > in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8344) Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488079#comment-16488079 ] Hudson commented on YARN-8344: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14274 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14274/]) YARN-8344. Missing nm.stop() in TestNodeManagerResync to fix (inigoiri: rev e99e5bf104e9664bc1b43a2639d87355d47a77e2) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java > Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync > > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3 > > Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored
[ https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488073#comment-16488073 ] Eric Badger commented on YARN-8342: --- {quote}Looks like the name {{docker.privileged-containers.registries}} is very misleading. It doesn't apply only for Docker Privileged Containers, right? If so, we should fix this name. {quote} I 100% agree with this. bq. With YARN-7654 changes to use execvp, this concern has been nullified. It is safe to preserve launch command even for untrusted images. If we're going to allow random (untrusted) images to execute, then the command with which they start doesn't really matter, user-specified or image-supplied. The image could start with any CMD, so we already have to assume that it's untrusted/possibly malicious code that is executing right off the bat. I don't see any added risk here by letting the user define what they want to run. > Using docker image from a non-privileged registry, the launch_command is not > honored > > > Key: YARN-8342 > URL: https://issues.apache.org/jira/browse/YARN-8342 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Critical > Labels: Docker > > During test of the Docker feature, I found that if a container comes from > non-privileged docker registry, the specified launch command will be ignored. > Container will success without any log, which is very confusing to end users. > And this behavior is inconsistent to containers from privileged docker > registries. > cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests to fix NPE failures
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8348: --- Description: HBase tests are failing in [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] for 2 reasons: * incorrect afterClass; * not defined KeyProviderTokenIssuer. While in windows are failing for the previous 2 reasons plus * missing afterClass. This Jira tracks the effort to fix the NPE failures in HBase-tests and reduces the failed tests in Linux. was: HBase tests are failing in [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] for 2 reasons: * incorrect afterClass; * not defined KeyProviderTokenIssuer. While in windows are failing for the previous 2 reasons plus * missing afterClass. This Jira tracks the effort to fix part of HBase-tests and reduces the failed tests in Linux. > Incorrect and missing AfterClass in HBase-tests to fix NPE failures > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This Jira tracks the effort to fix the NPE failures in HBase-tests and > reduces the failed tests in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests to fix NPE failures
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8348: --- Summary: Incorrect and missing AfterClass in HBase-tests to fix NPE failures (was: Incorrect and missing AfterClass in HBase-tests) > Incorrect and missing AfterClass in HBase-tests to fix NPE failures > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This Jira tracks the effort to fix part of HBase-tests and reduces the failed > tests in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6919) Add default volume mount list
[ https://issues.apache.org/jira/browse/YARN-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488071#comment-16488071 ] genericqa commented on YARN-6919: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 9s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 11s{color} | {color:red} root in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn in branch-3.1 failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 7s{color} | {color:green} branch-3.1 passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 11s{color} | {color:red} hadoop-yarn-api in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-common in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 0m 46s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-api in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-common in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-api in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-common in branch-3.1 failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-3.1 failed. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 8s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 8s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 10s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 9s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 0m 9s{color} | {color:red} patch has errors when building and testing our client artifacts.
[jira] [Updated] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-7530: Attachment: YARN-7530-branch-3.1.001.patch > hadoop-yarn-services-api should be part of hadoop-yarn-services > --- > > Key: YARN-7530 > URL: https://issues.apache.org/jira/browse/YARN-7530 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Chandni Singh >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-7530-branch-3.1.001.patch, YARN-7530.001.patch, > YARN-7530.002.patch > > > Hadoop-yarn-services-api is currently a parallel project to > hadoop-yarn-services project. It would be better if hadoop-yarn-services-api > is part of hadoop-yarn-services for correctness. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488066#comment-16488066 ] Íñigo Goiri commented on YARN-8348: --- Do you mind updating the description to make clear we leave KeyProviderTokenIssuer open but we fix the NPEs? > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This Jira tracks the effort to fix part of HBase-tests and reduces the failed > tests in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6
[ https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013 ] Hsin-Liang Huang edited comment on YARN-8326 at 5/23/18 9:19 PM: - Hi [~eyang] I ran the sample job, {color:#14892c}time hadoop jar /usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar Client -classpath simple-yarn-app-1.1.0.jar -cmd "java com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"{color} with the changed settings, it still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment. So I am not sure if the significant performance role that these two monitoring setting would play in this. The major issue could still be in the exiting container that in 3.0 environment is much slower than 2.6 environment. Can someone from yarn team look into this? This is a general yarn application performance issue in 3.0. was (Author: hlhu...@us.ibm.com): Hi [~eyang] I ran the sample job, with the changed settings, it still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment. So I am not sure if the significant performance role that these two monitoring setting would play in this. The major issue could still be in the exiting container that in 3.0 environment is much slower than 2.6 environment. Can someone from yarn team look into this? This is a general yarn application performance issue in 3.0. > Yarn 3.0 seems runs slower than Yarn 2.6 > > > Key: YARN-8326 > URL: https://issues.apache.org/jira/browse/YARN-8326 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 > Environment: This is the yarn-site.xml for 3.0. > > > > hadoop.registry.dns.bind-port > 5353 > > > hadoop.registry.dns.domain-name > hwx.site > > > hadoop.registry.dns.enabled > true > > > hadoop.registry.dns.zone-mask > 255.255.255.0 > > > hadoop.registry.dns.zone-subnet > 172.17.0.0 > > > manage.include.files > false > > > yarn.acl.enable > false > > > yarn.admin.acl > yarn > > > yarn.client.nodemanager-connect.max-wait-ms > 6 > > > yarn.client.nodemanager-connect.retry-interval-ms > 1 > > > yarn.http.policy > HTTP_ONLY > > > yarn.log-aggregation-enable > false > > > yarn.log-aggregation.retain-seconds > 2592000 > > > yarn.log.server.url > > [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs] > > > yarn.log.server.web-service.url > > [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory] > > > yarn.node-labels.enabled > false > > > yarn.node-labels.fs-store.retry-policy-spec > 2000, 500 > > > yarn.node-labels.fs-store.root-dir > /system/yarn/node-labels > > > yarn.nodemanager.address > 0.0.0.0:45454 > > > yarn.nodemanager.admin-env > MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX > > > yarn.nodemanager.aux-services > mapreduce_shuffle,spark2_shuffle,timeline_collector > > > yarn.nodemanager.aux-services.mapreduce_shuffle.class > org.apache.hadoop.mapred.ShuffleHandler > > > yarn.nodemanager.aux-services.spark2_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.spark2_shuffle.classpath > /usr/spark2/aux/* > > > yarn.nodemanager.aux-services.spark_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.timeline_collector.class > > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService > > > yarn.nodemanager.bind-host > 0.0.0.0 > > > yarn.nodemanager.container-executor.class > > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor > > > yarn.nodemanager.container-metrics.unregister-delay-ms > 6 > > > yarn.nodemanager.container-monitor.interval-ms > 3000 > > > yarn.nodemanager.delete.debug-delay-sec > 0 > > > > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > 90 > > > yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb > 1000 > > > yarn.nodemanager.disk-health-checker.min-healthy-disks > 0.25 > > > yarn.nodemanager.health-checker.interval-ms > 135000 > > > yarn.nodemanager.health-checker.script.timeout-ms > 6 > > > > yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage > false > > > yarn.nodemanager.linux-container-executor.group > hadoop > > > > yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users > false > > > yarn.nodemanager.local-dirs > /hadoop/yarn/local > > > yarn.nodemanager.log-aggregation.compression-type > gz > > >
[jira] [Issue Comment Deleted] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6
[ https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hsin-Liang Huang updated YARN-8326: --- Comment: was deleted (was: HI Eric, I tried the suggestion and changed the setting. The result on running {color:#14892c}time hadoop jar /usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar Client -classpath simple-yarn-app-1.1.0.jar -cmd "java com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"{color} is 20s, 15s and 15s (I ran it 3 times). It didn't get better if it's not worse. (It was 14, 15 seconds before). ) > Yarn 3.0 seems runs slower than Yarn 2.6 > > > Key: YARN-8326 > URL: https://issues.apache.org/jira/browse/YARN-8326 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 > Environment: This is the yarn-site.xml for 3.0. > > > > hadoop.registry.dns.bind-port > 5353 > > > hadoop.registry.dns.domain-name > hwx.site > > > hadoop.registry.dns.enabled > true > > > hadoop.registry.dns.zone-mask > 255.255.255.0 > > > hadoop.registry.dns.zone-subnet > 172.17.0.0 > > > manage.include.files > false > > > yarn.acl.enable > false > > > yarn.admin.acl > yarn > > > yarn.client.nodemanager-connect.max-wait-ms > 6 > > > yarn.client.nodemanager-connect.retry-interval-ms > 1 > > > yarn.http.policy > HTTP_ONLY > > > yarn.log-aggregation-enable > false > > > yarn.log-aggregation.retain-seconds > 2592000 > > > yarn.log.server.url > > [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs] > > > yarn.log.server.web-service.url > > [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory] > > > yarn.node-labels.enabled > false > > > yarn.node-labels.fs-store.retry-policy-spec > 2000, 500 > > > yarn.node-labels.fs-store.root-dir > /system/yarn/node-labels > > > yarn.nodemanager.address > 0.0.0.0:45454 > > > yarn.nodemanager.admin-env > MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX > > > yarn.nodemanager.aux-services > mapreduce_shuffle,spark2_shuffle,timeline_collector > > > yarn.nodemanager.aux-services.mapreduce_shuffle.class > org.apache.hadoop.mapred.ShuffleHandler > > > yarn.nodemanager.aux-services.spark2_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.spark2_shuffle.classpath > /usr/spark2/aux/* > > > yarn.nodemanager.aux-services.spark_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.timeline_collector.class > > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService > > > yarn.nodemanager.bind-host > 0.0.0.0 > > > yarn.nodemanager.container-executor.class > > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor > > > yarn.nodemanager.container-metrics.unregister-delay-ms > 6 > > > yarn.nodemanager.container-monitor.interval-ms > 3000 > > > yarn.nodemanager.delete.debug-delay-sec > 0 > > > > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > 90 > > > yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb > 1000 > > > yarn.nodemanager.disk-health-checker.min-healthy-disks > 0.25 > > > yarn.nodemanager.health-checker.interval-ms > 135000 > > > yarn.nodemanager.health-checker.script.timeout-ms > 6 > > > > yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage > false > > > yarn.nodemanager.linux-container-executor.group > hadoop > > > > yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users > false > > > yarn.nodemanager.local-dirs > /hadoop/yarn/local > > > yarn.nodemanager.log-aggregation.compression-type > gz > > > yarn.nodemanager.log-aggregation.debug-enabled > false > > > yarn.nodemanager.log-aggregation.num-log-files-per-app > 30 > > > > yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds > 3600 > > > yarn.nodemanager.log-dirs > /hadoop/yarn/log > > > yarn.nodemanager.log.retain-seconds > 604800 > > > yarn.nodemanager.pmem-check-enabled > false > > > yarn.nodemanager.recovery.dir > /var/log/hadoop-yarn/nodemanager/recovery-state > > > yarn.nodemanager.recovery.enabled > true > > > yarn.nodemanager.recovery.supervised > true > > > yarn.nodemanager.remote-app-log-dir > /app-logs > > > yarn.nodemanager.remote-app-log-dir-suffix > logs > > > yarn.nodemanager.resource-plugins > > > > yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices > auto > > >
[jira] [Comment Edited] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6
[ https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013 ] Hsin-Liang Huang edited comment on YARN-8326 at 5/23/18 9:15 PM: - Hi [~eyang] I ran the sample job, with the changed settings, it still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment. So I am not sure if the significant performance role that these two monitoring setting would play in this. The major issue could still be in the exiting container that in 3.0 environment is much slower than 2.6 environment. Can someone from yarn team look into this? This is a general yarn application performance issue in 3.0. was (Author: hlhu...@us.ibm.com): Hi [~eyang] Here is another update. Even though the simple job that I ran with the suggested setting changed, the performance was improved. However, I ran our unit testcases, and it still ran 14 hours compared to 7 hours in 2.6 environment. I also ran another sample job, with the changed settings, it still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment. So I think even though monitoring setting might affect the performance issue, but it only plays a little part, the major issue could still be in the exiting container that in 3.0 environment is much slower than 2.6 environment. Is there anyone looking into this area? Thanks! > Yarn 3.0 seems runs slower than Yarn 2.6 > > > Key: YARN-8326 > URL: https://issues.apache.org/jira/browse/YARN-8326 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 > Environment: This is the yarn-site.xml for 3.0. > > > > hadoop.registry.dns.bind-port > 5353 > > > hadoop.registry.dns.domain-name > hwx.site > > > hadoop.registry.dns.enabled > true > > > hadoop.registry.dns.zone-mask > 255.255.255.0 > > > hadoop.registry.dns.zone-subnet > 172.17.0.0 > > > manage.include.files > false > > > yarn.acl.enable > false > > > yarn.admin.acl > yarn > > > yarn.client.nodemanager-connect.max-wait-ms > 6 > > > yarn.client.nodemanager-connect.retry-interval-ms > 1 > > > yarn.http.policy > HTTP_ONLY > > > yarn.log-aggregation-enable > false > > > yarn.log-aggregation.retain-seconds > 2592000 > > > yarn.log.server.url > > [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs] > > > yarn.log.server.web-service.url > > [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory] > > > yarn.node-labels.enabled > false > > > yarn.node-labels.fs-store.retry-policy-spec > 2000, 500 > > > yarn.node-labels.fs-store.root-dir > /system/yarn/node-labels > > > yarn.nodemanager.address > 0.0.0.0:45454 > > > yarn.nodemanager.admin-env > MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX > > > yarn.nodemanager.aux-services > mapreduce_shuffle,spark2_shuffle,timeline_collector > > > yarn.nodemanager.aux-services.mapreduce_shuffle.class > org.apache.hadoop.mapred.ShuffleHandler > > > yarn.nodemanager.aux-services.spark2_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.spark2_shuffle.classpath > /usr/spark2/aux/* > > > yarn.nodemanager.aux-services.spark_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.timeline_collector.class > > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService > > > yarn.nodemanager.bind-host > 0.0.0.0 > > > yarn.nodemanager.container-executor.class > > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor > > > yarn.nodemanager.container-metrics.unregister-delay-ms > 6 > > > yarn.nodemanager.container-monitor.interval-ms > 3000 > > > yarn.nodemanager.delete.debug-delay-sec > 0 > > > > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > 90 > > > yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb > 1000 > > > yarn.nodemanager.disk-health-checker.min-healthy-disks > 0.25 > > > yarn.nodemanager.health-checker.interval-ms > 135000 > > > yarn.nodemanager.health-checker.script.timeout-ms > 6 > > > > yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage > false > > > yarn.nodemanager.linux-container-executor.group > hadoop > > > > yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users > false > > > yarn.nodemanager.local-dirs > /hadoop/yarn/local > > > yarn.nodemanager.log-aggregation.compression-type > gz > > > yarn.nodemanager.log-aggregation.debug-enabled > false > > >
[jira] [Updated] (YARN-8344) Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8344: --- Summary: Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync (was: Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync) > Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync > > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6
[ https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hsin-Liang Huang updated YARN-8326: --- Comment: was deleted (was: [~eyang] this afternoon, I tried the command and the performance was dramatically improved. It used to run 8 seconds, now it ran 3 seconds consistently, then I compared with the other 3.0 cluster which I didn't make the properties changes that you suggested, and it still ran 8 seconds consistently. I am going to run our testcases to see if the performance is also improved there. ) > Yarn 3.0 seems runs slower than Yarn 2.6 > > > Key: YARN-8326 > URL: https://issues.apache.org/jira/browse/YARN-8326 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 > Environment: This is the yarn-site.xml for 3.0. > > > > hadoop.registry.dns.bind-port > 5353 > > > hadoop.registry.dns.domain-name > hwx.site > > > hadoop.registry.dns.enabled > true > > > hadoop.registry.dns.zone-mask > 255.255.255.0 > > > hadoop.registry.dns.zone-subnet > 172.17.0.0 > > > manage.include.files > false > > > yarn.acl.enable > false > > > yarn.admin.acl > yarn > > > yarn.client.nodemanager-connect.max-wait-ms > 6 > > > yarn.client.nodemanager-connect.retry-interval-ms > 1 > > > yarn.http.policy > HTTP_ONLY > > > yarn.log-aggregation-enable > false > > > yarn.log-aggregation.retain-seconds > 2592000 > > > yarn.log.server.url > > [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs] > > > yarn.log.server.web-service.url > > [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory] > > > yarn.node-labels.enabled > false > > > yarn.node-labels.fs-store.retry-policy-spec > 2000, 500 > > > yarn.node-labels.fs-store.root-dir > /system/yarn/node-labels > > > yarn.nodemanager.address > 0.0.0.0:45454 > > > yarn.nodemanager.admin-env > MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX > > > yarn.nodemanager.aux-services > mapreduce_shuffle,spark2_shuffle,timeline_collector > > > yarn.nodemanager.aux-services.mapreduce_shuffle.class > org.apache.hadoop.mapred.ShuffleHandler > > > yarn.nodemanager.aux-services.spark2_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.spark2_shuffle.classpath > /usr/spark2/aux/* > > > yarn.nodemanager.aux-services.spark_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.timeline_collector.class > > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService > > > yarn.nodemanager.bind-host > 0.0.0.0 > > > yarn.nodemanager.container-executor.class > > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor > > > yarn.nodemanager.container-metrics.unregister-delay-ms > 6 > > > yarn.nodemanager.container-monitor.interval-ms > 3000 > > > yarn.nodemanager.delete.debug-delay-sec > 0 > > > > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > 90 > > > yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb > 1000 > > > yarn.nodemanager.disk-health-checker.min-healthy-disks > 0.25 > > > yarn.nodemanager.health-checker.interval-ms > 135000 > > > yarn.nodemanager.health-checker.script.timeout-ms > 6 > > > > yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage > false > > > yarn.nodemanager.linux-container-executor.group > hadoop > > > > yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users > false > > > yarn.nodemanager.local-dirs > /hadoop/yarn/local > > > yarn.nodemanager.log-aggregation.compression-type > gz > > > yarn.nodemanager.log-aggregation.debug-enabled > false > > > yarn.nodemanager.log-aggregation.num-log-files-per-app > 30 > > > > yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds > 3600 > > > yarn.nodemanager.log-dirs > /hadoop/yarn/log > > > yarn.nodemanager.log.retain-seconds > 604800 > > > yarn.nodemanager.pmem-check-enabled > false > > > yarn.nodemanager.recovery.dir > /var/log/hadoop-yarn/nodemanager/recovery-state > > > yarn.nodemanager.recovery.enabled > true > > > yarn.nodemanager.recovery.supervised > true > > > yarn.nodemanager.remote-app-log-dir > /app-logs > > > yarn.nodemanager.remote-app-log-dir-suffix > logs > > > yarn.nodemanager.resource-plugins > > > > yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices > auto > > > yarn.nodemanager.resource-plugins.gpu.docker-plugin > nvidia-docker-v1 > > >
[jira] [Commented] (YARN-6919) Add default volume mount list
[ https://issues.apache.org/jira/browse/YARN-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488049#comment-16488049 ] Eric Badger commented on YARN-6919: --- Hey [~shaneku...@gmail.com], I think this should go into 3.1 as well, so I just put up a patch. Note, however, that YARN-7530 has broken branch-3.1 compilation from a clean .m2 repo. I'm not sure what genericqa will do. > Add default volume mount list > - > > Key: YARN-6919 > URL: https://issues.apache.org/jira/browse/YARN-6919 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Labels: Docker > Attachments: YARN-6919-branch-3.1.002.patch, YARN-6919.001.patch, > YARN-6919.002.patch > > > Piggybacking on YARN-5534, we should create a default list that bind mounts > selected volumes into all docker containers. This list will be empty by > default -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6919) Add default volume mount list
[ https://issues.apache.org/jira/browse/YARN-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-6919: -- Attachment: YARN-6919-branch-3.1.002.patch > Add default volume mount list > - > > Key: YARN-6919 > URL: https://issues.apache.org/jira/browse/YARN-6919 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Labels: Docker > Attachments: YARN-6919-branch-3.1.002.patch, YARN-6919.001.patch, > YARN-6919.002.patch > > > Piggybacking on YARN-5534, we should create a default list that bind mounts > selected volumes into all docker containers. This list will be empty by > default -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7899) [AMRMProxy] Stateful FederationInterceptor for pending requests
[ https://issues.apache.org/jira/browse/YARN-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-7899: --- Attachment: YARN-7899.v2.patch > [AMRMProxy] Stateful FederationInterceptor for pending requests > --- > > Key: YARN-7899 > URL: https://issues.apache.org/jira/browse/YARN-7899 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-7899.v1.patch, YARN-7899.v2.patch > > > Today FederationInterceptor (in AMRMProxy for YARN Federation) is stateless > in terms of pending (outstanding) requests. Whenever AM issues new requests, > FI simply splits and sends them to sub-cluster YarnRMs and forget about them. > This JIRA attempts to make FI stateful so that it remembers the pending > requests in all relevant sub-clusters. This has two major benefits: > 1. It is a prerequisite for FI to be able to cancel pending request in one > sub-cluster and re-send it to other sub-clusters. This is needed for load > balancing and to fully comply with the relax locality fallback to ANY > semantic. When we send a request to one sub-cluster, we have effectively > restrained the allocation for this request to be within this sub-cluster > rather than everywhere. If the cluster capacity in this sub-cluster for this > app is full or this YarnRM is overloaded and slow, the request will be stuck > there for a long time even if there is free capacity in other sub-clusters. > We need FI to remember and adjust the pending requests on the fly. > 2. This makes pending request recovery easier when YarnRM fails over. Today > whenever one sub-cluster RM fails over, in order to recover lost pending > requests for this sub-cluster, > we have to propagate the ApplicationMasterNotRegisteredException from the > YarnRM back to AM, triggering a full pending resend from AM. This contains > pending for not only the failing-over sub-cluster, but everyone. Since our > split-merge (AMRMProxyPolicy) does not guarantee idempotency, the same > request we sent to sub-cluster-1 earlier might be resent to sub-cluster-2. If > both these YarnRMs have not failed over, they will both allocate for this > request, leading to over-allocation. Also, these full pending resends also > puts unnecessary load on every YarnRM in the cluster everytime one YarnRM > fails over. With stateful FederationInterceptor, since we remember pending > requests we have sent out earlier, we can shield the > ApplicationMasterNotRegisteredException for AM and resend the pending only to > the failed over YarnRM. This eliminates over-allocation and minimizes the > recovery overhead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7530: - Fix Version/s: 3.2.0 > hadoop-yarn-services-api should be part of hadoop-yarn-services > --- > > Key: YARN-7530 > URL: https://issues.apache.org/jira/browse/YARN-7530 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Chandni Singh >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-7530.001.patch, YARN-7530.002.patch > > > Hadoop-yarn-services-api is currently a parallel project to > hadoop-yarn-services project. It would be better if hadoop-yarn-services-api > is part of hadoop-yarn-services for correctness. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7530: - Priority: Blocker (was: Trivial) > hadoop-yarn-services-api should be part of hadoop-yarn-services > --- > > Key: YARN-7530 > URL: https://issues.apache.org/jira/browse/YARN-7530 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Chandni Singh >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-7530.001.patch, YARN-7530.002.patch > > > Hadoop-yarn-services-api is currently a parallel project to > hadoop-yarn-services project. It would be better if hadoop-yarn-services-api > is part of hadoop-yarn-services for correctness. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-7530: - Fix Version/s: (was: 3.2.0) > hadoop-yarn-services-api should be part of hadoop-yarn-services > --- > > Key: YARN-7530 > URL: https://issues.apache.org/jira/browse/YARN-7530 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Chandni Singh >Priority: Blocker > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-7530.001.patch, YARN-7530.002.patch > > > Hadoop-yarn-services-api is currently a parallel project to > hadoop-yarn-services project. It would be better if hadoop-yarn-services-api > is part of hadoop-yarn-services for correctness. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6
[ https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013 ] Hsin-Liang Huang commented on YARN-8326: Hi [~eyang] Here is another update. Even though the simple job that I ran with the suggested setting changed, the performance was improved. However, I ran our unit testcases, and it still ran 14 hours compared to 7 hours in 2.6 environment. I also ran another sample job, with the changed settings, it still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment. So I think even though monitoring setting might affect the performance issue, but it only plays a little part, the major issue could still be in the exiting container that in 3.0 environment is much slower than 2.6 environment. Is there anyone looking into this area? Thanks! > Yarn 3.0 seems runs slower than Yarn 2.6 > > > Key: YARN-8326 > URL: https://issues.apache.org/jira/browse/YARN-8326 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0 > Environment: This is the yarn-site.xml for 3.0. > > > > hadoop.registry.dns.bind-port > 5353 > > > hadoop.registry.dns.domain-name > hwx.site > > > hadoop.registry.dns.enabled > true > > > hadoop.registry.dns.zone-mask > 255.255.255.0 > > > hadoop.registry.dns.zone-subnet > 172.17.0.0 > > > manage.include.files > false > > > yarn.acl.enable > false > > > yarn.admin.acl > yarn > > > yarn.client.nodemanager-connect.max-wait-ms > 6 > > > yarn.client.nodemanager-connect.retry-interval-ms > 1 > > > yarn.http.policy > HTTP_ONLY > > > yarn.log-aggregation-enable > false > > > yarn.log-aggregation.retain-seconds > 2592000 > > > yarn.log.server.url > > [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs] > > > yarn.log.server.web-service.url > > [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory] > > > yarn.node-labels.enabled > false > > > yarn.node-labels.fs-store.retry-policy-spec > 2000, 500 > > > yarn.node-labels.fs-store.root-dir > /system/yarn/node-labels > > > yarn.nodemanager.address > 0.0.0.0:45454 > > > yarn.nodemanager.admin-env > MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX > > > yarn.nodemanager.aux-services > mapreduce_shuffle,spark2_shuffle,timeline_collector > > > yarn.nodemanager.aux-services.mapreduce_shuffle.class > org.apache.hadoop.mapred.ShuffleHandler > > > yarn.nodemanager.aux-services.spark2_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.spark2_shuffle.classpath > /usr/spark2/aux/* > > > yarn.nodemanager.aux-services.spark_shuffle.class > org.apache.spark.network.yarn.YarnShuffleService > > > yarn.nodemanager.aux-services.timeline_collector.class > > org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService > > > yarn.nodemanager.bind-host > 0.0.0.0 > > > yarn.nodemanager.container-executor.class > > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor > > > yarn.nodemanager.container-metrics.unregister-delay-ms > 6 > > > yarn.nodemanager.container-monitor.interval-ms > 3000 > > > yarn.nodemanager.delete.debug-delay-sec > 0 > > > > yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage > 90 > > > yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb > 1000 > > > yarn.nodemanager.disk-health-checker.min-healthy-disks > 0.25 > > > yarn.nodemanager.health-checker.interval-ms > 135000 > > > yarn.nodemanager.health-checker.script.timeout-ms > 6 > > > > yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage > false > > > yarn.nodemanager.linux-container-executor.group > hadoop > > > > yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users > false > > > yarn.nodemanager.local-dirs > /hadoop/yarn/local > > > yarn.nodemanager.log-aggregation.compression-type > gz > > > yarn.nodemanager.log-aggregation.debug-enabled > false > > > yarn.nodemanager.log-aggregation.num-log-files-per-app > 30 > > > > yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds > 3600 > > > yarn.nodemanager.log-dirs > /hadoop/yarn/log > > > yarn.nodemanager.log.retain-seconds > 604800 > > > yarn.nodemanager.pmem-check-enabled > false > > > yarn.nodemanager.recovery.dir > /var/log/hadoop-yarn/nodemanager/recovery-state > > > yarn.nodemanager.recovery.enabled > true > > > yarn.nodemanager.recovery.supervised > true > > > yarn.nodemanager.remote-app-log-dir > /app-logs > > >
[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488002#comment-16488002 ] Giovanni Matteo Fumarola commented on YARN-8348: Thanks [~elgoiri] for the review. I will open a follow-up Jira for KeyProviderTokenIssuer. As I said before, this patch will bring the failed test from 21 to 16 in Linux. As HDFS-13558 that you and [~huanbang1993] fixed by closing the cluster, the patch will fix failures in Windows for TestHBaseTimelineStorageDomain and TestHBaseTimelineStorageSchema. > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This Jira tracks the effort to fix part of HBase-tests and reduces the failed > tests in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored
[ https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487992#comment-16487992 ] Shane Kumpf commented on YARN-8342: --- This sounds like a reasonable proposal. In cases where the current behavior is desired, the user can set "launch_command" to an empty string I guess? To be clear, there is no replacement with an "empty bash". The current "untrusted" mode leaves it up to the Docker image to specify the ENTRYPOINT/CMD. Nothing is overwritten by YARN in this "untrusted" mode. It is very common for images to use "bash" as the CMD. When an image does this and YARN runs in this "untrusted" mode, a non-interactive "bash" shell starts in the container and immediately exits with success. YARN reports that the container ran successfully, but this is confusing to the user because the code they expected to run did not run. The launch script depends on mounts and "untrusted" mode strips all mounts, meaning we flat out can't use a launch_script in this mode as we would in "trusted" mode. Allowing the "launch_command" supplied by the user, without embedding that "launch_command" in the launch script seems like a viable way to support both. Confused yet? :) > Using docker image from a non-privileged registry, the launch_command is not > honored > > > Key: YARN-8342 > URL: https://issues.apache.org/jira/browse/YARN-8342 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Critical > Labels: Docker > > During test of the Docker feature, I found that if a container comes from > non-privileged docker registry, the specified launch command will be ignored. > Container will success without any log, which is very confusing to end users. > And this behavior is inconsistent to containers from privileged docker > registries. > cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487987#comment-16487987 ] Íñigo Goiri commented on YARN-8344: --- +1 on [^YARN-8344.v2.patch]. We still need to figure out the proper fix for the path length issue on Windows. [~giovanni.fumarola], please link this JIRA once opening the Windows fix. > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > - > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487984#comment-16487984 ] Íñigo Goiri commented on YARN-8348: --- Good news is that the NPE is gone. However, the original NoClassDefFoundError surfaces clarly now. I'm fine committing this as is but I'd like to have a follow-up JIRA on why KeyProviderTokenIssuer is not found. > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This Jira tracks the effort to fix part of HBase-tests and reduces the failed > tests in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records
[ https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang reassigned YARN-8333: --- Assignee: Eric Yang > Load balance YARN services using RegistryDNS multiple A records > --- > > Key: YARN-8333 > URL: https://issues.apache.org/jira/browse/YARN-8333 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > > For scaling stateless containers, it would be great to support DNS round > robin for fault tolerance and load balancing. The current DNS record format > for RegistryDNS is > [container-instance].[application-name].[username].[domain]. For example: > {code} > appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120 > appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121 > appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122 > appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123 > {code} > It would be nice to add multi-A record that contains all IP addresses of the > same component in addition to the instance based records. For example: > {code} > appcatalog.appname.hbase.ycluster. IN A 123.123.123.120 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.121 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.122 > appcatalog.appname.hbase.ycluster. IN A 123.123.123.123 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored
[ https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487952#comment-16487952 ] Eric Yang commented on YARN-8342: - [~vinodkv] Launch command was dropped in YARN-7516 due to concerns of shell expansion to cause the commands to run as root user via popen. With YARN-7654 changes to use execvp, this concern has been nullified. It is safe to preserve launch command even for untrusted images. [~shaneku...@gmail.com] [~ebadger] [~jlowe] Do you agree with this change? > Using docker image from a non-privileged registry, the launch_command is not > honored > > > Key: YARN-8342 > URL: https://issues.apache.org/jira/browse/YARN-8342 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Critical > Labels: Docker > > During test of the Docker feature, I found that if a container comes from > non-privileged docker registry, the specified launch command will be ignored. > Container will success without any log, which is very confusing to end users. > And this behavior is inconsistent to containers from privileged docker > registries. > cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487951#comment-16487951 ] genericqa commented on YARN-8348: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 7s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 45s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 46m 1s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities | | | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageSchema | | | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps | | | hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage | | | hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity | | | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageDomain | | | hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction | | | hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce
[jira] [Assigned] (YARN-8349) Remove YARN registry entries when a service is killed by the RM
[ https://issues.apache.org/jira/browse/YARN-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Kumpf reassigned YARN-8349: - Assignee: Billie Rinaldi > Remove YARN registry entries when a service is killed by the RM > --- > > Key: YARN-8349 > URL: https://issues.apache.org/jira/browse/YARN-8349 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Affects Versions: 3.2.0, 3.1.1 >Reporter: Shane Kumpf >Assignee: Billie Rinaldi >Priority: Major > > As the title states, when a service is killed by the RM (for exceeding its > lifetime for example), the YARN registry entries should be cleaned up. > Without cleanup, DNS can contain multiple hostnames for a single IP address > in the case where IPs are reused. This impacts reverse lookups, which breaks > services, such as kerberos, that depend on those lookups. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8349) Remove YARN registry entries when a service is killed by the RM
Shane Kumpf created YARN-8349: - Summary: Remove YARN registry entries when a service is killed by the RM Key: YARN-8349 URL: https://issues.apache.org/jira/browse/YARN-8349 Project: Hadoop YARN Issue Type: Sub-task Components: yarn-native-services Affects Versions: 3.2.0, 3.1.1 Reporter: Shane Kumpf As the title states, when a service is killed by the RM (for exceeding its lifetime for example), the YARN registry entries should be cleaned up. Without cleanup, DNS can contain multiple hostnames for a single IP address in the case where IPs are reused. This impacts reverse lookups, which breaks services, such as kerberos, that depend on those lookups. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8334) [GPG] Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang resolved YARN-8334. Resolution: Fixed > [GPG] Fix potential connection leak in GPGUtils > --- > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch, > YARN-8334-YARN-7402.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8334) [GPG] Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487945#comment-16487945 ] Botong Huang commented on YARN-8334: Committed to YARN-7402 as db183f2ea. Thanks [~giovanni.fumarola] for the patch and [~elgoiri] for the review! > [GPG] Fix potential connection leak in GPGUtils > --- > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch, > YARN-8334-YARN-7402.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Reopened] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services
[ https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reopened YARN-7530: --- This change breaks branch-3.1 compilation if the .m2 directory is cleaned. {noformat} [ERROR] [ERROR] Some problems were encountered while processing the POMs: [WARNING] 'parent.relativePath' of POM org.apache.hadoop:hadoop-yarn-services-api:[unknown-version] (/Users/ebadger/apachehadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml) points at org.apache.hadoop:hadoop-yarn-services instead of org.apache.hadoop:hadoop-yarn-applications, please verify your project structure @ line 19, column 11 [FATAL] Non-resolvable parent POM for org.apache.hadoop:hadoop-yarn-services-api:[unknown-version]: Could not find artifact org.apache.hadoop:hadoop-yarn-applications:pom:3.1.1-SNAPSHOT and 'parent.relativePath' points at wrong local POM @ line 19, column 11 [WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-gpg-plugin is missing. @ line 133, column 15 @ [ERROR] The build could not read 1 project -> [Help 1] [ERROR] [ERROR] The project org.apache.hadoop:hadoop-yarn-services-api:[unknown-version] (/Users/ebadger/apachehadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml) has 1 error [ERROR] Non-resolvable parent POM for org.apache.hadoop:hadoop-yarn-services-api:[unknown-version]: Could not find artifact org.apache.hadoop:hadoop-yarn-applications:pom:3.1.1-SNAPSHOT and 'parent.relativePath' points at wrong local POM @ line 19, column 11 -> [Help 2] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException {noformat} Here's the difference between branch-3.1 and trunk. The artifactId was updated correctly in trunk, but not branch-3.1 {noformat} diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml index 45168a9fbc4..d45da093102 100644 --- a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml +++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml @@ -18,8 +18,8 @@ 4.0.0 org.apache.hadoop -hadoop-yarn-services -3.2.0-SNAPSHOT +hadoop-yarn-applications +3.1.1-SNAPSHOT hadoop-yarn-services-api Apache Hadoop YARN Services API {noformat} > hadoop-yarn-services-api should be part of hadoop-yarn-services > --- > > Key: YARN-7530 > URL: https://issues.apache.org/jira/browse/YARN-7530 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Affects Versions: 3.1.0 >Reporter: Eric Yang >Assignee: Chandni Singh >Priority: Trivial > Fix For: 3.2.0, 3.1.1 > > Attachments: YARN-7530.001.patch, YARN-7530.002.patch > > > Hadoop-yarn-services-api is currently a parallel project to > hadoop-yarn-services project. It would be better if hadoop-yarn-services-api > is part of hadoop-yarn-services for correctness. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8334) [] Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-8334: --- Summary: [] Fix potential connection leak in GPGUtils (was: Fix potential connection leak in GPGUtils) > [] Fix potential connection leak in GPGUtils > > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch, > YARN-8334-YARN-7402.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8334) [GPG] Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-8334: --- Summary: [GPG] Fix potential connection leak in GPGUtils (was: [] Fix potential connection leak in GPGUtils) > [GPG] Fix potential connection leak in GPGUtils > --- > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch, > YARN-8334-YARN-7402.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.
[ https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487902#comment-16487902 ] Eric Payne commented on YARN-4781: -- bq. FairOrdering policy could be used with weights? [~sunilg], the fair ordering preemption will generally select the smaller-weigted users first even when those containers are older. It's a hierarchy of priority ordering, though, and it does still try to be "fair," so you could have a situation where the youngest containers are selected even though they are owned by a more heavily-weighted user. > Support intra-queue preemption for fairness ordering policy. > > > Key: YARN-4781 > URL: https://issues.apache.org/jira/browse/YARN-4781 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Wangda Tan >Assignee: Eric Payne >Priority: Major > Attachments: YARN-4781.001.patch, YARN-4781.002.patch, > YARN-4781.003.patch, YARN-4781.004.patch, YARN-4781.005.patch > > > We introduced fairness queue policy since YARN-3319, which will let large > applications make progresses and not starve small applications. However, if a > large application takes the queue’s resources, and containers of the large > app has long lifespan, small applications could still wait for resources for > long time and SLAs cannot be guaranteed. > Instead of wait for application release resources on their own, we need to > preempt resources of queue with fairness policy enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487894#comment-16487894 ] genericqa commented on YARN-8344: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 29 unchanged - 2 fixed = 29 total (was 31) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 45s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 56s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 76m 25s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8344 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924796/YARN-8344.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux db23130e69a9 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 51ce02b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20841/testReport/ | | Max. process+thread count | 306 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20841/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT
[jira] [Commented] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
[ https://issues.apache.org/jira/browse/YARN-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487895#comment-16487895 ] Hudson commented on YARN-8336: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14272 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14272/]) YARN-8336. Fix potential connection leak in SchedConfCLI and (inigoiri: rev e30938af1270e079587e7bc06b755f9e93e660a5) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/SchedConfCLI.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/YarnWebServiceUtils.java > Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils > - > > Key: YARN-8336 > URL: https://issues.apache.org/jira/browse/YARN-8336 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Fix For: 3.2.0 > > Attachments: YARN-8336.v1.patch, YARN-8336.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487886#comment-16487886 ] Íñigo Goiri commented on YARN-8348: --- Technically the null check in AfterClass shouldn't be needed as a failure in BeforeClass should trigger the error everywhere else. In any case, is good to not have a NPE if the BeforeClass fails. So in the output we went from a double NoClassDefFound+NPE to just NoClassDefFound. I think this is an improvement but we need to figure out the reason for the NoClassDefFound (probably a separate JIRA). The real fix here would be the one in TestHBaseTimelineStorageDomain which leaves the mini cluster open. [^YARN-8348.v1.patch] LGTM. Let's wait for Yetus. > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This Jira tracks the effort to fix part of HBase-tests and reduces the failed > tests in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"
[ https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487887#comment-16487887 ] genericqa commented on YARN-8346: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 30s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 6s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 61m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8346 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12924799/YARN-8346.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2a58b0d4306d 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 51ce02b | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20842/testReport/ | | Max. process+thread count | 303 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/20842/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Upgrading to 3.1 kills running containers with error "Opportunistic container > queue is full" >
[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487861#comment-16487861 ] Giovanni Matteo Fumarola commented on YARN-8348: [^YARN-8348.v1.patch] will bring test failed from 21 to 16. [Link to failed tests|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This Jira tracks the effort to fix part of HBase-tests and reduces the failed > tests in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8348: --- Description: HBase tests are failing in [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] for 2 reasons: * incorrect afterClass; * not defined KeyProviderTokenIssuer. While in windows are failing for the previous 2 reasons plus * missing afterClass. This Jira tracks the effort to fix part of HBase-tests and reduces the failed tests in Linux. was: HBase tests are failing in [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] for 2 reasons: * incorrect > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect afterClass; > * not defined KeyProviderTokenIssuer. > While in windows are failing for the previous 2 reasons plus * missing > afterClass. > This Jira tracks the effort to fix part of HBase-tests and reduces the failed > tests in Linux. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8348: --- Description: HBase tests are failing in [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] for 2 reasons: * incorrect > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > > HBase tests are failing in > [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/] > for 2 reasons: > * incorrect -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487848#comment-16487848 ] Giovanni Matteo Fumarola commented on YARN-8348: Before my patch: [ERROR] Errors: [ERROR] TestTimelineReaderWebServicesHBaseStorage.setupBeforeClass:79->AbstractTimelineReaderHBaseTestBase.setup:60 » NoClassDefFound [ERROR] org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps.org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps [ERROR] Run 1: TestHBaseTimelineStorageApps.setupBeforeClass:97 » NoClassDefFound org/apache/... [ERROR] Run 2: TestHBaseTimelineStorageApps.tearDownAfterClass:1939 NullPointer [INFO] [ERROR] TestHBaseTimelineStorageDomain.setupBeforeClass:51 » NoClassDefFound org/apach... [ERROR] org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities.org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities [ERROR] Run 1: TestHBaseTimelineStorageEntities.setupBeforeClass:110 » NoClassDefFound org/ap... [ERROR] Run 2: TestHBaseTimelineStorageEntities.tearDownAfterClass:1882 NullPointer [INFO] [ERROR] TestHBaseTimelineStorageSchema.setupBeforeClass:49 » NoClassDefFound org/apach... [ERROR] org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity.org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity [ERROR] Run 1: TestHBaseStorageFlowActivity.setupBeforeClass:71 » NoClassDefFound org/apache/... [ERROR] Run 2: TestHBaseStorageFlowActivity.tearDownAfterClass:495 NullPointer [INFO] [ERROR] org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun.org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun [ERROR] Run 1: TestHBaseStorageFlowRun.setupBeforeClass:83 » NoClassDefFound org/apache/hadoo... [ERROR] Run 2: TestHBaseStorageFlowRun.tearDownAfterClass:1078 NullPointer [INFO] [ERROR] org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction.org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction [ERROR] Run 1: TestHBaseStorageFlowRunCompaction.setupBeforeClass:82 » NoClassDefFound org/ap... [ERROR] Run 2: TestHBaseStorageFlowRunCompaction.tearDownAfterClass:853 NullPointer After my patch: [ERROR] Errors: [ERROR] TestTimelineReaderWebServicesHBaseStorage.setupBeforeClass:79->AbstractTimelineReaderHBaseTestBase.setup:60 » NoClassDefFound [ERROR] TestHBaseTimelineStorageApps.setupBeforeClass:97 » NoClassDefFound org/apache/... [ERROR] TestHBaseTimelineStorageDomain.setupBeforeClass:52 » NoClassDefFound org/apach... [ERROR] TestHBaseTimelineStorageEntities.setupBeforeClass:110 » NoClassDefFound org/ap... [ERROR] TestHBaseTimelineStorageSchema.setupBeforeClass:50 » NoClassDefFound org/apach... [ERROR] TestHBaseStorageFlowActivity.setupBeforeClass:71 » NoClassDefFound org/apache/... [ERROR] TestHBaseStorageFlowRun.setupBeforeClass:83 » NoClassDefFound org/apache/hadoo... [ERROR] TestHBaseStorageFlowRunCompaction.setupBeforeClass:82 » NoClassDefFound org/ap... > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola reassigned YARN-8348: -- Assignee: Giovanni Matteo Fumarola > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
[ https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8348: --- Attachment: YARN-8348.v1.patch > Incorrect and missing AfterClass in HBase-tests > --- > > Key: YARN-8348 > URL: https://issues.apache.org/jira/browse/YARN-8348 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8348.v1.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
[ https://issues.apache.org/jira/browse/YARN-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487775#comment-16487775 ] Íñigo Goiri edited comment on YARN-8336 at 5/23/18 6:54 PM: Both [TestLogsCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestLogsCLI/] and [TestSchedConfCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestSchedConfCLI/] pass. +1 Committing to trunk. was (Author: elgoiri): Both [TestLogsCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestLogsCLI/] and [TestSchedConfCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestSchedConfCLI/] pass. +1 Feel free to commit. > Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils > - > > Key: YARN-8336 > URL: https://issues.apache.org/jira/browse/YARN-8336 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8336.v1.patch, YARN-8336.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8348) Incorrect and missing AfterClass in HBase-tests
Giovanni Matteo Fumarola created YARN-8348: -- Summary: Incorrect and missing AfterClass in HBase-tests Key: YARN-8348 URL: https://issues.apache.org/jira/browse/YARN-8348 Project: Hadoop YARN Issue Type: Bug Reporter: Giovanni Matteo Fumarola -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8108) RM metrics rest API throws GSSException in kerberized environment
[ https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487825#comment-16487825 ] Eric Yang commented on YARN-8108: - [~yzhangal] My preference is to fix this in 3.0.3 release. If consensus is not reached, release manager can push this out of 3.0.3 release, and release note this as an known issue. I am fine with the plan. > RM metrics rest API throws GSSException in kerberized environment > - > > Key: YARN-8108 > URL: https://issues.apache.org/jira/browse/YARN-8108 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Kshitij Badani >Assignee: Eric Yang >Priority: Blocker > Attachments: YARN-8108.001.patch > > > Test is trying to pull up metrics data from SHS after kiniting as 'test_user' > It is throwing GSSException as follows > {code:java} > b2b460b80713|RUNNING: curl --silent -k -X GET -D > /hwqe/hadoopqe/artifacts/tmp-94845 --negotiate -u : > http://rm_host:8088/proxy/application_1518674952153_0070/metrics/json2018-02-15 > 07:15:48,757|INFO|MainThread|machine.py:194 - > run()||GUID=fc5a3266-28f8-4eed-bae2-b2b460b80713|Exit Code: 0 > 2018-02-15 07:15:48,758|INFO|MainThread|spark.py:1757 - > getMetricsJsonData()|metrics: > > > > Error 403 GSSException: Failure unspecified at GSS-API level > (Mechanism level: Request is a replay (34)) > > HTTP ERROR 403 > Problem accessing /proxy/application_1518674952153_0070/metrics/json. > Reason: > GSSException: Failure unspecified at GSS-API level (Mechanism level: > Request is a replay (34)) > > > {code} > Rootcausing : proxyserver on RM can't be supported for Kerberos enabled > cluster because AuthenticationFilter is applied twice in Hadoop code (once in > httpServer2 for RM, and another instance from AmFilterInitializer for proxy > server). This will require code changes to hadoop-yarn-server-web-proxy > project -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"
[ https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487810#comment-16487810 ] Yongjun Zhang commented on YARN-8346: - Thanks a lot for the quick turnaround [~jlowe] and [~kkaranasos]. > Upgrading to 3.1 kills running containers with error "Opportunistic container > queue is full" > > > Key: YARN-8346 > URL: https://issues.apache.org/jira/browse/YARN-8346 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0, 3.0.2 >Reporter: Rohith Sharma K S >Assignee: Jason Lowe >Priority: Blocker > Attachments: YARN-8346.001.patch > > > It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the > running containers are killed and second attempt is launched for that > application. The diagnostics message is "Opportunistic container queue is > full" which is the reason for container killed. > In NM log, I see below logs for after container is recovered. > {noformat} > 2018-05-23 17:18:50,655 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler: > Opportunistic container [container_e06_1527075664705_0001_01_01] will > not be queued at the NMsince max queue length [0] has been reached > {noformat} > Following steps are executed for rolling upgrade > # Install 2.8.4 cluster and launch a MR job with distributed cache enabled. > # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration. > # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8108) RM metrics rest API throws GSSException in kerberized environment
[ https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487806#comment-16487806 ] Yongjun Zhang commented on YARN-8108: - Hi [~eyang], It seems the issue also exists in 3.0.2 release. The above discussion indicates that it might take some time for the solution to converge, should we move 3.0.3 out of the target release and list this jira as a known issue for 3.0.3? or we should fix this issue in 3.0.3? Thanks. > RM metrics rest API throws GSSException in kerberized environment > - > > Key: YARN-8108 > URL: https://issues.apache.org/jira/browse/YARN-8108 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Kshitij Badani >Assignee: Eric Yang >Priority: Blocker > Attachments: YARN-8108.001.patch > > > Test is trying to pull up metrics data from SHS after kiniting as 'test_user' > It is throwing GSSException as follows > {code:java} > b2b460b80713|RUNNING: curl --silent -k -X GET -D > /hwqe/hadoopqe/artifacts/tmp-94845 --negotiate -u : > http://rm_host:8088/proxy/application_1518674952153_0070/metrics/json2018-02-15 > 07:15:48,757|INFO|MainThread|machine.py:194 - > run()||GUID=fc5a3266-28f8-4eed-bae2-b2b460b80713|Exit Code: 0 > 2018-02-15 07:15:48,758|INFO|MainThread|spark.py:1757 - > getMetricsJsonData()|metrics: > > > > Error 403 GSSException: Failure unspecified at GSS-API level > (Mechanism level: Request is a replay (34)) > > HTTP ERROR 403 > Problem accessing /proxy/application_1518674952153_0070/metrics/json. > Reason: > GSSException: Failure unspecified at GSS-API level (Mechanism level: > Request is a replay (34)) > > > {code} > Rootcausing : proxyserver on RM can't be supported for Kerberos enabled > cluster because AuthenticationFilter is applied twice in Hadoop code (once in > httpServer2 for RM, and another instance from AmFilterInitializer for proxy > server). This will require code changes to hadoop-yarn-server-web-proxy > project -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"
[ https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487803#comment-16487803 ] Konstantinos Karanasos commented on YARN-8346: -- Thanks for the patch, [~jlowe]. Indeed you are right – the problem is the lack of execution type. The queue size should remain 0 given that opportunistic containers are disabled in this case. +1 for the patch. > Upgrading to 3.1 kills running containers with error "Opportunistic container > queue is full" > > > Key: YARN-8346 > URL: https://issues.apache.org/jira/browse/YARN-8346 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0, 3.0.2 >Reporter: Rohith Sharma K S >Assignee: Jason Lowe >Priority: Blocker > Attachments: YARN-8346.001.patch > > > It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the > running containers are killed and second attempt is launched for that > application. The diagnostics message is "Opportunistic container queue is > full" which is the reason for container killed. > In NM log, I see below logs for after container is recovered. > {noformat} > 2018-05-23 17:18:50,655 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler: > Opportunistic container [container_e06_1527075664705_0001_01_01] will > not be queued at the NMsince max queue length [0] has been reached > {noformat} > Following steps are executed for rolling upgrade > # Install 2.8.4 cluster and launch a MR job with distributed cache enabled. > # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration. > # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487778#comment-16487778 ] Giovanni Matteo Fumarola edited comment on YARN-8344 at 5/23/18 6:22 PM: - Attached v2 with the fix for Check style warning. If any test in this class fails all the other tests will fail (same behavior in Windows or Linux). testContainerResourceIncreaseIsSynchronizedWithRMResync fails in Windows - due the length of log directory. This patch will fix testKillContainersOnResync java.io.IOException: Cannot launch container using script at path F:/short/hadoop-trunk-win/s/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync/nm0/usercache/nobody/appcache/application_0_/container_0__01_00/default_container_executor.cmd, because it exceeds the maximum supported path length of 260 characters. Consider configuring shorter directories in yarn.nodemanager.local-dirs. I saw a bunch of tests failing in windows for this reason. I will open a Jira to track this fix. was (Author: giovanni.fumarola): Attached v2 with the fix for Check style warning. If any test in this class fails all the other tests will fail (same behavior in Windows or Linux). testContainerResourceIncreaseIsSynchronizedWithRMResync fails in Windows - still figuring out the root cause. This patch will fix testKillContainersOnResync > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > - > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8344: --- Summary: Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync (was: Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows) > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > - > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"
[ https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-8346: - Attachment: YARN-8346.001.patch > Upgrading to 3.1 kills running containers with error "Opportunistic container > queue is full" > > > Key: YARN-8346 > URL: https://issues.apache.org/jira/browse/YARN-8346 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0, 3.0.2 >Reporter: Rohith Sharma K S >Assignee: Jason Lowe >Priority: Blocker > Attachments: YARN-8346.001.patch > > > It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the > running containers are killed and second attempt is launched for that > application. The diagnostics message is "Opportunistic container queue is > full" which is the reason for container killed. > In NM log, I see below logs for after container is recovered. > {noformat} > 2018-05-23 17:18:50,655 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler: > Opportunistic container [container_e06_1527075664705_0001_01_01] will > not be queued at the NMsince max queue length [0] has been reached > {noformat} > Following steps are executed for rolling upgrade > # Install 2.8.4 cluster and launch a MR job with distributed cache enabled. > # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration. > # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487778#comment-16487778 ] Giovanni Matteo Fumarola commented on YARN-8344: Attached v2 with the fix for Check style warning. If any test in this class fails all the other tests will fail (same behavior in Windows or Linux). testContainerResourceIncreaseIsSynchronizedWithRMResync fails in Windows - still figuring out the root cause. This patch will fix testKillContainersOnResync > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows > > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
[ https://issues.apache.org/jira/browse/YARN-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487775#comment-16487775 ] Íñigo Goiri commented on YARN-8336: --- Both [TestLogsCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestLogsCLI/] and [TestSchedConfCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestSchedConfCLI/] pass. +1 Feel free to commit. > Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils > - > > Key: YARN-8336 > URL: https://issues.apache.org/jira/browse/YARN-8336 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8336.v1.patch, YARN-8336.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8334) Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487766#comment-16487766 ] Íñigo Goiri edited comment on YARN-8334 at 5/23/18 6:06 PM: The TestPolicyGenerator unit test runs [here|https://builds.apache.org/job/PreCommit-YARN-Build/20833/testReport/org.apache.hadoop.yarn.server.globalpolicygenerator.policygenerator/TestPolicyGenerator/]. +1 Feel free to commit to the branch. was (Author: elgoiri): The TestPolicyGenerator unit test runs [here|https://builds.apache.org/job/PreCommit-YARN-Build/20833/testReport/org.apache.hadoop.yarn.server.globalpolicygenerator.policygenerator/TestPolicyGenerator/]. +1 Committing. > Fix potential connection leak in GPGUtils > - > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch, > YARN-8334-YARN-7402.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8334) Fix potential connection leak in GPGUtils
[ https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487766#comment-16487766 ] Íñigo Goiri commented on YARN-8334: --- The TestPolicyGenerator unit test runs [here|https://builds.apache.org/job/PreCommit-YARN-Build/20833/testReport/org.apache.hadoop.yarn.server.globalpolicygenerator.policygenerator/TestPolicyGenerator/]. +1 Committing. > Fix potential connection leak in GPGUtils > - > > Key: YARN-8334 > URL: https://issues.apache.org/jira/browse/YARN-8334 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Minor > Attachments: YARN-8334-YARN-7402.v1.patch, > YARN-8334-YARN-7402.v2.patch > > > Missing ClientResponse.close and Client.destroy can lead to a connection leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-8344: --- Attachment: YARN-8344.v2.patch > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows > > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487762#comment-16487762 ] Íñigo Goiri commented on YARN-8344: --- Why does this fail on Windows specifically? > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows > > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows
[ https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated YARN-8344: -- Description: Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows. > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows > > > Key: YARN-8344 > URL: https://issues.apache.org/jira/browse/YARN-8344 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola >Priority: Major > Attachments: YARN-8344.v1.patch > > > Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync > on Windows. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"
[ https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned YARN-8346: Assignee: Jason Lowe > Upgrading to 3.1 kills running containers with error "Opportunistic container > queue is full" > > > Key: YARN-8346 > URL: https://issues.apache.org/jira/browse/YARN-8346 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.0, 3.0.2 >Reporter: Rohith Sharma K S >Assignee: Jason Lowe >Priority: Blocker > > It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the > running containers are killed and second attempt is launched for that > application. The diagnostics message is "Opportunistic container queue is > full" which is the reason for container killed. > In NM log, I see below logs for after container is recovered. > {noformat} > 2018-05-23 17:18:50,655 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler: > Opportunistic container [container_e06_1527075664705_0001_01_01] will > not be queued at the NMsince max queue length [0] has been reached > {noformat} > Following steps are executed for rolling upgrade > # Install 2.8.4 cluster and launch a MR job with distributed cache enabled. > # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration. > # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups
[ https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487749#comment-16487749 ] Haibo Chen commented on YARN-4599: -- +1 on the latest patch. Will check it in later today if no objections > Set OOM control for memory cgroups > -- > > Key: YARN-4599 > URL: https://issues.apache.org/jira/browse/YARN-4599 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.9.0 >Reporter: Karthik Kambatla >Assignee: Miklos Szegedi >Priority: Major > Labels: oct16-medium > Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, > YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, > YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, > YARN-4599.007.patch, YARN-4599.008.patch, YARN-4599.009.patch, > YARN-4599.010.patch, YARN-4599.011.patch, YARN-4599.012.patch, > YARN-4599.013.patch, YARN-4599.014.patch, YARN-4599.015.patch, > YARN-4599.016.patch, YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch > > > YARN-1856 adds memory cgroups enforcing support. We should also explicitly > set OOM control so that containers are not killed as soon as they go over > their usage. Today, one could set the swappiness to control this, but > clusters with swap turned off exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org