[jira] [Commented] (YARN-9581) LogsCli getAMContainerInfoForRMWebService ignores rm2
[ https://issues.apache.org/jira/browse/YARN-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847962#comment-16847962 ] Tan, Wangda commented on YARN-9581: --- Nice catch, thanks [~Prabhu Joseph]. > LogsCli getAMContainerInfoForRMWebService ignores rm2 > - > > Key: YARN-9581 > URL: https://issues.apache.org/jira/browse/YARN-9581 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > Yarn Logs fails for a running job in case of RM HA with rm2 active. > {code} > hrt_qa@prabhuYarn:~> /usr/hdp/current/hadoop-yarn-client/bin/yarn logs > -applicationId application_1558613472348_0004 -am 1 > 19/05/24 18:04:49 INFO client.AHSProxy: Connecting to Application History > server at prabhuYarn/172.27.23.55:10200 > 19/05/24 18:04:50 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > Unable to get AM container informations for the > application:application_1558613472348_0004 > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > Error while authenticating with endpoint: > https://prabhuYarn:8090/ws/v1/cluster/apps/application_1558613472348_0004/appattempts > Can not get AMContainers logs for the > application:application_1558613472348_0004 with the appOwner:hrt_qa > {code} > LogsCli getRMWebAppURLWithoutScheme only checks the first one from the RM > list yarn.resourcemanager.ha.rm-ids. > {code} > yarnConfig.set(YarnConfiguration.RM_HA_ID, rmIds.get(0)); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9563) Resource report REST API could return NaN or Inf
[ https://issues.apache.org/jira/browse/YARN-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847914#comment-16847914 ] Jonathan Eagles commented on YARN-9563: --- Both TestLeaderElectorService and TestCapacityOverTimePolicy are flaky tests, but can you address the small checkstyle issues mentioned in the report, [~ahussein]? > Resource report REST API could return NaN or Inf > > > Key: YARN-9563 > URL: https://issues.apache.org/jira/browse/YARN-9563 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: YARN-9563.001.patch, YARN-9563.002.patch, > YARN-9563.003.patch > > > The Resource Manager's Cluster Applications and Cluster Application REST APIs > are sometimes returning invalid JSON. This was addressed in YARN-6082. > However, the fix only fixes the calculation in one site and does not > guarantee to avoid the problem.Likewise, generating NaN/Inf can break the web > GUI if the columns cannot render non-numeric values. > The suggested fix is to check for NaN/Inf in the protob. The protob replaces > NaN/Inf by 0.0f. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9563) Resource report REST API could return NaN or Inf
[ https://issues.apache.org/jira/browse/YARN-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847885#comment-16847885 ] Hadoop QA commented on YARN-9563: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 30s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 28s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 439 unchanged - 0 fixed = 441 total (was 439) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 52s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}130m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestLeaderElectorService | | | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9563 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969676/YARN-9563.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9cfda2aea48f 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6d0e79c | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/24148/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit |
[jira] [Issue Comment Deleted] (YARN-9563) Resource report REST API could return NaN or Inf
[ https://issues.apache.org/jira/browse/YARN-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Hussein updated YARN-9563: Comment: was deleted (was: I checked that 2.8 has the same implementation and I could not spot differences in between the two versions.) > Resource report REST API could return NaN or Inf > > > Key: YARN-9563 > URL: https://issues.apache.org/jira/browse/YARN-9563 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: YARN-9563.001.patch, YARN-9563.002.patch, > YARN-9563.003.patch > > > The Resource Manager's Cluster Applications and Cluster Application REST APIs > are sometimes returning invalid JSON. This was addressed in YARN-6082. > However, the fix only fixes the calculation in one site and does not > guarantee to avoid the problem.Likewise, generating NaN/Inf can break the web > GUI if the columns cannot render non-numeric values. > The suggested fix is to check for NaN/Inf in the protob. The protob replaces > NaN/Inf by 0.0f. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9563) Resource report REST API could return NaN or Inf
[ https://issues.apache.org/jira/browse/YARN-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847882#comment-16847882 ] Ahmed Hussein commented on YARN-9563: - I will create another patch for 2.8 > Resource report REST API could return NaN or Inf > > > Key: YARN-9563 > URL: https://issues.apache.org/jira/browse/YARN-9563 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: YARN-9563.001.patch, YARN-9563.002.patch, > YARN-9563.003.patch > > > The Resource Manager's Cluster Applications and Cluster Application REST APIs > are sometimes returning invalid JSON. This was addressed in YARN-6082. > However, the fix only fixes the calculation in one site and does not > guarantee to avoid the problem.Likewise, generating NaN/Inf can break the web > GUI if the columns cannot render non-numeric values. > The suggested fix is to check for NaN/Inf in the protob. The protob replaces > NaN/Inf by 0.0f. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9563) Resource report REST API could return NaN or Inf
[ https://issues.apache.org/jira/browse/YARN-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847875#comment-16847875 ] Ahmed Hussein commented on YARN-9563: - I checked that 2.8 has the same implementation and I could not spot differences in between the two versions. > Resource report REST API could return NaN or Inf > > > Key: YARN-9563 > URL: https://issues.apache.org/jira/browse/YARN-9563 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: YARN-9563.001.patch, YARN-9563.002.patch, > YARN-9563.003.patch > > > The Resource Manager's Cluster Applications and Cluster Application REST APIs > are sometimes returning invalid JSON. This was addressed in YARN-6082. > However, the fix only fixes the calculation in one site and does not > guarantee to avoid the problem.Likewise, generating NaN/Inf can break the web > GUI if the columns cannot render non-numeric values. > The suggested fix is to check for NaN/Inf in the protob. The protob replaces > NaN/Inf by 0.0f. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9563) Resource report REST API could return NaN or Inf
[ https://issues.apache.org/jira/browse/YARN-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847840#comment-16847840 ] Jonathan Eagles commented on YARN-9563: --- I'm +1 on patch 003. I'll wait for Hadoop QA results as well as give some time for other reviewers before committing this. I'm guessing this patch is targeted for all lines back to 2.8? > Resource report REST API could return NaN or Inf > > > Key: YARN-9563 > URL: https://issues.apache.org/jira/browse/YARN-9563 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: YARN-9563.001.patch, YARN-9563.002.patch, > YARN-9563.003.patch > > > The Resource Manager's Cluster Applications and Cluster Application REST APIs > are sometimes returning invalid JSON. This was addressed in YARN-6082. > However, the fix only fixes the calculation in one site and does not > guarantee to avoid the problem.Likewise, generating NaN/Inf can break the web > GUI if the columns cannot render non-numeric values. > The suggested fix is to check for NaN/Inf in the protob. The protob replaces > NaN/Inf by 0.0f. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847833#comment-16847833 ] Eric Badger commented on YARN-9560: --- I've opened YARN-9582 to port the yarn sysfs feature to the new runtime > Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime > --- > > Key: YARN-9560 > URL: https://issues.apache.org/jira/browse/YARN-9560 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9560.001.patch, YARN-9560.002.patch, > YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, > YARN-9560.006.patch > > > Since the new OCI/squashFS/runc runtime will be using a lot of the same code > as DockerLinuxContainerRuntime, it would be good to move a bunch of the > DockerLinuxContainerRuntime code up a level to an abstract class that both of > the runtimes can extend. > The new structure will look like: > {noformat} > OCIContainerRuntime (abstract class) > - DockerLinuxContainerRuntime > - FSImageContainerRuntime (name negotiable) > {noformat} > This JIRA should only change the structure of the code, not the actual > semantics -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9582) Port YARN-8569 to FSImageContainerRuntime
Eric Badger created YARN-9582: - Summary: Port YARN-8569 to FSImageContainerRuntime Key: YARN-9582 URL: https://issues.apache.org/jira/browse/YARN-9582 Project: Hadoop YARN Issue Type: Sub-task Reporter: Eric Badger After YARN-9562 is merged, we should add in the yarn sysfs to the new runtime. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9581) LogsCli getAMContainerInfoForRMWebService ignores rm2
[ https://issues.apache.org/jira/browse/YARN-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9581: Affects Version/s: 3.2.0 > LogsCli getAMContainerInfoForRMWebService ignores rm2 > - > > Key: YARN-9581 > URL: https://issues.apache.org/jira/browse/YARN-9581 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > Yarn Logs fails for a running job in case of RM HA with rm2 active. > {code} > hrt_qa@prabhuYarn:~> /usr/hdp/current/hadoop-yarn-client/bin/yarn logs > -applicationId application_1558613472348_0004 -am 1 > 19/05/24 18:04:49 INFO client.AHSProxy: Connecting to Application History > server at prabhuYarn/172.27.23.55:10200 > 19/05/24 18:04:50 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > Unable to get AM container informations for the > application:application_1558613472348_0004 > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > Error while authenticating with endpoint: > https://prabhuYarn:8090/ws/v1/cluster/apps/application_1558613472348_0004/appattempts > Can not get AMContainers logs for the > application:application_1558613472348_0004 with the appOwner:hrt_qa > {code} > LogsCli getRMWebAppURLWithoutScheme only checks the first one from the RM > list yarn.resourcemanager.ha.rm-ids. > {code} > yarnConfig.set(YarnConfiguration.RM_HA_ID, rmIds.get(0)); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9581) LogsCli getAMContainerInfoForRMWebService ignores rm2
[ https://issues.apache.org/jira/browse/YARN-9581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9581: Component/s: client > LogsCli getAMContainerInfoForRMWebService ignores rm2 > - > > Key: YARN-9581 > URL: https://issues.apache.org/jira/browse/YARN-9581 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > > Yarn Logs fails for a running job in case of RM HA with rm2 active. > {code} > hrt_qa@prabhuYarn:~> /usr/hdp/current/hadoop-yarn-client/bin/yarn logs > -applicationId application_1558613472348_0004 -am 1 > 19/05/24 18:04:49 INFO client.AHSProxy: Connecting to Application History > server at prabhuYarn/172.27.23.55:10200 > 19/05/24 18:04:50 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > Unable to get AM container informations for the > application:application_1558613472348_0004 > java.io.IOException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > Error while authenticating with endpoint: > https://prabhuYarn:8090/ws/v1/cluster/apps/application_1558613472348_0004/appattempts > Can not get AMContainers logs for the > application:application_1558613472348_0004 with the appOwner:hrt_qa > {code} > LogsCli getRMWebAppURLWithoutScheme only checks the first one from the RM > list yarn.resourcemanager.ha.rm-ids. > {code} > yarnConfig.set(YarnConfiguration.RM_HA_ID, rmIds.get(0)); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9581) LogsCli getAMContainerInfoForRMWebService ignores rm2
Prabhu Joseph created YARN-9581: --- Summary: LogsCli getAMContainerInfoForRMWebService ignores rm2 Key: YARN-9581 URL: https://issues.apache.org/jira/browse/YARN-9581 Project: Hadoop YARN Issue Type: Bug Reporter: Prabhu Joseph Assignee: Prabhu Joseph Yarn Logs fails for a running job in case of RM HA with rm2 active. {code} hrt_qa@prabhuYarn:~> /usr/hdp/current/hadoop-yarn-client/bin/yarn logs -applicationId application_1558613472348_0004 -am 1 19/05/24 18:04:49 INFO client.AHSProxy: Connecting to Application History server at prabhuYarn/172.27.23.55:10200 19/05/24 18:04:50 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to rm2 Unable to get AM container informations for the application:application_1558613472348_0004 java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: Error while authenticating with endpoint: https://prabhuYarn:8090/ws/v1/cluster/apps/application_1558613472348_0004/appattempts Can not get AMContainers logs for the application:application_1558613472348_0004 with the appOwner:hrt_qa {code} LogsCli getRMWebAppURLWithoutScheme only checks the first one from the RM list yarn.resourcemanager.ha.rm-ids. {code} yarnConfig.set(YarnConfiguration.RM_HA_ID, rmIds.get(0)); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9563) Resource report REST API could return NaN or Inf
[ https://issues.apache.org/jira/browse/YARN-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847805#comment-16847805 ] Ahmed Hussein commented on YARN-9563: - [~jeagles] I uploaded another modifying TestLeafQueue to check against NaN/Infinity. This test case will fail if someone modifies the FiCaSchedulerApp's resource calculation without checking for 0 denominator. > Resource report REST API could return NaN or Inf > > > Key: YARN-9563 > URL: https://issues.apache.org/jira/browse/YARN-9563 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: YARN-9563.001.patch, YARN-9563.002.patch, > YARN-9563.003.patch > > > The Resource Manager's Cluster Applications and Cluster Application REST APIs > are sometimes returning invalid JSON. This was addressed in YARN-6082. > However, the fix only fixes the calculation in one site and does not > guarantee to avoid the problem.Likewise, generating NaN/Inf can break the web > GUI if the columns cannot render non-numeric values. > The suggested fix is to check for NaN/Inf in the protob. The protob replaces > NaN/Inf by 0.0f. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847804#comment-16847804 ] Eric Yang commented on YARN-9560: - [~ebadger] Sounds reasonable to me. > Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime > --- > > Key: YARN-9560 > URL: https://issues.apache.org/jira/browse/YARN-9560 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9560.001.patch, YARN-9560.002.patch, > YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, > YARN-9560.006.patch > > > Since the new OCI/squashFS/runc runtime will be using a lot of the same code > as DockerLinuxContainerRuntime, it would be good to move a bunch of the > DockerLinuxContainerRuntime code up a level to an abstract class that both of > the runtimes can extend. > The new structure will look like: > {noformat} > OCIContainerRuntime (abstract class) > - DockerLinuxContainerRuntime > - FSImageContainerRuntime (name negotiable) > {noformat} > This JIRA should only change the structure of the code, not the actual > semantics -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9563) Resource report REST API could return NaN or Inf
[ https://issues.apache.org/jira/browse/YARN-9563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmed Hussein updated YARN-9563: Attachment: YARN-9563.003.patch > Resource report REST API could return NaN or Inf > > > Key: YARN-9563 > URL: https://issues.apache.org/jira/browse/YARN-9563 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Minor > Attachments: YARN-9563.001.patch, YARN-9563.002.patch, > YARN-9563.003.patch > > > The Resource Manager's Cluster Applications and Cluster Application REST APIs > are sometimes returning invalid JSON. This was addressed in YARN-6082. > However, the fix only fixes the calculation in one site and does not > guarantee to avoid the problem.Likewise, generating NaN/Inf can break the web > GUI if the columns cannot render non-numeric values. > The suggested fix is to check for NaN/Inf in the protob. The protob replaces > NaN/Inf by 0.0f. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847782#comment-16847782 ] Eric Badger commented on YARN-9560: --- bq. Are there two JSON formats used in OCIContainerRuntime? One for passing information between Java and C, and another passed to runc for execution? In the patch for YARN-9562 there will be 2 formats, 1 for Java to C and one for C to runc. bq. If there is already two types of JSON messages setup for communication between Java-container-executor and container-executor-runc, then it would be better to have sysfs included for communication between Java and container-executor. Container-executor binary needs to handle how to translate the flag into meaningful mount operations for runc. Agreed that this is necessary for the yarn sysfs feature to work. However, we can make that change in a followup JIRA. I don't want to conflate this restructuring JIRA with features that will need extra code changes to support such as changing the {{setYarnSysFS()}} method. The new runtime won't be using {{DockerRunCommand}}, since that is Docker specific. So to make way for the yarn sysfs feature in {{OCIContainerRuntime}}, I'd need to change {{setYarnSysFS} to something more general. This is something I'd like to avoid so that I can keep the changes as minimal as possible here and then make any non-trivial changes in followup JIRAs. That way we can minimize the patch size and the number of things that we're changing. > Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime > --- > > Key: YARN-9560 > URL: https://issues.apache.org/jira/browse/YARN-9560 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9560.001.patch, YARN-9560.002.patch, > YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, > YARN-9560.006.patch > > > Since the new OCI/squashFS/runc runtime will be using a lot of the same code > as DockerLinuxContainerRuntime, it would be good to move a bunch of the > DockerLinuxContainerRuntime code up a level to an abstract class that both of > the runtimes can extend. > The new structure will look like: > {noformat} > OCIContainerRuntime (abstract class) > - DockerLinuxContainerRuntime > - FSImageContainerRuntime (name negotiable) > {noformat} > This JIRA should only change the structure of the code, not the actual > semantics -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9452) Fix failing testcases TestDistributedShell and TestTimelineAuthFilterForV2
[ https://issues.apache.org/jira/browse/YARN-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847778#comment-16847778 ] Prabhu Joseph commented on YARN-9452: - [~adam.antal] [~snemeth] Can you review this Jira when you get time. This fixes failing testcases from TestDistributedShell and TestTimelineAuthFilterForV2. Failed testcase TestContainerSchedulerQueuing will be handled by YARN-9427. TestDistributedShell testcases works fine with the patch. > Fix failing testcases TestDistributedShell and TestTimelineAuthFilterForV2 > -- > > Key: YARN-9452 > URL: https://issues.apache.org/jira/browse/YARN-9452 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2, distributed-shell, test >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9452-001.patch, YARN-9452-002.patch, > YARN-9452-003.patch > > > *TestDistributedShell#testDSShellWithoutDomainV2CustomizedFlow* > {code} > [ERROR] > testDSShellWithoutDomainV2CustomizedFlow(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) > Time elapsed: 72.14 s <<< FAILURE! > java.lang.AssertionError: Entity ID prefix should be same across each publish > of same entity expected:<9223372036854775806> but was:<9223370482298585580> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.verifyEntityForTimelineV2(TestDistributedShell.java:695) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV2(TestDistributedShell.java:588) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:459) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow(TestDistributedShell.java:330) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > *TestTimelineAuthFilterForV2#testPutTimelineEntities* > {code} > [ERROR] > testPutTimelineEntities[3](org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2) > Time elapsed: 1.047 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertNotNull(Assert.java:712) > at org.junit.Assert.assertNotNull(Assert.java:722) > at > org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.verifyEntity(TestTimelineAuthFilterForV2.java:282) > at > org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:421) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at >
[jira] [Resolved] (YARN-9558) Log Aggregation testcases failing
[ https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang resolved YARN-9558. - Resolution: Fixed Thank [~Prabhu Joseph]. Keep this patch in Hadoop 3.3.0+. Mark as resolved again. > Log Aggregation testcases failing > - > > Key: YARN-9558 > URL: https://issues.apache.org/jira/browse/YARN-9558 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9558-001.patch, YARN-9558-002.patch, > YARN-9558-003.patch > > > Test cases related to Log Aggregation from below classes are failing > hadoop.yarn.server.nodemanager.webapp.TestNMWebServices > hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService > > hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices > hadoop.yarn.client.cli.TestLogsCLI -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9452) Fix failing testcases TestDistributedShell and TestTimelineAuthFilterForV2
[ https://issues.apache.org/jira/browse/YARN-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847770#comment-16847770 ] Hadoop QA commented on YARN-9452: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 38s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 56s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 28s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 22s{color} | {color:green} hadoop-yarn-server-tests in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 34s{color} | {color:green} hadoop-yarn-applications-distributedshell in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 45s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}122m 55s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.scheduler.TestContainerSchedulerQueuing | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce
[jira] [Commented] (YARN-9558) Log Aggregation testcases failing
[ https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847741#comment-16847741 ] Prabhu Joseph commented on YARN-9558: - Yes Sure [~eyang]. Thanks. > Log Aggregation testcases failing > - > > Key: YARN-9558 > URL: https://issues.apache.org/jira/browse/YARN-9558 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9558-001.patch, YARN-9558-002.patch, > YARN-9558-003.patch > > > Test cases related to Log Aggregation from below classes are failing > hadoop.yarn.server.nodemanager.webapp.TestNMWebServices > hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService > > hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices > hadoop.yarn.client.cli.TestLogsCLI -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847723#comment-16847723 ] Hadoop QA commented on YARN-9560: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 22s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 10 unchanged - 2 fixed = 11 total (was 12) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 59s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 22s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9560 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969658/YARN-9560.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9abb7eaccc63 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 460ba7f | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/24147/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24147/testReport/ | | Max. process+thread count | 447 (vs.
[jira] [Comment Edited] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847712#comment-16847712 ] Eric Yang edited comment on YARN-9560 at 5/24/19 4:44 PM: -- [~ebadger] Are there two JSON formats used in OCIContainerRuntime? One for passing information between Java and C, and another passed to runc for execution? If there is only one format that the one passing from Java is consumed by runc, then I agree with you that it is not easy to pass this flag and follow up JIRA make sense to further develop communication filtering between Java, container-executor and runc. If there is already two types of JSON messages setup for communication between Java-container-executor and container-executor-runc, then it would be better to have sysfs included for communication between Java and container-executor. Container-executor binary needs to handle how to translate the flag into meaningful mount operations for runc. was (Author: eyang): [~ebadger] Are there two JSON formats used in OCIContainerRuntime? One for passing information between Java and C, and another passed to runc for execution? If there is only one format that the one passing from Java is consumed by runc, then I agree with you that it is not easy to pass this flag and follow up JIRA make sense to further develop communication filtering between Java, container-executor and runc. If there is already two types of JSON messages setup for communication between Java <-> container-executor and container-executor <-> runc, then it would be better to have sysfs included for communication between Java and container-executor. Container-executor binary needs to handle how to translate the flag into meaningful mount operations for runc. > Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime > --- > > Key: YARN-9560 > URL: https://issues.apache.org/jira/browse/YARN-9560 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9560.001.patch, YARN-9560.002.patch, > YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, > YARN-9560.006.patch > > > Since the new OCI/squashFS/runc runtime will be using a lot of the same code > as DockerLinuxContainerRuntime, it would be good to move a bunch of the > DockerLinuxContainerRuntime code up a level to an abstract class that both of > the runtimes can extend. > The new structure will look like: > {noformat} > OCIContainerRuntime (abstract class) > - DockerLinuxContainerRuntime > - FSImageContainerRuntime (name negotiable) > {noformat} > This JIRA should only change the structure of the code, not the actual > semantics -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847712#comment-16847712 ] Eric Yang commented on YARN-9560: - [~ebadger] Are there two JSON formats used in OCIContainerRuntime? One for passing information between Java and C, and another passed to runc for execution? If there is only one format that the one passing from Java is consumed by runc, then I agree with you that it is not easy to pass this flag and follow up JIRA make sense to further develop communication filtering between Java, container-executor and runc. If there is already two types of JSON messages setup for communication between Java <-> container-executor and container-executor <-> runc, then it would be better to have sysfs included for communication between Java and container-executor. Container-executor binary needs to handle how to translate the flag into meaningful mount operations for runc. > Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime > --- > > Key: YARN-9560 > URL: https://issues.apache.org/jira/browse/YARN-9560 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9560.001.patch, YARN-9560.002.patch, > YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, > YARN-9560.006.patch > > > Since the new OCI/squashFS/runc runtime will be using a lot of the same code > as DockerLinuxContainerRuntime, it would be good to move a bunch of the > DockerLinuxContainerRuntime code up a level to an abstract class that both of > the runtimes can extend. > The new structure will look like: > {noformat} > OCIContainerRuntime (abstract class) > - DockerLinuxContainerRuntime > - FSImageContainerRuntime (name negotiable) > {noformat} > This JIRA should only change the structure of the code, not the actual > semantics -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9525) IFile format is not working against s3a remote folder
[ https://issues.apache.org/jira/browse/YARN-9525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847698#comment-16847698 ] Adam Antal commented on YARN-9525: -- It looks like we still have some problems with IFile, I got the following errors: Cannot seek to a negative offset -4 >From LogAggregationIndexedFileController.java it looks like we are not writing >out full stacktraces, but probably it's originating from >{{loadIndexedLogsMeta}} where we do a seek with negative offset. It must be >some similar byte-magic as the first issue was, will look into it deeper >tomorrow. > IFile format is not working against s3a remote folder > - > > Key: YARN-9525 > URL: https://issues.apache.org/jira/browse/YARN-9525 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.1.2 >Reporter: Adam Antal >Assignee: Peter Bacsko >Priority: Major > Attachments: IFile-S3A-POC01.patch, YARN-9525-001.patch > > > Using the IndexedFileFormat {{yarn.nodemanager.remote-app-log-dir}} > configured to an s3a URI throws the following exception during log > aggregation: > {noformat} > Cannot create writer for app application_1556199768861_0001. Skip log upload > this time. > java.io.IOException: java.io.FileNotFoundException: No such file or > directory: > s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041 > at > org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:247) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:306) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:464) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:420) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.FileNotFoundException: No such file or directory: > s3a://adamantal-log-test/logs/systest/ifile/application_1556199768861_0001/adamantal-3.gce.cloudera.com_8041 > at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:2488) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2382) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2321) > at > org.apache.hadoop.fs.DelegateToFileSystem.getFileStatus(DelegateToFileSystem.java:128) > at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1244) > at org.apache.hadoop.fs.FileContext$15.next(FileContext.java:1240) > at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) > at org.apache.hadoop.fs.FileContext.getFileStatus(FileContext.java:1246) > at > org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController$1.run(LogAggregationIndexedFileController.java:228) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:195) > ... 7 more > {noformat} > This stack trace point to > {{LogAggregationIndexedFileController$initializeWriter}} where we do the > following steps (in a non-rolling log aggregation setup): > - create FSDataOutputStream > - writing out a UUID > - flushing > - immediately after that we call a GetFileStatus to get the length of the log > file (the bytes we just wrote out), and that's where the failures happens: > the file is not there yet due to eventual consistency. > Maybe we can get rid of that, so we can use IFile format against a s3a target. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9512) [JDK11] TestAuxServices#testCustomizedAuxServiceClassPath ClassCastException: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class java.net.URLC
[ https://issues.apache.org/jira/browse/YARN-9512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847689#comment-16847689 ] Adam Antal commented on YARN-9512: -- I made some investigation, but don't have a proposed solutions to this. Some related articles and stuff: - It is a known [migration problem|https://blog.codefx.org/java/java-9-migration-guide/#Casting-To-URL-Class-Loader] from Java 8 to 9. This article says some case where the migration is easy, but non of them is applicable here. - The [Oracle community|https://community.oracle.com/thread/4011800] has also got a thread about this problem, might worth chasing that option they suggested. Most likely it is just a testing issue, but I am still unsure about this. I'm also CC [~billie.rinaldi], as had some work around this area. If you have any thoughts on this, we'd be happy. Also other components bumped into this - similar: TEZ-3860. > [JDK11] TestAuxServices#testCustomizedAuxServiceClassPath ClassCastException: > class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class > java.net.URLClassLoader > --- > > Key: YARN-9512 > URL: https://issues.apache.org/jira/browse/YARN-9512 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Siyao Meng >Assignee: Adam Antal >Priority: Major > > Found in maven JDK 11 unit test run. Compiled on JDK 8: > {code} > [ERROR] > testCustomizedAuxServiceClassPath(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices) > Time elapsed: 0.019 s <<< ERROR!java.lang.ClassCastException: class > jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class > java.net.URLClassLoader (jdk.internal.loader.ClassLoaders$AppClassLoader and > java.net.URLClassLoader are in module java.base of loader 'bootstrap') > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices$ServiceC.getMetaData(TestAuxServices.java:197) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStart(AuxServices.java:315) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices.testCustomizedAuxServiceClassPath(TestAuxServices.java:344) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9558) Log Aggregation testcases failing
[ https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847685#comment-16847685 ] Eric Yang commented on YARN-9558: - [~Prabhu Joseph] I think it probably better to keep these changes in 3.3.0 only because these changes introduces a new configuration flag NM_REMOTE_APP_LOG_DIR_INCLUDE_OLDER and new behavior to locate log files. It is better that we don't introduce new behavior or flags during patch version back port because the upstream configuration management utility does not know about the new flag and log structure. This would reduce probability of introducing incompatible changes that we may not see otherwise. If you agree, I will reset the target version to 3.3.0 only. > Log Aggregation testcases failing > - > > Key: YARN-9558 > URL: https://issues.apache.org/jira/browse/YARN-9558 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9558-001.patch, YARN-9558-002.patch, > YARN-9558-003.patch > > > Test cases related to Log Aggregation from below classes are failing > hadoop.yarn.server.nodemanager.webapp.TestNMWebServices > hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService > > hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices > hadoop.yarn.client.cli.TestLogsCLI -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8625) Aggregate Resource Allocation for each job is not present in ATS
[ https://issues.apache.org/jira/browse/YARN-8625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847663#comment-16847663 ] Prabhu Joseph commented on YARN-8625: - [~eepayne] Branch-2.7 checkstyle issues are ignorable. Branch-2.8 asflicense license issue looks not related. YARN-9558 is fixed in trunk, have validated TestAHSWebServices in local. > Aggregate Resource Allocation for each job is not present in ATS > > > Key: YARN-8625 > URL: https://issues.apache.org/jira/browse/YARN-8625 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2 >Affects Versions: 2.7.4 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: 0001-YARN-8625.patch, 0002-YARN-8625.patch, > ApplicationHistoryServer_Rest_Api.png, ApplicationHistoryServer_UI.png, > YARN-8625-branch-2.7.001.patch, YARN-8625-branch-2.8.001.patch, yarn-site.xml > > > Aggregate Resource Allocation shown on RM UI for finished job is very useful > metric to understand how much resource a job has consumed. But this does not > get stored in ATS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847658#comment-16847658 ] Eric Badger commented on YARN-9560: --- Patch 006 adds the Private and Unstable flags to {{OCIContainerRuntime}} bq. It is basically a forward and pass operation to make sure that downstream C side of code receives this flag, and perform the necessary operation to setup the sysfs directory in the container working directory. Sysfs directory will be populated through async rest api call with a json file that contains the application structure, i.e. ip address and host names of the containers. In this case, by passing the flag as part of json to container-executor is sufficient. I still think we should add this in a follow-up JIRA. The current code is docker specific since it is a method of {{DockerRunCommand}}. If you agree I can file a followup JIRA bq. I will do tests, and probably try with and without ENTRYPOINT to make sure it's well covered. Thanks! I appreciate it > Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime > --- > > Key: YARN-9560 > URL: https://issues.apache.org/jira/browse/YARN-9560 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9560.001.patch, YARN-9560.002.patch, > YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, > YARN-9560.006.patch > > > Since the new OCI/squashFS/runc runtime will be using a lot of the same code > as DockerLinuxContainerRuntime, it would be good to move a bunch of the > DockerLinuxContainerRuntime code up a level to an abstract class that both of > the runtimes can extend. > The new structure will look like: > {noformat} > OCIContainerRuntime (abstract class) > - DockerLinuxContainerRuntime > - FSImageContainerRuntime (name negotiable) > {noformat} > This JIRA should only change the structure of the code, not the actual > semantics -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9560) Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9560: -- Attachment: YARN-9560.006.patch > Restructure DockerLinuxContainerRuntime to extend a new OCIContainerRuntime > --- > > Key: YARN-9560 > URL: https://issues.apache.org/jira/browse/YARN-9560 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9560.001.patch, YARN-9560.002.patch, > YARN-9560.003.patch, YARN-9560.004.patch, YARN-9560.005.patch, > YARN-9560.006.patch > > > Since the new OCI/squashFS/runc runtime will be using a lot of the same code > as DockerLinuxContainerRuntime, it would be good to move a bunch of the > DockerLinuxContainerRuntime code up a level to an abstract class that both of > the runtimes can extend. > The new structure will look like: > {noformat} > OCIContainerRuntime (abstract class) > - DockerLinuxContainerRuntime > - FSImageContainerRuntime (name negotiable) > {noformat} > This JIRA should only change the structure of the code, not the actual > semantics -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9452) Fix failing testcases TestDistributedShell and TestTimelineAuthFilterForV2
[ https://issues.apache.org/jira/browse/YARN-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9452: Summary: Fix failing testcases TestDistributedShell and TestTimelineAuthFilterForV2 (was: Timeline related testcases are failing) > Fix failing testcases TestDistributedShell and TestTimelineAuthFilterForV2 > -- > > Key: YARN-9452 > URL: https://issues.apache.org/jira/browse/YARN-9452 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2, test >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9452-001.patch, YARN-9452-002.patch, > YARN-9452-003.patch > > > *TestDistributedShell#testDSShellWithoutDomainV2CustomizedFlow* > {code} > [ERROR] > testDSShellWithoutDomainV2CustomizedFlow(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) > Time elapsed: 72.14 s <<< FAILURE! > java.lang.AssertionError: Entity ID prefix should be same across each publish > of same entity expected:<9223372036854775806> but was:<9223370482298585580> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.verifyEntityForTimelineV2(TestDistributedShell.java:695) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV2(TestDistributedShell.java:588) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:459) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow(TestDistributedShell.java:330) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > *TestTimelineAuthFilterForV2#testPutTimelineEntities* > {code} > [ERROR] > testPutTimelineEntities[3](org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2) > Time elapsed: 1.047 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertNotNull(Assert.java:712) > at org.junit.Assert.assertNotNull(Assert.java:722) > at > org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.verifyEntity(TestTimelineAuthFilterForV2.java:282) > at > org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:421) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at
[jira] [Updated] (YARN-9452) Timeline related testcases are failing
[ https://issues.apache.org/jira/browse/YARN-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9452: Description: *TestDistributedShell#testDSShellWithoutDomainV2CustomizedFlow* {code} [ERROR] testDSShellWithoutDomainV2CustomizedFlow(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) Time elapsed: 72.14 s <<< FAILURE! java.lang.AssertionError: Entity ID prefix should be same across each publish of same entity expected:<9223372036854775806> but was:<9223370482298585580> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.verifyEntityForTimelineV2(TestDistributedShell.java:695) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV2(TestDistributedShell.java:588) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:459) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow(TestDistributedShell.java:330) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) {code} *TestTimelineAuthFilterForV2#testPutTimelineEntities* {code} [ERROR] testPutTimelineEntities[3](org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2) Time elapsed: 1.047 s <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertNotNull(Assert.java:712) at org.junit.Assert.assertNotNull(Assert.java:722) at org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.verifyEntity(TestTimelineAuthFilterForV2.java:282) at org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:421) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runners.Suite.runChild(Suite.java:128) at
[jira] [Updated] (YARN-9452) Fix failing testcases TestDistributedShell and TestTimelineAuthFilterForV2
[ https://issues.apache.org/jira/browse/YARN-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9452: Component/s: distributed-shell > Fix failing testcases TestDistributedShell and TestTimelineAuthFilterForV2 > -- > > Key: YARN-9452 > URL: https://issues.apache.org/jira/browse/YARN-9452 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2, distributed-shell, test >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9452-001.patch, YARN-9452-002.patch, > YARN-9452-003.patch > > > *TestDistributedShell#testDSShellWithoutDomainV2CustomizedFlow* > {code} > [ERROR] > testDSShellWithoutDomainV2CustomizedFlow(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) > Time elapsed: 72.14 s <<< FAILURE! > java.lang.AssertionError: Entity ID prefix should be same across each publish > of same entity expected:<9223372036854775806> but was:<9223370482298585580> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.verifyEntityForTimelineV2(TestDistributedShell.java:695) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV2(TestDistributedShell.java:588) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:459) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow(TestDistributedShell.java:330) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > *TestTimelineAuthFilterForV2#testPutTimelineEntities* > {code} > [ERROR] > testPutTimelineEntities[3](org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2) > Time elapsed: 1.047 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertNotNull(Assert.java:712) > at org.junit.Assert.assertNotNull(Assert.java:722) > at > org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.verifyEntity(TestTimelineAuthFilterForV2.java:282) > at > org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:421) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at >
[jira] [Updated] (YARN-9452) Timeline related testcases are failing
[ https://issues.apache.org/jira/browse/YARN-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9452: Description: *TestDistributedShell#testDSShellWithoutDomainV2CustomizedFlow * {code} [ERROR] testDSShellWithoutDomainV2CustomizedFlow(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) Time elapsed: 72.14 s <<< FAILURE! java.lang.AssertionError: Entity ID prefix should be same across each publish of same entity expected:<9223372036854775806> but was:<9223370482298585580> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.verifyEntityForTimelineV2(TestDistributedShell.java:695) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV2(TestDistributedShell.java:588) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:459) at org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow(TestDistributedShell.java:330) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) {code} *TestTimelineAuthFilterForV2#testPutTimelineEntities * {code} [ERROR] testPutTimelineEntities[3](org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2) Time elapsed: 1.047 s <<< FAILURE! java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertNotNull(Assert.java:712) at org.junit.Assert.assertNotNull(Assert.java:722) at org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.verifyEntity(TestTimelineAuthFilterForV2.java:282) at org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:421) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runners.Suite.runChild(Suite.java:128) at
[jira] [Commented] (YARN-9573) DistributedShell cannot specify LogAggregationContext
[ https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847644#comment-16847644 ] Adam Antal commented on YARN-9573: -- Thanks for the reviews, just to make sure let's wait until YARN-9425 got comitted, don't want to accidentally mess up something in the background. > DistributedShell cannot specify LogAggregationContext > - > > Key: YARN-9573 > URL: https://issues.apache.org/jira/browse/YARN-9573 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell, log-aggregation, yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9573.001.patch > > > When DShell sends the application request object to the RM, it doesn't > specify the LogAggregationContext object - thus it is not possible to run > DShell with various log-aggregation configurations, for e.g. a rolling > fashioned log aggregation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9452) Timeline related testcases are failing
[ https://issues.apache.org/jira/browse/YARN-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9452: Attachment: YARN-9452-003.patch > Timeline related testcases are failing > -- > > Key: YARN-9452 > URL: https://issues.apache.org/jira/browse/YARN-9452 > Project: Hadoop YARN > Issue Type: Bug > Components: ATSv2, test >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9452-001.patch, YARN-9452-002.patch, > YARN-9452-003.patch > > > Timeline related testcases are failing. > TestDistributedShell#testDSShellWithoutDomainV2CustomizedFlow > {code} > [ERROR] > testDSShellWithoutDomainV2CustomizedFlow(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell) > Time elapsed: 72.14 s <<< FAILURE! > java.lang.AssertionError: Entity ID prefix should be same across each publish > of same entity expected:<9223372036854775806> but was:<9223370482298585580> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:834) > at org.junit.Assert.assertEquals(Assert.java:645) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.verifyEntityForTimelineV2(TestDistributedShell.java:695) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV2(TestDistributedShell.java:588) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:459) > at > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithoutDomainV2CustomizedFlow(TestDistributedShell.java:330) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > TestTimelineAuthFilterForV2#testPutTimelineEntities > {code} > [ERROR] > testPutTimelineEntities[3](org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2) > Time elapsed: 1.047 s <<< FAILURE! > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertNotNull(Assert.java:712) > at org.junit.Assert.assertNotNull(Assert.java:722) > at > org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.verifyEntity(TestTimelineAuthFilterForV2.java:282) > at > org.apache.hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2.testPutTimelineEntities(TestTimelineAuthFilterForV2.java:421) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at >
[jira] [Resolved] (YARN-9145) [Umbrella] Dynamically add or remove auxiliary services
[ https://issues.apache.org/jira/browse/YARN-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi resolved YARN-9145. -- Resolution: Fixed Fix Version/s: 3.3.0 Yes, thanks for reminding me I forgot to close the umbrella! > [Umbrella] Dynamically add or remove auxiliary services > --- > > Key: YARN-9145 > URL: https://issues.apache.org/jira/browse/YARN-9145 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Billie Rinaldi >Assignee: Billie Rinaldi >Priority: Major > Fix For: 3.3.0 > > > Umbrella to track tasks supporting adding, removing, or updating auxiliary > services without NM restart. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9145) [Umbrella] Dynamically add or remove auxiliary services
[ https://issues.apache.org/jira/browse/YARN-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847576#comment-16847576 ] Adam Antal commented on YARN-9145: -- Hi [~billie.rinaldi], is this considered feature-completed? > [Umbrella] Dynamically add or remove auxiliary services > --- > > Key: YARN-9145 > URL: https://issues.apache.org/jira/browse/YARN-9145 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Billie Rinaldi >Assignee: Billie Rinaldi >Priority: Major > > Umbrella to track tasks supporting adding, removing, or updating auxiliary > services without NM restart. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9511) [JDK11] TestAuxServices#testRemoteAuxServiceClassPath YarnRuntimeException: The remote jarfile should not be writable by group or others. The current Permission is 436
[ https://issues.apache.org/jira/browse/YARN-9511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Antal reassigned YARN-9511: Assignee: Adam Antal > [JDK11] TestAuxServices#testRemoteAuxServiceClassPath YarnRuntimeException: > The remote jarfile should not be writable by group or others. The current > Permission is 436 > --- > > Key: YARN-9511 > URL: https://issues.apache.org/jira/browse/YARN-9511 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Siyao Meng >Assignee: Adam Antal >Priority: Major > > Found in maven JDK 11 unit test run. Compiled on JDK 8. > {code} > [ERROR] > testRemoteAuxServiceClassPath(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices) > Time elapsed: 0.551 s <<< > ERROR!org.apache.hadoop.yarn.exceptions.YarnRuntimeException: The remote > jarfile should not be writable by group or others. The current Permission is > 436 > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:202) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices.testRemoteAuxServiceClassPath(TestAuxServices.java:268) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9558) Log Aggregation testcases failing
[ https://issues.apache.org/jira/browse/YARN-9558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847529#comment-16847529 ] Prabhu Joseph commented on YARN-9558: - Thanks [~eyang]. YARN-9558 also requires YARN-6929 + YARN-9524. Can we include all three in branch-3.2 and branch-3.1 as well. For Branch-3.2, It works by applying YARN-6929-011.patch, YARN-9524-002.patch and then YARN-9558-003.patch. For Branch-3.1, It works by applying YARN-6929-branch-3.1.001.patch and YARN-9524-002.patch and then YARN-9558-003.patch. Have submitted YARN-6929-branch-3.1.001.patch in YARN-6929 Jira. > Log Aggregation testcases failing > - > > Key: YARN-9558 > URL: https://issues.apache.org/jira/browse/YARN-9558 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, test >Affects Versions: 3.3.0, 3.2.1, 3.1.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9558-001.patch, YARN-9558-002.patch, > YARN-9558-003.patch > > > Test cases related to Log Aggregation from below classes are failing > hadoop.yarn.server.nodemanager.webapp.TestNMWebServices > hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService > > hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices > hadoop.yarn.client.cli.TestLogsCLI -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-9512) [JDK11] TestAuxServices#testCustomizedAuxServiceClassPath ClassCastException: class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class java.net.URLCl
[ https://issues.apache.org/jira/browse/YARN-9512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Antal reassigned YARN-9512: Assignee: Adam Antal > [JDK11] TestAuxServices#testCustomizedAuxServiceClassPath ClassCastException: > class jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class > java.net.URLClassLoader > --- > > Key: YARN-9512 > URL: https://issues.apache.org/jira/browse/YARN-9512 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Siyao Meng >Assignee: Adam Antal >Priority: Major > > Found in maven JDK 11 unit test run. Compiled on JDK 8: > {code} > [ERROR] > testCustomizedAuxServiceClassPath(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices) > Time elapsed: 0.019 s <<< ERROR!java.lang.ClassCastException: class > jdk.internal.loader.ClassLoaders$AppClassLoader cannot be cast to class > java.net.URLClassLoader (jdk.internal.loader.ClassLoaders$AppClassLoader and > java.net.URLClassLoader are in module java.base of loader 'bootstrap') > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices$ServiceC.getMetaData(TestAuxServices.java:197) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStart(AuxServices.java:315) > at > org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices.testCustomizedAuxServiceClassPath(TestAuxServices.java:344) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable
[ https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-6929: Attachment: (was: YARN-6929-branch-3.2.001.patch) > yarn.nodemanager.remote-app-log-dir structure is not scalable > - > > Key: YARN-6929 > URL: https://issues.apache.org/jira/browse/YARN-6929 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-6929-007.patch, YARN-6929-008.patch, > YARN-6929-009.patch, YARN-6929-010.patch, YARN-6929-011.patch, > YARN-6929-branch-3.1.001.patch, YARN-6929.1.patch, YARN-6929.2.patch, > YARN-6929.2.patch, YARN-6929.3.patch, YARN-6929.4.patch, YARN-6929.5.patch, > YARN-6929.6.patch, YARN-6929.patch > > > The current directory structure for yarn.nodemanager.remote-app-log-dir is > not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). > With retention yarn.log-aggregation.retain-seconds of 7days, there are more > chances LogAggregationService fails to create a new directory with > FSLimitException$MaxDirectoryItemsExceededException. > The current structure is > //logs/. This can be > improved with adding date as a subdirectory like > //logs// > {code:java} > WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService: > Application failed to init aggregation > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194) > > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) > > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) > > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at >
[jira] [Updated] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable
[ https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-6929: Attachment: YARN-6929-branch-3.2.001.patch > yarn.nodemanager.remote-app-log-dir structure is not scalable > - > > Key: YARN-6929 > URL: https://issues.apache.org/jira/browse/YARN-6929 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-6929-007.patch, YARN-6929-008.patch, > YARN-6929-009.patch, YARN-6929-010.patch, YARN-6929-011.patch, > YARN-6929-branch-3.1.001.patch, YARN-6929-branch-3.2.001.patch, > YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, YARN-6929.3.patch, > YARN-6929.4.patch, YARN-6929.5.patch, YARN-6929.6.patch, YARN-6929.patch > > > The current directory structure for yarn.nodemanager.remote-app-log-dir is > not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). > With retention yarn.log-aggregation.retain-seconds of 7days, there are more > chances LogAggregationService fails to create a new directory with > FSLimitException$MaxDirectoryItemsExceededException. > The current structure is > //logs/. This can be > improved with adding date as a subdirectory like > //logs// > {code:java} > WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService: > Application failed to init aggregation > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194) > > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) > > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) > > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at >
[jira] [Updated] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable
[ https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-6929: Attachment: (was: YARN-6929-branch-3.2.001.patch) > yarn.nodemanager.remote-app-log-dir structure is not scalable > - > > Key: YARN-6929 > URL: https://issues.apache.org/jira/browse/YARN-6929 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-6929-007.patch, YARN-6929-008.patch, > YARN-6929-009.patch, YARN-6929-010.patch, YARN-6929-011.patch, > YARN-6929-branch-3.1.001.patch, YARN-6929.1.patch, YARN-6929.2.patch, > YARN-6929.2.patch, YARN-6929.3.patch, YARN-6929.4.patch, YARN-6929.5.patch, > YARN-6929.6.patch, YARN-6929.patch > > > The current directory structure for yarn.nodemanager.remote-app-log-dir is > not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). > With retention yarn.log-aggregation.retain-seconds of 7days, there are more > chances LogAggregationService fails to create a new directory with > FSLimitException$MaxDirectoryItemsExceededException. > The current structure is > //logs/. This can be > improved with adding date as a subdirectory like > //logs// > {code:java} > WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService: > Application failed to init aggregation > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194) > > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) > > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) > > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at >
[jira] [Updated] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable
[ https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-6929: Attachment: YARN-6929-branch-3.2.001.patch > yarn.nodemanager.remote-app-log-dir structure is not scalable > - > > Key: YARN-6929 > URL: https://issues.apache.org/jira/browse/YARN-6929 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-6929-007.patch, YARN-6929-008.patch, > YARN-6929-009.patch, YARN-6929-010.patch, YARN-6929-011.patch, > YARN-6929-branch-3.1.001.patch, YARN-6929-branch-3.2.001.patch, > YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, YARN-6929.3.patch, > YARN-6929.4.patch, YARN-6929.5.patch, YARN-6929.6.patch, YARN-6929.patch > > > The current directory structure for yarn.nodemanager.remote-app-log-dir is > not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). > With retention yarn.log-aggregation.retain-seconds of 7days, there are more > chances LogAggregationService fails to create a new directory with > FSLimitException$MaxDirectoryItemsExceededException. > The current structure is > //logs/. This can be > improved with adding date as a subdirectory like > //logs// > {code:java} > WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService: > Application failed to init aggregation > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194) > > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) > > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) > > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at >
[jira] [Updated] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable
[ https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-6929: Attachment: YARN-6929-branch-3.1.001.patch > yarn.nodemanager.remote-app-log-dir structure is not scalable > - > > Key: YARN-6929 > URL: https://issues.apache.org/jira/browse/YARN-6929 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 2.7.3 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-6929-007.patch, YARN-6929-008.patch, > YARN-6929-009.patch, YARN-6929-010.patch, YARN-6929-011.patch, > YARN-6929-branch-3.1.001.patch, YARN-6929.1.patch, YARN-6929.2.patch, > YARN-6929.2.patch, YARN-6929.3.patch, YARN-6929.4.patch, YARN-6929.5.patch, > YARN-6929.6.patch, YARN-6929.patch > > > The current directory structure for yarn.nodemanager.remote-app-log-dir is > not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). > With retention yarn.log-aggregation.retain-seconds of 7days, there are more > chances LogAggregationService fails to create a new directory with > FSLimitException$MaxDirectoryItemsExceededException. > The current structure is > //logs/. This can be > improved with adding date as a subdirectory like > //logs// > {code:java} > WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService: > Application failed to init aggregation > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072) > > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221) > > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194) > > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813) > > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600) > > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 > items=1048576 > at >
[jira] [Updated] (YARN-9573) DistributedShell cannot specify LogAggregationContext
[ https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9573: - Summary: DistributedShell cannot specify LogAggregationContext (was: DistributedShell can't specify LogAggregationContext) > DistributedShell cannot specify LogAggregationContext > - > > Key: YARN-9573 > URL: https://issues.apache.org/jira/browse/YARN-9573 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell, log-aggregation, yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9573.001.patch > > > When DShell sends the application request object to the RM, it doesn't > specify the LogAggregationContext object - thus it is not possible to run > DShell with various log-aggregation configurations, for e.g. a rolling > fashioned log aggregation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9573) DistributedShell can't specify LogAggregationContext
[ https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847472#comment-16847472 ] Szilard Nemeth commented on YARN-9573: -- Thanks [~Prabhu Joseph]! Then I'm giving +1 (non-binding) > DistributedShell can't specify LogAggregationContext > > > Key: YARN-9573 > URL: https://issues.apache.org/jira/browse/YARN-9573 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell, log-aggregation, yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9573.001.patch > > > When DShell sends the application request object to the RM, it doesn't > specify the LogAggregationContext object - thus it is not possible to run > DShell with various log-aggregation configurations, for e.g. a rolling > fashioned log aggregation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9580) Fulfilled reservation information in assignment is lost when transferring in ParentQueue#assignContainers
[ https://issues.apache.org/jira/browse/YARN-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847468#comment-16847468 ] Hadoop QA commented on YARN-9580: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 11s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 9s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}128m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestLeaderElectorService | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9580 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969609/YARN-9580.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4005624868e2 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 460ba7f | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/24145/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24145/testReport/ | | Max. process+thread count | 919 (vs. ulimit of 1) | | modules | C:
[jira] [Commented] (YARN-9573) DistributedShell can't specify LogAggregationContext
[ https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847467#comment-16847467 ] Prabhu Joseph commented on YARN-9573: - [~snemeth] Test case failures are not related and will be fixed by YARN-9452. > DistributedShell can't specify LogAggregationContext > > > Key: YARN-9573 > URL: https://issues.apache.org/jira/browse/YARN-9573 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell, log-aggregation, yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9573.001.patch > > > When DShell sends the application request object to the RM, it doesn't > specify the LogAggregationContext object - thus it is not possible to run > DShell with various log-aggregation configurations, for e.g. a rolling > fashioned log aggregation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9573) DistributedShell can't specify LogAggregationContext
[ https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847449#comment-16847449 ] Szilard Nemeth commented on YARN-9573: -- Hi [~adam.antal]! Thanks for this patch! I know checkstyle issue is maybe not strongly related, but if you can fix it easily, please do so. Is the unit test failure related to your patch? Otherwise, the patch looks good! > DistributedShell can't specify LogAggregationContext > > > Key: YARN-9573 > URL: https://issues.apache.org/jira/browse/YARN-9573 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell, log-aggregation, yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9573.001.patch > > > When DShell sends the application request object to the RM, it doesn't > specify the LogAggregationContext object - thus it is not possible to run > DShell with various log-aggregation configurations, for e.g. a rolling > fashioned log aggregation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9573) DistributedShell can't specify LogAggregationContext
[ https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847422#comment-16847422 ] Hadoop QA commented on YARN-9573: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 3s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell: The patch generated 1 new + 89 unchanged - 1 fixed = 90 total (was 90) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 23s{color} | {color:red} hadoop-yarn-applications-distributedshell in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 87m 9s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.applications.distributedshell.TestDistributedShell | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9573 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12969597/YARN-9573.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3140f21bc6dc 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 460ba7f | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | checkstyle |
[jira] [Updated] (YARN-9580) Fulfilled reservation information in assignment is lost when transferring in ParentQueue#assignContainers
[ https://issues.apache.org/jira/browse/YARN-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-9580: --- Attachment: YARN-9580.001.patch > Fulfilled reservation information in assignment is lost when transferring in > ParentQueue#assignContainers > - > > Key: YARN-9580 > URL: https://issues.apache.org/jira/browse/YARN-9580 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9580.001.patch > > > When transferring assignment from child queue to parent queue, fulfilled > reservation information including fulfilledReservation and > fulfilledReservedContainer in assignment is lost. > When multi-nodes enabled, this lost can raise a problem that allocation > proposal is generated but can't be accepted because there is a check for > fulfilled reservation information in > FiCaSchedulerApp#commonCheckContainerAllocation, this endless loop will > always be there and the resource of the node can't be used anymore. > In HB-driven scheduling mode, fulfilled reservation can be allocated via > another calling stack: CapacityScheduler#allocateContainersToNode --> > CapacityScheduler#allocateContainerOnSingleNode --> > CapacityScheduler#allocateFromReservedContainer, in this way assignment can > be generated by leaf queue and directly submitted, I think that's why we > hardly find this problem before. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9580) Fulfilled reservation information in assignment is lost when transferring in ParentQueue#assignContainers
Tao Yang created YARN-9580: -- Summary: Fulfilled reservation information in assignment is lost when transferring in ParentQueue#assignContainers Key: YARN-9580 URL: https://issues.apache.org/jira/browse/YARN-9580 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Reporter: Tao Yang Assignee: Tao Yang When transferring assignment from child queue to parent queue, fulfilled reservation information including fulfilledReservation and fulfilledReservedContainer in assignment is lost. When multi-nodes enabled, this lost can raise a problem that allocation proposal is generated but can't be accepted because there is a check for fulfilled reservation information in FiCaSchedulerApp#commonCheckContainerAllocation, this endless loop will always be there and the resource of the node can't be used anymore. In HB-driven scheduling mode, fulfilled reservation can be allocated via another calling stack: CapacityScheduler#allocateContainersToNode --> CapacityScheduler#allocateContainerOnSingleNode --> CapacityScheduler#allocateFromReservedContainer, in this way assignment can be generated by leaf queue and directly submitted, I think that's why we hardly find this problem before. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9573) DistributedShell can't specify LogAggregationContext
[ https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847316#comment-16847316 ] Adam Antal commented on YARN-9573: -- Uploaded patch v1. It supports adding simple pattern to be added to the include pattern of the log aggregation context and disables the exclusion pattern. This is enough for a DShell to be started with log aggregation of rolling mode. No test is added though. I'm unsure how can that be added, tons of the options are untested as well. (Also 80% of the DShell tests are timing out in my local.) > DistributedShell can't specify LogAggregationContext > > > Key: YARN-9573 > URL: https://issues.apache.org/jira/browse/YARN-9573 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell, log-aggregation, yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9573.001.patch > > > When DShell sends the application request object to the RM, it doesn't > specify the LogAggregationContext object - thus it is not possible to run > DShell with various log-aggregation configurations, for e.g. a rolling > fashioned log aggregation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9573) DistributedShell can't specify LogAggregationContext
[ https://issues.apache.org/jira/browse/YARN-9573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Antal updated YARN-9573: - Attachment: YARN-9573.001.patch > DistributedShell can't specify LogAggregationContext > > > Key: YARN-9573 > URL: https://issues.apache.org/jira/browse/YARN-9573 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell, log-aggregation, yarn >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9573.001.patch > > > When DShell sends the application request object to the RM, it doesn't > specify the LogAggregationContext object - thus it is not possible to run > DShell with various log-aggregation configurations, for e.g. a rolling > fashioned log aggregation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org