[jira] [Created] (YARN-11669) cgroups v2 support for YARN
Ferenc Erdelyi created YARN-11669: - Summary: cgroups v2 support for YARN Key: YARN-11669 URL: https://issues.apache.org/jira/browse/YARN-11669 Project: Hadoop YARN Issue Type: New Feature Components: yarn Reporter: Ferenc Erdelyi The cgroups v2 is becoming the default for OSs, like RHEL9. Support for YARN has to be implemented. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831901#comment-17831901 ] ASF GitHub Bot commented on YARN-11664: --- steveloughran commented on PR #6631: URL: https://github.com/apache/hadoop/pull/6631#issuecomment-2025705847 waiting to see what hdfs people say; mentioned internally. now, there is a way to do this with a smaller diff, specifically, move the IOPair class into hadoop common *but keep with the same package name*. something to seriously consider. would reduce the risk of any code elsewhere making explicit use of the class then breaking. > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException;{code} > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, > * Constants variables and Utility classes can be moved to *hadoop-common* > * Instead of DSQuotaExceededException, Use the parent exception > ClusterStoragrCapacityExceeded -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831854#comment-17831854 ] ASF GitHub Bot commented on YARN-11582: --- hiwangzhihui commented on code in PR #6139: URL: https://github.com/apache/hadoop/pull/6139#discussion_r1542224470 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java: ## @@ -1027,4 +1027,79 @@ public void testAMLimitByAllResources() throws Exception { rm.close(); } + + @Test(timeout = 12) + public void testDiagnosticWhenAMActivated() throws Exception { +/* + * Test Case: + * Verify AM resource limit per partition level and per queue level. So + * we use 2 queues to verify this case. + * Queue a1 supports labels (x,y). Configure am-resource-limit as 0.2 (x) + * Queue c1 supports default label. Configure am-resource-limit as 0.2 + * + * Queue a1 for label X can only support 2Gb AM resource. + * Queue c1 (empty label) can support 2Gb AM resource. + * + * Verify at least one AM is launched, and AM resources should not go more + * than 2GB in each queue. + */ + +simpleNodeLabelMappingToManager(); +CapacitySchedulerConfiguration config = (CapacitySchedulerConfiguration) + TestUtils.getConfigurationWithQueueLabels(conf); + +// After getting queue conf, configure AM resource percent for Queue a1 +// as 0.2 (Label X) and for Queue c1 as 0.2 (Empty Label). +config.setMaximumAMResourcePercentPerPartition(A1, "x", 0.2f); +config.setMaximumApplicationMasterResourcePerQueuePercent(C1, 0.2f); + +// Now inject node label manager with this updated config. +MockRM rm = new MockRM(config) { + @Override + public RMNodeLabelsManager createNodeLabelManager() { +return mgr; + } +}; + +rm.getRMContext().setNodeLabelManager(mgr); +rm.start(); +rm.registerNode("h1:1234", 10 * GB); // label = x +rm.registerNode("h2:1234", 10 * GB); // label = y +rm.registerNode("h3:1234", 10 * GB); // label = + +// Submit app1 with 1Gb AM resource to Queue a1 for label X +MockRMAppSubmissionData data1 = + MockRMAppSubmissionData.Builder.createWithMemory(GB, rm) + .withAppName("app") + .withUser("user") + .withAcls(null) + .withQueue("a1") + .withAmLabel("x") + .build(); +RMApp app1 = MockRMAppSubmitter.submit(rm, data1); + +// Submit app2 with 1Gb AM resource to Queue a1 for label X +MockRMAppSubmissionData data2 = + MockRMAppSubmissionData.Builder.createWithMemory(GB, rm) + .withAppName("app") + .withUser("user") + .withAcls(null) + .withQueue("a1") + .withAmLabel("x") + .build(); +RMApp app2 = MockRMAppSubmitter.submit(rm, data2); + +CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler(); +LeafQueue leafQueue = (LeafQueue) cs.getQueue("a1"); +Assert.assertNotNull(leafQueue); + +// Only one AM will be activated here, and second AM will be still pending. +Assert.assertEquals(2, leafQueue.getNumActiveApplications()); +String activatedDiagnostics="AM Resource Request = "; +Assert.assertTrue("still doesn't show AMResource When Activated", app1.getDiagnostics() + .toString().contains(activatedDiagnostics)); Review Comment: Add a test checks for Am resource prompts would be better > Improve WebUI diagnosticMessage to show AM Container resource request size > -- > > Key: YARN-11582 > URL: https://issues.apache.org/jira/browse/YARN-11582 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, resourcemanager >Affects Versions: 3.3.4 >Reporter: xiaojunxiang >Priority: Minor > Labels: pull-request-available > Attachments: image-2023-10-02-00-05-34-337.png, > image-2024-03-28-22-11-37-903.png, success_ShowAMInfo.jpg > > > When Yarn resources are insufficient, the newly submitted job AM may be in > the state of "Application is Activated, waiting for resources to be assigned > for AM". This is obviously because Yarn doesn't have enough resources to > allocate another AM Container, so we want to know how large the AM Container > is currently allocated. Unfortunately, the current diagnosticMessage on the > Web page does not sh
[jira] [Comment Edited] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831827#comment-17831827 ] wangzhihui edited comment on YARN-11582 at 3/28/24 2:15 PM: hi, [~slfan1989] . This [PR|https://github.com/apache/hadoop/pull/6139] has added valid Test content and passed the latest Jenkins check; please help merge it. Thanks! was (Author: JIRAUSER302479): hi, [~slfan1989] This [PR|https://github.com/apache/hadoop/pull/6139] has added valid Test content and passed the latest Jenkins check; please help merge it. Thanks! > Improve WebUI diagnosticMessage to show AM Container resource request size > -- > > Key: YARN-11582 > URL: https://issues.apache.org/jira/browse/YARN-11582 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, resourcemanager >Affects Versions: 3.3.4 >Reporter: xiaojunxiang >Priority: Minor > Labels: pull-request-available > Attachments: image-2023-10-02-00-05-34-337.png, > image-2024-03-28-22-11-37-903.png, success_ShowAMInfo.jpg > > > When Yarn resources are insufficient, the newly submitted job AM may be in > the state of "Application is Activated, waiting for resources to be assigned > for AM". This is obviously because Yarn doesn't have enough resources to > allocate another AM Container, so we want to know how large the AM Container > is currently allocated. Unfortunately, the current diagnosticMessage on the > Web page does not show this data. Therefore, it is necessary to add the > resource size of the AM Container in the diagnosticMessage, which will be > very useful for us to troubleshoise the production faults on line. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831827#comment-17831827 ] wangzhihui commented on YARN-11582: --- hi, [~slfan1989] This [PR|https://github.com/apache/hadoop/pull/6139] has added valid Test content and passed the latest Jenkins check; please help merge it. Thanks! > Improve WebUI diagnosticMessage to show AM Container resource request size > -- > > Key: YARN-11582 > URL: https://issues.apache.org/jira/browse/YARN-11582 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, resourcemanager >Affects Versions: 3.3.4 >Reporter: xiaojunxiang >Priority: Minor > Labels: pull-request-available > Attachments: image-2023-10-02-00-05-34-337.png, > image-2024-03-28-22-11-37-903.png, success_ShowAMInfo.jpg > > > When Yarn resources are insufficient, the newly submitted job AM may be in > the state of "Application is Activated, waiting for resources to be assigned > for AM". This is obviously because Yarn doesn't have enough resources to > allocate another AM Container, so we want to know how large the AM Container > is currently allocated. Unfortunately, the current diagnosticMessage on the > Web page does not show this data. Therefore, it is necessary to add the > resource size of the AM Container in the diagnosticMessage, which will be > very useful for us to troubleshoise the production faults on line. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangzhihui updated YARN-11582: -- Attachment: image-2024-03-28-22-11-37-903.png > Improve WebUI diagnosticMessage to show AM Container resource request size > -- > > Key: YARN-11582 > URL: https://issues.apache.org/jira/browse/YARN-11582 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, resourcemanager >Affects Versions: 3.3.4 >Reporter: xiaojunxiang >Priority: Minor > Labels: pull-request-available > Attachments: image-2023-10-02-00-05-34-337.png, > image-2024-03-28-22-11-37-903.png, success_ShowAMInfo.jpg > > > When Yarn resources are insufficient, the newly submitted job AM may be in > the state of "Application is Activated, waiting for resources to be assigned > for AM". This is obviously because Yarn doesn't have enough resources to > allocate another AM Container, so we want to know how large the AM Container > is currently allocated. Unfortunately, the current diagnosticMessage on the > Web page does not show this data. Therefore, it is necessary to add the > resource size of the AM Container in the diagnosticMessage, which will be > very useful for us to troubleshoise the production faults on line. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831676#comment-17831676 ] ASF GitHub Bot commented on YARN-11582: --- hadoop-yetus commented on PR #6139: URL: https://github.com/apache/hadoop/pull/6139#issuecomment-2024708052 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 21s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 24s | | trunk passed | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 33s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 37s | | trunk passed | | +1 :green_heart: | javadoc | 0m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 11s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 19s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 26s | | the patch passed | | +1 :green_heart: | compile | 0m 26s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 26s | | the patch passed | | +1 :green_heart: | compile | 0m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 24s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 22s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 25s | | the patch passed | | +1 :green_heart: | javadoc | 0m 22s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 3s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 89m 20s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 24s | | The patch does not generate ASF License warnings. | | | | 172m 59s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/13/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6139 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 64f3cc57e612 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / dca8ab0eade23a70756077c5e60ce865237cf340 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/13/testReport/ | | Max. process+thread count | 948 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/13/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus
[jira] [Commented] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831627#comment-17831627 ] ASF GitHub Bot commented on YARN-11582: --- hadoop-yetus commented on PR #6139: URL: https://github.com/apache/hadoop/pull/6139#issuecomment-2024431543 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 20s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 40s | | trunk passed | | +1 :green_heart: | compile | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 31s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 33s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 35s | | trunk passed | | +1 :green_heart: | javadoc | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 32s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 12s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 21s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | compile | 0m 25s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 25s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 22s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/12/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 29 unchanged - 0 fixed = 31 total (was 29) | | +1 :green_heart: | mvnsite | 0m 27s | | the patch passed | | +1 :green_heart: | javadoc | 0m 25s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 26s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 9s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 89m 7s | [/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/12/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 25s | | The patch does not generate ASF License warnings. | | | | 173m 25s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimitsByPartition | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/12/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6139 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 19ef12641e21 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
[jira] [Commented] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831593#comment-17831593 ] ASF GitHub Bot commented on YARN-11668: --- slfan1989 commented on PR #6681: URL: https://github.com/apache/hadoop/pull/6681#issuecomment-2024239085 LGTM. > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Labels: pull-request-available > Attachments: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > > > The RM crash when encoutering the following the stacktrace in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831488#comment-17831488 ] ASF GitHub Bot commented on YARN-11582: --- hadoop-yetus commented on PR #6139: URL: https://github.com/apache/hadoop/pull/6139#issuecomment-2023438781 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 22s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 36m 16s | | trunk passed | | +1 :green_heart: | compile | 0m 36s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 37s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 40s | | trunk passed | | +1 :green_heart: | javadoc | 0m 42s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 33s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 22s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 32s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 30s | | the patch passed | | +1 :green_heart: | compile | 0m 34s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 34s | | the patch passed | | +1 :green_heart: | compile | 0m 33s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 33s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 28s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 34s | | the patch passed | | +1 :green_heart: | javadoc | 0m 29s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 23s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 17s | | the patch passed | | +1 :green_heart: | shadedclient | 22m 23s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 89m 20s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 25s | | The patch does not generate ASF License warnings. | | | | 185m 14s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/11/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6139 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux d25913424258 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / d0015be4cf3cbd73f8c7a85ba67442a2b36e8bbc | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/11/testReport/ | | Max. process+thread count | 951 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/11/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus
[jira] [Commented] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831424#comment-17831424 ] ASF GitHub Bot commented on YARN-11668: --- hadoop-yetus commented on PR #6681: URL: https://github.com/apache/hadoop/pull/6681#issuecomment-2023096174 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 50s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 49m 47s | | trunk passed | | +1 :green_heart: | compile | 1m 4s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 55s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 56s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 0s | | trunk passed | | +1 :green_heart: | javadoc | 0m 59s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 50s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 0s | | trunk passed | | +1 :green_heart: | shadedclient | 39m 38s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 48s | | the patch passed | | +1 :green_heart: | compile | 0m 54s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 54s | | the patch passed | | +1 :green_heart: | compile | 0m 47s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 47s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 46s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 50s | | the patch passed | | +1 :green_heart: | javadoc | 0m 44s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 41s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 59s | | the patch passed | | +1 :green_heart: | shadedclient | 40m 18s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 107m 51s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 257m 5s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6681/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6681 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 49f4d2c2b589 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / dd9c02bb4d6fc750136bd9be068fd4efe647c87c | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6681/2/testReport/ | | Max. process+thread count | 878 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://ci-hadoop.apache.org/job/hadoop
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831409#comment-17831409 ] ASF GitHub Bot commented on YARN-11664: --- hadoop-yetus commented on PR #6631: URL: https://github.com/apache/hadoop/pull/6631#issuecomment-2023044050 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 32s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 13m 56s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 32m 21s | | trunk passed | | +1 :green_heart: | compile | 17m 29s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 16m 7s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 4m 32s | | trunk passed | | +1 :green_heart: | mvnsite | 6m 51s | | trunk passed | | +1 :green_heart: | javadoc | 5m 47s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 6m 5s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | -1 :x: | spotbugs | 2m 47s | [/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/7/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html) | hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs warnings. | | -1 :x: | spotbugs | 1m 9s | [/branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/7/artifact/out/branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core-warnings.html) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 33m 58s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 34m 24s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 32s | | the patch passed | | +1 :green_heart: | compile | 17m 37s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 17m 37s | | the patch passed | | +1 :green_heart: | compile | 16m 28s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 16m 28s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 15s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/7/artifact/out/results-checkstyle-root.txt) | root: The patch generated 4 new + 526 unchanged - 2 fixed = 530 total (was 528) | | +1 :green_heart: | mvnsite | 7m 2s | | the patch passed | | -1 :x: | javadoc | 1m 8s | [/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/7/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-common in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | +1 :green_heart: | javadoc | 6m 3s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 14m 15s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 29s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 15s | | hadoop-common in the patch passed. | | +1
[jira] [Commented] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831387#comment-17831387 ] ASF GitHub Bot commented on YARN-11582: --- hadoop-yetus commented on PR #6139: URL: https://github.com/apache/hadoop/pull/6139#issuecomment-2022950080 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 22s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 37m 4s | | trunk passed | | +1 :green_heart: | compile | 0m 37s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 34s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 35s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 41s | | trunk passed | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 9s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 46s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | -1 :x: | compile | 0m 6s | [/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/10/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-yarn-server-resourcemanager in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javac | 0m 6s | [/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/10/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-yarn-server-resourcemanager in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | +1 :green_heart: | compile | 0m 25s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 25s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 26s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 28s | | the patch passed | | +1 :green_heart: | javadoc | 0m 29s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 15s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 16s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 88m 53s | | hadoop-yarn-server-resourcemanager in the patch passed. | | -1 :x: | asflicense | 0m 26s | [/results-asflicense.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/10/artifact/out/results-asflicense.txt) | The patch generated 1 ASF License warnings. | | | | 188m 48s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/10/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6139 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux a1e82d469702 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC
[jira] [Commented] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831332#comment-17831332 ] ASF GitHub Bot commented on YARN-11668: --- hadoop-yetus commented on PR #6681: URL: https://github.com/apache/hadoop/pull/6681#issuecomment-2022686673 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 49m 46s | | trunk passed | | +1 :green_heart: | compile | 1m 1s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 54s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 56s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 58s | | trunk passed | | +1 :green_heart: | javadoc | 0m 56s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 47s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 57s | | trunk passed | | +1 :green_heart: | shadedclient | 40m 32s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 48s | | the patch passed | | +1 :green_heart: | compile | 0m 54s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 54s | | the patch passed | | +1 :green_heart: | compile | 0m 44s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 44s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 44s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 48s | | the patch passed | | +1 :green_heart: | javadoc | 0m 44s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 40s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 56s | | the patch passed | | +1 :green_heart: | shadedclient | 39m 22s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 107m 33s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 34s | | The patch does not generate ASF License warnings. | | | | 256m 1s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6681/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6681 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux d93b6f0e44fb 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / dd9c02bb4d6fc750136bd9be068fd4efe647c87c | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6681/1/testReport/ | | Max. process+thread count | 908 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://ci-hadoop.apache.org/job/hadoop
[jira] [Commented] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831304#comment-17831304 ] ASF GitHub Bot commented on YARN-11668: --- zuston opened a new pull request, #6681: URL: https://github.com/apache/hadoop/pull/6681 ### Description of PR ![img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg](https://github.com/apache/hadoop/assets/8609142/3bcb0f02-267b-4e28-b7a2-c732a37796cf) This PR is to fix concurrent modification exception that will make RM crash due to the concurrent node attributes update for one node manager. ### How was this patch tested? Existing tests. ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Labels: pull-request-available > Attachments: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > > > The RM crash when encoutering the following the stacktrace in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831269#comment-17831269 ] ASF GitHub Bot commented on YARN-11582: --- hadoop-yetus commented on PR #6139: URL: https://github.com/apache/hadoop/pull/6139#issuecomment-2022373151 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 23s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 38m 23s | | trunk passed | | +1 :green_heart: | compile | 0m 32s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 32s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 32s | | trunk passed | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 28s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 9s | | trunk passed | | +1 :green_heart: | shadedclient | 25m 1s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | -1 :x: | mvninstall | 0m 25s | [/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/9/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-server-resourcemanager in the patch failed. | | -1 :x: | compile | 0m 29s | [/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/9/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-yarn-server-resourcemanager in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javac | 0m 29s | [/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/9/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-yarn-server-resourcemanager in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | compile | 0m 26s | [/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/9/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | hadoop-yarn-server-resourcemanager in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -1 :x: | javac | 0m 26s | [/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6139/9/artifact/out/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | hadoop-yarn-server-resourcemanager in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 23s | | the patch passed | | -1 :x: | mvnsite | 0m 25s | [/patch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop
[jira] [Commented] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831234#comment-17831234 ] ASF GitHub Bot commented on YARN-11668: --- zuston closed pull request #6681: YARN-11668. Fix RM crash for potential concurrent modification exception when updating node attributes URL: https://github.com/apache/hadoop/pull/6681 > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Labels: pull-request-available > Attachments: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > > > The RM crash when encoutering the following the stacktrace in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated YARN-11668: -- Labels: pull-request-available (was: ) > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Labels: pull-request-available > Attachments: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > > > The RM crash when encoutering the following the stacktrace in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831214#comment-17831214 ] ASF GitHub Bot commented on YARN-11668: --- zuston opened a new pull request, #6681: URL: https://github.com/apache/hadoop/pull/6681 ### Description of PR ![img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg](https://github.com/apache/hadoop/assets/8609142/3bcb0f02-267b-4e28-b7a2-c732a37796cf) This PR is to fix concurrent modification exception that will make RM crash due to the concurrent node attributes update for one node manager. ### How was this patch tested? Existing tests. ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Attachments: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > > > The RM crash when encoutering the following the stacktrace in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831215#comment-17831215 ] ASF GitHub Bot commented on YARN-11668: --- zuston commented on PR #6681: URL: https://github.com/apache/hadoop/pull/6681#issuecomment-2022203719 PTAL @slfan1989 > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Labels: pull-request-available > Attachments: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > > > The RM crash when encoutering the following the stacktrace in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junfan Zhang updated YARN-11668: Attachment: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Attachments: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > > > The RM crash when encoutering the following the stacktrace. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junfan Zhang updated YARN-11668: Description: The RM crash when encoutering the following the stacktrace in the attachment. (was: The RM crash when encoutering the following the stacktrace.) > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > Attachments: img_v3_029c_55ac6b50-64aa-4cbe-81a0-5f8d22c623fg.jpg > > > The RM crash when encoutering the following the stacktrace in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
[ https://issues.apache.org/jira/browse/YARN-11668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junfan Zhang updated YARN-11668: Description: The RM crash when encoutering the following the stacktrace. > Potential concurrent modification exception for node attributes of node > manager > --- > > Key: YARN-11668 > URL: https://issues.apache.org/jira/browse/YARN-11668 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Junfan Zhang >Priority: Major > > The RM crash when encoutering the following the stacktrace. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-11668) Potential concurrent modification exception for node attributes of node manager
Junfan Zhang created YARN-11668: --- Summary: Potential concurrent modification exception for node attributes of node manager Key: YARN-11668 URL: https://issues.apache.org/jira/browse/YARN-11668 Project: Hadoop YARN Issue Type: Bug Reporter: Junfan Zhang -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangzhihui updated YARN-11582: -- Priority: Minor (was: Major) > Improve WebUI diagnosticMessage to show AM Container resource request size > -- > > Key: YARN-11582 > URL: https://issues.apache.org/jira/browse/YARN-11582 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, resourcemanager >Affects Versions: 3.3.4 >Reporter: xiaojunxiang >Priority: Minor > Labels: pull-request-available > Attachments: image-2023-10-02-00-05-34-337.png, success_ShowAMInfo.jpg > > > When Yarn resources are insufficient, the newly submitted job AM may be in > the state of "Application is Activated, waiting for resources to be assigned > for AM". This is obviously because Yarn doesn't have enough resources to > allocate another AM Container, so we want to know how large the AM Container > is currently allocated. Unfortunately, the current diagnosticMessage on the > Web page does not show this data. Therefore, it is necessary to add the > resource size of the AM Container in the diagnosticMessage, which will be > very useful for us to troubleshoise the production faults on line. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11582) Improve WebUI diagnosticMessage to show AM Container resource request size
[ https://issues.apache.org/jira/browse/YARN-11582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831205#comment-17831205 ] ASF GitHub Bot commented on YARN-11582: --- xiaojunxiang2023 commented on PR #6139: URL: https://github.com/apache/hadoop/pull/6139#issuecomment-2022168620 @hiwangzhihui Hi~, please help me review it. > Improve WebUI diagnosticMessage to show AM Container resource request size > -- > > Key: YARN-11582 > URL: https://issues.apache.org/jira/browse/YARN-11582 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, resourcemanager >Affects Versions: 3.3.4 >Reporter: xiaojunxiang >Priority: Major > Labels: pull-request-available > Attachments: image-2023-10-02-00-05-34-337.png, success_ShowAMInfo.jpg > > > When Yarn resources are insufficient, the newly submitted job AM may be in > the state of "Application is Activated, waiting for resources to be assigned > for AM". This is obviously because Yarn doesn't have enough resources to > allocate another AM Container, so we want to know how large the AM Container > is currently allocated. Unfortunately, the current diagnosticMessage on the > Web page does not show this data. Therefore, it is necessary to add the > resource size of the AM Container in the diagnosticMessage, which will be > very useful for us to troubleshoise the production faults on line. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831179#comment-17831179 ] ASF GitHub Bot commented on YARN-11664: --- shameersss1 commented on PR #6631: URL: https://github.com/apache/hadoop/pull/6631#issuecomment-2022015512 > looks ok to me, but hdfs mail list should be invited to comment. > > made some minor suggestions I have send an email to hdfs mailing list asking their opinion as well. > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException;{code} > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, > * Constants variables and Utility classes can be moved to *hadoop-common* > * Instead of DSQuotaExceededException, Use the parent exception > ClusterStoragrCapacityExceeded -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831169#comment-17831169 ] ASF GitHub Bot commented on YARN-11664: --- shameersss1 commented on code in PR #6631: URL: https://github.com/apache/hadoop/pull/6631#discussion_r1540493331 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/IOStreamPair.java: ## @@ -15,15 +15,14 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -package org.apache.hadoop.hdfs.protocol.datatransfer; +package org.apache.hadoop.io; import java.io.Closeable; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; import org.apache.hadoop.classification.InterfaceAudience; -import org.apache.hadoop.io.IOUtils; /** * A little struct class to wrap an InputStream and an OutputStream. Review Comment: ack > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException;{code} > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, > * Constants variables and Utility classes can be moved to *hadoop-common* > * Instead of DSQuotaExceededException, Use the parent exception > ClusterStoragrCapacityExceeded -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831159#comment-17831159 ] ASF GitHub Bot commented on YARN-11664: --- shameersss1 commented on code in PR #6631: URL: https://github.com/apache/hadoop/pull/6631#discussion_r1540456198 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/Constants.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop; + +import org.apache.hadoop.io.Text; + +/** + * This class contains constants for configuration keys and default values. + */ +public final class Constants { Review Comment: ack ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/Constants.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop; + +import org.apache.hadoop.io.Text; + +/** + * This class contains constants for configuration keys and default values. + */ +public final class Constants { + + public static final Text HDFS_DELEGATION_KIND = Review Comment: ack > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException;{code} > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, > * Constants variables and Utility classes can be moved to *hadoop-common* > * Instead of DSQuotaExceededException, Use the parent exception > ClusterStoragrCapacityExceeded -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831102#comment-17831102 ] ASF GitHub Bot commented on YARN-11663: --- slfan1989 commented on code in PR #6662: URL: https://github.com/apache/hadoop/pull/6662#discussion_r1540208851 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java: ## @@ -4031,6 +4031,11 @@ public static boolean isAclEnabled(Configuration conf) { // 5 minutes public static final int DEFAULT_FEDERATION_CACHE_TIME_TO_LIVE_SECS = 5 * 60; + public static final String FEDERATION_CACHE_ENTITY_NUMS = + FEDERATION_PREFIX + "cache-entity.nums"; + // default 1000 Review Comment: Thanks for reviewing the code! I will improve the code. > [Federation] Add Cache Entity Nums Limit. > - > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Labels: pull-request-available > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai reassigned YARN-11664: --- Assignee: Syed Shameerur Rahman > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException;{code} > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, > * Constants variables and Utility classes can be moved to *hadoop-common* > * Instead of DSQuotaExceededException, Use the parent exception > ClusterStoragrCapacityExceeded -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17831047#comment-17831047 ] ASF GitHub Bot commented on YARN-11664: --- steveloughran commented on code in PR #6631: URL: https://github.com/apache/hadoop/pull/6631#discussion_r1539932538 ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/IOStreamPair.java: ## @@ -15,15 +15,14 @@ * See the License for the specific language governing permissions and * limitations under the License. */ -package org.apache.hadoop.hdfs.protocol.datatransfer; +package org.apache.hadoop.io; import java.io.Closeable; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; import org.apache.hadoop.classification.InterfaceAudience; -import org.apache.hadoop.io.IOUtils; /** * A little struct class to wrap an InputStream and an OutputStream. Review Comment: add that they both get closed in the close ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/Constants.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop; + +import org.apache.hadoop.io.Text; + +/** + * This class contains constants for configuration keys and default values. + */ +public final class Constants { + + public static final Text HDFS_DELEGATION_KIND = Review Comment: javadocs with {@value} entries here and below ## hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/Constants.java: ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop; + +import org.apache.hadoop.io.Text; + +/** + * This class contains constants for configuration keys and default values. + */ +public final class Constants { Review Comment: not in this package. should go somewhere under filesystem, maybe a class like `org.apache.hadoop.fs.HdfsCommonConstants` for all hdfs related constants which need to go into hadoop-common > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies
[jira] [Commented] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830999#comment-17830999 ] ASF GitHub Bot commented on YARN-11663: --- dineshchitlangia commented on code in PR #6662: URL: https://github.com/apache/hadoop/pull/6662#discussion_r1539567214 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java: ## @@ -4031,6 +4031,11 @@ public static boolean isAclEnabled(Configuration conf) { // 5 minutes public static final int DEFAULT_FEDERATION_CACHE_TIME_TO_LIVE_SECS = 5 * 60; + public static final String FEDERATION_CACHE_ENTITY_NUMS = + FEDERATION_PREFIX + "cache-entity.nums"; + // default 1000 Review Comment: NIT - This comment can be removed as the variable name is appropriate. > [Federation] Add Cache Entity Nums Limit. > - > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Labels: pull-request-available > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11387) [GPG] YARN GPG mistakenly deleted applicationid
[ https://issues.apache.org/jira/browse/YARN-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830971#comment-17830971 ] ASF GitHub Bot commented on YARN-11387: --- slfan1989 commented on code in PR #6660: URL: https://github.com/apache/hadoop/pull/6660#discussion_r1539407981 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-globalpolicygenerator/src/main/java/org/apache/hadoop/yarn/server/globalpolicygenerator/applicationcleaner/DefaultApplicationCleaner.java: ## @@ -46,47 +45,38 @@ public void run() { LOG.info("Application cleaner run at time {}", now); FederationStateStoreFacade facade = getGPGContext().getStateStoreFacade(); Review Comment: Step 1: Retrieve all applications stored in the StateStore, which represents all applications submitted to the Router. Step 2: Use the Router's REST API to fetch all running tasks. This API will invoke applications from all active SubClusters. Step 3: Compare the results of Step1 and Step2 to identify applications that exist in Step1 but not in Step2. Delete these applications. There is a potential issue with this approach. If a particular SubCluster is undergoing maintenance, such as RM restart, Step2 will not be able to fetch the complete list of running applications. As a result, during the comparison in Step3, there is a risk of mistakenly deleting applications that are still running. We have three SubClusters: subClusterA, subClusterB, and subClusterC, with an equal allocation ratio of 1:1:1. We submit six applications through routerA. app1 and app2 are allocated to subClusterA app3 and app4 to subClusterB app5 and app6 to subClusterC. Among these, app1, app3, and app5 have completed their execution, and we expect to retain app2, app4, and app6 in the StateStore. In the normal scenario: Comparing the steps mentioned above: Step 1: We will retrieve six applications [app1, app2, app3, app4, app5, app6] from the StateStore. Step 2: We will fetch three applications [app2, app4, app6] from the Router's REST interface. Step 3: By comparing Step 1 and Step 2, we can identify that applications [app1, app3, app5] should be deleted. In the exceptional scenario: Comparing the steps mentioned above: Step 1: We will retrieve six applications [app1, app2, app3, app4, app5, app6] from the StateStore. Step 2: We will fetch the list of running applications from the Router's REST interface. However, due to maintenance in subClusterB and subClusterC, we can only obtain the applications running in subClusterA [app2]. Step 3: By comparing Step 1 and Step 3, we can identify that applications [app1, app3, app4, app5, app6] should be deleted. In this case, we had an error deletion. > [GPG] YARN GPG mistakenly deleted applicationid > --- > > Key: YARN-11387 > URL: https://issues.apache.org/jira/browse/YARN-11387 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.2.1, 3.4.0 >Reporter: zhangjunj >Assignee: Shilun Fan >Priority: Major > Labels: federation, gpg, pull-request-available > Attachments: YARN-11387-YARN-11387.v1.patch, > yarn-gpg-mistakenly-deleted-applicationid.png > > Original Estimate: 168h > Remaining Estimate: 168h > > In [YARN-7599|https://issues.apache.org/jira/browse/YARN-7599], the > Federation can delete expired applicationid, but YARN GPG uses getRouter() > method to obtain application information for multiple clusters. If there are > too many applicationids that more than 200,000 , it will not be possible to > pull all the applicationid information at one time, resulting in the > possibility of accidental deletion. The following error is reported for spark > component. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830970#comment-17830970 ] ASF GitHub Bot commented on YARN-11663: --- slfan1989 commented on PR #6662: URL: https://github.com/apache/hadoop/pull/6662#issuecomment-2020639828 @goiri Can you help review this PR? Thank you very much! > [Federation] Add Cache Entity Nums Limit. > - > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Labels: pull-request-available > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11261) Upgrade JUnit from 4 to 5 in hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-11261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830929#comment-17830929 ] ASF GitHub Bot commented on YARN-11261: --- hadoop-yetus commented on PR #6652: URL: https://github.com/apache/hadoop/pull/6652#issuecomment-2020426986 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 19s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 22s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 19m 52s | | trunk passed | | +1 :green_heart: | compile | 8m 39s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 7m 53s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 2m 3s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 31s | | trunk passed | | +1 :green_heart: | javadoc | 1m 16s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 1s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 35s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 43s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 21s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 57s | | the patch passed | | +1 :green_heart: | compile | 8m 22s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 8m 22s | | the patch passed | | +1 :green_heart: | compile | 8m 4s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 8m 4s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 2m 3s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/3/artifact/out/results-checkstyle-root.txt) | root: The patch generated 7 new + 140 unchanged - 0 fixed = 147 total (was 140) | | +1 :green_heart: | mvnsite | 1m 33s | | the patch passed | | +1 :green_heart: | javadoc | 1m 7s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 10s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | -1 :x: | spotbugs | 1m 33s | [/new-spotbugs-hadoop-common-project_hadoop-common.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/3/artifact/out/new-spotbugs-hadoop-common-project_hadoop-common.html) | hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 20m 26s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 16m 42s | | hadoop-common in the patch passed. | | -1 :x: | unit | 88m 3s | [/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/3/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 33s | | The patch does not generate ASF License warnings. | | | | 237m 43s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-common-project/hadoop-common | | | Inconsistent synchronization of org.apache.hadoop.conf.Configuration.properties; locked 87% of time Unsynchronized access at Configuration.java:87% of time Unsynchronized access at Configuration.java:[line 771] | | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNewQueueAutoCreation | | Subsystem | Report/Notes
[jira] [Commented] (YARN-11261) Upgrade JUnit from 4 to 5 in hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-11261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830616#comment-17830616 ] ASF GitHub Bot commented on YARN-11261: --- hadoop-yetus commented on PR #6652: URL: https://github.com/apache/hadoop/pull/6652#issuecomment-2018548183 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 20s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 16m 12s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 20m 14s | | trunk passed | | +1 :green_heart: | compile | 8m 50s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 8m 21s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 2m 3s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 40s | | trunk passed | | +1 :green_heart: | javadoc | 1m 25s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 17s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 49s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 45s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 22s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 56s | | the patch passed | | +1 :green_heart: | compile | 8m 39s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 8m 39s | | the patch passed | | +1 :green_heart: | compile | 8m 10s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 8m 10s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 2m 4s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/2/artifact/out/results-checkstyle-root.txt) | root: The patch generated 7 new + 139 unchanged - 0 fixed = 146 total (was 139) | | +1 :green_heart: | mvnsite | 1m 40s | | the patch passed | | +1 :green_heart: | javadoc | 1m 18s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 18s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | -1 :x: | spotbugs | 1m 34s | [/new-spotbugs-hadoop-common-project_hadoop-common.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/2/artifact/out/new-spotbugs-hadoop-common-project_hadoop-common.html) | hadoop-common-project/hadoop-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 21m 0s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 16m 19s | | hadoop-common in the patch passed. | | -1 :x: | unit | 90m 22s | [/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/2/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 44s | | The patch does not generate ASF License warnings. | | | | 243m 18s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-common-project/hadoop-common | | | Inconsistent synchronization of org.apache.hadoop.conf.Configuration.properties; locked 93% of time Unsynchronized access at Configuration.java:93% of time Unsynchronized access at Configuration.java:[line 771] | | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNewQueueAutoCreation | | Subsystem | Report/Notes
[jira] [Commented] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830247#comment-17830247 ] ASF GitHub Bot commented on YARN-11663: --- hadoop-yetus commented on PR #6662: URL: https://github.com/apache/hadoop/pull/6662#issuecomment-2016767599 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 23s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 13m 41s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 4s | | trunk passed | | +1 :green_heart: | compile | 10m 4s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 9m 27s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 2m 23s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 30s | | trunk passed | | +1 :green_heart: | javadoc | 2m 21s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 2m 20s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +0 :ok: | spotbugs | 0m 33s | | branch/hadoop-project no spotbugs output file (spotbugsXml.xml) | | +1 :green_heart: | shadedclient | 22m 30s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 36s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 22s | | the patch passed | | +1 :green_heart: | compile | 8m 57s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 8m 57s | | the patch passed | | +1 :green_heart: | compile | 9m 19s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 9m 19s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 2m 29s | | the patch passed | | +1 :green_heart: | mvnsite | 2m 5s | | the patch passed | | +1 :green_heart: | javadoc | 2m 18s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 2m 10s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +0 :ok: | spotbugs | 0m 27s | | hadoop-project has no data from spotbugs | | -1 :x: | shadedclient | 23m 44s | | patch has errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 0m 25s | | hadoop-project in the patch passed. | | +1 :green_heart: | unit | 0m 52s | | hadoop-yarn-api in the patch passed. | | +1 :green_heart: | unit | 4m 46s | | hadoop-yarn-common in the patch passed. | | +1 :green_heart: | unit | 2m 53s | | hadoop-yarn-server-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 41s | | The patch does not generate ASF License warnings. | | | | 164m 39s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6662/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6662 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle | | uname | Linux 33879fd5031e 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 3bf1337d6649b40c92d891b2a8f8ff3d0da4df56 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
[jira] [Commented] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830186#comment-17830186 ] ASF GitHub Bot commented on YARN-11663: --- slfan1989 commented on code in PR #6662: URL: https://github.com/apache/hadoop/pull/6662#discussion_r1536710928 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/cache/FederationGuavaCache.java: ## @@ -60,7 +63,7 @@ public void initCache(Configuration pConf, FederationStateStore pStateStore) { // Initialize Cache. cache = CacheBuilder.newBuilder().expireAfterWrite(cacheTimeToLive, -TimeUnit.MILLISECONDS).build(); +TimeUnit.MILLISECONDS).maximumSize(cacheEntityNums).build(); Review Comment: Thanks for your suggestion! I will fix it. > [Federation] Add Cache Entity Nums Limit. > - > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Labels: pull-request-available > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830106#comment-17830106 ] ASF GitHub Bot commented on YARN-11663: --- hadoop-yetus commented on PR #6662: URL: https://github.com/apache/hadoop/pull/6662#issuecomment-2016545141 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 21s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 13m 59s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 19m 50s | | trunk passed | | +1 :green_heart: | compile | 8m 57s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 8m 9s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 2m 7s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 44s | | trunk passed | | +1 :green_heart: | javadoc | 1m 38s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +0 :ok: | spotbugs | 0m 32s | | branch/hadoop-project no spotbugs output file (spotbugsXml.xml) | | +1 :green_heart: | shadedclient | 20m 32s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 35s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 50s | | the patch passed | | +1 :green_heart: | compile | 8m 38s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 8m 38s | | the patch passed | | +1 :green_heart: | compile | 8m 8s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 8m 8s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 2m 3s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6662/1/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 164 unchanged - 0 fixed = 166 total (was 164) | | +1 :green_heart: | mvnsite | 1m 45s | | the patch passed | | +1 :green_heart: | javadoc | 1m 30s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 36s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +0 :ok: | spotbugs | 0m 26s | | hadoop-project has no data from spotbugs | | -1 :x: | shadedclient | 21m 4s | | patch has errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 0m 26s | | hadoop-project in the patch passed. | | -1 :x: | unit | 0m 49s | [/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6662/1/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt) | hadoop-yarn-api in the patch passed. | | +1 :green_heart: | unit | 2m 52s | | hadoop-yarn-server-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 42s | | The patch does not generate ASF License warnings. | | | | 140m 5s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6662/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6662 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle | | uname | Linux 475c51c6156e 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU
[jira] [Commented] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830088#comment-17830088 ] ASF GitHub Bot commented on YARN-11663: --- luoyuan3471 commented on code in PR #6662: URL: https://github.com/apache/hadoop/pull/6662#discussion_r1536644135 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/cache/FederationGuavaCache.java: ## @@ -60,7 +63,7 @@ public void initCache(Configuration pConf, FederationStateStore pStateStore) { // Initialize Cache. cache = CacheBuilder.newBuilder().expireAfterWrite(cacheTimeToLive, -TimeUnit.MILLISECONDS).build(); +TimeUnit.MILLISECONDS).maximumSize(cacheEntityNums).build(); Review Comment: cacheTimeToLive = pConf.getInt(YarnConfiguration.FEDERATION_CACHE_TIME_TO_LIVE_SECS, YarnConfiguration.DEFAULT_FEDERATION_CACHE_TIME_TO_LIVE_SECS); TimeUnit.MILLISECONDS -> TimeUnit.SECONDS > [Federation] Add Cache Entity Nums Limit. > - > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Labels: pull-request-available > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11663: -- Issue Type: Improvement (was: Bug) > [Federation] Add Cache Entity Nums Limit. > - > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Labels: pull-request-available > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11663) [Federation] Add Cache Entity Nums Limit.
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11663: -- Summary: [Federation] Add Cache Entity Nums Limit. (was: Router cache expansion issue) > [Federation] Add Cache Entity Nums Limit. > - > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Labels: pull-request-available > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11663) Router cache expansion issue
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated YARN-11663: -- Labels: pull-request-available (was: ) > Router cache expansion issue > > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Labels: pull-request-available > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11663) Router cache expansion issue
[ https://issues.apache.org/jira/browse/YARN-11663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830084#comment-17830084 ] ASF GitHub Bot commented on YARN-11663: --- slfan1989 opened a new pull request, #6662: URL: https://github.com/apache/hadoop/pull/6662 ### Description of PR JIRA: YARN-11663. [Federation] Add Cache Entity Nums Limit. ### How was this patch tested? ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Router cache expansion issue > > > Key: YARN-11663 > URL: https://issues.apache.org/jira/browse/YARN-11663 > Project: Hadoop YARN > Issue Type: Bug > Components: federation, yarn >Affects Versions: 3.4.0 >Reporter: Yuan Luo >Priority: Major > Attachments: image-2024-03-14-18-12-28-426.png, > image-2024-03-14-18-12-49-950.png, image-2024-03-15-10-50-32-860.png > > > !image-2024-03-14-18-12-28-426.png! > !image-2024-03-14-18-12-49-950.png! > hi [~slfan1989] After apply this feature to our prod env, I found the memory > of the router keeps growing over time. This is because after jobs finished, > we won't access the expired key to trigger cleanup mechanism. Is it better to > add cache maximum number limit? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11387) [GPG] YARN GPG mistakenly deleted applicationid
[ https://issues.apache.org/jira/browse/YARN-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830050#comment-17830050 ] ASF GitHub Bot commented on YARN-11387: --- hadoop-yetus commented on PR #6660: URL: https://github.com/apache/hadoop/pull/6660#issuecomment-2016433769 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 18m 5s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 43m 38s | | trunk passed | | +1 :green_heart: | compile | 0m 27s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 26s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 27s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 32s | | trunk passed | | +1 :green_heart: | javadoc | 0m 34s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 28s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 0m 47s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 3s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 20s | | the patch passed | | +1 :green_heart: | compile | 0m 20s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 20s | | the patch passed | | +1 :green_heart: | compile | 0m 18s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 18s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 14s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 20s | | the patch passed | | +1 :green_heart: | javadoc | 0m 20s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 19s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 0m 46s | | the patch passed | | +1 :green_heart: | shadedclient | 32m 50s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 1m 1s | | hadoop-yarn-server-globalpolicygenerator in the patch passed. | | +1 :green_heart: | asflicense | 0m 37s | | The patch does not generate ASF License warnings. | | | | 140m 49s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6660/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6660 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 45f6a5950e77 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 2b2084718031bda6966917176f9b171356cbf459 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6660/1/testReport/ | | Max. process+thread count | 558 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-globalpolicygenerator U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-globalpolicygenerator | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6660/1/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered
[jira] [Commented] (YARN-2024) IOException in AppLogAggregatorImpl does not give stacktrace and leaves aggregated TFile in a bad state.
[ https://issues.apache.org/jira/browse/YARN-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830047#comment-17830047 ] zeekling commented on YARN-2024: I have the same proplem in Hadoop 3.1.1 !image-2024-03-23-17-22-00-057.png! > IOException in AppLogAggregatorImpl does not give stacktrace and leaves > aggregated TFile in a bad state. > > > Key: YARN-2024 > URL: https://issues.apache.org/jira/browse/YARN-2024 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 0.23.10, 2.4.0 >Reporter: Eric Payne >Assignee: Xuan Gong >Priority: Major > > Multiple issues were encountered when AppLogAggregatorImpl encountered an > IOException in AppLogAggregatorImpl#uploadLogsForContainer while aggregating > yarn-logs for an application that had very large (>150G each) error logs. > - An IOException was encountered during the LogWriter#append call, and a > message was printed, but no stacktrace was provided. Message: "ERROR: > Couldn't upload logs for container_n_nnn_nn_nn. Skipping > this container." > - After the IOExceptin, the TFile is in a bad state, so subsequent calls to > LogWriter#append fail with the following stacktrace: > 2014-04-16 13:29:09,772 [LogAggregationService #17907] ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[LogAggregationService #17907,5,main] threw an Exception. > java.lang.IllegalStateException: Incorrect state to start a new key: IN_VALUE > at > org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:528) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.append(AggregatedLogFormat.java:262) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:128) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:164) > ... > - At this point, the yarn-logs cleaner still thinks the thread is > aggregating, so the huge yarn-logs never get cleaned up for that application. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-2024) IOException in AppLogAggregatorImpl does not give stacktrace and leaves aggregated TFile in a bad state.
[ https://issues.apache.org/jira/browse/YARN-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830047#comment-17830047 ] zeekling edited comment on YARN-2024 at 3/23/24 9:23 AM: - I have the same proplem in Hadoop 3.1.1 2024-02-17 01:09:21,112 | INFO | SchedulerEventDispatcher:Event Processor | container_e65_1707884856539_27553_01_66 Container Transitioned from NEW to COMPLETED | RMContainerImpl.java:480 2024-02-17 01:09:21,112 | FATAL | RM ApplicationHistory dispatcher | Error in dispatcher thread | AsyncDispatcher.java:233 java.lang.IllegalStateException: Incorrect state to start a new key: END_KEY at org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:530) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.writeHistoryData(FileSystemApplicationHistoryStore.java:756) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.containerStarted(FileSystemApplicationHistoryStore.java:523) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:198) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:304) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:299) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:227) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:144) at java.lang.Thread.run(Thread.java:748) was (Author: JIRAUSER299659): I have the same proplem in Hadoop 3.1.1 2024-02-17 01:09:21,112 | INFO | SchedulerEventDispatcher:Event Processor | container_e65_1707884856539_27553_01_66 Container Transitioned from NEW to COMPLETED | RMContainerImpl.java:480 2024-02-17 01:09:21,112 | FATAL | RM ApplicationHistory dispatcher | Error in dispatcher thread | AsyncDispatcher.java:233 java.lang.IllegalStateException: Incorrect state to start a new key: END_KEY at org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:530) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.writeHistoryData(FileSystemApplicationHistoryStore.java:756) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.containerStarted(FileSystemApplicationHistoryStore.java:523) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:198) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:304) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:299) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:227) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:144) at java.lang.Thread.run(Thread.java:748) > IOException in AppLogAggregatorImpl does not give stacktrace and leaves > aggregated TFile in a bad state. > > > Key: YARN-2024 > URL: https://issues.apache.org/jira/browse/YARN-2024 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 0.23.10, 2.4.0 >Reporter: Eric Payne >Assignee: Xuan Gong >Priority: Major > > Multiple issues were encountered when AppLogAggregatorImpl encountered an > IOException in AppLogAggregatorImpl#uploadLogsForContainer while aggregating > yarn-logs for an application that had very large (>150G each) error logs. > - An IOException was encountered during the LogWriter#append call, and a > message was printed, but no stacktrace was provided. Message: "ERROR: > Couldn't upload logs for container_n_nnn_nn_nn. Skipping > this container." > - After the IOExceptin, the TFile is in a bad state, so subsequent calls to > LogWriter#append fail with the following stacktrace: > 2014-04-16 13:29:09,772 [LogAggregationService #17907] ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[LogAggregationService #17907,5,main] threw an Exception. > java.lang.IllegalStateException: Incorrect state to start a new key: IN_VALU
[jira] [Comment Edited] (YARN-2024) IOException in AppLogAggregatorImpl does not give stacktrace and leaves aggregated TFile in a bad state.
[ https://issues.apache.org/jira/browse/YARN-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830047#comment-17830047 ] zeekling edited comment on YARN-2024 at 3/23/24 9:23 AM: - I have the same proplem in Hadoop 3.1.1 2024-02-17 01:09:21,112 | INFO | SchedulerEventDispatcher:Event Processor | container_e65_1707884856539_27553_01_66 Container Transitioned from NEW to COMPLETED | RMContainerImpl.java:480 2024-02-17 01:09:21,112 | FATAL | RM ApplicationHistory dispatcher | Error in dispatcher thread | AsyncDispatcher.java:233 java.lang.IllegalStateException: Incorrect state to start a new key: END_KEY at org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:530) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore$HistoryFileWriter.writeHistoryData(FileSystemApplicationHistoryStore.java:756) at org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore.containerStarted(FileSystemApplicationHistoryStore.java:523) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter.handleWritingApplicationHistoryEvent(RMApplicationHistoryWriter.java:198) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:304) at org.apache.hadoop.yarn.server.resourcemanager.ahs.RMApplicationHistoryWriter$ForwardingEventHandler.handle(RMApplicationHistoryWriter.java:299) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:227) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:144) at java.lang.Thread.run(Thread.java:748) was (Author: JIRAUSER299659): I have the same proplem in Hadoop 3.1.1 !image-2024-03-23-17-22-00-057.png! > IOException in AppLogAggregatorImpl does not give stacktrace and leaves > aggregated TFile in a bad state. > > > Key: YARN-2024 > URL: https://issues.apache.org/jira/browse/YARN-2024 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 0.23.10, 2.4.0 >Reporter: Eric Payne >Assignee: Xuan Gong >Priority: Major > > Multiple issues were encountered when AppLogAggregatorImpl encountered an > IOException in AppLogAggregatorImpl#uploadLogsForContainer while aggregating > yarn-logs for an application that had very large (>150G each) error logs. > - An IOException was encountered during the LogWriter#append call, and a > message was printed, but no stacktrace was provided. Message: "ERROR: > Couldn't upload logs for container_n_nnn_nn_nn. Skipping > this container." > - After the IOExceptin, the TFile is in a bad state, so subsequent calls to > LogWriter#append fail with the following stacktrace: > 2014-04-16 13:29:09,772 [LogAggregationService #17907] ERROR > org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread > Thread[LogAggregationService #17907,5,main] threw an Exception. > java.lang.IllegalStateException: Incorrect state to start a new key: IN_VALUE > at > org.apache.hadoop.io.file.tfile.TFile$Writer.prepareAppendKey(TFile.java:528) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.append(AggregatedLogFormat.java:262) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:128) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:164) > ... > - At this point, the yarn-logs cleaner still thinks the thread is > aggregating, so the huge yarn-logs never get cleaned up for that application. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11387) [GPG] YARN GPG mistakenly deleted applicationid
[ https://issues.apache.org/jira/browse/YARN-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830035#comment-17830035 ] ASF GitHub Bot commented on YARN-11387: --- slfan1989 opened a new pull request, #6660: URL: https://github.com/apache/hadoop/pull/6660 ### Description of PR JIRA: YARN-11387. [GPG] YARN GPG mistakenly deleted applicationid. ### How was this patch tested? ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > [GPG] YARN GPG mistakenly deleted applicationid > --- > > Key: YARN-11387 > URL: https://issues.apache.org/jira/browse/YARN-11387 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.2.1, 3.4.0 >Reporter: zhangjunj >Assignee: Shilun Fan >Priority: Major > Labels: federation, gpg, pull-request-available > Attachments: YARN-11387-YARN-11387.v1.patch, > yarn-gpg-mistakenly-deleted-applicationid.png > > Original Estimate: 168h > Remaining Estimate: 168h > > In [YARN-7599|https://issues.apache.org/jira/browse/YARN-7599], the > Federation can delete expired applicationid, but YARN GPG uses getRouter() > method to obtain application information for multiple clusters. If there are > too many applicationids that more than 200,000 , it will not be possible to > pull all the applicationid information at one time, resulting in the > possibility of accidental deletion. The following error is reported for spark > component. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11387) [GPG] YARN GPG mistakenly deleted applicationid
[ https://issues.apache.org/jira/browse/YARN-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830020#comment-17830020 ] ASF GitHub Bot commented on YARN-11387: --- hadoop-yetus commented on PR #6473: URL: https://github.com/apache/hadoop/pull/6473#issuecomment-2016250984 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 46s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 49m 44s | | trunk passed | | +1 :green_heart: | compile | 0m 25s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 24s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 24s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 28s | | trunk passed | | +1 :green_heart: | javadoc | 0m 32s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 25s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 0m 45s | | trunk passed | | +1 :green_heart: | shadedclient | 37m 53s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 26s | | the patch passed | | +1 :green_heart: | compile | 0m 18s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 18s | | the patch passed | | +1 :green_heart: | compile | 0m 16s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 16s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 13s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-globalpolicygenerator.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6473/5/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-globalpolicygenerator.txt) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-globalpolicygenerator: The patch generated 10 new + 0 unchanged - 0 fixed = 10 total (was 0) | | +1 :green_heart: | mvnsite | 0m 17s | | the patch passed | | +1 :green_heart: | javadoc | 0m 18s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 17s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 0m 43s | | the patch passed | | +1 :green_heart: | shadedclient | 37m 57s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 0m 57s | | hadoop-yarn-server-globalpolicygenerator in the patch passed. | | +1 :green_heart: | asflicense | 0m 35s | | The patch does not generate ASF License warnings. | | | | 139m 4s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6473/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6473 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 3c39262bfb8b 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 0064efa587f65c2978d70c1b8a9ee0c6748b83aa | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6473/5
[jira] [Comment Edited] (YARN-11387) [GPG] YARN GPG mistakenly deleted applicationid
[ https://issues.apache.org/jira/browse/YARN-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830001#comment-17830001 ] Shilun Fan edited comment on YARN-11387 at 3/22/24 11:08 PM: - I will resubmit PR to follow up on this issue. was (Author: slfan1989): I will resubmit PR to follow up on this issue.I will resubmit PR to follow up on this issue. > [GPG] YARN GPG mistakenly deleted applicationid > --- > > Key: YARN-11387 > URL: https://issues.apache.org/jira/browse/YARN-11387 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.2.1, 3.4.0 >Reporter: zhangjunj >Assignee: Shilun Fan >Priority: Major > Labels: federation, gpg, pull-request-available > Attachments: YARN-11387-YARN-11387.v1.patch, > yarn-gpg-mistakenly-deleted-applicationid.png > > Original Estimate: 168h > Remaining Estimate: 168h > > In [YARN-7599|https://issues.apache.org/jira/browse/YARN-7599], the > Federation can delete expired applicationid, but YARN GPG uses getRouter() > method to obtain application information for multiple clusters. If there are > too many applicationids that more than 200,000 , it will not be possible to > pull all the applicationid information at one time, resulting in the > possibility of accidental deletion. The following error is reported for spark > component. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11387) [GPG] YARN GPG mistakenly deleted applicationid
[ https://issues.apache.org/jira/browse/YARN-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830001#comment-17830001 ] Shilun Fan commented on YARN-11387: --- I will resubmit PR to follow up on this issue.I will resubmit PR to follow up on this issue. > [GPG] YARN GPG mistakenly deleted applicationid > --- > > Key: YARN-11387 > URL: https://issues.apache.org/jira/browse/YARN-11387 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.2.1, 3.4.0 >Reporter: zhangjunj >Assignee: Shilun Fan >Priority: Major > Labels: federation, gpg, pull-request-available > Attachments: YARN-11387-YARN-11387.v1.patch, > yarn-gpg-mistakenly-deleted-applicationid.png > > Original Estimate: 168h > Remaining Estimate: 168h > > In [YARN-7599|https://issues.apache.org/jira/browse/YARN-7599], the > Federation can delete expired applicationid, but YARN GPG uses getRouter() > method to obtain application information for multiple clusters. If there are > too many applicationids that more than 200,000 , it will not be possible to > pull all the applicationid information at one time, resulting in the > possibility of accidental deletion. The following error is reported for spark > component. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11387) [GPG] YARN GPG mistakenly deleted applicationid
[ https://issues.apache.org/jira/browse/YARN-11387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1783#comment-1783 ] ASF GitHub Bot commented on YARN-11387: --- slfan1989 closed pull request #6473: YARN-11387. [GPG] YARN GPG mistakenly deleted applicationid. URL: https://github.com/apache/hadoop/pull/6473 > [GPG] YARN GPG mistakenly deleted applicationid > --- > > Key: YARN-11387 > URL: https://issues.apache.org/jira/browse/YARN-11387 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.2.1, 3.4.0 >Reporter: zhangjunj >Assignee: Shilun Fan >Priority: Major > Labels: federation, gpg, pull-request-available > Attachments: YARN-11387-YARN-11387.v1.patch, > yarn-gpg-mistakenly-deleted-applicationid.png > > Original Estimate: 168h > Remaining Estimate: 168h > > In [YARN-7599|https://issues.apache.org/jira/browse/YARN-7599], the > Federation can delete expired applicationid, but YARN GPG uses getRouter() > method to obtain application information for multiple clusters. If there are > too many applicationids that more than 200,000 , it will not be possible to > pull all the applicationid information at one time, resulting in the > possibility of accidental deletion. The following error is reported for spark > component. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11665) Hive jobs support aggregating logs according to real users
[ https://issues.apache.org/jira/browse/YARN-11665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zeekling updated YARN-11665: Issue Type: Wish (was: Improvement) > Hive jobs support aggregating logs according to real users > -- > > Key: YARN-11665 > URL: https://issues.apache.org/jira/browse/YARN-11665 > Project: Hadoop YARN > Issue Type: Wish > Components: log-aggregation >Reporter: zeekling >Priority: Major > > Currently, hive job logs are in /tmp/logs/hive/bucket/appId ,can we aggregate > logs against real users running hive jobs, like /tmp/logs/hive/\{real > user}/bucket/appId -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Syed Shameerur Rahman updated YARN-11664: - Description: In principle Hadoop Yarn is independent of HDFS. It can work with any filesystem. Currently there exists some code dependency for Yarn with HDFS. This dependency requires Yarn to bring in some of the HDFS binaries/jars to its class path. The idea behind this jira is to remove this dependency so that Yarn can run without HDFS binaries/jars *Scope* 1. Non test classes are considered 2. Some test classes which comes as transitive dependency are considered *Out of scope* 1. All test classes in Yarn module is not considered A quick search in Yarn module revealed following HDFS dependencies 1. Constants {code:java} import org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; import org.apache.hadoop.hdfs.DFSConfigKeys;{code} 2. Exception {code:java} import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException;{code} 3. Utility {code:java} import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} Both Yarn and HDFS depends on *hadoop-common* module, * Constants variables and Utility classes can be moved to *hadoop-common* * Instead of DSQuotaExceededException, Use the parent exception ClusterStoragrCapacityExceeded was: In principle Hadoop Yarn is independent of HDFS. It can work with any filesystem. Currently there exists some code dependency for Yarn with HDFS. This dependency requires Yarn to bring in some of the HDFS binaries/jars to its class path. The idea behind this jira is to remove this dependency so that Yarn can run without HDFS binaries/jars *Scope* 1. Non test classes are considered 2. Some test classes which comes as transitive dependency are considered *Out of scope* 1. All test classes in Yarn module is not considered A quick search in Yarn module revealed following HDFS dependencies 1. Constants {code:java} import org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; import org.apache.hadoop.hdfs.DFSConfigKeys;{code} 2. Exception {code:java} import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException; 3. Utility {code:java} import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} Both Yarn and HDFS depends on *hadoop-common* module, * Constants variables and Utility classes can be moved to *hadoop-common* * Instead of DSQuotaExceededException, Use the parent exception ClusterStoragrCapacityExceeded > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException;{code} > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, > * Constants variables and Utility classes can be moved to *hadoop-common* > * Instead of DSQuotaExceededException, Use the parent exception > ClusterStoragrCapacityExceeded -- This message was sent by Atlassian Jira (v8.20.10#820010) ---
[jira] [Commented] (YARN-11261) Upgrade JUnit from 4 to 5 in hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-11261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829637#comment-17829637 ] ASF GitHub Bot commented on YARN-11261: --- hadoop-yetus commented on PR #6652: URL: https://github.com/apache/hadoop/pull/6652#issuecomment-2013027918 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 44s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 0m 21s | | Maven dependency ordering for branch | | -1 :x: | mvninstall | 0m 23s | [/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-mvninstall-root.txt) | root in trunk failed. | | -1 :x: | compile | 0m 23s | [/branch-compile-root-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-compile-root-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | root in trunk failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | compile | 0m 23s | [/branch-compile-root-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | root in trunk failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -0 :warning: | checkstyle | 0m 22s | [/buildtool-branch-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/buildtool-branch-checkstyle-root.txt) | The patch fails to run checkstyle in root | | -1 :x: | mvnsite | 0m 24s | [/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt) | hadoop-common in trunk failed. | | -1 :x: | mvnsite | 0m 23s | [/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-server-resourcemanager in trunk failed. | | -1 :x: | javadoc | 0m 23s | [/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-common in trunk failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javadoc | 0m 23s | [/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-yarn-server-resourcemanager in trunk failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javadoc | 0m 24s | [/branch-javadoc-hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-javadoc-hadoop-common-project_hadoop-common-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | hadoop-common in trunk failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -1 :x: | javadoc | 0m 23s | [/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6652/1/artifact/out/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | hadoop-yarn-server-resourcemanager in trunk failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -1 :x
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829627#comment-17829627 ] ASF GitHub Bot commented on YARN-11664: --- hadoop-yetus commented on PR #6631: URL: https://github.com/apache/hadoop/pull/6631#issuecomment-2012824210 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 20m 54s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 14s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 51m 14s | | trunk passed | | -1 :x: | compile | 17m 24s | [/branch-compile-root-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-compile-root-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | root in trunk failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | compile | 1m 34s | [/branch-compile-root-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | root in trunk failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -0 :warning: | checkstyle | 0m 28s | [/buildtool-branch-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/buildtool-branch-checkstyle-root.txt) | The patch fails to run checkstyle in root | | -1 :x: | mvnsite | 0m 30s | [/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt) | hadoop-common in trunk failed. | | -1 :x: | mvnsite | 0m 30s | [/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs-client.txt) | hadoop-hdfs-client in trunk failed. | | -1 :x: | mvnsite | 0m 31s | [/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-mvnsite-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in trunk failed. | | -1 :x: | mvnsite | 0m 31s | [/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt) | hadoop-yarn-common in trunk failed. | | -1 :x: | mvnsite | 0m 32s | [/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt) | hadoop-yarn-server-nodemanager in trunk failed. | | -1 :x: | mvnsite | 0m 30s | [/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core.txt) | hadoop-yarn-services-core in trunk failed. | | -1 :x: | javadoc | 0m 30s | [/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-common in trunk failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javadoc | 0m 31s | [/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/5/artifact/out/branch-javadoc-hadoop-hdfs-project_hadoop-hdfs-client-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | hadoop-hdfs-client in trunk failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x
[jira] [Commented] (YARN-11261) Upgrade JUnit from 4 to 5 in hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-11261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829581#comment-17829581 ] ASF GitHub Bot commented on YARN-11261: --- K0K0V0K opened a new pull request, #6652: URL: https://github.com/apache/hadoop/pull/6652 ### Description of PR ### How was this patch tested? ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? > Upgrade JUnit from 4 to 5 in hadoop-yarn-server-web-proxy > - > > Key: YARN-11261 > URL: https://issues.apache.org/jira/browse/YARN-11261 > Project: Hadoop YARN > Issue Type: Sub-task > Components: test, yarn >Affects Versions: 3.3.4 >Reporter: Ashutosh Gupta >Assignee: Ashutosh Gupta >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829566#comment-17829566 ] ASF GitHub Bot commented on YARN-11667: --- hadoop-yetus commented on PR #6648: URL: https://github.com/apache/hadoop/pull/6648#issuecomment-2012334945 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 17m 19s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 13m 51s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 36m 33s | | trunk passed | | +1 :green_heart: | compile | 8m 19s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 7m 31s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 2m 0s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 46s | | trunk passed | | +1 :green_heart: | javadoc | 1m 42s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 31s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 3m 19s | | trunk passed | | +1 :green_heart: | shadedclient | 39m 17s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 31s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 2s | | the patch passed | | +1 :green_heart: | compile | 7m 29s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 7m 29s | | the patch passed | | +1 :green_heart: | compile | 7m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 7m 24s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 52s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 34s | | the patch passed | | +1 :green_heart: | javadoc | 1m 28s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 21s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 3m 31s | | the patch passed | | +1 :green_heart: | shadedclient | 39m 27s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 1m 9s | | hadoop-yarn-api in the patch passed. | | +1 :green_heart: | unit | 28m 39s | | hadoop-yarn-client in the patch passed. | | +1 :green_heart: | asflicense | 1m 1s | | The patch does not generate ASF License warnings. | | | | 236m 38s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6648 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 08f70c235076 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 4be56d2c6590fe652f15a0d896645d9f0ef62982 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/5/testReport/ | | Max. process+thread count | 586 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829562#comment-17829562 ] ASF GitHub Bot commented on YARN-11664: --- hadoop-yetus commented on PR #6631: URL: https://github.com/apache/hadoop/pull/6631#issuecomment-2012303030 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 6m 54s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 13m 42s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 20m 18s | | trunk passed | | +1 :green_heart: | compile | 8m 57s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 8m 33s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 2m 10s | | trunk passed | | +1 :green_heart: | mvnsite | 4m 20s | | trunk passed | | +1 :green_heart: | javadoc | 3m 41s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 4m 13s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | -1 :x: | spotbugs | 1m 26s | [/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/6/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html) | hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs warnings. | | -1 :x: | spotbugs | 0m 45s | [/branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/6/artifact/out/branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core-warnings.html) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 21m 15s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 20s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 2m 42s | | the patch passed | | +1 :green_heart: | compile | 10m 38s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 10m 38s | | the patch passed | | +1 :green_heart: | compile | 8m 30s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 8m 30s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 2m 4s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/6/artifact/out/results-checkstyle-root.txt) | root: The patch generated 2 new + 525 unchanged - 2 fixed = 527 total (was 527) | | +1 :green_heart: | mvnsite | 3m 44s | | the patch passed | | +1 :green_heart: | javadoc | 3m 17s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 3m 56s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 8m 55s | | the patch passed | | +1 :green_heart: | shadedclient | 26m 38s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 16m 5s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 1m 57s | | hadoop-hdfs-client in the patch passed. | | -1 :x: | unit | 202m 55s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 4m 43s | | hadoop-yarn-common in the patch passed. | | +1 :green_heart: | unit | 22m 5s | | hadoop
[jira] [Resolved] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qiuliang resolved YARN-11667. - Resolution: Won't Do > Federation: ResourceRequestComparator occurs NPE when using low version of > hadoop submit application > > > Key: YARN-11667 > URL: https://issues.apache.org/jira/browse/YARN-11667 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Affects Versions: 3.4.0 >Reporter: qiuliang >Priority: Major > Labels: pull-request-available > > When a application is submitted using a lower version of hadoop and the > Resource Request built by AM has no ExecutionTypeRequest. After the Resource > Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs > the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829552#comment-17829552 ] ASF GitHub Bot commented on YARN-11667: --- qiuliang988 closed pull request #6648: YARN-11667. Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application URL: https://github.com/apache/hadoop/pull/6648 > Federation: ResourceRequestComparator occurs NPE when using low version of > hadoop submit application > > > Key: YARN-11667 > URL: https://issues.apache.org/jira/browse/YARN-11667 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Affects Versions: 3.4.0 >Reporter: qiuliang >Priority: Major > Labels: pull-request-available > > When a application is submitted using a lower version of hadoop and the > Resource Request built by AM has no ExecutionTypeRequest. After the Resource > Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs > the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829550#comment-17829550 ] ASF GitHub Bot commented on YARN-11667: --- qiuliang988 commented on PR #6648: URL: https://github.com/apache/hadoop/pull/6648#issuecomment-2012249877 > @qiuliang988 Thank you for your contribution! but I'm sorry, I do not agree with this change. We cannot always maintain backward compatibility, and the best approach would be to find a way to upgrade the version. @slfan1989 Thanks for your advice! I will close this PR > Federation: ResourceRequestComparator occurs NPE when using low version of > hadoop submit application > > > Key: YARN-11667 > URL: https://issues.apache.org/jira/browse/YARN-11667 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Affects Versions: 3.4.0 >Reporter: qiuliang >Priority: Major > Labels: pull-request-available > > When a application is submitted using a lower version of hadoop and the > Resource Request built by AM has no ExecutionTypeRequest. After the Resource > Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs > the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829548#comment-17829548 ] ASF GitHub Bot commented on YARN-11667: --- hadoop-yetus commented on PR #6648: URL: https://github.com/apache/hadoop/pull/6648#issuecomment-2012241859 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 30s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 0s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 10s | | trunk passed | | +1 :green_heart: | compile | 7m 36s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 6m 57s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 1m 57s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 52s | | trunk passed | | +1 :green_heart: | javadoc | 1m 45s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 35s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 3m 24s | | trunk passed | | +1 :green_heart: | shadedclient | 33m 38s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 3s | | the patch passed | | +1 :green_heart: | compile | 6m 50s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 6m 50s | | the patch passed | | +1 :green_heart: | compile | 6m 58s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 6m 58s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 1m 47s | | the patch passed | | +1 :green_heart: | mvnsite | 1m 37s | | the patch passed | | +1 :green_heart: | javadoc | 1m 31s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 25s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 3m 33s | | the patch passed | | +1 :green_heart: | shadedclient | 33m 56s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 1m 11s | | hadoop-yarn-api in the patch passed. | | +1 :green_heart: | unit | 29m 7s | | hadoop-yarn-client in the patch passed. | | +1 :green_heart: | asflicense | 1m 4s | | The patch does not generate ASF License warnings. | | | | 201m 44s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6648 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 1de32bf9fbba 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 4be56d2c6590fe652f15a0d896645d9f0ef62982 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/4/testReport/ | | Max. process+thread count | 661 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: hadoop-yarn-project/hadoop-yarn
[jira] [Commented] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829542#comment-17829542 ] ASF GitHub Bot commented on YARN-11667: --- slfan1989 commented on PR #6648: URL: https://github.com/apache/hadoop/pull/6648#issuecomment-2012203074 @qiuliang988 Thank you for your contribution! but I'm sorry, I do not agree with this change. We cannot always maintain backward compatibility, and the best approach would be to find a way to upgrade the version. > Federation: ResourceRequestComparator occurs NPE when using low version of > hadoop submit application > > > Key: YARN-11667 > URL: https://issues.apache.org/jira/browse/YARN-11667 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Affects Versions: 3.4.0 >Reporter: qiuliang >Priority: Major > Labels: pull-request-available > > When a application is submitted using a lower version of hadoop and the > Resource Request built by AM has no ExecutionTypeRequest. After the Resource > Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs > the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829493#comment-17829493 ] ASF GitHub Bot commented on YARN-11667: --- hadoop-yetus commented on PR #6648: URL: https://github.com/apache/hadoop/pull/6648#issuecomment-2011932156 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 36s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 1s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 36m 50s | | trunk passed | | +1 :green_heart: | compile | 8m 8s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 7m 15s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 1m 54s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 40s | | trunk passed | | +1 :green_heart: | javadoc | 1m 33s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 23s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 3m 23s | | trunk passed | | +1 :green_heart: | shadedclient | 37m 2s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 31s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 3s | | the patch passed | | +1 :green_heart: | compile | 7m 15s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 7m 15s | | the patch passed | | +1 :green_heart: | compile | 7m 6s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 7m 6s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 1m 49s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/3/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt) | hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 14 unchanged - 0 fixed = 15 total (was 14) | | +1 :green_heart: | mvnsite | 1m 31s | | the patch passed | | +1 :green_heart: | javadoc | 1m 20s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 18s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 3m 38s | | the patch passed | | +1 :green_heart: | shadedclient | 35m 6s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 1m 7s | | hadoop-yarn-api in the patch passed. | | +1 :green_heart: | unit | 28m 53s | | hadoop-yarn-client in the patch passed. | | +1 :green_heart: | asflicense | 1m 4s | | The patch does not generate ASF License warnings. | | | | 212m 21s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6648 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux f5faf9fb4f4d 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / a85f9f54c1e6bed41227a2f48587e2b42d821c0a | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci
[jira] [Commented] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829456#comment-17829456 ] ASF GitHub Bot commented on YARN-11667: --- hadoop-yetus commented on PR #6648: URL: https://github.com/apache/hadoop/pull/6648#issuecomment-2011738526 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 21s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 13m 45s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 19m 52s | | trunk passed | | +1 :green_heart: | compile | 3m 40s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 3m 28s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 1m 2s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 12s | | trunk passed | | +1 :green_heart: | javadoc | 1m 7s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 5s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 2s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 57s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 23s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 37s | | the patch passed | | +1 :green_heart: | compile | 3m 35s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 3m 35s | | the patch passed | | +1 :green_heart: | compile | 3m 32s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 3m 32s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 57s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/2/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt) | hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 14 unchanged - 0 fixed = 15 total (was 14) | | +1 :green_heart: | mvnsite | 1m 1s | | the patch passed | | +1 :green_heart: | javadoc | 0m 52s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 52s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 0s | | the patch passed | | +1 :green_heart: | shadedclient | 21m 30s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 0m 42s | | hadoop-yarn-api in the patch passed. | | +1 :green_heart: | unit | 24m 50s | | hadoop-yarn-client in the patch passed. | | +1 :green_heart: | asflicense | 0m 32s | | The patch does not generate ASF License warnings. | | | | 134m 20s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6648 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux d6c3acd693a2 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 42e05bb0e8c7db7f62445feeb685f08e934d49ac | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci
[jira] [Resolved] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia resolved YARN-11626. -- Fix Version/s: 3.5.0 Resolution: Fixed > Optimization of the safeDelete operation in ZKRMStateStore > -- > > Key: YARN-11626 > URL: https://issues.apache.org/jira/browse/YARN-11626 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.0.0-alpha4, 3.1.1, 3.3.0 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > Fix For: 3.5.0 > > > h1. Description > * We can be observed that removing app info started at 06:17:20, but the > NoNodeException was received at 06:17:35. > * During the 15s interval, Curator was retrying the metadata operation. Due > to the non-idempotent nature of the Zookeeper deletion operation, in one of > the retry attempts, the metadata operation was successful but no response was > received. In the next retry it resulted in a NoNodeException, triggering the > STATE_STORE_FENCED event and ultimately causing the current ResourceManager > to switch to standby . > {code:java} > 2023-10-28 06:17:20,359 INFO recovery.RMStateStore > (RMStateStore.java:transition(333)) - Removing info for app: > application_1697410508608_140368 > 2023-10-28 06:17:20,359 INFO resourcemanager.RMAppManager > (RMAppManager.java:checkAppNumCompletedLimit(303)) - Application should be > expired, max number of completed apps kept in memory met: > maxCompletedAppsInMemory = 1000, removing app > application_1697410508608_140368 from memory: > 2023-10-28 06:17:35,665 ERROR recovery.RMStateStore > (RMStateStore.java:transition(337)) - Error removing app: > application_1697410508608_140368 > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > 2023-10-28 06:17:35,666 INFO recovery.RMStateStore > (RMStateStore.java:handleStoreEvent(1147)) - RMStateStore state change from > ACTIVE to FENCED > 2023-10-28 06:17:35,666 ERROR resourcemanager.ResourceManager > (ResourceManager.java:handle(898)) - Received RMFatalEvent of type > STATE_STORE_FENCED, caused by > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > 2023-10-28 06:17:35,666 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToStandby(1309)) - Transitioning to standby > state > {code} > h1. Solution > The NoNodeException clearly indicates that the Znode no longer exists, so we > can safely ignore this exception to avoid triggering a larger impact on the > cluster caused by ResourceManager failover. > h1. Other > We also need to discuss and optimize the same issues in safeCreate. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829405#comment-17829405 ] ASF GitHub Bot commented on YARN-11626: --- dineshchitlangia merged PR #6616: URL: https://github.com/apache/hadoop/pull/6616 > Optimization of the safeDelete operation in ZKRMStateStore > -- > > Key: YARN-11626 > URL: https://issues.apache.org/jira/browse/YARN-11626 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.0.0-alpha4, 3.1.1, 3.3.0 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > > h1. Description > * We can be observed that removing app info started at 06:17:20, but the > NoNodeException was received at 06:17:35. > * During the 15s interval, Curator was retrying the metadata operation. Due > to the non-idempotent nature of the Zookeeper deletion operation, in one of > the retry attempts, the metadata operation was successful but no response was > received. In the next retry it resulted in a NoNodeException, triggering the > STATE_STORE_FENCED event and ultimately causing the current ResourceManager > to switch to standby . > {code:java} > 2023-10-28 06:17:20,359 INFO recovery.RMStateStore > (RMStateStore.java:transition(333)) - Removing info for app: > application_1697410508608_140368 > 2023-10-28 06:17:20,359 INFO resourcemanager.RMAppManager > (RMAppManager.java:checkAppNumCompletedLimit(303)) - Application should be > expired, max number of completed apps kept in memory met: > maxCompletedAppsInMemory = 1000, removing app > application_1697410508608_140368 from memory: > 2023-10-28 06:17:35,665 ERROR recovery.RMStateStore > (RMStateStore.java:transition(337)) - Error removing app: > application_1697410508608_140368 > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > 2023-10-28 06:17:35,666 INFO recovery.RMStateStore > (RMStateStore.java:handleStoreEvent(1147)) - RMStateStore state change from > ACTIVE to FENCED > 2023-10-28 06:17:35,666 ERROR resourcemanager.ResourceManager > (ResourceManager.java:handle(898)) - Received RMFatalEvent of type > STATE_STORE_FENCED, caused by > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > 2023-10-28 06:17:35,666 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToStandby(1309)) - Transitioning to standby > state > {code} > h1. Solution > The NoNodeException clearly indicates that the Znode no longer exists, so we > can safely ignore this exception to avoid triggering a larger impact on the > cluster caused by ResourceManager failover. > h1. Other > We also need to discuss and optimize the same issues in safeCreate. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated YARN-11667: -- Labels: pull-request-available (was: ) > Federation: ResourceRequestComparator occurs NPE when using low version of > hadoop submit application > > > Key: YARN-11667 > URL: https://issues.apache.org/jira/browse/YARN-11667 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Affects Versions: 3.4.0 >Reporter: qiuliang >Priority: Major > Labels: pull-request-available > > When a application is submitted using a lower version of hadoop and the > Resource Request built by AM has no ExecutionTypeRequest. After the Resource > Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs > the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829392#comment-17829392 ] ASF GitHub Bot commented on YARN-11667: --- hadoop-yetus commented on PR #6648: URL: https://github.com/apache/hadoop/pull/6648#issuecomment-2011310490 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 7m 30s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 40s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 19m 57s | | trunk passed | | +1 :green_heart: | compile | 3m 44s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 3m 21s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 58s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 10s | | trunk passed | | +1 :green_heart: | javadoc | 1m 10s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 59s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 10s | | trunk passed | | +1 :green_heart: | shadedclient | 19m 57s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 20s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 0m 36s | | the patch passed | | +1 :green_heart: | compile | 3m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 3m 27s | | the patch passed | | +1 :green_heart: | compile | 3m 24s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 3m 24s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 55s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/1/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt) | hadoop-yarn-project/hadoop-yarn: The patch generated 1 new + 14 unchanged - 0 fixed = 15 total (was 14) | | +1 :green_heart: | mvnsite | 1m 0s | | the patch passed | | +1 :green_heart: | javadoc | 0m 55s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 56s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 2m 10s | | the patch passed | | +1 :green_heart: | shadedclient | 19m 54s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 0m 47s | | hadoop-yarn-api in the patch passed. | | -1 :x: | unit | 25m 34s | [/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/1/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt) | hadoop-yarn-client in the patch passed. | | +1 :green_heart: | asflicense | 0m 40s | | The patch does not generate ASF License warnings. | | | | 140m 38s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.yarn.client.TestYarnApiClasses | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6648/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6648 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux 56635d28f057 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk
[jira] [Updated] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qiuliang updated YARN-11667: External issue URL: (was: https://github.com/apache/hadoop/pull/6648) > Federation: ResourceRequestComparator occurs NPE when using low version of > hadoop submit application > > > Key: YARN-11667 > URL: https://issues.apache.org/jira/browse/YARN-11667 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Affects Versions: 3.4.0 >Reporter: qiuliang >Priority: Major > > When a application is submitted using a lower version of hadoop and the > Resource Request built by AM has no ExecutionTypeRequest. After the Resource > Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs > the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
[ https://issues.apache.org/jira/browse/YARN-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] qiuliang updated YARN-11667: External issue URL: https://github.com/apache/hadoop/pull/6648 > Federation: ResourceRequestComparator occurs NPE when using low version of > hadoop submit application > > > Key: YARN-11667 > URL: https://issues.apache.org/jira/browse/YARN-11667 > Project: Hadoop YARN > Issue Type: Bug > Components: amrmproxy >Affects Versions: 3.4.0 >Reporter: qiuliang >Priority: Major > > When a application is submitted using a lower version of hadoop and the > Resource Request built by AM has no ExecutionTypeRequest. After the Resource > Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs > the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829387#comment-17829387 ] ASF GitHub Bot commented on YARN-11626: --- XbaoWu commented on PR #6616: URL: https://github.com/apache/hadoop/pull/6616#issuecomment-2011278871 > @XbaoWu - could you please address the 3 checkstyle violations generated by your patch? > > TestCheckRemoveZKNodeRMStateStore.java:95: TestZKRMStateStoreInternal store;:32: Variable 'store' must be private and have accessor methods. [VisibilityModifier] > > TestCheckRemoveZKNodeRMStateStore.java:96: String workingZnode;:12: Variable 'workingZnode' must be private and have accessor methods. [VisibilityModifier] > > TestCheckRemoveZKNodeRMStateStore.java:366: public void testTransitionedToStandbyAfterCheckNode(RMStateStoreHelper stateStoreHelper) throws Exception {: Line is longer than 100 characters (found 109). [LineLength] Okay, I 've solved these non-standard code. > Optimization of the safeDelete operation in ZKRMStateStore > -- > > Key: YARN-11626 > URL: https://issues.apache.org/jira/browse/YARN-11626 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.0.0-alpha4, 3.1.1, 3.3.0 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > > h1. Description > * We can be observed that removing app info started at 06:17:20, but the > NoNodeException was received at 06:17:35. > * During the 15s interval, Curator was retrying the metadata operation. Due > to the non-idempotent nature of the Zookeeper deletion operation, in one of > the retry attempts, the metadata operation was successful but no response was > received. In the next retry it resulted in a NoNodeException, triggering the > STATE_STORE_FENCED event and ultimately causing the current ResourceManager > to switch to standby . > {code:java} > 2023-10-28 06:17:20,359 INFO recovery.RMStateStore > (RMStateStore.java:transition(333)) - Removing info for app: > application_1697410508608_140368 > 2023-10-28 06:17:20,359 INFO resourcemanager.RMAppManager > (RMAppManager.java:checkAppNumCompletedLimit(303)) - Application should be > expired, max number of completed apps kept in memory met: > maxCompletedAppsInMemory = 1000, removing app > application_1697410508608_140368 from memory: > 2023-10-28 06:17:35,665 ERROR recovery.RMStateStore > (RMStateStore.java:transition(337)) - Error removing app: > application_1697410508608_140368 > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > 2023-10-28 06:17:35,666 INFO recovery.RMStateStore > (RMStateStore.java:handleStoreEvent(1147)) - RMStateStore state change from > ACTIVE to FENCED > 2023-10-28 06:17:35,666 ERROR resourcemanager.ResourceManager > (ResourceManager.java:handle(898)) - Received RMFatalEvent of type > STATE_STORE_FENCED, caused by > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > 2023-10-28 06:17:35,666 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToStandby(1309)) - Transitioning to standby > state > {code} > h1. Solution > The NoNodeException clearly indicates that the Znode no longer exists, so we > can safely ignore this exception to avoid triggering a larger impact on the > cluster caused by ResourceManager failover. > h1. Other > We also need to discuss and optimize the same issues in safeCreate. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829386#comment-17829386 ] ASF GitHub Bot commented on YARN-11626: --- hadoop-yetus commented on PR #6616: URL: https://github.com/apache/hadoop/pull/6616#issuecomment-2011277303 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 21s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 42s | | trunk passed | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 30s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 36s | | trunk passed | | +1 :green_heart: | javadoc | 0m 36s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 13s | | trunk passed | | +1 :green_heart: | shadedclient | 19m 57s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 29s | | the patch passed | | +1 :green_heart: | compile | 0m 28s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 28s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 21s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 27s | | the patch passed | | +1 :green_heart: | javadoc | 0m 23s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 8s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 89m 20s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 24s | | The patch does not generate ASF License warnings. | | | | 173m 27s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6616 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle | | uname | Linux 8d01ef473a19 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / c30bbd67867e4b445820620c4387bcd11cf8fba0 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/5/testReport/ | | Max. process+thread count | 963 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/5/console | | versions
[jira] [Updated] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Syed Shameerur Rahman updated YARN-11664: - Description: In principle Hadoop Yarn is independent of HDFS. It can work with any filesystem. Currently there exists some code dependency for Yarn with HDFS. This dependency requires Yarn to bring in some of the HDFS binaries/jars to its class path. The idea behind this jira is to remove this dependency so that Yarn can run without HDFS binaries/jars *Scope* 1. Non test classes are considered 2. Some test classes which comes as transitive dependency are considered *Out of scope* 1. All test classes in Yarn module is not considered A quick search in Yarn module revealed following HDFS dependencies 1. Constants {code:java} import org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; import org.apache.hadoop.hdfs.DFSConfigKeys;{code} 2. Exception {code:java} import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException; 3. Utility {code:java} import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} Both Yarn and HDFS depends on *hadoop-common* module, * Constants variables and Utility classes can be moved to *hadoop-common* * Instead of DSQuotaExceededException, Use the parent exception ClusterStoragrCapacityExceeded was: In principle Hadoop Yarn is independent of HDFS. It can work with any filesystem. Currently there exists some code dependency for Yarn with HDFS. This dependency requires Yarn to bring in some of the HDFS binaries/jars to its class path. The idea behind this jira is to remove this dependency so that Yarn can run without HDFS binaries/jars *Scope* 1. Non test classes are considered 2. Some test classes which comes as transitive dependency are considered *Out of scope* 1. All test classes in Yarn module is not considered A quick search in Yarn module revealed following HDFS dependencies 1. Constants {code:java} import org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; import org.apache.hadoop.hdfs.DFSConfigKeys;{code} 2. Exception {code:java} import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException; import org.apache.hadoop.hdfs.protocol.QuotaExceededException; (Comes as a transitive dependency from DSQuotaExceededException){code} 3. Utility {code:java} import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} Both Yarn and HDFS depends on *hadoop-common* module, One straight forward approach is to move all these dependencies to *hadoop-common* module and both HDFS and Yarn can pick these dependencies. > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException; > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, > * Constants variables and Utility classes can be moved to *hadoop-common* > * Instead of DSQuotaExceededExc
[jira] [Created] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
qiuliang created YARN-11667: --- Summary: Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application Key: YARN-11667 URL: https://issues.apache.org/jira/browse/YARN-11667 Project: Hadoop YARN Issue Type: Bug Components: amrmproxy Affects Versions: 3.4.0 Reporter: qiuliang When a application is submitted using a lower version of hadoop and the Resource Request built by AM has no ExecutionTypeRequest. After the Resource Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829283#comment-17829283 ] ASF GitHub Bot commented on YARN-11664: --- hadoop-yetus commented on PR #6631: URL: https://github.com/apache/hadoop/pull/6631#issuecomment-2010451899 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 23s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 49s | | trunk passed | | +1 :green_heart: | compile | 18m 14s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 16m 42s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 4m 20s | | trunk passed | | +1 :green_heart: | mvnsite | 6m 59s | | trunk passed | | +1 :green_heart: | javadoc | 5m 47s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 6m 1s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | -1 :x: | spotbugs | 2m 54s | [/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/4/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html) | hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs warnings. | | -1 :x: | spotbugs | 1m 9s | [/branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/4/artifact/out/branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core-warnings.html) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 34m 29s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 31s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 32s | | the patch passed | | +1 :green_heart: | compile | 16m 51s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 16m 51s | | the patch passed | | +1 :green_heart: | compile | 16m 55s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 16m 55s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 29s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/4/artifact/out/results-checkstyle-root.txt) | root: The patch generated 5 new + 528 unchanged - 2 fixed = 533 total (was 530) | | +1 :green_heart: | mvnsite | 7m 22s | | the patch passed | | +1 :green_heart: | javadoc | 5m 52s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 6m 20s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 14m 54s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 50s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 16s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 41s | | hadoop-hdfs-client in the patch passed. | | +1 :green_heart: | unit | 225m 0s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 6m 7s | | hadoop-yarn-common in the patch passed. | | +1 :green_heart: | unit | 24m 39s | | hadoop-yarn-server-nodemanager in the patch passed. | | +1 :green_heart: | unit | 21m 26s | | hadoop-yarn-services-core in the patch passed. | | +1 :green_heart: | asflicense | 1m 14s
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829262#comment-17829262 ] ASF GitHub Bot commented on YARN-11626: --- hadoop-yetus commented on PR #6616: URL: https://github.com/apache/hadoop/pull/6616#issuecomment-2010221243 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 6m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 32s | | trunk passed | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 33s | | trunk passed | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 1s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/4/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 5 unchanged - 0 fixed = 8 total (was 5) | | +1 :green_heart: | mvnsite | 0m 26s | | the patch passed | | +1 :green_heart: | javadoc | 0m 23s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 8s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 89m 34s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 24s | | The patch does not generate ASF License warnings. | | | | 179m 40s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6616 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle | | uname | Linux 00b3366602f7 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 725bb7fd54d8c2d821e7b38df2a3358678c71b9c | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results | https://ci
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829195#comment-17829195 ] ASF GitHub Bot commented on YARN-11626: --- XbaoWu commented on code in PR #6616: URL: https://github.com/apache/hadoop/pull/6616#discussion_r1532220247 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java: ## @@ -1441,6 +1441,29 @@ void delete(final String path) throws Exception { zkManager.delete(path); } + /** + * Deletes the path more safe. + * When NNE is encountered, if the node does not exist, Review Comment: > Could you expand NNE in the javadoc for brevity? Okay, thank you for your reminder > Optimization of the safeDelete operation in ZKRMStateStore > -- > > Key: YARN-11626 > URL: https://issues.apache.org/jira/browse/YARN-11626 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.0.0-alpha4, 3.1.1, 3.3.0 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > > h1. Description > * We can be observed that removing app info started at 06:17:20, but the > NoNodeException was received at 06:17:35. > * During the 15s interval, Curator was retrying the metadata operation. Due > to the non-idempotent nature of the Zookeeper deletion operation, in one of > the retry attempts, the metadata operation was successful but no response was > received. In the next retry it resulted in a NoNodeException, triggering the > STATE_STORE_FENCED event and ultimately causing the current ResourceManager > to switch to standby . > {code:java} > 2023-10-28 06:17:20,359 INFO recovery.RMStateStore > (RMStateStore.java:transition(333)) - Removing info for app: > application_1697410508608_140368 > 2023-10-28 06:17:20,359 INFO resourcemanager.RMAppManager > (RMAppManager.java:checkAppNumCompletedLimit(303)) - Application should be > expired, max number of completed apps kept in memory met: > maxCompletedAppsInMemory = 1000, removing app > application_1697410508608_140368 from memory: > 2023-10-28 06:17:35,665 ERROR recovery.RMStateStore > (RMStateStore.java:transition(337)) - Error removing app: > application_1697410508608_140368 > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > 2023-10-28 06:17:35,666 INFO recovery.RMStateStore > (RMStateStore.java:handleStoreEvent(1147)) - RMStateStore state change from > ACTIVE to FENCED > 2023-10-28 06:17:35,666 ERROR resourcemanager.ResourceManager > (ResourceManager.java:handle(898)) - Received RMFatalEvent of type > STATE_STORE_FENCED, caused by > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > 2023-10-28 06:17:35,666 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToStandby(1309)) - Transitioning to standby > state > {code} > h1. Solution > The NoNodeException clearly indicates that the Znode no longer exists, so we > can safely ignore this exception to avoid triggering a larger impact on the > cluster caused by ResourceManager failover. > h1. Other > We also need to discuss and optimize the same issues in safeCreate. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829191#comment-17829191 ] ASF GitHub Bot commented on YARN-11626: --- dineshchitlangia commented on code in PR #6616: URL: https://github.com/apache/hadoop/pull/6616#discussion_r1532190511 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java: ## @@ -1441,6 +1441,29 @@ void delete(final String path) throws Exception { zkManager.delete(path); } + /** + * Deletes the path more safe. + * When NNE is encountered, if the node does not exist, Review Comment: Could you expand NNE in the javadoc for brevity? > Optimization of the safeDelete operation in ZKRMStateStore > -- > > Key: YARN-11626 > URL: https://issues.apache.org/jira/browse/YARN-11626 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.0.0-alpha4, 3.1.1, 3.3.0 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > > h1. Description > * We can be observed that removing app info started at 06:17:20, but the > NoNodeException was received at 06:17:35. > * During the 15s interval, Curator was retrying the metadata operation. Due > to the non-idempotent nature of the Zookeeper deletion operation, in one of > the retry attempts, the metadata operation was successful but no response was > received. In the next retry it resulted in a NoNodeException, triggering the > STATE_STORE_FENCED event and ultimately causing the current ResourceManager > to switch to standby . > {code:java} > 2023-10-28 06:17:20,359 INFO recovery.RMStateStore > (RMStateStore.java:transition(333)) - Removing info for app: > application_1697410508608_140368 > 2023-10-28 06:17:20,359 INFO resourcemanager.RMAppManager > (RMAppManager.java:checkAppNumCompletedLimit(303)) - Application should be > expired, max number of completed apps kept in memory met: > maxCompletedAppsInMemory = 1000, removing app > application_1697410508608_140368 from memory: > 2023-10-28 06:17:35,665 ERROR recovery.RMStateStore > (RMStateStore.java:transition(337)) - Error removing app: > application_1697410508608_140368 > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > 2023-10-28 06:17:35,666 INFO recovery.RMStateStore > (RMStateStore.java:handleStoreEvent(1147)) - RMStateStore state change from > ACTIVE to FENCED > 2023-10-28 06:17:35,666 ERROR resourcemanager.ResourceManager > (ResourceManager.java:handle(898)) - Received RMFatalEvent of type > STATE_STORE_FENCED, caused by > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > 2023-10-28 06:17:35,666 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToStandby(1309)) - Transitioning to standby > state > {code} > h1. Solution > The NoNodeException clearly indicates that the Znode no longer exists, so we > can safely ignore this exception to avoid triggering a larger impact on the > cluster caused by ResourceManager failover. > h1. Other > We also need to discuss and optimize the same issues in safeCreate. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5305) Yarn Application Log Aggregation fails due to NM can not get correct HDFS delegation token III
[ https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke resolved YARN-5305. - Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Yarn Application Log Aggregation fails due to NM can not get correct HDFS > delegation token III > -- > > Key: YARN-5305 > URL: https://issues.apache.org/jira/browse/YARN-5305 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Xianyin Xin >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Different with YARN-5098 and YARN-5302, this problem happens when AM submits > a startContainer request with a new HDFS token (say, tokenB) which is not > managed by YARN, so two tokens exist in the credentials of the user on NM, > one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is > selected when connect to HDFS and tokenB expires, exception happens. > Supplementary: this problem happen due to that AM didn't use the service name > as the token alias in credentials, so two tokens for the same service can > co-exist in one credentials. TokenSelector can only select the first matched > token, it doesn't care if the token is valid or not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5305) Yarn Application Log Aggregation fails due to NM can not get correct HDFS delegation token III
[ https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829190#comment-17829190 ] ASF GitHub Bot commented on YARN-5305: -- brumi1024 merged PR #6625: URL: https://github.com/apache/hadoop/pull/6625 > Yarn Application Log Aggregation fails due to NM can not get correct HDFS > delegation token III > -- > > Key: YARN-5305 > URL: https://issues.apache.org/jira/browse/YARN-5305 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Xianyin Xin >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > > Different with YARN-5098 and YARN-5302, this problem happens when AM submits > a startContainer request with a new HDFS token (say, tokenB) which is not > managed by YARN, so two tokens exist in the credentials of the user on NM, > one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is > selected when connect to HDFS and tokenB expires, exception happens. > Supplementary: this problem happen due to that AM didn't use the service name > as the token alias in credentials, so two tokens for the same service can > co-exist in one credentials. TokenSelector can only select the first matched > token, it doesn't care if the token is valid or not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5305) Yarn Application Log Aggregation fails due to NM can not get correct HDFS delegation token III
[ https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829189#comment-17829189 ] ASF GitHub Bot commented on YARN-5305: -- brumi1024 commented on PR #6625: URL: https://github.com/apache/hadoop/pull/6625#issuecomment-2009719040 Thanks @p-szucs for the patch, @K0K0V0K for the review. The spotbug warning seems unrelated, merging to trunk. > Yarn Application Log Aggregation fails due to NM can not get correct HDFS > delegation token III > -- > > Key: YARN-5305 > URL: https://issues.apache.org/jira/browse/YARN-5305 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Xianyin Xin >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > > Different with YARN-5098 and YARN-5302, this problem happens when AM submits > a startContainer request with a new HDFS token (say, tokenB) which is not > managed by YARN, so two tokens exist in the credentials of the user on NM, > one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is > selected when connect to HDFS and tokenB expires, exception happens. > Supplementary: this problem happen due to that AM didn't use the service name > as the token alias in credentials, so two tokens for the same service can > co-exist in one credentials. TokenSelector can only select the first matched > token, it doesn't care if the token is valid or not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} , [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169] should also be fixed. [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169] {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {color:#505f79}_*you can use the attachments([^reproduce.sh] which uses [^add_test_cases.patch]patch) to easily reproduce the bug_{color} {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} {{stop}} method before calling any methods on it.(same for executor object in [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169]){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where th
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Attachment: add_test_cases.patch reproduce.sh > NullPointerException in TestSLSRunner.testSimulatorRunning > -- > > Key: YARN-11666 > URL: https://issues.apache.org/jira/browse/YARN-11666 > Project: Hadoop YARN > Issue Type: Bug > Environment: {*}Operating System{*}: macOS (Sanoma 14.2.1 (23C71)) > {*}Hardware{*}: MacBook Air 2023 > {*}IDE{*}: IntelliJ IDEA (2023.3.2 (Ultimate Edition)) > {*}Java Version{*}: OpenJDK version "1.8.0_292" >Reporter: Elen Chatikyan >Priority: Major > Attachments: add_test_cases.patch, reproduce.sh > > > *What happened:* > In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load > Scheduler) framework, a *NullPointerException* is thrown during the teardown > process of parameterized tests. This exception is thrown when the stop method > is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This > issue occurs under test conditions that involve mismatches between trace > types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to > scenarios where the rm object may not be properly initialized before the stop > method is invoked. > > *Buggy code:* > The issue is located in the > {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} > file within the *{{stop}}* method: > {code:java} > public void stop() { > rm.stop(); > } > {code} > The root cause of the *{{NullPointerException}}* is the lack of a null check > for the {{rm}} object before calling its {{stop}} method. Under any condition > where the *{{ResourceManager}}* fails to initialize correctly, attempting to > stop the *{{ResourceManager}}* leads to a null pointer dereference. > > After fixing in > {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} > , > [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169] > should also be fixed. > [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169] > {code:java} > public void stop() throws InterruptedException { > executor.shutdownNow(); > executor.awaitTermination(20, TimeUnit.SECONDS); > } > {code} > > *How to trigger this bug:* > {color:#00875a}*you can use the attachments(reproduce.sh and ) to easily > reproduce the bug{color} > * Change the parameterized unit test's(TestSLSRunner.java) data method to > include one/both of the following test cases: > * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } > * {capScheduler, "SYNTH", slsTraceFile, nodeFile } > * Execute the *TestSLSRunner* test suite, particularly the > *testSimulatorRunning* method. > * Observe the resulting *NullPointerException* in the test output(triggered > in RMRunner.java). > > {panel:title=Example stack trace from the test output:} > [ERROR] testSimulatorRunning[Testing with: SYNTH, > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, > (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: > 3.027 s <<< ERROR! > java.lang.NullPointerException > at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) > at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) > at > org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) > ... > {panel} > > > *How To Fix* > _{color:#172b4d}The bug can be fixed by implementing a null check for the > {{rm}} object within the > {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} > {{stop}} method before calling any methods on it.(same for executor object > in TaskRunner.java){color}_ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} , [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169] should also be fixed. [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169] {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* {color:#00875a}*you can use the attachments(reproduce.sh and ) to easily reproduce the bug{color} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {{[RMRunner.java|https://
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} , [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169] should also be fixed. [TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169] {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoo
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}} , +[TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169]+ should also be fixed. +[TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169]+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}}* file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in *{{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}}* , +[TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/mai
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}}* file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in *{{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}}* , +[TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169]+ should also be fixed. +[TaskRunner.java|https://github.com/apache/hadoop/blob/12a26d8b1987e883efab00c25a0594512527bd1f/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/TaskRunner.java#L169]+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}}* file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* * Change the parameterized unit test's(TestSLSRunner.java) dat
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{[RMRunner.java|https://github.com/apache/hadoop/blob/8b2058a4e755b8ebc081ac67b1b582dd2945e3c6/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/RMRunner.java#L126]}}* file within the *{{stop}}* method: {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPo
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} *How to trigger this bug:* * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{*} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320)
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{*} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{{*}} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at or
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{{*}} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} *How To Fix* _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{{*}} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at or
[jira] [Updated] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
[ https://issues.apache.org/jira/browse/YARN-11666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elen Chatikyan updated YARN-11666: -- Description: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{{*}} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} ___ _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ was: *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{{*}} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRun
[jira] [Created] (YARN-11666) NullPointerException in TestSLSRunner.testSimulatorRunning
Elen Chatikyan created YARN-11666: - Summary: NullPointerException in TestSLSRunner.testSimulatorRunning Key: YARN-11666 URL: https://issues.apache.org/jira/browse/YARN-11666 Project: Hadoop YARN Issue Type: Bug Environment: {*}Operating System{*}: macOS (Sanoma 14.2.1 (23C71)) {*}Hardware{*}: MacBook Air 2023 {*}IDE{*}: IntelliJ IDEA (2023.3.2 (Ultimate Edition)) {*}Java Version{*}: OpenJDK version "1.8.0_292" Reporter: Elen Chatikyan *What happened:* In the *TestSLSRunner* class of the Apache Hadoop YARN SLS (Simulated Load Scheduler) framework, a *NullPointerException* is thrown during the teardown process of parameterized tests. This exception is thrown when the stop method is called on the ResourceManager (rm) object in {_}RMRunner.java{_}. This issue occurs under test conditions that involve mismatches between trace types (RUMEN, SLS, SYNTH) and their corresponding trace files, leading to scenarios where the rm object may not be properly initialized before the stop method is invoked. *Buggy code:* The issue is located in the *{{RMRunner.java}}* file within the *{{stop}}* method:{+}{{+}} {code:java} public void stop() { rm.stop(); } {code} The root cause of the *{{NullPointerException}}* is the lack of a null check for the {{rm}} object before calling its {{stop}} method. Under any condition where the *{{ResourceManager}}* fails to initialize correctly, attempting to stop the *{{ResourceManager}}* leads to a null pointer dereference. After fixing in {*}RMRunner.java{*}, TaskRunner should also be fixed. +TaskRunner.java+ {code:java} public void stop() throws InterruptedException { executor.shutdownNow(); executor.awaitTermination(20, TimeUnit.SECONDS); } {code} {*}How to trigger this bug:{*}{*}{{*}} * Change the parameterized unit test's(TestSLSRunner.java) data method to include one/both of the following test cases: * {capScheduler, "SYNTH", rumenTraceFile, nodeFile } * {capScheduler, "SYNTH", slsTraceFile, nodeFile } * Execute the *TestSLSRunner* test suite, particularly the *testSimulatorRunning* method. * Observe the resulting *NullPointerException* in the test output(triggered in RMRunner.java). {panel:title=Example stack trace from the test output:} [ERROR] testSimulatorRunning[Testing with: SYNTH, org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler, (nodeFile null)](org.apache.hadoop.yarn.sls.TestSLSRunner) Time elapsed: 3.027 s <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.yarn.sls.RMRunner.stop(RMRunner.java:127) at org.apache.hadoop.yarn.sls.SLSRunner.stop(SLSRunner.java:320) at org.apache.hadoop.yarn.sls.BaseSLSRunnerTest.tearDown(BaseSLSRunnerTest.java:68) ... {panel} ___ _{color:#172b4d}The bug can be fixed by implementing a null check for the {{rm}} object within the *{{RMRunner.java}}* {{stop}} method before calling any methods on it.(same for executor object in TaskRunner.java){color}_ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5305) Yarn Application Log Aggregation fails due to NM can not get correct HDFS delegation token III
[ https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828512#comment-17828512 ] ASF GitHub Bot commented on YARN-5305: -- hadoop-yetus commented on PR #6625: URL: https://github.com/apache/hadoop/pull/6625#issuecomment-2008119232 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 44s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 32m 9s | | trunk passed | | +1 :green_heart: | compile | 17m 41s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 16m 15s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 4m 26s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 42s | | trunk passed | | +1 :green_heart: | javadoc | 2m 14s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 50s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | -1 :x: | spotbugs | 2m 35s | [/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6625/4/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html) | hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 34m 22s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 32s | | the patch passed | | +1 :green_heart: | compile | 16m 48s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 16m 48s | | the patch passed | | +1 :green_heart: | compile | 16m 13s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 16m 13s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 21s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6625/4/artifact/out/results-checkstyle-root.txt) | root: The patch generated 1 new + 197 unchanged - 0 fixed = 198 total (was 197) | | +1 :green_heart: | mvnsite | 2m 42s | | the patch passed | | +1 :green_heart: | javadoc | 2m 10s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 51s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 4m 32s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 19s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 40s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 24m 43s | | hadoop-yarn-server-nodemanager in the patch passed. | | +1 :green_heart: | asflicense | 1m 5s | | The patch does not generate ASF License warnings. | | | | 268m 12s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6625/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6625 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets | | uname | Linux c8c47b10ec91 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 772720878905eb7caa1ca4ca2936d727d54ee7b9 | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828470#comment-17828470 ] ASF GitHub Bot commented on YARN-11664: --- shameersss1 commented on PR #6631: URL: https://github.com/apache/hadoop/pull/6631#issuecomment-2007805123 > -1. Please do not change the following `@Public` and `@Evolving` classes: > > * QuotaExceededException.java > > * DSQuotaExceededException.java > > > > https://apache.github.io/hadoop/hadoop-project-dist/hadoop-common/Compatibility.html > > Evolving interfaces must not change between minor releases. > > Can we use ClusterStorageCapacityExceededException (hadoop-common) instead of DSQuotaExceededException/QuotaExceededException (hadoop-hdfs) in YARN source code? > > IOStreamPair.java is `@Private` and I think we can relocate to hadoop-common. ClusterStorageCapacityExceededException is a parent exception of DSQuotaExceededException and hence catching it will serve the purpose as well. I will raise a revision of this change. > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException; > import org.apache.hadoop.hdfs.protocol.QuotaExceededException; (Comes as a > transitive dependency from DSQuotaExceededException){code} > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, One straight forward > approach is to move all these dependencies to *hadoop-common* module and both > HDFS and Yarn can pick these dependencies. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11216) Avoid unnecessary reconstruction of ConfigurationProperties
[ https://issues.apache.org/jira/browse/YARN-11216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17828448#comment-17828448 ] ASF GitHub Bot commented on YARN-11216: --- hadoop-yetus commented on PR #4655: URL: https://github.com/apache/hadoop/pull/4655#issuecomment-2007703551 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 44s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 2 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 22s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 36m 21s | | trunk passed | | +1 :green_heart: | compile | 19m 1s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 17m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 4m 42s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 53s | | trunk passed | | +1 :green_heart: | javadoc | 2m 17s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 1m 48s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | -1 :x: | spotbugs | 2m 34s | [/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4655/16/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html) | hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 41m 12s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 30s | | Maven dependency ordering for patch | | -1 :x: | mvninstall | 0m 32s | [/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4655/16/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-server-resourcemanager in the patch failed. | | -1 :x: | compile | 8m 13s | [/patch-compile-root-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4655/16/artifact/out/patch-compile-root-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | root in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | javac | 8m 13s | [/patch-compile-root-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4655/16/artifact/out/patch-compile-root-jdkUbuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1.txt) | root in the patch failed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1. | | -1 :x: | compile | 7m 37s | [/patch-compile-root-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4655/16/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | root in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | -1 :x: | javac | 7m 37s | [/patch-compile-root-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4655/16/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06.txt) | root in the patch failed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06. | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 20s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4655/16/artifact/out/results-checkstyle-root.txt) | root: The patch generated 6 new + 139 unchanged - 0 fixed = 145 total (was 139) | | -1 :x: | mvnsite | 0m 37s | [/patch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4655/16/artifact/out/patch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server