[jira] [Updated] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Syed Shameerur Rahman updated YARN-11664: - Description: In principle Hadoop Yarn is independent of HDFS. It can work with any filesystem. Currently there exists some code dependency for Yarn with HDFS. This dependency requires Yarn to bring in some of the HDFS binaries/jars to its class path. The idea behind this jira is to remove this dependency so that Yarn can run without HDFS binaries/jars *Scope* 1. Non test classes are considered 2. Some test classes which comes as transitive dependency are considered *Out of scope* 1. All test classes in Yarn module is not considered A quick search in Yarn module revealed following HDFS dependencies 1. Constants {code:java} import org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; import org.apache.hadoop.hdfs.DFSConfigKeys;{code} 2. Exception {code:java} import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException; 3. Utility {code:java} import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} Both Yarn and HDFS depends on *hadoop-common* module, * Constants variables and Utility classes can be moved to *hadoop-common* * Instead of DSQuotaExceededException, Use the parent exception ClusterStoragrCapacityExceeded was: In principle Hadoop Yarn is independent of HDFS. It can work with any filesystem. Currently there exists some code dependency for Yarn with HDFS. This dependency requires Yarn to bring in some of the HDFS binaries/jars to its class path. The idea behind this jira is to remove this dependency so that Yarn can run without HDFS binaries/jars *Scope* 1. Non test classes are considered 2. Some test classes which comes as transitive dependency are considered *Out of scope* 1. All test classes in Yarn module is not considered A quick search in Yarn module revealed following HDFS dependencies 1. Constants {code:java} import org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; import org.apache.hadoop.hdfs.DFSConfigKeys;{code} 2. Exception {code:java} import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException; import org.apache.hadoop.hdfs.protocol.QuotaExceededException; (Comes as a transitive dependency from DSQuotaExceededException){code} 3. Utility {code:java} import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} Both Yarn and HDFS depends on *hadoop-common* module, One straight forward approach is to move all these dependencies to *hadoop-common* module and both HDFS and Yarn can pick these dependencies. > Remove HDFS Binaries/Jars Dependency From YARN > -- > > Key: YARN-11664 > URL: https://issues.apache.org/jira/browse/YARN-11664 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > > In principle Hadoop Yarn is independent of HDFS. It can work with any > filesystem. Currently there exists some code dependency for Yarn with HDFS. > This dependency requires Yarn to bring in some of the HDFS binaries/jars to > its class path. The idea behind this jira is to remove this dependency so > that Yarn can run without HDFS binaries/jars > *Scope* > 1. Non test classes are considered > 2. Some test classes which comes as transitive dependency are considered > *Out of scope* > 1. All test classes in Yarn module is not considered > > > A quick search in Yarn module revealed following HDFS dependencies > 1. Constants > {code:java} > import > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier; > import org.apache.hadoop.hdfs.DFSConfigKeys;{code} > > > 2. Exception > {code:java} > import org.apache.hadoop.hdfs.protocol.DSQuotaExceededException; > > 3. Utility > {code:java} > import org.apache.hadoop.hdfs.protocol.datatransfer.IOStreamPair;{code} > > Both Yarn and HDFS depends on *hadoop-common* module, > * Constants variables and Utility classes can be moved to *hadoop-common* > * Instead of DSQuotaExceededException, Use the parent exception > ClusterStoragrCapacityExceeded -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (YARN-11667) Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application
qiuliang created YARN-11667: --- Summary: Federation: ResourceRequestComparator occurs NPE when using low version of hadoop submit application Key: YARN-11667 URL: https://issues.apache.org/jira/browse/YARN-11667 Project: Hadoop YARN Issue Type: Bug Components: amrmproxy Affects Versions: 3.4.0 Reporter: qiuliang When a application is submitted using a lower version of hadoop and the Resource Request built by AM has no ExecutionTypeRequest. After the Resource Request is submitted to AMRMProxy, the NPE occurs when AMRMProxy reconstructs the Allocate Request to add Resource Request to its ask -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11664) Remove HDFS Binaries/Jars Dependency From YARN
[ https://issues.apache.org/jira/browse/YARN-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829283#comment-17829283 ] ASF GitHub Bot commented on YARN-11664: --- hadoop-yetus commented on PR #6631: URL: https://github.com/apache/hadoop/pull/6631#issuecomment-2010451899 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 4 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 23s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 49s | | trunk passed | | +1 :green_heart: | compile | 18m 14s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 16m 42s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | checkstyle | 4m 20s | | trunk passed | | +1 :green_heart: | mvnsite | 6m 59s | | trunk passed | | +1 :green_heart: | javadoc | 5m 47s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 6m 1s | | trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | -1 :x: | spotbugs | 2m 54s | [/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/4/artifact/out/branch-spotbugs-hadoop-hdfs-project_hadoop-hdfs-client-warnings.html) | hadoop-hdfs-project/hadoop-hdfs-client in trunk has 1 extant spotbugs warnings. | | -1 :x: | spotbugs | 1m 9s | [/branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/4/artifact/out/branch-spotbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-core-warnings.html) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core in trunk has 1 extant spotbugs warnings. | | +1 :green_heart: | shadedclient | 34m 29s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 31s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 32s | | the patch passed | | +1 :green_heart: | compile | 16m 51s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 16m 51s | | the patch passed | | +1 :green_heart: | compile | 16m 55s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | javac | 16m 55s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 29s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6631/4/artifact/out/results-checkstyle-root.txt) | root: The patch generated 5 new + 528 unchanged - 2 fixed = 533 total (was 530) | | +1 :green_heart: | mvnsite | 7m 22s | | the patch passed | | +1 :green_heart: | javadoc | 5m 52s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 6m 20s | | the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08 | | +1 :green_heart: | spotbugs | 14m 54s | | the patch passed | | +1 :green_heart: | shadedclient | 34m 50s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 16s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 41s | | hadoop-hdfs-client in the patch passed. | | +1 :green_heart: | unit | 225m 0s | | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 6m 7s | | hadoop-yarn-common in the patch passed. | | +1 :green_heart: | unit | 24m 39s | | hadoop-yarn-server-nodemanager in the patch passed. | | +1 :green_heart: | unit | 21m 26s | | hadoop-yarn-services-core in the patch passed. | | +1 :green_heart: | asflicense | 1m 14s |
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829262#comment-17829262 ] ASF GitHub Bot commented on YARN-11626: --- hadoop-yetus commented on PR #6616: URL: https://github.com/apache/hadoop/pull/6616#issuecomment-2010221243 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 6m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 32m 32s | | trunk passed | | +1 :green_heart: | compile | 0m 33s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | compile | 0m 30s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | checkstyle | 0m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 0m 33s | | trunk passed | | +1 :green_heart: | javadoc | 0m 35s | | trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 29s | | trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 7s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 1s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | compile | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | javac | 0m 27s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 0m 21s | [/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/4/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt) | hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 5 unchanged - 0 fixed = 8 total (was 5) | | +1 :green_heart: | mvnsite | 0m 26s | | the patch passed | | +1 :green_heart: | javadoc | 0m 23s | | the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 | | +1 :green_heart: | javadoc | 0m 27s | | the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | +1 :green_heart: | spotbugs | 1m 6s | | the patch passed | | +1 :green_heart: | shadedclient | 20m 8s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 89m 34s | | hadoop-yarn-server-resourcemanager in the patch passed. | | +1 :green_heart: | asflicense | 0m 24s | | The patch does not generate ASF License warnings. | | | | 179m 40s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6616/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/6616 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle | | uname | Linux 00b3366602f7 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 725bb7fd54d8c2d821e7b38df2a3358678c71b9c | | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 | | Test Results |
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829195#comment-17829195 ] ASF GitHub Bot commented on YARN-11626: --- XbaoWu commented on code in PR #6616: URL: https://github.com/apache/hadoop/pull/6616#discussion_r1532220247 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java: ## @@ -1441,6 +1441,29 @@ void delete(final String path) throws Exception { zkManager.delete(path); } + /** + * Deletes the path more safe. + * When NNE is encountered, if the node does not exist, Review Comment: > Could you expand NNE in the javadoc for brevity? Okay, thank you for your reminder > Optimization of the safeDelete operation in ZKRMStateStore > -- > > Key: YARN-11626 > URL: https://issues.apache.org/jira/browse/YARN-11626 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.0.0-alpha4, 3.1.1, 3.3.0 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > > h1. Description > * We can be observed that removing app info started at 06:17:20, but the > NoNodeException was received at 06:17:35. > * During the 15s interval, Curator was retrying the metadata operation. Due > to the non-idempotent nature of the Zookeeper deletion operation, in one of > the retry attempts, the metadata operation was successful but no response was > received. In the next retry it resulted in a NoNodeException, triggering the > STATE_STORE_FENCED event and ultimately causing the current ResourceManager > to switch to standby . > {code:java} > 2023-10-28 06:17:20,359 INFO recovery.RMStateStore > (RMStateStore.java:transition(333)) - Removing info for app: > application_1697410508608_140368 > 2023-10-28 06:17:20,359 INFO resourcemanager.RMAppManager > (RMAppManager.java:checkAppNumCompletedLimit(303)) - Application should be > expired, max number of completed apps kept in memory met: > maxCompletedAppsInMemory = 1000, removing app > application_1697410508608_140368 from memory: > 2023-10-28 06:17:35,665 ERROR recovery.RMStateStore > (RMStateStore.java:transition(337)) - Error removing app: > application_1697410508608_140368 > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > 2023-10-28 06:17:35,666 INFO recovery.RMStateStore > (RMStateStore.java:handleStoreEvent(1147)) - RMStateStore state change from > ACTIVE to FENCED > 2023-10-28 06:17:35,666 ERROR resourcemanager.ResourceManager > (ResourceManager.java:handle(898)) - Received RMFatalEvent of type > STATE_STORE_FENCED, caused by > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > 2023-10-28 06:17:35,666 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToStandby(1309)) - Transitioning to standby > state > {code} > h1. Solution > The NoNodeException clearly indicates that the Znode no longer exists, so we > can safely ignore this exception to avoid triggering a larger impact on the > cluster caused by ResourceManager failover. > h1. Other > We also need to discuss and optimize the same issues in safeCreate. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11626) Optimization of the safeDelete operation in ZKRMStateStore
[ https://issues.apache.org/jira/browse/YARN-11626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829191#comment-17829191 ] ASF GitHub Bot commented on YARN-11626: --- dineshchitlangia commented on code in PR #6616: URL: https://github.com/apache/hadoop/pull/6616#discussion_r1532190511 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java: ## @@ -1441,6 +1441,29 @@ void delete(final String path) throws Exception { zkManager.delete(path); } + /** + * Deletes the path more safe. + * When NNE is encountered, if the node does not exist, Review Comment: Could you expand NNE in the javadoc for brevity? > Optimization of the safeDelete operation in ZKRMStateStore > -- > > Key: YARN-11626 > URL: https://issues.apache.org/jira/browse/YARN-11626 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 3.0.0-alpha4, 3.1.1, 3.3.0 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > > h1. Description > * We can be observed that removing app info started at 06:17:20, but the > NoNodeException was received at 06:17:35. > * During the 15s interval, Curator was retrying the metadata operation. Due > to the non-idempotent nature of the Zookeeper deletion operation, in one of > the retry attempts, the metadata operation was successful but no response was > received. In the next retry it resulted in a NoNodeException, triggering the > STATE_STORE_FENCED event and ultimately causing the current ResourceManager > to switch to standby . > {code:java} > 2023-10-28 06:17:20,359 INFO recovery.RMStateStore > (RMStateStore.java:transition(333)) - Removing info for app: > application_1697410508608_140368 > 2023-10-28 06:17:20,359 INFO resourcemanager.RMAppManager > (RMAppManager.java:checkAppNumCompletedLimit(303)) - Application should be > expired, max number of completed apps kept in memory met: > maxCompletedAppsInMemory = 1000, removing app > application_1697410508608_140368 from memory: > 2023-10-28 06:17:35,665 ERROR recovery.RMStateStore > (RMStateStore.java:transition(337)) - Error removing app: > application_1697410508608_140368 > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > 2023-10-28 06:17:35,666 INFO recovery.RMStateStore > (RMStateStore.java:handleStoreEvent(1147)) - RMStateStore state change from > ACTIVE to FENCED > 2023-10-28 06:17:35,666 ERROR resourcemanager.ResourceManager > (ResourceManager.java:handle(898)) - Received RMFatalEvent of type > STATE_STORE_FENCED, caused by > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode > 2023-10-28 06:17:35,666 INFO resourcemanager.ResourceManager > (ResourceManager.java:transitionToStandby(1309)) - Transitioning to standby > state > {code} > h1. Solution > The NoNodeException clearly indicates that the Znode no longer exists, so we > can safely ignore this exception to avoid triggering a larger impact on the > cluster caused by ResourceManager failover. > h1. Other > We also need to discuss and optimize the same issues in safeCreate. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-5305) Yarn Application Log Aggregation fails due to NM can not get correct HDFS delegation token III
[ https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke resolved YARN-5305. - Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Yarn Application Log Aggregation fails due to NM can not get correct HDFS > delegation token III > -- > > Key: YARN-5305 > URL: https://issues.apache.org/jira/browse/YARN-5305 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Xianyin Xin >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Different with YARN-5098 and YARN-5302, this problem happens when AM submits > a startContainer request with a new HDFS token (say, tokenB) which is not > managed by YARN, so two tokens exist in the credentials of the user on NM, > one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is > selected when connect to HDFS and tokenB expires, exception happens. > Supplementary: this problem happen due to that AM didn't use the service name > as the token alias in credentials, so two tokens for the same service can > co-exist in one credentials. TokenSelector can only select the first matched > token, it doesn't care if the token is valid or not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5305) Yarn Application Log Aggregation fails due to NM can not get correct HDFS delegation token III
[ https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829190#comment-17829190 ] ASF GitHub Bot commented on YARN-5305: -- brumi1024 merged PR #6625: URL: https://github.com/apache/hadoop/pull/6625 > Yarn Application Log Aggregation fails due to NM can not get correct HDFS > delegation token III > -- > > Key: YARN-5305 > URL: https://issues.apache.org/jira/browse/YARN-5305 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Xianyin Xin >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > > Different with YARN-5098 and YARN-5302, this problem happens when AM submits > a startContainer request with a new HDFS token (say, tokenB) which is not > managed by YARN, so two tokens exist in the credentials of the user on NM, > one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is > selected when connect to HDFS and tokenB expires, exception happens. > Supplementary: this problem happen due to that AM didn't use the service name > as the token alias in credentials, so two tokens for the same service can > co-exist in one credentials. TokenSelector can only select the first matched > token, it doesn't care if the token is valid or not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5305) Yarn Application Log Aggregation fails due to NM can not get correct HDFS delegation token III
[ https://issues.apache.org/jira/browse/YARN-5305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829189#comment-17829189 ] ASF GitHub Bot commented on YARN-5305: -- brumi1024 commented on PR #6625: URL: https://github.com/apache/hadoop/pull/6625#issuecomment-2009719040 Thanks @p-szucs for the patch, @K0K0V0K for the review. The spotbug warning seems unrelated, merging to trunk. > Yarn Application Log Aggregation fails due to NM can not get correct HDFS > delegation token III > -- > > Key: YARN-5305 > URL: https://issues.apache.org/jira/browse/YARN-5305 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Xianyin Xin >Assignee: Peter Szucs >Priority: Major > Labels: pull-request-available > > Different with YARN-5098 and YARN-5302, this problem happens when AM submits > a startContainer request with a new HDFS token (say, tokenB) which is not > managed by YARN, so two tokens exist in the credentials of the user on NM, > one is tokenB, the other is the one renewed on RM (tokenA). If tokenB is > selected when connect to HDFS and tokenB expires, exception happens. > Supplementary: this problem happen due to that AM didn't use the service name > as the token alias in credentials, so two tokens for the same service can > co-exist in one credentials. TokenSelector can only select the first matched > token, it doesn't care if the token is valid or not. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org