[jira] [Commented] (YARN-8631) YARN RM fails to add the application to the delegation token renewer on recovery
[ https://issues.apache.org/jira/browse/YARN-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255168#comment-17255168 ] Shen Yinjie commented on YARN-8631: --- [~gandras][~umittal] Is there any progress ? > YARN RM fails to add the application to the delegation token renewer on > recovery > > > Key: YARN-8631 > URL: https://issues.apache.org/jira/browse/YARN-8631 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.1.0 >Reporter: Sanjay Divgi >Assignee: Umesh Mittal >Priority: Blocker > Attachments: YARN-8631.001.patch, > hadoop-yarn-resourcemanager-ctr-e138-1518143905142-429059-01-04.log > > > On HA cluster we have observed that yarn resource manager fails to add the > application to the delegation token renewer on recovery. > Below is the error: > {code:java} > 2018-08-07 08:41:23,850 INFO security.DelegationTokenRenewer > (DelegationTokenRenewer.java:renewToken(635)) - Renewed delegation-token= > [Kind: TIMELINE_DELEGATION_TOKEN, Service: 172.27.84.192:8188, Ident: > (TIMELINE_DELEGATION_TOKEN owner=hrt_qa_hive_spark, renewer=yarn, realUser=, > issueDate=1533624642302, maxDate=1534229442302, sequenceNumber=18, > masterKeyId=4);exp=1533717683478; apps=[application_1533623972681_0001]] > 2018-08-07 08:41:23,855 WARN security.DelegationTokenRenewer > (DelegationTokenRenewer.java:handleDTRenewerAppRecoverEvent(955)) - Unable to > add the application to the delegation token renewer on recovery. > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:522) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleDTRenewerAppRecoverEvent(DelegationTokenRenewer.java:953) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:79) > at > org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:912) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10550) Decouple NM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10550: -- Attachment: YARN-10550.001.patch > Decouple NM runner logic from SLSRunner > --- > > Key: YARN-10550 > URL: https://issues.apache.org/jira/browse/YARN-10550 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10550.001.patch > > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and > launch the AMs and task containers. > The NM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10549) Decouple RM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10549: -- Attachment: YARN-10549.001.patch > Decouple RM runner logic from SLSRunner > --- > > Key: YARN-10549 > URL: https://issues.apache.org/jira/browse/YARN-10549 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10549.001.patch > > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and > launch the AMs and task containers. > The RM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10547) Decouple job parsing logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255061#comment-17255061 ] Hadoop QA commented on YARN-10547: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 48s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 22s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 54s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 49s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 18s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/417/artifact/out/diff-checkstyle-hadoop-tools_hadoop-sls.txt{color} | {color:orange} hadoop-tools/hadoop-sls: The patch generated 76 new + 47 unchanged - 3 fixed = 123 total (was 50) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/417/artifact/out/whitespace-eol.txt{color} | {color:red} The patch has 27 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} The patch has no ill-formed XML file. {color} | |
[jira] [Commented] (YARN-10547) Decouple job parsing logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255055#comment-17255055 ] Hadoop QA commented on YARN-10547: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 32m 50s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 48s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 49s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 22s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-mvninstall-hadoop-tools_hadoop-sls.txt{color} | {color:red} hadoop-sls in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 22s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color} | {color:red} hadoop-sls in the patch failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 22s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color} | {color:red} hadoop-sls in the patch failed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 19s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt{color} | {color:red} hadoop-sls in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 19s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt{color} | {color:red} hadoop-sls in the patch failed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 17s{color} |
[jira] [Created] (YARN-10550) Decouple NM runner logic from SLSRunner
Szilard Nemeth created YARN-10550: - Summary: Decouple NM runner logic from SLSRunner Key: YARN-10550 URL: https://issues.apache.org/jira/browse/YARN-10550 Project: Hadoop YARN Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. The RM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10550) Decouple NM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10550: -- Description: SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. The NM runner logic could be decoupled. was: SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. The RM runner logic could be decoupled. > Decouple NM runner logic from SLSRunner > --- > > Key: YARN-10550 > URL: https://issues.apache.org/jira/browse/YARN-10550 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and > launch the AMs and task containers. > The NM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10549) Decouple RM runner logic from SLSRunner
Szilard Nemeth created YARN-10549: - Summary: Decouple RM runner logic from SLSRunner Key: YARN-10549 URL: https://issues.apache.org/jira/browse/YARN-10549 Project: Hadoop YARN Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. The AM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10549) Decouple RM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10549: -- Description: SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. The RM runner logic could be decoupled. was: SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. The AM runner logic could be decoupled. > Decouple RM runner logic from SLSRunner > --- > > Key: YARN-10549 > URL: https://issues.apache.org/jira/browse/YARN-10549 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and > launch the AMs and task containers. > The RM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10548: -- Attachment: YARN-10548.001.patch > Decouple AM runner logic from SLSRunner > --- > > Key: YARN-10548 > URL: https://issues.apache.org/jira/browse/YARN-10548 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10548.001.patch > > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and > launch the AMs and task containers. > The AM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10548: -- Attachment: (was: YARN-10548.001.patch) > Decouple AM runner logic from SLSRunner > --- > > Key: YARN-10548 > URL: https://issues.apache.org/jira/browse/YARN-10548 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and > launch the AMs and task containers. > The AM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10548: -- Attachment: YARN-10548.001.patch > Decouple AM runner logic from SLSRunner > --- > > Key: YARN-10548 > URL: https://issues.apache.org/jira/browse/YARN-10548 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and > launch the AMs and task containers. > The AM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10547) Decouple job parsing logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10547: -- Attachment: YARN-10547.002.patch > Decouple job parsing logic from SLSRunner > - > > Key: YARN-10547 > URL: https://issues.apache.org/jira/browse/YARN-10547 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10547.001.patch, YARN-10547.002.patch > > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and launch > the AMs and task containers. > As a first step, the job parser logic could be decoupled from this class. > There are 3 types of inputs: > - SLS trace > - Synth > - Rumen > Their job parsing method are: > - SLS trace: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526 > - Synth: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790 > - Rumen: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10548: -- Description: SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. The AM runner logic could be decoupled. was: SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. As a first step, the job parser logic could be decoupled from this class. There are 3 types of inputs: - SLS trace - Synth - Rumen Their job parsing method are: - SLS trace: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526 - Synth: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790 - Rumen: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716 > Decouple AM runner logic from SLSRunner > --- > > Key: YARN-10548 > URL: https://issues.apache.org/jira/browse/YARN-10548 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and > launch the AMs and task containers. > The AM runner logic could be decoupled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10548) CLONE - Decouple job parsing logic from SLSRunner
Szilard Nemeth created YARN-10548: - Summary: CLONE - Decouple job parsing logic from SLSRunner Key: YARN-10548 URL: https://issues.apache.org/jira/browse/YARN-10548 Project: Hadoop YARN Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. As a first step, the job parser logic could be decoupled from this class. There are 3 types of inputs: - SLS trace - Synth - Rumen Their job parsing method are: - SLS trace: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526 - Synth: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790 - Rumen: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10548: -- Summary: Decouple AM runner logic from SLSRunner (was: CLONE - Decouple job parsing logic from SLSRunner) > Decouple AM runner logic from SLSRunner > --- > > Key: YARN-10548 > URL: https://issues.apache.org/jira/browse/YARN-10548 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and launch > the AMs and task containers. > As a first step, the job parser logic could be decoupled from this class. > There are 3 types of inputs: > - SLS trace > - Synth > - Rumen > Their job parsing method are: > - SLS trace: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526 > - Synth: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790 > - Rumen: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10547) Decouple job parsing logic from SLSRunner
[ https://issues.apache.org/jira/browse/YARN-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-10547: -- Attachment: YARN-10547.001.patch > Decouple job parsing logic from SLSRunner > - > > Key: YARN-10547 > URL: https://issues.apache.org/jira/browse/YARN-10547 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Minor > Attachments: YARN-10547.001.patch > > > SLSRunner has too many responsibilities. > One of them is to parse the job details from the SLS input formats and launch > the AMs and task containers. > As a first step, the job parser logic could be decoupled from this class. > There are 3 types of inputs: > - SLS trace > - Synth > - Rumen > Their job parsing method are: > - SLS trace: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526 > - Synth: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790 > - Rumen: > https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10547) Decouple job parsing logic from SLSRunner
Szilard Nemeth created YARN-10547: - Summary: Decouple job parsing logic from SLSRunner Key: YARN-10547 URL: https://issues.apache.org/jira/browse/YARN-10547 Project: Hadoop YARN Issue Type: Improvement Reporter: Szilard Nemeth Assignee: Szilard Nemeth SLSRunner has too many responsibilities. One of them is to parse the job details from the SLS input formats and launch the AMs and task containers. As a first step, the job parser logic could be decoupled from this class. There are 3 types of inputs: - SLS trace - Synth - Rumen Their job parsing method are: - SLS trace: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526 - Synth: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790 - Rumen: https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10501) Can't remove all node labels after add node label without nodemanager port
[ https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254983#comment-17254983 ] caozhiqiang commented on YARN-10501: [~ebadger], thank you. I'll try to answer your questions. *Firstly, we use yarn rmadmin -replaceLabelsOnNode command to the first adding labels and replace labels to host/nm, and all processes are in* {code:java} case REPLACE:{code} 1a. I am also have confused for adding node without port to map, It may only want to allow users to use port 0 and show its information. In the hadoop document, it also tell this. [nodelabels|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeLabel.html] {code:java} Executing yarn rmadmin -replaceLabelsOnNode “node1[:port]=label1 node2=label2” [-failOnUnknownNodes]. Added label1 to node1, label2 to node2. If user don’t specify port, it adds the label to all NodeManagers running on the node. If option -failOnUnknownNodes is set, this command will fail if specified nodes are unknown. {code} 1b. If user don’t specify port, it adds the label to all NodeManagers running on the node. So host's labels are the labels for each nm in this host. And each nm also need its labels. Another reason is when there are none nms in a host, we can use host's labels.. {code:java} protected Set getLabelsByNode(NodeId nodeId, Map map) { Host host = map.get(nodeId.getHost()); if (null == host) { return EMPTY_STRING_SET; } Node nm = host.nms.get(nodeId); if (null != nm && null != nm.labels) { return nm.labels; } else { return host.labels; } } {code} 1c. Before the first adding labels to host/node, the node's labels is null. In "case ADD: ", the host should have been initalized use "case REPLACE:", I think. 2a and 2b are the same with 1. 2c and 2d. If the port is 0, host's labels are the same with nodes' labels in this host. *and The add the host's label to nodes' labels.* 2e and 2f. *Set the Labels to Null for each Node, then set new labels to node asynchronously with below code.* {code:java} // code placeholder newNMToLabels.put(nodeId, host.labels); ... dispatcher.getEventHandler().handle( new UpdateNodeToLabelsMappingsEvent(newNMToLabels)); ... public Node copy() { Node c = new Node(nodeId); if (labels != null) { c.labels = Collections.newSetFromMap(new ConcurrentHashMap()); c.labels.addAll(labels); } else { c.labels = null; } c.resource = Resources.clone(resource); c.running = running; return c; } }{code} Could [~leftnoteasy] , [~varunsaxena] and [~sunilg] give more information for these processes? > Can't remove all node labels after add node label without nodemanager port > -- > > Key: YARN-10501 > URL: https://issues.apache.org/jira/browse/YARN-10501 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.4.0 >Reporter: caozhiqiang >Assignee: caozhiqiang >Priority: Critical > Attachments: YARN-10501.002.patch, YARN-10501.003.patch > > > When add a label to nodes without nodemanager port or use WILDCARD_PORT (0) > port, it can't remove all label info in these nodes > Reproduce process: > {code:java} > 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)" > 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode" > 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}} > 4.yarn rmadmin -replaceLabelsOnNode "server001" > 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings > {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}} > {code} > You can see after the 4 process to remove nodemanager labels, the label info > is still in the node info. > {code:java} > 641 case REPLACE: > 642