[jira] [Commented] (YARN-8631) YARN RM fails to add the application to the delegation token renewer on recovery

2020-12-26 Thread Shen Yinjie (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255168#comment-17255168
 ] 

Shen Yinjie commented on YARN-8631:
---

[~gandras][~umittal] Is there any progress ? 

> YARN RM fails to add the application to the delegation token renewer on 
> recovery
> 
>
> Key: YARN-8631
> URL: https://issues.apache.org/jira/browse/YARN-8631
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.1.0
>Reporter: Sanjay Divgi
>Assignee: Umesh Mittal
>Priority: Blocker
> Attachments: YARN-8631.001.patch, 
> hadoop-yarn-resourcemanager-ctr-e138-1518143905142-429059-01-04.log
>
>
> On HA cluster we have observed that yarn resource manager fails to add the 
> application to the delegation token renewer on recovery.
> Below is the error:
> {code:java}
> 2018-08-07 08:41:23,850 INFO security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:renewToken(635)) - Renewed delegation-token= 
> [Kind: TIMELINE_DELEGATION_TOKEN, Service: 172.27.84.192:8188, Ident: 
> (TIMELINE_DELEGATION_TOKEN owner=hrt_qa_hive_spark, renewer=yarn, realUser=, 
> issueDate=1533624642302, maxDate=1534229442302, sequenceNumber=18, 
> masterKeyId=4);exp=1533717683478; apps=[application_1533623972681_0001]]
> 2018-08-07 08:41:23,855 WARN security.DelegationTokenRenewer 
> (DelegationTokenRenewer.java:handleDTRenewerAppRecoverEvent(955)) - Unable to 
> add the application to the delegation token renewer on recovery.
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:522)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleDTRenewerAppRecoverEvent(DelegationTokenRenewer.java:953)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$700(DelegationTokenRenewer.java:79)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:912)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10550) Decouple NM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10550:
--
Attachment: YARN-10550.001.patch

> Decouple NM runner logic from SLSRunner
> ---
>
> Key: YARN-10550
> URL: https://issues.apache.org/jira/browse/YARN-10550
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-10550.001.patch
>
>
> SLSRunner has too many responsibilities.
>  One of them is to parse the job details from the SLS input formats and 
> launch the AMs and task containers.
>  The NM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10549) Decouple RM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10549:
--
Attachment: YARN-10549.001.patch

> Decouple RM runner logic from SLSRunner
> ---
>
> Key: YARN-10549
> URL: https://issues.apache.org/jira/browse/YARN-10549
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-10549.001.patch
>
>
> SLSRunner has too many responsibilities.
>  One of them is to parse the job details from the SLS input formats and 
> launch the AMs and task containers.
>  The RM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10547) Decouple job parsing logic from SLSRunner

2020-12-26 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255061#comment-17255061
 ] 

Hadoop QA commented on YARN-10547:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
48s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 2 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
22s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 54s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
49s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 18s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/417/artifact/out/diff-checkstyle-hadoop-tools_hadoop-sls.txt{color}
 | {color:orange} hadoop-tools/hadoop-sls: The patch generated 76 new + 47 
unchanged - 3 fixed = 123 total (was 50) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/417/artifact/out/whitespace-eol.txt{color}
 | {color:red} The patch has 27 line(s) that end in whitespace. Use git apply 
--whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} The patch has no ill-formed 
XML file. {color} |
| 

[jira] [Commented] (YARN-10547) Decouple job parsing logic from SLSRunner

2020-12-26 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17255055#comment-17255055
 ] 

Hadoop QA commented on YARN-10547:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 32m 
50s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 2 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
53s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 48s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
49s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
22s{color} | 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-mvninstall-hadoop-tools_hadoop-sls.txt{color}
 | {color:red} hadoop-sls in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
22s{color} | 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color}
 | {color:red} hadoop-sls in the patch failed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 22s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color}
 | {color:red} hadoop-sls in the patch failed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
19s{color} | 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt{color}
 | {color:red} hadoop-sls in the patch failed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 19s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/416/artifact/out/patch-compile-hadoop-tools_hadoop-sls-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt{color}
 | {color:red} hadoop-sls in the patch failed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 17s{color} | 

[jira] [Created] (YARN-10550) Decouple NM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)
Szilard Nemeth created YARN-10550:
-

 Summary: Decouple NM runner logic from SLSRunner
 Key: YARN-10550
 URL: https://issues.apache.org/jira/browse/YARN-10550
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth


SLSRunner has too many responsibilities.
 One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
 The RM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10550) Decouple NM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10550:
--
Description: 
SLSRunner has too many responsibilities.
 One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
 The NM runner logic could be decoupled.

  was:
SLSRunner has too many responsibilities.
 One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
 The RM runner logic could be decoupled.


> Decouple NM runner logic from SLSRunner
> ---
>
> Key: YARN-10550
> URL: https://issues.apache.org/jira/browse/YARN-10550
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
>
> SLSRunner has too many responsibilities.
>  One of them is to parse the job details from the SLS input formats and 
> launch the AMs and task containers.
>  The NM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10549) Decouple RM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)
Szilard Nemeth created YARN-10549:
-

 Summary: Decouple RM runner logic from SLSRunner
 Key: YARN-10549
 URL: https://issues.apache.org/jira/browse/YARN-10549
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth


SLSRunner has too many responsibilities.
 One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
 The AM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10549) Decouple RM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10549:
--
Description: 
SLSRunner has too many responsibilities.
 One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
 The RM runner logic could be decoupled.

  was:
SLSRunner has too many responsibilities.
 One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
 The AM runner logic could be decoupled.


> Decouple RM runner logic from SLSRunner
> ---
>
> Key: YARN-10549
> URL: https://issues.apache.org/jira/browse/YARN-10549
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
>
> SLSRunner has too many responsibilities.
>  One of them is to parse the job details from the SLS input formats and 
> launch the AMs and task containers.
>  The RM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10548:
--
Attachment: YARN-10548.001.patch

> Decouple AM runner logic from SLSRunner
> ---
>
> Key: YARN-10548
> URL: https://issues.apache.org/jira/browse/YARN-10548
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-10548.001.patch
>
>
> SLSRunner has too many responsibilities.
>  One of them is to parse the job details from the SLS input formats and 
> launch the AMs and task containers.
>  The AM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10548:
--
Attachment: (was: YARN-10548.001.patch)

> Decouple AM runner logic from SLSRunner
> ---
>
> Key: YARN-10548
> URL: https://issues.apache.org/jira/browse/YARN-10548
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
>
> SLSRunner has too many responsibilities.
>  One of them is to parse the job details from the SLS input formats and 
> launch the AMs and task containers.
>  The AM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10548:
--
Attachment: YARN-10548.001.patch

> Decouple AM runner logic from SLSRunner
> ---
>
> Key: YARN-10548
> URL: https://issues.apache.org/jira/browse/YARN-10548
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
>
> SLSRunner has too many responsibilities.
>  One of them is to parse the job details from the SLS input formats and 
> launch the AMs and task containers.
>  The AM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10547) Decouple job parsing logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10547:
--
Attachment: YARN-10547.002.patch

> Decouple job parsing logic from SLSRunner
> -
>
> Key: YARN-10547
> URL: https://issues.apache.org/jira/browse/YARN-10547
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-10547.001.patch, YARN-10547.002.patch
>
>
> SLSRunner has too many responsibilities.
> One of them is to parse the job details from the SLS input formats and launch 
> the AMs and task containers.
> As a first step, the job parser logic could be decoupled from this class.
> There are 3 types of inputs: 
> - SLS trace
> - Synth
> - Rumen
> Their job parsing method are: 
> - SLS trace: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526
> - Synth: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790
> - Rumen: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10548:
--
Description: 
SLSRunner has too many responsibilities.
 One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
 The AM runner logic could be decoupled.

  was:
SLSRunner has too many responsibilities.
One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
As a first step, the job parser logic could be decoupled from this class.

There are 3 types of inputs: 
- SLS trace
- Synth
- Rumen

Their job parsing method are: 
- SLS trace: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526
- Synth: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790
- Rumen: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716


> Decouple AM runner logic from SLSRunner
> ---
>
> Key: YARN-10548
> URL: https://issues.apache.org/jira/browse/YARN-10548
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
>
> SLSRunner has too many responsibilities.
>  One of them is to parse the job details from the SLS input formats and 
> launch the AMs and task containers.
>  The AM runner logic could be decoupled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10548) CLONE - Decouple job parsing logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)
Szilard Nemeth created YARN-10548:
-

 Summary: CLONE - Decouple job parsing logic from SLSRunner
 Key: YARN-10548
 URL: https://issues.apache.org/jira/browse/YARN-10548
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth


SLSRunner has too many responsibilities.
One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
As a first step, the job parser logic could be decoupled from this class.

There are 3 types of inputs: 
- SLS trace
- Synth
- Rumen

Their job parsing method are: 
- SLS trace: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526
- Synth: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790
- Rumen: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10548) Decouple AM runner logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10548:
--
Summary: Decouple AM runner logic from SLSRunner  (was: CLONE - Decouple 
job parsing logic from SLSRunner)

> Decouple AM runner logic from SLSRunner
> ---
>
> Key: YARN-10548
> URL: https://issues.apache.org/jira/browse/YARN-10548
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
>
> SLSRunner has too many responsibilities.
> One of them is to parse the job details from the SLS input formats and launch 
> the AMs and task containers.
> As a first step, the job parser logic could be decoupled from this class.
> There are 3 types of inputs: 
> - SLS trace
> - Synth
> - Rumen
> Their job parsing method are: 
> - SLS trace: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526
> - Synth: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790
> - Rumen: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10547) Decouple job parsing logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-10547:
--
Attachment: YARN-10547.001.patch

> Decouple job parsing logic from SLSRunner
> -
>
> Key: YARN-10547
> URL: https://issues.apache.org/jira/browse/YARN-10547
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-10547.001.patch
>
>
> SLSRunner has too many responsibilities.
> One of them is to parse the job details from the SLS input formats and launch 
> the AMs and task containers.
> As a first step, the job parser logic could be decoupled from this class.
> There are 3 types of inputs: 
> - SLS trace
> - Synth
> - Rumen
> Their job parsing method are: 
> - SLS trace: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526
> - Synth: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790
> - Rumen: 
> https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10547) Decouple job parsing logic from SLSRunner

2020-12-26 Thread Szilard Nemeth (Jira)
Szilard Nemeth created YARN-10547:
-

 Summary: Decouple job parsing logic from SLSRunner
 Key: YARN-10547
 URL: https://issues.apache.org/jira/browse/YARN-10547
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth


SLSRunner has too many responsibilities.
One of them is to parse the job details from the SLS input formats and launch 
the AMs and task containers.
As a first step, the job parser logic could be decoupled from this class.

There are 3 types of inputs: 
- SLS trace
- Synth
- Rumen

Their job parsing method are: 
- SLS trace: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L479-L526
- Synth: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L722-L790
- Rumen: 
https://github.com/apache/hadoop/blob/005b854f6bad66defafae0abf95dabc6c36ca8b1/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/SLSRunner.java#L651-L716



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10501) Can't remove all node labels after add node label without nodemanager port

2020-12-26 Thread caozhiqiang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254983#comment-17254983
 ] 

caozhiqiang commented on YARN-10501:


[~ebadger], thank you. I'll try to answer your questions.

*Firstly, we use yarn rmadmin -replaceLabelsOnNode command to the first adding 
labels and replace labels to host/nm, and all processes are in* 
{code:java}
case REPLACE:{code}
 

1a. I am also have confused for adding node without port to map, It  may only 
want to allow users to use port 0 and show its information. In the hadoop 
document, it also tell this.

[nodelabels|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeLabel.html]
{code:java}
Executing yarn rmadmin -replaceLabelsOnNode “node1[:port]=label1 node2=label2” 
[-failOnUnknownNodes]. Added label1 to node1, label2 to node2. If user don’t 
specify port, it adds the label to all NodeManagers running on the node. If 
option -failOnUnknownNodes is set, this command will fail if specified nodes 
are unknown.
{code}
 1b. If user don’t specify port, it adds the label to all NodeManagers running 
on the node. So host's labels are the labels for each nm in this host. And each 
nm also need its labels. Another reason is when there are none nms in a host, 
we can use host's labels..

 
{code:java}
protected Set getLabelsByNode(NodeId nodeId, Map map) {
  Host host = map.get(nodeId.getHost());
  if (null == host) {
return EMPTY_STRING_SET;
  }
  Node nm = host.nms.get(nodeId);
  if (null != nm && null != nm.labels) {
return nm.labels;
  } else {
return host.labels;
  }
}
{code}
1c. Before the first adding labels to host/node, the node's labels is null. In 
"case ADD: ", the host should have been initalized use "case REPLACE:", I think.

2a and 2b are the same with 1.

2c and 2d. If the port is 0, host's labels are the same with nodes' labels in 
this host. *and The add the host's label to nodes' labels.*

2e and 2f. *Set the Labels to Null for each Node, then set new labels to node 
asynchronously with below code.*

 
{code:java}
// code placeholder
newNMToLabels.put(nodeId, host.labels);
...
dispatcher.getEventHandler().handle(
new UpdateNodeToLabelsMappingsEvent(newNMToLabels));
...
  public Node copy() {
Node c = new Node(nodeId);
if (labels != null) {
  c.labels =
  Collections.newSetFromMap(new ConcurrentHashMap());
  c.labels.addAll(labels);
} else {
  c.labels = null;
}
c.resource = Resources.clone(resource);
c.running = running;
return c;
  }
}{code}
 Could [~leftnoteasy] , [~varunsaxena] and [~sunilg] give more information for 
these processes?

> Can't remove all node labels after add node label without nodemanager port
> --
>
> Key: YARN-10501
> URL: https://issues.apache.org/jira/browse/YARN-10501
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Critical
> Attachments: YARN-10501.002.patch, YARN-10501.003.patch
>
>
> When add a label to nodes without nodemanager port or use WILDCARD_PORT (0) 
> port, it can't remove all label info in these nodes
> Reproduce process:
> {code:java}
> 1.yarn rmadmin -addToClusterNodeLabels "cpunode(exclusive=true)"
> 2.yarn rmadmin -replaceLabelsOnNode "server001=cpunode"
> 3.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
> {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":["server001:0","server001:45454"],"partitionInfo":{"resourceAvailable":{"memory":"510","vCores":"1","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"510"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"1"}]}}}
> 4.yarn rmadmin -replaceLabelsOnNode "server001"
> 5.curl http://RM_IP:8088/ws/v1/cluster/label-mappings
> {"labelsToNodes":{"entry":{"key":{"name":"cpunode","exclusivity":"true"},"value":{"nodes":"server001:45454","partitionInfo":{"resourceAvailable":{"memory":"0","vCores":"0","resourceInformations":{"resourceInformation":[{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"memory-mb","resourceType":"COUNTABLE","units":"Mi","value":"0"},{"attributes":null,"maximumAllocation":"9223372036854775807","minimumAllocation":"0","name":"vcores","resourceType":"COUNTABLE","units":"","value":"0"}]}}}
>  {code}
> You can see after the 4 process to remove nodemanager labels, the label info 
> is still in the node info.
> {code:java}
>  641 case REPLACE:
>  642