[jira] [Commented] (YARN-10352) Skip schedule on not heartbeated nodes in Multi Node Placement

2020-07-22 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163030#comment-17163030
 ] 

Hadoop QA commented on YARN-10352:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
46s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 13s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}158m 39s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26303/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10352 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13008187/YARN-10352-006.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux b93823ddc7e7 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / d5b47661582 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| unit | 

[jira] [Issue Comment Deleted] (YARN-10363) TestRMAdminCLI.testHelp is failing in branch-2.10

2020-07-22 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10363:
-
Comment: was deleted

(was: Hi [~Jim_Brennan]

I think we can backport YARN-9985 to branch-2.10. )

> TestRMAdminCLI.testHelp is failing in branch-2.10
> -
>
> Key: YARN-10363
> URL: https://issues.apache.org/jira/browse/YARN-10363
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.10.1
>Reporter: Jim Brennan
>Assignee: Bilwa S T
>Priority: Major
>
> TestRMAdminCLI.testHelp is failing in branch-2.10.
> Example failure:
> {noformat}
> ---
> Test set: org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
> ---
> Tests run: 31, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 18.668 s <<< 
> FAILURE! - in org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
> testHelp(org.apache.hadoop.yarn.client.cli.TestRMAdminCLI)  Time elapsed: 
> 0.043 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expected error message: 
> Usage: yarn rmadmin [-failover [--forcefence] [--forceactive]  
> ] is not included in messages: 
> Usage: yarn rmadmin
>-refreshQueues 
>-refreshNodes [-g|graceful [timeout in seconds] -client|server]
>-refreshNodesResources 
>-refreshSuperUserGroupsConfiguration 
>-refreshUserToGroupsMappings 
>-refreshAdminAcls 
>-refreshServiceAcl 
>-getGroups [username]
>-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">
>-removeFromClusterNodeLabels  (label splitted by ",")
>-replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes] 
>-directlyAccessNodeLabelStore 
>-refreshClusterMaxPriority 
>-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
>-help [cmd]
> Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar files 
> to be included in the classpath
> -archives   specify a comma-separated list of archives 
> to be unarchived on the compute machines
> The general command line syntax is:
> command [genericOptions] [commandOptions]
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testError(TestRMAdminCLI.java:859)
>   at 
> org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testHelp(TestRMAdminCLI.java:585)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> 

[jira] [Commented] (YARN-10363) TestRMAdminCLI.testHelp is failing in branch-2.10

2020-07-22 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162947#comment-17162947
 ] 

Bilwa S T commented on YARN-10363:
--

Hi [~Jim_Brennan]

I think we can backport YARN-9985 to branch-2.10. 

> TestRMAdminCLI.testHelp is failing in branch-2.10
> -
>
> Key: YARN-10363
> URL: https://issues.apache.org/jira/browse/YARN-10363
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.10.1
>Reporter: Jim Brennan
>Assignee: Bilwa S T
>Priority: Major
>
> TestRMAdminCLI.testHelp is failing in branch-2.10.
> Example failure:
> {noformat}
> ---
> Test set: org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
> ---
> Tests run: 31, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 18.668 s <<< 
> FAILURE! - in org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
> testHelp(org.apache.hadoop.yarn.client.cli.TestRMAdminCLI)  Time elapsed: 
> 0.043 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expected error message: 
> Usage: yarn rmadmin [-failover [--forcefence] [--forceactive]  
> ] is not included in messages: 
> Usage: yarn rmadmin
>-refreshQueues 
>-refreshNodes [-g|graceful [timeout in seconds] -client|server]
>-refreshNodesResources 
>-refreshSuperUserGroupsConfiguration 
>-refreshUserToGroupsMappings 
>-refreshAdminAcls 
>-refreshServiceAcl 
>-getGroups [username]
>-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">
>-removeFromClusterNodeLabels  (label splitted by ",")
>-replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes] 
>-directlyAccessNodeLabelStore 
>-refreshClusterMaxPriority 
>-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
>-help [cmd]
> Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar files 
> to be included in the classpath
> -archives   specify a comma-separated list of archives 
> to be unarchived on the compute machines
> The general command line syntax is:
> command [genericOptions] [commandOptions]
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testError(TestRMAdminCLI.java:859)
>   at 
> org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testHelp(TestRMAdminCLI.java:585)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> 

[jira] [Commented] (YARN-10362) Javadoc for TimelineReaderAuthenticationFilterInitializer is broken

2020-07-22 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162941#comment-17162941
 ] 

Hadoop QA commented on YARN-10362:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice
 generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
28s{color} | {color:green} hadoop-yarn-server-timelineservice in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26302/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10362 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13008164/HADOOP-17148.000.patch
 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 3d51f6d26f2c 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / d5b47661582 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
|  Test Results 

[jira] [Updated] (YARN-10352) Skip schedule on not heartbeated nodes in Multi Node Placement

2020-07-22 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10352:
-
Attachment: YARN-10352-006.patch

> Skip schedule on not heartbeated nodes in Multi Node Placement
> --
>
> Key: YARN-10352
> URL: https://issues.apache.org/jira/browse/YARN-10352
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: capacityscheduler, multi-node-placement
> Attachments: YARN-10352-001.patch, YARN-10352-002.patch, 
> YARN-10352-003.patch, YARN-10352-004.patch, YARN-10352-005.patch, 
> YARN-10352-006.patch
>
>
> When Node Recovery is Enabled, Stopping a NM won't unregister to RM. So RM 
> Active Nodes will be still having those stopped nodes until NM Liveliness 
> Monitor Expires after configured timeout 
> (yarn.nm.liveness-monitor.expiry-interval-ms = 10 mins). During this 10mins, 
> Multi Node Placement assigns the containers on those nodes. They need to 
> exclude the nodes which has not heartbeated for configured heartbeat interval 
> (yarn.resourcemanager.nodemanagers.heartbeat-interval-ms=1000ms) similar to 
> Asynchronous Capacity Scheduler Threads. 
> (CapacityScheduler#shouldSkipNodeSchedule)
> *Repro:*
> 1. Enable Multi Node Placement 
> (yarn.scheduler.capacity.multi-node-placement-enabled) + Node Recovery 
> Enabled  (yarn.node.recovery.enabled)
> 2. Have only one NM running say worker0
> 3. Stop worker0 and start any other NM say worker1
> 4. Submit a sleep job. The containers will timeout as assigned to stopped NM 
> worker0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10363) TestRMAdminCLI.testHelp is failing in branch-2.10

2020-07-22 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10363:


Assignee: Bilwa S T

> TestRMAdminCLI.testHelp is failing in branch-2.10
> -
>
> Key: YARN-10363
> URL: https://issues.apache.org/jira/browse/YARN-10363
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.10.1
>Reporter: Jim Brennan
>Assignee: Bilwa S T
>Priority: Major
>
> TestRMAdminCLI.testHelp is failing in branch-2.10.
> Example failure:
> {noformat}
> ---
> Test set: org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
> ---
> Tests run: 31, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 18.668 s <<< 
> FAILURE! - in org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
> testHelp(org.apache.hadoop.yarn.client.cli.TestRMAdminCLI)  Time elapsed: 
> 0.043 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expected error message: 
> Usage: yarn rmadmin [-failover [--forcefence] [--forceactive]  
> ] is not included in messages: 
> Usage: yarn rmadmin
>-refreshQueues 
>-refreshNodes [-g|graceful [timeout in seconds] -client|server]
>-refreshNodesResources 
>-refreshSuperUserGroupsConfiguration 
>-refreshUserToGroupsMappings 
>-refreshAdminAcls 
>-refreshServiceAcl 
>-getGroups [username]
>-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">
>-removeFromClusterNodeLabels  (label splitted by ",")
>-replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes] 
>-directlyAccessNodeLabelStore 
>-refreshClusterMaxPriority 
>-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
>-help [cmd]
> Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar files 
> to be included in the classpath
> -archives   specify a comma-separated list of archives 
> to be unarchived on the compute machines
> The general command line syntax is:
> command [genericOptions] [commandOptions]
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testError(TestRMAdminCLI.java:859)
>   at 
> org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testHelp(TestRMAdminCLI.java:585)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> 

[jira] [Created] (YARN-10363) TestRMAdminCLI.testHelp is failing in branch-2.10

2020-07-22 Thread Jim Brennan (Jira)
Jim Brennan created YARN-10363:
--

 Summary: TestRMAdminCLI.testHelp is failing in branch-2.10
 Key: YARN-10363
 URL: https://issues.apache.org/jira/browse/YARN-10363
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.10.1
Reporter: Jim Brennan


TestRMAdminCLI.testHelp is failing in branch-2.10.

Example failure:
{noformat}
---
Test set: org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
---
Tests run: 31, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 18.668 s <<< 
FAILURE! - in org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
testHelp(org.apache.hadoop.yarn.client.cli.TestRMAdminCLI)  Time elapsed: 0.043 
s  <<< FAILURE!
java.lang.AssertionError: 
Expected error message: 
Usage: yarn rmadmin [-failover [--forcefence] [--forceactive]  
] is not included in messages: 
Usage: yarn rmadmin
   -refreshQueues 
   -refreshNodes [-g|graceful [timeout in seconds] -client|server]
   -refreshNodesResources 
   -refreshSuperUserGroupsConfiguration 
   -refreshUserToGroupsMappings 
   -refreshAdminAcls 
   -refreshServiceAcl 
   -getGroups [username]
   -addToClusterNodeLabels 
<"label1(exclusive=true),label2(exclusive=false),label3">
   -removeFromClusterNodeLabels  (label splitted by ",")
   -replaceLabelsOnNode <"node1[:port]=label1,label2 
node2[:port]=label1,label2"> [-failOnUnknownNodes] 
   -directlyAccessNodeLabelStore 
   -refreshClusterMaxPriority 
   -updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
   -help [cmd]

Generic options supported are:
-conf specify an application configuration file
-Ddefine a value for a given property
-fs  specify default filesystem URL to use, 
overrides 'fs.defaultFS' property from configurations.
-jt   specify a ResourceManager
-files specify a comma-separated list of files to be 
copied to the map reduce cluster
-libjarsspecify a comma-separated list of jar files 
to be included in the classpath
-archives   specify a comma-separated list of archives to 
be unarchived on the compute machines

The general command line syntax is:
command [genericOptions] [commandOptions]


at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testError(TestRMAdminCLI.java:859)
at 
org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testHelp(TestRMAdminCLI.java:585)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)


[jira] [Commented] (YARN-10363) TestRMAdminCLI.testHelp is failing in branch-2.10

2020-07-22 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162926#comment-17162926
 ] 

Jim Brennan commented on YARN-10363:


cc: [~aajisaka], [~risyomei]

> TestRMAdminCLI.testHelp is failing in branch-2.10
> -
>
> Key: YARN-10363
> URL: https://issues.apache.org/jira/browse/YARN-10363
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.10.1
>Reporter: Jim Brennan
>Priority: Major
>
> TestRMAdminCLI.testHelp is failing in branch-2.10.
> Example failure:
> {noformat}
> ---
> Test set: org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
> ---
> Tests run: 31, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 18.668 s <<< 
> FAILURE! - in org.apache.hadoop.yarn.client.cli.TestRMAdminCLI
> testHelp(org.apache.hadoop.yarn.client.cli.TestRMAdminCLI)  Time elapsed: 
> 0.043 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expected error message: 
> Usage: yarn rmadmin [-failover [--forcefence] [--forceactive]  
> ] is not included in messages: 
> Usage: yarn rmadmin
>-refreshQueues 
>-refreshNodes [-g|graceful [timeout in seconds] -client|server]
>-refreshNodesResources 
>-refreshSuperUserGroupsConfiguration 
>-refreshUserToGroupsMappings 
>-refreshAdminAcls 
>-refreshServiceAcl 
>-getGroups [username]
>-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">
>-removeFromClusterNodeLabels  (label splitted by ",")
>-replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes] 
>-directlyAccessNodeLabelStore 
>-refreshClusterMaxPriority 
>-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
>-help [cmd]
> Generic options supported are:
> -conf specify an application configuration file
> -Ddefine a value for a given property
> -fs  specify default filesystem URL to use, 
> overrides 'fs.defaultFS' property from configurations.
> -jt   specify a ResourceManager
> -files specify a comma-separated list of files to 
> be copied to the map reduce cluster
> -libjarsspecify a comma-separated list of jar files 
> to be included in the classpath
> -archives   specify a comma-separated list of archives 
> to be unarchived on the compute machines
> The general command line syntax is:
> command [genericOptions] [commandOptions]
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testError(TestRMAdminCLI.java:859)
>   at 
> org.apache.hadoop.yarn.client.cli.TestRMAdminCLI.testHelp(TestRMAdminCLI.java:585)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> 

[jira] [Moved] (YARN-10362) Javadoc for TimelineReaderAuthenticationFilterInitializer is broken

2020-07-22 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka moved HADOOP-17148 to YARN-10362:
---

Component/s: (was: documentation)
 documentation
Key: YARN-10362  (was: HADOOP-17148)
Project: Hadoop YARN  (was: Hadoop Common)

> Javadoc for TimelineReaderAuthenticationFilterInitializer is broken
> ---
>
> Key: YARN-10362
> URL: https://issues.apache.org/jira/browse/YARN-10362
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Xieming Li
>Assignee: Xieming Li
>Priority: Minor
> Attachments: HADOOP-17148.000.patch
>
>
> mvn javadoc:javadoc fails for 
> TimelineReaderAuthenticationFilterInitializer.java
> {code:java}
> [ERROR] 
> /Users/sri/projects/hadoop-mirror/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/reader/security/TimelineReaderAuthenticationFilterInitializer.java:39:
>  error: value does not refer to a constant
> [ERROR]* {@value TimelineAuthenticationFilterInitializer#PREFIX}.
> [ERROR]  ^
> [ERROR] 
> /Users/sri/projects/hadoop-mirror/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice/src/main/java/org/apache/hadoop/yarn/server/timelineservice/reader/security/TimelineReaderAuthenticationFilterInitializer.java:39:
>  error: reference not found
> [ERROR]* {@value TimelineAuthenticationFilterInitializer#PREFIX}.
> {code}
> This issue seems to be caused by changes in YARN-10339



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10360) Support Multi Node Placement in SingleConstraintAppPlacementAllocator

2020-07-22 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162838#comment-17162838
 ] 

Hadoop QA commented on YARN-10360:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
49s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
59s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 36s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 15 new + 85 unchanged - 0 fixed = 100 total (was 85) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 33s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 53s{color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
48s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}224m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
|   | hadoop.yarn.applications.distributedshell.TestDistributedShell |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 

[jira] [Created] (YARN-10361) Make custom DAO classes configurable into RMWebApp#JAXBContextResolver

2020-07-22 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-10361:


 Summary: Make custom DAO classes configurable into 
RMWebApp#JAXBContextResolver
 Key: YARN-10361
 URL: https://issues.apache.org/jira/browse/YARN-10361
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 3.4.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


YARN-8047 provides support to add custom WebServices as part of RMWebApp. But 
the custom DAO classes needs to be added into JAXBContextResolver. This Jira is 
to configure the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4771) Some containers can be skipped during log aggregation after NM restart

2020-07-22 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162728#comment-17162728
 ] 

Jim Brennan commented on YARN-4771:
---

I don't think the TestFederationInterceptor unit test failure is related to 
this change.

> Some containers can be skipped during log aggregation after NM restart
> --
>
> Key: YARN-4771
> URL: https://issues.apache.org/jira/browse/YARN-4771
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.10.0, 3.2.1, 3.1.3
>Reporter: Jason Darrell Lowe
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-4771.001.patch, YARN-4771.002.patch, 
> YARN-4771.003.patch
>
>
> A container can be skipped during log aggregation after a work-preserving 
> nodemanager restart if the following events occur:
> # Container completes more than 
> yarn.nodemanager.duration-to-track-stopped-containers milliseconds before the 
> restart
> # At least one other container completes after the above container and before 
> the restart



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10360) Support Multi Node Placement in SingleConstraintAppPlacementAllocator

2020-07-22 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10360:
-
Attachment: YARN-10360-001.patch

> Support Multi Node Placement in SingleConstraintAppPlacementAllocator
> -
>
> Key: YARN-10360
> URL: https://issues.apache.org/jira/browse/YARN-10360
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, multi-node-placement
>Affects Versions: 3.4.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-10360-001.patch
>
>
> Currently, placement constraints are not supported when Multi Node Placement 
> is enabled. This Jira is to add Support for Multi Node Placement in 
> SingleConstraintAppPlacementAllocator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10359) Log container report only if list is not empty

2020-07-22 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10359:


Assignee: Bilwa S T

> Log container report only if list is not empty
> --
>
> Key: YARN-10359
> URL: https://issues.apache.org/jira/browse/YARN-10359
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Minor
>
> In NodeStatusUpdaterImpl print log only if containerReports list is  not empty
> {code:java}
> if (containerReports != null) {
> LOG.info("Registering with RM using containers :" + containerReports);
>  }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10360) Support Multi Node Placement in SingleConstraintAppPlacementAllocator

2020-07-22 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-10360:


 Summary: Support Multi Node Placement in 
SingleConstraintAppPlacementAllocator
 Key: YARN-10360
 URL: https://issues.apache.org/jira/browse/YARN-10360
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, multi-node-placement
Affects Versions: 3.4.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


Currently, placement constraints are not supported when Multi Node Placement is 
enabled. This Jira is to add Support for Multi Node Placement in 
SingleConstraintAppPlacementAllocator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-07-22 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10293:
-
Parent: YARN-5139
Issue Type: Sub-task  (was: Bug)

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement (YARN-10259)
> 
>
> Key: YARN-10293
> URL: https://issues.apache.org/jira/browse/YARN-10293
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10293-001.patch, YARN-10293-002.patch, 
> YARN-10293-003-WIP.patch, YARN-10293-004.patch, YARN-10293-005.patch
>
>
> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues 
> related to it 
> https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987
> Have found one more bug in the CapacityScheduler.java code which causes the 
> same issue with slight difference in the repro.
> *Repro:*
> *Nodes :   Available : Used*
> Node1 -  8GB, 8vcores -  8GB. 8cores
> Node2 -  8GB, 8vcores - 8GB. 8cores
> Node3 -  8GB, 8vcores - 8GB. 8cores
> Queues -> A and B both 50% capacity, 100% max capacity
> MultiNode enabled + Preemption enabled
> 1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores
> 2. JobB Submitted to B queue with AM size of 1GB
> {code}
> 2020-05-21 12:12:27,313 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
> IP=172.27.160.139   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005  
>   CALLERCONTEXT=CLI   QUEUENAME=dummy
> {code}
> 3. Preemption happens and used capacity is lesser than 1.0f
> {code}
> 2020-05-21 12:12:48,222 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
>  Non-AM container preempted, current 
> appAttemptId=appattempt_1590046667304_0004_01, 
> containerId=container_e09_1590046667304_0004_01_24, 
> resource=
> {code}
> 4. JobB gets a Reserved Container as part of 
> CapacityScheduler#allocateOrReserveNewContainer
> {code}
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
> RESERVED
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> {code}
> *Why RegularContainerAllocator reserved the container when the used capacity 
> is <= 1.0f ?*
> {code}
> The reason is even though the container is preempted - nodemanager has to 
> stop the container and heartbeat and update the available and unallocated 
> resources to ResourceManager.
> {code}
> 5. Now, no new allocation happens and reserved container stays at reserved.
> After reservation the used capacity becomes 1.0f, below will be in a loop and 
> no new allocate or reserve happens. The reserved container cannot be 
> allocated as reserved node does not have space. node2 has space for 1GB, 
> 1vcore but CapacityScheduler#allocateOrReserveNewContainers not getting 
> called causing the Hang.
> *[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
> CapacityScheduler#allocateFromReservedContainer -> Re-reserve the container 
> on node*
> {code}
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1590046667304_0005 
> on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignContainers: partition= #applications=1
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> 2020-05-21 12:13:33,243 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Allocation proposal accepted
> {code}
> CapacityScheduler#allocateOrReserveNewContainers won't be 

[jira] [Updated] (YARN-10259) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement

2020-07-22 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10259:
-
Parent: YARN-5139
Issue Type: Sub-task  (was: Bug)

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement
> ---
>
> Key: YARN-10259
> URL: https://issues.apache.org/jira/browse/YARN-10259
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler
>Affects Versions: 3.2.0, 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10259-001.patch, YARN-10259-002.patch, 
> YARN-10259-003.patch
>
>
> Reserved Containers are not allocated from the available space of other nodes 
> in CandidateNodeSet in MultiNodePlacement. 
> *Repro:*
> 1. MultiNode Placement Enabled.
> 2. Two nodes h1 and h2 with 8GB
> 3. Submit app1 AM (5GB) which gets placed in h1 and app2 AM (5GB) which gets 
> placed in h2.
> 4. Submit app3 AM which is reserved in h1
> 5. Kill app2 which frees space in h2.
> 6. app3 AM never gets ALLOCATED
> RM logs shows YARN-8127 fix rejecting the allocation proposal for app3 AM on 
> h2 as it expects the assignment to be on same node where reservation has 
> happened.
> {code}
> 2020-05-05 18:49:37,264 DEBUG [AsyncDispatcher event handler] 
> scheduler.SchedulerApplicationAttempt 
> (SchedulerApplicationAttempt.java:commonReserve(573)) - Application attempt 
> appattempt_1588684773609_0003_01 reserved container 
> container_1588684773609_0003_01_01 on node host: h1:1234 #containers=1 
> available= used=. This attempt 
> currently has 1 reserved containers at priority 0; currentReservation 
> 
> 2020-05-05 18:49:37,264 INFO  [AsyncDispatcher event handler] 
> fica.FiCaSchedulerApp (FiCaSchedulerApp.java:apply(670)) - Reserved 
> container=container_1588684773609_0003_01_01, on node=host: h1:1234 
> #containers=1 available= used= 
> with resource=
>RESERVED=[(Application=appattempt_1588684773609_0003_01; 
> Node=h1:1234; Resource=)]
>
> 2020-05-05 18:49:38,283 DEBUG [Time-limited test] 
> allocator.RegularContainerAllocator 
> (RegularContainerAllocator.java:assignContainer(514)) - assignContainers: 
> node=h2 application=application_1588684773609_0003 priority=0 
> pendingAsk=,repeat=1> 
> type=OFF_SWITCH
> 2020-05-05 18:49:38,285 DEBUG [Time-limited test] fica.FiCaSchedulerApp 
> (FiCaSchedulerApp.java:commonCheckContainerAllocation(371)) - Try to allocate 
> from reserved container container_1588684773609_0003_01_01, but node is 
> not reserved
>ALLOCATED=[(Application=appattempt_1588684773609_0003_01; 
> Node=h2:1234; Resource=)]
> {code}
> Attached testcase which reproduces the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10357) Proactively relocate allocated containers from a stopped node

2020-07-22 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10357:
-
Parent: YARN-5139
Issue Type: Sub-task  (was: Improvement)

> Proactively relocate allocated containers from a stopped node
> -
>
> Key: YARN-10357
> URL: https://issues.apache.org/jira/browse/YARN-10357
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, multi-node-placement
>Affects Versions: 3.4.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> In a cloud environment, node can be frequently commissioned, if we always 
> wait for 10 mins timeout, it may not be good, it's better to improve the 
> logic by preempting containers newly allocated (by not acquired) on NM which 
> stopped heartbeating. With this, we can proactively relocate containers to 
> different nodes before the 10 mins timeout.
> cc [~leftnoteasy]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10352) Skip schedule on not heartbeated nodes in Multi Node Placement

2020-07-22 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10352:
-
Parent: YARN-5139
Issue Type: Sub-task  (was: Bug)

> Skip schedule on not heartbeated nodes in Multi Node Placement
> --
>
> Key: YARN-10352
> URL: https://issues.apache.org/jira/browse/YARN-10352
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.3.0, 3.4.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>  Labels: capacityscheduler, multi-node-placement
> Attachments: YARN-10352-001.patch, YARN-10352-002.patch, 
> YARN-10352-003.patch, YARN-10352-004.patch, YARN-10352-005.patch
>
>
> When Node Recovery is Enabled, Stopping a NM won't unregister to RM. So RM 
> Active Nodes will be still having those stopped nodes until NM Liveliness 
> Monitor Expires after configured timeout 
> (yarn.nm.liveness-monitor.expiry-interval-ms = 10 mins). During this 10mins, 
> Multi Node Placement assigns the containers on those nodes. They need to 
> exclude the nodes which has not heartbeated for configured heartbeat interval 
> (yarn.resourcemanager.nodemanagers.heartbeat-interval-ms=1000ms) similar to 
> Asynchronous Capacity Scheduler Threads. 
> (CapacityScheduler#shouldSkipNodeSchedule)
> *Repro:*
> 1. Enable Multi Node Placement 
> (yarn.scheduler.capacity.multi-node-placement-enabled) + Node Recovery 
> Enabled  (yarn.node.recovery.enabled)
> 2. Have only one NM running say worker0
> 3. Stop worker0 and start any other NM say worker1
> 4. Submit a sleep job. The containers will timeout as assigned to stopped NM 
> worker0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org