[jira] [Comment Edited] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission

2019-11-29 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985279#comment-16985279
 ] 

lindongdong edited comment on YARN-9738 at 11/30/19 7:26 AM:
-

Hi [~BilwaST] , In the last patch, I think it is better to move "null check"  
out of the readlock.


was (Author: lindongdong):
[~BilwaST] In the last patch, I think it is better to move "null check"  out of 
the readlock.

> Remove lock on ClusterNodeTracker#getNodeReport as it blocks application 
> submission
> ---
>
> Key: YARN-9738
> URL: https://issues.apache.org/jira/browse/YARN-9738
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9738-001.patch, YARN-9738-002.patch, 
> YARN-9738-003.patch
>
>
> *Env :*
> Server OS :- UBUNTU
> No. of Cluster Node:- 9120 NMs
> Env Mode:- [Secure / Non secure]Secure
> *Preconditions:*
> ~9120 NM's was running
> ~1250 applications was in running state 
> 35K applications was in pending state
> *Test Steps:*
> 1. Submit the application from 5 clients, each client 2 threads and total 10 
> queues
> 2. Once application submittion increases (for each application of 
> distributted shell will call getClusterNodes)
> *ClientRMservice#getClusterNodes tries to get 
> ClusterNodeTracker#getNodeReport where map nodes is locked.*
> {quote}
> "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 
> tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x7f759f6d8858> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792)
> {quote}
> *Instead we can make nodes as concurrentHashMap and remove readlock*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission

2019-11-29 Thread lindongdong (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985279#comment-16985279
 ] 

lindongdong commented on YARN-9738:
---

[~BilwaST] In the last patch, I think it is better to move "null check"  out of 
the readlock.

> Remove lock on ClusterNodeTracker#getNodeReport as it blocks application 
> submission
> ---
>
> Key: YARN-9738
> URL: https://issues.apache.org/jira/browse/YARN-9738
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9738-001.patch, YARN-9738-002.patch, 
> YARN-9738-003.patch
>
>
> *Env :*
> Server OS :- UBUNTU
> No. of Cluster Node:- 9120 NMs
> Env Mode:- [Secure / Non secure]Secure
> *Preconditions:*
> ~9120 NM's was running
> ~1250 applications was in running state 
> 35K applications was in pending state
> *Test Steps:*
> 1. Submit the application from 5 clients, each client 2 threads and total 10 
> queues
> 2. Once application submittion increases (for each application of 
> distributted shell will call getClusterNodes)
> *ClientRMservice#getClusterNodes tries to get 
> ClusterNodeTracker#getNodeReport where map nodes is locked.*
> {quote}
> "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 
> tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x7f759f6d8858> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792)
> {quote}
> *Instead we can make nodes as concurrentHashMap and remove readlock*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission

2019-11-29 Thread lindongdong (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lindongdong updated YARN-9738:
--
Description: 
*Env :*
 Server OS :- UBUNTU
 No. of Cluster Node:- 9120 NMs
 Env Mode:- [Secure / Non secure]Secure

*Preconditions:*
 ~9120 NM's was running
 ~1250 applications was in running state 
 35K applications was in pending state

*Test Steps:*
 1. Submit the application from 5 clients, each client 2 threads and total 10 
queues
 2. Once application submittion increases (for each application of distributted 
shell will call getClusterNodes)

*ClientRMservice#getClusterNodes tries to get ClusterNodeTracker#getNodeReport 
where map nodes is locked.*
{quote}"IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 
tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000]
 java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for <0x7f759f6d8858> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
 at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
 at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123)
 at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067)
 at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992)
 at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313)
 at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792){quote}
*Instead we can make nodes as concurrentHashMap and remove readlock*

  was:
*Env :*
Server OS :- UBUNTU
No. of Cluster Node:- 9120 NMs
Env Mode:- [Secure / Non secure]Secure

*Preconditions:*
~9120 NM's was running
~1250 applications was in running state 
35K applications was in pending state

*Test Steps:*
1. Submit the application from 5 clients, each client 2 threads and total 10 
queues
2. Once application submittion increases (for each application of distributted 
shell will call getClusterNodes)

*ClientRMservice#getClusterNodes tries to get ClusterNodeTracker#getNodeReport 
where map nodes is locked.*

{quote}
"IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 
tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x7f759f6d8858> (a 
java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992)
at 

[jira] [Commented] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985179#comment-16985179
 ] 

Hadoop QA commented on YARN-5106:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 29 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 278 unchanged - 52 fixed = 279 total (was 330) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 29s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m 
59s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}211m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-5106 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987181/YARN-5106.016.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 42aee859937e 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a2dadac |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Commented] (YARN-9925) CapacitySchedulerQueueManager allows unsupported Queue hierarchy

2019-11-29 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985120#comment-16985120
 ] 

Prabhu Joseph commented on YARN-9925:
-

[~pbacsko] Have reported YARN-10006 to handle the same. Thanks.

> CapacitySchedulerQueueManager allows unsupported Queue hierarchy
> 
>
> Key: YARN-9925
> URL: https://issues.apache.org/jira/browse/YARN-9925
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9925-001.patch, YARN-9925-002.patch, 
> YARN-9925-003.patch, YARN-9925-004.patch, YARN-9925-005.patch
>
>
> CapacitySchedulerQueueManager allows unsupported Queue hierarchy. When 
> creating a queue with same name as an existing parent queue name - it has to 
> fail with below.
> {code:java}
> Caused by: java.io.IOException: A is moved from:root.A to:root.B.A after 
> refresh, which is not allowed.Caused by: java.io.IOException: A is moved 
> from:root.A to:root.B.A after refresh, which is not allowed. at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.validateQueueHierarchy(CapacitySchedulerQueueManager.java:335)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.reinitializeQueues(CapacitySchedulerQueueManager.java:180)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:762)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:473)
>  ... 70 more 
> {code}
> In Some cases, the error is not thrown while creating the queue but thrown at 
> submission of job "Failed to submit application_1571677375269_0002 to YARN : 
> Application application_1571677375269_0002 submitted by user : systest to 
> non-leaf queue : B"
> Below scenarios are allowed but it should not
> {code:java}
> It allows root.A.A1.B when root.B.B1 already exists.
>
> 1. Add root.A
> 2. Add root.A.A1
> 3. Add root.B
> 4. Add root.B.B1
> 5. Allows Add of root.A.A1.B 
> It allows two root queues:
>
> 1. Add root.A
> 2. Add root.B
> 3. Add root.A.A1
> 4. Allows Add of root.A.A1.root
>
> {code}
> Below scenario is handled properly:
> {code:java}
> It does not allow root.B.A when root.A.A1 already exists.
>  
> 1. Add root.A
> 2. Add root.B
> 3. Add root.A.A1
> 4. Does not Allow Add of root.B.A
> {code}
> This error handling has to be consistent in all scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10006) IOException used in place of YARNException in CapaitySheduler

2019-11-29 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-10006:


 Summary: IOException used in place of YARNException in 
CapaitySheduler
 Key: YARN-10006
 URL: https://issues.apache.org/jira/browse/YARN-10006
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler
Affects Versions: 3.3.0
Reporter: Prabhu Joseph


IOException used in place of YARNException in CapaityScheduler. As per 
YARNException Doc,
{code:java}
/**
 * YarnException indicates exceptions from yarn servers. On the other hand,
 * IOExceptions indicates exceptions from RPC layer.
 */
{code}
Below methods throws IOException but it is suppose to throw YarnException.

CapaityShedulerQueueManager#parseQueue <- initializeQueues <- 
CapacityScheduler#initializeQueues <- initScheduler <- serviceInit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-5106:
-
Attachment: YARN-5106.016.patch

> Provide a builder interface for FairScheduler allocations for use in tests
> --
>
> Key: YARN-5106
> URL: https://issues.apache.org/jira/browse/YARN-5106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie++
> Attachments: YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.001.patch, YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.002.patch, YARN-5106-branch-3.2.001.patch, 
> YARN-5106-branch-3.2.001.patch, YARN-5106-branch-3.2.002.patch, 
> YARN-5106.001.patch, YARN-5106.002.patch, YARN-5106.003.patch, 
> YARN-5106.004.patch, YARN-5106.005.patch, YARN-5106.006.patch, 
> YARN-5106.007.patch, YARN-5106.008.patch, YARN-5106.008.patch, 
> YARN-5106.008.patch, YARN-5106.009.patch, YARN-5106.010.patch, 
> YARN-5106.011.patch, YARN-5106.012.patch, YARN-5106.013.patch, 
> YARN-5106.014.patch, YARN-5106.015.patch, YARN-5106.016.patch
>
>
> Most, if not all, fair scheduler tests create an allocations XML file. Having 
> a helper class that potentially uses a builder would make the tests cleaner. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985085#comment-16985085
 ] 

Adam Antal commented on YARN-5106:
--

Thanks for the review [~pbacsko]!

Done with all. I put an enum to the placement rules, I hope it aligns with your 
intention.

> Provide a builder interface for FairScheduler allocations for use in tests
> --
>
> Key: YARN-5106
> URL: https://issues.apache.org/jira/browse/YARN-5106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie++
> Attachments: YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.001.patch, YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.002.patch, YARN-5106-branch-3.2.001.patch, 
> YARN-5106-branch-3.2.001.patch, YARN-5106-branch-3.2.002.patch, 
> YARN-5106.001.patch, YARN-5106.002.patch, YARN-5106.003.patch, 
> YARN-5106.004.patch, YARN-5106.005.patch, YARN-5106.006.patch, 
> YARN-5106.007.patch, YARN-5106.008.patch, YARN-5106.008.patch, 
> YARN-5106.008.patch, YARN-5106.009.patch, YARN-5106.010.patch, 
> YARN-5106.011.patch, YARN-5106.012.patch, YARN-5106.013.patch, 
> YARN-5106.014.patch, YARN-5106.015.patch
>
>
> Most, if not all, fair scheduler tests create an allocations XML file. Having 
> a helper class that potentially uses a builder would make the tests cleaner. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9985) Unsupported "transitionToObserver" option displaying for rmadmin command

2019-11-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985060#comment-16985060
 ] 

Hadoop QA commented on YARN-9985:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 90 unchanged - 2 fixed = 90 total (was 92) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m 
56s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 99m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9985 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987158/YARN-9985-02.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 

[jira] [Commented] (YARN-9877) Intermittent TIME_OUT of LogAggregationReport

2019-11-29 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985055#comment-16985055
 ] 

Peter Bacsko commented on YARN-9877:


[~adam.antal] is there are way to test this? {{TestRMAppTransitions}} is the 
place where the validation could take place.

> Intermittent TIME_OUT of LogAggregationReport
> -
>
> Key: YARN-9877
> URL: https://issues.apache.org/jira/browse/YARN-9877
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation, resourcemanager, yarn
>Affects Versions: 3.0.3, 3.3.0, 3.2.1, 3.1.3
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9877.001.patch
>
>
> I noticed some intermittent TIME_OUT in some downstream log-aggregation based 
> tests.
> Steps to reproduce:
> - Let's run a MR job
> {code}
> hadoop jar hadoop-mapreduce/hadoop-mapreduce-client-jobclient-tests.jar sleep 
> -Dmapreduce.job.queuename=root.default -m 10 -r 10 -mt 5000 -rt 5000
> {code}
> - Suppose the AM is requesting more containers, but as soon as they're 
> allocated - the AM realizes it doesn't need them. The container's state 
> changes are: ALLOCATED -> ACQUIRED -> RELEASED. 
> Let's suppose these extra containers are allocated in a different node from 
> the other 21 (AM + 10 mapper + 10 reducer) containers' node.
> - All the containers finish successfully and the app is finished successfully 
> as well. Log aggregation status for the whole app seemingly stucks in RUNNING 
> state.
> - After a while the final log aggregation status for the app changes to 
> TIME_OUT.
> Root cause:
> - As unused containers are getting through the state transition in the RM's 
> internal representation, {{RMAppImpl$AppRunningOnNodeTransition}}'s 
> transition function is called. This calls the 
> {{RMAppLogAggregation$addReportIfNecessary}} which forcefully adds the 
> "NOT_START" LogAggregationStatus associated with this NodeId for the app, 
> even though it does not have any running container on it.
> - The node's LogAggregationStatus is never updated to "SUCCEEDED" by the 
> NodeManager because it does not have any running container on it (Note that 
> the AM immediately released them after acquisition). The LogAggregationStatus 
> remains NOT_START until time out is reached. After that point the RM 
> aggregates the LogAggregationReports for all the nodes, and though all the 
> containers have SUCCEEDED state, one particular node has NOT_START, so the 
> final log aggregation will be TIME_OUT.
> (I crawled the RM UI for the log aggregation statuses, and it was always 
> NOT_START for this particular node).
> This situation is highly unlikely, but has an estimated ~0.8% of failure rate 
> based on a year's 1500 run on an unstressed cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985033#comment-16985033
 ] 

Peter Bacsko edited comment on YARN-5106 at 11/29/19 1:51 PM:
--

Thanks for picking up the patch [~adam.antal].

 * Just minor a thing: strings like "drf" and "fair" are often repeated quite 
often. Either:
 ## Extract them to constants somewhere
 ## instead of using {{.defaultQueueSchedulingPolicy("drf")}} or 
{{.schedulingPolicy("drf")}}, you can introduce methods like 
{{.drfDefaultSchedulingPolicy()}} and {{.drfSchedulingPolicy()}}, 
{{.fairSchedulingPolicy()}}, etc.
 * {{.queueMaxAMShareDefault(-1.0f)}}: here, the value -1.0f has a special 
meaning (feature is disabled). Again, extract {{-1.0f}} or add a method 
{{.disableQueueMaxAmShareDefault()}}.
 * Placement rules like "specified", "reject", etc. Similarly to scheduling 
policy, the set of accepted values are fixed. So do something which I 
recommended above regarding "drf" and "fair".




was (Author: pbacsko):
Thanks for picking up the patch [~adam.antal].

 * Just minor a thing: strings like "drf" and "fair" are often repeated quite 
often. Either:
 ## Extract them to constants somewhere
 ## instead of using {{.defaultQueueSchedulingPolicy("drf")}} or 
{{.schedulingPolicy("drf")}}, you can introduce methods like 
{{.drfDefaultSchedulingPolicy()}} and {{.drfSchedulingPolicy()}}, 
{{.fairSchedulingPolicy()}}, etc.
 * {{.queueMaxAMShareDefault(-1.0f)}}: here, the value -1.0f has a special 
meaning (feature is disabled). Again, extract {{-1.0f}} or add a method 
{{.disableQueueMaxAmShareDefault()}}.
 * Placement rules like "specified", "reject", etc. Similarly to scheduling 
policy, the set of accepted values are fixed. So do something which I 
recommended in #2.



> Provide a builder interface for FairScheduler allocations for use in tests
> --
>
> Key: YARN-5106
> URL: https://issues.apache.org/jira/browse/YARN-5106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie++
> Attachments: YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.001.patch, YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.002.patch, YARN-5106-branch-3.2.001.patch, 
> YARN-5106-branch-3.2.001.patch, YARN-5106-branch-3.2.002.patch, 
> YARN-5106.001.patch, YARN-5106.002.patch, YARN-5106.003.patch, 
> YARN-5106.004.patch, YARN-5106.005.patch, YARN-5106.006.patch, 
> YARN-5106.007.patch, YARN-5106.008.patch, YARN-5106.008.patch, 
> YARN-5106.008.patch, YARN-5106.009.patch, YARN-5106.010.patch, 
> YARN-5106.011.patch, YARN-5106.012.patch, YARN-5106.013.patch, 
> YARN-5106.014.patch, YARN-5106.015.patch
>
>
> Most, if not all, fair scheduler tests create an allocations XML file. Having 
> a helper class that potentially uses a builder would make the tests cleaner. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985033#comment-16985033
 ] 

Peter Bacsko edited comment on YARN-5106 at 11/29/19 1:50 PM:
--

Thanks for picking up the patch [~adam.antal].

 
 * Just minor a thing: strings like "drf" and "fair" are often repeated quite 
often. Either:
 ## Extract them to constants somewhere
 ## instead of using {{.defaultQueueSchedulingPolicy("drf")}} or 
{{.schedulingPolicy("drf")}}, you can introduce methods like 
{{.drfDefaultSchedulingPolicy()}} and {{.drfSchedulingPolicy()}}, 
{{.fairSchedulingPolicy()}}, etc.
 * {{.queueMaxAMShareDefault(-1.0f)}}: here, the value -1.0f has a special 
meaning (feature is disabled). Again, extract {{-1.0f}} or add a method 
{{.disableQueueMaxAmShareDefault()}}.
 * Placement rules like "specified", "reject", etc. Similarly to scheduling 
policy, the set of accepted values are fixed. So do something which I 
recommended in #2.




was (Author: pbacsko):
Thanks for picking up the patch [~adam.antal].

Just minor a thing: strings like "drf" and "fair" are often repeated quite 
often. Either
 # Extract them to constants somewhere
 # instead of using {{.defaultQueueSchedulingPolicy("drf")}} or 
{{.schedulingPolicy("drf")}}, you can introduce methods like 
{{.drfDefaultSchedulingPolicy()}} and {{.drfSchedulingPolicy()}}, 
{{.fairSchedulingPolicy()}}, etc.
 # {{.queueMaxAMShareDefault(-1.0f)}}: here, the value -1.0f has a special 
meaning (feature is disabled). Again, extract {{-1.0f}} or add a method 
{{.disableQueueMaxAmShareDefault()}}.
 # Placement rules like "specified", "reject", etc. Similarly to scheduling 
policy, the set of accepted values are fixed. So do something which I 
recommended in #2.

> Provide a builder interface for FairScheduler allocations for use in tests
> --
>
> Key: YARN-5106
> URL: https://issues.apache.org/jira/browse/YARN-5106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie++
> Attachments: YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.001.patch, YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.002.patch, YARN-5106-branch-3.2.001.patch, 
> YARN-5106-branch-3.2.001.patch, YARN-5106-branch-3.2.002.patch, 
> YARN-5106.001.patch, YARN-5106.002.patch, YARN-5106.003.patch, 
> YARN-5106.004.patch, YARN-5106.005.patch, YARN-5106.006.patch, 
> YARN-5106.007.patch, YARN-5106.008.patch, YARN-5106.008.patch, 
> YARN-5106.008.patch, YARN-5106.009.patch, YARN-5106.010.patch, 
> YARN-5106.011.patch, YARN-5106.012.patch, YARN-5106.013.patch, 
> YARN-5106.014.patch, YARN-5106.015.patch
>
>
> Most, if not all, fair scheduler tests create an allocations XML file. Having 
> a helper class that potentially uses a builder would make the tests cleaner. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985033#comment-16985033
 ] 

Peter Bacsko edited comment on YARN-5106 at 11/29/19 1:50 PM:
--

Thanks for picking up the patch [~adam.antal].

 * Just minor a thing: strings like "drf" and "fair" are often repeated quite 
often. Either:
 ## Extract them to constants somewhere
 ## instead of using {{.defaultQueueSchedulingPolicy("drf")}} or 
{{.schedulingPolicy("drf")}}, you can introduce methods like 
{{.drfDefaultSchedulingPolicy()}} and {{.drfSchedulingPolicy()}}, 
{{.fairSchedulingPolicy()}}, etc.
 * {{.queueMaxAMShareDefault(-1.0f)}}: here, the value -1.0f has a special 
meaning (feature is disabled). Again, extract {{-1.0f}} or add a method 
{{.disableQueueMaxAmShareDefault()}}.
 * Placement rules like "specified", "reject", etc. Similarly to scheduling 
policy, the set of accepted values are fixed. So do something which I 
recommended in #2.




was (Author: pbacsko):
Thanks for picking up the patch [~adam.antal].

 
 * Just minor a thing: strings like "drf" and "fair" are often repeated quite 
often. Either:
 ## Extract them to constants somewhere
 ## instead of using {{.defaultQueueSchedulingPolicy("drf")}} or 
{{.schedulingPolicy("drf")}}, you can introduce methods like 
{{.drfDefaultSchedulingPolicy()}} and {{.drfSchedulingPolicy()}}, 
{{.fairSchedulingPolicy()}}, etc.
 * {{.queueMaxAMShareDefault(-1.0f)}}: here, the value -1.0f has a special 
meaning (feature is disabled). Again, extract {{-1.0f}} or add a method 
{{.disableQueueMaxAmShareDefault()}}.
 * Placement rules like "specified", "reject", etc. Similarly to scheduling 
policy, the set of accepted values are fixed. So do something which I 
recommended in #2.



> Provide a builder interface for FairScheduler allocations for use in tests
> --
>
> Key: YARN-5106
> URL: https://issues.apache.org/jira/browse/YARN-5106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie++
> Attachments: YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.001.patch, YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.002.patch, YARN-5106-branch-3.2.001.patch, 
> YARN-5106-branch-3.2.001.patch, YARN-5106-branch-3.2.002.patch, 
> YARN-5106.001.patch, YARN-5106.002.patch, YARN-5106.003.patch, 
> YARN-5106.004.patch, YARN-5106.005.patch, YARN-5106.006.patch, 
> YARN-5106.007.patch, YARN-5106.008.patch, YARN-5106.008.patch, 
> YARN-5106.008.patch, YARN-5106.009.patch, YARN-5106.010.patch, 
> YARN-5106.011.patch, YARN-5106.012.patch, YARN-5106.013.patch, 
> YARN-5106.014.patch, YARN-5106.015.patch
>
>
> Most, if not all, fair scheduler tests create an allocations XML file. Having 
> a helper class that potentially uses a builder would make the tests cleaner. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985033#comment-16985033
 ] 

Peter Bacsko commented on YARN-5106:


Thanks for picking up the patch [~adam.antal].

Just minor a thing: strings like "drf" and "fair" are often repeated quite 
often. Either
 # Extract them to constants somewhere
 # instead of using {{.defaultQueueSchedulingPolicy("drf")}} or 
{{.schedulingPolicy("drf")}}, you can introduce methods like 
{{.drfDefaultSchedulingPolicy()}} and {{.drfSchedulingPolicy()}}, 
{{.fairSchedulingPolicy()}}, etc.
 # {{.queueMaxAMShareDefault(-1.0f)}}: here, the value -1.0f has a special 
meaning (feature is disabled). Again, extract {{-1.0f}} or add a method 
{{.disableQueueMaxAmShareDefault()}}.
 # Placement rules like "specified", "reject", etc. Similarly to scheduling 
policy, the set of accepted values are fixed. So do something which I 
recommended in #2.

> Provide a builder interface for FairScheduler allocations for use in tests
> --
>
> Key: YARN-5106
> URL: https://issues.apache.org/jira/browse/YARN-5106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie++
> Attachments: YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.001.patch, YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.002.patch, YARN-5106-branch-3.2.001.patch, 
> YARN-5106-branch-3.2.001.patch, YARN-5106-branch-3.2.002.patch, 
> YARN-5106.001.patch, YARN-5106.002.patch, YARN-5106.003.patch, 
> YARN-5106.004.patch, YARN-5106.005.patch, YARN-5106.006.patch, 
> YARN-5106.007.patch, YARN-5106.008.patch, YARN-5106.008.patch, 
> YARN-5106.008.patch, YARN-5106.009.patch, YARN-5106.010.patch, 
> YARN-5106.011.patch, YARN-5106.012.patch, YARN-5106.013.patch, 
> YARN-5106.014.patch, YARN-5106.015.patch
>
>
> Most, if not all, fair scheduler tests create an allocations XML file. Having 
> a helper class that potentially uses a builder would make the tests cleaner. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-29 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985027#comment-16985027
 ] 

Szilard Nemeth commented on YARN-9052:
--

OK, unit test failure was intermittent. 
[~sunilg] Please review the latest patch!

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: 
> YARN-9052-004withlogs-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs003-justfailed.txt, 
> YARN-9052-testlogs003-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs004-justfailed.txt, YARN-9052.001.patch, 
> YARN-9052.002.patch, YARN-9052.003.patch, YARN-9052.004.patch, 
> YARN-9052.004.withlogs.patch, YARN-9052.005.patch, YARN-9052.006.patch, 
> YARN-9052.007.patch, YARN-9052.008.patch, YARN-9052.009.patch, 
> YARN-9052.009.patch, YARN-9052.testlogs.002.patch, 
> YARN-9052.testlogs.002.patch, YARN-9052.testlogs.003.patch, 
> YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9985) Unsupported "transitionToObserver" option displaying for rmadmin command

2019-11-29 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984999#comment-16984999
 ] 

Ayush Saxena commented on YARN-9985:


Found Failover too mentioned in two places along with transitionToObserver, 
Removed this as that too wasn't supposed to be there as per YARN-3397

> Unsupported "transitionToObserver" option displaying for rmadmin command
> 
>
> Key: YARN-9985
> URL: https://issues.apache.org/jira/browse/YARN-9985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: RM, yarn
>Affects Versions: 3.2.1
>Reporter: Souryakanta Dwivedy
>Assignee: Ayush Saxena
>Priority: Minor
> Attachments: YARN-9985-01.patch, YARN-9985-02.patch, 
> image-2019-11-18-18-31-17-755.png, image-2019-11-18-18-35-54-688.png
>
>
> Unsupported "transitionToObserver" option displaying for rmadmin command
> Check the options for Yarn rmadmin command
> It will display the "-transitionToObserver " option which is not 
> supported 
>  by yarn rmadmin command which is wrong behavior.
>  But if you check the yarn rmadmin -help it will not display any option  
> "-transitionToObserver "
>  
> !image-2019-11-18-18-31-17-755.png!
>  
> ==
> install/hadoop/resourcemanager/bin> ./yarn rmadmin -help
> rmadmin is the command to execute YARN administrative commands.
> The full syntax is:
> yarn rmadmin [-refreshQueues] [-refreshNodes [-g|graceful [timeout in 
> seconds] -client|server]] [-refreshNodesResources] 
> [-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] 
> [-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] 
> [-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">] 
> [-removeFromClusterNodeLabels ] [-replaceLabelsOnNode 
> <"node1[:port]=label1,label2 node2[:port]=label1"> [-failOnUnknownNodes]] 
> [-directlyAccessNodeLabelStore] [-refreshClusterMaxPriority] 
> [-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout]) or 
> -updateNodeResource [NodeID] [ResourceTypes] ([OvercommitTimeout])] 
> *{color:#FF}[-transitionToActive [--forceactive] ]{color} 
> {color:#FF}[-transitionToStandby ]{color}* [-getServiceState 
> ] [-getAllServiceState] [-checkHealth ] [-help [cmd]]
> -refreshQueues: Reload the queues' acls, states and scheduler specific 
> properties.
>  ResourceManager will reload the mapred-queues configuration file.
>  -refreshNodes [-g|graceful [timeout in seconds] -client|server]: Refresh the 
> hosts information at the ResourceManager. Here [-g|graceful [timeout in 
> seconds] -client|server] is optional, if we specify the timeout then 
> ResourceManager will wait for timeout before marking the NodeManager as 
> decommissioned. The -client|server indicates if the timeout tracking should 
> be handled by the client or the ResourceManager. The client-side tracking is 
> blocking, while the server-side tracking is not. Omitting the timeout, or a 
> timeout of -1, indicates an infinite timeout. Known Issue: the server-side 
> tracking will immediately decommission if an RM HA failover occurs.
>  -refreshNodesResources: Refresh resources of NodeManagers at the 
> ResourceManager.
>  -refreshSuperUserGroupsConfiguration: Refresh superuser proxy groups mappings
>  -refreshUserToGroupsMappings: Refresh user-to-groups mappings
>  -refreshAdminAcls: Refresh acls for administration of ResourceManager
>  -refreshServiceAcl: Reload the service-level authorization policy file.
>  ResourceManager will reload the authorization policy file.
>  -getGroups [username]: Get the groups which given user belongs to.
>  -addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">: add to cluster 
> node labels. Default exclusivity is true
>  -removeFromClusterNodeLabels  (label splitted by ","): 
> remove from cluster node labels
>  -replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes] : replace labels on nodes 
> (please note that we do not support specifying multiple labels on a single 
> host for now.)
>  [-failOnUnknownNodes] is optional, when we set this option, it will fail if 
> specified nodes are unknown.
>  -directlyAccessNodeLabelStore: This is DEPRECATED, will be removed in future 
> releases. Directly access node label store, with this option, all node label 
> related operations will not connect RM. Instead, they will access/modify 
> stored node labels directly. By default, it is false (access via RM). AND 
> PLEASE NOTE: if you configured yarn.node-labels.fs-store.root-dir to a local 
> directory (instead of NFS or HDFS), this option will only work when the 
> command run on the machine where RM is running.
>  -refreshClusterMaxPriority: Refresh cluster max priority
> 

[jira] [Updated] (YARN-9985) Unsupported "transitionToObserver" option displaying for rmadmin command

2019-11-29 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated YARN-9985:
---
Attachment: YARN-9985-02.patch

> Unsupported "transitionToObserver" option displaying for rmadmin command
> 
>
> Key: YARN-9985
> URL: https://issues.apache.org/jira/browse/YARN-9985
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: RM, yarn
>Affects Versions: 3.2.1
>Reporter: Souryakanta Dwivedy
>Assignee: Ayush Saxena
>Priority: Minor
> Attachments: YARN-9985-01.patch, YARN-9985-02.patch, 
> image-2019-11-18-18-31-17-755.png, image-2019-11-18-18-35-54-688.png
>
>
> Unsupported "transitionToObserver" option displaying for rmadmin command
> Check the options for Yarn rmadmin command
> It will display the "-transitionToObserver " option which is not 
> supported 
>  by yarn rmadmin command which is wrong behavior.
>  But if you check the yarn rmadmin -help it will not display any option  
> "-transitionToObserver "
>  
> !image-2019-11-18-18-31-17-755.png!
>  
> ==
> install/hadoop/resourcemanager/bin> ./yarn rmadmin -help
> rmadmin is the command to execute YARN administrative commands.
> The full syntax is:
> yarn rmadmin [-refreshQueues] [-refreshNodes [-g|graceful [timeout in 
> seconds] -client|server]] [-refreshNodesResources] 
> [-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] 
> [-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] 
> [-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">] 
> [-removeFromClusterNodeLabels ] [-replaceLabelsOnNode 
> <"node1[:port]=label1,label2 node2[:port]=label1"> [-failOnUnknownNodes]] 
> [-directlyAccessNodeLabelStore] [-refreshClusterMaxPriority] 
> [-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout]) or 
> -updateNodeResource [NodeID] [ResourceTypes] ([OvercommitTimeout])] 
> *{color:#FF}[-transitionToActive [--forceactive] ]{color} 
> {color:#FF}[-transitionToStandby ]{color}* [-getServiceState 
> ] [-getAllServiceState] [-checkHealth ] [-help [cmd]]
> -refreshQueues: Reload the queues' acls, states and scheduler specific 
> properties.
>  ResourceManager will reload the mapred-queues configuration file.
>  -refreshNodes [-g|graceful [timeout in seconds] -client|server]: Refresh the 
> hosts information at the ResourceManager. Here [-g|graceful [timeout in 
> seconds] -client|server] is optional, if we specify the timeout then 
> ResourceManager will wait for timeout before marking the NodeManager as 
> decommissioned. The -client|server indicates if the timeout tracking should 
> be handled by the client or the ResourceManager. The client-side tracking is 
> blocking, while the server-side tracking is not. Omitting the timeout, or a 
> timeout of -1, indicates an infinite timeout. Known Issue: the server-side 
> tracking will immediately decommission if an RM HA failover occurs.
>  -refreshNodesResources: Refresh resources of NodeManagers at the 
> ResourceManager.
>  -refreshSuperUserGroupsConfiguration: Refresh superuser proxy groups mappings
>  -refreshUserToGroupsMappings: Refresh user-to-groups mappings
>  -refreshAdminAcls: Refresh acls for administration of ResourceManager
>  -refreshServiceAcl: Reload the service-level authorization policy file.
>  ResourceManager will reload the authorization policy file.
>  -getGroups [username]: Get the groups which given user belongs to.
>  -addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">: add to cluster 
> node labels. Default exclusivity is true
>  -removeFromClusterNodeLabels  (label splitted by ","): 
> remove from cluster node labels
>  -replaceLabelsOnNode <"node1[:port]=label1,label2 
> node2[:port]=label1,label2"> [-failOnUnknownNodes] : replace labels on nodes 
> (please note that we do not support specifying multiple labels on a single 
> host for now.)
>  [-failOnUnknownNodes] is optional, when we set this option, it will fail if 
> specified nodes are unknown.
>  -directlyAccessNodeLabelStore: This is DEPRECATED, will be removed in future 
> releases. Directly access node label store, with this option, all node label 
> related operations will not connect RM. Instead, they will access/modify 
> stored node labels directly. By default, it is false (access via RM). AND 
> PLEASE NOTE: if you configured yarn.node-labels.fs-store.root-dir to a local 
> directory (instead of NFS or HDFS), this option will only work when the 
> command run on the machine where RM is running.
>  -refreshClusterMaxPriority: Refresh cluster max priority
>  -updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
>  or
>  [NodeID] [resourcetypes] ([OvercommitTimeout]). : Update resource on 
> 

[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984991#comment-16984991
 ] 

Hadoop QA commented on YARN-9052:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 88 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
6s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 19m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 19m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
3m 36s{color} | {color:orange} root: The patch generated 42 new + 1829 
unchanged - 58 fixed = 1871 total (was 1887) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 84m 
47s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 26m  
9s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m  
1s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
48s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}237m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9052 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987130/YARN-9052.009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 213e8e3019a1 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984948#comment-16984948
 ] 

Hadoop QA commented on YARN-5106:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
1s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 29 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 278 unchanged - 52 fixed = 278 total (was 330) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 87m 
25s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 26m 
23s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}197m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-5106 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12987127/YARN-5106.015.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c30178192070 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / a2dadac |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 

[jira] [Commented] (YARN-9923) Introduce HealthReporter interface to support multiple health checker files

2019-11-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984905#comment-16984905
 ] 

Hadoop QA commented on YARN-9923:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
51s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 26 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
21m 22s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m  
6s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
38s{color} | {color:green} root generated 0 new + 1868 unchanged - 2 fixed = 
1868 total (was 1870) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 55s{color} | {color:orange} root: The patch generated 2 new + 596 unchanged 
- 52 fixed = 598 total (was 648) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  3s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
18s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
52s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
36s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| 

[jira] [Commented] (YARN-9925) CapacitySchedulerQueueManager allows unsupported Queue hierarchy

2019-11-29 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984890#comment-16984890
 ] 

Peter Bacsko commented on YARN-9925:


[~prabhujoseph] could you create a follow-up Jira about changing 
{{IOException}} to {{YarnException}}?

> CapacitySchedulerQueueManager allows unsupported Queue hierarchy
> 
>
> Key: YARN-9925
> URL: https://issues.apache.org/jira/browse/YARN-9925
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9925-001.patch, YARN-9925-002.patch, 
> YARN-9925-003.patch, YARN-9925-004.patch, YARN-9925-005.patch
>
>
> CapacitySchedulerQueueManager allows unsupported Queue hierarchy. When 
> creating a queue with same name as an existing parent queue name - it has to 
> fail with below.
> {code:java}
> Caused by: java.io.IOException: A is moved from:root.A to:root.B.A after 
> refresh, which is not allowed.Caused by: java.io.IOException: A is moved 
> from:root.A to:root.B.A after refresh, which is not allowed. at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.validateQueueHierarchy(CapacitySchedulerQueueManager.java:335)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.reinitializeQueues(CapacitySchedulerQueueManager.java:180)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:762)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:473)
>  ... 70 more 
> {code}
> In Some cases, the error is not thrown while creating the queue but thrown at 
> submission of job "Failed to submit application_1571677375269_0002 to YARN : 
> Application application_1571677375269_0002 submitted by user : systest to 
> non-leaf queue : B"
> Below scenarios are allowed but it should not
> {code:java}
> It allows root.A.A1.B when root.B.B1 already exists.
>
> 1. Add root.A
> 2. Add root.A.A1
> 3. Add root.B
> 4. Add root.B.B1
> 5. Allows Add of root.A.A1.B 
> It allows two root queues:
>
> 1. Add root.A
> 2. Add root.B
> 3. Add root.A.A1
> 4. Allows Add of root.A.A1.root
>
> {code}
> Below scenario is handled properly:
> {code:java}
> It does not allow root.B.A when root.A.A1 already exists.
>  
> 1. Add root.A
> 2. Add root.B
> 3. Add root.A.A1
> 4. Does not Allow Add of root.B.A
> {code}
> This error handling has to be consistent in all scenarios.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9938) Validate Parent Queue for QueueMapping contains dynamic group as parent queue

2019-11-29 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984888#comment-16984888
 ] 

Peter Bacsko commented on YARN-9938:


+1 (non-binding)

[~snemeth] please review

> Validate Parent Queue for QueueMapping contains dynamic group as parent queue
> -
>
> Key: YARN-9938
> URL: https://issues.apache.org/jira/browse/YARN-9938
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9938.001.patch, YARN-9938.002.patch, 
> YARN-9938.003.patch, YARN-9938.004.patch, YARN-9938.005.patch, 
> YARN-9938.006.patch
>
>
> Currently \{{UserGroupMappingPlacementRule#validateParentQueue}} validates 
> the parent queue using queue path. With dynamic group using %primary_group 
> and %secondary_group in place (Refer YARN-9841 and YARN-9865) , parent queue 
> validation should also happen for these above 2 queue mappings after 
> resolving the above wildcard pattern to corresponding groups at runtime.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-29 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984816#comment-16984816
 ] 

Szilard Nemeth commented on YARN-9052:
--

Re-attaching patch to see if test failure is intermittent or not.

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: 
> YARN-9052-004withlogs-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs003-justfailed.txt, 
> YARN-9052-testlogs003-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs004-justfailed.txt, YARN-9052.001.patch, 
> YARN-9052.002.patch, YARN-9052.003.patch, YARN-9052.004.patch, 
> YARN-9052.004.withlogs.patch, YARN-9052.005.patch, YARN-9052.006.patch, 
> YARN-9052.007.patch, YARN-9052.008.patch, YARN-9052.009.patch, 
> YARN-9052.009.patch, YARN-9052.testlogs.002.patch, 
> YARN-9052.testlogs.002.patch, YARN-9052.testlogs.003.patch, 
> YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-29 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9052:
-
Attachment: YARN-9052.009.patch

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: 
> YARN-9052-004withlogs-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs003-justfailed.txt, 
> YARN-9052-testlogs003-patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt,
>  YARN-9052-testlogs004-justfailed.txt, YARN-9052.001.patch, 
> YARN-9052.002.patch, YARN-9052.003.patch, YARN-9052.004.patch, 
> YARN-9052.004.withlogs.patch, YARN-9052.005.patch, YARN-9052.006.patch, 
> YARN-9052.007.patch, YARN-9052.008.patch, YARN-9052.009.patch, 
> YARN-9052.009.patch, YARN-9052.testlogs.002.patch, 
> YARN-9052.testlogs.002.patch, YARN-9052.testlogs.003.patch, 
> YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-5106:
-
Attachment: YARN-5106.015.patch

> Provide a builder interface for FairScheduler allocations for use in tests
> --
>
> Key: YARN-5106
> URL: https://issues.apache.org/jira/browse/YARN-5106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie++
> Attachments: YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.001.patch, YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.002.patch, YARN-5106-branch-3.2.001.patch, 
> YARN-5106-branch-3.2.001.patch, YARN-5106-branch-3.2.002.patch, 
> YARN-5106.001.patch, YARN-5106.002.patch, YARN-5106.003.patch, 
> YARN-5106.004.patch, YARN-5106.005.patch, YARN-5106.006.patch, 
> YARN-5106.007.patch, YARN-5106.008.patch, YARN-5106.008.patch, 
> YARN-5106.008.patch, YARN-5106.009.patch, YARN-5106.010.patch, 
> YARN-5106.011.patch, YARN-5106.012.patch, YARN-5106.013.patch, 
> YARN-5106.014.patch, YARN-5106.015.patch
>
>
> Most, if not all, fair scheduler tests create an allocations XML file. Having 
> a helper class that potentially uses a builder would make the tests cleaner. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9990) Testcase fails with "Insufficient configured threads: required=16 < max=10"

2019-11-29 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984810#comment-16984810
 ] 

Prabhu Joseph commented on YARN-9990:
-

Thanks [~abmodi].

> Testcase fails with "Insufficient configured threads: required=16 < max=10"
> ---
>
> Key: YARN-9990
> URL: https://issues.apache.org/jira/browse/YARN-9990
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9990-001.patch
>
>
> Testcase fails with "Insufficient configured threads: required=16 < max=10". 
> Below testcases failing 
> 1. TestWebAppProxyServlet
> 2. TestAmFilter
> 3. TestApiServiceClient
> 4. TestSecureApiServiceClient
> {code}
> [ERROR] org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet  Time 
> elapsed: 0.396 s  <<< ERROR!
> java.lang.IllegalStateException: Insufficient configured threads: required=16 
> < max=10 for 
> QueuedThreadPool[qtp1597249648]@5f341870{STARTED,8<=8<=10,i=8,r=1,q=0}[ReservedThreadExecutor@4c762604{s=0/1,p=0}]
>   at 
> org.eclipse.jetty.util.thread.ThreadPoolBudget.check(ThreadPoolBudget.java:156)
>   at 
> org.eclipse.jetty.util.thread.ThreadPoolBudget.leaseTo(ThreadPoolBudget.java:130)
>   at 
> org.eclipse.jetty.util.thread.ThreadPoolBudget.leaseFrom(ThreadPoolBudget.java:182)
>   at 
> org.eclipse.jetty.io.SelectorManager.doStart(SelectorManager.java:255)
>   at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
>   at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
>   at 
> org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
>   at 
> org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:283)
>   at 
> org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:81)
>   at 
> org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:231)
>   at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
>   at org.eclipse.jetty.server.Server.doStart(Server.java:385)
>   at 
> org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
>   at 
> org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet.start(TestWebAppProxyServlet.java:102)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:24)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> [INFO] Running org.apache.hadoop.yarn.server.webproxy.amfilter.TestAmFilter
> [ERROR] Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 2.326 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.webproxy.amfilter.TestAmFilter
> [ERROR] 
> testFindRedirectUrl(org.apache.hadoop.yarn.server.webproxy.amfilter.TestAmFilter)
>   Time elapsed: 0.306 s  <<< ERROR!
> java.lang.IllegalStateException: Insufficient configured threads: required=16 
> < max=10 for 
> QueuedThreadPool[qtp485041780]@1ce92674{STARTED,8<=8<=10,i=8,r=1,q=0}[ReservedThreadExecutor@31f924f5{s=0/1,p=0}]
>   at 
> 

[jira] [Commented] (YARN-5106) Provide a builder interface for FairScheduler allocations for use in tests

2019-11-29 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984811#comment-16984811
 ] 

Adam Antal commented on YARN-5106:
--

Fixed typo causes test failure and last checkstyle issues in v15. Please review.

> Provide a builder interface for FairScheduler allocations for use in tests
> --
>
> Key: YARN-5106
> URL: https://issues.apache.org/jira/browse/YARN-5106
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Adam Antal
>Priority: Major
>  Labels: newbie++
> Attachments: YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.001.patch, YARN-5106-branch-3.1.001.patch, 
> YARN-5106-branch-3.1.002.patch, YARN-5106-branch-3.2.001.patch, 
> YARN-5106-branch-3.2.001.patch, YARN-5106-branch-3.2.002.patch, 
> YARN-5106.001.patch, YARN-5106.002.patch, YARN-5106.003.patch, 
> YARN-5106.004.patch, YARN-5106.005.patch, YARN-5106.006.patch, 
> YARN-5106.007.patch, YARN-5106.008.patch, YARN-5106.008.patch, 
> YARN-5106.008.patch, YARN-5106.009.patch, YARN-5106.010.patch, 
> YARN-5106.011.patch, YARN-5106.012.patch, YARN-5106.013.patch, 
> YARN-5106.014.patch
>
>
> Most, if not all, fair scheduler tests create an allocations XML file. Having 
> a helper class that potentially uses a builder would make the tests cleaner. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9923) Introduce HealthReporter interface to support multiple health checker files

2019-11-29 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-9923:
-
Attachment: YARN-9923.008.patch

> Introduce HealthReporter interface to support multiple health checker files
> ---
>
> Key: YARN-9923
> URL: https://issues.apache.org/jira/browse/YARN-9923
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9923.001.patch, YARN-9923.002.patch, 
> YARN-9923.003.patch, YARN-9923.004.patch, YARN-9923.005.patch, 
> YARN-9923.006.patch, YARN-9923.007.patch, YARN-9923.008.patch
>
>
> Currently if a NodeManager is enabled to allocate Docker containers, but the 
> specified binary (docker.binary in the container-executor.cfg) is missing the 
> container allocation fails with the following error message:
> {noformat}
> Container launch fails
> Exit code: 29
> Exception message: Launch container failed
> Shell error output: sh: : No 
> such file or directory
> Could not inspect docker network to get type /usr/bin/docker network inspect 
> host --format='{{.Driver}}'.
> Error constructing docker command, docker error code=-1, error 
> message='Unknown error'
> {noformat}
> I suggest to add a property say "yarn.nodemanager.runtime.linux.docker.check" 
> to have the following options:
> - STARTUP: setting this option the NodeManager would not start if Docker 
> binaries are missing or the Docker daemon is not running (the exception is 
> considered FATAL during startup)
> - RUNTIME: would give a more detailed/user-friendly exception in 
> NodeManager's side (NM logs) if Docker binaries are missing or the daemon is 
> not working. This would also prevent further Docker container allocation as 
> long as the binaries do not exist and the docker daemon is not running.
> - NONE (default): preserving the current behaviour, throwing exception during 
> container allocation, carrying on using the default retry procedure.
> 
> A new interface called {{HealthChecker}} is introduced which is used in the 
> {{NodeHealthCheckerService}}. Currently existing implementations like 
> {{LocalDirsHandlerService}} are modified to implement this giving a clear 
> abstraction to the node's health. The {{DockerHealthChecker}} implements this 
> new interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9923) Introduce HealthReporter interface to support multiple health checker files

2019-11-29 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-9923:
-
Summary: Introduce HealthReporter interface to support multiple health 
checker files  (was: Introduce HealthReporter interface and implement running 
Docker daemon checker)

> Introduce HealthReporter interface to support multiple health checker files
> ---
>
> Key: YARN-9923
> URL: https://issues.apache.org/jira/browse/YARN-9923
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9923.001.patch, YARN-9923.002.patch, 
> YARN-9923.003.patch, YARN-9923.004.patch, YARN-9923.005.patch, 
> YARN-9923.006.patch, YARN-9923.007.patch
>
>
> Currently if a NodeManager is enabled to allocate Docker containers, but the 
> specified binary (docker.binary in the container-executor.cfg) is missing the 
> container allocation fails with the following error message:
> {noformat}
> Container launch fails
> Exit code: 29
> Exception message: Launch container failed
> Shell error output: sh: : No 
> such file or directory
> Could not inspect docker network to get type /usr/bin/docker network inspect 
> host --format='{{.Driver}}'.
> Error constructing docker command, docker error code=-1, error 
> message='Unknown error'
> {noformat}
> I suggest to add a property say "yarn.nodemanager.runtime.linux.docker.check" 
> to have the following options:
> - STARTUP: setting this option the NodeManager would not start if Docker 
> binaries are missing or the Docker daemon is not running (the exception is 
> considered FATAL during startup)
> - RUNTIME: would give a more detailed/user-friendly exception in 
> NodeManager's side (NM logs) if Docker binaries are missing or the daemon is 
> not working. This would also prevent further Docker container allocation as 
> long as the binaries do not exist and the docker daemon is not running.
> - NONE (default): preserving the current behaviour, throwing exception during 
> container allocation, carrying on using the default retry procedure.
> 
> A new interface called {{HealthChecker}} is introduced which is used in the 
> {{NodeHealthCheckerService}}. Currently existing implementations like 
> {{LocalDirsHandlerService}} are modified to implement this giving a clear 
> abstraction to the node's health. The {{DockerHealthChecker}} implements this 
> new interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9923) Introduce HealthReporter interface and implement running Docker daemon checker

2019-11-29 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984802#comment-16984802
 ] 

Adam Antal commented on YARN-9923:
--

Resolved last test failure and checkstyle. Will not make final class out of 
{{NodeHealthScriptRunner}}, because it is mocked in 
{{TestNodeHealthCheckerService#testNodeHealthService()}}.

Please review.

> Introduce HealthReporter interface and implement running Docker daemon checker
> --
>
> Key: YARN-9923
> URL: https://issues.apache.org/jira/browse/YARN-9923
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9923.001.patch, YARN-9923.002.patch, 
> YARN-9923.003.patch, YARN-9923.004.patch, YARN-9923.005.patch, 
> YARN-9923.006.patch, YARN-9923.007.patch
>
>
> Currently if a NodeManager is enabled to allocate Docker containers, but the 
> specified binary (docker.binary in the container-executor.cfg) is missing the 
> container allocation fails with the following error message:
> {noformat}
> Container launch fails
> Exit code: 29
> Exception message: Launch container failed
> Shell error output: sh: : No 
> such file or directory
> Could not inspect docker network to get type /usr/bin/docker network inspect 
> host --format='{{.Driver}}'.
> Error constructing docker command, docker error code=-1, error 
> message='Unknown error'
> {noformat}
> I suggest to add a property say "yarn.nodemanager.runtime.linux.docker.check" 
> to have the following options:
> - STARTUP: setting this option the NodeManager would not start if Docker 
> binaries are missing or the Docker daemon is not running (the exception is 
> considered FATAL during startup)
> - RUNTIME: would give a more detailed/user-friendly exception in 
> NodeManager's side (NM logs) if Docker binaries are missing or the daemon is 
> not working. This would also prevent further Docker container allocation as 
> long as the binaries do not exist and the docker daemon is not running.
> - NONE (default): preserving the current behaviour, throwing exception during 
> container allocation, carrying on using the default retry procedure.
> 
> A new interface called {{HealthChecker}} is introduced which is used in the 
> {{NodeHealthCheckerService}}. Currently existing implementations like 
> {{LocalDirsHandlerService}} are modified to implement this giving a clear 
> abstraction to the node's health. The {{DockerHealthChecker}} implements this 
> new interface.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org