[jira] [Commented] (YARN-4948) Support node labels store in zookeeper

2016-04-13 Thread jialei weng (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238763#comment-15238763
 ] 

jialei weng commented on YARN-4948:
---

Ok, I will.

> Support node labels store in zookeeper
> --
>
> Key: YARN-4948
> URL: https://issues.apache.org/jira/browse/YARN-4948
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: jialei weng
> Attachments: YARN-4948-branch-2.7.0.001.patch, YARN-4948.001.patch
>
>
> Support node labels store in zookeeper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4954) TestYarnClient.testReservationAPIs fails on machines with less than 4 GB available memory

2016-04-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-4954:

Description: 
TestYarnClient.testReservationAPIs sometimes fails with this error:
{noformat}
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.AlignedPlannerWithGreedy.createReservation(AlignedPlannerWithGreedy.java:84)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1237)
... 10 more
{noformat}

This is caused by really not having enough available memory to complete the 
reservation (4 * 1024 MB). In my opinion lowering the required memory (either 
by lowering the number of containers to 2, or the memory to 512 MB) would make 
the test more stable. 

  was:
TestYarnClient.testReservationAPIs sometimes fails with this error:
```
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
at 

[jira] [Updated] (YARN-4954) TestYarnClient.testReservationAPIs fails on machines with less than 4 GB available memory

2016-04-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-4954:

Description: 
TestYarnClient.testReservationAPIs sometimes fails with this error:
{noformat}
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.AlignedPlannerWithGreedy.createReservation(AlignedPlannerWithGreedy.java:84)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1237)
... 10 more
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testReservationAPIs(TestYarnClient.java:1227)
{noformat}

This is caused by really not having enough available memory to complete the 
reservation (4 * 1024 MB). In my opinion lowering the required memory (either 
by lowering the number of containers to 2, or the memory to 512 MB) would make 
the test more stable. 

  was:
TestYarnClient.testReservationAPIs sometimes fails with this error:
{noformat}
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 

[jira] [Updated] (YARN-4954) TestYarnClient.testReservationAPIs fails on machines with less than 4 GB available memory

2016-04-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-4954:

Description: 
TestYarnClient.testReservationAPIs sometimes fails with this error:
{noformat}
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.AlignedPlannerWithGreedy.createReservation(AlignedPlannerWithGreedy.java:84)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1237)
... 10 more
at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testReservationAPIs(TestYarnClient.java:1227)
{noformat}

This is caused by really not having enough available memory to complete the 
reservation (4 * 1024 MB). In my opinion lowering the required memory (either 
by lowering the number of containers to 2, or the memory to 512 MB) would make 
the test more stable. 

  was:
TestYarnClient.testReservationAPIs sometimes fails with this error:
{noformat}
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 

[jira] [Updated] (YARN-4954) TestYarnClient.testReservationAPIs fails on machines with less than 4 GB available memory

2016-04-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-4954:

Description: 
TestYarnClient.testReservationAPIs sometimes fails with this error:
{noformat}
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.AlignedPlannerWithGreedy.createReservation(AlignedPlannerWithGreedy.java:84)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1237)
... 10 more

at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testReservationAPIs(TestYarnClient.java:1227)
{noformat}

This is caused by really not having enough available memory to complete the 
reservation (4 * 1024 MB). In my opinion lowering the required memory (either 
by lowering the number of containers to 2, or the memory to 512 MB) would make 
the test more stable. 

  was:
TestYarnClient.testReservationAPIs sometimes fails with this error:
{noformat}
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 

[jira] [Updated] (YARN-4954) TestYarnClient.testReservationAPIs fails on machines with less than 4 GB available memory

2016-04-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-4954:

Attachment: YARN-4954.001.patch

> TestYarnClient.testReservationAPIs fails on machines with less than 4 GB 
> available memory
> -
>
> Key: YARN-4954
> URL: https://issues.apache.org/jira/browse/YARN-4954
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Affects Versions: 3.0.0
>Reporter: Gergely Novák
>Assignee: Gergely Novák
>Priority: Minor
> Attachments: YARN-4954.001.patch
>
>
> TestYarnClient.testReservationAPIs sometimes fails with this error:
> {noformat}
> java.lang.AssertionError: 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
>  The request cannot be satisfied
>   at 
> org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
> Caused by: 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
>  The request cannot be satisfied
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.AlignedPlannerWithGreedy.createReservation(AlignedPlannerWithGreedy.java:84)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1237)
>   ... 10 more
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testReservationAPIs(TestYarnClient.java:1227)
> {noformat}
> This is caused by really not having enough available memory to complete the 
> reservation (4 * 1024 MB). In my opinion lowering the required memory (either 
> by lowering the number of containers to 2, or the memory to 512 MB) would 
> make the test more stable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes

2016-04-13 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238884#comment-15238884
 ] 

Bibin A Chundatt commented on YARN-4947:


[~rohithsharma]
For the current testcase rm is not required to be started have updated test 
code to always return true for {{isDrained}}.
Could you please review patch attached.

> Test timeout is happening for TestRMWebServicesNodes
> 
>
> Key: YARN-4947
> URL: https://issues.apache.org/jira/browse/YARN-4947
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4947.patch
>
>
> Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 
> [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4954) TestYarnClient.testReservationAPIs fails on machines with less than 4 GB available memory

2016-04-13 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Novák updated YARN-4954:

Description: 
TestYarnClient.testReservationAPIs sometimes fails with this error:
```
java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.AlignedPlannerWithGreedy.createReservation(AlignedPlannerWithGreedy.java:84)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1237)
... 10 more
```

This is caused by really not having enough available memory to complete the 
reservation (4 * 1024 MB). In my opinion lowering the required memory (either 
by lowering the number of containers to 2, or the memory to 512 MB) would make 
the test more stable. 

  was:
TestYarnClient.testReservationAPIs sometimes fails with this error:
{{java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
at 

[jira] [Created] (YARN-4954) TestYarnClient.testReservationAPIs fails on machines with less than 4 GB available memory

2016-04-13 Thread JIRA
Gergely Novák created YARN-4954:
---

 Summary: TestYarnClient.testReservationAPIs fails on machines with 
less than 4 GB available memory
 Key: YARN-4954
 URL: https://issues.apache.org/jira/browse/YARN-4954
 Project: Hadoop YARN
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Gergely Novák
Assignee: Gergely Novák
Priority: Minor


TestYarnClient.testReservationAPIs sometimes fails with this error:
{{java.lang.AssertionError: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1254)
at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitReservation(ApplicationClientProtocolPBServiceImpl.java:457)
at 
org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:515)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2422)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2418)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2416)
Caused by: 
org.apache.hadoop.yarn.server.resourcemanager.reservation.exceptions.PlanningException:
 The request cannot be satisfied
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.IterativePlanner.computeJobAllocation(IterativePlanner.java:151)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.allocateUser(PlanningAlgorithm.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.PlanningAlgorithm.createReservation(PlanningAlgorithm.java:140)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.TryManyReservationAgents.createReservation(TryManyReservationAgents.java:55)
at 
org.apache.hadoop.yarn.server.resourcemanager.reservation.planning.AlignedPlannerWithGreedy.createReservation(AlignedPlannerWithGreedy.java:84)
at 
org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitReservation(ClientRMService.java:1237)
... 10 more


at org.junit.Assert.fail(Assert.java:88)
at 
org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testReservationAPIs(TestYarnClient.java:1227)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:119)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:42)
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 

[jira] [Commented] (YARN-4862) Handle duplicate completed containers in RMNodeImpl

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238959#comment-15238959
 ] 

Hadoop QA commented on YARN-4862:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
32s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 0 new + 46 unchanged - 1 fixed = 46 total (was 47) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 28s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 43s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 144m 32s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.8.0_77 Timed out junit tests | 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
| JDK v1.7.0_95 Failed junit tests | 

[jira] [Commented] (YARN-4859) [Bug] Unable to submit a job to a reservation when using FairScheduler

2016-04-13 Thread Kai Sasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238987#comment-15238987
 ] 

Kai Sasaki commented on YARN-4859:
--

We are facing the same issue when using FairScheduler. Restarting 
ResourceManager seems workaround of this issue for us.
But we are not sure the cause of the issue yet.

> [Bug] Unable to submit a job to a reservation when using FairScheduler
> --
>
> Key: YARN-4859
> URL: https://issues.apache.org/jira/browse/YARN-4859
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Subru Krishnan
>Assignee: Arun Suresh
>
> Jobs submitted to a reservation get stuck at scheduled stage when using 
> FairScheduler. I came across this when working on YARN-4827 (documentation 
> for configuring ReservationSystem for FairScheduler)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3863) Support complex filters in TimelineReader

2016-04-13 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239001#comment-15239001
 ] 

Varun Saxena commented on YARN-3863:


Thanks [~sjlee0] for the commit and thanks [~djp] and [~jrottinghuis] for 
reviews.

> Support complex filters in TimelineReader
> -
>
> Key: YARN-3863
> URL: https://issues.apache.org/jira/browse/YARN-3863
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: YARN-2928
>Reporter: Varun Saxena
>Assignee: Varun Saxena
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-3863-YARN-2928.v2.01.patch, 
> YARN-3863-YARN-2928.v2.02.patch, YARN-3863-YARN-2928.v2.03.patch, 
> YARN-3863-YARN-2928.v2.04.patch, YARN-3863-YARN-2928.v2.05.patch, 
> YARN-3863-feature-YARN-2928.wip.003.patch, 
> YARN-3863-feature-YARN-2928.wip.01.patch, 
> YARN-3863-feature-YARN-2928.wip.02.patch, 
> YARN-3863-feature-YARN-2928.wip.04.patch, 
> YARN-3863-feature-YARN-2928.wip.05.patch
>
>
> Currently filters in timeline reader will return an entity only if all the 
> filter conditions hold true i.e. only AND operation is supported. We can 
> support OR operation for the filters as well. Additionally as primary backend 
> implementation is HBase, we can design our filters in a manner, where they 
> closely resemble HBase Filters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4849) [YARN-3368] cleanup code base, integrate web UI related build to mvn, and fix licenses.

2016-04-13 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-4849:
---
Fix Version/s: (was: YANR-3368)
   YARN-3368

> [YARN-3368] cleanup code base, integrate web UI related build to mvn, and fix 
> licenses.
> ---
>
> Key: YARN-4849
> URL: https://issues.apache.org/jira/browse/YARN-4849
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Fix For: YARN-3368
>
> Attachments: YARN-4849-YARN-3368.1.patch, 
> YARN-4849-YARN-3368.2.patch, YARN-4849-YARN-3368.3.patch, 
> YARN-4849-YARN-3368.4.patch, YARN-4849-YARN-3368.5.patch, 
> YARN-4849-YARN-3368.6.patch, YARN-4849-YARN-3368.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4862) Handle duplicate completed containers in RMNodeImpl

2016-04-13 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4862:

Attachment: 0002-YARN-4862.patch

Updated the same patch fixing checkstyle errors

> Handle duplicate completed containers in RMNodeImpl
> ---
>
> Key: YARN-4862
> URL: https://issues.apache.org/jira/browse/YARN-4862
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4862.patch, 0002-YARN-4862.patch
>
>
> As per 
> [comment|https://issues.apache.org/jira/browse/YARN-4852?focusedCommentId=15209689=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15209689]
>  from [~sharadag], there should be safe guard for duplicated container status 
> in RMNodeImpl before creating UpdatedContainerInfo. 
> Or else in heavily loaded cluster where event processing is gradually slow, 
> if any duplicated container are sent to RM(may be bug in NM also), there is 
> significant impact that RMNodImpl always create UpdatedContainerInfo for 
> duplicated containers. This result in increase in the heap memory and causes 
> problem like YARN-4852.
> This is an optimization for issue kind YARN-4852



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4862) Handle duplicate completed containers in RMNodeImpl

2016-04-13 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4862:

Target Version/s: 2.9.0

> Handle duplicate completed containers in RMNodeImpl
> ---
>
> Key: YARN-4862
> URL: https://issues.apache.org/jira/browse/YARN-4862
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4862.patch, 0002-YARN-4862.patch
>
>
> As per 
> [comment|https://issues.apache.org/jira/browse/YARN-4852?focusedCommentId=15209689=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15209689]
>  from [~sharadag], there should be safe guard for duplicated container status 
> in RMNodeImpl before creating UpdatedContainerInfo. 
> Or else in heavily loaded cluster where event processing is gradually slow, 
> if any duplicated container are sent to RM(may be bug in NM also), there is 
> significant impact that RMNodImpl always create UpdatedContainerInfo for 
> duplicated containers. This result in increase in the heap memory and causes 
> problem like YARN-4852.
> This is an optimization for issue kind YARN-4852



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4879) Proposal for a simple (delta) allocate protocol

2016-04-13 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239057#comment-15239057
 ] 

Steve Loughran commented on YARN-4879:
--

I like the idea of having requestIDs on requests; helps us map from request to 
response. As you say, today things are an ugly hack with priority.

I would also like to see if the allocated containers could support a role ID 
field too...nothing much, but enough that on an AM restart their role can be 
determined. That one, I'd keep separate from the request ID; they serve 
slightly different purposes. (I could have 5 requests outstanding for 
containers of role 4; I'd want to track those requests)

> Proposal for a simple (delta) allocate protocol
> ---
>
> Key: YARN-4879
> URL: https://issues.apache.org/jira/browse/YARN-4879
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: SimpleAllocateProtocolProposal-v1.pdf
>
>
> For legacy reasons, the current allocate protocol expects expanded requests 
> which represent the cumulative request for any change in resource 
> constraints. This is not only very difficult to comprehend but makes it 
> impossible for the scheduler to associate container allocations to the 
> original requests. This problem is amplified by the fact that the expansion 
> is managed by the AMRMClient which makes it cumbersome for non-Java clients 
> as they all have to replicate the non-trivial logic. In this JIRA, we are 
> proposing a delta allocate protocol where the AM will need to only specify 
> changes in resource constraints.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4954) TestYarnClient.testReservationAPIs fails on machines with less than 4 GB available memory

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239085#comment-15239085
 ] 

Hadoop QA commented on YARN-4954:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 24s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 23s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 148m 11s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy |
|   | hadoop.yarn.client.TestGetGroups |
| JDK v1.8.0_77 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | org.apache.hadoop.yarn.client.api.impl.TestYarnClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestNMClient |
| JDK v1.7.0_95 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy |
|   | hadoop.yarn.client.TestGetGroups |
| JDK v1.7.0_95 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | 

[jira] [Updated] (YARN-3692) Allow REST API to set a user generated message when killing an application

2016-04-13 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-3692:

Attachment: 0001-YARN-3692.patch

> Allow REST API to set a user generated message when killing an application
> --
>
> Key: YARN-3692
> URL: https://issues.apache.org/jira/browse/YARN-3692
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Rajat Jain
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-3692.patch
>
>
> Currently YARN's REST API supports killing an application without setting a 
> diagnostic message. It would be good to provide that support.
> *Use Case*
> Usually this helps in workflow management in a multi-tenant environment when 
> the workflow scheduler (or the hadoop admin) wants to kill a job - and let 
> the user know the reason why the job was killed. Killing the job by setting a 
> diagnostic message is a very good solution for that. Ideally, we can set the 
> diagnostic message on all such interface:
> yarn kill -applicationId ... -diagnosticMessage "some message added by 
> admin/workflow"
> REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by 
> admin/workflow'}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4931) Preempted resources go back to the same application

2016-04-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239235#comment-15239235
 ] 

Karthik Kambatla commented on YARN-4931:


Thanks for reporting this. The issue here seems to be that when the scheduler 
preempts resources for an application A, it doesn't reserve these resources for 
A. Before it considers A again for allocation, other applications can take 
these resources. We are planning to address this as part of YARN-4752. 

> Preempted resources go back to the same application
> ---
>
> Key: YARN-4931
> URL: https://issues.apache.org/jira/browse/YARN-4931
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: resourcemanager.log
>
>
> Sometimes a queue that needs resources causes preemption - but the preempted 
> containers are just allocated right back to the application that just 
> released them!
> Here is a tiny application (0007) that wants resources, and a container is 
> preempted from application 0002 to satisfy it:
> {code}
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Should preempt  res for 
> queue root.default: resDueToMinShare = , 
> resDueToFairShare = 
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler 
> (FairSchedulerUpdateThread): Preempting container (prio=1res= vCores:1>) from queue root.milesc
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics
>  (FairSchedulerUpdateThread): Non-AM container preempted, current 
> appAttemptId=appattempt_1460047303577_0002_01, 
> containerId=container_1460047303577_0002_01_001038, resource= vCores:1>
> 2016-04-07 21:08:13,463 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (FairSchedulerUpdateThread): container_1460047303577_0002_01_001038 Container 
> Transitioned from RUNNING to KILLED
> {code}
> But then a moment later, application 2 gets the container right back:
> {code}
> 2016-04-07 21:08:13,844 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode 
> (ResourceManager Event Processor): Assigned container 
> container_1460047303577_0002_01_001039 of capacity  
> on host ip-10-12-40-63.us-west-2.compute.internal:8041, which has 13 
> containers,  used and  
> available after allocation
> 2016-04-07 21:08:14,555 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (IPC Server handler 59 on 8030): container_1460047303577_0002_01_001039 
> Container Transitioned from ALLOCATED to ACQUIRED
> 2016-04-07 21:08:14,845 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl 
> (ResourceManager Event Processor): container_1460047303577_0002_01_001039 
> Container Transitioned from ACQUIRED to RUNNING
> {code}
> This results in new applications being unable to even get an AM, and never 
> starting at all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239301#comment-15239301
 ] 

Hadoop QA commented on YARN-3692:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 9s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 31s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 50s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
51s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_77. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 50s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 19s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s 
{color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. 
{color} |
| 

[jira] [Commented] (YARN-2888) Corrective mechanisms for rebalancing NM container queues

2016-04-13 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239364#comment-15239364
 ] 

Carlo Curino commented on YARN-2888:


Hi [~asuresh]

Sorry for the long delay reviewing this. The patch looks generally good, though 
I don't have complete context of how the thresholds computed here are used, but 
as much as I can see here things look reasonable.

General:
 # First a general question. I see you add a few extra conf parameters. I was 
wondering whether we can come up with a better mechanism to configure policies, 
other than global conf parameters. For now you have a fixed policy and 4 param, 
but as we develop other policies, we will start having more and more params. 
This is a general issue, you are following the standard way of doing this, I am 
just wondering whether we could do better than what is done so far.
 # In general, I would prefer to see the policy to be configurable. Unless you 
guys think this is the one and only policy we would want for queueing (at least 
for a long while). 
 # Would it make sense to make the parameters you pass down in the 
{{NodeHeartBeatResponse}} to be more general than "queuLimit"? In the future 
you might want to have the central component to send down other information, 
combined with local up to date information to make decisions. What I am 
suggesting is to make the "data-bus" you are establishing between central 
policy components and local enforcers/policies more generic, so that you can 
add/change things inside it later. Maybe this just means renaming the 
{{ContainerQueuingLimt}} to {{ContainerQueueingCommand}} or something like 
that, which is consistent if later on you don't send down just "limits".
 # in the .proto it would likely help other devs if you say max_wait_time_in_ms 
or something like that, which indicates time granularity. Also is int32 always 
enough? (likely silly question)
 # spuriuos import in {{ClusterMonitor.java}}?
 # as above maybe maybe of the classes/methods (e.g., 
{{getQueueLimitCalculator()}}) use the term "limit", where maybe "policy" would 
allow you to be more general and fit more stuff under the same name later on.
  
 In {{QueueLimitCalculator}}:
 # Is it reasonable to assume the caller of {{QueueLimitCalculator.update()}} 
will synchronize on topKNodes? This seems likely to create issues later on. 
Maybe making the topKNodes datastructure a concurrenthasmap is enough, even if 
the calculation is done on a slightly off view of the world, it should be 
statistically relevant correct?
 # If topKNodes is << than total nodes, you could create a local list of 
{{ClusterNode}}(s) and scan over it instead of invoking 
{{nodeSelector.getClusterNodes().get(nId)}} over and over again. 
 # Do you have a guarantee that the topKNodes list of Ids is already sorted by 
"val"? If not the median calculation might be off. 
 # The median calculation would be busted if you have K=1 (line 567 of the 
patch would lookup at -1) likely want to protect from that.
 # stdevMedian is never updated (line 580 is a wrong copy and paste of 579)
 # stedevMedian is typically referred to as Median Absolute Deviation, if I am 
not mistaken stdev is used only for the deviation based on the mean.
 # Can you guys comment on when using the MAD vs STDEV makes sense? Do you have 
experiments showing the impact or one or the other?
 # At line 630, 632 you simply cast-down float to int (which trims the decimal 
values). Is this the right way of rounding this? Wouldn't Math.round() be 
better?

> Corrective mechanisms for rebalancing NM container queues
> -
>
> Key: YARN-2888
> URL: https://issues.apache.org/jira/browse/YARN-2888
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
> Attachments: YARN-2888-yarn-2877.001.patch, 
> YARN-2888-yarn-2877.002.patch
>
>
> Bad queuing decisions by the LocalRMs (e.g., due to the distributed nature of 
> the scheduling decisions or due to having a stale image of the system) may 
> lead to an imbalance in the waiting times of the NM container queues. This 
> can in turn have an impact in job execution times and cluster utilization.
> To this end, we introduce corrective mechanisms that may remove (whenever 
> needed) container requests from overloaded queues, adding them to less-loaded 
> ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2883) Queuing of container requests in the NM

2016-04-13 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2883:
-
Attachment: YARN-2883-trunk.012.patch

I am attaching a new version of the patch, in which I got rid of the 
{{QueuingContainersMonitorImpl}}. The {{containersAllocation}} field is now 
moved to the {{ContainersManagerImpl}}.

Give it a look, [~kasha] and [~asuresh].
What I don't like is that I had to expose a lot of new methods in the 
{{ContainerManager}} interface, and also that I had to add a field (i.e., the 
{{containersAllocation}}) to the {{ContainerManagerImpl}} that will not be 
updated if queuing is not enabled.

That said, unless I am missing something (as in being able to access the 
methods of the {{ContainersManagerImpl}} without exposing them to the 
interface, which actually does not solve the second problem I mention above), I 
prefer the previous version of the patch that had the 
{{QueuingContainersMonitorImpl}}. 
I agree though with moving the pickOpportunisticContainersToKill to the 
{{QueuingContainerManagerImpl}}.

Let me know what you think.

> Queuing of container requests in the NM
> ---
>
> Key: YARN-2883
> URL: https://issues.apache.org/jira/browse/YARN-2883
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch, 
> YARN-2883-trunk.006.patch, YARN-2883-trunk.007.patch, 
> YARN-2883-trunk.008.patch, YARN-2883-trunk.009.patch, 
> YARN-2883-trunk.010.patch, YARN-2883-trunk.011.patch, 
> YARN-2883-trunk.012.patch, YARN-2883-yarn-2877.001.patch, 
> YARN-2883-yarn-2877.002.patch, YARN-2883-yarn-2877.003.patch, 
> YARN-2883-yarn-2877.004.patch
>
>
> We propose to add a queue in each NM, where queueable container requests can 
> be held.
> Based on the available resources in the node and the containers in the queue, 
> the NM will decide when to allow the execution of a queued container.
> In order to ensure the instantaneous start of a guaranteed-start container, 
> the NM may decide to pre-empt/kill running queueable containers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4948) Support node labels store in zookeeper

2016-04-13 Thread jialei weng (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jialei weng updated YARN-4948:
--
Attachment: (was: YARN-4948.001.patch)

> Support node labels store in zookeeper
> --
>
> Key: YARN-4948
> URL: https://issues.apache.org/jira/browse/YARN-4948
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Reporter: jialei weng
>
> Support node labels store in zookeeper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4948) Support node labels store in zookeeper

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240583#comment-15240583
 ] 

Hadoop QA commented on YARN-4948:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 41s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
38s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 23s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 20s 
{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 27s 
{color} | {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 27s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 33s 
{color} | {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 33s {color} 
| {color:red} hadoop-yarn in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 14 new + 
212 unchanged - 0 fixed = 226 total (was 212) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 22s 
{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 14 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 11s 
{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 21s {color} 
| {color:red} hadoop-yarn-api in the patch failed with JDK v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 16s {color} 
| {color:red} hadoop-yarn-common in the patch failed with JDK v1.8.0_77. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 22s {color} 
| {color:red} hadoop-yarn-api in the patch failed with JDK v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 21s {color} 
| {color:red} hadoop-yarn-common in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does 

[jira] [Updated] (YARN-4948) Support node labels store in zookeeper

2016-04-13 Thread jialei weng (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jialei weng updated YARN-4948:
--
Attachment: (was: YARN-4948-branch-2.7.0.001.patch)

> Support node labels store in zookeeper
> --
>
> Key: YARN-4948
> URL: https://issues.apache.org/jira/browse/YARN-4948
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: jialei weng
> Attachments: YARN-4948.001.patch
>
>
> Support node labels store in zookeeper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4425) Pluggable sharing policy for Partition Node Label resources

2016-04-13 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240533#comment-15240533
 ] 

Naganarasimha G R commented on YARN-4425:
-

Hi [~leftnoteasy],
Yes some of the interface methods can be further debated. But i tried to keep 
them to solve specific needs.
bq. The new added policy seems like a backdoor of scheduler's capacity 
management: when under "NON_EXCLUSIVE" mode, scheduler completely depends on 
configured policy to decide who can get next resources.
Yes i agree its a backdoor but still i feel this necessary to support features 
like hierarchical labels, consider one of the tenants in multi tenant 
environment who has batch and query load which requires diff kinds of 
nodes(hence the labels). And further query load will be totally nill during off 
peak hours and only during these hours they want to share the load to batch 
loading(under the same tenant). Currently this cannot be acheived through the 
existing label functionality and its very practical scenario which we are 
facing. And secondly for one of our scenarios query/batch processing pre 
emption is less desirable. So we require control to decide when to use the 
shared resources from other partitions(like only when resource crunch in 
existing partition use resources of other partition) . Hence i exposed those 
interfaces.

bq. And we need to make API of policy with more clear semantic: In existing 
scheduler, resource will be allocated on requested partition, and only 
request.partition = "" can get chance to allocate on other partitions when 
non-exclusive criteria is met.
In the new API, resource could be allocated on any partition regardless of 
requested partition. (depends on different policy implementation). Which will 
be conflict to our existing APIs for node partitions.
I have taken care in the current patch such the default behaviors of exclusive 
and nonexclusive partitions are not broken. And only for a specific partition 
policy is enabled then it starts behaving as per the policy. I did this based 
on your earlier 
[comment|https://issues.apache.org/jira/browse/YARN-4425?focusedCommentId=15053213=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15053213].

bq. To me, sharing resource between partitions itself is not a clear API:
Hmm but in a way we are already doing it now by sharing resources of non 
exclusive to default. But the scope is very limited and current sharing doesnt 
solve the scenarios which i have captured in the doc.

bq. Queue's shares of partitions could be dynamically adjusted, OR
Yes these are good to have but if there are thousands of queues to adjust for 
one queue how to adjust others ? IMHO it would have lot more complexities.

bq. Node's partition could be dynamically update
Well i see some complexities here too, suppose there is high end node marked 
for Query Label and now load for query is not there then manually(or 
externally) some one needs to modify the label to BatchLoad and after 
particular time then we redo this operation. So practically this needs to be 
handled externally instead based on policy i try to acheive the same thing 
transparently from inside. Thoughts?

I agree i have included lot of API's in the interface but open to scale it down 
and also we can make this as private interface and expose one configurable 
interface like the one given in the patch.  Thoughts? 

Would also like to know the views of [~jianhe], [~vinodkv] and any other folks 
already using labels.


> Pluggable sharing policy for Partition Node Label resources
> ---
>
> Key: YARN-4425
> URL: https://issues.apache.org/jira/browse/YARN-4425
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
> Attachments: ResourceSharingPolicyForNodeLabelsPartitions-V1.pdf, 
> ResourceSharingPolicyForNodeLabelsPartitions-V2.pdf, 
> YARN-4425.20160105-1.patch
>
>
> As part of support for sharing NonExclusive Node Label partitions in 
> YARN-3214, NonExclusive partitions are shared only to Default Partitions and 
> also have fixed rule when apps in default partitions makes use of resources 
> of any NonExclusive partitions.
> There are many scenarios where in we require pluggable policy like 
> MutliTenant, Hierarchical etc.. where in each partition can determine when 
> they want to share the resources to other paritions and when other partitions 
> wants to use resources from others
> More details in the attached document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4882) Change the log level to DEBUG for recovering completed applications

2016-04-13 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240489#comment-15240489
 ] 

Rohith Sharma K S commented on YARN-4882:
-

Apologies for missing your comment. Yes, your understanding is right. Initially 
when I raised this JIRA I had in mind that we don't want to flood the logs. But 
considering some issues would happen during recovery(would be a bug in RM also) 
that helps to know which application recovery caused the failure. This is the 
reason I given vote for keeping the logs in separate file. 
bq. Are the recovery logs used for anything other than diagnosing a failed 
recovery?
AFAIK, completed applications recovery logs used only diagnosis only if 
recovery fails. And also helps in debugging for running applications if 
applications get hanged or not allocating any containers cases. 

> Change the log level to DEBUG for recovering completed applications
> ---
>
> Key: YARN-4882
> URL: https://issues.apache.org/jira/browse/YARN-4882
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Daniel Templeton
>
> I think for recovering completed applications no need to log as INFO, rather 
> it can be made it as DEBUG.  The problem seen from large cluster is if any 
> issue happens during RM start up and continuously switching , then  RM logs 
> are filled with most with recovering applications only. 
> There are 6 lines are logged for 1 applications as I shown in below logs, 
> then consider RM default value for max-completed applications is 10K. So for 
> each switch 10K*6=60K lines will be added which is not useful I feel.
> {noformat}
> 2016-03-01 10:20:59,077 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority 
> level is set to application:application_1456298208485_21507
> 2016-03-01 10:20:59,094 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Recovering 
> app: application_1456298208485_21507 with 1 attempts and final state = 
> FINISHED
> 2016-03-01 10:20:59,100 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Recovering attempt: appattempt_1456298208485_21507_01 with final state: 
> FINISHED
> 2016-03-01 10:20:59,107 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1456298208485_21507_01 State change from NEW to FINISHED
> 2016-03-01 10:20:59,111 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1456298208485_21507 State change from NEW to FINISHED
> 2016-03-01 10:20:59,112 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=rohith   
> OPERATION=Application Finished - Succeeded  TARGET=RMAppManager 
> RESULT=SUCCESS  APPID=application_1456298208485_21507
> {noformat}
> The main problem is missing important information's from the logs before RM 
> unstable. Even though log roll back is 50 or 100, in a short period all these 
> logs will be rolled out and all the logs contains only RM switching 
> information that too recovering applications!!. 
> I suggest at least completed applications recovery should be logged as DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4948) Support node labels store in zookeeper

2016-04-13 Thread jialei weng (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jialei weng updated YARN-4948:
--
Attachment: (was: YARN-4948.001.patch)

> Support node labels store in zookeeper
> --
>
> Key: YARN-4948
> URL: https://issues.apache.org/jira/browse/YARN-4948
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Reporter: jialei weng
>
> Support node labels store in zookeeper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4577) Enable aux services to have their own custom classpath/jar file

2016-04-13 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240658#comment-15240658
 ] 

Vinod Kumar Vavilapalli commented on YARN-4577:
---

Just caught up with the wall of comments.

+1 in general for the ApplicationClassloader based solution. Aux-services was 
always a hack since Chris Douglas and I originally introduced it, the better 
solution is to move these services as first-class apps on top of YARN, but we 
are where we are.

[~sjlee0],
bq. For example, if the aux service code depends on another class property 
(owned by the aux service) in the configuration, that will be invoked via 
Configuration.getClass(), and it will still use the system classloader to load 
that class. Then it's very likely that you'll get a ClassNotFoundException.
Sangjin, you may be missing one important thing here - unlike in the MapReduce 
case, there is no shared Configuration object between NodeManager and the 
specific aux-service implementation here. We simply do not pass in any 
configuration anywhere as part of the AuxService APIs - so this entire thread 
of reasoning about getClass() is no long a problem? If needed, we can document 
advising against adding Conf as part of future API changes.

bq. The thread context classloader represents another similar problem. The 
moment the aux service code hits a code path that does Class.forName() that 
loads classes via the thread context classloader, and it needs to load an aux 
service-related class (that is not present in the main NM classpath), you will 
get a ClassNotFoundException.
In addition to wrapping aux-service API calls under a class-loader, wouldn't it 
suffice to simply have NM make all aux-services API calls in a separate thread 
whose ContextClassLoader is changed to be another custom one that resolves both 
System classes as well as AuxServices classes?

> Enable aux services to have their own custom classpath/jar file
> ---
>
> Key: YARN-4577
> URL: https://issues.apache.org/jira/browse/YARN-4577
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4577.1.patch, YARN-4577.2.patch, 
> YARN-4577.20160119.1.patch, YARN-4577.20160204.patch, YARN-4577.3.patch, 
> YARN-4577.3.rebase.patch, YARN-4577.4.patch
>
>
> Right now, users have to add their jars to the NM classpath directly, thus 
> put them on the system classloader. But if multiple versions of the plugin 
> are present on the classpath, there is no control over which version actually 
> gets loaded. Or if there are any conflicts between the dependencies 
> introduced by the auxiliary service and the NM itself, they can break the NM, 
> the auxiliary service, or both.
> The solution could be: to instantiate aux services using a classloader that 
> is different from the system classloader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4953) Delete completed container log folder when rolling log aggregation is enabled

2016-04-13 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4953:

Target Version/s: 2.9.0

> Delete completed container log folder when rolling log aggregation is enabled
> -
>
> Key: YARN-4953
> URL: https://issues.apache.org/jira/browse/YARN-4953
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>
> There would be potential bottle neck when cluster is running with very large 
> number of containers on the same NodeManager for single application. The 
> linux limits the subfolders count to 32K. If number of containers is greater 
> than 32K for an application, there would be container launch failure. At this 
> point of time, there are no more containers can be launched in this node.
> Currently log folders are deleted after app is finished. Rolling log 
> aggregation aggregates logs to hdfs periodically. 
> I think if aggregation is completed for finished containers, then clean up 
> can be done i.e deleting log folder for finished containers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4948) Support node labels store in zookeeper

2016-04-13 Thread jialei weng (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jialei weng updated YARN-4948:
--
Attachment: YARN-4948.001.patch

> Support node labels store in zookeeper
> --
>
> Key: YARN-4948
> URL: https://issues.apache.org/jira/browse/YARN-4948
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Reporter: jialei weng
> Attachments: YARN-4948.001.patch
>
>
> Support node labels store in zookeeper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4882) Change the log level to DEBUG for recovering completed applications

2016-04-13 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240495#comment-15240495
 ] 

Rohith Sharma K S commented on YARN-4882:
-

One of the reason for redirecting recovered applications to separate log file 
is, by the time you get to know there is something problem in the cluster the 
logs files are flooded especially when RM HA is configured. I agree that 
changing log level would help, but it wont solve our log flooding problem.

> Change the log level to DEBUG for recovering completed applications
> ---
>
> Key: YARN-4882
> URL: https://issues.apache.org/jira/browse/YARN-4882
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Daniel Templeton
>
> I think for recovering completed applications no need to log as INFO, rather 
> it can be made it as DEBUG.  The problem seen from large cluster is if any 
> issue happens during RM start up and continuously switching , then  RM logs 
> are filled with most with recovering applications only. 
> There are 6 lines are logged for 1 applications as I shown in below logs, 
> then consider RM default value for max-completed applications is 10K. So for 
> each switch 10K*6=60K lines will be added which is not useful I feel.
> {noformat}
> 2016-03-01 10:20:59,077 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority 
> level is set to application:application_1456298208485_21507
> 2016-03-01 10:20:59,094 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Recovering 
> app: application_1456298208485_21507 with 1 attempts and final state = 
> FINISHED
> 2016-03-01 10:20:59,100 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Recovering attempt: appattempt_1456298208485_21507_01 with final state: 
> FINISHED
> 2016-03-01 10:20:59,107 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1456298208485_21507_01 State change from NEW to FINISHED
> 2016-03-01 10:20:59,111 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1456298208485_21507 State change from NEW to FINISHED
> 2016-03-01 10:20:59,112 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=rohith   
> OPERATION=Application Finished - Succeeded  TARGET=RMAppManager 
> RESULT=SUCCESS  APPID=application_1456298208485_21507
> {noformat}
> The main problem is missing important information's from the logs before RM 
> unstable. Even though log roll back is 50 or 100, in a short period all these 
> logs will be rolled out and all the logs contains only RM switching 
> information that too recovering applications!!. 
> I suggest at least completed applications recovery should be logged as DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4909) Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240638#comment-15240638
 ] 

Hadoop QA commented on YARN-4909:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 7s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 26s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
36s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 51s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 51s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 22s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 44s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 13s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 52s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
39s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 239m 54s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||

[jira] [Updated] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient

2016-04-13 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4955:

Attachment: YARN-4955.1.patch

> Add retry for SocketTimeoutException in TimelineClient
> --
>
> Key: YARN-4955
> URL: https://issues.apache.org/jira/browse/YARN-4955
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4955.1.patch
>
>
> We saw this exception several times when we tried to getDelegationToken from 
> ATS.
> java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
>   at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567)
>   ... 24 more
> Caused by: java.net.SocketTimeoutException: Read timed out
>   at 

[jira] [Updated] (YARN-4928) Some yarn.server.timeline.* tests fail on Windows attempting to use a test root path containing a colon

2016-04-13 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4928:
-
Fix Version/s: (was: 2.9.0)
   2.8.0

I have merge the patch to 2.8. 
branch-2.7 doesn't have ATS 1.5 code so test failure fixed here is not related.

> Some yarn.server.timeline.* tests fail on Windows attempting to use a test 
> root path containing a colon
> ---
>
> Key: YARN-4928
> URL: https://issues.apache.org/jira/browse/YARN-4928
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 2.8.0
> Environment: OS: Windows Server 2012
> JDK: 1.7.0_79
>Reporter: Gergely Novák
>Assignee: Gergely Novák
>Priority: Minor
> Fix For: 2.8.0
>
> Attachments: YARN-4928.001.patch, YARN-4928.002.patch, 
> YARN-4928.003.patch, YARN-4928.004.patch, YARN-4928.005.patch, 
> YARN-4928.006.patch
>
>
> yarn.server.timeline.TestEntityGroupFSTimelineStore.* and 
> yarn.server.timeline.TestLogInfo.* fail on Windows, because they are 
> attempting to use a test root paths like 
> "/C:/hdp/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/target/test-dir/TestLogInfo",
>  which contains a ":" (after the Windows drive letter) and 
> DFSUtil.isValidName() does not accept paths containing ":".
> This problem is identical to HDFS-6189, so I suggest to use the same 
> approach: using "/tmp/..." as test root dir instead of 
> System.getProperty("test.build.data", System.getProperty("java.io.tmpdir")).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2567) Add a percentage-node threshold for RM to wait for new allocations after restart/failover

2016-04-13 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239425#comment-15239425
 ] 

sandflee commented on YARN-2567:


Thanks [~jlowe],  agree that a asynchronous state store will always lead to 
inconsistent and hard to fix. make state store first seems a reasonable way. 

> Add a percentage-node threshold for RM to wait for new allocations after 
> restart/failover
> -
>
> Key: YARN-2567
> URL: https://issues.apache.org/jira/browse/YARN-2567
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>
> This is the remaining part of YARN-2001 - to halt allocations after restart 
> till x% of nodes sync back with the RM. This is useful for avoiding bad 
> scheduling during the time the nodes are still joining back after a 
> restart/failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3998) Add retry-times to let NM re-launch container when it fails to run

2016-04-13 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239448#comment-15239448
 ] 

Varun Vasudev commented on YARN-3998:
-

The latest patch looks good to me. The points that still need to be addressed -
# Unify restart policies with AM restart work
# Support minimum retry interval
# Fix  containerLaunchStartTime and track the individual start time for the 
retry attempt.

In my opinion, all of these can be done as follow up JIRAs. [~vinodkv] - can 
you take a look at the latest patch?

> Add retry-times to let NM re-launch container when it fails to run
> --
>
> Key: YARN-3998
> URL: https://issues.apache.org/jira/browse/YARN-3998
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jun Gong
>Assignee: Jun Gong
> Attachments: YARN-3998.01.patch, YARN-3998.02.patch, 
> YARN-3998.03.patch, YARN-3998.04.patch, YARN-3998.05.patch, 
> YARN-3998.06.patch, YARN-3998.07.patch, YARN-3998.08.patch, YARN-3998.09.patch
>
>
> I'd like to add a field(retry-times) in ContainerLaunchContext. When AM 
> launches containers, it could specify the value. Then NM will re-launch the 
> container 'retry-times' times when it fails to run(e.g.exit code is not 0). 
> It will save a lot of time. It avoids container localization. RM does not 
> need to re-schedule the container. And local files in container's working 
> directory will be left for re-use.(If container have downloaded some big 
> files, it does not need to re-download them when running again.) 
> We find it is useful in systems like Storm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient

2016-04-13 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239541#comment-15239541
 ] 

Junping Du commented on YARN-4955:
--

Thanks [~xgong] for reporting the issue and delivering the patch. The issue 
reported here is valid and I also see the same problem in some logs. For fix, I 
think we are missing the catch part:

{code}
public Object retryOn(TimelineClientRetryOp op)
throws RuntimeException, IOException {
  int leftRetries = maxRetries;
  retried = false;

  // keep trying
  while (true) {
try {
  // try perform the op, if fail, keep retrying
  return op.run();
} catch (IOException | RuntimeException e) {
  // break if there's no retries left
{code}
As AuthenticationException is neither a IOException or RuntimeException, we 
will miss to catch here.
Can you add AuthenticationException in catch clause? Also, it would be nice if 
we can have a unit test to verify it works.

> Add retry for SocketTimeoutException in TimelineClient
> --
>
> Key: YARN-4955
> URL: https://issues.apache.org/jira/browse/YARN-4955
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4955.1.patch
>
>
> We saw this exception several times when we tried to getDelegationToken from 
> ATS.
> java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
>   at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166)
> 

[jira] [Created] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient

2016-04-13 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-4955:
---

 Summary: Add retry for SocketTimeoutException in TimelineClient
 Key: YARN-4955
 URL: https://issues.apache.org/jira/browse/YARN-4955
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Xuan Gong
Assignee: Xuan Gong


We saw this exception several times when we tried to getDelegationToken from 
ATS.

java.io.IOException: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250)
at 
org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291)
at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at 
org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
at 
org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
at java.lang.Thread.run(Thread.java:745)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
Caused by: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
java.net.SocketTimeoutException: Read timed out
at 
org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
at 
org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
at 
org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166)
at 
org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567)
... 24 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at 

[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239547#comment-15239547
 ] 

Hadoop QA commented on YARN-4955:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 54s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_77. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 12s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 42s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:fbe3e86 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798534/YARN-4955.1.patch |
| JIRA Issue | YARN-4955 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 23ddfb1806d8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 

[jira] [Commented] (YARN-2883) Queuing of container requests in the NM

2016-04-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239499#comment-15239499
 ] 

Karthik Kambatla commented on YARN-2883:


Discussed this offline with [~asuresh], [~kkaranasos] and [~chris.douglas]. 
Regarding queuing vs immediately starting guaranteed containers, it is 
reasonable to queue it as part of YARN-2883. In YARN-1011, we could add the 
option of starting them directly or using the actual utilization for 
determining resource availability. 

The logic sounds good to me. Most of my comments are cosmetic 
(readability/maintainability) in nature. Since this change will only be in 
trunk (and not branch-2), I am comfortable with checking this in to unblock 
other efforts. I am open to addressing some of my comments in a subsequent 
JIRA, and happy to contribute those changes too. 

Comments:
# QueuingContainerManagerImpl
## I see that additions to opportContainersToKill are in the monitor. My latter 
comments recommend moving the helper methods in monitor to manager itself. If 
we don't do that, it would help to add a comment mentioning where this set is 
populated. 
## Nit: Rename opportContainersToKill to opportunisticContainersToKill? 
## Nits: In the constructor, don't need to specify the type of 
ConcurrentLinkedQueue
## Nit: The following code could be merged into a single line:
{code}
  if (allocatedContInfo
  .getExecutionType() == ExecutionType.GUARANTEED) {
{code}
## Nit: Rename killOpportContainers to killOpportunisticContainers
## Would it make sense to have a field for queuingContainerMonitor since it is 
used at several places? I am open to calling this containersMonitor depending 
on whether we choose to relax the visibility of the same name field in the 
parent class.
# QueuingContainersMonitorImpl
## Several methods and fields seem to use utilization/usage in their names; 
this is misleading as it gives the impression we are looking at the utilization 
instead of allocation/limit.
## Would it make sense to track the aggregateContainerAllocation using 
ProcessTreeInfo instead of ResourceUtilization? The latter likely works better 
for YARN-1011, but I am fine with either. 
## Would it make sense to track the aggregateContainerAllocation in 
ContainersMonitorImpl itself? This can be updated when we add/remove items from 
trackingContainers? That way, most of the helper methods in 
QueuingContainersMonitorImpl can just move to QueuingContainerManager, and the 
helper methods themselves will not need all the parameters they are passed 
making the code more readable.
## Nit: Rename opportContainersToKill to opportunisticContainersToKill and even 
more preferably to identifyOpportunisticContainersToKill? 
# ContainerImpl: The second change appears spurious. 
# ContainersMonitorImpl
## Visibility of some of the fields has been relaxed. Not all of them are 
required. Some of them are for tests; can we add @VisibleForTesting along with 
a comment about what the visibility could be if it weren't for the tests?
## Observation: The addition of onStart etc. methods is not necessary, but 
makes the code easier to understand. 
# Should the config being added be a queue length with a default of 0 instead 
of a boolean that we have now? I am fine with filing a follow-up to fix this 
up. My intent here is to limit the new configs we add and avoid redundancy.
# TestQueuingContainerManager
## createContainerManager defines getRemoteUgi the exact same way as 
TestContainerManager. Any chance we can avoid the duplication? 
## Would it make sense to define the right hasResources in the setup method 
itself? 
## When creating a new ArrayList, don't need to specify the type. 

> Queuing of container requests in the NM
> ---
>
> Key: YARN-2883
> URL: https://issues.apache.org/jira/browse/YARN-2883
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
> Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch, 
> YARN-2883-trunk.006.patch, YARN-2883-trunk.007.patch, 
> YARN-2883-trunk.008.patch, YARN-2883-trunk.009.patch, 
> YARN-2883-trunk.010.patch, YARN-2883-trunk.011.patch, 
> YARN-2883-yarn-2877.001.patch, YARN-2883-yarn-2877.002.patch, 
> YARN-2883-yarn-2877.003.patch, YARN-2883-yarn-2877.004.patch
>
>
> We propose to add a queue in each NM, where queueable container requests can 
> be held.
> Based on the available resources in the node and the containers in the queue, 
> the NM will decide when to allow the execution of a queued container.
> In order to ensure the instantaneous start of a guaranteed-start container, 
> the NM may decide to pre-empt/kill running queueable containers.



--
This message was sent by Atlassian JIRA

[jira] [Commented] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes

2016-04-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239707#comment-15239707
 ] 

Sunil G commented on YARN-4947:
---

HI [~bibinchundatt]
I am not very sure whether this is correct as we always return TRUE from 
isDrained.

I think its better if we remove {{drainEvents}} from {{MockRM}}. We can call 
this from those test cases (like those cases from YARN-4893) which has to 
ensure events are processed and do next iteration of test case execution. Else 
we will run in to similar issues, and may be more error prone.
[~djp], [~brahma] and [~bibinchundatt] thoughts?

> Test timeout is happening for TestRMWebServicesNodes
> 
>
> Key: YARN-4947
> URL: https://issues.apache.org/jira/browse/YARN-4947
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4947.patch
>
>
> Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 
> [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4948) Support node labels store in zookeeper

2016-04-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239721#comment-15239721
 ] 

Sunil G commented on YARN-4948:
---

I think its based on branch 2.7. It can be prepared from trunk.

> Support node labels store in zookeeper
> --
>
> Key: YARN-4948
> URL: https://issues.apache.org/jira/browse/YARN-4948
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: resourcemanager
>Affects Versions: 2.7.0
>Reporter: jialei weng
> Attachments: YARN-4948-branch-2.7.0.001.patch, YARN-4948.001.patch
>
>
> Support node labels store in zookeeper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4751) In 2.7, Labeled queue usage not shown properly in capacity scheduler UI

2016-04-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239743#comment-15239743
 ] 

Sunil G commented on YARN-4751:
---

Hi [~eepayne]
I have a suggestion here. Is it better to have YARN-3362 2.7 version patch to 
be added in YARN-3362 and commit that. Then other changes to do here. I also 
feel its not very easy to do this, if its problematic I think we can do here 
too. What do u feel.



> In 2.7, Labeled queue usage not shown properly in capacity scheduler UI
> ---
>
> Key: YARN-4751
> URL: https://issues.apache.org/jira/browse/YARN-4751
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.3
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: 2.7 CS UI No BarGraph.jpg, 
> YARH-4752-branch-2.7.001.patch, YARH-4752-branch-2.7.002.patch, 
> YARN-4751-branch-2.7.003.patch
>
>
> In 2.6 and 2.7, the capacity scheduler UI does not have the queue graphs 
> separated by partition. When applications are running on a labeled queue, no 
> color is shown in the bar graph, and several of the "Used" metrics are zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4902) [Umbrella] Generalized and unified scheduling-strategies in YARN

2016-04-13 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239662#comment-15239662
 ] 

Arun Suresh commented on YARN-4902:
---

Does it make sense to introduce the concept of *AffinityGroups* / 
*Anti-AffinityGroups* for the purpose of simplifying affinity/anti-affinity  ?
Essentially, just like we use the Reservation API to request a reservation Id, 
and then include the Id in subsequent requests,
we could expose an API that allows one to request an *(Anti)AffinityGroups* 
(consisting of N machine / totally X resources etc.) which inturn returns a 
*groupId*. (This could internally would dynamically tag / label a group of 
Machines/Nodes with the id, but I guess that is an implementation detail.)

Subsequent allocation requests can include this *groupId* and based on the 
policy, The scheduler will try to schedule on the same group of machines or 
different machines.

Another though is, an affinity group may be an Allocation that can be 
used/shared by multiple applications. Once we de-link Allocations from 
containers, It may be possible to have an uber-application (or maybe a 
component in the RM itself) lease Allocations only and grant the right to start 
containers against these allocations to other applications. A group of 
allocations across multiple machines can constitute an anti-affinity group, and 
we can introduce policies that allow subsequent apps to, say, start only 1 
container of each role/type on 1 allocation within a group for eg.

I feel that allowing users to organize deployments along the lines of 
affinity/anti-affinity groups (failure domains might also be something similar) 
is more manageable than allowing users to specify affinity and anti-affinity 
constraints with respect to an already deployed application.

> [Umbrella] Generalized and unified scheduling-strategies in YARN
> 
>
> Key: YARN-4902
> URL: https://issues.apache.org/jira/browse/YARN-4902
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Wangda Tan
> Attachments: Generalized and unified scheduling-strategies in YARN 
> -v0.pdf
>
>
> Apache Hadoop YARN's ResourceRequest mechanism is the core part of the YARN's 
> scheduling API for applications to use. The ResourceRequest mechanism is a 
> powerful API for applications (specifically ApplicationMasters) to indicate 
> to YARN what size of containers are needed, and where in the cluster etc.
> However a host of new feature requirements are making the API increasingly 
> more and more complex and difficult to understand by users and making it very 
> complicated to implement within the code-base.
> This JIRA aims to generalize and unify all such scheduling-strategies in YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4886) Add HDFS caller context for EntityGroupFSTimelineStore

2016-04-13 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239663#comment-15239663
 ] 

Xuan Gong commented on YARN-4886:
-

Looks like the findbug issue is not introduced by this patch. Will create a 
separate ticket to track it, Committing this patch.

> Add HDFS caller context for EntityGroupFSTimelineStore
> --
>
> Key: YARN-4886
> URL: https://issues.apache.org/jira/browse/YARN-4886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-4886-trunk.001.patch
>
>
> We need to add a HDFS caller context for the entity group FS storage for 
> better audit log debugging. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4886) Add HDFS caller context for EntityGroupFSTimelineStore

2016-04-13 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239669#comment-15239669
 ] 

Xuan Gong commented on YARN-4886:
-

Committed into trunk/branch-2/branch-2.8. Thanks, Li

> Add HDFS caller context for EntityGroupFSTimelineStore
> --
>
> Key: YARN-4886
> URL: https://issues.apache.org/jira/browse/YARN-4886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Fix For: 2.8.0
>
> Attachments: YARN-4886-trunk.001.patch
>
>
> We need to add a HDFS caller context for the entity group FS storage for 
> better audit log debugging. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4886) Add HDFS caller context for EntityGroupFSTimelineStore

2016-04-13 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4886:

Fix Version/s: 2.8.0

> Add HDFS caller context for EntityGroupFSTimelineStore
> --
>
> Key: YARN-4886
> URL: https://issues.apache.org/jira/browse/YARN-4886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Fix For: 2.8.0
>
> Attachments: YARN-4886-trunk.001.patch
>
>
> We need to add a HDFS caller context for the entity group FS storage for 
> better audit log debugging. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4886) Add HDFS caller context for EntityGroupFSTimelineStore

2016-04-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239809#comment-15239809
 ] 

Hudson commented on YARN-4886:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9604 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9604/])
YARN-4886. Add HDFS caller context for EntityGroupFSTimelineStore. (xgong: rev 
e0cb426758b3d716ff143f723fc16ef2f1e4971b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage/src/main/java/org/apache/hadoop/yarn/server/timeline/EntityGroupFSTimelineStore.java


> Add HDFS caller context for EntityGroupFSTimelineStore
> --
>
> Key: YARN-4886
> URL: https://issues.apache.org/jira/browse/YARN-4886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Fix For: 2.8.0
>
> Attachments: YARN-4886-trunk.001.patch
>
>
> We need to add a HDFS caller context for the entity group FS storage for 
> better audit log debugging. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient

2016-04-13 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4955:

Attachment: YARN-4955.2.patch

> Add retry for SocketTimeoutException in TimelineClient
> --
>
> Key: YARN-4955
> URL: https://issues.apache.org/jira/browse/YARN-4955
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4955.1.patch, YARN-4955.2.patch
>
>
> We saw this exception several times when we tried to getDelegationToken from 
> ATS.
> java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
>   at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:567)
>   ... 24 more
> Caused by: java.net.SocketTimeoutException: Read timed out
>   at 

[jira] [Commented] (YARN-4514) [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses

2016-04-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239661#comment-15239661
 ] 

Sunil G commented on YARN-4514:
---

It seems jenkins is not triggered. [~leftnoteasy], could u pls help to kick 
jenkins manually. 

> [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses
> --
>
> Key: YARN-4514
> URL: https://issues.apache.org/jira/browse/YARN-4514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Attachments: YARN-4514-YARN-3368.1.patch, 
> YARN-4514-YARN-3368.2.patch, YARN-4514-YARN-3368.3.patch, 
> YARN-4514-YARN-3368.4.patch, YARN-4514-YARN-3368.5.patch, 
> YARN-4514-YARN-3368.6.patch, YARN-4514-YARN-3368.7.patch
>
>
> We have several configurations are hard-coded, for example, RM/ATS addresses, 
> we should make them configurable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4878) Expose scheduling policy and max running apps over JMX for Yarn queues

2016-04-13 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239800#comment-15239800
 ] 

Yufei Gu commented on YARN-4878:


I've tested all failed test cases. All of them are unrelated to the patch.
In terms of style error, I mimic the existing code, I think it's better to 
follow the convention of the existing code in the class.

> Expose scheduling policy and max running apps over JMX for Yarn queues
> --
>
> Key: YARN-4878
> URL: https://issues.apache.org/jira/browse/YARN-4878
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 2.9.0
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-4878.001.patch, YARN-4878.002.patch
>
>
> There are two things that are not currently visible over JMX: the current 
> scheduling policy for a queue, and the number of max running apps. It would 
> be great if these could be exposed over JMX as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4514) [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239803#comment-15239803
 ] 

Hadoop QA commented on YARN-4514:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 3m 20s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:e35bf0f |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12798311/YARN-4514-YARN-3368.7.patch
 |
| JIRA Issue | YARN-4514 |
| Optional Tests |  asflicense  |
| uname | Linux 6ad19ad54a1d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-3368 / e35bf0f |
| modules | C:  hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui   .  U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/11063/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> [YARN-3368] Cleanup hardcoded configurations, such as RM/ATS addresses
> --
>
> Key: YARN-4514
> URL: https://issues.apache.org/jira/browse/YARN-4514
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Attachments: YARN-4514-YARN-3368.1.patch, 
> YARN-4514-YARN-3368.2.patch, YARN-4514-YARN-3368.3.patch, 
> YARN-4514-YARN-3368.4.patch, YARN-4514-YARN-3368.5.patch, 
> YARN-4514-YARN-3368.6.patch, YARN-4514-YARN-3368.7.patch
>
>
> We have several configurations are hard-coded, for example, RM/ATS addresses, 
> we should make them configurable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient

2016-04-13 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239658#comment-15239658
 ] 

Xuan Gong commented on YARN-4955:
-

Thanks for the review. Looks like in 
{code}
  public Object run() throws IOException {
// Try pass the request, if fail, keep retrying
authUgi.checkTGTAndReloginFromKeytab();
try {
  return authUgi.doAs(action);
} catch (UndeclaredThrowableException e) {
  throw new IOException(e.getCause());
} catch (InterruptedException e) {
  throw new IOException(e);
}
  }
{code}
All the exceptions would be wrapped as IOException. So, currently catch 
IOException should be enough. 
Have modify the patch to get the cause from IOException to check whether it is 
AuthenticationException. and then further check the cause for this 
AuthenticationException.

> Add retry for SocketTimeoutException in TimelineClient
> --
>
> Key: YARN-4955
> URL: https://issues.apache.org/jira/browse/YARN-4955
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4955.1.patch, YARN-4955.2.patch
>
>
> We saw this exception several times when we tried to getDelegationToken from 
> ATS.
> java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
>   at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166)
>   at 
> 

[jira] [Commented] (YARN-4909) Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter

2016-04-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239671#comment-15239671
 ] 

Sunil G commented on YARN-4909:
---

[~bibinchundatt]
{{int jerseyPort = port + rand.nextInt(1000);}}
I think this ll also be fine.

> Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter
> ---
>
> Key: YARN-4909
> URL: https://issues.apache.org/jira/browse/YARN-4909
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Brahma Reddy Battula
>Assignee: Bibin A Chundatt
>Priority: Blocker
> Attachments: 0001-YARN-4909.patch, 0002-YARN-4909.patch, 
> 0003-YARN-4909.patch, 0004-YARN-4909.patch, 0005-YARN-4909.patch
>
>
>  *Precommit link* 
> https://builds.apache.org/job/PreCommit-YARN-Build/10908/testReport/
> *Trace* 
> {noformat}
> com.sun.jersey.test.framework.spi.container.TestContainerException: 
> java.net.BindException: Address already in use
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:463)
>   at sun.nio.ch.Net.bind(Net.java:455)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:413)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:384)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:375)
>   at 
> org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:549)
>   at 
> org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:255)
>   at 
> com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:326)
>   at 
> com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:343)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.instantiateGrizzlyWebServer(GrizzlyWebTestContainerFactory.java:219)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:129)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:86)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory.create(GrizzlyWebTestContainerFactory.java:79)
>   at 
> com.sun.jersey.test.framework.JerseyTest.getContainer(JerseyTest.java:342)
>   at com.sun.jersey.test.framework.JerseyTest.(JerseyTest.java:217)
>   at 
> org.apache.hadoop.yarn.webapp.JerseyTestBase.(JerseyTestBase.java:30)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices.(TestRMWebServices.java:125)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239905#comment-15239905
 ] 

Daniel Templeton commented on YARN-4807:


Looks like the same tests are still failing.  Did you add in the sleeps per 
[~kasha]'s suggestion?

bq. (1) manually add sleeps so the tests pass, (2) file a follow-up JIRA to fix 
the test the right way and (3) add a TODO in the code annotated with the JIRA. 
(e.g. // TODO (YARN-wxyz))

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
>  Labels: newbie
> Attachments: YARN-4807.001.patch, YARN-4807.002.patch, 
> YARN-4807.003.patch, YARN-4807.004.patch, YARN-4807.005.patch, 
> YARN-4807.006.patch, YARN-4807.007.patch, YARN-4807.008.patch
>
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4956) findbug issue on LevelDBCacheTimelineStore

2016-04-13 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-4956:

Affects Version/s: 2.8.0

> findbug issue on LevelDBCacheTimelineStore
> --
>
> Key: YARN-4956
> URL: https://issues.apache.org/jira/browse/YARN-4956
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Xuan Gong
>
> {code}
> Multithreaded correctness Warnings
> Code Warning IS Inconsistent synchronization of 
> org.apache.hadoop.yarn.server.timeline.LevelDBCacheTimelineStore.configuration;
>  locked 66% of time
> Bug type IS2_INCONSISTENT_SYNC (click for details) 
> In class org.apache.hadoop.yarn.server.timeline.LevelDBCacheTimelineStore
> Field 
> org.apache.hadoop.yarn.server.timeline.LevelDBCacheTimelineStore.configuration
> Synchronized 66% of the time
> Unsynchronized access at LevelDBCacheTimelineStore.java:[line 82]
> Synchronized access at LevelDBCacheTimelineStore.java:[line 117]
> Synchronized access at LevelDBCacheTimelineStore.java:[line 122]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4956) findbug issue on LevelDBCacheTimelineStore

2016-04-13 Thread Xuan Gong (JIRA)
Xuan Gong created YARN-4956:
---

 Summary: findbug issue on LevelDBCacheTimelineStore
 Key: YARN-4956
 URL: https://issues.apache.org/jira/browse/YARN-4956
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong


{code}
Multithreaded correctness Warnings
Code Warning IS Inconsistent synchronization of 
org.apache.hadoop.yarn.server.timeline.LevelDBCacheTimelineStore.configuration; 
locked 66% of time
Bug type IS2_INCONSISTENT_SYNC (click for details) 
In class org.apache.hadoop.yarn.server.timeline.LevelDBCacheTimelineStore
Field 
org.apache.hadoop.yarn.server.timeline.LevelDBCacheTimelineStore.configuration
Synchronized 66% of the time
Unsynchronized access at LevelDBCacheTimelineStore.java:[line 82]
Synchronized access at LevelDBCacheTimelineStore.java:[line 117]
Synchronized access at LevelDBCacheTimelineStore.java:[line 122]
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4468) Document the general ReservationSystem functionality, and the REST API

2016-04-13 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239842#comment-15239842
 ] 

Arun Suresh commented on YARN-4468:
---

[~curino]/[~subru], The patch itself looks good... but it looks like it does 
not apply correctly against trunk..
can you rebase and re-post the patch ?

> Document the general ReservationSystem functionality, and the REST API
> --
>
> Key: YARN-4468
> URL: https://issues.apache.org/jira/browse/YARN-4468
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, resourcemanager
>Reporter: Carlo Curino
>Assignee: Carlo Curino
> Attachments: YARN-4468.1.patch, YARN-4468.rest-only.patch
>
>
> This JIRA tracks effort to document the ReservationSystem functionality, and 
> the REST API access to it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240324#comment-15240324
 ] 

Hadoop QA commented on YARN-3816:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 12m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 12s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
47s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 47s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
45s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 7s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
1s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
19s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 40s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 1s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
35s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 57s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 7 new + 
1 unchanged - 0 fixed = 8 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
50s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 57s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-jdk1.8.0_77 with JDK v1.8.0_77 
generated 8 new + 92 unchanged - 8 fixed = 100 total (was 100) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 57s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.8.0_77
 with JDK v1.8.0_77 generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 7m 6s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-jdk1.7.0_95 with JDK v1.7.0_95 
generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 53s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s 
{color} | {color:green} 

[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-04-13 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240325#comment-15240325
 ] 

Robert Kanter commented on YARN-4676:
-

Thanks for updating the patch.  I'll take a look either late today or tomorrow. 
 The docs are simply markdown files in the repo.  For instance, here is the 
docs for the {{rmadmin}} command \[1\].  Looks like someone has already updated 
the usage to mention the {{-g}} argument for {{refreshNodes}}, but they forgot 
to update the "COMMAND_OPTIONS" and "Description" after it.  We should fix 
that.  As documentation for more details on how to do a graceful decom, I 
wonder if that should go on its own page like these \[2\].

\[1\] 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnCommands.md#rmadmin
\[2\] 
https://github.com/apache/hadoop/tree/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown



> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> 
>
> Key: YARN-4676
> URL: https://issues.apache.org/jira/browse/YARN-4676
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Daniel Zhi
>Assignee: Daniel Zhi
>  Labels: features
> Attachments: GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, 
> YARN-4676.005.patch, YARN-4676.006.patch, YARN-4676.007.patch, 
> YARN-4676.008.patch, YARN-4676.009.patch, YARN-4676.010.patch
>
>
> DecommissioningNodeWatcher inside ResourceTrackingService tracks 
> DECOMMISSIONING nodes status automatically and asynchronously after 
> client/admin made the graceful decommission request. It tracks 
> DECOMMISSIONING nodes status to decide when, after all running containers on 
> the node have completed, will be transitioned into DECOMMISSIONED state. 
> NodesListManager detect and handle include and exclude list changes to kick 
> out decommission or recommission as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4541) Change log message in LocalizedResource#handle() to DEBUG

2016-04-13 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240327#comment-15240327
 ] 

Robert Kanter commented on YARN-4541:
-

+1

> Change log message in LocalizedResource#handle() to DEBUG
> -
>
> Key: YARN-4541
> URL: https://issues.apache.org/jira/browse/YARN-4541
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Minor
>  Labels: supportability
> Attachments: YARN-4541.001.patch, YARN-4541.002.patch
>
>
> This section of code can fill up a log fairly quickly.
> {code}
>if (oldState != newState) {
> LOG.info("Resource " + resourcePath + (localPath != null ?
>   "(->" + localPath + ")": "") + " transitioned from " + oldState
> + " to " + newState);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4958) The file localization process should allow for wildcards to reduce the application footprint in the state store

2016-04-13 Thread Daniel Templeton (JIRA)
Daniel Templeton created YARN-4958:
--

 Summary: The file localization process should allow for wildcards 
to reduce the application footprint in the state store
 Key: YARN-4958
 URL: https://issues.apache.org/jira/browse/YARN-4958
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.8.0
Reporter: Daniel Templeton
Assignee: Daniel Templeton
Priority: Critical


When using the -libjars option to add classes to the classpath, every library 
so added is explicitly listed in the {{ContainerLaunchContext}}'s local 
resources even though they're all uploaded to the same directory in HDFS.  When 
using tools like Crunch without an uber JAR or when trying to take advantage of 
the shared cache, the number of libraries can be quite large.  We've seen many 
cases where we had to turn down the max number of applications to prevent ZK 
from running out of heap because of the size of the state store entries.

Rather than listing all files independently, this JIRA proposes to have the NM 
allow wildcards in the resource localization paths.  Specifically, we propose 
to allow a path to have a final component (name) set to "*", which is 
interpreted by the NM as "download the fell directory and link to every file in 
it from the job's working directory."  This behavior is the same as the current 
behavior when using -libjars, but avoids explicitly listing every file.

This JIRA does not attempt to provide more general purpose wildcards, such as 
"*.jar" or "file*", as having multiple entries for a single directory presents 
numerous logistical issues.

This JIRA also does not attempt to integrate with the shared cache.  That work 
will be left to a future JIRA.

This JIRA proposes to allow for wildcards both in the internal processing of 
the -libjars switch and in paths added through the {{Job}} and 
{{DistributedCache}} classes.

The proposed approach is to treat a path, "dir/*", as "dir" for purposes of all 
file verification.  In the final step, the NM will query the localized 
directory to get a list of the files in "dir" such that each can be linked from 
the job's working directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4734) Merge branch:YARN-3368 to trunk

2016-04-13 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240333#comment-15240333
 ] 

Wangda Tan commented on YARN-4734:
--

Thanks for comments, [~aw].

For your comments, actually attached patches are still WIP and I used it to 
figure out issues like ASF licensing warnings. Will send out a mail to yarn-dev 
mail list when patch is ready to be reviewed from my POV. I will add "wip" to 
patch name later to avoid confusing.

For your concerns:
bq. Definitely need some clarification from ASF legal whether we can merge 
licenses like that. My hunch is no, but IANAL.
I can see some projects like Spark are using merged licenses. See 
LEGAL-226/SPARK-10833. And I will send query about standard formats of Licenses 
as well.

bq. The dist and tmp directories should be inside target and not in the root of 
the module. This makes a ton of other problems go away.
Will do

bq. Why is there a separate profile for this? What UI do I get if I don't build 
with this profile? This also means the precommit hooks won't work until the 
hadoop personality is modified (which means the above precommit testing is 
mostly useless)
Since it requires additional tools to build it (npm & bower), we cannot ask 
developers to install them until it *officially* supported by YARN.
To make sure it can be run by Jenkins, can we modify Yetus (or Hadoop dev 
support script) to make Jenkins can build / test it by adding the additional 
profile?

bq. Double check the license headers. At least one of 'em was using the old 
text.
It seems all existing YARN docs (*.md) are using old header. (If that's the 
older header you mentioned above). I will fix yarnui2.md and rest of YARN docs 
can be fixed separately.

bq. Why isn't YarnUI2.md's content in BUILDING.txt? Why does an end user care 
about this information?
My understanding is BUILDING.txt should only contain how to build components. 
YarnUI2.md is majorly about how to deploy and start new UI server. 
Contributor/volunteer can try it follow the steps.

bq. The Apache RAT issues
Will fix

bq. Why does "hadoop-yarn-ui/src/main/resources/META-INF/NOTICE.txt" mention 
Tez?
Will fix.

bq. hadoop-yarn-ui/src/main/webapp/package.json should have it's version pulled 
from maven.
>From my investigation, we cannot pass down the version to ember to build the 
>package from CLI, it has to be picked from package.json.
We basically have two choices:
1) Modify package.json automatically in maven building script and pass down the 
version from maven.
2) Give a separate version to yarn-ui module. 0.0.0 in the patch doesn't make 
sense at all, how about call it 0.1?
I'm not sure how we dealed with version of libhadoop.so. Would like to hear 
your thoughts.



> Merge branch:YARN-3368 to trunk
> ---
>
> Key: YARN-4734
> URL: https://issues.apache.org/jira/browse/YARN-4734
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4734.1.patch, YARN-4734.2.patch, YARN-4734.3.patch, 
> YARN-4734.4.patch, YARN-4734.5.patch
>
>
> YARN-2928 branch is planned to merge back to trunk shortly, it depends on 
> changes of YARN-3368. This JIRA is to track the merging task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4958) The file localization process should allow for wildcards to reduce the application footprint in the state store

2016-04-13 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-4958:
---
Attachment: YARN-4958.001.patch

Here's an initial cut at the patch.  There are still a few issues to resolve, 
but some general testing shows that the patch behaves as expected.

> The file localization process should allow for wildcards to reduce the 
> application footprint in the state store
> ---
>
> Key: YARN-4958
> URL: https://issues.apache.org/jira/browse/YARN-4958
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.8.0
>Reporter: Daniel Templeton
>Assignee: Daniel Templeton
>Priority: Critical
> Attachments: YARN-4958.001.patch
>
>
> When using the -libjars option to add classes to the classpath, every library 
> so added is explicitly listed in the {{ContainerLaunchContext}}'s local 
> resources even though they're all uploaded to the same directory in HDFS.  
> When using tools like Crunch without an uber JAR or when trying to take 
> advantage of the shared cache, the number of libraries can be quite large.  
> We've seen many cases where we had to turn down the max number of 
> applications to prevent ZK from running out of heap because of the size of 
> the state store entries.
> Rather than listing all files independently, this JIRA proposes to have the 
> NM allow wildcards in the resource localization paths.  Specifically, we 
> propose to allow a path to have a final component (name) set to "*", which is 
> interpreted by the NM as "download the fell directory and link to every file 
> in it from the job's working directory."  This behavior is the same as the 
> current behavior when using -libjars, but avoids explicitly listing every 
> file.
> This JIRA does not attempt to provide more general purpose wildcards, such as 
> "*.jar" or "file*", as having multiple entries for a single directory 
> presents numerous logistical issues.
> This JIRA also does not attempt to integrate with the shared cache.  That 
> work will be left to a future JIRA.
> This JIRA proposes to allow for wildcards both in the internal processing of 
> the -libjars switch and in paths added through the {{Job}} and 
> {{DistributedCache}} classes.
> The proposed approach is to treat a path, "dir/*", as "dir" for purposes of 
> all file verification.  In the final step, the NM will query the localized 
> directory to get a list of the files in "dir" such that each can be linked 
> from the job's working directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes

2016-04-13 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240340#comment-15240340
 ] 

Bibin A Chundatt commented on YARN-4947:


[~sunilg]
{quote}
I am not very sure whether this is correct as we always return TRUE from 
isDrained.
{quote}
i disagree to this .The existing testcase implementation 
{{TestRMWebServicesNodes}} never drained event ever so should be fine.

{quote}
 We can call this from those test cases (like those cases from YARN-4893) which 
has to ensure events are processed and do next iteration of test case 
execution. Else we will run in to similar issues, and may be more error prone.
{quote}
The cases we will hit this is only when Mock RM all services are not started. 
So what all other cases are you expecting??


> Test timeout is happening for TestRMWebServicesNodes
> 
>
> Key: YARN-4947
> URL: https://issues.apache.org/jira/browse/YARN-4947
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4947.patch
>
>
> Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 
> [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4909) Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter

2016-04-13 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240344#comment-15240344
 ] 

Bibin A Chundatt commented on YARN-4909:


[~sunilg]
{quote}
9998 + rnd. nextInt()%500, something like this.. So it can be random when 
called in parallel
{quote} . i was thinking of the same why your previous comment specifically 
mentioned like this. :).
Attaching the latest patch after updation.


> Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter
> ---
>
> Key: YARN-4909
> URL: https://issues.apache.org/jira/browse/YARN-4909
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Brahma Reddy Battula
>Assignee: Bibin A Chundatt
>Priority: Blocker
> Attachments: 0001-YARN-4909.patch, 0002-YARN-4909.patch, 
> 0003-YARN-4909.patch, 0004-YARN-4909.patch, 0005-YARN-4909.patch, 
> 0006-YARN-4909.patch
>
>
>  *Precommit link* 
> https://builds.apache.org/job/PreCommit-YARN-Build/10908/testReport/
> *Trace* 
> {noformat}
> com.sun.jersey.test.framework.spi.container.TestContainerException: 
> java.net.BindException: Address already in use
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:463)
>   at sun.nio.ch.Net.bind(Net.java:455)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:413)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:384)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:375)
>   at 
> org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:549)
>   at 
> org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:255)
>   at 
> com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:326)
>   at 
> com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:343)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.instantiateGrizzlyWebServer(GrizzlyWebTestContainerFactory.java:219)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:129)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:86)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory.create(GrizzlyWebTestContainerFactory.java:79)
>   at 
> com.sun.jersey.test.framework.JerseyTest.getContainer(JerseyTest.java:342)
>   at com.sun.jersey.test.framework.JerseyTest.(JerseyTest.java:217)
>   at 
> org.apache.hadoop.yarn.webapp.JerseyTestBase.(JerseyTestBase.java:30)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices.(TestRMWebServices.java:125)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4909) Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter

2016-04-13 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-4909:
---
Attachment: 0006-YARN-4909.patch

> Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter
> ---
>
> Key: YARN-4909
> URL: https://issues.apache.org/jira/browse/YARN-4909
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Brahma Reddy Battula
>Assignee: Bibin A Chundatt
>Priority: Blocker
> Attachments: 0001-YARN-4909.patch, 0002-YARN-4909.patch, 
> 0003-YARN-4909.patch, 0004-YARN-4909.patch, 0005-YARN-4909.patch, 
> 0006-YARN-4909.patch
>
>
>  *Precommit link* 
> https://builds.apache.org/job/PreCommit-YARN-Build/10908/testReport/
> *Trace* 
> {noformat}
> com.sun.jersey.test.framework.spi.container.TestContainerException: 
> java.net.BindException: Address already in use
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:463)
>   at sun.nio.ch.Net.bind(Net.java:455)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:413)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:384)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:375)
>   at 
> org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:549)
>   at 
> org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:255)
>   at 
> com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:326)
>   at 
> com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:343)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.instantiateGrizzlyWebServer(GrizzlyWebTestContainerFactory.java:219)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:129)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:86)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory.create(GrizzlyWebTestContainerFactory.java:79)
>   at 
> com.sun.jersey.test.framework.JerseyTest.getContainer(JerseyTest.java:342)
>   at com.sun.jersey.test.framework.JerseyTest.(JerseyTest.java:217)
>   at 
> org.apache.hadoop.yarn.webapp.JerseyTestBase.(JerseyTestBase.java:30)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices.(TestRMWebServices.java:125)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4751) In 2.7, Labeled queue usage not shown properly in capacity scheduler UI

2016-04-13 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240016#comment-15240016
 ] 

Eric Payne commented on YARN-4751:
--

Thanks for the suggestion [~sunilg]:
bq. Is it better to have YARN-3362 2.7 version patch to be added in YARN-3362 
and commit that.
I'm okay with submitting the YARN-4751-branch-2.7.003.patch patch to YARN-3362.
bq. Then other changes to do here.
What other changes would be done here?

> In 2.7, Labeled queue usage not shown properly in capacity scheduler UI
> ---
>
> Key: YARN-4751
> URL: https://issues.apache.org/jira/browse/YARN-4751
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.3
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: 2.7 CS UI No BarGraph.jpg, 
> YARH-4752-branch-2.7.001.patch, YARH-4752-branch-2.7.002.patch, 
> YARN-4751-branch-2.7.003.patch
>
>
> In 2.6 and 2.7, the capacity scheduler UI does not have the queue graphs 
> separated by partition. When applications are running on a labeled queue, no 
> color is shown in the bar graph, and several of the "Used" metrics are zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4751) In 2.7, Labeled queue usage not shown properly in capacity scheduler UI

2016-04-13 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240046#comment-15240046
 ] 

Eric Payne commented on YARN-4751:
--

bq. Is it better to have YARN-3362 2.7 version patch to be added in YARN-3362 
and commit that.
[~sunilg]: Sorry, I think I understand now what you are saying. You mean, Just 
backport the changes from YARN-3362 and submit that as a patch there. Then, in 
this JIRA (YARN-4751), make additional changes to present the correct values 
for all of the labeled metrics. Is that correct?

I guess I'm okay with that, but I don't really like the idea of submitting a 
patch to YARN-3362 that we know gives wrong metrics in the UI.

> In 2.7, Labeled queue usage not shown properly in capacity scheduler UI
> ---
>
> Key: YARN-4751
> URL: https://issues.apache.org/jira/browse/YARN-4751
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 2.7.3
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: 2.7 CS UI No BarGraph.jpg, 
> YARH-4752-branch-2.7.001.patch, YARH-4752-branch-2.7.002.patch, 
> YARN-4751-branch-2.7.003.patch
>
>
> In 2.6 and 2.7, the capacity scheduler UI does not have the queue graphs 
> separated by partition. When applications are running on a labeled queue, no 
> color is shown in the bar graph, and several of the "Used" metrics are zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4929) Fix test failures because of removing the minimum wait time for attempts.

2016-04-13 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4929:
---
Description: 
The following unit test cases failed because we removed the minimum wait time 
for attempt in YARN-4807. I manually add sleeps so the tests pass and add a 
TODO in the code.

- TestAMRestart.testRMAppAttemptFailuresValidityInterval 
- TestApplicationMasterService.testResourceTypes
- TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche

  was:
The following unit test cases failed because we removed the minimum wait time 
for attempt in YARN-4807. I manually add sleeps so the tests pass

- TestAMRestart.testRMAppAttemptFailuresValidityInterval 
- TestApplicationMasterService.testResourceTypes
- TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche


> Fix test failures because of removing the minimum wait time for attempts.
> -
>
> Key: YARN-4929
> URL: https://issues.apache.org/jira/browse/YARN-4929
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-4929.001.patch
>
>
> The following unit test cases failed because we removed the minimum wait time 
> for attempt in YARN-4807. I manually add sleeps so the tests pass and add a 
> TODO in the code.
> - TestAMRestart.testRMAppAttemptFailuresValidityInterval 
> - TestApplicationMasterService.testResourceTypes
> - TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
> - TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
> - TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4957) Add getNewReservation in ApplicationClientProtocol

2016-04-13 Thread Subru Krishnan (JIRA)
Subru Krishnan created YARN-4957:


 Summary: Add getNewReservation in ApplicationClientProtocol
 Key: YARN-4957
 URL: https://issues.apache.org/jira/browse/YARN-4957
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: applications, client, resourcemanager
Reporter: Subru Krishnan
Assignee: Sean Po


Currently submitReservation returns a ReservationId if sucessful. This JIRA 
propose adding a getNewReservation in ApplicationClientProtocol for the 
following reasons:
  * Prevent zombie reservations in the face of client and/or network failures 
post submitReservation
  * Align reservation submission with application submission



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-13 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4807:
---
Attachment: YARN-4807.009.patch

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
>  Labels: newbie
> Attachments: YARN-4807.001.patch, YARN-4807.002.patch, 
> YARN-4807.003.patch, YARN-4807.004.patch, YARN-4807.005.patch, 
> YARN-4807.006.patch, YARN-4807.007.patch, YARN-4807.008.patch, 
> YARN-4807.009.patch
>
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-13 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4807:
---
Attachment: (was: YARN-4807.009.patch)

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
>  Labels: newbie
> Attachments: YARN-4807.001.patch, YARN-4807.002.patch, 
> YARN-4807.003.patch, YARN-4807.004.patch, YARN-4807.005.patch, 
> YARN-4807.006.patch, YARN-4807.007.patch, YARN-4807.008.patch
>
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-13 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4807:
---
Attachment: YARN-4807.009.patch

Thanks [~templedf] for the review. I've done all the three things in the patch 
009. Please take a look.

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
>  Labels: newbie
> Attachments: YARN-4807.001.patch, YARN-4807.002.patch, 
> YARN-4807.003.patch, YARN-4807.004.patch, YARN-4807.005.patch, 
> YARN-4807.006.patch, YARN-4807.007.patch, YARN-4807.008.patch, 
> YARN-4807.009.patch
>
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4929) Fix test failures because of removing the minimum wait time for attempts.

2016-04-13 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4929:
---
Description: 
The following unit test cases failed because we removed the minimum wait time 
for attempt in YARN-4807. I manually add sleeps so the tests pass

- TestAMRestart.testRMAppAttemptFailuresValidityInterval 
- TestApplicationMasterService.testResourceTypes
- TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche

  was:
The following unit test cases failed because we removed the minimum wait time 
for attempt in YARN-4807

- TestAMRestart.testRMAppAttemptFailuresValidityInterval 
- TestApplicationMasterService.testResourceTypes
- TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche


> Fix test failures because of removing the minimum wait time for attempts.
> -
>
> Key: YARN-4929
> URL: https://issues.apache.org/jira/browse/YARN-4929
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-4929.001.patch
>
>
> The following unit test cases failed because we removed the minimum wait time 
> for attempt in YARN-4807. I manually add sleeps so the tests pass
> - TestAMRestart.testRMAppAttemptFailuresValidityInterval 
> - TestApplicationMasterService.testResourceTypes
> - TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
> - TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
> - TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4929) Fix test failures because of removing the minimum wait time for attempts.

2016-04-13 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-4929:
---
Description: 
The following unit test cases failed because we removed the minimum wait time 
for attempt in YARN-4807. I manually added sleeps so the tests pass and added a 
TODO in the code.

- TestAMRestart.testRMAppAttemptFailuresValidityInterval 
- TestApplicationMasterService.testResourceTypes
- TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche

  was:
The following unit test cases failed because we removed the minimum wait time 
for attempt in YARN-4807. I manually add sleeps so the tests pass and add a 
TODO in the code.

- TestAMRestart.testRMAppAttemptFailuresValidityInterval 
- TestApplicationMasterService.testResourceTypes
- TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
- TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche


> Fix test failures because of removing the minimum wait time for attempts.
> -
>
> Key: YARN-4929
> URL: https://issues.apache.org/jira/browse/YARN-4929
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-4929.001.patch
>
>
> The following unit test cases failed because we removed the minimum wait time 
> for attempt in YARN-4807. I manually added sleeps so the tests pass and added 
> a TODO in the code.
> - TestAMRestart.testRMAppAttemptFailuresValidityInterval 
> - TestApplicationMasterService.testResourceTypes
> - TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers
> - TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForFairSche
> - TestRMApplicationHistoryWriter.testRMWritingMassiveHistoryForCapacitySche



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-04-13 Thread Daniel Zhi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Zhi updated YARN-4676:
-
Attachment: YARN-4676.010.patch

Patch 010 contains updates for review comments 2~7. The main change is for No 5 
--- merge refreshNodes(long timeout) with YARN-4676:
  1. Provide timeout through RefreshNodeRequest to NodesListManager;
  2. NodesListManager uses the timeout if no per-node timeout specified;
  3. The final FORCEFUL decommission is still there but with an extra 20 
seconds delay after timeout. I have verified that RM tracks and handles the 
timeout as expected so normally RMAdminCLI won't forceful decommission. I also 
verified that even if it does it, DecommissioningNodesWatcher is fine dealing 
with it. So overall, we can simply preserve the FORCEFUL decommission near the 
end of refreshNodes(long timeout).

For 8. I need help on how to add "docs".
For 1. I will consider a pseudo patch with one line comment to see whether QA 
still complain unit test errors.

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> 
>
> Key: YARN-4676
> URL: https://issues.apache.org/jira/browse/YARN-4676
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Daniel Zhi
>Assignee: Daniel Zhi
>  Labels: features
> Attachments: GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, 
> YARN-4676.005.patch, YARN-4676.006.patch, YARN-4676.007.patch, 
> YARN-4676.008.patch, YARN-4676.009.patch, YARN-4676.010.patch
>
>
> DecommissioningNodeWatcher inside ResourceTrackingService tracks 
> DECOMMISSIONING nodes status automatically and asynchronously after 
> client/admin made the graceful decommission request. It tracks 
> DECOMMISSIONING nodes status to decide when, after all running containers on 
> the node have completed, will be transitioned into DECOMMISSIONED state. 
> NodesListManager detect and handle include and exclude list changes to kick 
> out decommission or recommission as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4882) Change the log level to DEBUG for recovering completed applications

2016-04-13 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240244#comment-15240244
 ] 

Daniel Templeton commented on YARN-4882:


I'm going to assume the silence means I captured it pretty well.

bq. We don't want to flood the logs with an intractable number of log messages 
during recovery

This one is clearly solved by both having an extra log file and just dialing 
down the log level.

bq. We need to be able to identify bad applications in the case that recovery 
fails

As long as we don't dial down the log level for recovery failures, both 
solutions seem to address this objective as well.  On the point that sometimes 
knowing what didn't fail is useful in a failed recovery, let me ask a question. 
 If the recovery fails, the RM fails to start, right?  If the RM fails to 
start, it's possible to change the log level before starting it again, if 
getting a list of the successful recoveries is helpful, right?  And since that 
recovery will also fail, it's possible to reset the log level before the final 
restart after resolving the issue.  Is there a scenario where the RM starts 
successfully but the list of recovered apps is still useful?

I dislike the idea of adding an extra log file and a property to enable it to 
the admin's plate for the sole purpose of logging successful recoveries, when 
that information is not commonly useful, and when the same information could be 
retrieved through a well known existing mechanism (changing the log level).

I propose that we streamline the log messages to be useful and succinct.  The 
RM should detail the recovery statistics at the info level, recovery failures 
at the warn or error level, and recovery successes at the debug level.  The log 
messages should also be reworked to include as information as possible to 
assist in debugging failures while being less chatty.

Any other thoughts?

> Change the log level to DEBUG for recovering completed applications
> ---
>
> Key: YARN-4882
> URL: https://issues.apache.org/jira/browse/YARN-4882
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Daniel Templeton
>
> I think for recovering completed applications no need to log as INFO, rather 
> it can be made it as DEBUG.  The problem seen from large cluster is if any 
> issue happens during RM start up and continuously switching , then  RM logs 
> are filled with most with recovering applications only. 
> There are 6 lines are logged for 1 applications as I shown in below logs, 
> then consider RM default value for max-completed applications is 10K. So for 
> each switch 10K*6=60K lines will be added which is not useful I feel.
> {noformat}
> 2016-03-01 10:20:59,077 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAppManager: Default priority 
> level is set to application:application_1456298208485_21507
> 2016-03-01 10:20:59,094 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Recovering 
> app: application_1456298208485_21507 with 1 attempts and final state = 
> FINISHED
> 2016-03-01 10:20:59,100 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Recovering attempt: appattempt_1456298208485_21507_01 with final state: 
> FINISHED
> 2016-03-01 10:20:59,107 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> appattempt_1456298208485_21507_01 State change from NEW to FINISHED
> 2016-03-01 10:20:59,111 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: 
> application_1456298208485_21507 State change from NEW to FINISHED
> 2016-03-01 10:20:59,112 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=rohith   
> OPERATION=Application Finished - Succeeded  TARGET=RMAppManager 
> RESULT=SUCCESS  APPID=application_1456298208485_21507
> {noformat}
> The main problem is missing important information's from the logs before RM 
> unstable. Even though log roll back is 50 or 100, in a short period all these 
> logs will be rolled out and all the logs contains only RM switching 
> information that too recovering applications!!. 
> I suggest at least completed applications recovery should be logged as DEBUG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4951) large IP ranges require a different naming strategy

2016-04-13 Thread Jonathan Maron (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Maron updated YARN-4951:
-
Description: Large subnet definitions (e.g. specifying a mask value of 
255.255.224.0) yield a large number of potential network addresses.  Therefore, 
the standard naming convention of xx.xx.xx.in-addr.arpa needs to be modified to 
be more general:  xx.xx.in-addr.arpa.  (was: Large subnet definitions (e.g. 
specifying a mask value of 255.255.224.0) yield a large number of potential 
network addresses, each requiring a separate reverse zone definition (given 
that reverse zones include the first 3 IP bytes in reverse order).)

> large IP ranges require a different naming strategy
> ---
>
> Key: YARN-4951
> URL: https://issues.apache.org/jira/browse/YARN-4951
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Maron
>Assignee: Jonathan Maron
>
> Large subnet definitions (e.g. specifying a mask value of 255.255.224.0) 
> yield a large number of potential network addresses.  Therefore, the standard 
> naming convention of xx.xx.xx.in-addr.arpa needs to be modified to be more 
> general:  xx.xx.in-addr.arpa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4951) large IP ranges require a different naming strategy

2016-04-13 Thread Jonathan Maron (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Maron updated YARN-4951:
-
Attachment: (was: 
0001-YARN-4757-address-multiple-reverse-lookup-zones-and-.patch)

> large IP ranges require a different naming strategy
> ---
>
> Key: YARN-4951
> URL: https://issues.apache.org/jira/browse/YARN-4951
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Maron
>Assignee: Jonathan Maron
>
> Large subnet definitions (e.g. specifying a mask value of 255.255.224.0) 
> yield a large number of potential network addresses.  Therefore, the standard 
> naming convention of xx.xx.xx.in-addr.arpa needs to be modified to be more 
> general:  xx.xx.in-addr.arpa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4951) large IP ranges require a different naming strategy

2016-04-13 Thread Jonathan Maron (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240129#comment-15240129
 ] 

Jonathan Maron commented on YARN-4951:
--

The original approach was unnecessary.  Uploaded a new patch that addresses the 
need to simply generalize the reverse lookup zone name.

> large IP ranges require a different naming strategy
> ---
>
> Key: YARN-4951
> URL: https://issues.apache.org/jira/browse/YARN-4951
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Maron
>Assignee: Jonathan Maron
>
> Large subnet definitions (e.g. specifying a mask value of 255.255.224.0) 
> yield a large number of potential network addresses.  Therefore, the standard 
> naming convention of xx.xx.xx.in-addr.arpa needs to be modified to be more 
> general:  xx.xx.in-addr.arpa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3846) RM Web UI queue filter is not working

2016-04-13 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3846:
-
Target Version/s: 3.0.0, 2.8.0, 2.7.3, 2.6.5  (was: 3.0.0, 2.8.0)

> RM Web UI queue filter is not working
> -
>
> Key: YARN-3846
> URL: https://issues.apache.org/jira/browse/YARN-3846
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>  Labels: PatchAvailable
> Fix For: 2.8.0
>
> Attachments: YARN-3846.patch, scheduler queue issue.png, scheduler 
> queue positive behavior.png
>
>
> Click on root queue will show the complete applications
> But click on the leaf queue is not filtering the application related to the 
> the clicked queue.
> The regular expression seems to be wrong 
> {code}
> q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';",
> {code}
> For example
> 1. Suppose  queue name is  b
> them the above expression will try to substr at index 1 
> q.lastIndexOf(':')  = -1
> -1+2= 1
> which is wrong. its should look at the 0 index.
> 2. if queue name is ab.x
> then it will parse it to .x 
> but it should be x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240145#comment-15240145
 ] 

Hadoop QA commented on YARN-4807:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 22 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 35m 42s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m 0s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 75m 17s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | hadoop.yarn.server.resourcemanager.TestApplicationMasterService |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler |
|   | 

[jira] [Updated] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics

2016-04-13 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-3816:

Attachment: YARN-3816-YARN-2928-v7.patch

More clean up and addressed [~sjlee0]'s comments. 

> [Aggregation] App-level aggregation and accumulation for YARN system metrics
> 
>
> Key: YARN-3816
> URL: https://issues.apache.org/jira/browse/YARN-3816
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Junping Du
>Assignee: Li Lu
>  Labels: yarn-2928-1st-milestone
> Attachments: Application Level Aggregation of Timeline Data.pdf, 
> YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch, 
> YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch, 
> YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, 
> YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch, 
> YARN-3816-YARN-2928-v5.patch, YARN-3816-YARN-2928-v6.patch, 
> YARN-3816-YARN-2928-v7.patch, YARN-3816-feature-YARN-2928.v4.1.patch, 
> YARN-3816-poc-v1.patch, YARN-3816-poc-v2.patch
>
>
> We need application level aggregation of Timeline data:
> - To present end user aggregated states for each application, include: 
> resource (CPU, Memory) consumption across all containers, number of 
> containers launched/completed/failed, etc. We need this for apps while they 
> are running as well as when they are done.
> - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be 
> aggregated to show details of states in framework level.
> - Other level (Flow/User/Queue) aggregation can be more efficient to be based 
> on Application-level aggregations rather than raw entity-level data as much 
> less raws need to scan (with filter out non-aggregated entities, like: 
> events, configurations, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4951) large IP ranges require a different naming strategy

2016-04-13 Thread Jonathan Maron (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Maron updated YARN-4951:
-
Summary: large IP ranges require a different naming strategy  (was: large 
IP ranges require the creation of multiple reverse lookup zones)

> large IP ranges require a different naming strategy
> ---
>
> Key: YARN-4951
> URL: https://issues.apache.org/jira/browse/YARN-4951
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Maron
>Assignee: Jonathan Maron
> Attachments: 
> 0001-YARN-4757-address-multiple-reverse-lookup-zones-and-.patch
>
>
> Large subnet definitions (e.g. specifying a mask value of 255.255.224.0) 
> yield a large number of potential network addresses, each requiring a 
> separate reverse zone definition (given that reverse zones include the first 
> 3 IP bytes in reverse order).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4951) large IP ranges require a different naming strategy

2016-04-13 Thread Jonathan Maron (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Maron updated YARN-4951:
-
Attachment: 0001-YARN-4757-simplified-reverse-lookup-zone-approach-fo.patch

> large IP ranges require a different naming strategy
> ---
>
> Key: YARN-4951
> URL: https://issues.apache.org/jira/browse/YARN-4951
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Maron
>Assignee: Jonathan Maron
> Attachments: 
> 0001-YARN-4757-simplified-reverse-lookup-zone-approach-fo.patch
>
>
> Large subnet definitions (e.g. specifying a mask value of 255.255.224.0) 
> yield a large number of potential network addresses.  Therefore, the standard 
> naming convention of xx.xx.xx.in-addr.arpa needs to be modified to be more 
> general:  xx.xx.in-addr.arpa.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3846) RM Web UI queue filter is not working

2016-04-13 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240151#comment-15240151
 ] 

Wangda Tan commented on YARN-3846:
--

We should back port this issue to branch-2.6/branch-2.7, list apps for nested 
queues is broken for branch-2.6/2.6.

> RM Web UI queue filter is not working
> -
>
> Key: YARN-3846
> URL: https://issues.apache.org/jira/browse/YARN-3846
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0, 2.8.0
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>  Labels: PatchAvailable
> Fix For: 2.8.0
>
> Attachments: YARN-3846.patch, scheduler queue issue.png, scheduler 
> queue positive behavior.png
>
>
> Click on root queue will show the complete applications
> But click on the leaf queue is not filtering the application related to the 
> the clicked queue.
> The regular expression seems to be wrong 
> {code}
> q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';",
> {code}
> For example
> 1. Suppose  queue name is  b
> them the above expression will try to substr at index 1 
> q.lastIndexOf(':')  = -1
> -1+2= 1
> which is wrong. its should look at the 0 index.
> 2. if queue name is ab.x
> then it will parse it to .x 
> but it should be x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240164#comment-15240164
 ] 

Hadoop QA commented on YARN-4807:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
1s {color} | {color:green} The patch appears to include 22 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
48s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 13s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_77. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 16s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 57s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_77 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.8.0_77 Timed out junit tests | 
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels |
|   | hadoop.yarn.webapp.TestRMWithCSRFFilter |
|   | 

[jira] [Commented] (YARN-4955) Add retry for SocketTimeoutException in TimelineClient

2016-04-13 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240180#comment-15240180
 ] 

Li Lu commented on YARN-4955:
-

Thanks for the work [~xgong]! Yes from the exception stack the 
AuthenticationException was wrapped into an IOException, so fixing the 
shouldRetryOn() method of the delegation token retry should be fine. However, I 
do share the same concern as [~djp] that we may want to build an UT to mock 
this case in TestTimelineClient#testDelegationTokenOperationsRetry. 

> Add retry for SocketTimeoutException in TimelineClient
> --
>
> Key: YARN-4955
> URL: https://issues.apache.org/jira/browse/YARN-4955
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
> Attachments: YARN-4955.1.patch, YARN-4955.2.patch
>
>
> We saw this exception several times when we tried to getDelegationToken from 
> ATS.
> java.io.IOException: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$5.run(TimelineClientImpl.java:569)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$TimelineClientConnectionRetry.retryOn(TimelineClientImpl.java:234)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.operateDelegationToken(TimelineClientImpl.java:582)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.getDelegationToken(TimelineClientImpl.java:479)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken(YarnClientImpl.java:349)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(YarnClientImpl.java:330)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:250)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:291)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:290)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:240)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
>   at 
> org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:335)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.submit(PigJobControl.java:128)
>   at 
> org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:194)
>   at java.lang.Thread.run(Thread.java:745)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276)
> Caused by: 
> org.apache.hadoop.security.authentication.client.AuthenticationException: 
> java.net.SocketTimeoutException: Read timed out
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332)
>   at 
> org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:128)
>   at 
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:215)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:285)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:166)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:475)
>   at 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl$2.run(TimelineClientImpl.java:467)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at 

[jira] [Commented] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes

2016-04-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240348#comment-15240348
 ] 

Sunil G commented on YARN-4947:
---

[~bibinchundatt]

I think I was not very clear in explaining my point. Let me try again.

Test cases like {{TestRMWebServicesNodes}} has to override {{isDrained}} to 
return hardcoded boolean to continue. I agree with reason here, because event 
never drained  for these cases. So for any new test case similar to this, we 
always forced to override {{isDrained}}, which is not documented and easily to 
miss out. Pls correct me if I am wrong.

So I was thinking abt the real fix went in for  YARN-4893. Do we need to call 
{{drainEvents}} from MockRM, rather can we handle from each test case. Its  a 
choice and its fine either ways.:)

> Test timeout is happening for TestRMWebServicesNodes
> 
>
> Key: YARN-4947
> URL: https://issues.apache.org/jira/browse/YARN-4947
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-4947.patch
>
>
> Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 
> [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4909) Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter

2016-04-13 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240356#comment-15240356
 ] 

Sunil G commented on YARN-4909:
---

Yes. Sorry for that, later I thought this is more better. :)

> Fix intermittent failures of TestRMWebServices And TestRMWithCSRFFilter
> ---
>
> Key: YARN-4909
> URL: https://issues.apache.org/jira/browse/YARN-4909
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Brahma Reddy Battula
>Assignee: Bibin A Chundatt
>Priority: Blocker
> Attachments: 0001-YARN-4909.patch, 0002-YARN-4909.patch, 
> 0003-YARN-4909.patch, 0004-YARN-4909.patch, 0005-YARN-4909.patch, 
> 0006-YARN-4909.patch
>
>
>  *Precommit link* 
> https://builds.apache.org/job/PreCommit-YARN-Build/10908/testReport/
> *Trace* 
> {noformat}
> com.sun.jersey.test.framework.spi.container.TestContainerException: 
> java.net.BindException: Address already in use
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:463)
>   at sun.nio.ch.Net.bind(Net.java:455)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:413)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:384)
>   at 
> org.glassfish.grizzly.nio.transport.TCPNIOTransport.bind(TCPNIOTransport.java:375)
>   at 
> org.glassfish.grizzly.http.server.NetworkListener.start(NetworkListener.java:549)
>   at 
> org.glassfish.grizzly.http.server.HttpServer.start(HttpServer.java:255)
>   at 
> com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:326)
>   at 
> com.sun.jersey.api.container.grizzly2.GrizzlyServerFactory.createHttpServer(GrizzlyServerFactory.java:343)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.instantiateGrizzlyWebServer(GrizzlyWebTestContainerFactory.java:219)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:129)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory$GrizzlyWebTestContainer.(GrizzlyWebTestContainerFactory.java:86)
>   at 
> com.sun.jersey.test.framework.spi.container.grizzly2.web.GrizzlyWebTestContainerFactory.create(GrizzlyWebTestContainerFactory.java:79)
>   at 
> com.sun.jersey.test.framework.JerseyTest.getContainer(JerseyTest.java:342)
>   at com.sun.jersey.test.framework.JerseyTest.(JerseyTest.java:217)
>   at 
> org.apache.hadoop.yarn.webapp.JerseyTestBase.(JerseyTestBase.java:30)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServices.(TestRMWebServices.java:125)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4541) Change log message in LocalizedResource#handle() to DEBUG

2016-04-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240384#comment-15240384
 ] 

Hudson commented on YARN-4541:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9607 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9607/])
YARN-4541. Change log message in LocalizedResource#handle() to DEBUG (rkanter: 
rev 0d9194df00fd68bfb7a8ba504b0cddd7d7c69b8a)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java


> Change log message in LocalizedResource#handle() to DEBUG
> -
>
> Key: YARN-4541
> URL: https://issues.apache.org/jira/browse/YARN-4541
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>Priority: Minor
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: YARN-4541.001.patch, YARN-4541.002.patch
>
>
> This section of code can fill up a log fairly quickly.
> {code}
>if (oldState != newState) {
> LOG.info("Resource " + resourcePath + (localPath != null ?
>   "(->" + localPath + ")": "") + " transitioned from " + oldState
> + " to " + newState);
>}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4879) Enhance Allocate Protocol to Identify Requests Explicitly

2016-04-13 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-4879:
-
Summary: Enhance Allocate Protocol to Identify Requests Explicitly  (was: 
Proposal for a simple (delta) allocate protocol)

> Enhance Allocate Protocol to Identify Requests Explicitly
> -
>
> Key: YARN-4879
> URL: https://issues.apache.org/jira/browse/YARN-4879
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: SimpleAllocateProtocolProposal-v1.pdf
>
>
> For legacy reasons, the current allocate protocol expects expanded requests 
> which represent the cumulative request for any change in resource 
> constraints. This is not only very difficult to comprehend but makes it 
> impossible for the scheduler to associate container allocations to the 
> original requests. This problem is amplified by the fact that the expansion 
> is managed by the AMRMClient which makes it cumbersome for non-Java clients 
> as they all have to replicate the non-trivial logic. In this JIRA, we are 
> proposing a delta allocate protocol where the AM will need to only specify 
> changes in resource constraints.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4879) Enhance Allocate Protocol to Identify Requests Explicitly

2016-04-13 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-4879:
-
Description: For legacy reasons, the current allocate protocol expects 
expanded requests which represent the cumulative request for any change in 
resource constraints. This is not only very difficult to comprehend but makes 
it impossible for the scheduler to associate container allocations to the 
original requests. This problem is amplified by the fact that the expansion is 
managed by the AMRMClient which makes it cumbersome for non-Java clients as 
they all have to replicate the non-trivial logic. In this JIRA, we are 
proposing enhancement to the Allocate Protocol to allow AMs to identify 
requests explicitly.(was: For legacy reasons, the current allocate protocol 
expects expanded requests which represent the cumulative request for any change 
in resource constraints. This is not only very difficult to comprehend but 
makes it impossible for the scheduler to associate container allocations to the 
original requests. This problem is amplified by the fact that the expansion is 
managed by the AMRMClient which makes it cumbersome for non-Java clients as 
they all have to replicate the non-trivial logic. In this JIRA, we are 
proposing a delta allocate protocol where the AM will need to only specify 
changes in resource constraints.  )

> Enhance Allocate Protocol to Identify Requests Explicitly
> -
>
> Key: YARN-4879
> URL: https://issues.apache.org/jira/browse/YARN-4879
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: SimpleAllocateProtocolProposal-v1.pdf
>
>
> For legacy reasons, the current allocate protocol expects expanded requests 
> which represent the cumulative request for any change in resource 
> constraints. This is not only very difficult to comprehend but makes it 
> impossible for the scheduler to associate container allocations to the 
> original requests. This problem is amplified by the fact that the expansion 
> is managed by the AMRMClient which makes it cumbersome for non-Java clients 
> as they all have to replicate the non-trivial logic. In this JIRA, we are 
> proposing enhancement to the Allocate Protocol to allow AMs to identify 
> requests explicitly.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4879) Enhance Allocate Protocol to Identify Requests Explicitly

2016-04-13 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-4879:
-
Attachment: SimpleAllocateProtocolProposal-v2.pdf

Thanks [~kasha], [~leftnoteasy], [~vinodkv] and [~ste...@apache.org] for taking 
a look at our proposal. PFA updated doc (v2) that incorporates your feedback.

A few additional clarifications (have addressed rest of comments directly in 
the updated doc):

bq. While making these changes, would it possible to address YARN-314 too? 
bq. I'm okay if we can get two in a shot, but I'd caution against risking this 
effort by blowing up the size.

We will address YARN-314 as long as applications specify the Request-ID as they 
can request for multiple containers at same priority through independent 
requests.

bq. Why are we putting priority semantics onto the ID? We should just follow 
the existing priority ordering.

We will continue to follow the existing priority ordering. But as explained 
above, with the proposed enhancement user can potentially make multiple 
requests at same priority (YARN-314). In such a scenario, we will simply 
allocate containers in FIFO order.

bq. BTW, for the federation related issue, does the client-library need to 
always generate these IDs? How does that interact with application generated 
IDs?

In Federation also, we expect applications to generate the IDs. For e.g.: we 
will work with the REEF team (and the long running service AM proposed as part 
of YARN-4692) to start specifying IDs for their allocation requests.

bq. I would also like to see if the allocated containers could support a role 
ID field too...nothing much, but enough that on an AM restart their role can be 
determined. That one, I'd keep separate from the request ID; they serve 
slightly different purposes. (I could have 5 requests outstanding for 
containers of role 4; I'd want to track those requests)

I agree that having an explicit role ID is useful but feel its outside the 
scope of this JIRA which IIUC is what you are also observing. I think adding a 
role ID should be part of YARN-4692/YARN-4902.

> Enhance Allocate Protocol to Identify Requests Explicitly
> -
>
> Key: YARN-4879
> URL: https://issues.apache.org/jira/browse/YARN-4879
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: applications, resourcemanager
>Reporter: Subru Krishnan
>Assignee: Subru Krishnan
> Attachments: SimpleAllocateProtocolProposal-v1.pdf, 
> SimpleAllocateProtocolProposal-v2.pdf
>
>
> For legacy reasons, the current allocate protocol expects expanded requests 
> which represent the cumulative request for any change in resource 
> constraints. This is not only very difficult to comprehend but makes it 
> impossible for the scheduler to associate container allocations to the 
> original requests. This problem is amplified by the fact that the expansion 
> is managed by the AMRMClient which makes it cumbersome for non-Java clients 
> as they all have to replicate the non-trivial logic. In this JIRA, we are 
> proposing enhancement to the Allocate Protocol to allow AMs to identify 
> requests explicitly.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-04-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240423#comment-15240423
 ] 

Hadoop QA commented on YARN-4676:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
2s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 53s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 49s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
55s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 12s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 
27s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 37s 
{color} | {color:green} trunk passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 14s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 26s 
{color} | {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s 
{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 17s 
{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 16s 
{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 15s 
{color} | {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-sls in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 0s 
{color} | {color:green} the patch passed with JDK v1.8.0_77 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 0s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 9m 58s {color} 
| {color:red} root-jdk1.8.0_77 with JDK v1.8.0_77 generated 1 new + 739 
unchanged - 0 fixed = 740 total (was 739) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 39s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 39s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 17m 37s 
{color} | {color:red} root-jdk1.7.0_95 with JDK v1.7.0_95 generated 1 new + 736 
unchanged - 0 fixed = 737 total (was 736) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 39s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s 
{color} | {color:red} root: patch generated 9 new + 519 unchanged - 6 fixed = 
528 total (was 525) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 31s 
{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 23s 
{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} mvnsite