[jira] [Commented] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.

2013-09-10 Thread Chuan Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762770#comment-13762770
 ] 

Chuan Liu commented on YARN-1025:
-

+1
Thanks for the patch, Chris!
Just to add some of my observations: we don't need to set this for mapred, 
hdfs, and hadoop cmd script files because they all use HADOOP_OPTS environment 
variable which is already set to include JAVA_LIBRARY_PATH in 
hadoop-config.cmd. This patch also matches Linux behavior -- we also explicitly 
set JAVA_LIBRARY_PATH in the Yarn Linux shell script.

 ResourceManager and NodeManager do not load native libraries on Windows.
 

 Key: YARN-1025
 URL: https://issues.apache.org/jira/browse/YARN-1025
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Chris Nauroth
 Attachments: YARN-1025.1.patch


 ResourceManager and NodeManager do not have the correct setting for 
 java.library.path when launched on Windows.  This prevents the processes from 
 loading native code from hadoop.dll.  The native code is required for correct 
 functioning on Windows (not optional), so this ultimately can cause failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests

2013-09-10 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762779#comment-13762779
 ] 

Junping Du commented on YARN-1042:
--

Attach a patch to showcase above proposal. This is only a demo patch and 
haven't included any unit tests so far. 
I think there are several open questions here before we move on next step:

1. this affinity/anti-affinity rule is bi-direction or not? If task 
A.affinity(B) is true then B.affinity(A) is always true or not? I guess it is 
not as A may prefer a list of nodes which makes the relationship non-symmetric. 
Also, that is how we can differ A prefer to live with B and C from A prefer to 
live B or C. Isn't it?

2. which rule's priority is higher in case affinity rule against with 
anti-affinity rule? In demo patch, affinity rule plays as higher priority but I 
am not sure if this is true in real case. Do we want to make it configurable? 
Or we just make sure rules updated later can override previous one if conflict.

3. Currently, the affinity/anti-affinity is only considered in node level, do 
we want to expand it to other level i.e. rack level in future?

4. The API now is to add a list of taskId as affinity/anti-affinity tasks. Is 
that easy to consume in application prospective?

5. the affinity/anti-affinity rules is a *must* conform rule in current 
implementation which may cause task starve for longer time, do we think about 
more leisure rule?

Welcome to comments. Thx!

 add ability to specify affinity/anti-affinity in container requests
 ---

 Key: YARN-1042
 URL: https://issues.apache.org/jira/browse/YARN-1042
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Steve Loughran
Assignee: Junping Du
 Attachments: YARN-1042-demo.patch


 container requests to the AM should be able to request anti-affinity to 
 ensure that things like Region Servers don't come up on the same failure 
 zones. 
 Similarly, you may be able to want to specify affinity to same host or rack 
 without specifying which specific host/rack. Example: bringing up a small 
 giraph cluster in a large YARN cluster would benefit from having the 
 processes in the same rack purely for bandwidth reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests

2013-09-10 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762928#comment-13762928
 ] 

Junping Du commented on YARN-1042:
--

BTW, it seems the effort is more on application side. Do we think it is better 
to move to MAPREDUCE project?

 add ability to specify affinity/anti-affinity in container requests
 ---

 Key: YARN-1042
 URL: https://issues.apache.org/jira/browse/YARN-1042
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Steve Loughran
Assignee: Junping Du
 Attachments: YARN-1042-demo.patch


 container requests to the AM should be able to request anti-affinity to 
 ensure that things like Region Servers don't come up on the same failure 
 zones. 
 Similarly, you may be able to want to specify affinity to same host or rack 
 without specifying which specific host/rack. Example: bringing up a small 
 giraph cluster in a large YARN cluster would benefit from having the 
 processes in the same rack purely for bandwidth reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762944#comment-13762944
 ] 

Hudson commented on YARN-292:
-

SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/328/])
YARN-292. Fixed FifoScheduler and FairScheduler to make their applications data 
structures thread safe to avoid RM crashing with 
ArrayIndexOutOfBoundsException. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521328)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java


 ResourceManager throws ArrayIndexOutOfBoundsException while handling 
 CONTAINER_ALLOCATED for application attempt
 

 Key: YARN-292
 URL: https://issues.apache.org/jira/browse/YARN-292
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.1-alpha
Reporter: Devaraj K
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: ArrayIndexOutOfBoundsException.log, YARN-292.1.patch, 
 YARN-292.2.patch, YARN-292.3.patch, YARN-292.4.patch


 {code:xml}
 2012-12-26 08:41:15,030 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: 
 Calling allocate on removed or non existant application 
 appattempt_1356385141279_49525_01
 2012-12-26 08:41:15,031 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type CONTAINER_ALLOCATED for applicationAttempt 
 application_1356385141279_49525
 java.lang.ArrayIndexOutOfBoundsException: 0
   at java.util.Arrays$ArrayList.get(Arrays.java:3381)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
  {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, 

[jira] [Commented] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762941#comment-13762941
 ] 

Hudson commented on YARN-1152:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/328/])
YARN-1152. Fixed a bug in ResourceManager that was causing clients to get 
invalid client token key errors when an appliation is about to finish. 
Contributed by Jason Lowe. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521292)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java


 Invalid key to HMAC computation error when getting application report for 
 completed app attempt
 ---

 Key: YARN-1152
 URL: https://issues.apache.org/jira/browse/YARN-1152
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1152-2.txt, YARN-1152.txt


 On a secure cluster, an invalid key to HMAC error is thrown when trying to 
 get an application report for an application with an attempt that has 
 unregistered.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1144) Unmanaged AMs registering a tracking URI should not be proxy-fied

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762943#comment-13762943
 ] 

Hudson commented on YARN-1144:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/328/])
YARN-1144. Unmanaged AMs registering a tracking URI should not be proxy-fied. 
(tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521039)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java


 Unmanaged AMs registering a tracking URI should not be proxy-fied
 -

 Key: YARN-1144
 URL: https://issues.apache.org/jira/browse/YARN-1144
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Critical
 Fix For: 2.1.1-beta

 Attachments: YARN-1144.patch, YARN-1144.patch, YARN-1144.patch


 Unmanaged AMs do not run in the cluster, their tracking URL should not be 
 proxy-fied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1049) ContainerExistStatus should define a status for preempted containers

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762940#comment-13762940
 ] 

Hudson commented on YARN-1049:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/328/])
YARN-1049. ContainerExistStatus should define a status for preempted 
containers. (tucu) (tucu: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521036)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java


 ContainerExistStatus should define a status for preempted containers
 

 Key: YARN-1049
 URL: https://issues.apache.org/jira/browse/YARN-1049
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Alejandro Abdelnur
Assignee: Alejandro Abdelnur
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1049.patch


 With the current behavior is impossible to determine if a container has been 
 preempted or lost due to a NM crash.
 Adding a PREEMPTED exit status (-102) will help an AM determine that a 
 container has been preempted.
 Note the change of scope from the original summary/description. The original 
 scope proposed API/behavior changes. Because we are passed 2.1.0-beta I'm 
 reducing the scope of this JIRA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762942#comment-13762942
 ] 

Hudson commented on YARN-910:
-

SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/328/])
YARN-910. Augmented auxiliary services to listen for container starts and 
completions in addition to application events. Contributed by Alejandro 
Abdelnur. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521298)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerInitializationContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerTerminationContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java


 Allow auxiliary services to listen for container starts and completions
 ---

 Key: YARN-910
 URL: https://issues.apache.org/jira/browse/YARN-910
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Alejandro Abdelnur
 Fix For: 2.3.0

 Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, 
 YARN-910.patch


 Making container start and completion events available to auxiliary services 
 would allow them to be resource-aware.  The auxiliary service would be able 
 to notify a co-located service that is opportunistically using free capacity 
 of allocation changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-910) Allow auxiliary services to listen for container starts and completions

2013-09-10 Thread Alejandro Abdelnur (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Abdelnur updated YARN-910:


Fix Version/s: (was: 2.3.0)
   2.1.1-beta

Committed to branch-2.1-beta and changed fix-version.

 Allow auxiliary services to listen for container starts and completions
 ---

 Key: YARN-910
 URL: https://issues.apache.org/jira/browse/YARN-910
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Alejandro Abdelnur
 Fix For: 2.1.1-beta

 Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, 
 YARN-910.patch


 Making container start and completion events available to auxiliary services 
 would allow them to be resource-aware.  The auxiliary service would be able 
 to notify a co-located service that is opportunistically using free capacity 
 of allocation changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763040#comment-13763040
 ] 

Hudson commented on YARN-910:
-

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1518 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1518/])
YARN-910. Augmented auxiliary services to listen for container starts and 
completions in addition to application events. Contributed by Alejandro 
Abdelnur. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521298)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerInitializationContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerTerminationContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java


 Allow auxiliary services to listen for container starts and completions
 ---

 Key: YARN-910
 URL: https://issues.apache.org/jira/browse/YARN-910
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Alejandro Abdelnur
 Fix For: 2.1.1-beta

 Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, 
 YARN-910.patch


 Making container start and completion events available to auxiliary services 
 would allow them to be resource-aware.  The auxiliary service would be able 
 to notify a co-located service that is opportunistically using free capacity 
 of allocation changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763041#comment-13763041
 ] 

Hudson commented on YARN-292:
-

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1518 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1518/])
YARN-292. Fixed FifoScheduler and FairScheduler to make their applications data 
structures thread safe to avoid RM crashing with 
ArrayIndexOutOfBoundsException. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521328)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java


 ResourceManager throws ArrayIndexOutOfBoundsException while handling 
 CONTAINER_ALLOCATED for application attempt
 

 Key: YARN-292
 URL: https://issues.apache.org/jira/browse/YARN-292
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.1-alpha
Reporter: Devaraj K
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: ArrayIndexOutOfBoundsException.log, YARN-292.1.patch, 
 YARN-292.2.patch, YARN-292.3.patch, YARN-292.4.patch


 {code:xml}
 2012-12-26 08:41:15,030 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: 
 Calling allocate on removed or non existant application 
 appattempt_1356385141279_49525_01
 2012-12-26 08:41:15,031 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type CONTAINER_ALLOCATED for applicationAttempt 
 application_1356385141279_49525
 java.lang.ArrayIndexOutOfBoundsException: 0
   at java.util.Arrays$ArrayList.get(Arrays.java:3381)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
  {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, 

[jira] [Commented] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763039#comment-13763039
 ] 

Hudson commented on YARN-1152:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1518 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1518/])
YARN-1152. Fixed a bug in ResourceManager that was causing clients to get 
invalid client token key errors when an appliation is about to finish. 
Contributed by Jason Lowe. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521292)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java


 Invalid key to HMAC computation error when getting application report for 
 completed app attempt
 ---

 Key: YARN-1152
 URL: https://issues.apache.org/jira/browse/YARN-1152
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1152-2.txt, YARN-1152.txt


 On a secure cluster, an invalid key to HMAC error is thrown when trying to 
 get an application report for an application with an attempt that has 
 unregistered.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763068#comment-13763068
 ] 

Hudson commented on YARN-1152:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1544 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1544/])
YARN-1152. Fixed a bug in ResourceManager that was causing clients to get 
invalid client token key errors when an appliation is about to finish. 
Contributed by Jason Lowe. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521292)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java


 Invalid key to HMAC computation error when getting application report for 
 completed app attempt
 ---

 Key: YARN-1152
 URL: https://issues.apache.org/jira/browse/YARN-1152
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker
 Fix For: 2.1.1-beta

 Attachments: YARN-1152-2.txt, YARN-1152.txt


 On a secure cluster, an invalid key to HMAC error is thrown when trying to 
 get an application report for an application with an attempt that has 
 unregistered.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763070#comment-13763070
 ] 

Hudson commented on YARN-292:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1544 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1544/])
YARN-292. Fixed FifoScheduler and FairScheduler to make their applications data 
structures thread safe to avoid RM crashing with 
ArrayIndexOutOfBoundsException. Contributed by Zhijie Shen. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521328)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java


 ResourceManager throws ArrayIndexOutOfBoundsException while handling 
 CONTAINER_ALLOCATED for application attempt
 

 Key: YARN-292
 URL: https://issues.apache.org/jira/browse/YARN-292
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.0.1-alpha
Reporter: Devaraj K
Assignee: Zhijie Shen
 Fix For: 2.1.1-beta

 Attachments: ArrayIndexOutOfBoundsException.log, YARN-292.1.patch, 
 YARN-292.2.patch, YARN-292.3.patch, YARN-292.4.patch


 {code:xml}
 2012-12-26 08:41:15,030 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: 
 Calling allocate on removed or non existant application 
 appattempt_1356385141279_49525_01
 2012-12-26 08:41:15,031 ERROR 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
 handling event type CONTAINER_ALLOCATED for applicationAttempt 
 application_1356385141279_49525
 java.lang.ArrayIndexOutOfBoundsException: 0
   at java.util.Arrays$ArrayList.get(Arrays.java:3381)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
   at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
   at 
 org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
   at java.lang.Thread.run(Thread.java:662)
  {code}

--
This message is automatically generated by JIRA.
If you think it was sent 

[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763069#comment-13763069
 ] 

Hudson commented on YARN-910:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1544 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1544/])
YARN-910. Augmented auxiliary services to listen for container starts and 
completions in addition to application events. Contributed by Alejandro 
Abdelnur. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521298)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerInitializationContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerTerminationContext.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java


 Allow auxiliary services to listen for container starts and completions
 ---

 Key: YARN-910
 URL: https://issues.apache.org/jira/browse/YARN-910
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Alejandro Abdelnur
 Fix For: 2.1.1-beta

 Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, 
 YARN-910.patch


 Making container start and completion events available to auxiliary services 
 would allow them to be resource-aware.  The auxiliary service would be able 
 to notify a co-located service that is opportunistically using free capacity 
 of allocation changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests

2013-09-10 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763080#comment-13763080
 ] 

Junping Du commented on YARN-1042:
--

Hi [~ste...@apache.org], as you are the creator of this jira and probably 
consume this API in HOYA project. It is great if you can provide some input 
here. Thx!

 add ability to specify affinity/anti-affinity in container requests
 ---

 Key: YARN-1042
 URL: https://issues.apache.org/jira/browse/YARN-1042
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Steve Loughran
Assignee: Junping Du
 Attachments: YARN-1042-demo.patch


 container requests to the AM should be able to request anti-affinity to 
 ensure that things like Region Servers don't come up on the same failure 
 zones. 
 Similarly, you may be able to want to specify affinity to same host or rack 
 without specifying which specific host/rack. Example: bringing up a small 
 giraph cluster in a large YARN cluster would benefit from having the 
 processes in the same rack purely for bandwidth reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists

2013-09-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763263#comment-13763263
 ] 

Zhijie Shen commented on YARN-609:
--

Checked the three methods bellow. Though they're called addAll*, they seem to 
be used just as the setter in the context. Would you please check the 
references of them as well? If they're supposed to be the setter, I think it's 
good to modify the implementation as you did for other setters.

* NodeHeartbeatResponsePBImpl#addAllContainersToCleanup
* NodeHeartbeatResponsePBImpl#addAllApplicationsToCleanup
* LocalizerStatusPBImpl#addAllResources

 Fix synchronization issues in APIs which take in lists
 --

 Key: YARN-609
 URL: https://issues.apache.org/jira/browse/YARN-609
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-609.1.patch, YARN-609.2.patch, YARN-609.3.patch, 
 YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, YARN-609.7.patch, 
 YARN-609.8.patch, YARN-609.9.patch


 Some of the APIs take in lists and the setter-APIs don't always do proper 
 synchronization. We need to fix these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation

2013-09-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763315#comment-13763315
 ] 

Zhijie Shen commented on YARN-978:
--

The patch looks good, but it's better to add some javadoc for 
YarnApplicationAttemptState and ApplicationAttemptReport, because it's 
user-oriented.

Another question is whether all RMAppAttemptState states are meaningful to 
users to have the 1-to-1 mapping. I've noticed that YarnApplicationState 
combined FINISHING and FINISHED. Thoughts?

If we decided not to expose host, rpc port, and tracking url via rpc protocol, 
we should be consistent via web (YARN-954 and YARN-1023).


 [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
 --

 Key: YARN-978
 URL: https://issues.apache.org/jira/browse/YARN-978
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Mayank Bansal
Assignee: Xuan Gong
 Fix For: YARN-321

 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch, 
 YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch


 We dont have ApplicationAttemptReport and Protobuf implementation.
 Adding that.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1119) Add ClusterMetrics checks to tho TestRMNodeTransitions tests

2013-09-10 Thread Mit Desai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mit Desai updated YARN-1119:


Attachment: YARN-1119.patch

Patch posted for trunk

 Add ClusterMetrics checks to tho TestRMNodeTransitions tests
 

 Key: YARN-1119
 URL: https://issues.apache.org/jira/browse/YARN-1119
 Project: Hadoop YARN
  Issue Type: Test
  Components: resourcemanager
Affects Versions: 3.0.0, 0.23.9, 2.0.6-alpha
Reporter: Robert Parker
Assignee: Mit Desai
 Attachments: YARN-1119.patch, YARN-1119-v1-b23.patch


 YARN-1101 identified an issue where UNHEALTHY nodes could double decrement 
 the active nodes. We should add checks for RUNNING node transitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1098) Separate out RM services into Always On and Active

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763297#comment-13763297
 ] 

Hudson commented on YARN-1098:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4394 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4394/])
YARN-1098. Separate out RM services into Always On and Active (Karthik Kambatla 
via bikas) (bikas: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521560)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


 Separate out RM services into Always On and Active
 --

 Key: YARN-1098
 URL: https://issues.apache.org/jira/browse/YARN-1098
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Fix For: 2.3.0

 Attachments: yarn-1098-1.patch, yarn-1098-2.patch, yarn-1098-3.patch, 
 yarn-1098-4.patch, yarn-1098-5.patch, yarn-1098-approach.patch, 
 yarn-1098-approach.patch


 From discussion on YARN-1027, it makes sense to separate out services that 
 are stateful and stateless. The stateless services can  run perennially 
 irrespective of whether the RM is in Active/Standby state, while the stateful 
 services need to  be started on transitionToActive() and completely shutdown 
 on transitionToStandby().
 The external-facing stateless services should respond to the client/AM/NM 
 requests depending on whether the RM is Active/Standby.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1119) Add ClusterMetrics checks to tho TestRMNodeTransitions tests

2013-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763356#comment-13763356
 ] 

Hadoop QA commented on YARN-1119:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12602372/YARN-1119.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1887//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1887//console

This message is automatically generated.

 Add ClusterMetrics checks to tho TestRMNodeTransitions tests
 

 Key: YARN-1119
 URL: https://issues.apache.org/jira/browse/YARN-1119
 Project: Hadoop YARN
  Issue Type: Test
  Components: resourcemanager
Affects Versions: 3.0.0, 0.23.9, 2.0.6-alpha
Reporter: Robert Parker
Assignee: Mit Desai
 Attachments: YARN-1119.patch, YARN-1119-v1-b23.patch


 YARN-1101 identified an issue where UNHEALTHY nodes could double decrement 
 the active nodes. We should add checks for RUNNING node transitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1119) Add ClusterMetrics checks to tho TestRMNodeTransitions tests

2013-09-10 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763404#comment-13763404
 ] 

Jonathan Eagles commented on YARN-1119:
---

+1 lgtm. Thanks for the patches, Mit.

 Add ClusterMetrics checks to tho TestRMNodeTransitions tests
 

 Key: YARN-1119
 URL: https://issues.apache.org/jira/browse/YARN-1119
 Project: Hadoop YARN
  Issue Type: Test
  Components: resourcemanager
Affects Versions: 3.0.0, 0.23.9, 2.0.6-alpha
Reporter: Robert Parker
Assignee: Mit Desai
 Attachments: YARN-1119.patch, YARN-1119-v1-b23.patch


 YARN-1101 identified an issue where UNHEALTHY nodes could double decrement 
 the active nodes. We should add checks for RUNNING node transitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable

2013-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763431#comment-13763431
 ] 

Hadoop QA commented on YARN-713:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12602394/YARN-713.20130910.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1888//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1888//console

This message is automatically generated.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi
Priority: Critical
 Fix For: 2.3.0

 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.20130910.1.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch, 
 YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1119) Add ClusterMetrics checks to tho TestRMNodeTransitions tests

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763432#comment-13763432
 ] 

Hudson commented on YARN-1119:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4397 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4397/])
YARN-1119. Add ClusterMetrics checks to tho TestRMNodeTransitions tests (Mit 
Desai via jeagles) (jeagles: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521611)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java


 Add ClusterMetrics checks to tho TestRMNodeTransitions tests
 

 Key: YARN-1119
 URL: https://issues.apache.org/jira/browse/YARN-1119
 Project: Hadoop YARN
  Issue Type: Test
  Components: resourcemanager
Affects Versions: 3.0.0, 0.23.9, 2.0.6-alpha
Reporter: Robert Parker
Assignee: Mit Desai
 Attachments: YARN-1119.patch, YARN-1119-v1-b23.patch


 YARN-1101 identified an issue where UNHEALTHY nodes could double decrement 
 the active nodes. We should add checks for RUNNING node transitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists

2013-09-10 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763439#comment-13763439
 ] 

Xuan Gong commented on YARN-609:


Verified, they are used just as the setter.

 Fix synchronization issues in APIs which take in lists
 --

 Key: YARN-609
 URL: https://issues.apache.org/jira/browse/YARN-609
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-609.10.patch, YARN-609.1.patch, YARN-609.2.patch, 
 YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, 
 YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch


 Some of the APIs take in lists and the setter-APIs don't always do proper 
 synchronization. We need to fix these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-609) Fix synchronization issues in APIs which take in lists

2013-09-10 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-609:
---

Attachment: YARN-609.10.patch

 Fix synchronization issues in APIs which take in lists
 --

 Key: YARN-609
 URL: https://issues.apache.org/jira/browse/YARN-609
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-609.10.patch, YARN-609.1.patch, YARN-609.2.patch, 
 YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, 
 YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch


 Some of the APIs take in lists and the setter-APIs don't always do proper 
 synchronization. We need to fix these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-867) Isolation of failures in aux services

2013-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763452#comment-13763452
 ] 

Hadoop QA commented on YARN-867:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12602396/YARN-867.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1889//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1889//console

This message is automatically generated.

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-867) Isolation of failures in aux services

2013-09-10 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-867:
---

Attachment: YARN-867.3.patch

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol

2013-09-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763470#comment-13763470
 ] 

Karthik Kambatla commented on YARN-1027:


Did some testing with several transitions to Standby and Active back and forth, 
and ran MR jobs when in Active mode.
# The Standby mode (389719 objects worth 46661952 bytes) indeed has fewer 
objects and uses less memory compared to the Active mode (399819 objects worth 
50104584 bytes).
# The applicationId has the same timestamp from when the RM started, and starts 
issuing ids starting from 1. This leads to issues ranging from client-side 
failures due to entries in .staging/ to jobs hanging. Once enough jobs are 
killed, subsequent jobs can be run as usual. To address this, I think it is 
safe to reset the timestamp to when the RM becomes Active.
# The WebUI behaves as expected.

Regarding more involved tests, I was thinking of writing a 
MiniYARNCluster-based one that checks if the RPC servers are shutdown in 
Standby mode. We can check if a client can request applicationId etc. Is it 
okay for these tests to live in hadoop-yarn-client. Or, would it make sense to 
create a separate module for such end-to-end tests, including future HA tests, 
stress tests etc.?

 Implement RMHAServiceProtocol
 -

 Key: YARN-1027
 URL: https://issues.apache.org/jira/browse/YARN-1027
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: test-yarn-1027.patch, yarn-1027-1.patch, 
 yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, 
 yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch


 Implement existing HAServiceProtocol from Hadoop common. This protocol is the 
 single point of interaction between the RM and HA clients/services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists

2013-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763473#comment-13763473
 ] 

Hadoop QA commented on YARN-609:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12602401/YARN-609.10.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1890//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1890//console

This message is automatically generated.

 Fix synchronization issues in APIs which take in lists
 --

 Key: YARN-609
 URL: https://issues.apache.org/jira/browse/YARN-609
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-609.10.patch, YARN-609.1.patch, YARN-609.2.patch, 
 YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, 
 YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch


 Some of the APIs take in lists and the setter-APIs don't always do proper 
 synchronization. We need to fix these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-867) Isolation of failures in aux services

2013-09-10 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763423#comment-13763423
 ] 

Xuan Gong commented on YARN-867:


recreate the patch based on the latest trunk, and add new test case to test the 
logic.
Remove the API onAuxServiceFailure, we already have onContainersCompleted() to 
take care of it.

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable

2013-09-10 Thread Omkar Vinit Joshi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Omkar Vinit Joshi updated YARN-713:
---

Attachment: YARN-713.20130910.1.patch

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi
Priority: Critical
 Fix For: 2.3.0

 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.20130910.1.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch, 
 YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable

2013-09-10 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763387#comment-13763387
 ] 

Omkar Vinit Joshi commented on YARN-713:


Fixing test case and findbug warning.

 ResourceManager can exit unexpectedly if DNS is unavailable
 ---

 Key: YARN-713
 URL: https://issues.apache.org/jira/browse/YARN-713
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Jason Lowe
Assignee: Omkar Vinit Joshi
Priority: Critical
 Fix For: 2.3.0

 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, 
 YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch


 As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could 
 lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and 
 that ultimately would cause the RM to exit.  The RM should not exit during 
 DNS hiccups.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-890) The roundup for memory values on resource manager UI is misleading

2013-09-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763533#comment-13763533
 ] 

Zhijie Shen commented on YARN-890:
--

The patch can ensure the UI shows the configured resource.

Just think out loud. The problem happens because totalMB = allocatedMB + 
availableMB, and availableMB is rounded up, which only happens to 
CapacityScheduler. [~tdhavle], would you please confirm the problem only 
happens with CapacityScheduler?

While it makes sense to round up resource request, why do we need to round up 
available memory? Let's say we have 100MB available, and the number will be 
rounded up to 1024MB. Should we allow to allocate another 1024MB container?

In addition, availableMB seems to be only used by web now.

 The roundup for memory values on resource manager UI is misleading
 --

 Key: YARN-890
 URL: https://issues.apache.org/jira/browse/YARN-890
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Trupti Dhavle
Assignee: Xuan Gong
 Attachments: Screen Shot 2013-07-10 at 10.43.34 AM.png, 
 YARN-890.1.patch


 From the yarn-site.xml, I see following values-
 property
 nameyarn.nodemanager.resource.memory-mb/name
 value4192/value
 /property
 property
 nameyarn.scheduler.maximum-allocation-mb/name
 value4192/value
 /property
 property
 nameyarn.scheduler.minimum-allocation-mb/name
 value1024/value
 /property
 However the resourcemanager UI shows total memory as 5MB 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1078) TestNodeManagerResync, TestNodeManagerShutdown, and TestNodeStatusUpdater fail on Windows

2013-09-10 Thread Chuan Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chuan Liu updated YARN-1078:


Attachment: YARN-1078.2.patch

I looked into the failure. It turns out we use 
InetAddress.getCanonicalHostName() to construct nodeId in 
ContainerManagerImpl. In the test, we assume this will always be localhost 
for a local loop back address, i.e. 127.0.0.1. However, this is not the case on 
Windows. As the method could return 127.0.0.1 on Windows instead of 
localhost. In the old patch, I switch from localhost to 127.0.0.1, and 
regressed Linux. Attach a new patch that uses getCanonicalHostName() to obtain 
the name for nodeId constructed in the tests.

 TestNodeManagerResync, TestNodeManagerShutdown, and TestNodeStatusUpdater 
 fail on Windows
 -

 Key: YARN-1078
 URL: https://issues.apache.org/jira/browse/YARN-1078
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 2.3.0
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: YARN-1078.2.patch, YARN-1078.patch


 The three unit tests fail on Windows due to host name resolution differences 
 on Windows, i.e. 127.0.0.1 does not resolve to host name localhost.
 {noformat}
 org.apache.hadoop.security.token.SecretManager$InvalidToken: Given Container 
 container_0__01_00 identifier is not valid for current Node manager. 
 Expected : 127.0.0.1:12345 Found : localhost:12345
 {noformat}
 {noformat}
 testNMConnectionToRM(org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater)
   Time elapsed: 8343 sec   FAILURE!
 org.junit.ComparisonFailure: expected:[localhost]:12345 but 
 was:[127.0.0.1]:12345
   at org.junit.Assert.assertEquals(Assert.java:125)
   at org.junit.Assert.assertEquals(Assert.java:147)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater$MyResourceTracker6.registerNodeManager(TestNodeStatusUpdater.java:712)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at $Proxy26.registerNodeManager(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:212)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:149)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater$MyNodeStatusUpdater4.serviceStart(TestNodeStatusUpdater.java:369)
   at 
 org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
   at 
 org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:101)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:213)
   at 
 org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater.testNMConnectionToRM(TestNodeStatusUpdater.java:985)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.

2013-09-10 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763614#comment-13763614
 ] 

Arpit Agarwal commented on YARN-1025:
-

+1 for the change.

 ResourceManager and NodeManager do not load native libraries on Windows.
 

 Key: YARN-1025
 URL: https://issues.apache.org/jira/browse/YARN-1025
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Chris Nauroth
 Attachments: YARN-1025.1.patch


 ResourceManager and NodeManager do not have the correct setting for 
 java.library.path when launched on Windows.  This prevents the processes from 
 loading native code from hadoop.dll.  The native code is required for correct 
 functioning on Windows (not optional), so this ultimately can cause failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Mayank Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763572#comment-13763572
 ] 

Mayank Bansal commented on YARN-938:


I ran these benchmarks with vinod's [~vinodkv] collabration .

Thanks Vinod for all your help.

Attaching the results.

Thanks,
Mayank 

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1001) YARN should provide per application-type and state statistics

2013-09-10 Thread Srimanth Gunturi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srimanth Gunturi updated YARN-1001:
---

Priority: Critical  (was: Major)

Ambari needs atleast a way to get MapReduce app state counts. This is necessary 
for the upcoming Ambari release.

 YARN should provide per application-type and state statistics
 -

 Key: YARN-1001
 URL: https://issues.apache.org/jira/browse/YARN-1001
 Project: Hadoop YARN
  Issue Type: Task
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Critical
 Attachments: YARN-1001.1.patch, YARN-1001.2.patch


 In Ambari we plan to show for MR2 the number of applications finished, 
 running, waiting, etc. It would be efficient if YARN could provide per 
 application-type and state aggregated counts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-1171) Add defaultQueueSchedulingPolicy to Fair Scheduler documentation

2013-09-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla reassigned YARN-1171:
--

Assignee: Karthik Kambatla

 Add defaultQueueSchedulingPolicy to Fair Scheduler documentation 
 -

 Key: YARN-1171
 URL: https://issues.apache.org/jira/browse/YARN-1171
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: documentation, scheduler
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Karthik Kambatla

 The Fair Scheduler doc is missing the defaultQueueSchedulingPolicy property.  
 I suspect there are a few other ones too that provide defaults for all queues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated YARN-938:
---

Attachment: Hadoop-benchmarking-2.x-vs-1.x.xls

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'

2013-09-10 Thread Srimanth Gunturi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srimanth Gunturi updated YARN-1166:
---

Priority: Critical  (was: Major)

This JIRA is necessary for the upcoming Ambari release. 

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Akira AJISAKA
Priority: Critical
 Attachments: YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1078) TestNodeManagerResync, TestNodeManagerShutdown, and TestNodeStatusUpdater fail on Windows

2013-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763627#comment-13763627
 ] 

Hadoop QA commented on YARN-1078:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12602435/YARN-1078.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1891//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1891//console

This message is automatically generated.

 TestNodeManagerResync, TestNodeManagerShutdown, and TestNodeStatusUpdater 
 fail on Windows
 -

 Key: YARN-1078
 URL: https://issues.apache.org/jira/browse/YARN-1078
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0, 2.3.0
Reporter: Chuan Liu
Assignee: Chuan Liu
Priority: Minor
 Attachments: YARN-1078.2.patch, YARN-1078.patch


 The three unit tests fail on Windows due to host name resolution differences 
 on Windows, i.e. 127.0.0.1 does not resolve to host name localhost.
 {noformat}
 org.apache.hadoop.security.token.SecretManager$InvalidToken: Given Container 
 container_0__01_00 identifier is not valid for current Node manager. 
 Expected : 127.0.0.1:12345 Found : localhost:12345
 {noformat}
 {noformat}
 testNMConnectionToRM(org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater)
   Time elapsed: 8343 sec   FAILURE!
 org.junit.ComparisonFailure: expected:[localhost]:12345 but 
 was:[127.0.0.1]:12345
   at org.junit.Assert.assertEquals(Assert.java:125)
   at org.junit.Assert.assertEquals(Assert.java:147)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater$MyResourceTracker6.registerNodeManager(TestNodeStatusUpdater.java:712)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
   at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
   at $Proxy26.registerNodeManager(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:212)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:149)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater$MyNodeStatusUpdater4.serviceStart(TestNodeStatusUpdater.java:369)
   at 
 org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
   at 
 org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:101)
   at 
 org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:213)
   at 
 org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater.testNMConnectionToRM(TestNodeStatusUpdater.java:985)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1001) YARN should provide per application-type and state statistics

2013-09-10 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763634#comment-13763634
 ] 

Xuan Gong commented on YARN-1001:
-

+1 Looks good

 YARN should provide per application-type and state statistics
 -

 Key: YARN-1001
 URL: https://issues.apache.org/jira/browse/YARN-1001
 Project: Hadoop YARN
  Issue Type: Task
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Critical
 Attachments: YARN-1001.1.patch, YARN-1001.2.patch


 In Ambari we plan to show for MR2 the number of applications finished, 
 running, waiting, etc. It would be efficient if YARN could provide per 
 application-type and state aggregated counts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.

2013-09-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763673#comment-13763673
 ] 

Hudson commented on YARN-1025:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4398 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4398/])
YARN-1025. ResourceManager and NodeManager do not load native libraries on 
Windows. Contributed by Chris Nauroth. (cnauroth: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521670)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd


 ResourceManager and NodeManager do not load native libraries on Windows.
 

 Key: YARN-1025
 URL: https://issues.apache.org/jira/browse/YARN-1025
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: 3.0.0, 2.1.1-beta

 Attachments: YARN-1025.1.patch


 ResourceManager and NodeManager do not have the correct setting for 
 java.library.path when launched on Windows.  This prevents the processes from 
 loading native code from hadoop.dll.  The native code is required for correct 
 functioning on Windows (not optional), so this ultimately can cause failures.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1001) YARN should provide per application-type and state statistics

2013-09-10 Thread Srimanth Gunturi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srimanth Gunturi updated YARN-1001:
---

Priority: Blocker  (was: Critical)

 YARN should provide per application-type and state statistics
 -

 Key: YARN-1001
 URL: https://issues.apache.org/jira/browse/YARN-1001
 Project: Hadoop YARN
  Issue Type: Task
  Components: api
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-1001.1.patch, YARN-1001.2.patch


 In Ambari we plan to show for MR2 the number of applications finished, 
 running, waiting, etc. It would be efficient if YARN could provide per 
 application-type and state aggregated counts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1149) NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING

2013-09-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763670#comment-13763670
 ] 

Zhijie Shen commented on YARN-1149:
---

Conducted some investigation on the problem:

1. The following transition seems to be unnecessary, because 
APPLICATION_LOG_HANDLING_FINISHED can be emitted as early as after 
APPLICATION_STARTED is handled, when Application is already at INITING.
{code}
+  .addTransition(ApplicationState.NEW, ApplicationState.FINISHED,
+  ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED,
+  new AppShutDownTransition())
{code}

2. The following message seems not to cover all the cases:
{code}
+  LOG.info(Application  + app.getAppId() +
+   is shutted down since NodeManager has been killed.);
{code}
In the normal case, APPLICATION_LOG_HANDLING_FINISHED is emitted after 
APPLICATION_FINISHED is handled, when Application is already at FINISHED. The 
two exceptions are: 1. NM is stopping, the running log aggregation job is 
signaled to stop early. In this case, this log info makes sense. 2. The running 
log aggregation job is interrupted. See the following code:
{code}
while (!this.appFinishing.get()) {
  synchronized(this) {
try {
  wait(THREAD_SLEEP_TIME);
} catch (InterruptedException e) {
  LOG.warn(PendingContainers queue is interrupted);
  this.appFinishing.set(true);
}
  }
}
{code}
In this case, the message seems not to be correct.

3. Should we do the following in AppShutDownTransition as well? This is because 
APPLICATION_LOG_HANDLING_FINISHED is consumed, there'll not be the transition 
from FINISHED-FINISHED on APPLICATION_LOG_HANDLING_FINISHED, and then the app 
will always be in the context.
{code}
  app.context.getApplications().remove(appId);
  app.aclsManager.removeApplication(appId);
{code}

 NM throws InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING
 -

 Key: YARN-1149
 URL: https://issues.apache.org/jira/browse/YARN-1149
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ramya Sunil
Assignee: Xuan Gong
 Fix For: 2.1.1-beta

 Attachments: YARN-1149.1.patch


 When nodemanager receives a kill signal when an application has finished 
 execution but log aggregation has not kicked in, 
 InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING is thrown
 {noformat}
 2013-08-25 20:45:00,875 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:finishLogAggregation(254)) - Application just 
 finished : application_1377459190746_0118
 2013-08-25 20:45:00,876 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:uploadLogsForContainer(105)) - Starting aggregate 
 log-file for app application_1377459190746_0118 at 
 /app-logs/foo/logs/application_1377459190746_0118/host_45454.tmp
 2013-08-25 20:45:00,876 INFO  logaggregation.LogAggregationService 
 (LogAggregationService.java:stopAggregators(151)) - Waiting for aggregation 
 to complete for application_1377459190746_0118
 2013-08-25 20:45:00,891 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:uploadLogsForContainer(122)) - Uploading logs for 
 container container_1377459190746_0118_01_04. Current good log dirs are 
 /tmp/yarn/local
 2013-08-25 20:45:00,915 INFO  logaggregation.AppLogAggregatorImpl 
 (AppLogAggregatorImpl.java:doAppLogAggregation(182)) - Finished aggregate 
 log-file for app application_1377459190746_0118
 2013-08-25 20:45:00,925 WARN  application.Application 
 (ApplicationImpl.java:handle(427)) - Can't handle this event at current state
 org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
 APPLICATION_LOG_HANDLING_FINISHED at RUNNING
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
  
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
 at 
 org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:425)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:59)
 at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:697)
 at 
 

[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'

2013-09-10 Thread Srimanth Gunturi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Srimanth Gunturi updated YARN-1166:
---

Priority: Blocker  (was: Critical)

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Akira AJISAKA
Priority: Blocker
 Attachments: YARN-1166.2.patch, YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1098) Separate out RM services into Always On and Active

2013-09-10 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763683#comment-13763683
 ] 

Karthik Kambatla commented on YARN-1098:


[~bikassaha] and [~tucu00], thanks for the reviews.

 Separate out RM services into Always On and Active
 --

 Key: YARN-1098
 URL: https://issues.apache.org/jira/browse/YARN-1098
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
  Labels: ha
 Fix For: 2.3.0

 Attachments: yarn-1098-1.patch, yarn-1098-2.patch, yarn-1098-3.patch, 
 yarn-1098-4.patch, yarn-1098-5.patch, yarn-1098-approach.patch, 
 yarn-1098-approach.patch


 From discussion on YARN-1027, it makes sense to separate out services that 
 are stateful and stateless. The stateless services can  run perennially 
 irrespective of whether the RM is in Active/Standby state, while the stateful 
 services need to  be started on transitionToActive() and completely shutdown 
 on transitionToStandby().
 The external-facing stateless services should respond to the client/AM/NM 
 requests depending on whether the RM is Active/Standby.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'

2013-09-10 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-1166:


Attachment: YARN-1166.2.patch

Attached a patch to pass TestLeafQueue.

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Akira AJISAKA
Priority: Critical
 Attachments: YARN-1166.2.patch, YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions

2013-09-10 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763653#comment-13763653
 ] 

Vinod Kumar Vavilapalli commented on YARN-910:
--

bq. Vinod Kumar Vavilapalli, thanks. Any reason not have this in for 2.1.1-beta?
It's just that I didn't see a target version. And this is new functionality, so 
committed to 2.3 by default. I already see you merged into 2.1.

 Allow auxiliary services to listen for container starts and completions
 ---

 Key: YARN-910
 URL: https://issues.apache.org/jira/browse/YARN-910
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.1.0-beta
Reporter: Sandy Ryza
Assignee: Alejandro Abdelnur
 Fix For: 2.1.1-beta

 Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, 
 YARN-910.patch


 Making container start and completion events available to auxiliary services 
 would allow them to be resource-aware.  The auxiliary service would be able 
 to notify a co-located service that is opportunistically using free capacity 
 of allocation changes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'

2013-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763694#comment-13763694
 ] 

Hadoop QA commented on YARN-1166:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12602448/YARN-1166.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1892//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1892//console

This message is automatically generated.

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Akira AJISAKA
Priority: Blocker
 Attachments: YARN-1166.2.patch, YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists

2013-09-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763557#comment-13763557
 ] 

Zhijie Shen commented on YARN-609:
--

+1 LGMT

 Fix synchronization issues in APIs which take in lists
 --

 Key: YARN-609
 URL: https://issues.apache.org/jira/browse/YARN-609
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-609.10.patch, YARN-609.1.patch, YARN-609.2.patch, 
 YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, 
 YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch


 Some of the APIs take in lists and the setter-APIs don't always do proper 
 synchronization. We need to fix these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1027) Implement RMHAProtocolService

2013-09-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1027:
---

Summary: Implement RMHAProtocolService  (was: Implement RMHAServiceProtocol)

 Implement RMHAProtocolService
 -

 Key: YARN-1027
 URL: https://issues.apache.org/jira/browse/YARN-1027
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: test-yarn-1027.patch, yarn-1027-1.patch, 
 yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, 
 yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch


 Implement existing HAServiceProtocol from Hadoop common. This protocol is the 
 single point of interaction between the RM and HA clients/services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-1027) Implement RMHAProtocolService

2013-09-10 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-1027:
---

Attachment: yarn-1027-6.patch

Updated patch to add ha config to yarn-default.xml and have the 
RM#clusterTimeStamp reflect when the RM became Active.

Submitting patch to check what Jenkins has to say.

 Implement RMHAProtocolService
 -

 Key: YARN-1027
 URL: https://issues.apache.org/jira/browse/YARN-1027
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: test-yarn-1027.patch, yarn-1027-1.patch, 
 yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, 
 yarn-1027-6.patch, yarn-1027-including-yarn-1098-3.patch, 
 yarn-1027-in-rm-poc.patch


 Implement existing HAServiceProtocol from Hadoop common. This protocol is the 
 single point of interaction between the RM and HA clients/services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-867) Isolation of failures in aux services

2013-09-10 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763749#comment-13763749
 ] 

Zhijie Shen commented on YARN-867:
--

How about issuing a KILL_CONTAINER event instead CONTAINER_EXITED_WITH_FAILURE, 
which is already handled at all container states. Otherwise, we need to add the 
transition from a number of states to EXITED_WITH_FAILURE. I'm not sure it is 
obvious to ensure the transition correct.

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763798#comment-13763798
 ] 

Sandy Ryza commented on YARN-938:
-

Thanks for working on these, [~mayank_bansal].  The results are pretty 
consistent with some internal benchmarking we've done at Cloudera.

A few questions:
* In MR1 was io.sort.record.percent tuned to spill the same number of times as 
MR2 does?
* What was slowstart completed maps set to?
* How many slots and MB were the TTs and NMs configured with?
* Any idea what caused the improvement between RC1 and the final release?  I'm 
guessing MAPREDUCE-5399 helped.


 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763817#comment-13763817
 ] 

Vinod Kumar Vavilapalli commented on YARN-938:
--

bq. The results are pretty consistent with some internal benchmarking we've 
done at Cloudera.
Interesting, do you mind sharing those results?

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763829#comment-13763829
 ] 

Sandy Ryza commented on YARN-938:
-

On vacation now, but I'll try to assemble them into a presentable form when I 
get back.

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1027) Implement RMHAProtocolService

2013-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763832#comment-13763832
 ] 

Hadoop QA commented on YARN-1027:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12602463/yarn-1027-6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 5 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1893//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1893//console

This message is automatically generated.

 Implement RMHAProtocolService
 -

 Key: YARN-1027
 URL: https://issues.apache.org/jira/browse/YARN-1027
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Karthik Kambatla
 Attachments: test-yarn-1027.patch, yarn-1027-1.patch, 
 yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, 
 yarn-1027-6.patch, yarn-1027-including-yarn-1098-3.patch, 
 yarn-1027-in-rm-poc.patch


 Implement existing HAServiceProtocol from Hadoop common. This protocol is the 
 single point of interaction between the RM and HA clients/services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-938) Hadoop 2 benchmarking

2013-09-10 Thread Nemon Lou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763838#comment-13763838
 ] 

Nemon Lou commented on YARN-938:


Thanks Mayank Bansal for your work.Do you mind sharing how much input data do 
you run for TeraSort?

 Hadoop 2 benchmarking 
 --

 Key: YARN-938
 URL: https://issues.apache.org/jira/browse/YARN-938
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Mayank Bansal
Assignee: Mayank Bansal
 Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls


 I am running the benchmarks on Hadoop 2 and will update the results soon.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests

2013-09-10 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763848#comment-13763848
 ] 

Junping Du commented on YARN-1042:
--

Thanks for comments. Luke!
bq. Although you can do a lot at the app side with container filtering, 
protocol and scheduler support will make it more efficient. I guess the 
intention of the jira is more for the latter, that affinity support should be 
app independent.
Oh. It remind me that intention of JIRA may be on RM side (so the title may be 
replaced from container requests to resource request) as long lived services 
may have different AppMaster from default one that I change here. Also, I agree 
that do it in RM side may be more efficient as no need to return containers in 
app side for against affinity rules.
However, my concern is it may take extra complexity to RM as it make RM aware 
the affinity/anti-affinity group of tasks (or resource request). IMO, one 
simplicity and beauty for YARN is: RM only take care abstracted resource 
request, and do container allocation accordingly. I am not sure if putting 
resource request into affinity/anti-affinity groups and tracking resource 
request relationship hurt this beauty. Thoughts?

 add ability to specify affinity/anti-affinity in container requests
 ---

 Key: YARN-1042
 URL: https://issues.apache.org/jira/browse/YARN-1042
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: resourcemanager
Affects Versions: 3.0.0
Reporter: Steve Loughran
Assignee: Junping Du
 Attachments: YARN-1042-demo.patch


 container requests to the AM should be able to request anti-affinity to 
 ensure that things like Region Servers don't come up on the same failure 
 zones. 
 Similarly, you may be able to want to specify affinity to same host or rack 
 without specifying which specific host/rack. Example: bringing up a small 
 giraph cluster in a large YARN cluster would benefit from having the 
 processes in the same rack purely for bandwidth reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation

2013-09-10 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763898#comment-13763898
 ] 

Xuan Gong commented on YARN-978:


bq.FINISHING to FINISHED is a bad merge, and indeed is a bug, will file a 
ticket. But agree with the general sentiment. LAUNCHED_UNMANAGED_SAVING can be 
mapped to ALLOCATED_SAVING, I actually think we don't need a separate 
LAUNCHED_UNMANAGED_SAVING, Unmanaged AM should directly go to ALLOCATED state 
on app-submission, will file a bug. The rest seem fine enough for me.

Fixed. Now YarnApplicationAttemptState has FINISHED and FINISHING. Also mapping 
RMAppAttemptState. LAUNCHED_UNMANAGED_SAVING to YarnApplicationAttemptState. 
ALLOCATED_SAVING

bq. I think we should add the host and port information for information 
purposes so that users can reason where their previous AMs ran and on what 
ports. The tracking url can be removed, instead a logs-url can be added like we 
have on the UI.

Added host, rpc_port and logsUrl to applicationAttemptReport

bq.No need to add more stuff to BuilderUtils. It was supposed to be dismantled.

Removed

bq.Do we really need the prefix APP_ATTEMPT_ in 
YarnApplicationAttemptStateProto?

Yes, we have to. Just like we added APP prefix in FinalApplicationStatusProto. 
The reason is in protocolbuffers, enum values use C++ scoping rules, meaning 
that enum values are siblings of their type, not children of it.  all enum 
values must be unique within the global scope, not just within 
YarnApplicationAttemptStateProto.

 [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
 --

 Key: YARN-978
 URL: https://issues.apache.org/jira/browse/YARN-978
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Mayank Bansal
Assignee: Xuan Gong
 Fix For: YARN-321

 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch, 
 YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch


 We dont have ApplicationAttemptReport and Protobuf implementation.
 Adding that.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation

2013-09-10 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-978:
---

Attachment: YARN-978.7.patch

 [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
 --

 Key: YARN-978
 URL: https://issues.apache.org/jira/browse/YARN-978
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Mayank Bansal
Assignee: Xuan Gong
 Fix For: YARN-321

 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch, 
 YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch, YARN-978.7.patch


 We dont have ApplicationAttemptReport and Protobuf implementation.
 Adding that.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation

2013-09-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763946#comment-13763946
 ] 

Hadoop QA commented on YARN-978:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12602505/YARN-978.7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1894//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1894//console

This message is automatically generated.

 [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
 --

 Key: YARN-978
 URL: https://issues.apache.org/jira/browse/YARN-978
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Mayank Bansal
Assignee: Xuan Gong
 Fix For: YARN-321

 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch, 
 YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch, YARN-978.7.patch


 We dont have ApplicationAttemptReport and Protobuf implementation.
 Adding that.
 Thanks,
 Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-867) Isolation of failures in aux services

2013-09-10 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764007#comment-13764007
 ] 

Xuan Gong commented on YARN-867:


bq. I think we should handle AuxServicesEventType.APPLICATION_INIT and the stop 
event in Application and not container. That should simplify THIS patch a lot.

I did not see the benefits. 
So, when there is any auxServices fail in a container, we need to fail this 
container. If we handle the AuxServicesEventType in Application, eventually, 
from Application, we need to inform that certain container(not all the 
containers) to exit_with_failure. It will go to the same process as that we 
handle the it from container directly. If there is no difference, why do we 
increase the traffic (more events) for application ? 

 Isolation of failures in aux services 
 --

 Key: YARN-867
 URL: https://issues.apache.org/jira/browse/YARN-867
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, 
 YARN-867.sampleCode.2.patch


 Today, a malicious application can bring down the NM by sending bad data to a 
 service. For example, sending data to the ShuffleService such that it results 
 any non-IOException will cause the NM's async dispatcher to exit as the 
 service's INIT APP event is not handled properly. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira