[jira] [Commented] (YARN-170) NodeManager stop() gets called twice on shutdown

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546776#comment-13546776
 ] 

Hudson commented on YARN-170:
-

Integrated in Hadoop-Yarn-trunk #90 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/90/])
YARN-170. Change NodeManager stop to be reentrant. Contributed by Sandy 
Ryza. (Revision 1429796)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1429796
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManagerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManagerEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java


 NodeManager stop() gets called twice on shutdown
 

 Key: YARN-170
 URL: https://issues.apache.org/jira/browse/YARN-170
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.0.3-alpha

 Attachments: YARN-170-1.patch, YARN-170-20130107.txt, 
 YARN-170-2.patch, YARN-170-3.patch, YARN-170.patch


 The stop method in the NodeManager gets called twice when the NodeManager is 
 shut down via the shutdown hook.
 The first is the stop that gets called directly by the shutdown hook.  The 
 second occurs when the NodeStatusUpdaterImpl is stopped.  The NodeManager 
 responds to the NodeStatusUpdaterImpl stop stateChanged event by stopping 
 itself.  This is so that NodeStatusUpdaterImpl can notify the NodeManager to 
 stop, by stopping itself in response to a request from the ResourceManager
 This could be avoided if the NodeStatusUpdaterImpl were to stop the 
 NodeManager by calling its stop method directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (YARN-321) Generic application history service

2013-01-08 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli moved MAPREDUCE-3061 to YARN-321:
-

 Tags:   (was: mrv2, history server)
  Component/s: (was: mrv2)
Fix Version/s: (was: 0.24.0)
 Target Version/s:   (was: 0.24.0)
Affects Version/s: (was: 0.23.0)
  Key: YARN-321  (was: MAPREDUCE-3061)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

 Generic application history service
 ---

 Key: YARN-321
 URL: https://issues.apache.org/jira/browse/YARN-321
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Luke Lu
Assignee: Vinod Kumar Vavilapalli

 The mapreduce job history server currently needs to be deployed as a trusted 
 server in sync with the mapreduce runtime. Every new application would need a 
 similar application history server. Having to deploy O(T*V) (where T is 
 number of type of application, V is number of version of application) trusted 
 servers is clearly not scalable.
 Job history storage handling itself is pretty generic: move the logs and 
 history data into a particular directory for later serving. Job history data 
 is already stored as json (or binary avro). I propose that we create only one 
 trusted application history server, which can have a generic UI (display json 
 as a tree of strings) as well. Specific application/version can deploy 
 untrusted webapps (a la AMs) to query the application history server and 
 interpret the json for its specific UI and/or analytics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-2) Enhance CS to schedule accounting for both memory and cpu cores

2013-01-08 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-2:
-

Attachment: YARN-2.patch

 Enhance CS to schedule accounting for both memory and cpu cores
 ---

 Key: YARN-2
 URL: https://issues.apache.org/jira/browse/YARN-2
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: capacityscheduler, scheduler
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4327.patch, MAPREDUCE-4327.patch, 
 MAPREDUCE-4327.patch, MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch, 
 MAPREDUCE-4327-v4.patch, MAPREDUCE-4327-v5.patch, YARN-2-help.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch


 With YARN being a general purpose system, it would be useful for several 
 applications (MPI et al) to specify not just memory but also CPU (cores) for 
 their resource requirements. Thus, it would be useful to the 
 CapacityScheduler to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-2) Enhance CS to schedule accounting for both memory and cpu cores

2013-01-08 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546824#comment-13546824
 ] 

Arun C Murthy commented on YARN-2:
--

Fixed TestRMWebServicesCapacitySched (had to fix the test) - any final 
comments? 

I think it's good to go, for now I'll commit after jenkins okays it since it's 
getting harder to maintain this largish patch. We can fix nits etc. 
post-commit. Thanks.

 Enhance CS to schedule accounting for both memory and cpu cores
 ---

 Key: YARN-2
 URL: https://issues.apache.org/jira/browse/YARN-2
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: capacityscheduler, scheduler
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4327.patch, MAPREDUCE-4327.patch, 
 MAPREDUCE-4327.patch, MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch, 
 MAPREDUCE-4327-v4.patch, MAPREDUCE-4327-v5.patch, YARN-2-help.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch


 With YARN being a general purpose system, it would be useful for several 
 applications (MPI et al) to specify not just memory but also CPU (cores) for 
 their resource requirements. Thus, it would be useful to the 
 CapacityScheduler to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-2) Enhance CS to schedule accounting for both memory and cpu cores

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546840#comment-13546840
 ] 

Hadoop QA commented on YARN-2:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563741/YARN-2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 22 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/323//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/323//console

This message is automatically generated.

 Enhance CS to schedule accounting for both memory and cpu cores
 ---

 Key: YARN-2
 URL: https://issues.apache.org/jira/browse/YARN-2
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: capacityscheduler, scheduler
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4327.patch, MAPREDUCE-4327.patch, 
 MAPREDUCE-4327.patch, MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch, 
 MAPREDUCE-4327-v4.patch, MAPREDUCE-4327-v5.patch, YARN-2-help.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch


 With YARN being a general purpose system, it would be useful for several 
 applications (MPI et al) to specify not just memory but also CPU (cores) for 
 their resource requirements. Thus, it would be useful to the 
 CapacityScheduler to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-253) Container launch may fail if no files were localized

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546845#comment-13546845
 ] 

Hadoop QA commented on YARN-253:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563743/YARN-253-20130108.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/324//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/324//console

This message is automatically generated.

 Container launch may fail if no files were localized
 

 Key: YARN-253
 URL: https://issues.apache.org/jira/browse/YARN-253
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.2-alpha
Reporter: Tom White
Assignee: Tom White
Priority: Critical
 Attachments: YARN-253-20130108.txt, YARN-253.patch, YARN-253.patch, 
 YARN-253-test.patch


 This can be demonstrated with DistributedShell. The containers running the 
 shell do not have any files to localize (if there is no shell script to copy) 
 so if they run on a different NM to the AM (which does localize files), then 
 they will fail since the appcache directory does not exist.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-170) NodeManager stop() gets called twice on shutdown

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546866#comment-13546866
 ] 

Hudson commented on YARN-170:
-

Integrated in Hadoop-Hdfs-trunk #1279 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1279/])
YARN-170. Change NodeManager stop to be reentrant. Contributed by Sandy 
Ryza. (Revision 1429796)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1429796
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManagerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManagerEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java


 NodeManager stop() gets called twice on shutdown
 

 Key: YARN-170
 URL: https://issues.apache.org/jira/browse/YARN-170
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Fix For: 2.0.3-alpha

 Attachments: YARN-170-1.patch, YARN-170-20130107.txt, 
 YARN-170-2.patch, YARN-170-3.patch, YARN-170.patch


 The stop method in the NodeManager gets called twice when the NodeManager is 
 shut down via the shutdown hook.
 The first is the stop that gets called directly by the shutdown hook.  The 
 second occurs when the NodeStatusUpdaterImpl is stopped.  The NodeManager 
 responds to the NodeStatusUpdaterImpl stop stateChanged event by stopping 
 itself.  This is so that NodeStatusUpdaterImpl can notify the NodeManager to 
 stop, by stopping itself in response to a request from the ResourceManager
 This could be avoided if the NodeStatusUpdaterImpl were to stop the 
 NodeManager by calling its stop method directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-253) Container launch may fail if no files were localized

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546875#comment-13546875
 ] 

Hudson commented on YARN-253:
-

Integrated in Hadoop-trunk-Commit #3190 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3190/])
YARN-253. Fixed container-launch to not fail when there are no local 
resources to localize. Contributed by Tom White. (Revision 1430269)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1430269
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/TestContainerManagerSecurity.java


 Container launch may fail if no files were localized
 

 Key: YARN-253
 URL: https://issues.apache.org/jira/browse/YARN-253
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.2-alpha
Reporter: Tom White
Assignee: Tom White
Priority: Critical
 Fix For: 2.0.3-alpha

 Attachments: YARN-253-20130108.txt, YARN-253.patch, YARN-253.patch, 
 YARN-253-test.patch


 This can be demonstrated with DistributedShell. The containers running the 
 shell do not have any files to localize (if there is no shell script to copy) 
 so if they run on a different NM to the AM (which does localize files), then 
 they will fail since the appcache directory does not exist.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-323) Yarn CLI commands prints classpath

2013-01-08 Thread Nishan Shetty (JIRA)
Nishan Shetty created YARN-323:
--

 Summary: Yarn CLI commands prints classpath
 Key: YARN-323
 URL: https://issues.apache.org/jira/browse/YARN-323
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.1-alpha
Reporter: Nishan Shetty
Priority: Minor


Execute ./yarn commands. It will print classpath in console

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-318) sendSignal in DefaultContainerExecutor causes invalid options error

2013-01-08 Thread Hyunsik Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyunsik Choi resolved YARN-318.
---

Resolution: Not A Problem

I didn't doubt the kill's bug because it is very common utility and it worked 
when I executed the command 'kill -0 -12127' on shell. However, accoring to 
this bug report (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=688731), it 
was a bug of procps-ng. If we don't give the path of kill command, the command 
invokes the built-in 'kill' command of bash. So, it seemed working.


 sendSignal in DefaultContainerExecutor causes invalid options error
 -

 Key: YARN-318
 URL: https://issues.apache.org/jira/browse/YARN-318
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.2-alpha
 Environment: * OS: MintOS 14
 ** MintOS 14 is based on ubuntu 12.10. So, this problem may be caused in 
 ubuntu 12.10.
 * procp version: procps-ng 3.3.3
 * OpenJDK version: 7u9-2.3.3
 * Hadoop version: 2.0.2-alpha
Reporter: Hyunsik Choi

 In line 238 of DefaultcontainerExecutor, sendSignal method causes an error 
 when ContainerManagerImpl tries to kill a container. The command passed to 
 ShellCommandExecutor in sendSignal() was kill -0 -12127.
 The following message is copied from the detailMessage of the Exception.
 {noformat}
 kill: invalid option -- '1'
 Usage:
  kill [options] pid [...]
 Options:
  pid [...]send signal to every pid listed
  -signal, -s, --signal signal
 specify the signal to be sent
  -l, --list=[signal]  list all signal names, or convert one to a name
  -L, --tablelist all signal names in a nice table
  -h, --help display this help and exit
  -V, --version  output version information and exit
 For more details see kill(1).
 {noformat}
 I investigated a little bit on this problem. I've found that sendSignal works 
 well with traditional procp (http://procps.sourceforge.net/), whereas it 
 causes such a error with procps-ng 
 (https://fedoraproject.org/wiki/Features/procps-ng) used in MintOS 14. As you 
 know, 'kill' command is included in procp package in general linux 
 distributions. When I only change the 'kill' binary into traditional one, 
 stopContainer works well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-320) RM should always be able to renew its own tokens

2013-01-08 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546988#comment-13546988
 ] 

Daryn Sharp commented on YARN-320:
--

Admittedly, extracting the renewer is a bit dubious.  It was the simplest way 
to satisfy the ADTSM checks w/o making changes to the core that would have a 
larger impact.

In 1.x I tried to make the JT not do a lookback RPC to itself, but I don't 
recall why it was dinged.  With the current design I'm not sure there's a good 
way for the token to get access to the RM's secret manager, but yes that would 
be ideal.

Thanks, I'll wrap up the patch.

 RM should always be able to renew its own tokens
 

 Key: YARN-320
 URL: https://issues.apache.org/jira/browse/YARN-320
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-320.branch-23.patch


 YARN-280 introduced fast-fail for job submissions with bad tokens.  
 Unfortunately, other stack components like oozie and customers are acquiring 
 RM tokens with a hardcoded dummy renewer value.  These jobs would fail after 
 24 hours because the RM token couldn't be renewed, but fast-fail is failing 
 them immediately.  The RM should always be able to renew its own tokens 
 submitted with a job.  The renewer field may continue to specify an external 
 user who can renew.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-309) Make RM provide heartbeat interval to NM

2013-01-08 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-309:
---

Attachment: YARN-309.2.patch

 Make RM provide heartbeat interval to NM
 

 Key: YARN-309
 URL: https://issues.apache.org/jira/browse/YARN-309
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-309.1.patch, YARN-309.2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-320) RM should always be able to renew its own tokens

2013-01-08 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated YARN-320:
-

Attachment: YARN-320.branch-23.patch

Add unit tests, trunk patch forthcoming.

 RM should always be able to renew its own tokens
 

 Key: YARN-320
 URL: https://issues.apache.org/jira/browse/YARN-320
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-320.branch-23.patch, YARN-320.branch-23.patch


 YARN-280 introduced fast-fail for job submissions with bad tokens.  
 Unfortunately, other stack components like oozie and customers are acquiring 
 RM tokens with a hardcoded dummy renewer value.  These jobs would fail after 
 24 hours because the RM token couldn't be renewed, but fast-fail is failing 
 them immediately.  The RM should always be able to renew its own tokens 
 submitted with a job.  The renewer field may continue to specify an external 
 user who can renew.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-324) Provide way to preserve container directories

2013-01-08 Thread Lohit Vijayarenu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lohit Vijayarenu updated YARN-324:
--

Summary: Provide way to preserve container directories  (was: Provide way 
to preserve )

 Provide way to preserve container directories
 -

 Key: YARN-324
 URL: https://issues.apache.org/jira/browse/YARN-324
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu

 There should be a way to preserve container directories (along with 
 filecache/appcache) for offline debugging. As of today, if container 
 completes (either success or failure) it would get cleaned up. In case of 
 failure it becomes very hard to debug to find out what the case of failure 
 is. Having ability to preserve container directories will enable one to log 
 into the machine and debug further for failures. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-324) Provide way to preserve container directories

2013-01-08 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547191#comment-13547191
 ] 

Jason Lowe commented on YARN-324:
-

The nodemanager currently supports this via the 
yarn.nodemanager.delete.debug-delay-sec property.  Is that sufficient to meet 
your needs or were you thinking of something different?

 Provide way to preserve container directories
 -

 Key: YARN-324
 URL: https://issues.apache.org/jira/browse/YARN-324
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager, resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Lohit Vijayarenu

 There should be a way to preserve container directories (along with 
 filecache/appcache) for offline debugging. As of today, if container 
 completes (either success or failure) it would get cleaned up. In case of 
 failure it becomes very hard to debug to find out what the case of failure 
 is. Having ability to preserve container directories will enable one to log 
 into the machine and debug further for failures. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-320) RM should always be able to renew its own tokens

2013-01-08 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated YARN-320:
-

Attachment: YARN-320.patch

Patch for trunk and branch-2.

 RM should always be able to renew its own tokens
 

 Key: YARN-320
 URL: https://issues.apache.org/jira/browse/YARN-320
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-320.branch-23.patch, YARN-320.branch-23.patch, 
 YARN-320.patch


 YARN-280 introduced fast-fail for job submissions with bad tokens.  
 Unfortunately, other stack components like oozie and customers are acquiring 
 RM tokens with a hardcoded dummy renewer value.  These jobs would fail after 
 24 hours because the RM token couldn't be renewed, but fast-fail is failing 
 them immediately.  The RM should always be able to renew its own tokens 
 submitted with a job.  The renewer field may continue to specify an external 
 user who can renew.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-320) RM should always be able to renew its own tokens

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547246#comment-13547246
 ] 

Hadoop QA commented on YARN-320:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563814/YARN-320.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/325//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/325//console

This message is automatically generated.

 RM should always be able to renew its own tokens
 

 Key: YARN-320
 URL: https://issues.apache.org/jira/browse/YARN-320
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-320.branch-23.patch, YARN-320.branch-23.patch, 
 YARN-320.patch


 YARN-280 introduced fast-fail for job submissions with bad tokens.  
 Unfortunately, other stack components like oozie and customers are acquiring 
 RM tokens with a hardcoded dummy renewer value.  These jobs would fail after 
 24 hours because the RM token couldn't be renewed, but fast-fail is failing 
 them immediately.  The RM should always be able to renew its own tokens 
 submitted with a job.  The renewer field may continue to specify an external 
 user who can renew.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-193) Scheduler.normalizeRequest does not account for allocation requests that exceed maximumAllocation limits

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547248#comment-13547248
 ] 

Hadoop QA commented on YARN-193:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563808/YARN-193.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/326//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/326//console

This message is automatically generated.

 Scheduler.normalizeRequest does not account for allocation requests that 
 exceed maximumAllocation limits 
 -

 Key: YARN-193
 URL: https://issues.apache.org/jira/browse/YARN-193
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.0.0, 2.0.3-alpha
Reporter: Hitesh Shah
Assignee: Hitesh Shah
Priority: Critical
 Attachments: MR-3796.1.patch, MR-3796.2.patch, MR-3796.3.patch, 
 MR-3796.wip.patch, YARN-193.4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-142) Change YARN APIs to throw IOException

2013-01-08 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-142:
---

Attachment: YARN-142.3.patch

 Change YARN APIs to throw IOException
 -

 Key: YARN-142
 URL: https://issues.apache.org/jira/browse/YARN-142
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 0.23.3, 2.0.0-alpha
Reporter: Siddharth Seth
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-142.1.patch, YARN-142.2.patch, YARN-142.3.patch


 Ref: MAPREDUCE-4067
 All YARN APIs currently throw YarnRemoteException.
 1) This cannot be extended in it's current form.
 2) The RPC layer can throw IOExceptions. These end up showing up as 
 UndeclaredThrowableExceptions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-2) Enhance CS to schedule accounting for both memory and cpu cores

2013-01-08 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547662#comment-13547662
 ] 

Hudson commented on YARN-2:
---

Integrated in Hadoop-trunk-Commit #3200 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3200/])
MAPREDUCE-4520. Added support for MapReduce applications to request for CPU 
cores along-with memory post YARN-2. Contributed by Arun C. Murthy. (Revision 
1430688)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1430688
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/YARNRunner.java


 Enhance CS to schedule accounting for both memory and cpu cores
 ---

 Key: YARN-2
 URL: https://issues.apache.org/jira/browse/YARN-2
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: capacityscheduler, scheduler
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4327.patch, MAPREDUCE-4327.patch, 
 MAPREDUCE-4327.patch, MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch, 
 MAPREDUCE-4327-v4.patch, MAPREDUCE-4327-v5.patch, YARN-2-help.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch


 With YARN being a general purpose system, it would be useful for several 
 applications (MPI et al) to specify not just memory but also CPU (cores) for 
 their resource requirements. Thus, it would be useful to the 
 CapacityScheduler to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-2) Enhance CS to schedule accounting for both memory and cpu cores

2013-01-08 Thread caolong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547664#comment-13547664
 ] 

caolong commented on YARN-2:


en,The great patch. all right,What are we planning to do ahout FairScheduler?

 Enhance CS to schedule accounting for both memory and cpu cores
 ---

 Key: YARN-2
 URL: https://issues.apache.org/jira/browse/YARN-2
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: capacityscheduler, scheduler
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Fix For: 2.0.3-alpha

 Attachments: MAPREDUCE-4327.patch, MAPREDUCE-4327.patch, 
 MAPREDUCE-4327.patch, MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch, 
 MAPREDUCE-4327-v4.patch, MAPREDUCE-4327-v5.patch, YARN-2-help.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, YARN-2.patch, 
 YARN-2.patch, YARN-2.patch, YARN-2.patch


 With YARN being a general purpose system, it would be useful for several 
 applications (MPI et al) to specify not just memory but also CPU (cores) for 
 their resource requirements. Thus, it would be useful to the 
 CapacityScheduler to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-325) RM CapacityScheduler can deadlock when getQueueInfo() is called and a container is completing

2013-01-08 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reassigned YARN-325:
--

Assignee: Arun C Murthy

 RM CapacityScheduler can deadlock when getQueueInfo() is called and a 
 container is completing
 -

 Key: YARN-325
 URL: https://issues.apache.org/jira/browse/YARN-325
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Arun C Murthy
Priority: Critical

 If a client calls getQueueInfo on a parent queue (e.g.: the root queue) and 
 containers are completing then the RM can deadlock.  getQueueInfo() locks the 
 ParentQueue and then calls the child queues' getQueueInfo() methods in turn.  
 However when a container completes, it locks the LeafQueue then calls back 
 into the ParentQueue.  When the two mix, it's a recipe for deadlock.
 Stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-325) RM CapacityScheduler can deadlock when getQueueInfo() is called and a container is completing

2013-01-08 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-325:
---

Priority: Blocker  (was: Critical)

 RM CapacityScheduler can deadlock when getQueueInfo() is called and a 
 container is completing
 -

 Key: YARN-325
 URL: https://issues.apache.org/jira/browse/YARN-325
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Arun C Murthy
Priority: Blocker

 If a client calls getQueueInfo on a parent queue (e.g.: the root queue) and 
 containers are completing then the RM can deadlock.  getQueueInfo() locks the 
 ParentQueue and then calls the child queues' getQueueInfo() methods in turn.  
 However when a container completes, it locks the LeafQueue then calls back 
 into the ParentQueue.  When the two mix, it's a recipe for deadlock.
 Stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-325) RM CapacityScheduler can deadlock when getQueueInfo() is called and a container is completing

2013-01-08 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-325:
---

Attachment: YARN-325.patch

Illustrative patch, need to fix unit-tests yet.

 RM CapacityScheduler can deadlock when getQueueInfo() is called and a 
 container is completing
 -

 Key: YARN-325
 URL: https://issues.apache.org/jira/browse/YARN-325
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: YARN-325.patch


 If a client calls getQueueInfo on a parent queue (e.g.: the root queue) and 
 containers are completing then the RM can deadlock.  getQueueInfo() locks the 
 ParentQueue and then calls the child queues' getQueueInfo() methods in turn.  
 However when a container completes, it locks the LeafQueue then calls back 
 into the ParentQueue.  When the two mix, it's a recipe for deadlock.
 Stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-325) RM CapacityScheduler can deadlock when getQueueInfo() is called and a container is completing

2013-01-08 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-325:
---

Attachment: YARN-325.patch

Added unit-tests.

 RM CapacityScheduler can deadlock when getQueueInfo() is called and a 
 container is completing
 -

 Key: YARN-325
 URL: https://issues.apache.org/jira/browse/YARN-325
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: YARN-325.patch, YARN-325.patch


 If a client calls getQueueInfo on a parent queue (e.g.: the root queue) and 
 containers are completing then the RM can deadlock.  getQueueInfo() locks the 
 ParentQueue and then calls the child queues' getQueueInfo() methods in turn.  
 However when a container completes, it locks the LeafQueue then calls back 
 into the ParentQueue.  When the two mix, it's a recipe for deadlock.
 Stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-325) RM CapacityScheduler can deadlock when getQueueInfo() is called and a container is completing

2013-01-08 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13547734#comment-13547734
 ] 

Hadoop QA commented on YARN-325:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12563898/YARN-325.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/328//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/328//console

This message is automatically generated.

 RM CapacityScheduler can deadlock when getQueueInfo() is called and a 
 container is completing
 -

 Key: YARN-325
 URL: https://issues.apache.org/jira/browse/YARN-325
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Arun C Murthy
Priority: Blocker
 Attachments: YARN-325.patch, YARN-325.patch


 If a client calls getQueueInfo on a parent queue (e.g.: the root queue) and 
 containers are completing then the RM can deadlock.  getQueueInfo() locks the 
 ParentQueue and then calls the child queues' getQueueInfo() methods in turn.  
 However when a container completes, it locks the LeafQueue then calls back 
 into the ParentQueue.  When the two mix, it's a recipe for deadlock.
 Stacktrace to follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira