[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908180#comment-13908180
 ] 

Hudson commented on YARN-1398:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #488 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/488/])
YARN-1398. Fixed a deadlock in ResourceManager between users requesting 
queue-acls and completing containers. Contributed by Vinod Kumar Vavilapalli. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1570415)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 2.4.0

 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908319#comment-13908319
 ] 

Hudson commented on YARN-1398:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1680 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1680/])
YARN-1398. Fixed a deadlock in ResourceManager between users requesting 
queue-acls and completing containers. Contributed by Vinod Kumar Vavilapalli. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1570415)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 2.4.0

 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13908397#comment-13908397
 ] 

Hudson commented on YARN-1398:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1705 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1705/])
YARN-1398. Fixed a deadlock in ResourceManager between users requesting 
queue-acls and completing containers. Contributed by Vinod Kumar Vavilapalli. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1570415)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 2.4.0

 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-20 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907709#comment-13907709
 ] 

Arun C Murthy commented on YARN-1398:
-

+1 lgtm.

I think this was an oversight in YARN-569. Let's get this in asap.

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-20 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907713#comment-13907713
 ] 

Jian He commented on YARN-1398:
---

took a look also,  lgtm, + 1

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907773#comment-13907773
 ] 

Hadoop QA commented on YARN-1398:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630180/YARN-1398-20140220.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3138//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3138//console

This message is automatically generated.

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907810#comment-13907810
 ] 

Hadoop QA commented on YARN-1398:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12630180/YARN-1398-20140220.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3140//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3140//console

This message is automatically generated.

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-20 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907874#comment-13907874
 ] 

Vinod Kumar Vavilapalli commented on YARN-1398:
---

Tx for the quick reviews, [~jianhe] and [~acmurthy]. I am checking this in now.

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-02-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13907911#comment-13907911
 ] 

Hudson commented on YARN-1398:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5201 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5201/])
YARN-1398. Fixed a deadlock in ResourceManager between users requesting 
queue-acls and completing containers. Contributed by Vinod Kumar Vavilapalli. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1570415)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Assignee: Vinod Kumar Vavilapalli
Priority: Blocker
 Fix For: 2.4.0

 Attachments: YARN-1398-20140220.txt


 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-01-02 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861194#comment-13861194
 ] 

Sunil G commented on YARN-1398:
---

As per YARN-325, this issue was fixed before 2.1.0. But in 2.1.0, we can see 
like below
ParentQueue.completedContainer while holding a lock on the LeafQueue.

This can cause same issue which is mentioned in YARN-325.

Is there any reason why the ParentQueue.completedContainer call is added back 
with holding the lock on leaf queue.
Because as per the YARN-325 fix, the fix was to remove the same. And this has 
mentioned in the comments too.

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Priority: Critical

 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-01-02 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861236#comment-13861236
 ] 

Sunil G commented on YARN-1398:
---

During YARN-569 defect fix for adding a scheduling policy, the below code 
segment is added back in leafqueue lock segment

  // Inform the parent queue
  getParent().completedContainer(clusterResource, application,
  node, rmContainer, null, event, this);

Pls let know whether this call is really required in the synchronized block of 
Leafqueue completedContainer call.

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Priority: Critical

 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2013-11-19 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826421#comment-13826421
 ] 

Rohith Sharma K S commented on YARN-1398:
-

Hi Sunil, 
I think this is same as https://issues.apache.org/jira/i#browse/YARN-325. 

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Priority: Critical

 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1#6144)