subject:"\[jira\] \[Commented\] \(YARN\-744\) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated."

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-11-20 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827538#comment-13827538
]

Hudson commented on YARN-744:
-

SUCCESS: Integrated in Hadoop-Yarn-trunk #397 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk/397/])
YARN-744. Race condition in ApplicationMasterService.allocate .. It might
process same allocate request twice resulting in additional containers getting
allocated. (Omkar Vinit Joshi via bikas) (bikas:
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543707)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

Key: YARN-744
URL: https://issues.apache.org/jira/browse/YARN-744
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
Attachments: MAPREDUCE-3899-branch-0.23.patch,
YARN-744-20130711.1.patch, YARN-744-20130715.1.patch,
YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.2.patch, YARN-744.patch

Looks like the lock taken in this is broken. It takes a lock on lastResponse
object and then puts a new lastResponse object into the map. At this point a
new thread entering this function will get a new lastResponse object and will
be able to take its lock and enter the critical section. Presumably we want
to limit one response per app attempt. So the lock could be taken on the
ApplicationAttemptId key of the response map object.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-11-20 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827627#comment-13827627
]

Hudson commented on YARN-744:
-

FAILURE: Integrated in Hadoop-Hdfs-trunk #1588 (See
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1588/])
YARN-744. Race condition in ApplicationMasterService.allocate .. It might
process same allocate request twice resulting in additional containers getting
allocated. (Omkar Vinit Joshi via bikas) (bikas:
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543707)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-11-20 Thread Hudson (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827639#comment-13827639
]

Hudson commented on YARN-744:
-

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1614 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1614/])
YARN-744. Race condition in ApplicationMasterService.allocate .. It might
process same allocate request twice resulting in additional containers getting
allocated. (Omkar Vinit Joshi via bikas) (bikas:
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543707)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-11-19 Thread Omkar Vinit Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826974#comment-13826974
 ] 

Omkar Vinit Joshi commented on YARN-744:


Thanks [~bikassaha] addressed your comments. Attaching a new patch.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, 
 YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.2.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-11-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826997#comment-13826997
 ] 

Hadoop QA commented on YARN-744:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12614693/YARN-744.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2486//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2486//console

This message is automatically generated.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, 
 YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.2.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-11-19 Thread Omkar Vinit Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827011#comment-13827011
 ] 

Omkar Vinit Joshi commented on YARN-744:


Test failure is not related to this. Opened ticket YARN-1425 to track this.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, 
 YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.2.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-11-18 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13825989#comment-13825989
 ] 

Bikas Saha commented on YARN-744:
-

Better name?
{code}
+AllocateResponseLock res = responseMap.get(applicationAttemptId);
{code}

reuse throwApplicationAttemptDoesNotExistInCacheException() in 
registerApplicationMaster()?

use InvalidApplicationMasterRequestException or a new specific exception 
instead of generic RPCUtil.throwRemoteException()?
{code}
+  private void throwApplicationAttemptDoesNotExistInCacheException(
+  ApplicationAttemptId appAttemptId) throws YarnException {
+String message = Application doesn't exist in cache 
++ appAttemptId;
+LOG.error(message);
+throw RPCUtil.getRemoteException(message);
+  }
{code}

The new logic is not the same as the old one. If the app is no longer in the 
cache then it would send a resync response. Now it will send a regular response 
instead of a resync response.
{code}
-  // before returning response, verify in sync
-  AllocateResponse oldResponse =
-  responseMap.put(appAttemptId, allocateResponse);
-  if (oldResponse == null) {
-// appAttempt got unregistered, remove it back out
-responseMap.remove(appAttemptId);
-String message = App Attempt removed from the cache during allocate
-+ appAttemptId;
-LOG.error(message);
-return resync;
-  }
-
+  res.setAllocateResponse(allocateResponse);
   return allocateResponse;
{code}

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, 
 YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-11-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820921#comment-13820921
 ] 

Hadoop QA commented on YARN-744:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12613497/YARN-744.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2433//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2433//console

This message is automatically generated.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, 
 YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-26 Thread Omkar Vinit Joshi (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721282#comment-13721282
]

Omkar Vinit Joshi commented on YARN-744:

Thanks [~bikassaha] ...

bq. AllocateResponseWrapper res
how about AllocateResponseLock??

bq. If the wrapper exists then how can the lastResponse be null?
you are right ..now we no longer need this removing it.

yeah the test won't actually be able to simulate the race condition mentioned
above. Can't think of any other test. Attaching it without a test.

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721300#comment-13721300
 ] 

Hadoop QA commented on YARN-744:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12594461/YARN-744-20130726.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1591//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1591//console

This message is automatically generated.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, 
 YARN-744-20130726.1.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-24 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13719003#comment-13719003
 ] 

Bikas Saha commented on YARN-744:
-

Patch needs rebase. Probably reuse the recently added UnknownApplication 
exception.
Can we pick a better name?
{code}AllocateResponseWrapper res{code}

If the wrapper exists then how can the lastResponse be null?
{code}
+synchronized (res) {
+  AllocateResponse lastResponse = res.getAllocateResponse();
+  if (lastResponse == null) {
+LOG.error(AppAttemptId doesnt exist in cache  + appAttemptId);
+return resync;
+  }
{code}

I am not quite getting fully what the test is doing. Does it fail without the 
change?

On the test code itself. 
Is there another solution to make the test work other than making a method 
protected only for test override purposes? 
Can you make it package private instead of protected? protected implies 
connotations for derived classes.
Can we avoid the long sleep() calls and use wait-notify or countdown latches so 
that we dont waste time in the test. They also help make the test less flaky 
because of race conditions.
{code}
+  @Override
+  protected void authorizeRequest(ApplicationAttemptId appAttemptID)
+  throws YarnException {
+int interval = 10;
+count.incrementAndGet();
+while (count.get() == 1  interval--  0 ) {
+  try {
+Thread.sleep(1000);
+  } catch (InterruptedException e) {}
+}
+Assert.assertTrue(count.get()  1);
+  }
+};
{code}


 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-18 Thread Omkar Vinit Joshi (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13713206#comment-13713206
]

Omkar Vinit Joshi commented on YARN-744:

[~bikassaha] yes... there is similar but different bug though..so
[~mayank_bansal] is fixing it. There we are computing the response and then
updating RMNodeImpl asynchronously. If this approach is correct then we can do
the similar thing after YARN-245 is in.

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-15 Thread Zhijie Shen (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13708761#comment-13708761
]

Zhijie Shen commented on YARN-744:
--

The passed in appAttemptId for an app currently seems to be the same object,
such that it can be used to for synchronized blocks, but I agree with the idea
of wrapper, because it is more predictable and stand-alone in
ApplicationMasterService.

BTW, is it convenient to write a test case for concurrent allocation? Like
TestClientRMService#testConcurrentAppSubmit.

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

Key: YARN-744
URL: https://issues.apache.org/jira/browse/YARN-744
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Attachments: MAPREDUCE-3899-branch-0.23.patch,
YARN-744-20130711.1.patch, YARN-744.patch

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-15 Thread Omkar Vinit Joshi (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709064#comment-13709064
]

Omkar Vinit Joshi commented on YARN-744:

bq. BTW, is it convenient to write a test case for concurrent allocation? Like
TestClientRMService#testConcurrentAppSubmit.
yeah wrote one...

bq. The passed in appAttemptId for an app currently seems to be the same
object, such that it can be used to for synchronized blocks, but I agree with
the idea of wrapper, because it is more predictable and stand-alone in
ApplicationMasterService.
locking on appAttemptId in case of allocate / RegisterApplicationMaster call
won't work. They are coming from client...can't guarantee that they are
identical in terms grabbing a lock.. thoughts?

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

Key: YARN-744
URL: https://issues.apache.org/jira/browse/YARN-744
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Attachments: MAPREDUCE-3899-branch-0.23.patch,
YARN-744-20130711.1.patch, YARN-744.patch

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-15 Thread Zhijie Shen (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709072#comment-13709072
]

Zhijie Shen commented on YARN-744:
--

bq. locking on appAttemptId in case of allocate / RegisterApplicationMaster
call won't work. They are coming from client...can't guarantee that they are
identical in terms grabbing a lock.. thoughts?

I meant that AMRMClient uses the same appAttemptId, but the uniqueness is not
guaranteed, so I agreed with the self-contained locker - wrapper.

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

Key: YARN-744
URL: https://issues.apache.org/jira/browse/YARN-744
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Attachments: MAPREDUCE-3899-branch-0.23.patch,
YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744.patch

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-15 Thread Bikas Saha (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709091#comment-13709091
]

Bikas Saha commented on YARN-744:
-

Why do we need a wrapper?
We should not be locking on the app attempt id. We should try to find some
internal RM object thats unique for the app attempt and lock on that. Also
avoid locking the RMAttempImpl object itself since it will block internal async
dispatcher.

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

Key: YARN-744
URL: https://issues.apache.org/jira/browse/YARN-744
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Attachments: MAPREDUCE-3899-branch-0.23.patch,
YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744.patch

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-15 Thread Bikas Saha (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709096#comment-13709096
]

Bikas Saha commented on YARN-744:
-

btw. it does not look like this is a practical problem. Until we start seeing a
few instances of this happening we should probably lower the priority of this
jira. I will do that now. Please change it if you think otherwise. A bug that
does not manifest itself is not a bug :P

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

Key: YARN-744
URL: https://issues.apache.org/jira/browse/YARN-744
Project: Hadoop YARN
Issue Type: Bug
Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Attachments: MAPREDUCE-3899-branch-0.23.patch,
YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744.patch

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-15 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709147#comment-13709147
 ] 

Hadoop QA commented on YARN-744:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12592426/YARN-744-20130715.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1485//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1485//console

This message is automatically generated.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
Priority: Minor
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-15 Thread Omkar Vinit Joshi (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13709304#comment-13709304
]

Omkar Vinit Joshi commented on YARN-744:

bq. We should not be locking on the app attempt id.
I am not locking on appAttemptId... or AppAttemptImpl...didn't understand your
question.

bq. Why do we need a wrapper?
We don't have any explicit lock for an application attempt...I am creating a
wrapped object to avoid maintaining per application attempt lock. Thereby
across application attempt response we can lock on specific attempt.

I think this is important as we may loose container than what were requested...

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-11 Thread Omkar Vinit Joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13706197#comment-13706197
 ] 

Omkar Vinit Joshi commented on YARN-744:


[~bikassaha] sounds reasonable ..will take a look at it again.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
 Attachments: MAPREDUCE-3899-branch-0.23.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-07-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13706554#comment-13706554
 ] 

Hadoop QA commented on YARN-744:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12591936/YARN-744-20130711.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1465//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/1465//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1465//console

This message is automatically generated.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
 Attachments: MAPREDUCE-3899-branch-0.23.patch, 
 YARN-744-20130711.1.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-06-28 Thread Omkar Vinit Joshi (JIRA)

[
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13695706#comment-13695706
]

Omkar Vinit Joshi commented on YARN-744:

The problem here is that we retrieve the last response from resource map and
then try to grab a lock on it. However after grabbing lock we don't check if
the last response in resource map itself got updated or not. That results into
a race condition which I am trying to solve here.. After grabbing the lock an
additional check has to be made to ensure that lastResponse was not changed in
between i.e. no other AM requests were processed.

Race condition in ApplicationMasterService.allocate .. It might process same
allocate request twice resulting in additional containers getting allocated.
-

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

2013-06-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13695757#comment-13695757
 ] 

Hadoop QA commented on YARN-744:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12590077/YARN-744.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/1404//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1404//console

This message is automatically generated.

 Race condition in ApplicationMasterService.allocate .. It might process same 
 allocate request twice resulting in additional containers getting allocated.
 -

 Key: YARN-744
 URL: https://issues.apache.org/jira/browse/YARN-744
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Bikas Saha
Assignee: Omkar Vinit Joshi
 Attachments: MAPREDUCE-3899-branch-0.23.patch, YARN-744.patch


 Looks like the lock taken in this is broken. It takes a lock on lastResponse 
 object and then puts a new lastResponse object into the map. At this point a 
 new thread entering this function will get a new lastResponse object and will 
 be able to take its lock and enter the critical section. Presumably we want 
 to limit one response per app attempt. So the lock could be taken on the 
 ApplicationAttemptId key of the response map object.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.

23 matches

Site Navigation

Mail list logo

Footer information