[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2015-08-30 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2910:
--
Fix Version/s: 2.6.1

Pulled this into 2.6.1. Ran compilation and TestFSLeafQueue before the push. 
Patch applied cleanly.

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
  Labels: 2.6.1-candidate
 Fix For: 2.7.0, 2.6.1

 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.7.patch, 
 YARN-2910.8.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2015-07-22 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-2910:
--
Labels: 2.6.1-candidate  (was: )

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
  Labels: 2.6.1-candidate
 Fix For: 2.7.0

 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.7.patch, 
 YARN-2910.8.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-08 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-2910:

Attachment: YARN-2910.4.patch

I did not change the assignment :-(

yes, the {{when(schedulable.getResourceUsage()).thenReturn(smallResource);}} 
should not have been in the patch, my mistake. Not sure how that ended up in 
the patch I used it during development but not in the last tests.

On my machine the test failed with just adding applications. The issue seems to 
be in the initialisation of the application attempt. When I added debug into 
the test run I can see the initialisation of the app attempt in the mock taking 
up a lot of time which meant that the {{getResourceUsage}} almost always ran 
over an empty list unless the number of iterations was raised above 1000. As 
soon as I moved the creation out of the thread the failure occurs within 5 
iterations of the {{getResourceUsage}} call in the second thread after adding 
less than 15 or so app instances.

I have attached an updated patch which passes with the new code and has a 100% 
failure rate with the old code. This version of the test runs faster and is 
more reliable than the previous ones.

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Ray Chiang
 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.4.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-08 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-2910:
-
Assignee: Wilfred Spiegelenburg  (was: Ray Chiang)

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.4.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-08 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-2910:

Attachment: YARN-2910.5.patch

OK, a complete new approach. The other approaches did not work or did not fix 
it so back to a simple lock and unlock around the read and write actions.

The locking is setup with a fair distribution which is almost a fifo setup. 
This is not the default option and chosen to make sure we do not cause a thread 
to be starved from the lock.
Multiple reads are allowed at the same time and only one writer with no readers 
at the same time.

All junit tests pass in my local environment also other failures. 
As an extra change the {{synchronized}} has been removed from 
FSAppAttempt#getHeadRoom as discussed with [~kasha].

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-08 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-2910:

Attachment: YARN-2910.6.patch

updated patch with try finally clauses

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-08 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-2910:

Attachment: YARN-2910.7.patch

One final update to shorten the time we keep the lock and make sure we do the 
least amount of work while holding a write lock

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.7.patch, 
 YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-08 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-2910:

Attachment: YARN-2910.8.patch

cleanup of spurious includes which should not be there

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.7.patch, 
 YARN-2910.8.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-07 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-2910:

Attachment: YARN-2910.3.patch

Updated the test to use the actual method, and fixed some indents. I reproduced 
ConcurrentModificationException without the fix.

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Ray Chiang
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.1.patch, 
 YARN-2910.2.patch, YARN-2910.3.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-07 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-2910:
-
Attachment: YARN-2910.004.patch

I see Wilfred assigned this to me now.

I took Akira's changes and updated with Tsuyoshi's suggestion.  The new unit 
test fails 10 out of 10 with the old code and passes 10 out of 10 with the new 
code.

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
Reporter: Wilfred Spiegelenburg
Assignee: Ray Chiang
 Attachments: FSLeafQueue_concurrent_exception.txt, 
 YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, 
 YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-05 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-2910:

Attachment: YARN-2910.1.patch

Updated patch with the changes as discussed and a junit test

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.1.patch, 
 YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-05 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2910:
---
Attachment: YARN-2910.2.patch

I had modify the test slightly. 

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.1.patch, 
 YARN-2910.2.patch, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-02 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2910:
---
Description: 
The list that maintains the runnable and the non runnable apps are a standard 
ArrayList but there is no guarantee that it will only be manipulated by one 
thread in the system. This can lead to the following exception:
{noformat}
2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING 
RM.
java.util.ConcurrentModificationException: 
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
at java.util.ArrayList$Itr.next(ArrayList.java:831)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
{noformat}

Full stack trace in the attached file.

We should guard against that by using a thread safe version from 
java.util.concurrent.CopyOnWriteArrayList


  was:


The list that maintains the runnable and the non runnable apps are a standard 
ArrayList but there is no guarantee that it will only be manipulated by one 
thread in the system. This can lead to the following exception:

2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING 
RM.
java.util.ConcurrentModificationException: 
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
at java.util.ArrayList$Itr.next(ArrayList.java:831)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)

Full stack trace in the attached file.

We should guard against that by using a thread safe version from 
java.util.concurrent.CopyOnWriteArrayList



 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-12-02 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-2910:
---
Target Version/s: 2.7.0, 2.6.1

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 {noformat}
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 {noformat}
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-11-30 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-2910:
-
Assignee: Wilfred Spiegelenburg

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Wilfred Spiegelenburg
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException

2014-11-26 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-2910:

Attachment: FSLeafQueue_concurrent_exception.txt
YARN-2910.patch

Full exception stack trace and patch

 FSLeafQueue can throw ConcurrentModificationException
 -

 Key: YARN-2910
 URL: https://issues.apache.org/jira/browse/YARN-2910
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Wilfred Spiegelenburg
Assignee: Rohith
 Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch


 The list that maintains the runnable and the non runnable apps are a standard 
 ArrayList but there is no guarantee that it will only be manipulated by one 
 thread in the system. This can lead to the following exception:
 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN 
 CONTACTING RM.
 java.util.ConcurrentModificationException: 
 java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
 at java.util.ArrayList$Itr.next(ArrayList.java:831)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923)
 at 
 org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516)
 Full stack trace in the attached file.
 We should guard against that by using a thread safe version from 
 java.util.concurrent.CopyOnWriteArrayList



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)