[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2910: -- Fix Version/s: 2.6.1 Pulled this into 2.6.1. Ran compilation and TestFSLeafQueue before the push. Patch applied cleanly. > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Labels: 2.6.1-candidate > Fix For: 2.7.0, 2.6.1 > > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.7.patch, > YARN-2910.8.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-2910: -- Labels: 2.6.1-candidate (was: ) > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Labels: 2.6.1-candidate > Fix For: 2.7.0 > > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.7.patch, > YARN-2910.8.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-2910: Attachment: YARN-2910.8.patch cleanup of spurious includes which should not be there > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.7.patch, > YARN-2910.8.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-2910: Attachment: YARN-2910.7.patch One final update to shorten the time we keep the lock and make sure we do the least amount of work while holding a write lock > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.7.patch, > YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-2910: Attachment: YARN-2910.6.patch updated patch with try finally clauses > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.6.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-2910: Attachment: YARN-2910.5.patch OK, a complete new approach. The other approaches did not work or did not fix it so back to a simple lock and unlock around the read and write actions. The locking is setup with a fair distribution which is almost a fifo setup. This is not the default option and chosen to make sure we do not cause a thread to be starved from the lock. Multiple reads are allowed at the same time and only one writer with no readers at the same time. All junit tests pass in my local environment also other failures. As an extra change the {{synchronized}} has been removed from FSAppAttempt#getHeadRoom as discussed with [~kasha]. > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.4.patch, YARN-2910.5.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-2910: - Assignee: Wilfred Spiegelenburg (was: Ray Chiang) > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.4.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-2910: Attachment: YARN-2910.4.patch I did not change the assignment :-( yes, the {{when(schedulable.getResourceUsage()).thenReturn(smallResource);}} should not have been in the patch, my mistake. Not sure how that ended up in the patch I used it during development but not in the last tests. On my machine the test failed with just adding applications. The issue seems to be in the initialisation of the application attempt. When I added debug into the test run I can see the initialisation of the app attempt in the mock taking up a lot of time which meant that the {{getResourceUsage}} almost always ran over an empty list unless the number of iterations was raised above 1000. As soon as I moved the creation out of the thread the failure occurs within 5 iterations of the {{getResourceUsage}} call in the second thread after adding less than 15 or so app instances. I have attached an updated patch which passes with the new code and has a 100% failure rate with the old code. This version of the test runs faster and is more reliable than the previous ones. > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Ray Chiang > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.4.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-2910: - Attachment: YARN-2910.004.patch I see Wilfred assigned this to me now. I took Akira's changes and updated with Tsuyoshi's suggestion. The new unit test fails 10 out of 10 with the old code and passes 10 out of 10 with the new code. > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2 >Reporter: Wilfred Spiegelenburg >Assignee: Ray Chiang > Attachments: FSLeafQueue_concurrent_exception.txt, > YARN-2910.004.patch, YARN-2910.1.patch, YARN-2910.2.patch, YARN-2910.3.patch, > YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated YARN-2910: Attachment: YARN-2910.3.patch Updated the test to use the actual method, and fixed some indents. I reproduced ConcurrentModificationException without the fix. > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0 >Reporter: Wilfred Spiegelenburg >Assignee: Ray Chiang > Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.1.patch, > YARN-2910.2.patch, YARN-2910.3.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2910: --- Attachment: YARN-2910.2.patch I had modify the test slightly. > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.1.patch, > YARN-2910.2.patch, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-2910: Attachment: YARN-2910.1.patch Updated patch with the changes as discussed and a junit test > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.1.patch, > YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2910: --- Target Version/s: 2.7.0, 2.6.1 > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2910: --- Description: The list that maintains the runnable and the non runnable apps are a standard ArrayList but there is no guarantee that it will only be manipulated by one thread in the system. This can lead to the following exception: {noformat} 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.util.ConcurrentModificationException: java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) {noformat} Full stack trace in the attached file. We should guard against that by using a thread safe version from java.util.concurrent.CopyOnWriteArrayList was: The list that maintains the runnable and the non runnable apps are a standard ArrayList but there is no guarantee that it will only be manipulated by one thread in the system. This can lead to the following exception: 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.util.ConcurrentModificationException: java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) at org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) Full stack trace in the attached file. We should guard against that by using a thread safe version from java.util.concurrent.CopyOnWriteArrayList > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > {noformat} > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > {noformat} > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-2910: - Assignee: Wilfred Spiegelenburg > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0 >Reporter: Wilfred Spiegelenburg >Assignee: Wilfred Spiegelenburg > Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2910) FSLeafQueue can throw ConcurrentModificationException
[ https://issues.apache.org/jira/browse/YARN-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wilfred Spiegelenburg updated YARN-2910: Attachment: FSLeafQueue_concurrent_exception.txt YARN-2910.patch Full exception stack trace and patch > FSLeafQueue can throw ConcurrentModificationException > - > > Key: YARN-2910 > URL: https://issues.apache.org/jira/browse/YARN-2910 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.5.0 >Reporter: Wilfred Spiegelenburg >Assignee: Rohith > Attachments: FSLeafQueue_concurrent_exception.txt, YARN-2910.patch > > > The list that maintains the runnable and the non runnable apps are a standard > ArrayList but there is no guarantee that it will only be manipulated by one > thread in the system. This can lead to the following exception: > 2014-11-12 02:29:01,169 ERROR [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN > CONTACTING RM. > java.util.ConcurrentModificationException: > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:516) > Full stack trace in the attached file. > We should guard against that by using a thread safe version from > java.util.concurrent.CopyOnWriteArrayList -- This message was sent by Atlassian JIRA (v6.3.4#6332)