[ https://issues.apache.org/jira/browse/MAPREDUCE-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated MAPREDUCE-1372: ---------------------------------- Attachment: ReadOnly.java I tested the use of Collections.newSetFromMap with the testcase attached and here are the results Pre patch ||num items||avg time to scan in ms|| |1000|0| |10000|9| |100000|13| |1000000|56| |10000000|424| Post patch ||num items||avg time to scan in ms|| |1000|0| |10000|7| |100000|21| |1000000|124| |10000000|1175| Using the command {code} java ReadOnly -correctness pre|post {code} the bug can be verified. The bug can be reproduced by using the _pre_ switch and the fix can be verified by switching to _post_. With _pre_, a HashMap is used while with _post_ a Collections.newSetFromMap is used. The test is simple. It simply starts 2 threads, one adding to the set while the other scans it. Using the command {code} java ReadOnly -performance pre|post X Y {code} the proposed change can be benchmarked. The benchmark simply adds to a set (determined by pre/post switch) X elements and scanning it Y times, averaging it over runs. > ConcurrentModificationException in JobInProgress > ------------------------------------------------ > > Key: MAPREDUCE-1372 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1372 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobtracker > Affects Versions: 0.20.1 > Reporter: Amareshwari Sriramadasu > Priority: Blocker > Fix For: 0.21.0 > > Attachments: M1372-0.patch, M1372-1.patch, M1372-2.patch, > ReadOnly.java > > > We have seen the following ConcurrentModificationException in one of our > clusters > {noformat} > java.io.IOException: java.util.ConcurrentModificationException > at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793) > at java.util.HashMap$KeyIterator.next(HashMap.java:828) > at > org.apache.hadoop.mapred.JobInProgress.findNewMapTask(JobInProgress.java:2018) > at > org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask(JobInProgress.java:1077) > at > org.apache.hadoop.mapred.CapacityTaskScheduler$MapSchedulingMgr.obtainNewTask(CapacityTaskScheduler.java:796) > at > org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.getTaskFromQueue(CapacityTaskScheduler.java:589) > at > org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.assignTasks(CapacityTaskScheduler.java:677) > at > org.apache.hadoop.mapred.CapacityTaskScheduler$TaskSchedulingMgr.access$500(CapacityTaskScheduler.java:348) > at > org.apache.hadoop.mapred.CapacityTaskScheduler.addMapTask(CapacityTaskScheduler.java:1397) > at > org.apache.hadoop.mapred.CapacityTaskScheduler.assignTasks(CapacityTaskScheduler.java:1349) > at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:2976) > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) > {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.