[ https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitriy V. Ryaboy updated PIG-3241: ----------------------------------- Attachment: PIG-3241.patch Attaching patch. Rather than synchronize all memory access, I decided to simply avoid concurrent access all together. spill(), called by Spillable Memory Manager, used to set up the iterator used for spilling - that involved looking at the primary and secondary maps, applying the combiner to them, doing all kinds of things -- all in the SMM thread. Instead, we now only set the doSpill flag in spill(), and do the work in the main thread, which now is the only thread that can modify iterators and hashmaps. Most of this patch is just whitespace changes :). > ConcurrentModificationException in POPartialAgg > ----------------------------------------------- > > Key: PIG-3241 > URL: https://issues.apache.org/jira/browse/PIG-3241 > Project: Pig > Issue Type: Bug > Affects Versions: 0.11 > Reporter: Lohit Vijayarenu > Assignee: Dmitriy V. Ryaboy > Priority: Blocker > Fix For: 0.12, 0.11.1 > > Attachments: PIG-3241.patch > > > While running few PIG scripts against Hadoop 2.0, I see consistently see > ConcurrentModificationException > {noformat} > at java.util.HashMap$HashIterator.remove(HashMap.java:811) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153) > {noformat} > It looks like there is rawInputMap is being modified while elements are > removed from it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira