[ 
https://issues.apache.org/jira/browse/PIG-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-3241:
-----------------------------------

    Attachment: PIG-3241.patch

Attaching patch.

Rather than synchronize all memory access, I decided to simply avoid concurrent 
access all together. spill(), called by Spillable Memory Manager, used to set 
up the iterator used for spilling - that involved looking at the primary and 
secondary maps, applying the combiner to them, doing all kinds of things -- all 
in the SMM thread.

Instead, we now only set the doSpill flag in spill(), and do the work in the 
main thread, which now is the only thread that can modify iterators and 
hashmaps.

Most of this patch is just whitespace changes :).
                
> ConcurrentModificationException in POPartialAgg
> -----------------------------------------------
>
>                 Key: PIG-3241
>                 URL: https://issues.apache.org/jira/browse/PIG-3241
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.11
>            Reporter: Lohit Vijayarenu
>            Assignee: Dmitriy V. Ryaboy
>            Priority: Blocker
>             Fix For: 0.12, 0.11.1
>
>         Attachments: PIG-3241.patch
>
>
> While running few PIG scripts against Hadoop 2.0, I see consistently see 
> ConcurrentModificationException 
> {noformat}
> at java.util.HashMap$HashIterator.remove(HashMap.java:811)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregate(POPartialAgg.java:365)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.aggregateSecondLevel(POPartialAgg.java:379)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPartialAgg.getNext(POPartialAgg.java:203)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:308)
>       at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:263)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:283)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:729)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:158)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:153)
> {noformat}
> It looks like there is rawInputMap is being modified while elements are 
> removed from it. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to