[ 
https://issues.apache.org/jira/browse/GORA-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883181#comment-13883181
 ] 

Henry Saputra commented on GORA-228:
------------------------------------

I think the static proposed solution should solve the unit test issue of 
concurrent modification but in distributed environment such as with Apache 
Hadoop's mapreduce it will still cause similar issue because all the tasks are 
run in separate JVM.
As [~renato2099] mentioned, they will not share a single one cache.

Maybe we could provide option for Memstore to either use distributed cache like 
with Apache directmemory [1] or Hazelcast/ ehcache, or simple discard after 
used for scenario where you manage one instance of store per usage.

[1] http://directmemory.apache.org

> java.util.ConcurrentModificationException when using MemStore for concurrent 
> tests
> ----------------------------------------------------------------------------------
>
>                 Key: GORA-228
>                 URL: https://issues.apache.org/jira/browse/GORA-228
>             Project: Apache Gora
>          Issue Type: Sub-task
>          Components: gora-core
>    Affects Versions: 0.3
>            Reporter: Lewis John McGibbney
>             Fix For: 0.4
>
>         Attachments: GORA-228.patch
>
>
> Finally, a multithreaded test in [3] fails with the following
> {code}
> java.util.ConcurrentModificationException
>       at 
> java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(TreeMap.java:1594)
>       at 
> java.util.TreeMap$NavigableSubMap$SubMapKeyIterator.next(TreeMap.java:1655)
>       at 
> org.apache.gora.memory.store.MemStore$MemResult.nextInner(MemStore.java:81)
>       at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
>       at 
> org.apache.nutch.storage.TestGoraStorage.readWrite(TestGoraStorage.java:74)
>       at 
> org.apache.nutch.storage.TestGoraStorage.access$100(TestGoraStorage.java:41)
>       at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:107)
>       at 
> org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:102)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>       at java.lang.Thread.run(Thread.java:722)
> {code}
> I believe that the final failure is due to to the use of TreeMap [5] as a 
> private object in MemStore. TreeMap implementations are not synchronized. If 
> multiple threads access a map concurrently, and at least one of the threads 
> modifies the map structurally, it must be synchronized externally. (A 
> structural modification is any operation that adds or deletes one or more 
> mappings; merely changing the value associated with an existing key is not a 
> structural modification.) This is typically accomplished by synchronizing on 
> some object that naturally encapsulates the map. If no such object exists, 
> the map should be "wrapped" using the Collections.synchronizedSortedMap 
> method. This is best done at creation time, to prevent accidental 
> unsynchronized access to the map e.g.
>    SortedMap m = Collections.synchronizedSortedMap(new TreeMap(...));
> N.B. The NOTE on TreeMap's come right from the Oracle JavaDoc.
> [3] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/storage/TestGoraStorage.java?view=markup
> [4] 
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/util/AbstractNutchTest.java?view=markup
> [5] http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to