[ 
https://issues.apache.org/jira/browse/GORA-225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated GORA-225:
--------------------------------------

    Description: 
In Nutch we have numerous testing scenarios which simulate persistence of data 
to Gora in some form or other. It has worked good as until now.
Now that gora-sql-0.1.1-incubating artifact is non-compatible with gora-core 
0.3, there is a requirement to address this situation in order to keep some 
degree of integrity within the Nutch codebase.
Specifcally a number of tests [0][1][2][3] all extend a Util testing class 
which utilizes functionality from the gora-sql artifact.

My initial solution was to switch to using MemStore... which brought me to 
logging this issue!

I've logged sub issues here to make clear distinction about my observations.

  was:
In Nutch we have numerous testing scenarios which simulate persistence of data 
to Gora in some form or other. It has worked good as until now.
Now that gora-sql-0.1.1-incubating artifact is non-compatible with gora-core 
0.3, there is a requirement to address this situation in order to keep some 
degree of integrity within the Nutch codebase.
Specifcally a number of tests [0][1][2][3] all extend a Util testing class 
which utilizes functionality from the gora-sql artifact.

My initial solution was to switch to using MemStore... which brought me to 
logging this issue!

Test [0] fails with the following useless logging... I need to DEBUG this much 
more throughly

{code}
Testcase: testGenerateHighest took 1.845 sec
        FAILED
expected:<2> but was:<0>
junit.framework.AssertionFailedError: expected:<2> but was:<0>
        at 
org.apache.nutch.crawl.TestGenerator.testGenerateHighest(TestGenerator.java:78)

Testcase: testGenerateHostLimit took 1.207 sec
        FAILED
expected:<1> but was:<0>
junit.framework.AssertionFailedError: expected:<1> but was:<0>
        at 
org.apache.nutch.crawl.TestGenerator.testGenerateHostLimit(TestGenerator.java:134)

Testcase: testGenerateDomainLimit took 1.175 sec
        FAILED
expected:<1> but was:<0>
junit.framework.AssertionFailedError: expected:<1> but was:<0>
        at 
org.apache.nutch.crawl.TestGenerator.testGenerateDomainLimit(TestGenerator.java:185)

Testcase: testFilter took 2.31 sec
        FAILED
expected:<3> but was:<0>
junit.framework.AssertionFailedError: expected:<3> but was:<0>
        at 
org.apache.nutch.crawl.TestGenerator.testFilter(TestGenerator.java:239)
{code}

Tests [1][2] are fail identically with the following stack trace

{code}   
Testcase: testInject took 1.931 sec
        Caused an ERROR
null
java.util.NoSuchElementException
        at java.util.TreeMap.key(TreeMap.java:1221)
        at java.util.TreeMap.firstKey(TreeMap.java:285)
        at org.apache.gora.memory.store.MemStore.execute(MemStore.java:122)
        at 
org.apache.nutch.util.CrawlTestUtil.readContents(CrawlTestUtil.java:112)
        at org.apache.nutch.crawl.TestInjector.readDb(TestInjector.java:104)
        at org.apache.nutch.crawl.TestInjector.testInject(TestInjector.java:62)
{code}

Finally, a multithreaded test in [3] fails with the following

{code}
java.util.ConcurrentModificationException
        at 
java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(TreeMap.java:1594)
        at 
java.util.TreeMap$NavigableSubMap$SubMapKeyIterator.next(TreeMap.java:1655)
        at 
org.apache.gora.memory.store.MemStore$MemResult.nextInner(MemStore.java:81)
        at org.apache.gora.query.impl.ResultBase.next(ResultBase.java:112)
        at 
org.apache.nutch.storage.TestGoraStorage.readWrite(TestGoraStorage.java:74)
        at 
org.apache.nutch.storage.TestGoraStorage.access$100(TestGoraStorage.java:41)
        at 
org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:107)
        at 
org.apache.nutch.storage.TestGoraStorage$1.call(TestGoraStorage.java:102)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)
{code}

I believe that the final failure is due to to the use of TreeMap [5] as a 
private object in MemStore. TreeMap implementations are not synchronized. If 
multiple threads access a map concurrently, and at least one of the threads 
modifies the map structurally, it must be synchronized externally. (A 
structural modification is any operation that adds or deletes one or more 
mappings; merely changing the value associated with an existing key is not a 
structural modification.) This is typically accomplished by synchronizing on 
some object that naturally encapsulates the map. If no such object exists, the 
map should be "wrapped" using the Collections.synchronizedSortedMap method. 
This is best done at creation time, to prevent accidental unsynchronized access 
to the map e.g.

   SortedMap m = Collections.synchronizedSortedMap(new TreeMap(...));

N.B. The NOTE on TreeMap's come right from the Oracle JavaDoc.

[0] 
http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup
[1] 
http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestInjector.java?view=markup
[2] 
http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/fetcher/TestFetcher.java?view=markup
[3] 
http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/storage/TestGoraStorage.java?view=markup
[4] 
http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/util/AbstractNutchTest.java?view=markup
[5] http://docs.oracle.com/javase/6/docs/api/java/util/TreeMap.html

    
> Various Issues with MemStore 
> -----------------------------
>
>                 Key: GORA-225
>                 URL: https://issues.apache.org/jira/browse/GORA-225
>             Project: Apache Gora
>          Issue Type: Bug
>          Components: gora-core, testing
>    Affects Versions: 0.3
>         Environment: Nutch 2.x HEAD, gora-core 0.3
>            Reporter: Lewis John McGibbney
>             Fix For: 0.4
>
>
> In Nutch we have numerous testing scenarios which simulate persistence of 
> data to Gora in some form or other. It has worked good as until now.
> Now that gora-sql-0.1.1-incubating artifact is non-compatible with gora-core 
> 0.3, there is a requirement to address this situation in order to keep some 
> degree of integrity within the Nutch codebase.
> Specifcally a number of tests [0][1][2][3] all extend a Util testing class 
> which utilizes functionality from the gora-sql artifact.
> My initial solution was to switch to using MemStore... which brought me to 
> logging this issue!
> I've logged sub issues here to make clear distinction about my observations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to