DataStore.put() silently loses records when executed from multiple processes
----------------------------------------------------------------------------

                 Key: NUTCH-893
                 URL: https://issues.apache.org/jira/browse/NUTCH-893
             Project: Nutch
          Issue Type: Bug
         Environment: Gora HEAD, SqlStore, MySQL 5.1, Ubuntu 10.4 x64, Sun JDK 
1.6
            Reporter: Andrzej Bialecki 


In order to debug the issue described in NUTCH-879 I created a test to simulate 
multiple clients appending to webtable (please see the patch), which is the 
situation that we have in distributed map-reduce jobs.

There are two tests there: one that uses multiple threads within the same JVM, 
and another that uses single thread in multiple JVMs. Each test first clears 
webtable (be careful!), and then puts a bunch of pages, and finally counts that 
all are present and their values correspond to keys. To make things more 
interesting each execution context (thread or process) closes and reopens its 
instance of DataStore a few times.

The multithreaded test passes just fine. However, the multi-process test fails 
with missing keys, as many as 30%.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to