[ 
https://issues.apache.org/jira/browse/NUTCH-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  updated NUTCH-893:
------------------------------------

    Attachment: NUTCH-893.patch

Unit test to illustrate the issue.

> DataStore.put() silently loses records when executed from multiple processes
> ----------------------------------------------------------------------------
>
>                 Key: NUTCH-893
>                 URL: https://issues.apache.org/jira/browse/NUTCH-893
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 2.0
>         Environment: Gora HEAD, SqlStore, MySQL 5.1, Ubuntu 10.4 x64, Sun JDK 
> 1.6
>            Reporter: Andrzej Bialecki 
>         Attachments: NUTCH-893.patch
>
>
> In order to debug the issue described in NUTCH-879 I created a test to 
> simulate multiple clients appending to webtable (please see the patch), which 
> is the situation that we have in distributed map-reduce jobs.
> There are two tests there: one that uses multiple threads within the same 
> JVM, and another that uses single thread in multiple JVMs. Each test first 
> clears webtable (be careful!), and then puts a bunch of pages, and finally 
> counts that all are present and their values correspond to keys. To make 
> things more interesting each execution context (thread or process) closes and 
> reopens its instance of DataStore a few times.
> The multithreaded test passes just fine. However, the multi-process test 
> fails with missing keys, as many as 30%.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to