[ https://issues.apache.org/jira/browse/NUTCH-893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrzej Bialecki updated NUTCH-893: ------------------------------------ Attachment: NUTCH-893.patch Unit test to illustrate the issue. > DataStore.put() silently loses records when executed from multiple processes > ---------------------------------------------------------------------------- > > Key: NUTCH-893 > URL: https://issues.apache.org/jira/browse/NUTCH-893 > Project: Nutch > Issue Type: Bug > Affects Versions: 2.0 > Environment: Gora HEAD, SqlStore, MySQL 5.1, Ubuntu 10.4 x64, Sun JDK > 1.6 > Reporter: Andrzej Bialecki > Attachments: NUTCH-893.patch > > > In order to debug the issue described in NUTCH-879 I created a test to > simulate multiple clients appending to webtable (please see the patch), which > is the situation that we have in distributed map-reduce jobs. > There are two tests there: one that uses multiple threads within the same > JVM, and another that uses single thread in multiple JVMs. Each test first > clears webtable (be careful!), and then puts a bunch of pages, and finally > counts that all are present and their values correspond to keys. To make > things more interesting each execution context (thread or process) closes and > reopens its instance of DataStore a few times. > The multithreaded test passes just fine. However, the multi-process test > fails with missing keys, as many as 30%. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.