[jira] Commented: (ZOOKEEPER-333) helgrind thread issues identified in mt c client code

2009-03-06 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679682#action_12679682
 ] 

Mahadev konar commented on ZOOKEEPER-333:
-

Helgrind: Fatal internal error -- cannot continue.
Helgrind: mk_SHVAL_ShR(tset=8192,lset=1): FAILED
Helgrind: max allowed tset=8191, lset=131071
Helgrind: program has too many thread sets or lock sets to track.


this is the error generated by helgrind when it hangs with the tests and is not 
able to run all of the tests.

> helgrind thread issues identified in mt c client code
> -
>
> Key: ZOOKEEPER-333
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-333
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.1.1, 3.2.0
>
> Attachments: helgrind_mt.out, helgrind_mt.out.gz
>
>
> helgrind generated a number of issues, I pulled a bunch of them. Most are 
> related to the test, some are really issues with the mt zk client code though:
> valgrind --tool=helgrind --log-file=helgrind_mt.out ./zktest-mt
> ==31294== Thread #2: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80ABCF6: notify_thread_ready (mt_adaptor.c:136)
> ==31294==by 0x80ABE90: do_io (mt_adaptor.c:277)
> ==31294== Possible data race during write of size 4 at 0x42E9A58
> ==31294==at 0x8050D83: terminateZookeeperThreads(_zhandle*) 
> (ZKMocks.cc:518)
> ==31294==by 0x805543B: DeliverWatchersWrapper::call(_zhandle*, int, int, 
> char const*, watcher_object_list**) (ZKMocks.cc:261)
> ==31294==by 0x80520F7: __wrap_deliverWatchers (ZKMocks.cc:220)
> ==31294==by 0x80A287B: process_completions (zookeeper.c:1393)
> ==31294==by 0x80ABDAA: do_completion (mt_adaptor.c:332)
> ==31294== Possible data race during write of size 4 at 0xBEFF5F30
> ==31294==at 0x80589AF: 
> Zookeeper_watchers::ConnectionWatcher::~ConnectionWatcher() 
> (TestWatchers.cc:54)
> ==31294==by 0x805D062: Zookeeper_watchers::testDefaultSessionWatcher1() 
> (TestWatchers.cc:438)
> ==31294==by 0x805608C: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> ==31294== Possible data race during write of size 4 at 0x42EB104
> ==31294==at 0x80A03EE: queue_completion (zookeeper.c:1776)
> ==31294==by 0x80A3A44: zookeeper_process (zookeeper.c:1598)
> ==31294==by 0x80AC00B: do_io (mt_adaptor.c:309)
> ==31294== Thread #29: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80AB9B3: wait_sync_completion (mt_adaptor.c:82)
> ==31294==by 0x80A1E82: zoo_wget (zookeeper.c:2517)
> ==31294==by 0x80A1F13: zoo_get (zookeeper.c:2497)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-333) helgrind thread issues identified in mt c client code

2009-03-06 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679677#action_12679677
 ] 

Patrick Hunt commented on ZOOKEEPER-333:


ps. I'm on ubuntu intrepid ibex latest, single core cpu.


> helgrind thread issues identified in mt c client code
> -
>
> Key: ZOOKEEPER-333
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-333
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.1.1, 3.2.0
>
> Attachments: helgrind_mt.out, helgrind_mt.out.gz
>
>
> helgrind generated a number of issues, I pulled a bunch of them. Most are 
> related to the test, some are really issues with the mt zk client code though:
> valgrind --tool=helgrind --log-file=helgrind_mt.out ./zktest-mt
> ==31294== Thread #2: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80ABCF6: notify_thread_ready (mt_adaptor.c:136)
> ==31294==by 0x80ABE90: do_io (mt_adaptor.c:277)
> ==31294== Possible data race during write of size 4 at 0x42E9A58
> ==31294==at 0x8050D83: terminateZookeeperThreads(_zhandle*) 
> (ZKMocks.cc:518)
> ==31294==by 0x805543B: DeliverWatchersWrapper::call(_zhandle*, int, int, 
> char const*, watcher_object_list**) (ZKMocks.cc:261)
> ==31294==by 0x80520F7: __wrap_deliverWatchers (ZKMocks.cc:220)
> ==31294==by 0x80A287B: process_completions (zookeeper.c:1393)
> ==31294==by 0x80ABDAA: do_completion (mt_adaptor.c:332)
> ==31294== Possible data race during write of size 4 at 0xBEFF5F30
> ==31294==at 0x80589AF: 
> Zookeeper_watchers::ConnectionWatcher::~ConnectionWatcher() 
> (TestWatchers.cc:54)
> ==31294==by 0x805D062: Zookeeper_watchers::testDefaultSessionWatcher1() 
> (TestWatchers.cc:438)
> ==31294==by 0x805608C: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> ==31294== Possible data race during write of size 4 at 0x42EB104
> ==31294==at 0x80A03EE: queue_completion (zookeeper.c:1776)
> ==31294==by 0x80A3A44: zookeeper_process (zookeeper.c:1598)
> ==31294==by 0x80AC00B: do_io (mt_adaptor.c:309)
> ==31294== Thread #29: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80AB9B3: wait_sync_completion (mt_adaptor.c:82)
> ==31294==by 0x80A1E82: zoo_wget (zookeeper.c:2517)
> ==31294==by 0x80A1F13: zoo_get (zookeeper.c:2497)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-333) helgrind thread issues identified in mt c client code

2009-03-06 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-333:
---

Attachment: helgrind_mt.out.gz

This latest gzip'd helgrind output file is from svn trunk:

At revision 751008.

Note: the tests hang at the following step:

Zookeeper_operations::testOperationsAndDisconnectConcurrently1^CKilled

I had to ctrl-c the test to stop. Perhaps this means helgrind is tickling a bad 
problem(s)? We should re run helgrind after resolving the basic issues 
currently in the log, and see if this hang still occurrs. Perhaps you guys 
could also look at the test in question to see if anything catches your eye? 
The test should not be failing.


> helgrind thread issues identified in mt c client code
> -
>
> Key: ZOOKEEPER-333
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-333
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.1.1, 3.2.0
>
> Attachments: helgrind_mt.out, helgrind_mt.out.gz
>
>
> helgrind generated a number of issues, I pulled a bunch of them. Most are 
> related to the test, some are really issues with the mt zk client code though:
> valgrind --tool=helgrind --log-file=helgrind_mt.out ./zktest-mt
> ==31294== Thread #2: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80ABCF6: notify_thread_ready (mt_adaptor.c:136)
> ==31294==by 0x80ABE90: do_io (mt_adaptor.c:277)
> ==31294== Possible data race during write of size 4 at 0x42E9A58
> ==31294==at 0x8050D83: terminateZookeeperThreads(_zhandle*) 
> (ZKMocks.cc:518)
> ==31294==by 0x805543B: DeliverWatchersWrapper::call(_zhandle*, int, int, 
> char const*, watcher_object_list**) (ZKMocks.cc:261)
> ==31294==by 0x80520F7: __wrap_deliverWatchers (ZKMocks.cc:220)
> ==31294==by 0x80A287B: process_completions (zookeeper.c:1393)
> ==31294==by 0x80ABDAA: do_completion (mt_adaptor.c:332)
> ==31294== Possible data race during write of size 4 at 0xBEFF5F30
> ==31294==at 0x80589AF: 
> Zookeeper_watchers::ConnectionWatcher::~ConnectionWatcher() 
> (TestWatchers.cc:54)
> ==31294==by 0x805D062: Zookeeper_watchers::testDefaultSessionWatcher1() 
> (TestWatchers.cc:438)
> ==31294==by 0x805608C: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> ==31294== Possible data race during write of size 4 at 0x42EB104
> ==31294==at 0x80A03EE: queue_completion (zookeeper.c:1776)
> ==31294==by 0x80A3A44: zookeeper_process (zookeeper.c:1598)
> ==31294==by 0x80AC00B: do_io (mt_adaptor.c:309)
> ==31294== Thread #29: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80AB9B3: wait_sync_completion (mt_adaptor.c:82)
> ==31294==by 0x80A1E82: zoo_wget (zookeeper.c:2517)
> ==31294==by 0x80A1F13: zoo_get (zookeeper.c:2497)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (ZOOKEEPER-333) helgrind thread issues identified in mt c client code

2009-03-06 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679667#action_12679667
 ] 

mahadev edited comment on ZOOKEEPER-333 at 3/6/09 9:50 AM:
-

chris,
 the run was from the trunk. I dont have a revision number but i think their 
hasnt been any code change to the source files after I uploaded the traces. So 
the line numbers should match.

  was (Author: mahadev):
chris,
 the run was from the trunk. I dont have a revision number but i think their 
hasnt been any code change to the files mentioned after I uploaded the file. So 
the line numbers should match.
  
> helgrind thread issues identified in mt c client code
> -
>
> Key: ZOOKEEPER-333
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-333
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.1.1, 3.2.0
>
> Attachments: helgrind_mt.out
>
>
> helgrind generated a number of issues, I pulled a bunch of them. Most are 
> related to the test, some are really issues with the mt zk client code though:
> valgrind --tool=helgrind --log-file=helgrind_mt.out ./zktest-mt
> ==31294== Thread #2: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80ABCF6: notify_thread_ready (mt_adaptor.c:136)
> ==31294==by 0x80ABE90: do_io (mt_adaptor.c:277)
> ==31294== Possible data race during write of size 4 at 0x42E9A58
> ==31294==at 0x8050D83: terminateZookeeperThreads(_zhandle*) 
> (ZKMocks.cc:518)
> ==31294==by 0x805543B: DeliverWatchersWrapper::call(_zhandle*, int, int, 
> char const*, watcher_object_list**) (ZKMocks.cc:261)
> ==31294==by 0x80520F7: __wrap_deliverWatchers (ZKMocks.cc:220)
> ==31294==by 0x80A287B: process_completions (zookeeper.c:1393)
> ==31294==by 0x80ABDAA: do_completion (mt_adaptor.c:332)
> ==31294== Possible data race during write of size 4 at 0xBEFF5F30
> ==31294==at 0x80589AF: 
> Zookeeper_watchers::ConnectionWatcher::~ConnectionWatcher() 
> (TestWatchers.cc:54)
> ==31294==by 0x805D062: Zookeeper_watchers::testDefaultSessionWatcher1() 
> (TestWatchers.cc:438)
> ==31294==by 0x805608C: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> ==31294== Possible data race during write of size 4 at 0x42EB104
> ==31294==at 0x80A03EE: queue_completion (zookeeper.c:1776)
> ==31294==by 0x80A3A44: zookeeper_process (zookeeper.c:1598)
> ==31294==by 0x80AC00B: do_io (mt_adaptor.c:309)
> ==31294== Thread #29: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80AB9B3: wait_sync_completion (mt_adaptor.c:82)
> ==31294==by 0x80A1E82: zoo_wget (zookeeper.c:2517)
> ==31294==by 0x80A1F13: zoo_get (zookeeper.c:2497)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-333) helgrind thread issues identified in mt c client code

2009-03-06 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679667#action_12679667
 ] 

Mahadev konar commented on ZOOKEEPER-333:
-

chris,
 the run was from the trunk. I dont have a revision number but i think their 
hasnt been any code change to the files mentioned after I uploaded the file. So 
the line numbers should match.

> helgrind thread issues identified in mt c client code
> -
>
> Key: ZOOKEEPER-333
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-333
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.1.1, 3.2.0
>
> Attachments: helgrind_mt.out
>
>
> helgrind generated a number of issues, I pulled a bunch of them. Most are 
> related to the test, some are really issues with the mt zk client code though:
> valgrind --tool=helgrind --log-file=helgrind_mt.out ./zktest-mt
> ==31294== Thread #2: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80ABCF6: notify_thread_ready (mt_adaptor.c:136)
> ==31294==by 0x80ABE90: do_io (mt_adaptor.c:277)
> ==31294== Possible data race during write of size 4 at 0x42E9A58
> ==31294==at 0x8050D83: terminateZookeeperThreads(_zhandle*) 
> (ZKMocks.cc:518)
> ==31294==by 0x805543B: DeliverWatchersWrapper::call(_zhandle*, int, int, 
> char const*, watcher_object_list**) (ZKMocks.cc:261)
> ==31294==by 0x80520F7: __wrap_deliverWatchers (ZKMocks.cc:220)
> ==31294==by 0x80A287B: process_completions (zookeeper.c:1393)
> ==31294==by 0x80ABDAA: do_completion (mt_adaptor.c:332)
> ==31294== Possible data race during write of size 4 at 0xBEFF5F30
> ==31294==at 0x80589AF: 
> Zookeeper_watchers::ConnectionWatcher::~ConnectionWatcher() 
> (TestWatchers.cc:54)
> ==31294==by 0x805D062: Zookeeper_watchers::testDefaultSessionWatcher1() 
> (TestWatchers.cc:438)
> ==31294==by 0x805608C: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> ==31294== Possible data race during write of size 4 at 0x42EB104
> ==31294==at 0x80A03EE: queue_completion (zookeeper.c:1776)
> ==31294==by 0x80A3A44: zookeeper_process (zookeeper.c:1598)
> ==31294==by 0x80AC00B: do_io (mt_adaptor.c:309)
> ==31294== Thread #29: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80AB9B3: wait_sync_completion (mt_adaptor.c:82)
> ==31294==by 0x80A1E82: zoo_wget (zookeeper.c:2517)
> ==31294==by 0x80A1F13: zoo_get (zookeeper.c:2497)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-318) remove locking in zk_hashtable.c or add locking in collect_keys()

2009-03-06 Thread Chris Darroch (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679633#action_12679633
 ] 

Chris Darroch commented on ZOOKEEPER-318:
-

Cool, thanks.  Much appreciated.

> remove locking in zk_hashtable.c or add locking in collect_keys()
> -
>
> Key: ZOOKEEPER-318
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-318
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.0.0, 3.0.1, 3.1.0
>Reporter: Chris Darroch
>Assignee: Chris Darroch
> Fix For: 3.2.0
>
> Attachments: ZOOKEEPER-318.patch
>
>
> From a review of zk_hashtable.c it appears to me that all functions which 
> manipulate the hashtables are called from the IO thread, and therefore any 
> need for locking is obviated.
> If I'm wrong about that, then I think at a minimum collect_keys() should 
> acquire a lock in the same manner as collect_session_watchers().  Both 
> iterate over hashtable contents (in the latter case using copy_table()).
> However, from what I can see, the only function (besides the init/destroy 
> functions used when creating a zhandle_t) called from the completion thread 
> is deliverWatchers(), which simply iterates over a "delivery" list created 
> from the hashtables by collectWatchers().  The activateWatcher() function 
> contains comments which describe it being called by the completion thread, 
> but in fact it is called by the IO thread in zookeeper_process().
> I believe all calls to collectWatchers(), activateWatcher(), and 
> collect_keys() are made by the IO thread in zookeeper_interest(), 
> zookeeper_process(), check_events(), send_set_watches(), and handle_error().  
> Note that queue_session_event() is aliased as PROCESS_SESSION_EVENT, but 
> appears only in handle_error() and check_events().
> Also note that handle_error() is called only in zookeeper_process() and 
> handle_socket_error_msg(), which is used only by the IO thread, so far as I 
> can see.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-333) helgrind thread issues identified in mt c client code

2009-03-06 Thread Chris Darroch (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679629#action_12679629
 ] 

Chris Darroch commented on ZOOKEEPER-333:
-

Some of these look to be innocuous on a first glance, such as "wait called with 
un-held mutex" in wait_sync_completion().  I suppose the issue here might be 
that wait_sync_completion() doesn't check the return value from 
pthread_mutex_lock() before proceeding, but in practice I believe one can 
reasonably assume it always succeeds.

One thing that might help a bit would be to have a reference to the C code 
which produced this output because it refers to line numbers in the code.  Was 
the run against 3.1.0 or against a snapshot from trunk?  If the latter, do you 
know the SVN revision number?

> helgrind thread issues identified in mt c client code
> -
>
> Key: ZOOKEEPER-333
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-333
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Reporter: Patrick Hunt
>Assignee: Mahadev konar
>Priority: Critical
> Fix For: 3.1.1, 3.2.0
>
> Attachments: helgrind_mt.out
>
>
> helgrind generated a number of issues, I pulled a bunch of them. Most are 
> related to the test, some are really issues with the mt zk client code though:
> valgrind --tool=helgrind --log-file=helgrind_mt.out ./zktest-mt
> ==31294== Thread #2: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80ABCF6: notify_thread_ready (mt_adaptor.c:136)
> ==31294==by 0x80ABE90: do_io (mt_adaptor.c:277)
> ==31294== Possible data race during write of size 4 at 0x42E9A58
> ==31294==at 0x8050D83: terminateZookeeperThreads(_zhandle*) 
> (ZKMocks.cc:518)
> ==31294==by 0x805543B: DeliverWatchersWrapper::call(_zhandle*, int, int, 
> char const*, watcher_object_list**) (ZKMocks.cc:261)
> ==31294==by 0x80520F7: __wrap_deliverWatchers (ZKMocks.cc:220)
> ==31294==by 0x80A287B: process_completions (zookeeper.c:1393)
> ==31294==by 0x80ABDAA: do_completion (mt_adaptor.c:332)
> ==31294== Possible data race during write of size 4 at 0xBEFF5F30
> ==31294==at 0x80589AF: 
> Zookeeper_watchers::ConnectionWatcher::~ConnectionWatcher() 
> (TestWatchers.cc:54)
> ==31294==by 0x805D062: Zookeeper_watchers::testDefaultSessionWatcher1() 
> (TestWatchers.cc:438)
> ==31294==by 0x805608C: CppUnit::TestCaller::runTest() 
> (TestCaller.h:166)
> ==31294== Possible data race during write of size 4 at 0x42EB104
> ==31294==at 0x80A03EE: queue_completion (zookeeper.c:1776)
> ==31294==by 0x80A3A44: zookeeper_process (zookeeper.c:1598)
> ==31294==by 0x80AC00B: do_io (mt_adaptor.c:309)
> ==31294== Thread #29: pthread_cond_{timed}wait called with un-held mutex
> ==31294==at 0x4027F8F: pthread_cond_w...@* (hg_intercepts.c:560)
> ==31294==by 0x404D881: pthread_cond_w...@glibc_2.0 (in 
> /lib/tls/i686/cmov/libpthread-2.8.90.so)
> ==31294==by 0x4028037: pthread_cond_w...@* (hg_intercepts.c:574)
> ==31294==by 0x809EBB7: pthread_cond_wait (PthreadMocks.cc:54)
> ==31294==by 0x80AB9B3: wait_sync_completion (mt_adaptor.c:82)
> ==31294==by 0x80A1E82: zoo_wget (zookeeper.c:2517)
> ==31294==by 0x80A1F13: zoo_get (zookeeper.c:2497)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.