During running mozilla thunderbird mail client under helgrind, I got the following message:
==13832== Thread #1: pthread_cond_destroy: destruction of condition variable being waited upon (See mozilla bugzilla entry https://bugzilla.mozilla.org/show_bug.cgi?id=819445 ~nsHTTPListener() destroys condition variable on which other threads are blocked.) I wonder if we can learn WHICH THREADs, maybe the thread ids, were waiting on the said condition variable when this message is printed. It would be at least great to learn which thread (maybe tid or its starting address or whatever) is waiting on the condition variable which is being destroyed, and also, it would be insanely great, if we can learn WHICH condition variable exactly is talked about (maybe its address?) is known. I think it is definitely worthwhile if we can print the address of the condition variable being destroyed even if symbolic information is not available because in the same log I often see something about "destruction of unknown cond var" also. I wonder if we can correlate the addresses printed by these warning messages to see if one thread is prematurely destroying a cond variable which other threads really assume to continue to exist, etc. By looking at helgrind/hg_main.c staring at line 2153 (I am quoting the function map_cond_to_CVInfo_delete ( ThreadId tid, void* cond ) below, I think printing the value of 'cond' as address in hexadecimal format would be enough to print the address of condition variable (I am not familiar how to print the symbolic information.) OR, since since CVinfo is defined as follows early in hg_main.c, typedef struct { SO* so; /* libhb-allocated SO */ void* mx_ga; /* addr of associated mutex, if any */ UWord nWaiters; /* # threads waiting on the CV */ } CVInfo; maybe we should print the value of mx_ga, I am not sure. I trust the developers of valgrind which is correct. Also, for printing out the thread id or something, cond->mx_ga can be used to call Lock *lk = map_locks_maybe_lookup( (Addr)cond->mx_ga ); and then lk->heldBy seems to contain the list of thread information. If so, we can iterate through it to print the task ID, etc. === QUOTE of map_cond_to_CVinfo_delete(): static void map_cond_to_CVInfo_delete ( ThreadId tid, void* cond ) { Thread* thr; UWord keyW, valW; thr = map_threads_maybe_lookup( tid ); tl_assert(thr); /* cannot fail - Thread* must already exist */ map_cond_to_CVInfo_INIT(); if (VG_(delFromFM)( map_cond_to_CVInfo, &keyW, &valW, (UWord)cond )) { CVInfo* cvi = (CVInfo*)valW; tl_assert(keyW == (UWord)cond); tl_assert(cvi); tl_assert(cvi->so); if (cvi->nWaiters > 0) { HG_(record_error_Misc)(thr, "pthread_cond_destroy:" " destruction of condition variable being waited upon"); } libhb_so_dealloc(cvi->so); cvi->mx_ga = 0; HG_(free)(cvi); } else { HG_(record_error_Misc)(thr, "pthread_cond_destroy: destruction of unknown cond var"); } } === END QUOTE Comment: In both places where warning is printed, I would like to see the address value of the condition variable, and hopefully symbolic information if one is available. Also for destroying a cond variable on which some tasks are waiting, I would like to know the task ID(s) waiting on it. I run helgrind with the following parameters, but adding a few other options such as --fair-sched=yes, etc. does not change the situation much. env GTK_IM_MODULE=xim valgrind --tool=helgrind ~/TB-NEW/TB-3HG/objdir-tb3/mozilla/dist/bin/thunderbird-bin -profile /TB-NEW/TB-3HG/objdir-tb3/mozilla/_tests/mozmill/mozmillprofile -jsbridge 24242 -foreground The message I got in one run: (Mozilla bugzilla points at an uploaded full log of a different run) ==13832== ---------------------------------------------------------------- ==13832== ==13832== Thread #1: pthread_cond_destroy: destruction of condition variable being waited upon ==13832== at 0x4027A7F: pthread_cond_destroy_WRK (hg_intercepts.c:940) ==13832== by 0x4029781: pthread_cond_destroy@* (hg_intercepts.c:958) ==13832== by 0x47191BA: PR_DestroyCondVar (ptsynch.c:340) ==13832== by 0x58C1A60: nsHTTPListener::~nsHTTPListener() (CondVar.h:56) ==13832== by 0x58C1AF7: nsHTTPListener::Release() (nsNSSCallbacks.cpp:536) ==13832== by 0x5F9C160: nsCOMPtr_base::assign_with_AddRef(nsISupports*) (nsCOMPtr.h:440) ==13832== by 0x4C84784: nsStreamLoader::OnStopRequest(nsIRequest*, nsISupports*, tag_nsresult) (nsCOMPtr.h:622) ==13832== by 0x4CFED70: mozilla::net::HttpBaseChannel::DoNotifyListener() (HttpBaseChannel.cpp:1463) ==13832== by 0x4D0159C: mozilla::net::nsHttpChannel::HandleAsyncAbort() (HttpBaseChannel.h:347) ==13832== by 0x4CFFFBA: nsRunnableMethodImpl<void (mozilla::net::nsHttpChannel::*)(), true>::Run() (nsThreadUtils.h:349) ==13832== by 0x5FDE0EB: nsThread::ProcessNextEvent(bool, bool*) (nsThread.cpp:612) ==13832== by 0x5FF556F: NS_InvokeByIndex_P (in /TB-NEW/TB-3HG/objdir-tb3/mozilla/toolkit/library/libxul.so) ==13832== PS: Unfortunately, due to the sheer size of thunderbird and its libraries, --read-var-info blows up my small PC's memory. I can't run the program with --read-var-info under 32bit linux, and under 64bits linux, it runs, but paging is so heavy (I have about 6 GB dedicated to the VMplayer in which this is done), the testing harness for thunderbird times out and stops the testing. I wait for 20 minutes for the initial network connection so that TB can be manipulated remotely through its own RPC by the test harness, but it fails due to timeout. Simply too much paging under 4-6GB of memory available in 64 bits linux if --read-var-info is specified. So trying to obtain the symbolic information of conditionn variable on my PC may be difficult since unless I have 8GB or more, it may not be possible to run thunderbird under helgrind using --read-var-info. But in other cases, where the memory demand is not that high, or with powerful PC with 16 GB of memory, say, learning the whereabout and/or symbolic information about the condition variable itself that is being destroyed will be very useful for debugging purposes. PPS: An excerpt of "destruction of unknown cond var" log. It would be also interesting to see the printing of the address of "unknown cond var". Coupled with the proposed printing of the address (at least even if the symbolic information is not available) of the cond variable being destroyed while the variable is still waited upon, we can compare such addresses to see if one routine is destroying which other thread is about to destroy (again? maybe a race or unproper locking of critical region?). The following warning is the first warning of "destruction of unknown cond var" after the "destruction of condition variable being waited upon" discussed above, and I wonder if the unknown cond var is the one that was destroyed above. (The address [which may be bogus now] may help us to find it out.) ==13832== ---------------------------------------------------------------- ==13832== ==13832== Thread #28: pthread_cond_destroy: destruction of unknown cond var ==13832== at 0x4027A7F: pthread_cond_destroy_WRK (hg_intercepts.c:940) ==13832== by 0x4029781: pthread_cond_destroy@* (hg_intercepts.c:958) ==13832== by 0x47191BA: PR_DestroyCondVar (ptsynch.c:340) ==13832== by 0x47ABE12: nssCertificate_Destroy (certificate.c:128) ==13832== by 0x47ABE6E: NSSCertificate_Destroy (certificate.c:150) ==13832== by 0x47A900B: CERT_DestroyCertificate (stanpcertdb.c:795) ==13832== by 0x47EB163: pkix_pl_Cert_Destroy (pkix_pl_cert.c:1167) ==13832== by 0x4803FFA: PKIX_PL_Object_DecRef (pkix_pl_object.c:891) ==13832== by 0x47E2FD7: pkix_List_Destroy (pkix_list.c:89) ==13832== by 0x4803FFA: PKIX_PL_Object_DecRef (pkix_pl_object.c:891) ==13832== by 0x47E301F: pkix_List_Destroy (pkix_list.c:93) ==13832== by 0x4803FFA: PKIX_PL_Object_DecRef (pkix_pl_object.c:891) ==13832== ==13832== ---------------------------------------------------------------- ==13832== ------------------------------------------------------------------------------ LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users