During running mozilla thunderbird mail client under helgrind,
I got the following message:

==13832== Thread #1: pthread_cond_destroy: destruction of condition
variable being waited upon


(See mozilla bugzilla entry
https://bugzilla.mozilla.org/show_bug.cgi?id=819445
~nsHTTPListener() destroys condition variable on which other threads
are blocked.)


I wonder if we can learn WHICH THREADs, maybe the thread ids, were waiting
on the said condition variable when this message is printed.

It would be at least great to learn which thread (maybe tid or its
starting address or whatever) is waiting on the condition variable
which is being destroyed, and also, it would be insanely great, if we
can learn WHICH condition variable exactly is talked about (maybe its
address?) is known.

I think it is definitely worthwhile if we can print the address of the
condition variable being destroyed even if symbolic information is not
available because in the same log I often see something about
"destruction of unknown cond var" also. I wonder if we can correlate
the addresses printed by these warning messages to see if one thread
is prematurely destroying a cond variable which other threads really
assume to continue to exist, etc.

By looking at helgrind/hg_main.c staring at line 2153 (I am quoting
the function  map_cond_to_CVInfo_delete ( ThreadId tid, void* cond )
below, I think printing the value of 'cond' as address
in hexadecimal format would be enough to print the address of
condition variable (I am not familiar how to print the symbolic
information.) OR, since
since CVinfo is defined as follows early in hg_main.c,

  typedef
     struct {
        SO*   so;       /* libhb-allocated SO */
        void* mx_ga;    /* addr of associated mutex, if any */
        UWord nWaiters; /* # threads waiting on the CV */
     }
     CVInfo;


maybe we should print the value of mx_ga, I am not sure. I trust the
developers of valgrind which is correct.

Also, for printing out the thread id or something,
cond->mx_ga can be used to call
 Lock *lk =   map_locks_maybe_lookup( (Addr)cond->mx_ga );

and then lk->heldBy seems to contain the list of thread information.
If so, we can iterate through it to print the task ID, etc.

=== QUOTE of map_cond_to_CVinfo_delete():

static void map_cond_to_CVInfo_delete ( ThreadId tid, void* cond ) {
   Thread*   thr;
   UWord keyW, valW;

   thr = map_threads_maybe_lookup( tid );
   tl_assert(thr); /* cannot fail - Thread* must already exist */

   map_cond_to_CVInfo_INIT();
   if (VG_(delFromFM)( map_cond_to_CVInfo, &keyW, &valW, (UWord)cond )) {
      CVInfo* cvi = (CVInfo*)valW;
      tl_assert(keyW == (UWord)cond);
      tl_assert(cvi);
      tl_assert(cvi->so);
      if (cvi->nWaiters > 0) {
         HG_(record_error_Misc)(thr,
                                "pthread_cond_destroy:"
                                " destruction of condition variable
being waited upon");
      }
      libhb_so_dealloc(cvi->so);
      cvi->mx_ga = 0;
      HG_(free)(cvi);
   } else {
      HG_(record_error_Misc)(thr,
                             "pthread_cond_destroy: destruction of
unknown cond var");
   }
}

=== END QUOTE
Comment:
In both places where warning is printed, I would like to see
the address value of the condition variable, and
hopefully symbolic information if one is available.
Also for destroying a cond variable on which some tasks are waiting,
I would like to know the task ID(s) waiting on it.

I run helgrind with the following parameters, but adding a few other
options such as
--fair-sched=yes, etc. does not change the situation much.
        
env GTK_IM_MODULE=xim valgrind --tool=helgrind
~/TB-NEW/TB-3HG/objdir-tb3/mozilla/dist/bin/thunderbird-bin -profile
/TB-NEW/TB-3HG/objdir-tb3/mozilla/_tests/mozmill/mozmillprofile
-jsbridge 24242 -foreground


The message I got in one run: (Mozilla bugzilla points at an uploaded
full log of a different run)

==13832== ----------------------------------------------------------------
==13832==
==13832== Thread #1: pthread_cond_destroy: destruction of condition
variable being waited upon
==13832==    at 0x4027A7F: pthread_cond_destroy_WRK (hg_intercepts.c:940)
==13832==    by 0x4029781: pthread_cond_destroy@* (hg_intercepts.c:958)
==13832==    by 0x47191BA: PR_DestroyCondVar (ptsynch.c:340)
==13832==    by 0x58C1A60: nsHTTPListener::~nsHTTPListener() (CondVar.h:56)
==13832==    by 0x58C1AF7: nsHTTPListener::Release()
(nsNSSCallbacks.cpp:536)
==13832==    by 0x5F9C160:
nsCOMPtr_base::assign_with_AddRef(nsISupports*) (nsCOMPtr.h:440)
==13832==    by 0x4C84784: nsStreamLoader::OnStopRequest(nsIRequest*,
nsISupports*, tag_nsresult) (nsCOMPtr.h:622)
==13832==    by 0x4CFED70:
mozilla::net::HttpBaseChannel::DoNotifyListener() (HttpBaseChannel.cpp:1463)
==13832==    by 0x4D0159C:
mozilla::net::nsHttpChannel::HandleAsyncAbort() (HttpBaseChannel.h:347)
==13832==    by 0x4CFFFBA: nsRunnableMethodImpl<void
(mozilla::net::nsHttpChannel::*)(), true>::Run() (nsThreadUtils.h:349)
==13832==    by 0x5FDE0EB: nsThread::ProcessNextEvent(bool, bool*)
(nsThread.cpp:612)
==13832==    by 0x5FF556F: NS_InvokeByIndex_P (in
/TB-NEW/TB-3HG/objdir-tb3/mozilla/toolkit/library/libxul.so)
==13832==

PS: Unfortunately, due to the sheer size of thunderbird and its
libraries, --read-var-info blows up my small PC's memory.

I can't run the program with --read-var-info under 32bit linux, and
under 64bits linux, it runs, but paging is so heavy (I have about 6 GB
dedicated to the VMplayer in which this is done), the testing harness
for thunderbird times out and stops the testing.  I wait for 20
minutes for the initial network connection so that TB can be
manipulated remotely through its own RPC by the test harness, but it
fails due to timeout. Simply too much paging under 4-6GB of memory
available in 64 bits linux if --read-var-info is specified.

So trying to obtain the symbolic information of conditionn variable on
my PC may be difficult since unless I have 8GB or more, it may not be
possible to run thunderbird under helgrind using --read-var-info.  But
in other cases, where the memory demand is not that high, or with
powerful PC with 16 GB of memory, say, learning the whereabout and/or
symbolic information about the condition variable itself that is being
destroyed will be very useful for debugging purposes.

PPS:
An excerpt of "destruction of unknown cond var" log.
It would be also interesting to see the printing of the
address of "unknown cond var".
Coupled with the proposed printing of the address (at least even if
the symbolic information is not available) of the cond variable being
destroyed while the variable is still waited upon, we can compare such
addresses to see if one routine is destroying which other thread is
about to destroy (again? maybe a race or unproper locking of critical
region?).

The following warning is the first warning of "destruction of unknown
cond var" after the "destruction of condition variable being waited
upon" discussed above, and I wonder if the unknown cond var is the one
that was destroyed above. (The address [which may be bogus now] may
help us to find it out.)


==13832== ----------------------------------------------------------------
==13832==
==13832== Thread #28: pthread_cond_destroy: destruction of unknown cond var
==13832==    at 0x4027A7F: pthread_cond_destroy_WRK (hg_intercepts.c:940)
==13832==    by 0x4029781: pthread_cond_destroy@* (hg_intercepts.c:958)
==13832==    by 0x47191BA: PR_DestroyCondVar (ptsynch.c:340)
==13832==    by 0x47ABE12: nssCertificate_Destroy (certificate.c:128)
==13832==    by 0x47ABE6E: NSSCertificate_Destroy (certificate.c:150)
==13832==    by 0x47A900B: CERT_DestroyCertificate (stanpcertdb.c:795)
==13832==    by 0x47EB163: pkix_pl_Cert_Destroy (pkix_pl_cert.c:1167)
==13832==    by 0x4803FFA: PKIX_PL_Object_DecRef (pkix_pl_object.c:891)
==13832==    by 0x47E2FD7: pkix_List_Destroy (pkix_list.c:89)
==13832==    by 0x4803FFA: PKIX_PL_Object_DecRef (pkix_pl_object.c:891)
==13832==    by 0x47E301F: pkix_List_Destroy (pkix_list.c:93)
==13832==    by 0x4803FFA: PKIX_PL_Object_DecRef (pkix_pl_object.c:891)
==13832==
==13832== ----------------------------------------------------------------
==13832==




------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to