Re: [Valgrind-users] helgrind markup for lock-ordering

Philippe Waroquiers Wed, 07 Nov 2012 14:02:36 -0800

On Wed, 2012-11-07 at 10:51 +0100, David Faure wrote:

> > The idea of helgrind is that it detects lock order problems and/or
> > race condition problems *even* if no deadlock happens and/or if no
> > race condition really happened.
> > Maybe it is very unlikely that the trylock fails. Still would be nice
> > to discover the bug. And if the trylock does not fail, then the race
> > condition will then not be detected by helgrind.
> 
> The simpler construct that can lead to this problem is
> *  trylock mutex1
> *  access shared data
"discover the bug" is related to the doubful construct, not
to a race condition (as said above, if the trylock does not fail,
no way to detect the race condition that would happen if
the application does not properly handle the trylock failure).


Note that currently, laog is producing messages which should
be considered as "lock order warning", not as
"for sure there is a deadlock order problem".
The trylock is one case of "warning, not an error" which could/should
be improved.
But there are others e.g. laog does not understand the concept of 
"guard locks" which is (IIUC): each thread can acquire a single lock
in a set of locks. If a thread wants to acquire more than one lock
(in any order then), it first has to acquire the "guard lock",
and then can lock in any order any nr of locks in the lock set.
With this guard lock, not possible to have a deadlock,
but for sure this is not understood by the current helgrind
laog algorithm.
Properly handling guard locks implies a more sophisticated algorithm
than the current laog (search for "good lock algorithm").


> This is too monolithic thinking :)
> An application that uses a given framework (e.g. Qt) could very well be doing 
> things differently than the framework does internally.
> As a developer debugging a large application written by other people, using a 
> framework written by other people, how can I guarantee helgrind that all 
> trylock uses follow a single design pattern?
Of course, this cannot be guaranteed. But the idea is not that
the command line option(s) are covering the full range of possible
"design patterns". The idea is that they should cover 'out of the box'
a reasonable range of such patterns. This is true e.g. for memcheck
(out of the box, it supports malloc compatible libs, but otherwise
you need annotations).
Helgrind should also support some 'out of the box' locking logics.
The difficult question is what should be covered by the command line
options 'out of the box' (the nirvana being an algorithm that would
work for everything without annotations).


> At the time of the trylock, it is the last one -> no warning at that precise 
> moment. This sounds like a simple enough change in the current algorithm?
> Basically adding one if() ... if only I knew where ;)
To avoid doing a "lock order warning" when doing the trylock is easy
I believe:
in hg_main.c:3697, put a condition
'if (!is_a_try_lock)
before:
   other = laog__do_dfs_from_to(lk, thr->locksetA);
   if (other) {
   ....
(where is_a_trylock has to be given by the caller).
I think it is almost mechanical work to add arguments to *POST event
handlers and corresponding requests to transfer the "is a try lock"
from the helgrind interception to the line 3697).

But I suspect that the insertion of a trylock in the graph
might later on cause a 'wrong' cycle to be detected.
E.g. (L = lock, T = trylock, L and T followed by lock nr)
  threadA L1 T2
  threadB L2 L3
  threadC L3 L1
cannot deadlock (I think :) if threadA releases lock 1
when T2 fails.

But when L3 L1 will be inserted, a cycle will be found
via T2 (if the graph has not remembered this is a trylock).
So, I am still (somewhat intuitively) thinking that we need
to have nodes and/or edges marked with "this is a trylock"
and have the graph exploration taking these marking into account
to not generate such "false warnings".
Far to be a mathematical proof, I know :).

Philippe



------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Re: [Valgrind-users] helgrind markup for lock-ordering

Reply via email to