[Valgrind-developers] helgrind: some more ideas

Konstantin Serebryany Tue, 19 Feb 2008 06:57:30 -0800

Hello all,

I've been doing some more experiments with helgrind and I would like
to share some ideas...


The 'test' programs I've run have few things in common:
- Hundreds of threads.
- Heavy use of message queues, condvars, and other 'happens-before'
stuff. Mutexes are used as well.
- Strict timeouts (i.e. if some action is delayed too much, the
program starts behaving differently).
- Strict limits on memory usage.


Accuracy:

The current default memory state machine (MSMHelgrind) gives too many
false reports in presence of message queues, condvars, etc.
So far, I find MSMProp1 to be more accurate, but noticeably slower
(mostly because it has to check happens-before in all states).

Speed:

MSMHelgrind leads to 40x-150x slowdown (mostly depending on the number
of active threads, since valgrind runs on one core).
MSMProp1 is 20%-100% slower than MSMHelgrind (depending on the size of
HB graph?).
Even though it is possible to further speedup both machines, I don't
think we can get to 20x slowdown by doing just this.

If we can ignore the majority of all memory accesses, we will speedup
the whole thing.
I have implemented the following:
   if ((address % X) != Y) we ignore this memory access.
   X and Y are command line parameters (% operation is optimized for
some values).
   0<=Y<X
   typical value of X is between 3 and 30.

This hack brings helgrind's speed to something about 10x-15x and makes
my strict timeouts happy.
We will of course miss some races, but it is better than nothing
(again, this is needed only in presence of strict timeouts).
And we also can run helgrind several times giving different values of Y.

Most other ways to speedup helgrind are complimentary to this hack.


Memory:

Helgrind's memory consumption on my tests is about 2x-2.5x.
Some tests have limits on memory usage and I have to suppress those
limits (and use a machine with lots of RAM).
Also, if I modify the hack above to
    if (((address >> N_SECMAP_BITS) % X) != Y)
I reduce the memory usage (since shadow values are not created for the
ignored addresses).


Segments:

If we have a lot of happens-before synchronization we create many segments.
We need to have very fast mapping between SegmentID->Segment
(especially for MSMProp1).
I suggest a resisable array of segments instead of a linked list.
And the mapping SegmentID->Segment is just an array reference. It also
reduces memory usage a bit.
We can't have more than N segments at a time (limited by RAM) so
ideally we have to recycle old segments (I did not do this yet).

Annotations:

Life is more complex than just mutex and condvar. :)
We have all the varieties of message queues and custom synchronization
mechanisms implemented via atomics.
Most of them can be explained to helgrind via annotations. Annotations
are essentially valgrind's 'client requests'.
This is what I used so far:

ANNOTATE_CONDVAR_*
Create a happens-before relation between two thread segments at
arbitrary points of program.

ANNOTATE_PCQ_*
A variation of ANNOTATE_CONDVAR_* specifically for FIFO message queue.

ANNOTATE_MUTEX_IS_USED_AS_CONDVAR(mutex)
Signal on all Unlocks and Wait on all Locks of this mutex.
On such mutex helgrind will behave as pure happens-before detector. See test61.

ANNOTATE_TRACE_MEMORY
--trace-addr is very good for testing helgrind on unit tests.
However it is useless on large programs where memory addresses are
non-deterministic (due to scheduler).
This annotation tells helgrind to trace accesses to some particular
memory location and to report races only on this address.
Multiple addresses could be traced. Useful for debugging a race.

ANNOTATE_EXPECT_RACE
Useful for regression testing of helgrind itself (if an expected race
is not detected, helgrind will complain).

ANNOTATE_BENIGN_RACE
Alternative to a suppression file.



Most of the things mentioned above are implemented at
http://code.google.com/p/data-race-test (not all of them are
'production quality' yet).
You feedback is more than welcome.


Thanks,

--kcc

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Valgrind-developers mailing list
Valgrind-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-developers

[Valgrind-developers] helgrind: some more ideas

Reply via email to