Re: [Valgrind-developers] helgrind: some more ideas

Julian Seward Tue, 19 Feb 2008 18:41:31 -0800

> I've been doing some more experiments with helgrind and I would like
> to share some ideas...


Good.  Some comments, in no particular order:

> Accuracy:
>
> The current default memory state machine (MSMHelgrind) gives too many
> false reports in presence of message queues, condvars, etc.
> So far, I find MSMProp1 to be more accurate,

Improved accuracy can only be a good thing.  Is MSMProp1 accurate
enough that you can find bugs in your real applications, without
getting to many false errors now?

> but noticeably slower 
> (mostly because it has to check happens-before in all states).

Maybe some better caching of the happens-before queries would 
help?  The current inplementation (hbefore__cache et al) uses a
64-entry fully associative cache, in effect.  Maybe it would be
better to have a larger, set-associative cache, eg 256 lines of
4 entries each, for example.

For sure a lower miss rate on the cache is possible.  Also, the
miss path -- function cmpGEQ_VTS -- is naively coded, we could
do a lot better there.  Maybe it is possible to change the
representation of VTSs so that comparison using vector operations
(SSE insns, etc) is possible.  Or at least so that the complex
alignment logic can be avoided.

> Even though it is possible to further speedup both machines, I don't
> think we can get to 20x slowdown by doing just this.

I agree.  I think we need all the tricks we can get.

> I have implemented the following:
>    if ((address % X) != Y) we ignore this memory access.
> [...]
> This hack brings helgrind's speed to something about 10x-15x and makes
> my strict timeouts happy.

Good!  Then you can make a better evaluation of MSMProp1.

> Most other ways to speedup helgrind are complimentary to this hack.

Yes.  As I mentioned a couple of weeks back, I think we can use
the fact that mostly the FSMs are idempotent under certain
circumstances, to take advantage of Helgrind's single threadedness,
and so filter out potentially many memory references.

> We can't have more than N segments at a time (limited by RAM) so
> ideally we have to recycle old segments (I did not do this yet).

Segment recycling is important, otherwise large programs cannot
run for long without eating all memory.  Progress on this would
be good.

> Most of them can be explained to helgrind via annotations. Annotations
> are essentially valgrind's 'client requests'.
> This is what I used so far:
>
> ANNOTATE_CONDVAR_*
> Create a happens-before relation between two thread segments at
> arbitrary points of program.
>
> ANNOTATE_PCQ_*
> A variation of ANNOTATE_CONDVAR_* specifically for FIFO message queue.
>
> ANNOTATE_MUTEX_IS_USED_AS_CONDVAR(mutex)
> Signal on all Unlocks and Wait on all Locks of this mutex.
> On such mutex helgrind will behave as pure happens-before detector. See
> test61.
>
> ANNOTATE_TRACE_MEMORY
> --trace-addr is very good for testing helgrind on unit tests.
> However it is useless on large programs where memory addresses are
> non-deterministic (due to scheduler).
> This annotation tells helgrind to trace accesses to some particular
> memory location and to report races only on this address.
> Multiple addresses could be traced. Useful for debugging a race.
>
> ANNOTATE_EXPECT_RACE
> Useful for regression testing of helgrind itself (if an expected race
> is not detected, helgrind will complain).
>
> ANNOTATE_BENIGN_RACE
> Alternative to a suppression file.

These all sound good to me.

J

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Valgrind-developers mailing list
Valgrind-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-developers

Re: [Valgrind-developers] helgrind: some more ideas

Reply via email to