> I've been doing some more experiments with helgrind and I would like > to share some ideas...
Good. Some comments, in no particular order: > Accuracy: > > The current default memory state machine (MSMHelgrind) gives too many > false reports in presence of message queues, condvars, etc. > So far, I find MSMProp1 to be more accurate, Improved accuracy can only be a good thing. Is MSMProp1 accurate enough that you can find bugs in your real applications, without getting to many false errors now? > but noticeably slower > (mostly because it has to check happens-before in all states). Maybe some better caching of the happens-before queries would help? The current inplementation (hbefore__cache et al) uses a 64-entry fully associative cache, in effect. Maybe it would be better to have a larger, set-associative cache, eg 256 lines of 4 entries each, for example. For sure a lower miss rate on the cache is possible. Also, the miss path -- function cmpGEQ_VTS -- is naively coded, we could do a lot better there. Maybe it is possible to change the representation of VTSs so that comparison using vector operations (SSE insns, etc) is possible. Or at least so that the complex alignment logic can be avoided. > Even though it is possible to further speedup both machines, I don't > think we can get to 20x slowdown by doing just this. I agree. I think we need all the tricks we can get. > I have implemented the following: > if ((address % X) != Y) we ignore this memory access. > [...] > This hack brings helgrind's speed to something about 10x-15x and makes > my strict timeouts happy. Good! Then you can make a better evaluation of MSMProp1. > Most other ways to speedup helgrind are complimentary to this hack. Yes. As I mentioned a couple of weeks back, I think we can use the fact that mostly the FSMs are idempotent under certain circumstances, to take advantage of Helgrind's single threadedness, and so filter out potentially many memory references. > We can't have more than N segments at a time (limited by RAM) so > ideally we have to recycle old segments (I did not do this yet). Segment recycling is important, otherwise large programs cannot run for long without eating all memory. Progress on this would be good. > Most of them can be explained to helgrind via annotations. Annotations > are essentially valgrind's 'client requests'. > This is what I used so far: > > ANNOTATE_CONDVAR_* > Create a happens-before relation between two thread segments at > arbitrary points of program. > > ANNOTATE_PCQ_* > A variation of ANNOTATE_CONDVAR_* specifically for FIFO message queue. > > ANNOTATE_MUTEX_IS_USED_AS_CONDVAR(mutex) > Signal on all Unlocks and Wait on all Locks of this mutex. > On such mutex helgrind will behave as pure happens-before detector. See > test61. > > ANNOTATE_TRACE_MEMORY > --trace-addr is very good for testing helgrind on unit tests. > However it is useless on large programs where memory addresses are > non-deterministic (due to scheduler). > This annotation tells helgrind to trace accesses to some particular > memory location and to report races only on this address. > Multiple addresses could be traced. Useful for debugging a race. > > ANNOTATE_EXPECT_RACE > Useful for regression testing of helgrind itself (if an expected race > is not detected, helgrind will complain). > > ANNOTATE_BENIGN_RACE > Alternative to a suppression file. These all sound good to me. J ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Valgrind-developers mailing list Valgrind-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-developers