-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi folks,
At SC10 this year there was an interesting tool presented as a student paper called "FlowChecker: Detecting Bugs in MPI Libraries via Message Flow Checking". http://sc10.supercomputing.org/schedule/event_detail.php?evid=pap352 Basically they instrument a program and derive "intentions" from your MPI calls and the MPI standard and also trace the data flow (including things like memcpy) and messages.Then offline you run a correlator which compares what was meant to happen and what did and tries to root cause the fault. They claim to have taken 5 random closed bugs from 3 different MPI implementations (including 3 from Open-MPI) and been able to detect all 5 and root-cause 4 of them (the one they missed was a data type issue). The PDF of their paper is here: http://www.cse.ohio-state.edu/~chenzhe/sc10-flowchecker.pdf I've emailed them to see if the code is going to be available as it could be quite a handy tool to have when trying to track down issues like the one Sébastien posted about. cheers, Chris - -- Christopher Samuel - Senior Systems Administrator VLSCI - Victorian Life Sciences Computational Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.unimelb.edu.au/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzn884ACgkQO2KABBYQAh+jAQCggP+izYq3rkSo1hPzADi2vCEI z2QAmwX5oEYpgYYlc6ZWC3Pr3q1dBGp/ =2KB+ -----END PGP SIGNATURE-----