Folks,

I came up with a feature, which does not seem quite appropriate to go
into the MTT trunk, but is still possibly useful for someone other
than me. I have posted a note about it on the MTT wiki:

  http://svn.open-mpi.org/trac/mtt/wiki/EmailTimeoutNotification

Here's the text of the Wiki page:

We (Sun) were trying to track down a hang in an MPI test that we were
seeing in our MTT runs which was difficult to reproduce manually. The
problem is that MTT kills the hanging process before a developer has a
chance to investigate the issue. To address this, I patched an MTT
client (see attached patch file) to send out a notification email
containing an mpirun command line and GDB back trace for the hanging
test. A predefined sentinel file is touched, which can later be
removed to force MTT to move on and continue testing. Here are the INI
parameters to activate the timeout email notification:

 * {{{docommand_timeout_sentinel_file}}}
 * {{{docommand_timeout_email_recipient}}}

Example usage:

{{{
$ client/mtt --scratch /foo/bar --file foo.ini
  
docommand_timeout_sentinel_file=/tmp/mtt-timeout-sentinel-file-\&random_string\(10\)
  docommand_timeout_email_recipient=fred.flints...@sun.com,barney.rub...@sun.com
}}}

-Ethan

Reply via email to