This still has a race condition... which can be dealt with using opal_atomic stuff. See below.
On Thu, Mar 6, 2008 at 2:35 PM, <r...@osl.iu.edu> wrote: > Author: rhc > Date: 2008-03-06 14:35:57 EST (Thu, 06 Mar 2008) > New Revision: 17766 > URL: https://svn.open-mpi.org/trac/ompi/changeset/17766 > > Log: > Fix a race condition - ensure we don't call terminate in orterun more than > once, even if the timeout fires while we are doing so [snip] > Modified: trunk/orte/tools/orterun/orterun.c > > ============================================================================== > --- trunk/orte/tools/orterun/orterun.c (original) > +++ trunk/orte/tools/orterun/orterun.c 2008-03-06 14:35:57 EST (Thu, 06 Mar > 2008) > @@ -112,14 +112,15 @@ > static bool want_prefix_by_default = (bool) > ORTE_WANT_ORTERUN_PREFIX_BY_DEFAULT; > static opal_event_t *orterun_event, *orteds_exit_event; > static char *ompi_server=NULL; > +static bool terminating=false; > [snip] > @@ -644,6 +638,12 @@ > orte_proc_t **procs; > orte_vpid_t i; > > + /* flag that we are here to avoid doing it twice */ > + if (terminating) { > + return; > + } > + terminating = true; > + [snip] I think this race condition should be dealt with like this: #include "opal/sys/atomic.h" static opal_atomic_lock_t terminating = OPAL_ATOMIC_UNLOCKED; ... if (opal_atomic_trylock(&terminating)) { /* returns 1 if already locked */ return; } -- Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/ tmat...@gmail.com || timat...@open-mpi.org I'm a bright... http://www.the-brights.net/