Wow that's ancient - can u up to 1.6 series? Sent from my iPhone
On Jun 20, 2013, at 3:05 PM, Claire Williams <clairewilliams1...@yahoo.com> wrote: > Hi Ralph, > > I'm using 1.4.3. Thanks > > - Claire > > From: Ralph Castain <rhc.open...@gmail.com> > To: Claire Williams <clairewilliams1...@yahoo.com>; Open MPI Users > <us...@open-mpi.org> > Sent: Thursday, June 20, 2013 1:59 PM > Subject: Re: [OMPI users] Detecting Node Failure > > It should detect and abort - what version are you using? > > Sent from my iPhone > > On Jun 20, 2013, at 2:02 PM, Claire Williams <clairewilliams1...@yahoo.com> > wrote: > >> Hi all, >> >> I was wondering if Open-MPI had any way to detect that a node has crashed, >> rebooted, etc. I am currently trying to integrate my MPI application with >> Amazon EC2 spot instances, and since spot instances can be terminated at any >> time, I would like to try to make it so that my application can detect this >> node failure, maybe remove the node from the machine file, and restart the >> application automatically. Right now, when one of the worker nodes is >> rebooted or terminated, the master that is waiting on the results of that >> node will just hang, waiting for results that will never come. >> >> Thanks, >> >> Claire >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >