> On 2012-04-20 17:58:45, Charles Reiss wrote: > > frameworks/mpi/nmpiexec.py, line 159 > > <https://reviews.apache.org/r/4768/diff/2/?file=103455#file103455line159> > > > > Account for MPICH2PATH here.
done. > On 2012-04-20 17:58:45, Charles Reiss wrote: > > frameworks/mpi/nmpiexec.py, line 194 > > <https://reviews.apache.org/r/4768/diff/2/?file=103455#file103455line194> > > > > Try to keep us below 80 chars (or at least below 100); split into > > multiple lines. done. > On 2012-04-20 17:58:45, Charles Reiss wrote: > > frameworks/mpi/startmpd.py, line 85 > > <https://reviews.apache.org/r/4768/diff/2/?file=103456#file103456line85> > > > > Assuming mpd tries to do something graceful on SIGTERM, try SIGTERM, > > wait a bit, then try SIGKILL (and below). Gave it a 5 second interval > On 2012-04-20 17:58:45, Charles Reiss wrote: > > frameworks/mpi/nmpiexec.py, line 32 > > <https://reviews.apache.org/r/4768/diff/2/?file=103455#file103455line32> > > > > MPI_TASK should be an array (and below). ok. The executable should be the first argument in the array, renamed MPI_PROGRAM. > On 2012-04-20 17:58:45, Charles Reiss wrote: > > frameworks/mpi/nmpiexec.py, line 210 > > <https://reviews.apache.org/r/4768/diff/2/?file=103455#file103455line210> > > > > Account for MPICH2PATH here. done. made it a global variable in both files too. > On 2012-04-20 17:58:45, Charles Reiss wrote: > > frameworks/mpi/nmpiexec.py, line 215 > > <https://reviews.apache.org/r/4768/diff/2/?file=103455#file103455line215> > > > > close() this; does this (and existing code calling mpdtrace) leave > > mpdtrace as a zombie process? mpdtrace's were left as zombie processes...it uses communicate() now to get the the stdout. - Harvey ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4768/#review7086 ----------------------------------------------------------- On 2012-04-21 05:08:47, Harvey Feng wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/4768/ > ----------------------------------------------------------- > > (Updated 2012-04-21 05:08:47) > > > Review request for mesos, Benjamin Hindman, Charles Reiss, and Jessica. > > > Summary > ------- > > Some updates to point out: > > -nmpiexec.py > -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved > 'driver.stop()' to statusUpdate() so that it stops when all tasks have been > finished, which occurs when the executor's launched mpd processes have all > exited. > -startmpd.py > -> Didn't remove cleanup(), and added code in shutdown() that manually > kills mpd processes. They might be useful during abnormal (cleanup) and > normal (shutdown) framework/executor termination...I think. cleanup() still > terminates all mpd's in the slave, but shutdown doesn't. > -> killtask() stops the mpd associated with the given tid. > -> Task states update nicely now. They correspond to the state of a task's > associated mpd process. > -Readme > -> Included additional info on how to setup and run MPICH2 1.2 and nmpiexec > on OS X and Ubuntu/Linux > > > This addresses bug MESOS-183. > https://issues.apache.org/jira/browse/MESOS-183 > > > Diffs > ----- > > frameworks/mpi/README.txt cdb4553 > frameworks/mpi/nmpiexec.py a5db9c0 > frameworks/mpi/startmpd.py 8eeba5e > > Diff: https://reviews.apache.org/r/4768/diff > > > Testing > ------- > > > Thanks, > > Harvey > >
