> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/README.txt, line 11
> > <https://reviews.apache.org/r/4768/diff/1/?file=102473#file102473line11>
> >
> >     mpd was deprecated? What's the current alternative?

I think the new versions use the Hydra process manager, so 'mpiexec' would be 
the only command needed to launch an MPI program.  


> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/nmpiexec.py, line 22
> > <https://reviews.apache.org/r/4768/diff/1/?file=102474#file102474line22>
> >
> >     Remove or comment this debugging.

done.


> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/startmpd.py, line 83
> > <https://reviews.apache.org/r/4768/diff/1/?file=102475#file102475line83>
> >
> >     Use os.kill instead (and above).

done.


> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/startmpd.py, line 56
> > <https://reviews.apache.org/r/4768/diff/1/?file=102475#file102475line56>
> >
> >     Can we use MPD's exit status to determine when to send TASK_FAILED or 
> > TASK_KILLED?

ok, fixed that.


> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/startmpd.py, line 15
> > <https://reviews.apache.org/r/4768/diff/1/?file=102475#file102475line15>
> >
> >     I think we can get rid of this entirely; it's clearly wrong in the case 
> > where multiple MPIs are running, and we should be tracking stray processes 
> > so we eventually kill them if MPD doesn't do something funny. (And if it 
> > does, we should figure out how to disable that.)

ok - shutdown() should remove any stray processes left over.


> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/nmpiexec.py, line 210
> > <https://reviews.apache.org/r/4768/diff/1/?file=102474#file102474line210>
> >
> >     Let's try a name that doesn't contain test or Python and will give a 
> > hint when multiple instances are running, like something using MPI_TASK.

changed to 'MPI: ' + MPI_TASK, and added a --name option


> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/nmpiexec.py, line 95
> > <https://reviews.apache.org/r/4768/diff/1/?file=102474#file102474line95>
> >
> >     Remove trailing whitespace.

done


> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/nmpiexec.py, line 31
> > <https://reviews.apache.org/r/4768/diff/1/?file=102474#file102474line31>
> >
> >     Can we avoid using the shell here (and having MPI_TASK be interpreted 
> > by the shell twice)?

ok


> On 2012-04-18 05:41:37, Charles Reiss wrote:
> > frameworks/mpi/README.txt, line 37
> > <https://reviews.apache.org/r/4768/diff/1/?file=102473#file102473line37>
> >
> >     We should probably support taking the path to these binaries an option 
> > passed automatically to the executor (e.g. through an environment variable 
> > option) to avoid PATH issues.

ok. Passes the directory to mpi binaries using the executor's CommandInfo


- Harvey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4768/#review6999
-----------------------------------------------------------


On 2012-04-20 08:17:57, Harvey Feng wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4768/
> -----------------------------------------------------------
> 
> (Updated 2012-04-20 08:17:57)
> 
> 
> Review request for mesos, Benjamin Hindman and Charles Reiss.
> 
> 
> Summary
> -------
> 
> Some updates to point out:
> 
> -nmpiexec.py
>   -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved 
> 'driver.stop()' to statusUpdate() so that it stops when all tasks have been 
> finished, which occurs when the executor's launched mpd processes have all 
> exited. 
> -startmpd.py
>   -> Didn't remove cleanup(), and added code in shutdown() that manually 
> kills mpd processes. They might be useful during abnormal (cleanup) and 
> normal (shutdown) framework/executor termination...I think. cleanup() still 
> terminates all mpd's in the slave, but shutdown doesn't. 
>   -> killtask() stops the mpd associated with the given tid. 
>   -> Task states update nicely now. They correspond to the state of a task's 
> associated mpd process.
> -Readme
>   -> Included additional info on how to setup and run MPICH2 1.2 and nmpiexec 
> on OS X and Ubuntu/Linux
> 
> 
> This addresses bug MESOS-183.
>     https://issues.apache.org/jira/browse/MESOS-183
> 
> 
> Diffs
> -----
> 
>   frameworks/mpi/README.txt cdb4553 
>   frameworks/mpi/nmpiexec.py a5db9c0 
>   frameworks/mpi/startmpd.py 8eeba5e 
> 
> Diff: https://reviews.apache.org/r/4768/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harvey
> 
>

Reply via email to