-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4768/#review6999
-----------------------------------------------------------



frameworks/mpi/README.txt
<https://reviews.apache.org/r/4768/#comment15565>

    mpd was deprecated? What's the current alternative?



frameworks/mpi/README.txt
<https://reviews.apache.org/r/4768/#comment15566>

    We should probably support taking the path to these binaries an option 
passed automatically to the executor (e.g. through an environment variable 
option) to avoid PATH issues.



frameworks/mpi/nmpiexec.py
<https://reviews.apache.org/r/4768/#comment15555>

    Remove or comment this debugging.



frameworks/mpi/nmpiexec.py
<https://reviews.apache.org/r/4768/#comment15563>

    Can we avoid using the shell here (and having MPI_TASK be interpreted by 
the shell twice)?



frameworks/mpi/nmpiexec.py
<https://reviews.apache.org/r/4768/#comment15561>

    Remove trailing whitespace.



frameworks/mpi/nmpiexec.py
<https://reviews.apache.org/r/4768/#comment15557>

    Let's try a name that doesn't contain test or Python and will give a hint 
when multiple instances are running, like something using MPI_TASK.



frameworks/mpi/startmpd.py
<https://reviews.apache.org/r/4768/#comment15562>

    I think we can get rid of this entirely; it's clearly wrong in the case 
where multiple MPIs are running, and we should be tracking stray processes so 
we eventually kill them if MPD doesn't do something funny. (And if it does, we 
should figure out how to disable that.)



frameworks/mpi/startmpd.py
<https://reviews.apache.org/r/4768/#comment15559>

    Can we use MPD's exit status to determine when to send TASK_FAILED or 
TASK_KILLED?



frameworks/mpi/startmpd.py
<https://reviews.apache.org/r/4768/#comment15558>

    Use os.kill instead (and above).


- Charles


On 2012-04-18 04:27:25, Harvey Feng wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4768/
> -----------------------------------------------------------
> 
> (Updated 2012-04-18 04:27:25)
> 
> 
> Review request for mesos, Benjamin Hindman and Charles Reiss.
> 
> 
> Summary
> -------
> 
> Some updates to point out:
> 
> -nmpiexec.py
>   -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved 
> 'driver.stop()' to statusUpdate() so that it stops when all tasks have been 
> finished, which occurs when the executor's launched mpd processes have all 
> exited. 
> -startmpd.py
>   -> Didn't remove cleanup(), and added code in shutdown() that manually 
> kills mpd processes. They might be useful during abnormal (cleanup) and 
> normal (shutdown) framework/executor termination...I think. cleanup() still 
> terminates all mpd's in the slave, but shutdown doesn't. 
>   -> killtask() stops the mpd associated with the given tid. 
>   -> Task states update nicely now. They correspond to the state of a task's 
> associated mpd process.
> -Readme
>   -> Included additional info on how to setup and run MPICH2 1.2 and nmpiexec 
> on OS X and Ubuntu/Linux
> 
> 
> This addresses bug MESOS-183.
>     https://issues.apache.org/jira/browse/MESOS-183
> 
> 
> Diffs
> -----
> 
>   frameworks/mpi/README.txt cdb4553 
>   frameworks/mpi/nmpiexec.py a5db9c0 
>   frameworks/mpi/startmpd.py 8eeba5e 
> 
> Diff: https://reviews.apache.org/r/4768/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harvey
> 
>

Reply via email to