> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/nmpiexec.py, line 270
> > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line270>
> >
> >     Did someone actually ask for this feature?

No, but I thought it wouldn't hurt...


> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/nmpiexec.py, line 253
> > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line253>
> >
> >     I'd like to make this whole thing simpler. In particular, I don't see 
> > any need for an executor here (i.e., the startmpd.py script). We actually 
> > have a 1-1 mapping from tasks to mpd's, so let's just have our TaskInfo's 
> > have a CommandInfo which just launches the mpd. Something like this for 
> > that CommandInfo's value:
> >     
> >     ...command.value = "${MPICH2PATH}mpd -n --host=${MPD_MASTER_IP} 
> > --port=${MPD_MASTER_PORT}"
> >     
> >     (Note your code, as does what I have above, assumes that MPICH2PATH 
> > includes the trailing '/'. Also, you might need to change the string based 
> > on if you're passing --ifhn. Finally, note I used the long options --host 
> > and --port. This is for readability for other people that might not know 
> > mpd very well. Likewise, if there is a long option for -n it would be great 
> > to use that instead.)
> >     
> >     Of course, this will require setting the command's environment 
> > variables appropriately. But, this should let us completely eliminate the 
> > startmpd.py script!!!!! Yeah! Less things to maintain!

Ok. Switched to no-executor. I directly specified the variables during 
string/command construction, so there isn't a need to set environment 
variables. Got rid of startmpd.py/startmpd.h =)


> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/nmpiexec.py, line 199
> > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line199>
> >
> >     s/slot/mpd
> >     
> >     Same as above, is there something we can/should set when we launch the 
> > mpd?

There doesn't seem to be anything to set for memory when launching the mpd...


> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/nmpiexec.py, line 28
> > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line28>
> >
> >     I know this wasn't the style of the codebase you've inherited, but I'd 
> > like spaces around all operators please. For example, this line should read:
> >     
> >     print "Got " + str(TOTAL_TASKS) + " mpd slots, running mpiexec"
> >     
> >     It looks like you've already done this with some of the code you've 
> > added (which is awesome!), but please clean up all the code. Thanks!

Done. I converted most string operations to use %, roughly following the Google 
style guide for Python.


> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/README.txt, line 76
> > <https://reviews.apache.org/r/4768/diff/3/?file=103692#file103692line76>
> >
> >     Does nmpiexec work out of the box without changes (not included in this 
> > review).
> >     
> >     Also, let's change the names from "nmpiexec*" to "mpiexec-mesos*"! ;)
> >     
> >     Finally, just pass host:port, no need for the 'master@' prefix.

Yup, nmpiexec works. 


> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/nmpiexec.py, line 196
> > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line196>
> >
> >     s/slot/mpd
> >     
> >     Also, do we want to set the '--ncpus' option on the actual mpd that we 
> > launch on a Mesos slave?

Added --npus to slave mpd calls.


> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/nmpiexec.py, lines 91-92
> > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line91>
> >
> >     Use driver.declineOffer please.

Done for this and others below. Issue#188 (small bug fix for Python/C++ 
binding) would have to be committed for driver.declineOffer to work though.


- Harvey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4768/#review7179
-----------------------------------------------------------


On 2012-05-02 13:00:12, Harvey Feng wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4768/
> -----------------------------------------------------------
> 
> (Updated 2012-05-02 13:00:12)
> 
> 
> Review request for mesos, Benjamin Hindman, Charles Reiss, and Jessica.
> 
> 
> Summary
> -------
> 
> Some updates to point out:
> 
> -nmpiexec.py
>   -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved 
> 'driver.stop()' to statusUpdate() so that it stops when all tasks have been 
> finished, which occurs when the executor's launched mpd processes have all 
> exited. 
> -startmpd.py
>   -> Didn't remove cleanup(), and added code in shutdown() that manually 
> kills mpd processes. They might be useful during abnormal (cleanup) and 
> normal (shutdown) framework/executor termination...I think. cleanup() still 
> terminates all mpd's in the slave, but shutdown doesn't. 
>   -> killtask() stops the mpd associated with the given tid. 
>   -> Task states update nicely now. They correspond to the state of a task's 
> associated mpd process.
> -Readme
>   -> Included additional info on how to setup and run MPICH2 1.2 and nmpiexec 
> on OS X and Ubuntu/Linux
> 
> 
> This addresses bug MESOS-183.
>     https://issues.apache.org/jira/browse/MESOS-183
> 
> 
> Diffs
> -----
> 
>   frameworks/mpi/nmpiexec.py a5db9c0 
>   frameworks/mpi/README.txt cdb4553 
>   frameworks/mpi/startmpd.py 8eeba5e 
>   frameworks/mpi/startmpd.sh 44faa05 
> 
> Diff: https://reviews.apache.org/r/4768/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harvey
> 
>

Reply via email to