> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/nmpiexec.py, line 209
> > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line209>
> >
> >     I'm not really sure how this can be used: the user running this script 
> > will not know what machines they might run on, so they can't possibly know 
> > which IP addresses they want to use on those machines. Maybe Jessica J. had 
> > something else in mind here?
> >     
> >     It definitely makes sense to keep --ifhn for the master.

Hmmm... Looks like my comment here disappeared somehow. Anyway, I agree that 
the --ifhn-slave option doesn't make sense since there's no way you can specify 
an IP address for each slave. I guess what I had in mind was a more general 
Mesos configuration option rather than specific to the MPI framework. 

>From a selfish standpoint, I'm not terribly concerned since the master was the 
>option I was concerned about. However, I've been thinking that, assuming 
>you're using the deploy scripts to start your cluster, it may be worth 
>considering modifying the format of the slaves configuration file (which 
>currently lists only hostnames) and allowing the user to also specify an IP 
>address for each host. Then perhaps the MPI framework could grab the IP 
>address from the Mesos configuration. This would be useful for deploying Mesos 
>as well since some users (such as myself) may have their Mesos config files in 
>an NTFS directory. (This setup means I can't start the entire cluster at one 
>go if I need to give any of my nodes a specific IP address since all nodes 
>will try to use the same ip option in mesos.conf.) Just a thought... I'll open 
>a general Mesos "Improvement" ticket if there's any chance of it happening.


> On 2012-04-24 21:45:19, Benjamin Hindman wrote:
> > frameworks/mpi/nmpiexec.py, line 223
> > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line223>
> >
> >     It looks like you assume that path ends in a '/'. You should probably 
> > check this here.

Why not use os.path.join?


- Jessica


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4768/#review7179
-----------------------------------------------------------


On 2012-05-02 13:29:50, Harvey Feng wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/4768/
> -----------------------------------------------------------
> 
> (Updated 2012-05-02 13:29:50)
> 
> 
> Review request for mesos, Benjamin Hindman, Charles Reiss, and Jessica.
> 
> 
> Summary
> -------
> 
> Some updates to point out:
> 
> -nmpiexec.py
>   -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved 
> 'driver.stop()' to statusUpdate() so that it stops when all tasks have been 
> finished, which occurs when the executor's launched mpd processes have all 
> exited. 
> -startmpd.py
>   -> Didn't remove cleanup(), and added code in shutdown() that manually 
> kills mpd processes. They might be useful during abnormal (cleanup) and 
> normal (shutdown) framework/executor termination...I think. cleanup() still 
> terminates all mpd's in the slave, but shutdown doesn't. 
>   -> killtask() stops the mpd associated with the given tid. 
>   -> Task states update nicely now. They correspond to the state of a task's 
> associated mpd process.
> -Readme
>   -> Included additional info on how to setup and run MPICH2 1.2 and nmpiexec 
> on OS X and Ubuntu/Linux
> 
> 
> This addresses bug MESOS-183.
>     https://issues.apache.org/jira/browse/MESOS-183
> 
> 
> Diffs
> -----
> 
>   frameworks/mpi/README.txt cdb4553 
>   frameworks/mpi/nmpiexec 517bdbc 
>   frameworks/mpi/nmpiexec.py a5db9c0 
>   frameworks/mpi/startmpd.py 8eeba5e 
>   frameworks/mpi/startmpd.sh 44faa05 
> 
> Diff: https://reviews.apache.org/r/4768/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Harvey
> 
>

Reply via email to