[ 
https://issues.apache.org/jira/browse/MESOS-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266780#comment-13266780
 ] 

[email protected] commented on MESOS-183:
-----------------------------------------------------



bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 278
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line278>
bq.  >
bq.  >     Kill extra space after 'args[0]'.

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 258
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line258>
bq.  >
bq.  >     Why the indentation? And add a period at the end of the sentence 
please.

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 257
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line257>
bq.  >
bq.  >     s/executor/executor.

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 225
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line225>
bq.  >
bq.  >     s/mesos/Mesos

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 219
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line219>
bq.  >
bq.  >     What about s/TOTAL_TASKS/TOTAL_MPDS

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 193
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line193>
bq.  >
bq.  >     s-slots/mpd:s-mpd's

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 177
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line177>
bq.  >
bq.  >     No need for the intermediate 'count'.

Done - removed 'count'


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 176
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line176>
bq.  >
bq.  >     Since mpderr is unused, how about instead:
bq.  >     
bq.  >     mpdtraceout, _ = mpdtraceproc.communicate()
bq.  >     
bq.  >     or
bq.  >     
bq.  >     mpdtraceout = mpdtraceproc.communicate()[0]

Went with mpdtraceout = mpdtraceproc.communicate()[0].


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 146
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line146>
bq.  >
bq.  >     Please use driver.declineOffer.

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 145
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line145>
bq.  >
bq.  >     s/Rejecting slot/Declining offer

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 142
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line142>
bq.  >
bq.  >     How about:
bq.  >     
bq.  >     print "Launching mpd " + tid + " on host " + offer.hostname

Changed to: print "Replying to offer: launching mpd %d on host %s" % (tid, 
offer.hostname)


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 114
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line114>
bq.  >
bq.  >     s/slot/offer

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 109
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line109>
bq.  >
bq.  >     s/Rejecting slot/Declining offer
bq.  >     
bq.  >     Also, why not do driver.declineOffer right here?

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 102
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line102>
bq.  >
bq.  >     s/r/resource

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 100
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line100>
bq.  >
bq.  >     Kill this line (or alternatively add the offer.id.value up on line 
87).

Done, merged with line 87.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 96
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line96>
bq.  >
bq.  >     s/slot/resources

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 89
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line89>
bq.  >
bq.  >     s/Rejecting/Declining

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 69
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line69>
bq.  >
bq.  >     No longer use, kill please.

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 66
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line66>
bq.  >
bq.  >     No longer used, kill please.

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 59
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line59>
bq.  >
bq.  >     How about s/tasksLaunched/mpdsLaunched

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 55
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line55>
bq.  >
bq.  >     It would be great to give this a real name, e.g., MPIScheduler.

Done.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 22
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line22>
bq.  >
bq.  >     No need to take driver as an argument anymore.

Deleted parameter.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 18
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line18>
bq.  >
bq.  >     Optional path.

Changed.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/nmpiexec.py, line 17
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103693#file103693line17>
bq.  >
bq.  >     This default isn't the same as what gets printed out from --help. 
Probably makes sense to kill these here and just put the value down in the 
add_option call (like you do for --num and TOTAL_TASKS).

Done - default value set at add_option.


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/README.txt, line 62
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103692#file103692line62>
bq.  >
bq.  >     I'd prefer if we just had everyone do 'make', since that should 
build the Python dependencies (including protobuf).

Ok. I probably didn't configure properly when installing mine...


bq.  On 2012-04-24 21:45:19, Benjamin Hindman wrote:
bq.  > frameworks/mpi/README.txt, line 23
bq.  > <https://reviews.apache.org/r/4768/diff/3/?file=103692#file103692line23>
bq.  >
bq.  >     Please kill all whitespace in this review.

Done.


- Harvey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4768/#review7179
-----------------------------------------------------------


On 2012-05-02 13:29:50, Harvey Feng wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4768/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2012-05-02 13:29:50)
bq.  
bq.  
bq.  Review request for mesos, Benjamin Hindman, Charles Reiss, and Jessica.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Some updates to point out:
bq.  
bq.  -nmpiexec.py
bq.    -> 'mpdallexit' should terminate all slaves' mpds in the ring. I moved 
'driver.stop()' to statusUpdate() so that it stops when all tasks have been 
finished, which occurs when the executor's launched mpd processes have all 
exited. 
bq.  -startmpd.py
bq.    -> Didn't remove cleanup(), and added code in shutdown() that manually 
kills mpd processes. They might be useful during abnormal (cleanup) and normal 
(shutdown) framework/executor termination...I think. cleanup() still terminates 
all mpd's in the slave, but shutdown doesn't. 
bq.    -> killtask() stops the mpd associated with the given tid. 
bq.    -> Task states update nicely now. They correspond to the state of a 
task's associated mpd process.
bq.  -Readme
bq.    -> Included additional info on how to setup and run MPICH2 1.2 and 
nmpiexec on OS X and Ubuntu/Linux
bq.  
bq.  
bq.  This addresses bug MESOS-183.
bq.      https://issues.apache.org/jira/browse/MESOS-183
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    frameworks/mpi/README.txt cdb4553 
bq.    frameworks/mpi/nmpiexec 517bdbc 
bq.    frameworks/mpi/nmpiexec.py a5db9c0 
bq.    frameworks/mpi/startmpd.py 8eeba5e 
bq.    frameworks/mpi/startmpd.sh 44faa05 
bq.  
bq.  Diff: https://reviews.apache.org/r/4768/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Harvey
bq.  
bq.


                
> Included MPI Framework Fails to Start
> -------------------------------------
>
>                 Key: MESOS-183
>                 URL: https://issues.apache.org/jira/browse/MESOS-183
>             Project: Mesos
>          Issue Type: Bug
>          Components: documentation, framework
>         Environment: Scientific Linux Cluster
>            Reporter: Jessica J
>            Assignee: Harvey Feng 
>            Priority: Blocker
>              Labels: documentation, mpi, setup
>
> There are really two facets to this issue. The first is that no good 
> documentation exists for setting up and using the included MPI framework. The 
> second, and more important issue, is that the framework will not run. The 
> second issue is possibly related to the first in that I may not be setting it 
> up properly. 
> To test the MPI framework, by trial and error I determined I needed to run 
> python setup.py build and python setup.py install in the 
> MESOS-HOME/src/python directory. Now when I try to run nmpiexec -h, I get an 
> AttributeError, below: 
> Traceback (most recent call last):
>   File "./nmpiexec.py", line 2, in <module>
>     import mesos
>   File 
> "/usr/lib64/python2.6/site-packages/mesos-0.9.0-py2.6-linux-x86_64.egg/mesos.py",
>  line 22, in <module>
>     import _mesos
>   File 
> "/usr/lib64/python2.6/site-packages/mesos-0.9.0-py2.6-linux-x86_64.egg/mesos_pb2.py",
>  line 1286, in <module>
>     DESCRIPTOR.message_types_by_name['FrameworkID'] = _FRAMEWORKID
> AttributeError: 'FileDescriptor' object has no attribute 
> 'message_types_by_name'
> I've examined setup.py and determined that the version of protobuf it 
> includes (2.4.1) does, indeed, contain a FileDescriptor class in 
> descriptor.py that sets self.message_types_by_name, so I'm not sure what the 
> issue is. Is this a bug? Or is there a step I'm missing? Do I need to also 
> build/install protobuf?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to