[ 
https://issues.apache.org/jira/browse/MESOS-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253485#comment-13253485
 ] 

Jessica J commented on MESOS-183:
---------------------------------

Here's what I've tried, in case it helps Harvey (assuming he will be the one to 
fix the framework). Following the example of the python test_framework, I 
created a new framework using mesos_pb2.FrameworkInfo() and passed that as the 
second argument to the MesosSchedulerDriver constructor. Running nmpiexec with 
this change resulted in an assertion failure:

python: ./common/try.hpp:77: T Try<T>::get() const [with T = 
mesos::internal::MasterDetector*]: Assertion `state == SOME' failed.
Aborted

I eventually determined that this failure was due to the fact that MPI 
framework does not accept the master URL in the same format as the rest of the 
project. (This should be changed for consistency, i.e., 
mesos://master@[ipaddress]:[port] rather than [ipaddress]:[port].)

Using the correct URL allows the framework to find the master, but then this 
error shows up on the master:

W0413 11:36:30.357491 30017 protobuf.hpp:255] Initialization errors: 
framework.executor
libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of 
type "mesos.internal.RegisterFrameworkMessage" because it is missing required 
fields: framework.executor

However, attempting to assign anything to framework.executor (before passing it 
to the MesosSchedulerDriver constructor) results in an AttributeError:

AttributeError: 'FrameworkInfo' object has no attribute 'executor'
                
> Included MPI Framework Fails to Start
> -------------------------------------
>
>                 Key: MESOS-183
>                 URL: https://issues.apache.org/jira/browse/MESOS-183
>             Project: Mesos
>          Issue Type: Bug
>          Components: documentation, framework
>         Environment: Scientific Linux Cluster
>            Reporter: Jessica J
>            Assignee: Harvey Feng 
>            Priority: Blocker
>              Labels: documentation, mpi, setup
>
> There are really two facets to this issue. The first is that no good 
> documentation exists for setting up and using the included MPI framework. The 
> second, and more important issue, is that the framework will not run. The 
> second issue is possibly related to the first in that I may not be setting it 
> up properly. 
> To test the MPI framework, by trial and error I determined I needed to run 
> python setup.py build and python setup.py install in the 
> MESOS-HOME/src/python directory. Now when I try to run nmpiexec -h, I get an 
> AttributeError, below: 
> Traceback (most recent call last):
>   File "./nmpiexec.py", line 2, in <module>
>     import mesos
>   File 
> "/usr/lib64/python2.6/site-packages/mesos-0.9.0-py2.6-linux-x86_64.egg/mesos.py",
>  line 22, in <module>
>     import _mesos
>   File 
> "/usr/lib64/python2.6/site-packages/mesos-0.9.0-py2.6-linux-x86_64.egg/mesos_pb2.py",
>  line 1286, in <module>
>     DESCRIPTOR.message_types_by_name['FrameworkID'] = _FRAMEWORKID
> AttributeError: 'FileDescriptor' object has no attribute 
> 'message_types_by_name'
> I've examined setup.py and determined that the version of protobuf it 
> includes (2.4.1) does, indeed, contain a FileDescriptor class in 
> descriptor.py that sets self.message_types_by_name, so I'm not sure what the 
> issue is. Is this a bug? Or is there a step I'm missing? Do I need to also 
> build/install protobuf?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to