[ 
https://issues.apache.org/jira/browse/MESOS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985801#comment-13985801
 ] 

brian wickman commented on MESOS-857:
-------------------------------------

I propose that we restructure the mesos python project.  Right now it's 
fractured haphazardly, yet there are idioms made available by the python 
packaging ecosystem to do this correctly.

For example, there is src/cli which is a mishmash of C++ and python, which 
contains a redeclaration of 'mesos' in unpackaged form which would conflict 
with the existing code in src/python.  Now src/python bundles mesos_pb2.py, 
mesos.py and _mesos.so in a top-level namespace.  Ordinarily if you 'pip 
install baz', you expect one top level package name and everything residing 
underneath, e.g. 'import baz' with baz.foo, baz.bar, baz.bak subpackages.

We should structure the mesos namespace such that bits and pieces of mesos can 
be installed a la carte.  Right now you have to go all-in, bringing in C 
extensions (which are challenging to build and have no pure source distribution 
available yet) which is a hindrance for adoption.

It seems reasonable that I might just want API stubs or the code-generated 
protobuf classes or just the CLI.  We can do this in a few ways, but it means 
splitting everything into different packages with dependencies between each 
(codified by "install_requires" in setup.py.)  The following proposal uses a 
top-level 'mesos' namespace package, but it could be done with separate 
top-level packages, e.g. mesos_api, mesos_driver, instead of mesos.api or 
mesos.driver.

I propose the following packages (which would also mirror the import namespace):

{noformat}
  mesos [nspkg]
  mesos.api [pkg]
  mesos.cli [pkg]
  mesos.driver [pkg]
  mesos.native [pkg]
  mesos.protocol [pkg]
{noformat}

mesos should be a namespace package: it contains no symbols.  But by default it 
would have install_requires on everything provided within the mesos project, so 
that 'pip install mesos' does approximately the correct thing.  But in and of 
itself, it would contain no sources.

mesos.api should contain just the Scheduler, SchedulerDriver, Executor, 
ExecutorDriver (and in the future, possibly Log, LogDriver, Containerizer, 
ContainerizerDriver) stubs.  it has no dependencies on anything else.

mesos.cli should contain all the CLI commands.  it also shouldn't need to 
depend on any other packages except maybe mesos.protocol.  we can use the 
console_scripts entry point in mesos.cli to handle script installation (see 
http://www.scotttorborg.com/python-packaging/command-line-scripts.html#the-console-scripts-entry-point
 ).  this means 'pip install mesos.cli' would create wrapper scripts for 
mesos-cat, mesos-ps, etc, that correctly invoke the underlying python modules 
with all the dependencies set up correctly, and put onto the $PATH in the same 
place as your python interpreter.

mesos.driver should be a package that is a small wrapper around pkg_resources 
find_packages + get_entry_map and used to detect any python packages in the 
environment exporting concrete driver implementations (e.g. 
_mesos.MesosSchedulerDriver or _mesos.MesosExecutorDriver.)  this would be done 
via EntryPoints (see 
https://pythonhosted.org/setuptools/pkg_resources.html#entry-points )

mesos.native should be the package that contains _mesos.so and entry_point 
metadata expected by mesos.driver in the setup.py.  we could even go so far as 
to publish mesos.native.el5 or mesos.native.el6 binary wheels to PyPI in order 
to differentiate linux ABIs, but have them correctly detected and picked up by 
mesos.driver at runtime.  this strategy is also compatible with the pesos 
project (https://github.com/wickman/pesos ), which would just publish 
PesosSchedulerDriver and PesosExecutorDriver entry points for mesos.driver, 
allowing a pure python scheduler or executor to be implemented.

finally, mesos.protocol would be the package containing all of the 
code-generated protobuf stubs.  we could even split mesos.protocol out as a 
namespace package with separate subpackages for mesos.protocol.pb, 
mesos.protocol.json.  currently protobuf only supports python 2.x (there are 
some branches out there with support for 3.x but afaik there is no plan for 
those to reach master.)  mesos.protocol.pb would have an install_requires on 
protobuf, and mesos.protocol.json would be dependency-free, and hence friendly 
with python 3.x.  ideally there would be helper messages for constructing the 
body of libprocess messages (the "wire protocol".)  in the future that could be 
ported over to the Event/Call interface that Ben has described.

in order to support legacy applications, we could have the mesos.legacy 
package, which would map all the above names into their _mesos, mesos_pb2 and 
mesos.* counterparts.

> restructure mesos python namespace
> ----------------------------------
>
>                 Key: MESOS-857
>                 URL: https://issues.apache.org/jira/browse/MESOS-857
>             Project: Mesos
>          Issue Type: Improvement
>          Components: python api
>            Reporter: brian wickman
>
> Right now the mesos_pb2 and mesos dependencies are bundled together into the 
> mesos egg. We have some tooling that uses just the compiled protobufs, but 
> because they're lumped together with the mesos egg, we get all the 
> dependency/platform nightmare that comes along with it, not to mention the 
> bloat of including 20MB of .so files.  This proposes splitting the mesos 
> protobufs into a separate mesos_pb distribution that the mesos distribution 
> should depend upon via install_requires (e.g. "mesos_pb==0.15.0-rc4")



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to