Wes McKinney created ARROW-8518:
-----------------------------------

             Summary: [Python] Create tools to enable optional components (like 
Gandiva, Flight) to be built and deployed as separate Python packages
                 Key: ARROW-8518
                 URL: https://issues.apache.org/jira/browse/ARROW-8518
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Packaging, Python
            Reporter: Wes McKinney
             Fix For: 1.0.0


Our current monolithic approach to Python packaging isn't likely to be 
sustainable long-term.

At a high level, I would propose a structure like this:

{code}
pip install pyarrow  # core package containing libarrow, libarrow_python, and 
any other common bundled C++ library dependencies

pip install pyarrow-flight  # installs pyarrow, pyarrow_flight

pip install pyarrow-gandiva # installs pyarrow, pyarrow_gandiva
{code}

We can maintain the semantic appearance of a single {{pyarrow}} package by 
having thin API modules that would look like

{code}
CONTENTS OF pyarrow/flight.py

from pyarrow_flight import *
{code}

Obviously, this is more difficult to build and package:

* CMake and setup.py files must be refactored a bit so that we can reuse code 
between the parent and child packages
* Separate conda and wheel packages must be produced. With conda this seems 
more straightforward but since the child wheels depend on the parent core 
wheel, the build process seems more complicated

In any case, I don't think these challenges are insurmountable. This will have 
several benefits:

* Smaller installation footprint for simple use cases (though note we are STILL 
duplicating shared libraries in the wheels, which is quite bad)
* Less developer anxiety about expanding the scope of what Python code is 
shipped from apache/arrow. If in 5 years we are shipping 5 different Python 
wheels with each Apache Arrow release, that sounds completely fine to me. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to