wesm opened a new pull request #7334:
URL: https://github.com/apache/arrow/pull/7334


   Current manylinux wheel packages on master:
   
   * .whl manylinux1 package is **61 MB**
   * Installed size is **223 MB**
   
   This patch
   
   * .whl package is **15 MB**
   * Installed size is **57 MB**
   
   That's more than a 4x size reduction. There's several things in this patch:
   
   * We no longer ship 2 copies of shared libraries in the wheels. We ship just 
the SO-versioned shared libraries now. Because this creates problems for 
linkers (`-larrow -lparquet` etc won't work as is), I added function that tries 
to create the necessary symlinks when you call `pyarrow.get_library_dirs()`. If 
pyarrow is installed somewhere where you can't create symlinks and the symlinks 
don't exist, it will print a message instructing you to run the symlinking 
function as root. This was the simplest strategy I could think to get out of 
this mess.
   * Gandiva is disabled. If we're going to ship Gandiva as a wheel, I think we 
should do it as an add-on `pyarrow_gandiva` package per ARROW-8518.
   * Environment variable PYARROW_INSTALL_TESTS added to not install 
`pyarrow.tests`, which is about 2.3MB uncompressed. I don't think we need to 
ship the tests in the wheels. 
   * Compiled Cython sources are no longer shipped. 
   
   I'll need some help kicking the tires on macOS and Windows and to make sure 
the Crossbow builds are all passing. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to