[ 
https://issues.apache.org/jira/browse/ARROW-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17031544#comment-17031544
 ] 

Michael Marino commented on ARROW-5158:
---------------------------------------

Hi Wes, thanks for the response.  Indeed, I understand the issue and that this 
isn't a critical part of the immediate timeline.  We currently work around 
this, and so it is not yet critical for us, but, especially with AWS pushing 
serverless for handling data workflows, I do expect this to become an issue for 
us and for others sometime soon. 

 

I personally have started looking at some possible solutions and will try to 
submit a PR here, but I would need some guidance as to the external 
requirements of the package.  Given the conversation about this 
[here|https://discuss.python.org/t/symbolic-links-in-wheels/1945/5], it sounds 
like the libraries are packaged in such a way so as to be usable by other tools 
(e.g. pyspark?).  If this is *not* the case, then I would focus on trying to 
update how the library is loaded from within pyarrow itself to handle the case 
when the library is coming from within the wheel.  

 

 

> [Packaging][Wheel] Symlink libraries in wheels
> ----------------------------------------------
>
>                 Key: ARROW-5158
>                 URL: https://issues.apache.org/jira/browse/ARROW-5158
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Packaging, Python
>            Reporter: Krisztian Szucs
>            Priority: Major
>              Labels: wheel
>
> Libraries are copied instead of symlinking in linux and osx wheels, which 
> result quiet big binaries:
>  
> This is what the wheel contains before running auditwheel:
>  
> {code}
> -rwxr-xr-x  1 root root 128K Apr  3 09:02 libarrow_boost_filesystem.so
> -rwxr-xr-x  1 root root 128K Apr  3 09:02 libarrow_boost_filesystem.so.1.66.0
> -rwxr-xr-x  1 root root 1.2M Apr  3 09:02 libarrow_boost_regex.so
> -rwxr-xr-x  1 root root 1.2M Apr  3 09:02 libarrow_boost_regex.so.1.66.0
> -rwxr-xr-x  1 root root  30K Apr  3 09:02 libarrow_boost_system.so
> -rwxr-xr-x  1 root root  30K Apr  3 09:02 libarrow_boost_system.so.1.66.0
> -rwxr-xr-x  1 root root 1.4M Apr  3 09:02 libarrow_python.so
> -rwxr-xr-x  1 root root 1.4M Apr  3 09:02 libarrow_python.so.14
> -rwxr-xr-x  1 root root  12M Apr  3 09:02 libarrow.so
> -rwxr-xr-x  1 root root  12M Apr  3 09:02 libarrow.so.14
> -rw-r--r--  1 root root 6.1M Apr  3 09:02 lib.cpp
> -rwxr-xr-x  1 root root 2.4M Apr  3 09:02 
> [lib.cpython-36m-x86_64-linux-gnu.so|http://lib.cpython-36m-x86_64-linux-gnu.so/]
> -rwxr-xr-x  1 root root  55M Apr  3 09:02 libgandiva.so
> -rwxr-xr-x  1 root root  55M Apr  3 09:02 libgandiva.so.14
> -rwxr-xr-x  1 root root 2.9M Apr  3 09:02 libparquet.so
> -rwxr-xr-x  1 root root 2.9M Apr  3 09:02 libparquet.so.14
> -rwxr-xr-x  1 root root 309K Apr  3 09:02 libplasma.so
> -rwxr-xr-x  1 root root 309K Apr  3 09:02 libplasma.so.14
>  {code}
> After running auditwheel, the repaired wheel contains:
>  
> {code}
> -rwxr-xr-x  1 root root 128K Apr  3 09:02 libarrow_boost_filesystem.so
> -rwxr-xr-x  1 root root 128K Apr  3 09:02 libarrow_boost_filesystem.so.1.66.0
> -rwxr-xr-x  1 root root 1.2M Apr  3 09:02 libarrow_boost_regex.so
> -rwxr-xr-x  1 root root 1.2M Apr  3 09:02 libarrow_boost_regex.so.1.66.0
> -rwxr-xr-x  1 root root  30K Apr  3 09:02 libarrow_boost_system.so
> -rwxr-xr-x  1 root root  30K Apr  3 09:02 libarrow_boost_system.so.1.66.0
> -rwxr-xr-x  1 root root 1.6M Apr  3 09:55 libarrow_python.so
> -rwxr-xr-x  1 root root 1.4M Apr  3 09:02 libarrow_python.so.14
> -rwxr-xr-x  1 root root  12M Apr  3 09:55 libarrow.so
> -rwxr-xr-x  1 root root  12M Apr  3 09:02 libarrow.so.14
> -rw-r--r--  1 root root 6.1M Apr  3 09:02 lib.cpp
> -rwxr-xr-x  1 root root 2.5M Apr  3 09:55 
> [lib.cpython-36m-x86_64-linux-gnu.so|http://lib.cpython-36m-x86_64-linux-gnu.so/]
> -rwxr-xr-x  1 root root  59M Apr  3 09:55 libgandiva.so
> -rwxr-xr-x  1 root root  55M Apr  3 09:02 libgandiva.so.14
> -rwxr-xr-x  1 root root 3.5M Apr  3 09:55 libparquet.so
> -rwxr-xr-x  1 root root 2.9M Apr  3 09:02 libparquet.so.14
> -rwxr-xr-x  1 root root 345K Apr  3 09:55 libplasma.so
> -rwxr-xr-x  1 root root 309K Apr  3 09:02 libplasma.so.14
> {code}
>  
> Here is the output of auditwheel 
> [https://travis-ci.org/kszucs/crossbow/builds/514605723#L3340]
> They should be symlinks, we have special code for this: 
> https://github.com/apache/arrow/blob/4495305092411e8551c60341e273c8aa3c14b282/python/setup.py#L489-L499
>  This is probably not going into the wheel as wheels are zip-files and they 
> don't support symlinks by default. So we probably need to pass the 
> `--symlinks` parameter to the wheel code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to