[
https://issues.apache.org/jira/browse/ARROW-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026792#comment-17026792
]
Vladimir commented on ARROW-7728:
---------------------------------
Thank you for the response, [~apitrou]!
Can I manually remove one of them from the package after install? or will it
have some unintended consequences?
> Duplicated binaries in the python package
> -----------------------------------------
>
> Key: ARROW-7728
> URL: https://issues.apache.org/jira/browse/ARROW-7728
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.15.1
> Reporter: Vladimir
> Priority: Minor
>
> Hello,
>
> I'm not sure if it is a desired feature or not, but there's no "question"
> issue type, so I'm opening it as a bug - please correct if necessary.
>
> Most of binary files in the python "pyarrow" package are present in two
> versions, e.g.:
>
> {code:java}
> libarrow.so
> libarrow.so.15
> {code}
> or
> {code:java}
> libarrow.dylib
> libarrow.15.dylib
> {code}
> (I presume, that ".15" correspond to the version of pyarrow?).
> Which are actually identical:
> {code:java}
> $ diff libarrow.15.dylib libarrow.dylib # returns nothing
> {code}
> So let me ask:
> - Is it necessary to have both of them in the distribution?
> - Which one is actually imported, and is it safe to remove another one?
>
> Out of 130 MB of full pyarrow, 105 MB are those binaries, so removing
> duplicates would save quite some space (especially important if using pyarrow
> in AWS lambdas where the function is limited in size).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)