Le 03/04/2019 à 02:23, Wes McKinney a écrit : > > $ ll Library/lib/ > total 741796 > -rw-r--r-- 1 wesm wesm 1507048 Mar 27 23:34 arrow.lib > -rw-r--r-- 1 wesm wesm 76184 Mar 27 23:35 arrow_python.lib > -rw-r--r-- 1 wesm wesm 61322082 Mar 27 23:36 arrow_python_static.lib > -rw-r--r-- 1 wesm wesm 328090044 Mar 27 23:37 arrow_static.lib > drwxr-xr-x 3 wesm wesm 4096 Apr 2 19:12 cmake/ > -rw-r--r-- 1 wesm wesm 302496 Mar 27 23:38 gandiva.lib > -rw-r--r-- 1 wesm wesm 239314018 Mar 27 23:40 gandiva_static.lib > -rw-r--r-- 1 wesm wesm 491292 Mar 27 23:41 parquet.lib > -rw-r--r-- 1 wesm wesm 128473780 Mar 27 23:42 parquet_static.lib > drwxr-xr-x 2 wesm wesm 4096 Apr 2 19:12 pkgconfig/ > > As a mitigating measure in the meantime, I would suggest that we stop > bundling the static libraries in the arrow-cpp conda package, since > we're just hurting release managers and users with a large package > download when they `conda install pyarrow`.
Agreed. > Can someone open a JIRA > issue about this? See https://issues.apache.org/jira/browse/ARROW-5101 > There's something very odd here, though, which is that libgandiva.so > and libgandiva.so.13 appear to be distinct. Not only. libparquet.so, libplasma.so and libarrow.so are distinct as well. This means that we may be building those libraries twice instead of copying the files. By the way, I don't understand why those are not symlinks. > That seems buggy to me. We might also investigate if there's a way to > trim the binary sizes in some way. Well, there's always "strip -s", but it doesn't seem to remove much (libgandiva.so shrinks from 60 to 50 MB, and you lose all debug information). One issue seems to be that libgandiva.so links LLVM statically, but doesn't hide LLVM symbols. That said, libllvmlite.so (which hides LLVM symbols) has grown quite large recently as well (around 40 MB). Perhaps Gandiva needs to be packaged separately... Regards Antoine.