[ https://issues.apache.org/jira/browse/ARROW-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881282#comment-16881282 ]
Uwe L. Korn commented on ARROW-5886: ------------------------------------ Actually the work of {{auditwheel repair}} should be to rename these libs and use {{patchelf}} on all binaries so that they linked to the renamed ones. If there is a binary that still links to the old name, then there is a bug in {{auditwheel repair}}. > [Python][Packaging] Manylinux1/2010 compliance issue with libz > -------------------------------------------------------------- > > Key: ARROW-5886 > URL: https://issues.apache.org/jira/browse/ARROW-5886 > Project: Apache Arrow > Issue Type: Bug > Components: Packaging, Python > Affects Versions: 0.14.0 > Reporter: Krisztian Szucs > Priority: Major > > So we statically link liblz4 in the manylinux1 wheels > {code} > # ldd pyarrow-manylinux1/libarrow.so.14 | grep z > libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fc28cef4000) > {code} > but dynamically in the manylinux2010 wheels > {code} > # ldd pyarrow-manylinux2010/libarrow.so.14 | grep z > liblz4.so.1 => not found (already deleted to reproduce the issue) > libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f56f7440000) > {code} > this what this PR resolves. > What I'm finding strange, that auditwheel seems to bundle libz for manylinux1: > {code} > # ls -lah pyarrow-manylinux1/*z*so.* > -rwxr-xr-x 1 root root 115K Jun 29 00:14 > pyarrow-manylinux1/libz-7f57503f.so.1.2.11 > {code} > while ldd still uses the system libz: > {code} > # ldd pyarrow-manylinux1/libarrow.so.14 | grep z > libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f91fcf3f000) > {code} > For manylinux2010 we also have liblz4: > {code} > # ls -lah pyarrow-manylinux2010/*z*so.* > -rwxr-xr-x 1 root root 191K Jun 28 23:38 > pyarrow-manylinux2010/liblz4-8cb8bdde.so.1.8.3 > -rwxr-xr-x 1 root root 115K Jun 28 23:38 > pyarrow-manylinux2010/libz-c69b9943.so.1.2.11 > {code} > and ldd similarly tries to load the system libs: > {code} > # ldd pyarrow-manylinux2010/libarrow.so.14 | grep z > liblz4.so.1 => not found > libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fd72764e000) > {code} > Inspecting manylinux1 with `LD_DEBUG=files,libs ldd libarrow.so.14` it seems > like to search the right path, but cannot find the hashed version of libz > `libz-7f57503f.so.1.2.11` > {code} > 463: file=libz.so.1 [0]; needed by ./libarrow.so.14 [0] > 463: find library=libz.so.1 [0]; searching > 463: search path=/tmp/pyarrow-manylinux1/. (RPATH from > file ./libarrow.so.14) > 463: trying file=/tmp/pyarrow-manylinux1/./libz.so.1 > 463: search cache=/etc/ld.so.cache > 463: trying file=/lib/x86_64-linux-gnu/libz.so.1 > {code} > There is no `libz.so.1` just `libz-7f57503f.so.1.2.11`. > Similarly for manylinux2010 and libz: > {code} > 470: file=libz.so.1 [0]; needed by ./libarrow.so.14 [0] > 470: find library=libz.so.1 [0]; searching > 470: search path=/tmp/pyarrow-manylinux2010/. > (RPATH from file ./libarrow.so.14) > 470: trying file=/tmp/pyarrow-manylinux2010/./libz.so.1 > 470: search cache=/etc/ld.so.cache > 470: trying file=/lib/x86_64-linux-gnu/libz.so.1 > {code} > for liblz4 (again, I've deleted the system one): > {code} > 470: file=liblz4.so.1 [0]; needed by ./libarrow.so.14 [0] > 470: find library=liblz4.so.1 [0]; searching > 470: search path=/tmp/pyarrow-manylinux2010/. > (RPATH from file ./libarrow.so.14) > 470: trying file=/tmp/pyarrow-manylinux2010/./liblz4.so.1 > 470: search cache=/etc/ld.so.cache > 470: search > path=/lib/x86_64-linux-gnu/tls/x86_64:/lib/x86_64-linux-gnu/tls:/lib/x86_64-linux-gnu/x86_64:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu/tls/x86_64:/usr/lib/x86_64-linux-gnu/tls:/usr/lib/x86_64-linux-gnu/x86_6$ > :/usr/lib/x86_64-linux-gnu:/lib/tls/x86_64:/lib/tls:/lib/x86_64:/lib:/usr/lib/tls/x86_64:/usr/lib/tls:/usr/lib/x86_64:/usr/lib > (system search path) > {code} > There are no `libz.so.1` nor `liblz4.so.1`, just `libz-c69b9943.so.1.2.11` > and `liblz4-8cb8bdde.so.1.8.3` > According to https://www.python.org/dev/peps/pep-0571/ `liblz4` nor `libz` > are part of the whitelist, and while these are bundled with the wheel, > seemingly cannot be found - perhaps because of the hash in the library name? > I've tried to inspect the wheels with `auditwheel show` with version `2` and > `1.10`, both says the following: > {code} > # auditwheel show pyarrow-0.14.0-cp37-cp37m-manylinux2010_x86_64.whl > pyarrow-0.14.0-cp37-cp37m-manylinux2010_x86_64.whl is consistent with > the following platform tag: "linux_x86_64". > The wheel references external versioned symbols in these system- > provided shared libraries: libgcc_s.so.1 with versions {'GCC_3.3', > 'GCC_3.4', 'GCC_3.0'}, libpthread.so.0 with versions {'GLIBC_2.3.3', > 'GLIBC_2.12', 'GLIBC_2.2.5', 'GLIBC_2.3.2'}, libc.so.6 with versions > {'GLIBC_2.4', 'GLIBC_2.6', 'GLIBC_2.2.5', 'GLIBC_2.7', 'GLIBC_2.3.4', > 'GLIBC_2.3.2', 'GLIBC_2.3'}, libstdc++.so.6 with versions > {'CXXABI_1.3', 'GLIBCXX_3.4.10', 'GLIBCXX_3.4.9', 'GLIBCXX_3.4.11', > 'GLIBCXX_3.4.5', 'GLIBCXX_3.4', 'CXXABI_1.3.2', 'CXXABI_1.3.3'}, > librt.so.1 with versions {'GLIBC_2.2.5'}, libm.so.6 with versions > {'GLIBC_2.2.5'}, libdl.so.2 with versions {'GLIBC_2.2.5'}, libz.so.1 > with versions {'ZLIB_1.2.0'} > This constrains the platform tag to "manylinux2010_x86_64". In order > to achieve a more compatible tag, you would need to recompile a new > wheel from source on a system with earlier versions of these > libraries, such as a recent manylinux image. > {code} > {code} > # auditwheel show pyarrow-0.14.0-cp37-cp37m-manylinux1_x86_64.whl > pyarrow-0.14.0-cp37-cp37m-manylinux1_x86_64.whl is consistent with the > following platform tag: "linux_x86_64". > The wheel references external versioned symbols in these system- > provided shared libraries: libgcc_s.so.1 with versions {'GCC_3.4', > 'GCC_3.0', 'GCC_3.3'}, libc.so.6 with versions {'GLIBC_2.3', > 'GLIBC_2.2.5', 'GLIBC_2.3.4', 'GLIBC_2.4', 'GLIBC_2.3.2'}, > libstdc++.so.6 with versions {'CXXABI_1.3', 'GLIBCXX_3.4.5', > 'GLIBCXX_3.4'}, librt.so.1 with versions {'GLIBC_2.2.5'}, libm.so.6 > with versions {'GLIBC_2.2.5'}, libpthread.so.0 with versions > {'GLIBC_2.3.3', 'GLIBC_2.3.2', 'GLIBC_2.2.5'}, libdl.so.2 with > versions {'GLIBC_2.2.5'}, libz.so.1 with versions {'ZLIB_1.2.0'} > The following external shared libraries are required by the wheel: > { > "libc.so.6": "/lib/x86_64-linux-gnu/libc-2.24.so", > "libcrypt.so.1": "/lib/x86_64-linux-gnu/libcrypt-2.24.so", > "libdl.so.2": "/lib/x86_64-linux-gnu/libdl-2.24.so", > "libgcc_s.so.1": "/lib/x86_64-linux-gnu/libgcc_s.so.1", > "libm.so.6": "/lib/x86_64-linux-gnu/libm-2.24.so", > "libnsl.so.1": "/lib/x86_64-linux-gnu/libnsl-2.24.so", > "libpthread.so.0": "/lib/x86_64-linux-gnu/libpthread-2.24.so", > "librt.so.1": "/lib/x86_64-linux-gnu/librt-2.24.so", > "libstdc++.so.6": "/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22", > "libutil.so.1": "/lib/x86_64-linux-gnu/libutil-2.24.so", > "libz.so.1": "/lib/x86_64-linux-gnu/libz.so.1.2.8" > } > In order to achieve the tag platform tag "manylinux2010_x86_64" the > following shared library dependencies will need to be eliminated: > libz.so.1 > In order to achieve the tag platform tag "manylinux1_x86_64" the > following shared library dependencies will need to be eliminated: > libz.so.1 > {code} > I think there are more todo left with the wheels. IMO the manylinux1 wheels > are not compliant because of `libz` and the manylinux2010 wheels are not > compliant because of both `libz` and `liblz4` (but incorrectly reported by > auditwheel?). > We also need to ensure to run {{auditwheel show}} on the produced wheels in > the manylinux-test script > https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/manylinux-test.sh -- This message was sent by Atlassian JIRA (v7.6.3#76005)