Just as a point of reference, I don't think that get any pushback at MapR
for not supporting RHEL 5 and that has been our policy for a few years now.

That experience should be pretty similar for Arrow, except that I would
expect that new adoptions might be even more canted towards current
versions.




On Tue, Sep 4, 2018 at 3:24 PM Wes McKinney <wesmck...@gmail.com> wrote:

> hi folks,
>
> Surfacing a JIRA discussion ([4]) to the mailing list for discussion.
>
> The manylinux1 ABI was developed to provide a mechanism for portable
> Python packages with pre-compiled binary extensions supporting C and
> C++, including C++11, on a wide variety of Linux distributions without
> need for distribution-specific packages. This is accomplished using
> RedHat's devtoolset-2, which performs selecting static linking of
> symbols from libstdc++ that cause ABI conflicts when used on systems
> with older standard libraries.
>
> The base image for producing these binaries is specified in a Dockerfile
> [1].
>
> The problem that we are having is that some C++ libraries, notably
> Google's Abseil C++ library, require a version of glibc that is too
> new for RHEL5. By building with CentOS6 / RHEL6 as the base image, we
> would get a new enough glibc (version 2.12). But building against
> glibc 2.12 would leave behind the RHEL5 folks.
>
> There is the in-discussion manylinux2010 standard uses RHEL6 as a base
> standard, but it is not yet finalized or in production.
>
> Some modern C++ projects shipping to Python have already left behind
> the manylinux1 standard even though their Python binaries claim to
> implement the standard. Both PyTorch and TensorFlow are tagged as
> manylinux1 although they have a different ABI. See [2] for example and
> [3]
>
> In my view there are two paths forward, neither perfect:
>
> 1) Stick with the manylinux1 ABI and do not use thirdparty libraries
> requiring newer glibc
> 2) "Cheat" on manylinux1 by using centos6 instead of centos5 as the
> base image for the wheel builds. This is what PyTorch is doing
>
> Since centos5 / RHEL5 are already past EOL those would be the primary
> casualties, but I'm not sure how many users would be affected. My
> guess is that they represent a small minority of our users at this
> point. RedHat is offering extended support for RHEL5 through end of
> 2020 but those are probably fairly exceptional cases and unlikely
> (IMHO) to be working on the bleeding edge of Python data engineering.
>
> Personally I would like to go with Option 2 and hope that this
> particular Python packaging gets sorted out in the next 12-24 months
> as we've already suffered problems due to TensorFlow and PyTorch's
> non-conformity with the manylinux1 ABI.
>
> Interested in the opinions of others.
>
> - Wes
>
> [1]:
> https://github.com/pypa/manylinux/blob/master/docker/Dockerfile-x86_64
> [2]:
> https://github.com/NVIDIA/nvidia-docker/issues/348#issuecomment-288875848
> [3]: https://github.com/pypa/manylinux/issues/96
> [4]: https://issues.apache.org/jira/browse/ARROW-2461
>

Reply via email to