Neal Richardson created ARROW-6793:
--------------------------------------
Summary: [R] Arrow C++ binary packaging for Linux
Key: ARROW-6793
URL: https://issues.apache.org/jira/browse/ARROW-6793
Project: Apache Arrow
Issue Type: Improvement
Components: R
Reporter: Neal Richardson
Assignee: Neal Richardson
Fix For: 1.0.0
Our current installation experience on Linux isn't ideal. Unless you've already
installed the Arrow C++ library, when you install the R package, you get a
shell that tells you to install the C++ library. That was a useful approach to
allow us to get the package on CRAN, which makes it easy for macOS and Windows
users to install, but it doesn't improve the installation experience for Linux
users. This is an impediment to adoption of arrow not only by users but also by
package maintainers who might want to depend on arrow.
macOS and Windows have a better experience because at installation time, the
configure scripts download and statically link a prebuilt C++ library. CRAN
bundles the whole thing up and delivers that as a binary R package.
Python wheels do a similar thing: they're binaries that contain all external
dependencies. And there are pyarrow wheels for Linux. This suggests that we
could do something similar for R: build a generic Linux binary of the C++
library and download it in the R package configure script at install time.
I experimented with using the Arrow C++ binaries included in the Python wheels
in R. See discussion at the end of ARROW-5956. This worked on macOS (not useful
for R, but it proved the concept) and almost worked on Linux, but it turned out
that the "manylinux2010" standard is too archaic to work with contemporary
Rcpp.
Proposal: do a similar workflow to what the manylinux2010 pyarrow build does,
just with slightly more modern compiler/settings. Publish that C++ binary
package to bintray. Then download it in the R configure script if a
local/system package isn't found.
Once we have a basic version working, test against various distros on
[R-hub|https://builder.r-hub.io/advanced] to make sure we're solid everywhere
and/or ensure the current fallback behavior when we encounter a distro that
this doesn't work for. If necessary, we can make multiple flavors of this C++
binary for debian, centos, etc.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)