[R] Improving documentation and transparency for Arrow build and packaging work for R

Wes McKinney Sat, 16 Mar 2019 13:11:09 -0700

hi folks,

I have noticed there is work under way to prepare Apache Arrow for
submission to the CRAN package manager for R users. I'm slightly
concerned about the lack of information and documentation in the
project regarding what is involved with this effort. This patch in
particular raised some eyebrows


https://github.com/apache/arrow/pull/3932

This introduces a dependency into the project on pre-built static
libraries based on processes that aren't documented in the project. I
see this repository containing these static libraries for the R
Windows toolchain, but if I needed to produce them myself I would not
know what to do

https://github.com/rwinlib/arrow

Additionally, in general, if I wanted to build and test Arrow and R
from source on Windows, I also would not know what to do.

In the Python world, this would be akin to depending on e.g.
conda-forge packages for Windows development, but not having any
information in the repository about to build Arrow C++ and Python from
source on Windows.

So I would like to see some transparency / documentation around the
scripts and processes involved with this so that we don't end up with
a "bus factor" problem where Arrow PMC members are unable to undertake
basic maintenance and release management activities. Currently the
work that is going on seems opaque to me and as such feels contrary to
the Apache Way.

I understand that there is some urgency to make the Arrow libraries
available to R users, but I want to make sure we are working in a
sustainable manner to grow a community of developers who are able to
do work on each part of the project.

Thanks,
Wes

[R] Improving documentation and transparency for Arrow build and packaging work for R

Reply via email to