rolweber opened a new issue #10226: URL: https://github.com/apache/arrow/issues/10226
Hello, I'm building container images with conda environments that include both pyarrow, and R arrow from CRAN. The builds were stable for several weeks. Then there were problems two weeks ago, which eventually resolved (see [ARROW-12502](https://issues.apache.org/jira/browse/ARROW-12502) in JIRA). Today, builds are breaking again, not related to the previous problem afaict. I'm looking for advice to 1. Get the builds working again. 2. Make the installation less fragile for the future. On Linux x86, I'm installing first pyarrow 3.0.0 from PyPI, which uses a wheel with pre-built native libs. Then I'm installing Arrow 3.0.0 from CRAN, which tries to build its own native libs, I think. Then I'm running a few unit tests to make sure that both pyarrow and R arrow are working, and can exchange data. When there are problems with installing R from CRAN, the build doesn't necessarily fail at that step, but only later in the unit tests. That's what's happening today. I've set ARROW_R_DEV=true to get some information about the installation problem, as the default output doesn't even print an error message. This is the problem today (builds were still working last Friday): ```txt -- thrift_ep configure command succeeded. See also /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-configure-*.log [ 50%] Performing build step for 'thrift_ep' -- thrift_ep build command succeeded. See also /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-build-*.log [ 51%] Performing install step for 'thrift_ep' CMake Error at /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-install-RELEASE.cmake:37 (message): Command failed: 2 'make' 'install' See also /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-install-*.log -- stdout output is: -- stderr output is: make[3]: *** No rule to make target 'install'. Stop. CMake Error at /tmp/RtmpE3CJCi/file19d083c11a8/thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-install-RELEASE.cmake:47 (message): Stopping after outputting logs. make[2]: *** [CMakeFiles/thrift_ep.dir/build.make:93: thrift_ep-prefix/src/thrift_ep-stamp/thrift_ep-install] Error 1 make[1]: *** [CMakeFiles/Makefile2:758: CMakeFiles/thrift_ep.dir/all] Error 2 gmake: *** [Makefile:160: all] Error 2 + popd /tmp/RtmpFiYDeK/R.INSTALL19b11d5af287/arrow ------------------------- NOTE --------------------------- See https://arrow.apache.org/docs/r/articles/install.html for help installing Arrow C++ libraries ``` I'm also building images for Linux on Power (ppc64le). There, I couldn't install pyarrow from PyPI, because there are no wheels for that platform, and the source compilation failed. I eventually built a custom version of conda packages pyarrow and arrow-cpp. Then I'm installing Arrow from CRAN. This is still working today. I've enabled debug output here as well, to compare with x86. This is what I see there: ```txt inst/build_arrow_static.sh: line 54: /tmp/RtmpR54GjL/file28b55b5e1923/cmake-3.19.2-Linux-x86_64/bin/cmake: cannot execute binary file: Exec format error + /tmp/RtmpR54GjL/file28b55b5e1923/cmake-3.19.2-Linux-x86_64/bin/cmake --build . --target install inst/build_arrow_static.sh: line 84: /tmp/RtmpR54GjL/file28b55b5e1923/cmake-3.19.2-Linux-x86_64/bin/cmake: cannot execute binary file: Exec format error + popd /tmp/RtmpfCx9q8/R.INSTALL28965204fdde/arrow PKG_CFLAGS=-I/tmp/RtmpfCx9q8/R.INSTALL28965204fdde/arrow/libarrow/arrow-3.0.0/include -DARROW_R_WITH_ARROW PKG_LIBS=-larrow_dataset -lparquet -larrow ** libs ``` So `cmake` is not even running on that platform, yet I get Docker images that work and pass the unit tests. Apparently, the native libs from the arrow-cpp conda package are found automatically, and satisfy whatever the R installation needs to compile its `arrow.so` library. Ideally, I'd want the native libs from the PyPI wheel to be used by R arrow on the x86 platform. But symlinking the files into the lib directory where arrow-cpp puts them on ppc64le didn't do the trick. Is there a way to tell R Arrow to use existing libs, and where those libs are located? If I have to install from a source tarball instead of CRAN, that would be OK. I'm more concerned about robustness than comfort. I'll try to collect more information about the thrift_ep problem. Because the installation does not fail, temporary files get removed. Maybe the problem even auto-resolves in a day or two, just as suddenly as it has appeared. But I'm afraid this wasn't the last time that something breaks during the R arrow installation, so I'd prefer to reduce the number of things that need to be downloaded and compiled at installation time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
