[
https://issues.apache.org/jira/browse/ARROW-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459457#comment-17459457
]
hu geme commented on ARROW-15072:
---------------------------------
Hi [~willjones127] thx alot for putting your time into it. Just a small
feedback. Running your dockerfile on MAC OSX 11.6.1 results in
#7 451.8 make[2]: *** [CMakeFiles/awssdk_ep.dir/build.make:132:
awssdk_ep-prefix/src/awssdk_ep-stamp/awssdk_ep-build] Error 1
#7 451.8 make[1]: *** [CMakeFiles/Makefile2:760: CMakeFiles/awssdk_ep.dir/all]
Error 2
#7 451.8 gmake: *** [Makefile:160: all] Error 2
#7 451.8 **** Error building Arrow C++.
#7 457.3 ------------------------- NOTE ---------------------------
#7 457.3 There was an issue preparing the Arrow C++ libraries.
#7 457.3 See https://arrow.apache.org/docs/r/articles/install.html
#7 457.3 ---------------------------------------------------------
#7 457.5 ERROR: configuration failed for package ‘arrow’
#7 457.6 * removing ‘/usr/local/lib/R/site-library/arrow’
#7 458.2
#7 458.2 The downloaded source packages are in
#7 458.2 ‘/tmp/downloaded_packages’
#7 458.2 Error: installation of package ‘arrow’ had non-zero exit status
#7 458.2 In addition: Warning message:
#7 458.2 In install.packages(pkgs, ...) :
#7 458.2 installation of package ‘arrow’ had non-zero exit status
#7 ERROR: executor failed running [/bin/sh -c install2.r --error arrow]:
exit code: 1
However, same runs with without any issues on Ubuntu 18.04.6 LTS (GNU/Linux
5.4.0-1053-gcp x86_64).
After modifying the dockerfile as following it seems to work on MAC OSX 11.6.1,
though I do not know why :)
{code:java}
FROM rocker/r-base:4.1.2
# TO READ FROM S3
RUN apt update -qq \
&& apt install -t unstable -y --no-install-recommends \
libcurl4-openssl-dev
ENV LIBARROW_MINIMAL false
ENV ARROW_DEV true ENV LIBARROW_BINARY true
RUN R -e "install.packages('arrow', type = 'source')"{code}
thx alot! I will close that story and my apologies to categorise it as a bug
> [R] Error: This build of the arrow package does not support Datasets
> --------------------------------------------------------------------
>
> Key: ARROW-15072
> URL: https://issues.apache.org/jira/browse/ARROW-15072
> Project: Apache Arrow
> Issue Type: Bug
> Components: Parquet, R
> Affects Versions: 6.0.1
> Environment: x86_64-pc-linux-gnu (64-bit) via rocker/docker
> rocker/r-base:4.1.2
> Reporter: hu geme
> Priority: Minor
> Fix For: 6.0.1
>
>
> Hello,
> I would like to report a possible issue (or I did not grasp the documentation
> and I apologize in advance)
> Im trying to use R with arrow on docker in {*}order to read parquet files
> from s3{*}:
>
> {code:java}
> FROM rocker/r-base:4.1.2
> # TO READ FROM S3
> RUN apt update -qq \
> && apt install -t unstable -y --no-install-recommends \
> libcurl4-openssl-dev
> ENV LIBARROW_MINIMAL false
> RUN apt update && \
> apt install -y -V ca-certificates lsb-release wget && \
> wget "https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id
> --short | tr 'A-Z' 'a-z')/apache-arrow- apt-source-latest-$(lsb_release
> --codename --short).deb" && \
> apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release
> --codename --short).deb
> RUN apt update && \
> apt install -y -V -f \
> libarrow-dev \
> libarrow-dataset-dev \
> libarrow-glib-dev \
> libarrow-flight-dev \
> libparquet-dev \
> libparquet-glib-dev
> RUN install2.r --error \
> arrow{code}
> Thats the output of sessionInfo from the container running R
>
> {code:java}
> sessionInfo()
> R version 4.1.2 (2021-11-01)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Debian GNU/Linux 11 (bullseye)Matrix products: default
> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
> LAPACK:
> /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.18.solocale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8
> [9] LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8attached base
> packages:
> [1] stats graphics grDevices utils datasets methods base
> other attached packages:
> [1] arrow_6.0.1 DBI_1.1.1 loaded via a namespace (and not attached):
> [1] tidyselect_1.1.1 bit_4.0.4 compiler_4.1.2 magrittr_2.0.1
>
> [5] assertthat_0.2.1 R6_2.5.1 tools_4.1.2 glue_1.5.1
>
> [9] bit64_4.0.5 vctrs_0.3.8 RJDBC_0.2-8 rlang_0.4.12
>
> [13] rJava_1.0-5 AWR.Athena_2.0.7-0 purrr_0.3.4 {code}
> And as far as I understand, all requierements are fulfilled to use datasets
> R version 4.1.2
> Platform: x86_64-pc-linux-gnu (64-bit)
> arrow_6.0.1
>
> {code:java}
> > .Machine$sizeof.pointer < 8
> [1] FALSE
> > getRversion() < "4.0.0"
> [1] FALSE
> > tolower(Sys.info()[["sysname"]]) == "windows"
> [1] FALSE
> > {code}
> Nevertheless I get
> Error: This build of the arrow package does not support Datasets
> in return when
> {code:java}
> arrow::open_dataset(sources = path) {code}
> Appreciate any help!
--
This message was sent by Atlassian Jira
(v8.20.1#820001)