nealrichardson commented on a change in pull request #11001:
URL: https://github.com/apache/arrow/pull/11001#discussion_r697443624
##########
File path: r/R/util.R
##########
@@ -183,3 +183,63 @@ repeat_value_as_array <- function(object, n) {
}
return(Scalar$create(object)$as_array(n))
}
+
+
+#' Download all optional Arrow dependencies
+#'
+#' @param deps_dir Directory to save files into. Will be created if necessary.
+#'
+#' @return TRUE/FALSE for whether the downloads were successful
+#'
+#' This function is used for setting up an offline build. If it's possible to
+#' download at build time, don't use this function. Instead, let `cmake`
+#' download them for you.
+#' If the files already exist in `deps_dir`, they will be re-downloaded and
+#' overwritten. Other files are not changed.
+#' These saved files are only used in the build if `ARROW_DEPENDENCY_SOURCE`
+#' is `BUNDLED` or `AUTO`.
+#' https://arrow.apache.org/docs/developers/cpp/building.html#offline-builds
+#'
+#' Steps for an offline install with optional dependencies:
+#' - Install the `arrow` package on a computer with internet access
Review comment:
I'd put this function in `install-arrow.R` and then recommend that you
can just source that file (like we note for `install_arrow()`), no installation
required.
##########
File path: r/R/util.R
##########
@@ -183,3 +183,63 @@ repeat_value_as_array <- function(object, n) {
}
return(Scalar$create(object)$as_array(n))
}
+
+
+#' Download all optional Arrow dependencies
+#'
+#' @param deps_dir Directory to save files into. Will be created if necessary.
+#'
+#' @return TRUE/FALSE for whether the downloads were successful
+#'
+#' This function is used for setting up an offline build. If it's possible to
+#' download at build time, don't use this function. Instead, let `cmake`
+#' download them for you.
+#' If the files already exist in `deps_dir`, they will be re-downloaded and
+#' overwritten. Other files are not changed.
+#' These saved files are only used in the build if `ARROW_DEPENDENCY_SOURCE`
+#' is `BUNDLED` or `AUTO`.
+#' https://arrow.apache.org/docs/developers/cpp/building.html#offline-builds
+#'
+#' Steps for an offline install with optional dependencies:
+#' - Install the `arrow` package on a computer with internet access
+#' - Run this function
+#' - Copy the saved dependency files to a computer without internet access
+#' - Create a environment variable called `ARROW_THIRDPARTY_DEPENDENCY_DIR`
that
+#' points to the folder.
+#' - Install the `arrow` package on the computer without internet access
+#' - Run [arrow_info()] to check installed capabilities
+#'
+#' @examples
+#' \dontrun{
+#' download_optional_dependencies("arrow-thirdparty")
+#' list.files("arrow-thirdparty", "thrift-*") # "thrift-0.13.0.tar.gz" or
similar
+#' }
+#' @export
+download_optional_dependencies <- function(deps_dir) {
+ # This script is copied over from arrow/cpp/... to arrow/r/tools/cpp/...
Review comment:
```suggestion
# This script is copied over from arrow/cpp/... to arrow/r/inst/...
```
##########
File path: r/R/util.R
##########
@@ -183,3 +183,63 @@ repeat_value_as_array <- function(object, n) {
}
return(Scalar$create(object)$as_array(n))
}
+
+
+#' Download all optional Arrow dependencies
+#'
+#' @param deps_dir Directory to save files into. Will be created if necessary.
+#'
+#' @return TRUE/FALSE for whether the downloads were successful
+#'
+#' This function is used for setting up an offline build. If it's possible to
+#' download at build time, don't use this function. Instead, let `cmake`
+#' download them for you.
+#' If the files already exist in `deps_dir`, they will be re-downloaded and
+#' overwritten. Other files are not changed.
+#' These saved files are only used in the build if `ARROW_DEPENDENCY_SOURCE`
+#' is `BUNDLED` or `AUTO`.
+#' https://arrow.apache.org/docs/developers/cpp/building.html#offline-builds
+#'
+#' Steps for an offline install with optional dependencies:
+#' - Install the `arrow` package on a computer with internet access
+#' - Run this function
+#' - Copy the saved dependency files to a computer without internet access
+#' - Create a environment variable called `ARROW_THIRDPARTY_DEPENDENCY_DIR`
that
+#' points to the folder.
+#' - Install the `arrow` package on the computer without internet access
+#' - Run [arrow_info()] to check installed capabilities
+#'
+#' @examples
+#' \dontrun{
+#' download_optional_dependencies("arrow-thirdparty")
+#' list.files("arrow-thirdparty", "thrift-*") # "thrift-0.13.0.tar.gz" or
similar
+#' }
+#' @export
+download_optional_dependencies <- function(deps_dir) {
+ # This script is copied over from arrow/cpp/... to arrow/r/tools/cpp/...
+ download_dependencies_sh <- system.file(
+ "thirdparty/download_dependencies.sh",
+ package = "arrow",
+ mustWork = TRUE
+ )
+ # Make sure the directory is sort of reasonable before creating it
+ deps_dir <- trimws(deps_dir)
+ stopifnot(nchar(deps_dir) >= 1)
Review comment:
I don't think you need this: `dir.create()` seems to validate enough:
```
> dir.create(4)
Error in dir.create(4) : invalid 'path' argument
> dir.create(NULL)
Error in dir.create(NULL) : invalid 'path' argument
> dir.create(c("a", "b"))
Error in dir.create(c("a", "b")) : invalid 'path' argument
```
```suggestion
```
##########
File path: r/tools/nixlibs.R
##########
@@ -329,24 +290,34 @@ build_libarrow <- function(src_dir, dst_dir) {
env_vars <- paste0(names(env_var_list), '="', env_var_list, '"', collapse =
" ")
env_vars <- with_s3_support(env_vars)
env_vars <- with_mimalloc(env_vars)
- if (tolower(Sys.info()[["sysname"]]) %in% "sunos") {
- # jemalloc doesn't seem to build on Solaris
- # nor does thrift, so turn off parquet,
- # and arrowExports.cpp requires parquet for dataset (ARROW-11994), so turn
that off
- # xsimd doesn't compile, so set SIMD level to NONE to skip it
- # re2 and utf8proc do compile,
- # but `ar` fails to build libarrow_bundled_dependencies, so turn them off
- # so that there are no bundled deps
- env_vars <- paste(env_vars, "ARROW_JEMALLOC=OFF ARROW_PARQUET=OFF
ARROW_DATASET=OFF ARROW_WITH_RE2=OFF ARROW_WITH_UTF8PROC=OFF
EXTRA_CMAKE_FLAGS=-DARROW_SIMD_LEVEL=NONE")
+ # turn_off_thirdparty_features() needs to happen after with_mimalloc() and
+ # with_s3_support(), since those might turn features ON.
+ thirdparty_deps_unavailable <- !download_ok &&
+ !dir.exists(Sys.getenv("ARROW_THIRDPARTY_DEPENDENCY_DIR")) &&
+ !env_is("ARROW_DEPENDENCY_SOURCE", "system")
+ if (thirdparty_deps_unavailable || is_solaris()) {
+ # Note that JSON support does work on Solaris, but will be turned off with
+ # the rest of the thirdparty dependencies (when ARROW-13768 is resolved and
+ # JSON can be turned off at all). All other dependencies don't compile
+ # (e.g thrift, jemalloc, and xsimd) or do compile but `ar` fails to build
+ # libarrow_bundled_dependencies (e.g. re2 and utf8proc).
+ env_vars <- turn_off_thirdparty_features(env_vars)
Review comment:
How about adding a message pointing the user to how to handle thirdparty
deps if offline?
```suggestion
if (thirdparty_deps_unavailable || is_solaris()) {
# Note that JSON support does work on Solaris, but will be turned off
with
# the rest of the thirdparty dependencies (when ARROW-13768 is resolved
and
# JSON can be turned off at all). All other dependencies don't compile
# (e.g thrift, jemalloc, and xsimd) or do compile but `ar` fails to build
# libarrow_bundled_dependencies (e.g. re2 and utf8proc).
env_vars <- turn_off_thirdparty_features(env_vars)
} else if (thirdparty_deps_unavailable) {
cat("*** Something something we're offline so building without many
deps/features; see vignette\n")
env_vars <- turn_off_thirdparty_features(env_vars)
}
```
##########
File path: r/tools/nixlibs.R
##########
@@ -329,24 +290,34 @@ build_libarrow <- function(src_dir, dst_dir) {
env_vars <- paste0(names(env_var_list), '="', env_var_list, '"', collapse =
" ")
env_vars <- with_s3_support(env_vars)
env_vars <- with_mimalloc(env_vars)
- if (tolower(Sys.info()[["sysname"]]) %in% "sunos") {
- # jemalloc doesn't seem to build on Solaris
- # nor does thrift, so turn off parquet,
- # and arrowExports.cpp requires parquet for dataset (ARROW-11994), so turn
that off
- # xsimd doesn't compile, so set SIMD level to NONE to skip it
- # re2 and utf8proc do compile,
- # but `ar` fails to build libarrow_bundled_dependencies, so turn them off
- # so that there are no bundled deps
- env_vars <- paste(env_vars, "ARROW_JEMALLOC=OFF ARROW_PARQUET=OFF
ARROW_DATASET=OFF ARROW_WITH_RE2=OFF ARROW_WITH_UTF8PROC=OFF
EXTRA_CMAKE_FLAGS=-DARROW_SIMD_LEVEL=NONE")
+ # turn_off_thirdparty_features() needs to happen after with_mimalloc() and
+ # with_s3_support(), since those might turn features ON.
+ thirdparty_deps_unavailable <- !download_ok &&
+ !dir.exists(Sys.getenv("ARROW_THIRDPARTY_DEPENDENCY_DIR")) &&
+ !env_is("ARROW_DEPENDENCY_SOURCE", "system")
+ if (thirdparty_deps_unavailable || is_solaris()) {
+ # Note that JSON support does work on Solaris, but will be turned off with
+ # the rest of the thirdparty dependencies (when ARROW-13768 is resolved and
+ # JSON can be turned off at all). All other dependencies don't compile
+ # (e.g thrift, jemalloc, and xsimd) or do compile but `ar` fails to build
+ # libarrow_bundled_dependencies (e.g. re2 and utf8proc).
+ env_vars <- turn_off_thirdparty_features(env_vars)
}
+ # If $ARROW_THIRDPARTY_DEPENDENCY_DIR has files, add their *_SOURCE_URL env
vars
+ env_vars <- set_thirdparty_urls(env_vars)
+
cat("**** arrow", ifelse(quietly, "", paste("with", env_vars)), "\n")
status <- suppressWarnings(system(
paste(env_vars, "inst/build_arrow_static.sh"),
ignore.stdout = quietly, ignore.stderr = quietly
))
if (status != 0) {
# It failed :(
- cat("**** Error building Arrow C++. Re-run with ARROW_R_DEV=true for debug
information.\n")
+ cat(
Review comment:
Good call 👍
##########
File path: r/tools/nixlibs.R
##########
@@ -413,10 +392,114 @@ cmake_version <- function(cmd = "cmake") {
)
}
+turn_off_thirdparty_features <- function(env_vars) {
+ # Because these are done as environment variables (as opposed to build
flags),
+ # setting these to "OFF" overrides any previous setting. We don't need to
+ # check the existing value.
+ turn_off <- c(
+ "ARROW_MIMALLOC=OFF",
+ "ARROW_JEMALLOC=OFF",
+ "ARROW_PARQUET=OFF", # depends on thrift
+ "ARROW_DATASET=OFF", # depends on parquet
+ "ARROW_S3=OFF",
+ "ARROW_WITH_BROTLI=OFF",
+ "ARROW_WITH_BZ2=OFF",
+ "ARROW_WITH_LZ4=OFF",
+ "ARROW_WITH_SNAPPY=OFF",
+ "ARROW_WITH_ZLIB=OFF",
+ "ARROW_WITH_ZSTD=OFF",
+ "ARROW_WITH_RE2=OFF",
+ "ARROW_WITH_UTF8PROC=OFF",
+ # NOTE: this code sets the environment variable ARROW_JSON to "OFF", but
+ # that setting is will *not* be honored by build_arrow_static.sh until
+ # ARROW-13768 is resolved.
+ "ARROW_JSON=OFF",
+ # The syntax to turn off XSIMD is different.
+ 'EXTRA_CMAKE_FLAGS="-DARROW_SIMD_LEVEL=NONE"'
+ )
+ if (Sys.getenv("EXTRA_CMAKE_FLAGS") != "") {
+ # Error rather than overwriting EXTRA_CMAKE_FLAGS
+ # (Correctly inserting the flag into an existing quoted string is tricky)
+ stop("Sorry, setting EXTRA_CMAKE_FLAGS is not supported at this time.")
+ }
+ paste(env_vars, paste(turn_off, collapse = " "))
+}
+
+set_thirdparty_urls <- function(env_vars) {
+ deps_dir <- Sys.getenv("ARROW_THIRDPARTY_DEPENDENCY_DIR")
+ files <- list.files(deps_dir, full.names = FALSE)
+ if (length(files) == 0) {
+ # This will be true if the variable is unset, if it's set but the directory
+ # doesn't exist, or if it exists but is empty.
+ return(env_vars)
+ }
+ dep_names <- c(
+ "absl", # not used; seems to be a dependency of gRPC
+ "aws-sdk-cpp",
+ "aws-checksums",
+ "aws-c-common",
+ "aws-c-event-stream",
+ "boost",
+ "brotli",
+ "bzip2",
+ "cares", # not used; "a dependency of gRPC"
+ "gbenchmark", # not used; "Google benchmark, for testing"
+ "gflags", # not used; "for command line utilities (formerly Googleflags)"
+ "glog", # not used; "for logging"
+ "grpc", # not used; "for remote procedure calls"
+ "gtest", # not used; "Googletest, for testing"
+ "jemalloc",
+ "lz4",
+ "mimalloc",
+ "orc", # not used; "for Apache ORC format support"
+ "protobuf", # not used; "Google Protocol Buffers, for data serialization"
+ "rapidjson",
+ "re2",
+ "snappy",
+ "thrift",
+ "utf8proc",
+ "xsimd",
+ "zlib",
+ "zstd"
+ )
+ dep_regex <- paste0("^(", paste(dep_names, collapse = "|"), ").*")
+ # If there were extra files in the folder (not matching our regex) drop them.
+ files <- files[grepl(dep_regex, files, perl = TRUE)]
+ # Convert e.g. "thrift-0.13.0.tar.gz" to ARROW_THRIFT_URL
+ # Note that if there's no file called thrift*, we won't add
+ # ARROW_THRIFT_URL to env_vars.
+ url_env_varname <- sub(dep_regex, "ARROW_\\1_URL", files, perl = TRUE)
+ url_env_varname <- toupper(gsub("-", "_", url_env_varname, fixed = TRUE))
+ # Special case: ARROW_AWSSDK_URL for aws-sdk-cpp-<version>.tar.gz
+ url_env_varname <- sub("ARROW_AWS_SDK_CPP_URL", "ARROW_AWSSDK_URL",
url_env_varname, fixed = TRUE)
+ if (anyDuplicated(url_env_varname)) {
+ warning("Unexpected files in ", deps_dir,
+ "\nDo you have multiple copies of a dependency?",
+ .call = FALSE
+ )
+ return(env_vars)
+ }
Review comment:
This is a gnarly system of regexes but it works and doesn't require
hard-coding the list of dependencies, which I think would lead to issues in the
future when versions.txt changes.
```suggestion
url_env_varname <- toupper(sub("(.*?)-.*", "ARROW_\\1_URL", files))
# Special handling for the aws dependencies
aws <- grepl("^aws", files)
url_env_varname[aws] <- sub("AWS_SDK_CPP", "AWSSDK",
gsub("-", "_",
sub("(AWS.*)-.*", "ARROW_\\1_URL",
toupper(files[aws])
)
)
)
```
I tested this by doing
```
source versions.txt && echo $DEPENDENCIES > tmp.txt
```
(newlines and spaces were mangled if I tried to use `system()` from R)
and then in R:
```
versions <- matrix(unlist(strsplit(readLines("tmp.txt"), " ")), ncol=3,
byrow=TRUE)
files <- versions[,2]
urls <- versions[,1]
...
identical(url_env_varname, urls)
```
##########
File path: r/tools/nixlibs.R
##########
@@ -329,24 +290,34 @@ build_libarrow <- function(src_dir, dst_dir) {
env_vars <- paste0(names(env_var_list), '="', env_var_list, '"', collapse =
" ")
env_vars <- with_s3_support(env_vars)
env_vars <- with_mimalloc(env_vars)
- if (tolower(Sys.info()[["sysname"]]) %in% "sunos") {
- # jemalloc doesn't seem to build on Solaris
- # nor does thrift, so turn off parquet,
- # and arrowExports.cpp requires parquet for dataset (ARROW-11994), so turn
that off
- # xsimd doesn't compile, so set SIMD level to NONE to skip it
- # re2 and utf8proc do compile,
- # but `ar` fails to build libarrow_bundled_dependencies, so turn them off
- # so that there are no bundled deps
- env_vars <- paste(env_vars, "ARROW_JEMALLOC=OFF ARROW_PARQUET=OFF
ARROW_DATASET=OFF ARROW_WITH_RE2=OFF ARROW_WITH_UTF8PROC=OFF
EXTRA_CMAKE_FLAGS=-DARROW_SIMD_LEVEL=NONE")
+ # turn_off_thirdparty_features() needs to happen after with_mimalloc() and
+ # with_s3_support(), since those might turn features ON.
+ thirdparty_deps_unavailable <- !download_ok &&
+ !dir.exists(Sys.getenv("ARROW_THIRDPARTY_DEPENDENCY_DIR")) &&
+ !env_is("ARROW_DEPENDENCY_SOURCE", "system")
+ if (thirdparty_deps_unavailable || is_solaris()) {
+ # Note that JSON support does work on Solaris, but will be turned off with
+ # the rest of the thirdparty dependencies (when ARROW-13768 is resolved and
+ # JSON can be turned off at all). All other dependencies don't compile
+ # (e.g thrift, jemalloc, and xsimd) or do compile but `ar` fails to build
+ # libarrow_bundled_dependencies (e.g. re2 and utf8proc).
+ env_vars <- turn_off_thirdparty_features(env_vars)
}
+ # If $ARROW_THIRDPARTY_DEPENDENCY_DIR has files, add their *_SOURCE_URL env
vars
+ env_vars <- set_thirdparty_urls(env_vars)
Review comment:
Maybe put this inside an `else` block, just for readability
##########
File path: r/R/util.R
##########
@@ -183,3 +183,63 @@ repeat_value_as_array <- function(object, n) {
}
return(Scalar$create(object)$as_array(n))
}
+
+
+#' Download all optional Arrow dependencies
+#'
+#' @param deps_dir Directory to save files into. Will be created if necessary.
+#'
+#' @return TRUE/FALSE for whether the downloads were successful
+#'
+#' This function is used for setting up an offline build. If it's possible to
+#' download at build time, don't use this function. Instead, let `cmake`
+#' download them for you.
+#' If the files already exist in `deps_dir`, they will be re-downloaded and
+#' overwritten. Other files are not changed.
+#' These saved files are only used in the build if `ARROW_DEPENDENCY_SOURCE`
+#' is `BUNDLED` or `AUTO`.
+#' https://arrow.apache.org/docs/developers/cpp/building.html#offline-builds
+#'
+#' Steps for an offline install with optional dependencies:
+#' - Install the `arrow` package on a computer with internet access
+#' - Run this function
+#' - Copy the saved dependency files to a computer without internet access
+#' - Create a environment variable called `ARROW_THIRDPARTY_DEPENDENCY_DIR`
that
+#' points to the folder.
+#' - Install the `arrow` package on the computer without internet access
+#' - Run [arrow_info()] to check installed capabilities
+#'
+#' @examples
+#' \dontrun{
+#' download_optional_dependencies("arrow-thirdparty")
+#' list.files("arrow-thirdparty", "thrift-*") # "thrift-0.13.0.tar.gz" or
similar
+#' }
+#' @export
+download_optional_dependencies <- function(deps_dir) {
+ # This script is copied over from arrow/cpp/... to arrow/r/tools/cpp/...
+ download_dependencies_sh <- system.file(
+ "thirdparty/download_dependencies.sh",
+ package = "arrow",
+ mustWork = TRUE
+ )
+ # Make sure the directory is sort of reasonable before creating it
+ deps_dir <- trimws(deps_dir)
+ stopifnot(nchar(deps_dir) >= 1)
+ dir.create(deps_dir, showWarnings = FALSE, recursive = TRUE)
+
+ # Run download_dependencies.sh
+ cat(paste0("*** Downloading optional dependencies to ", deps_dir, "\n"))
+ return_status <- system2(download_dependencies_sh,
+ args = deps_dir, stdout = FALSE, stderr = FALSE
+ )
+ download_successful <- isTRUE(return_status == 0)
+ if (download_successful) {
+ cat(paste0(
+ "**** Set environment variable on offline machine and re-build arrow:\n",
+ "export ARROW_THIRDPARTY_DEPENDENCY_DIR=<downloaded directory>\n"
+ ))
+ } else {
+ warning("Failed to download optional dependencies")
Review comment:
```suggestion
stop("Failed to download optional dependencies", call. = FALSE)
```
##########
File path: r/tests/testthat/test-install-arrow.R
##########
@@ -37,3 +37,20 @@ r_only({
})
})
})
+
+
+r_only({
+ test_that("download_optional_dependencies", {
+ skip_if_offline()
+ deps_dir <- tempfile()
+ download_successful <- expect_output(
+ download_optional_dependencies(deps_dir),
+ "export ARROW_THRIFT_URL"
Review comment:
I think this test is outdated. And actually I think we should delete it:
we'll test it in the offline build CI job. I don't want to download all of
these files every time I run the test suite.
##########
File path: r/vignettes/install.Rmd
##########
@@ -303,10 +307,12 @@ By default, these are all unset. All boolean variables
are case-insensitive.
won't look for Arrow libraries on your system and instead will look to
download/build them.
Use this if you have a version mismatch between installed system libraries
and the version of the R package you're installing.
-* `LIBARROW_DOWNLOAD`: Unless set to `false`, the build script
- will attempt to download C++ binary or source bundles.
+* `TEST_OFFLINE_BUILD`: Unless set to `true`, the build script
Review comment:
🤷 perhaps so; I don't expect anyone to use it other than us in testing
##########
File path: r/tools/nixlibs.R
##########
@@ -329,24 +290,34 @@ build_libarrow <- function(src_dir, dst_dir) {
env_vars <- paste0(names(env_var_list), '="', env_var_list, '"', collapse =
" ")
env_vars <- with_s3_support(env_vars)
env_vars <- with_mimalloc(env_vars)
- if (tolower(Sys.info()[["sysname"]]) %in% "sunos") {
- # jemalloc doesn't seem to build on Solaris
- # nor does thrift, so turn off parquet,
- # and arrowExports.cpp requires parquet for dataset (ARROW-11994), so turn
that off
- # xsimd doesn't compile, so set SIMD level to NONE to skip it
- # re2 and utf8proc do compile,
- # but `ar` fails to build libarrow_bundled_dependencies, so turn them off
- # so that there are no bundled deps
- env_vars <- paste(env_vars, "ARROW_JEMALLOC=OFF ARROW_PARQUET=OFF
ARROW_DATASET=OFF ARROW_WITH_RE2=OFF ARROW_WITH_UTF8PROC=OFF
EXTRA_CMAKE_FLAGS=-DARROW_SIMD_LEVEL=NONE")
+ # turn_off_thirdparty_features() needs to happen after with_mimalloc() and
+ # with_s3_support(), since those might turn features ON.
Review comment:
We could have those check download_ok too. Also worth considering if it
would be easier to work with `env_var_list` throughout here and only paste to
make `env_vars` when calling `system()` (that's scope creep but just pointing
it out since you're fighting against it here and it might be more natural to
carry around a list that you can update rather than opaque strings)
##########
File path: r/tools/nixlibs.R
##########
@@ -501,12 +572,10 @@ if (!file.exists(paste0(dst_dir,
"/include/arrow/api.h"))) {
unlink(bin_file)
} else if (build_ok) {
# (2) Find source and build it
- if (download_ok) {
+ src_dir <- find_local_source()
+ if (is.null(src_dir) && download_ok) {
src_dir <- download_source()
}
Review comment:
```suggestion
```
##########
File path: r/tools/nixlibs.R
##########
@@ -413,10 +392,114 @@ cmake_version <- function(cmd = "cmake") {
)
}
+turn_off_thirdparty_features <- function(env_vars) {
+ # Because these are done as environment variables (as opposed to build
flags),
+ # setting these to "OFF" overrides any previous setting. We don't need to
+ # check the existing value.
+ turn_off <- c(
+ "ARROW_MIMALLOC=OFF",
+ "ARROW_JEMALLOC=OFF",
+ "ARROW_PARQUET=OFF", # depends on thrift
+ "ARROW_DATASET=OFF", # depends on parquet
+ "ARROW_S3=OFF",
+ "ARROW_WITH_BROTLI=OFF",
+ "ARROW_WITH_BZ2=OFF",
+ "ARROW_WITH_LZ4=OFF",
+ "ARROW_WITH_SNAPPY=OFF",
+ "ARROW_WITH_ZLIB=OFF",
+ "ARROW_WITH_ZSTD=OFF",
+ "ARROW_WITH_RE2=OFF",
+ "ARROW_WITH_UTF8PROC=OFF",
+ # NOTE: this code sets the environment variable ARROW_JSON to "OFF", but
+ # that setting is will *not* be honored by build_arrow_static.sh until
+ # ARROW-13768 is resolved.
+ "ARROW_JSON=OFF",
+ # The syntax to turn off XSIMD is different.
+ 'EXTRA_CMAKE_FLAGS="-DARROW_SIMD_LEVEL=NONE"'
+ )
+ if (Sys.getenv("EXTRA_CMAKE_FLAGS") != "") {
+ # Error rather than overwriting EXTRA_CMAKE_FLAGS
+ # (Correctly inserting the flag into an existing quoted string is tricky)
+ stop("Sorry, setting EXTRA_CMAKE_FLAGS is not supported at this time.")
+ }
+ paste(env_vars, paste(turn_off, collapse = " "))
+}
+
+set_thirdparty_urls <- function(env_vars) {
+ deps_dir <- Sys.getenv("ARROW_THIRDPARTY_DEPENDENCY_DIR")
+ files <- list.files(deps_dir, full.names = FALSE)
+ if (length(files) == 0) {
+ # This will be true if the variable is unset, if it's set but the directory
Review comment:
Should we error explicitly if the variable is set but the dir doesn't
exist or is empty?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]