nealrichardson commented on a change in pull request #9896:
URL: https://github.com/apache/arrow/pull/9896#discussion_r607341142
##########
File path: r/R/arrow-package.R
##########
@@ -47,12 +47,27 @@
}
# Create these once, at package build time
- dplyr_functions$dataset <- build_function_list(build_dataset_expression)
- dplyr_functions$array <- build_function_list(build_array_expression)
-
+ if (arrow_available()) {
+ dplyr_functions$dataset <- build_function_list(build_dataset_expression)
+ dplyr_functions$array <- build_function_list(build_array_expression)
+ }
invisible()
}
+.onAttach <- function(libname, pkgname) {
+ if (!arrow_available()) {
+ msg <- paste(
+ 'The Arrow C++ library is not available. To retry installation with
debug output, run:',
+
+ ' install_arrow(verbose = TRUE)',
+
+ 'See `vignette("install", package = "arrow")` for more guidance and
troubleshooting.',
Review comment:
> * Should we refer to the arrow c++ as a dependency? I think that's a
word that people with less experience might look for.
I'm not sure. It's different from "dependencies" that R users are used to
where if it's missing, you install it and now things work. And likewise with
reticulate packages, blogdown, etc., where you have this external thing you
have to install after you install the R package (`install_tensorflow()`,
`install_hugo()`, etc.), this is different because you have to re-install the R
package.
> * Should we mention that in typical installations the arrow c++ library is
either bundled with the installation (e.g. our binaries/CRAN) or built as part
of the package installation process. I can't think of a short and pithy way to
say this, but what I think we _don't_ want is someone to think that this means
they need to now go and `apt install arrow` or the like (that should work for
releases as long as the versions match up, but there are lots of ways that
could be wrong).
Right. We want to convey the message that the fix for this is to reinstall
the R package. But really, if they're here, what we probably want is for them
to reinstall with verbosity and then file a JIRA with that output (should we
add a `report_installation_issue()` function to do that? ;). Users shouldn't be
here normally.
> * Is there any prohibition on offering the url
https://arrow.apache.org/docs/r/articles/install.html it's the same content as
the vignette, but I suspect more people are familiar with packagdown sites than
they are with vignettes these days.
Sure, that's probably better.
> This is separate from this message, but I wonder if we should have another
message onload (ideally onload, only the first time like many of the tidyverse
packages do) that tests for `codec_is_available("snappy")` (or all of the ones
not in the minimal build) that gently nudges that in the case of linux you'll
want to set libarrow_binary=true or libarrow_minimal=false to get a more fully
featured build? I know that a few times when I was managing rstudio server on
linux I installed arrow, thought everything was fine and then later someone had
to bug me that snappy wasn't enabled anymore so I had to go back and install
it. It's not a huge deal but getting it out of the way when I'm in the
installing mindset would have been nice compared to finding out later and
having to go back to it.
We could, but if you look at `arrow_info()`, there are lots of features you
might not have, and I don't know that we want to message about all of them.
What if we checked `arrow_info()$capabilities`, possibly excluded some we
don't think are interesting (for example, I don't have `lzo`, whatever that is,
and maybe it's not interesting whether you have all of the memory allocators
built), and if any of the rest are false, emit a message like "Some Arrow
features are not enabled in this build; see arrow_info() for details and
?arrow_info (or link to vignette) for how to install them"?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]