jonkeane commented on a change in pull request #9896:
URL: https://github.com/apache/arrow/pull/9896#discussion_r607351892
##########
File path: r/R/arrow-package.R
##########
@@ -47,12 +47,27 @@
}
# Create these once, at package build time
- dplyr_functions$dataset <- build_function_list(build_dataset_expression)
- dplyr_functions$array <- build_function_list(build_array_expression)
-
+ if (arrow_available()) {
+ dplyr_functions$dataset <- build_function_list(build_dataset_expression)
+ dplyr_functions$array <- build_function_list(build_array_expression)
+ }
invisible()
}
+.onAttach <- function(libname, pkgname) {
+ if (!arrow_available()) {
+ msg <- paste(
+ 'The Arrow C++ library is not available. To retry installation with
debug output, run:',
+
+ ' install_arrow(verbose = TRUE)',
+
+ 'See `vignette("install", package = "arrow")` for more guidance and
troubleshooting.',
Review comment:
> I'm not sure. It's different from "dependencies" that R users are used
to where if it's missing, you install it and now things work. And likewise with
reticulate packages, blogdown, etc., where you have this external thing you
have to install after you install the R package (install_tensorflow(),
install_hugo(), etc.), this is different because you have to re-install the R
package.
Fair enough, `C++ library` is probably sufficient here
> Right. We want to convey the message that the fix for this is to reinstall
the R package. But really, if they're here, what we probably want is for them
to reinstall with verbosity and then file a JIRA with that output (should we
add a report_installation_issue() function to do that? ;). Users shouldn't be
here normally.
`report_installation_issue()` would be slick, especially if it could turn on
`ARROW_R_DEV` and copy the input/output automatically.
> We could, but if you look at arrow_info(), there are lots of features you
might not have, and I don't know that we want to message about all of them.
>
> What if we checked arrow_info()$capabilities, possibly excluded some we
don't think are interesting (for example, I don't have lzo, whatever that is,
and maybe it's not interesting whether you have all of the memory allocators
built), and if any of the rest are false, emit a message like "Some Arrow
features are not enabled in this build; see arrow_info() for details and
?arrow_info (or link to vignette) for how to install them"?
Yeah, honestly I think the only two that are important are snappy and S3
support. Those are the only two that I've run into or seen issues with on Jira.
What if we only alert on those two and leave the other ones be for now?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]