nealrichardson commented on a change in pull request #10326:
URL: https://github.com/apache/arrow/pull/10326#discussion_r639742821
##########
File path: r/R/util.R
##########
@@ -110,3 +110,12 @@ handle_embedded_nul_error <- function(e) {
}
stop(e)
}
+
+handle_parquet_io_error <- function(e, format) {
+ msg <- conditionMessage(e)
+ if (grepl("Parquet magic bytes not found in footer", msg) &&
is.null(format)) {
+ e$message <- paste0(msg, "\nDid you mean to specify a 'format' other
than the default (parquet)?")
+ }
+ stop(e)
Review comment:
Agree that `abort(..., i = helpful_message)` is a good idea.
It probably is better to add `format` as an arg to `open_dataset()`, not
because we need it there--it is just passed through--but it would help make it
more explicit in the docs that it exists, what the default is, and what the
supported values are (via match.arg). I don't think it's a problem that some
kinds of inputs to open_dataset() wouldn't use the format argument: you can
document that it is ignored unless `sources` is a string/character vector. And
you wouldn't be making a breaking change or changing the defaults, you'd just
be duplicating the `match.arg()` handling from DatasetFactory$create(). (This
is the only downside I see, the duplication of that handling, but it's probably
worth it on balance.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]