nealrichardson commented on a change in pull request #10326:
URL: https://github.com/apache/arrow/pull/10326#discussion_r639742821



##########
File path: r/R/util.R
##########
@@ -110,3 +110,12 @@ handle_embedded_nul_error <- function(e) {
   }
   stop(e)
 }
+
+handle_parquet_io_error <- function(e, format) {
+  msg <- conditionMessage(e)
+  if (grepl("Parquet magic bytes not found in footer", msg) && 
is.null(format)) {
+      e$message <- paste0(msg, "\nDid you mean to specify a 'format' other 
than the default (parquet)?")
+  }
+  stop(e)

Review comment:
       Agree that `abort(..., i = helpful_message)` is a good idea.
   
   It probably is better to add `format` as an arg to `open_dataset()`, not 
because we need it there--it is just passed through--but it would help make it 
more explicit in the docs that it exists, what the default is, and what the 
supported values are (via match.arg). I don't think it's a problem that some 
kinds of inputs to open_dataset() wouldn't use the format argument: you can 
document that it is ignored unless `sources` is a string/character vector. And 
you wouldn't be making a breaking change or changing the defaults, you'd just 
be duplicating the `match.arg()` handling from DatasetFactory$create(). (This 
is the only downside I see, the duplication of that handling, but it's probably 
worth it on balance.)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to