thisisnic commented on a change in pull request #10326:
URL: https://github.com/apache/arrow/pull/10326#discussion_r634570064
##########
File path: r/R/dataset.R
##########
@@ -93,8 +93,19 @@ open_dataset <- function(sources,
return(dataset___UnionDataset__create(sources, schema))
}
factory <- DatasetFactory$create(sources, partitioning = partitioning, ...)
- # Default is _not_ to inspect/unify schemas
- factory$Finish(schema, isTRUE(unify_schemas))
+
+ tryCatch(
+ # Default is _not_ to inspect/unify schemas
+ factory$Finish(schema, isTRUE(unify_schemas)),
+ error = function (e) {
+ msg <- conditionMessage(e)
+ if(grep("Parquet magic bytes not found in footer", msg)){
+ stop("Looks like these are not parquet files, did you mean to specify
a 'format'?", call. = FALSE)
Review comment:
I don't like the C++ error messages being exposed to the end user, so
I'm with you on that. I'm now a bit concerned that the error can be triggered
by multiple things and it's not easy to work out which - and I don't want the
error message to mislead the user. I'll have a think about if there's a
rephrasing that fixes this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]