jonkeane commented on a change in pull request #10326:
URL: https://github.com/apache/arrow/pull/10326#discussion_r634492852



##########
File path: r/R/dataset.R
##########
@@ -93,8 +93,19 @@ open_dataset <- function(sources,
     return(dataset___UnionDataset__create(sources, schema))
   }
   factory <- DatasetFactory$create(sources, partitioning = partitioning, ...)
-  # Default is _not_ to inspect/unify schemas
-  factory$Finish(schema, isTRUE(unify_schemas))
+  
+  tryCatch(
+    # Default is _not_ to inspect/unify schemas
+    factory$Finish(schema, isTRUE(unify_schemas)),
+    error = function (e) {
+      msg <- conditionMessage(e)
+      if(grep("Parquet magic bytes not found in footer", msg)){
+        stop("Looks like these are not parquet files, did you mean to specify 
a 'format'?", call. = FALSE)
+      } else {
+        stop(e)

Review comment:
       Ah, I was thinking about the line 105 stop. I'm not certain if passing 
`e` there will have the same behavior as it would now (for errors that aren't 
this parquet issue, like providing a non-existent directory). Currently (on the 
master branch) that looks like:
   
   ```
   > ds <- open_dataset("~/does/not/exist") 
   Error: IOError: Cannot list directory '/Users/jkeane/does/not/exist'. 
Detail: [errno 2] No such file or directory
   ```
   
   But with this change it might be something like the following without also 
adding `call. = FALSE` here too (though I'm not certain of that!)
   
   ```
   Error in open_dataset() : Error: IOError: Cannot list directory 
'/Users/jkeane/does/not/exist'. Detail: [errno 2] No such file or directory
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to