nealrichardson commented on a change in pull request #10326:
URL: https://github.com/apache/arrow/pull/10326#discussion_r638438195



##########
File path: r/R/dataset.R
##########
@@ -93,8 +93,11 @@ open_dataset <- function(sources,
     return(dataset___UnionDataset__create(sources, schema))
   }
   factory <- DatasetFactory$create(sources, partitioning = partitioning, ...)
-  # Default is _not_ to inspect/unify schemas
-  factory$Finish(schema, isTRUE(unify_schemas))
+  tryCatch(
+    # Default is _not_ to inspect/unify schemas
+    factory$Finish(schema, isTRUE(unify_schemas)),
+    error = handle_parquet_io_error

Review comment:
       One last thought: since we're in `open_dataset()` here, we know whether 
the user has specified a format or not. "Did you mean to specify a 'format'?" 
may be a good message if `format` is not specified, but maybe we should give a 
different message if you specified `format = "parquet"` (i.e., there is a 
non-parquet file in what you think is all parquet files).
   
   Also maybe the default message should indicate that parquet is the default 
since that may not be obvious, something like "Did you mean to specify a 
'format' other than the default (parquet)?" perhaps?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to