nealrichardson commented on code in PR #13336:
URL: https://github.com/apache/arrow/pull/13336#discussion_r893711427


##########
r/R/dplyr.R:
##########
@@ -24,6 +24,21 @@ arrow_dplyr_query <- function(.data) {
   # RecordBatch, or Dataset) and the state of the user's dplyr query--things
   # like selected columns, filters, and group vars.
   # An arrow_dplyr_query can contain another arrow_dplyr_query in .data
+
+  supported <- c(
+    "Dataset", "RecordBatch", "RecordBatchReader",
+    "Table", "arrow_dplyr_query", "data.frame"
+  )
+  if (!inherits(.data, supported)) {
+    stop(
+      "'dataset' must be a ",

Review Comment:
   `substitute(.data, parent.frame())` is probably good enough since, as you 
observe, `arrow_dplyr_query()` is only ever called from another function, so 
we'd want to get its argument name from where it was called. 
   
   An alternative, if you weren't comfortable with that, would be to not name 
the input in the error message. Say something like "A query can only be built 
with a Dataset, RecordBatch, ...".
   
   The validation should be in `arrow_dplyr_query()` not `write_dataset()` 
because the input type restriction is about what we can do with a query, not 
just writing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to