ianmcook commented on a change in pull request #10001:
URL: https://github.com/apache/arrow/pull/10001#discussion_r611874806



##########
File path: r/NEWS.md
##########
@@ -21,14 +21,47 @@
 
 ## dplyr methods
 
-* `dplyr::mutate()` is now supported in Arrow for many applications. For 
queries on `Table` and `RecordBatch` that are not yet supported in Arrow, the 
implementation falls back to pulling data into an R `data.frame` first, as in 
the previous release. For queries on `Dataset`, it raises an error if the 
feature is not implemented.
+Many more `dplyr` verbs are supported on Arrow objects:
+
+* `dplyr::mutate()` is now supported in Arrow for many applications. For 
queries on `Table` and `RecordBatch` that are not yet supported in Arrow, the 
implementation falls back to pulling data into an R `data.frame` first, as in 
the previous release. For queries on `Dataset`, it raises an error if the 
function is not implemented. The main `mutate()` features that cannot yet be 
called on Arrow objects are (1) `mutate()` after `group_by()` (which is 
typically used in combination with aggregation) and (2) queries that use 
`dplyr::across()`.

Review comment:
       Consider adding a few words like this to rationalize the different 
behaviors
   ```suggestion
   * `dplyr::mutate()` is now supported in Arrow for many applications. For 
queries on `Table` and `RecordBatch` that are not yet supported in Arrow, the 
implementation falls back to pulling data into an in-memory R `data.frame` 
first, as in the previous release. For queries on `Dataset` (which can be 
larger than memory), it raises an error if the function is not implemented. The 
main `mutate()` features that cannot yet be called on Arrow objects are (1) 
`mutate()` after `group_by()` (which is typically used in combination with 
aggregation) and (2) queries that use `dplyr::across()`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to