wjones127 commented on code in PR #14337:
URL: https://github.com/apache/arrow/pull/14337#discussion_r998523465


##########
r/NEWS.md:
##########
@@ -19,6 +19,65 @@
 
 # arrow 9.0.0.9000
 
+## Arrow dplyr queries
+
+Several new functions can be used in queries: 
+
+* `dplyr::across()` can be used to apply the same computation across multiple 
+  columns;
+* `add_filename()` can be used to get the filename a row came from (only 
+  available when querying `?Dataset`);
+* Added five functions in the `slice_*` family: `dplyr::slice_min()`, 
+  `dplyr::slice_max()`, `dplyr::slice_head()`, `dplyr::slice_tail()`, and
+  `dplyr::slice_sample()`.
+
+A full list of functions available in queries is available at `?acero`.
+
+A few new features and bugfixes were implemented for joins.
+Extension arrays are now supported in joins, allowing, for example, joining 
+datasets that contain [geoarrow](https://paleolimbot.github.io/geoarrow/) data.
+The `keep` argument is now supported, allowing separate columns for the left
+and right hand side join keys in join output. Full joins now coalesce the 
+join keys (when `keep = FALSE`), avoiding the issue where the join keys would 
+be all `NA` for rows in the right hand side without any matches on the left.
+
+A few breaking changes: Calling `dplyr::pull()` will return a `?ChunkedArray` 
+instead of an R vector. Calling `dplyr::compute()` on a query that is grouped 
+returns a `?Table`, instead of an query object.

Review Comment:
   ```suggestion
   returns a `?Table`, instead of a query object.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to