[GitHub] [arrow] ianmcook commented on a change in pull request #11425: ARROW-14304: [R] Update news for 6.0.0

GitBox Tue, 19 Oct 2021 06:41:59 -0700


ianmcook commented on a change in pull request #11425:
URL: https://github.com/apache/arrow/pull/11425#discussion_r731879881




##########
File path: r/NEWS.md
##########
@@ -21,32 +21,64 @@
 
 There are now two ways to query Arrow data:
 
-## 1. Grouped aggregation in Arrow
+## 1. Expanded Arrow-native queries: aggregation and joins
 
 `dplyr::summarize()`, both grouped and ungrouped, is now implemented for Arrow 
Datasets, Tables, and RecordBatches. Because data is scanned in chunks, you can 
aggregate over larger-than-memory datasets backed by many files. Supported 
aggregation functions include `n()`, `n_distinct()`, `min(),` `max()`, `sum()`, 
`mean()`, `var()`, `sd()`, `any()`, and `all()`. `median()` and `quantile()` 
with one probability are also supported and currently return approximate 
results using the t-digest algorithm.
 
+Along with `summarize()`, you can also call `count()`, `tally()`, and 
`distinct()`, which effectively wrap `summarize()`.
+
 This enhancement does change the behavior of `summarize()` and `collect()` in 
some cases: see "Breaking changes" below for details.
 
-New compute functions include `str_to_title()` and `strftime()`.
+In addition to `summarize()`, equality joins (`left_join()`, `inner_join()`, 
`semi_join()`, et al.) are also supported natively in Arrow.

Review comment:
       Let's just list all these instead of using "et al."
   ```suggestion
   In addition to `summarize()`, mutating and filtering equality joins 
(`inner_join()`, `left_join()`, `right_join()`, `full_join()`, `semi_join()`, 
and `anti_join()`) with are also supported natively in Arrow.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] ianmcook commented on a change in pull request #11425: ARROW-14304: [R] Update news for 6.0.0

Reply via email to