[
https://issues.apache.org/jira/browse/ARROW-13434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17385517#comment-17385517
]
Neal Richardson commented on ARROW-13434:
-----------------------------------------
We do mutate() to add the column, but see if you can spot the hole in the
logic: https://github.com/apache/arrow/blob/master/r/R/dplyr-group-by.R#L28-L34
> [R] group_by() with an expression
> ---------------------------------
>
> Key: ARROW-13434
> URL: https://issues.apache.org/jira/browse/ARROW-13434
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Reporter: Jonathan Keane
> Priority: Major
>
> With dplyr, when we group_by with an expression, a column is added to the
> dataframe that has the result of the expression.
> {code}
> > example_data %>%
> + group_by(int < 4) %>% collect()
> # A tibble: 10 x 8
> # Groups: int < 4 [3]
> int dbl dbl2 lgl false chr fct `int < 4`
> <int> <dbl> <dbl> <lgl> <lgl> <chr> <fct> <lgl>
> 1 1 1.1 5 TRUE FALSE a a TRUE
> 2 2 2.1 5 NA FALSE b b TRUE
> 3 3 3.1 5 TRUE FALSE c c TRUE
> 4 NA 4.1 5 FALSE FALSE d d NA
> 5 5 5.1 5 TRUE FALSE e NA FALSE
> 6 6 6.1 5 NA FALSE NA NA FALSE
> 7 7 7.1 5 NA FALSE g g FALSE
> 8 8 8.1 5 FALSE FALSE h h FALSE
> 9 9 NA 5 FALSE FALSE i i FALSE
> 10 10 10.1 5 NA FALSE j j FALSE
> {code}
> Arrow doesn't do this, however:
> {code}
> > Table$create(example_data) %>%
> + group_by(int < 4) %>% collect()
> Error: Invalid: No match for FieldRef.Name(int < 4) in int: int32
> dbl: double
> dbl2: double
> lgl: bool
> false: bool
> chr: string
> fct: dictionary<values=string, indices=int8, ordered=0>
> {code}
> This isn't a big deal right now since grouped aggregations aren't (quite)
> here yet, but once we start having support for that, we will have people
> using examples like this. This might actually be something we need/want to do
> in C++ instead of in the R client.
> The workaround is relatively simple: add the expression in a mutate, then
> group_by that.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)