[
https://issues.apache.org/jira/browse/ARROW-14586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441151#comment-17441151
]
Dewey Dunnington commented on ARROW-14586:
------------------------------------------
Another example for when you might want a better error is when you end up with
a non-aggregate expression:
{code:R}
library(arrow, warn.conflicts = FALSE)
library(dplyr, warn.conflicts = FALSE)
RecordBatch$create(x = c(0, 1, 1), y = c(2, 3, 5), z = c(8, 13, 21)) %>%
mutate(new_col = x + 0.1) %>%
group_by(x) %>%
summarise(r = ifelse(new_col > 1, mean(y), mean(z))) %>%
collect()
#> Warning: Error : Expression ifelse(new_col > 1, mean(y), mean(z)) not
supported
#> in Arrow; pulling data into R
#> `summarise()` has grouped output by 'x'. You can override using the
`.groups` argument.
#> # A tibble: 3 × 2
#> # Groups: x [2]
#> x r
#> <dbl> <dbl>
#> 1 0 8
#> 2 1 4
#> 3 1 4
{code}
> [R] summarise() with nested aggregate expressions has a confusing error
> -----------------------------------------------------------------------
>
> Key: ARROW-14586
> URL: https://issues.apache.org/jira/browse/ARROW-14586
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Reporter: Dewey Dunnington
> Assignee: Dewey Dunnington
> Priority: Minor
>
> This affects code along the lines of {{summarise(mean(mean(var))}} where the
> inner expression is an aggregate function. This is probably not useful but
> the error it gives is not particularly helpful:
> {code:R}
> library(arrow, warn.conflicts = FALSE)
> library(dplyr, warn.conflicts = FALSE)
> RecordBatch$create(x = 4) %>%
> summarise(y = mean(mean(x)))
> #> Warning: Error in mean(..temp0) : object '..temp0' not found; pulling data
> into
> #> R
> #> # A tibble: 1 × 1
> #> y
> #> <dbl>
> #> 1 4
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)