[
https://issues.apache.org/jira/browse/ARROW-14051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17438009#comment-17438009
]
Dewey Dunnington commented on ARROW-14051:
------------------------------------------
The (partial) stack trace generating the error is this:
`Expression$type_id()`:
[https://github.com/apache/arrow/blob/master/r/R/expression.R#L140]
`nse_funcs$is.factor()`:
[https://github.com/apache/arrow/blob/master/r/R/dplyr-functions.R#L229]
`nse_funcs$if_else()`:
[https://github.com/apache/arrow/blob/master/r/R/dplyr-functions.R#L895]
It looks like the `Expression` that gets created has the `$schema` field
dropped in `make_field_refs()` (hence the `NULL` schema) that triggers the
error: [https://github.com/apache/arrow/blob/master/r/R/dplyr-summarize.R#L222]
> [R] Handle conditionals enclosing aggregate expressions
> -------------------------------------------------------
>
> Key: ARROW-14051
> URL: https://issues.apache.org/jira/browse/ARROW-14051
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Reporter: Ian Cook
> Assignee: Dewey Dunnington
> Priority: Major
> Labels: query-engine
>
> This type of {{summarise()}} expression does not work in arrow:
> {code:r}
> Table$create(x = c(0, 1, 1), y = c(2, 3, 5), z = c(8, 13, 21)) %>%
> group_by(x) %>%
> summarise(r = ifelse(n() > 1, mean(y), mean(z))){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)