[
https://issues.apache.org/jira/browse/ARROW-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606134#comment-17606134
]
SHIMA Tatsuya commented on ARROW-17738:
---------------------------------------
Ah, is this the intended behavior?
I didn't understand why this behavior was intended, I think compute should
return a Table here, just as dbplyr and dtplyr do.
> [R] dplyr::compute does not work for grouped arrow dplyr query
> --------------------------------------------------------------
>
> Key: ARROW-17738
> URL: https://issues.apache.org/jira/browse/ARROW-17738
> Project: Apache Arrow
> Issue Type: Bug
> Components: R
> Affects Versions: 9.0.0
> Reporter: SHIMA Tatsuya
> Assignee: SHIMA Tatsuya
> Priority: Major
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> It is expected that {{dplyr::compute()}} will perform the calculation on the
> arrow dplyr query and convert it to a Table, but it does not seem to work
> correctly for grouped arrow dplyr queries and does not result in a Table.
> {code:r}
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::compute() |>
> class()
> #> [1] "arrow_dplyr_query"
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::ungroup() |>
> dplyr::compute() |> class()
> #> [1] "Table" "ArrowTabular" "ArrowObject" "R6"
> {code}
> {{as_arrow_table()}} works fine.
> {code:r}
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> class()
> #> [1] "arrow_dplyr_query"
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |> dplyr::compute() |>
> class()
> #> [1] "arrow_dplyr_query"
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |>
> dplyr::collect(FALSE) |> class()
> #> [1] "arrow_dplyr_query"
> mtcars |> arrow::arrow_table() |> dplyr::group_by(cyl) |>
> arrow::as_arrow_table() |> class()
> #> [1] "Table" "ArrowTabular" "ArrowObject" "R6"
> {code}
> It seems to revert to arrow dplyr query in the following line.
> [https://github.com/apache/arrow/blob/7cfdfbb0d5472f8f8893398b51042a3ca1dd0adf/r/R/dplyr-collect.R#L73-L75]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)