This is an automated email from the ASF dual-hosted git repository. jonkeane pushed a commit to branch maint-17.0.0-r in repository https://gitbox.apache.org/repos/asf/arrow.git
commit 9754885cf9563b6e1a82612477ae7a0f07020836 Author: Jonathan Keane <[email protected]> AuthorDate: Sun Jul 7 11:57:29 2024 -0500 GH-43153: [R] pull on a grouped query returns the wrong column (#43172) ### Rationale for this change Fix a bug in our implementation of `pull` on grouped datasets ### What changes are included in this PR? An additional test, the fix. ### Are these changes tested? Yes, with the test I added to. ### Are there any user-facing changes? Users will now get the expected behavior when using `pull` on grouped queries. **This PR contains a "Critical Fix".** * GitHub Issue: #43153 Lead-authored-by: Jonathan Keane <[email protected]> Co-authored-by: Neal Richardson <[email protected]> Signed-off-by: Jonathan Keane <[email protected]> --- r/R/dplyr-collect.R | 2 +- r/tests/testthat/test-dplyr-query.R | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/r/R/dplyr-collect.R b/r/R/dplyr-collect.R index c3232c6ff7..08555cd9f3 100644 --- a/r/R/dplyr-collect.R +++ b/r/R/dplyr-collect.R @@ -64,7 +64,7 @@ pull.Dataset <- function(.data, .data <- as_adq(.data) var <- vars_pull(names(.data), !!enquo(var)) .data$selected_columns <- set_names(.data$selected_columns[var], var) - out <- dplyr::compute(.data)[[1]] + out <- dplyr::compute(.data)[[var]] handle_pull_as_vector(out, as_vector) } pull.RecordBatchReader <- pull.arrow_dplyr_query <- pull.Dataset diff --git a/r/tests/testthat/test-dplyr-query.R b/r/tests/testthat/test-dplyr-query.R index bab81a463e..7c75a84234 100644 --- a/r/tests/testthat/test-dplyr-query.R +++ b/r/tests/testthat/test-dplyr-query.R @@ -87,6 +87,7 @@ test_that("pull", { .input %>% filter(int > 4) %>% rename(strng = chr) %>% + group_by(dbl) %>% pull(strng) %>% as.vector(), tbl
