This is an automated email from the ASF dual-hosted git repository.
jonkeane pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/main by this push:
new cad13bf8a6 GH-43153: [R] pull on a grouped query returns the wrong
column (#43172)
cad13bf8a6 is described below
commit cad13bf8a65eaf13ec9feb78447d9b2f14c63965
Author: Jonathan Keane <[email protected]>
AuthorDate: Sun Jul 7 11:57:29 2024 -0500
GH-43153: [R] pull on a grouped query returns the wrong column (#43172)
### Rationale for this change
Fix a bug in our implementation of `pull` on grouped datasets
### What changes are included in this PR?
An additional test, the fix.
### Are these changes tested?
Yes, with the test I added to.
### Are there any user-facing changes?
Users will now get the expected behavior when using `pull` on grouped
queries.
**This PR contains a "Critical Fix".**
* GitHub Issue: #43153
Lead-authored-by: Jonathan Keane <[email protected]>
Co-authored-by: Neal Richardson <[email protected]>
Signed-off-by: Jonathan Keane <[email protected]>
---
r/R/dplyr-collect.R | 2 +-
r/tests/testthat/test-dplyr-query.R | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/r/R/dplyr-collect.R b/r/R/dplyr-collect.R
index c3232c6ff7..08555cd9f3 100644
--- a/r/R/dplyr-collect.R
+++ b/r/R/dplyr-collect.R
@@ -64,7 +64,7 @@ pull.Dataset <- function(.data,
.data <- as_adq(.data)
var <- vars_pull(names(.data), !!enquo(var))
.data$selected_columns <- set_names(.data$selected_columns[var], var)
- out <- dplyr::compute(.data)[[1]]
+ out <- dplyr::compute(.data)[[var]]
handle_pull_as_vector(out, as_vector)
}
pull.RecordBatchReader <- pull.arrow_dplyr_query <- pull.Dataset
diff --git a/r/tests/testthat/test-dplyr-query.R
b/r/tests/testthat/test-dplyr-query.R
index bab81a463e..7c75a84234 100644
--- a/r/tests/testthat/test-dplyr-query.R
+++ b/r/tests/testthat/test-dplyr-query.R
@@ -87,6 +87,7 @@ test_that("pull", {
.input %>%
filter(int > 4) %>%
rename(strng = chr) %>%
+ group_by(dbl) %>%
pull(strng) %>%
as.vector(),
tbl