eitsupi commented on issue #14872:
URL: https://github.com/apache/arrow/issues/14872#issuecomment-1342658842
A simpler example.
``` r
mtcars |>
arrow::arrow_table() |>
dplyr::group_by(mpg, cyl) |>
dplyr::summarise(value = "foo") |>
dplyr::group_by(mpg, value) |>
dplyr::collect()
#> # A tibble: 27 × 3
#> # Groups: mpg, value [25]
#> mpg cyl value
#> <dbl> <dbl> <dbl>
#> 1 21 6 21
#> 2 22.8 4 22.8
#> 3 21.4 6 21.4
#> 4 18.7 8 18.7
#> 5 18.1 6 18.1
#> 6 14.3 8 14.3
#> 7 19.2 6 19.2
#> 8 17.3 8 17.3
#> 9 14.7 8 14.7
#> 10 33.9 4 33.9
#> # … with 17 more rows
```
<sup>Created on 2022-12-08 with [reprex
v2.0.2](https://reprex.tidyverse.org)</sup>
By the way, I don't think this is a `summarise` problem, because the same
thing happens if we use `mutate` instead of `summarise`.
``` r
mtcars |>
arrow::arrow_table() |>
dplyr::group_by(mpg, cyl) |>
dplyr::mutate(value = "foo") |>
dplyr::group_by(mpg, value) |>
dplyr::collect()
#> # A tibble: 32 × 12
#> # Groups: mpg, value [25]
#> mpg cyl disp hp drat wt qsec vs am gear carb value
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 21
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 4 21
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 22.8
#> 4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1 21.4
#> 5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 18.7
#> 6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1 18.1
#> 7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4 14.3
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 24.4
#> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 22.8
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4 19.2
#> # … with 22 more rows
```
<sup>Created on 2022-12-08 with [reprex
v2.0.2](https://reprex.tidyverse.org)</sup>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]