rafapereirabr opened a new issue, #37529:
URL: https://github.com/apache/arrow/issues/37529
### Describe the bug, including details regarding any error messages,
version, and platform.
The function `dplyr::mutate_at()` does not work with arrow Dataset nor Table.
# Reprex:
## data
```
library(arrow)
library(dplyr)
data("mtcars")
head(mtcars)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
```
## how it works with a regular data.framte
```
# columns I'd like to change
cols_to_change <- c('disp', 'hp')
# function to change columns
fchange <- function(col) {
if_else(col >200, 'Yes', 'No')
}
mutate_at(mtcars,
.vars = cols_to_change,
.funs = fchange)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 No No 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 No No 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 22.8 4 No No 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 21.4 6 Yes No 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 18.7 8 Yes No 3.15 3.440 17.02 0 0 3 2
#> Valiant 18.1 6 Yes No 2.76 3.460 20.22 1 0 3 1
#> Duster 360 14.3 8 Yes Yes 3.21 3.570 15.84 0 0 3 4
```
## BUG
Now let's try this with arrow:
```
# save mtcars to a temp parquet file
mtcars_parquet <- tempfile(pattern = 'mtcars', fileext = '.parquet')
arrow::write_parquet(mtcars, mtcars_parquet)
# open mtcars as a arrow Dataset and Table
arrw6 <- arrow::open_dataset(mtcars_parquet)
arrwT <- arrow::read_parquet(mtcars_parquet, as_data_frame = FALSE)
mutate_at(arrw6,
.vars = cols_to_change,
.funs = fchange)
```
> Error in UseMethod("tbl_vars") : no applicable method for 'tbl_vars'
applied to an object of class "c('Table', 'ArrowTabular', 'ArrowObject', 'R6')"
```
mutate_at(arrwT,
.vars = cols_to_change,
.funs = fchange)
```
> Error in UseMethod("tbl_vars") : no applicable method for 'tbl_vars'
applied to an object of class "c('Table', 'ArrowTabular', 'ArrowObject', 'R6')"
The function `dplyr::mutate()` works fine, though.
```
mutate(arrwD,
disp = if_else(disp >200, 'Yes', 'No'))
mutate(arrwD,
disp = if_else(disp >200, 'Yes', 'No'))
```
### Component(s)
R
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]