nealrichardson commented on a change in pull request #8579:
URL: https://github.com/apache/arrow/pull/8579#discussion_r517464426
##########
File path: r/tests/testthat/test-Table.R
##########
@@ -179,10 +183,34 @@ test_that("[[<- assignment", {
# can use $
tab$new <- NULL
expect_null(as.vector(tab$new))
- expect_identical(dim(tab), c(10L, 5L))
+ expect_identical(dim(tab), c(10L, 4L))
tab$int <- 1:10
expect_vector(tab$int, 1:10)
+
+ # recycling
+ tab[["atom"]] <- 1L
+ expect_vector(tab[["atom"]], rep(1L, 10))
+
+ expect_error(
+ tab[["atom"]] <- 1:6,
+ paste0(
+ "Invalid: Added column's length must match table's length. ",
+ "Expected length 10 but got length 6"
+ )
+ )
+
+ # assign Arrow array and chunked_array
+ array <- Array$create(c(10:1))
+ tab$array <- array
+ expect_vector(tab$array, 10:1)
+
+ tab$chunked <- chunked_array(1:10)
+ expect_vector(tab$chunked, 1:10)
+
+ # nonsense indexes
+ expect_error(tab[[NA]] <- letters[10:1], "missing value where TRUE/FALSE
needed")
Review comment:
I'm not sure that's the right error to raise there so maybe there should
be better validation here.
Would also be important to test `NA_integer_` and `NA_real_` because those
will not get rejected by the R->cpp type mapping.
##########
File path: r/tests/testthat/test-Table.R
##########
@@ -179,10 +183,34 @@ test_that("[[<- assignment", {
# can use $
tab$new <- NULL
expect_null(as.vector(tab$new))
- expect_identical(dim(tab), c(10L, 5L))
+ expect_identical(dim(tab), c(10L, 4L))
tab$int <- 1:10
expect_vector(tab$int, 1:10)
+
+ # recycling
+ tab[["atom"]] <- 1L
+ expect_vector(tab[["atom"]], rep(1L, 10))
+
+ expect_error(
+ tab[["atom"]] <- 1:6,
+ paste0(
+ "Invalid: Added column's length must match table's length. ",
+ "Expected length 10 but got length 6"
+ )
+ )
+
+ # assign Arrow array and chunked_array
+ array <- Array$create(c(10:1))
+ tab$array <- array
+ expect_vector(tab$array, 10:1)
+
+ tab$chunked <- chunked_array(1:10)
+ expect_vector(tab$chunked, 1:10)
+
+ # nonsense indexes
+ expect_error(tab[[NA]] <- letters[10:1], "missing value where TRUE/FALSE
needed")
+ expect_error(tab[[NULL]] <- letters[10:1], "argument is of length zero")
Review comment:
```suggestion
expect_error(tab[[NULL]] <- letters[10:1])
```
Since that is an error message coming from base R (right?), we shouldn't
assert its exact message because it may be translated to local language (i.e.
it only will pass if you're in an English locale).
##########
File path: r/R/table.R
##########
@@ -261,28 +261,38 @@ names.Table <- function(x) x$ColumnNames()
`[[<-.Table` <- function(x, i, value) {
if (is.null(value)) {
if (is.character(i)) {
- i <- match(i, names(x)) - 1L
+ i <- match(i, names(x))
}
- x <- x$RemoveColumn(i)
+ x <- x$RemoveColumn(i - 1L)
} else {
if (!is.character(i)) {
# get or create a/the column name
- if (i <= ncol(x)) {
- i <- names(x)[[i]]
+ if (i <= x$num_columns) {
+ i <- names(x)[i]
} else {
i <- as.character(i)
}
}
+ # auto-magic recycling
Review comment:
1) We should probably only do recycling if `value` is not an ArrowObject
(perhaps with a TODO to support Arrow Scalar recycling)
2) https://vctrs.r-lib.org/reference/vec_recycle.html may be a good solution
instead of this bespoke code (we already import vctrs so it's available), or
else use `base::rep_len` (cf.
https://github.com/apache/arrow/blob/master/r/R/array.R#L291)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]