I am a bit confused about the semantics of classes, [, and [[. For at least some important built-in classes (factors and dates), both the getter and the setter methods of [ operate on the class, but though the getter method of [[ operates on the class, the setter method operates on the underlying vector. Is this behavior documented? (I haven't found any documentation of it.) Is it intentional? (i.e. is it a bug or a feature?) There are also cases where invalid assignments don't signal an error.
A simple example: > fact <- factor(2,levels=2:4) # master copy > f0 <- fact; f0; dput(f0) [1] 2 Levels: 2 3 4 structure(1L, .Label = c("2", "3", "4"), class = "factor") > f0 <- fact; f0[1] <- 3; f0; dput(f0) # use [ setter [1] 3 Levels: 2 3 4 structure(2L, .Label = c("2", "3", "4"), class = "factor") > f0 <- fact; f0[[1]] <- 3L; f0; dput(f0) # use [[ setter [1] 4 # ? didn't convert 3 to factor Levels: 2 3 4 structure(3L, .Label = c("2", "3", "4"), class = "factor") # modified underlying vector > f0[1] [1] 4 Levels: 2 3 4 # but result is a valid factor > f0 <- fact; f0[[1]] <- 3; f0; dput(f0) # use [[ setter [1] 4 Levels: 2 3 4 structure(3, .Label = c("2", "3", "4"), class = "factor") # didn't convert to 3L > f0[1] Error in class(y) <- oldClass(x) : adding class "factor" to an invalid object I suppose f0[1] and f0[[1]] fail here because the underlying vector must be integer and not numeric? If so, why didn't assigning to f0[[1]] cause an error? And why didn't printing f0 cause the same error? Here are some more examples. Consider fac <- factor(c("b","a","c"),levels=c("b","c","a")) f <- fac; f[1] <- "c"; dput(f) # structure(c(2L, 3L, 2L), .Label = c("b", "c", "a"), class = "factor") #### OK, implicit conversion of "c" to factor(c) was performed f <- fac; f[1] <- 25; dput(f) # Warning message: # In `[<-.factor`(`*tmp*`, 1, value = 25) : # invalid factor level, NAs generated # structure(c(NA, 3L, 2L), .Label = c("b", "c", "a"), class = "factor") #### OK, error given for invalid value, which becomes an NA #### Same thing happens for f[1]<-"foo" So far, so good. Now compare to what happens with fac[[...]] <- ... f <- fac; f[[1]] <- 25; dput(f) # structure(c(25, 3, 2), .Label = c("b", "c", "a"), class = "factor") #### No error given, but invalid factor generated f <- fac; f[[1]] <- "c"; dput(f) # structure(c("c", "3", "2"), .Label = c("b", "c", "a"), class = "factor") #### No conversion performed; no error given; invalid factor generated f # [1] <NA> <NA> <NA> # Levels: b c a #### Prints as though it were factor(c(NA,NA,NA)) with no warning/error f[] # Error in class(y) <- oldClass(x) : # adding class "factor" to an invalid object #### But f[] gives an error #### Same error with f[1] and f[[1]] Another interesting case is f[1] <- list(NULL) -- which correctly gives an error -- versus f[[1]] <- list(), which gives no error but results in an f which is not a factor at all: f <- fac; f[[1]]<-list(); class(f); dput(f) [1] "list" list(list(), 3L, 2L) I can see that being able to modify the underlying vector of a classed object directly would be very valuable functionality, but there is an assymmetry here: f[[1]]<- modifies the underlying vector, but f[[1]] accesses the classed vector. Presumably you need to do unclass(f)[[1]] to see the underlying value. But on the other hand, unclass doesn't have a setter (`unclass<-`), so you can't say unclass(f)[[1]] <- ... I have not been able to find documentation of all this in the R Language Definition or in the man page for [/[[, but perhaps I'm looking in the wrong place? -s ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel