Oops, I missed that the question was about whole column replacements. I'm generally using := now. Will have a think about this ...
"Chris Neff" <[email protected]> wrote in message news:caauy0ruusbcvsftyqbg4m+cwa3zhwrjmwpl-q0fblaq+egn...@mail.gmail.com... This was surprising to me. I would've thought test.dt$X <- "foo" and test.dt[["X"]] <- "foo" should do the exact same thing because the entire column is being overwritten, no? I would understand the issue if it was test.dt$X[i] <- "foo", but it isn't. In a related note, I'm surprised this doesn't work either: > DT=data.table(x=1:10,y=1:10,key="x") > DT$x="foo" Warning message: In `[<-.data.table`(x, j = name, value = value) : NAs introduced by coercion > DT[["x"]]="foo" # Works fine and sets DT$x to be a character vector with > "foo" repeated. DT$x is allowed to be overwritten by other classes (i.e. DT$x <- rnorm(10) when DT$x starts as an integer), why not a character, or change "foo" to a factor and assign it if that is what must be done? This is also true with a non-key variable, like DT$y. On 8 September 2011 18:15, Matthew Dowle <[email protected]> wrote: > It's intended. data.table requires sorted factor levels. If the levels > somehow become unsorted then binary search joins don't work. We might > be able to allow it and make data.table maintain things (with a speed > penalty due to the resort of levels and re-write of the entire integer > column), but, allowing character columns is (hopefully) not far away > which should be much better solution. > > Or, in the meantime, you can go 'under the hood'. Add "something > different" to the end of the factor levels using levels()<-, then the > assignment to the column should work. If that column is part of a key > then make sure to recall setkey() which will check and resort the levels > for you (with a warning message). If an existing level is changing name, > then it's faster to assign (once) directly to the levels(). But again, > careful to maintain sorted levels if that column is to be used in joins. > > Matthew > > > On Thu, 2011-09-08 at 14:27 -0500, Damian Betebenner wrote: >> Not sure whether the following is an intended behavior with >> data.table. Perhaps it is something idiosyncratic with factors. >> >> It is different than what one gets with data.frame >> >> >> >> test.df <- data.frame(X=letters[1:10], Y=rnorm(10)) >> >> test.dt <- data.table(X=letters[1:10], Y=rnorm(10)) >> >> >> >> test.df$X <- "Something Different" >> >> test.dt$X <- "Something Different" >> >> Error in `[<-.data.table`(x, j = name, value = value) : >> >> Some or all RHS not present in factor column levels >> >> >> >> >> >> test.df <- data.frame(X=letters[1:10], Y=rnorm(10)) >> >> test.dt <- data.table(X=letters[1:10], Y=rnorm(10)) >> >> >> >> test.df[["X"]] <- "Something Different" >> >> test.dt[["X"]] <- "Something Different" >> >> >> >> >> >> >> >> >> >> Damian Betebenner >> >> Center for Assessment >> >> PO Box 351 >> >> Dover, NH 03821-0351 >> >> >> >> Phone (office): (603) 516-7900 >> >> Phone (cell): (857) 234-2474 >> >> Fax: (603) 516-7910 >> >> >> >> [email protected] >> >> www.nciea.org >> >> >> >> >> >> >> >> >> _______________________________________________ >> datatable-help mailing list >> [email protected] >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
