On 5 August 2011 08:37, Matthew Dowle <[email protected]> wrote: > That is indeed odd. Please file a bug.report(). Intended was option B.
Done: https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1496&group_id=240&atid=975 > I see why you prefer A, but it is B so that compound syntax works; e.g. > DT[i,done:=TRUE][,sum(done)] Ah that makes sense. If someone absent mindedly does DT <- DT[i, z:=1:10] are they incurring any copying there? I'm guessing no because <- is by reference. > > Compound syntax is also why it doesn't return the number of rows updated by > :=. Verbosity returns the 'x rows updated' message. > > However, even if it worked, the usual copy-on-write semantics would still > not work (and that is deliberate). > > The correct way to copy a data.table is (now) : > > out <- data.table(DT[,z:=10]) > > As per (new) examples in ?setkey. Does that work ok in this case? Unexpected but acceptable and reasonable in the context of the package. I saw the examples in setkey and those work for me in terms of making a copy. I really wasn't even interested in making a copy I was just doing things like DT[,z:=10] and was perplexed by the output. However, playing a bit more I've found another weird thing with copy vs. reference: DT <- data.table(x=1:10, y=1:10) DT2 <- DT # DT2 is a reference to DT at the moment DT2[, z := 1:10] # Both DT and DT2 have z as 1:10 DT2$z <- 2:11 # DT2 now becomes a copy of DT with updated z column. DT$z is still 1:10 Is it intentional that DT2$z should convert a reference to a copy? > So, you if you really want to copy a (potentially very huge) data.table then > I've (deliberately) made it harder for you (and me myself) to copy (often by > accident). But, being able to copy is still possible if you need to. Yeah I actually appreciate the awareness it gives you about these things. > Thinking about it, perhaps we need a new copy() function, or even > duplicate(), since data.table() is a bit too heavy if all you need is a > mere copy. +1 to this. > Note that force() in a function body doesn't force local copies of > data.tables, either, even on copy-on-write. That is deliberate, too. If > you really need a copy, then you really must explicitly copy to a new > variable name AND use data.table() to explicitly create (potentially a huge > amount of) new memory. > > It isn't actually data.table itself per se, that doesn't copy, it's the > functions that operate on it. So, setkey has been changing DT by reference > since 1.6.2, and now := does too. > > Matthew > > > "Chris Neff" <[email protected]> wrote in message > news:caauy0ruvjjw1pj7j-net8yvx8nbxtfnlxs8p1gwd7-a9vgl...@mail.gmail.com... > Now that I've played with := for a little bit, what is the rationale > for the following? > >> DT <- data.table(x=1:10, y=1:10) >> out <- DT[, z:=1:10] >> out > x y > [1,] 1 1 > [2,] 2 2 > [3,] 3 3 > [4,] 4 4 > [5,] 5 5 > [6,] 6 6 > [7,] 7 7 > [8,] 8 8 > [9,] 9 9 > [10,] 10 10 >> DT > x y z > [1,] 1 1 1 > [2,] 2 2 2 > [3,] 3 3 3 > [4,] 4 4 4 > [5,] 5 5 5 > [6,] 6 6 6 > [7,] 7 7 7 > [8,] 8 8 8 > [9,] 9 9 9 > [10,] 10 10 10 > > > I would have expected the return from DT[, z:=1:10] to be either A) > nothing, which is what I think is the preferred thing if you are > really trying to drill home the idea of in place assignment, or B) the > newly updated version of DT with z in it (but I think that muddles > what := does). Why does it return what it does? > > > > > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
