On 11/01/14 14:31, Arunkumar Srinivasan wrote:
Thanks for reporting. That's expected behaviour. Use an explicit |copy|.
In short, when you do: |DT1 <- DT2|, there's *|no copy|* being made.
They still reference/point to the same location (try doing
|tracemem(DT1)| and |tracemem(DT2)|).
Just to be clear that's no different to base. DF1 <- DF2 makes no copy
in base either. In fact x <- y never makes a copy in R regardless of
what x and y are.
The phrase "copy-on-write" is terribly named because it might imply DF1
<- DF2 copies. I think the term should be "copy-on-subassign" because
that's really what R does. Only at the point of changing a sub-element
of an object, does <- copy (if another symbol is pointing to that same
object). It is switching from <- to set and := that does things by
reference. Not switching from data.frame to data.table.
Subassigning to a data.table using <- will still copy the entire
data.table, just like base. Only set* and := can modify by
reference. In fact, set* can be used on data.frame too, and other
objects; e.g. setattr is often useful on non-data.table's and
therefore copy() is useful on non-data.table's too. Hope that
clarifies.
So when you change the names of one |DT| by reference, the other one
will get changed as well - they're both pointing to the same location.
To overcome this, when you want to duplicate a |DT|, explicitly use
|copy|. That is, |DT1 <- copy(DT2)|. Now if you |setnames(DT1, c("x",
"y"))|, then |DT2| names won't get changed.
I think there's a FR somewhere on documenting this... Thanks again for
reporting (with nice example).
Arun
------------------------------------------------------------------------
From: Holger Kirsten Holger Kirsten <mailto:[email protected]>
Reply: Holger Kirsten [email protected]
<mailto:[email protected]>
Date: January 11, 2014 at 3:19:02 PM
To: [email protected]
[email protected]
<mailto:[email protected]>
Subject: [datatable-help] setnames changes names of other data.table
In a debugging session, I found that setnames changed the names of an
identical data.table although having a different name>
> ############### using setnames()
> require(data.table)
> mytab = data.table(a = letters[1:4], b = 1:4 )
> str(mytab)
Classes 'data.table' and 'data.frame': 4 obs. of 2 variables:
$ a: chr "a" "b" "c" "d"
$ b: int 1 2 3 4
- attr(*, ".internal.selfref")=<externalptr>
> mytab
a b
1: a 1
2: b 2
3: c 3
4: d 4
>
> othertab = mytab
> othertab
a b
1: a 1
2: b 2
3: c 3
4: d 4
> setnames(othertab, c("a", "b"), c("aa","bb"))
> othertab
aa bb
1: a 1
2: b 2
3: c 3
4: d 4
> mytab ## names have unexpectedly changed too
aa bb
1: a 1
2: b 2
3: c 3
4: d 4
>
> ############### using names()
> mytab = data.table(a = letters[1:4], b = 1:4 )
> str(mytab)
Classes 'data.table' and 'data.frame': 4 obs. of 2 variables:
$ a: chr "a" "b" "c" "d"
$ b: int 1 2 3 4
- attr(*, ".internal.selfref")=<externalptr>
> mytab
a b
1: a 1
2: b 2
3: c 3
4: d 4
>
> othertab = mytab
> othertab
a b
1: a 1
2: b 2
3: c 3
4: d 4
> names(othertab) = c("aa","bb")
Warning message:
In `names<-.data.table`(`*tmp*`, value = c("aa", "bb")) :
The names(x)<-value syntax copies the whole table. This is due to
<- in R itself. Please change to setnames(x,old,new) which does not
copy and is faster. See help('setnames'). You can safely ignore this
warning if it is inconvenient to change right now. Setting
options(warn=2) turns this warning into an error, so you can then use
traceback() to find and change your names<- calls.
> othertab
aa bb
1: a 1
2: b 2
3: c 3
4: d 4
> mytab ## names unchanged as expected
a b
1: a 1
2: b 2
3: c 3
4: d 4
>
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
LC_MONETARY=German_Germany.1252 LC_NUMERIC=C LC_TIME=German_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.8.10
loaded via a namespace (and not attached):
[1] tools_3.0.1
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help