One warning per DT in the list
(I added the line breaks)
-Rick
=============================================
Warning messages:
1: In `[.data.table`(DT, , `:=`(c("Col3", "Col4"), list(C3, C4))) :
Invalid .internal.selfref detected and fixed by taking a copy
of the whole table so that := can add this new column by
reference. At an earlier point, this data.table has been copied
by R (or been created manually using structure() or similar).
Avoid key<-, names<- and attr<- which in R currently (and oddly)
may copy the whole data.table. Use set* syntax instead to avoid
copying: ?set, ?setnames and ?setattr. Also, in R<v3.1.0,
list(DT1,DT2) copied the entire DT1 and DT2 (R's list() used to
copy named objects); please upgrade to R>=v3.1.0 if that is
biting. If this message doesn't help, please report to
datatable-help so the root cause can be fixed.
2: In `[.data.table`(DT, , `:=`(c("Col3", "Col4"), list(C3, C4))) :
Invalid .internal.selfref detected and fixed by taking a copy
of the whole table so that := can add this new column by
reference. At an earlier point, this data.table has been copied
by R (or been created manually using structure() or similar).
Avoid key<-, names<- and attr<- which in R currently (and oddly)
may copy the whole data.table. Use set* syntax instead to avoid
copying: ?set, ?setnames and ?setattr. Also, in R<v3.1.0,
list(DT1,DT2) copied the entire DT1 and DT2 (R's list() used to
copy named objects); please upgrade to R>=v3.1.0 if that is
biting. If this message doesn't help, please report to
datatable-help so the root cause can be fixed.
=============================================
On Fri, Sep 20, 2013 at 12:49 PM, Matthew Dowle
<[email protected] <mailto:[email protected]>> wrote:
Hi,
What's the warning?
Matthew
On 20/09/13 14:48, Ricardo Saporta wrote:
I've encountered the following issue iterating over a list
of data.tables.
The issue is only with mapply, not with lapply .
Given a list of data.table's, mapply'ing over the list directly
cannot modify in place.
Also if attempting to add a new column, we get an "Invalid
.internal.selfref" warning.
Modifying an existing column does not issue a warning, but
still fails to modify-in-place
WORKAROUND:
----------
The workaround is to iterate over an index to the list, then to
modify each data.table via list.of.DTs[[i]][ .. ]
**Interestingly, this issue occurs with `mapply`, but not
`lapply`.**
EXAMPLE:
--------
# Given a list of DT's and two lists of vectors,
# we want to add the corresponding vectors as columns to
the DT.
## ---------------- ##
## SAMPLE DATA: ##
## ---------------- ##
# list of data.tables
list.DT <- list(
DT1=data.table(Col1=111:115, Col2=121:125),
DT2=data.table(Col1=211:215, Col2=221:225)
)
# lists of columns to add
list.Col3 <- list(131:135, 231:235)
list.Col4 <- list(141:145, 241:245)
## ------------------------------------ ##
## Iterating over the list elements ##
## adding a new column ##
## ------------------------------------ ##
## Will issue warning and ##
## will fail to modify in place ##
## ------------------------------------ ##
mapply (
function(DT, C3, C4)
DT[, c("Col3", "Col4") := list(C3, C4)],
list.DT, # iterating over the list
list.Col3, list.Col4,
SIMPLIFY=FALSE
)
## Note the lack of change
list.DT
## ------------------------------------ ##
## Iterating over an index ##
## ------------------------------------ ##
mapply (
function(i, C3, C4)
list.DT[[i]] [, c("Col3", "Col4") := list(C3, C4)],
seq(list.DT), # iterating over an index to the list
list.Col3, list.Col4,
SIMPLIFY=FALSE
)
## Note each DT _has_ been modified
list.DT
## ------------------------------------ ##
## Iterating over the list elements ##
## modifying existing column ##
## ------------------------------------ ##
## No warning issued, but ##
## Will fail to modify in place ##
## ------------------------------------ ##
mapply (
function(DT, C3, C4)
DT[, c("Col3", "Col4") := list(Col3*1e3, Col4*1e4)],
list.DT, # iterating over the list
list.Col3, list.Col4,
SIMPLIFY=FALSE
)
## Note the lack of change (compare with output from `mapply`)
list.DT
## ------------------------------------ ##
## ##
## `lapply` works as expected. ##
## ##
## ------------------------------------ ##
## NOW WITH lapply
lapply(list.DT,
function(DT)
DT[, newCol := LETTERS[1:5]]
)
## Note the new column:
list.DT
# ========================== #
## NON-WORKAROUNDS ##
##
## I also tried all of the following alternatives
## in hopes of being able to iterate over the list
## directly, using `mapply`.
## None of these worked.
# (1) Creating the DTs First, then creating the list from them
DT1 <- data.table(Col1=111:115, Col2=121:125)
DT2 <- data.table(Col1=211:215, Col2=221:225)
list.DT <- list(DT1=DT1,DT2=DT2 )
# (2) Same as 1, and using `copy()` in the call to `list()`
list.DT <- list(DT1=copy(DT1),
DT2=copy(DT2) )
# (3) lapply'ing `copy` and then iterating over that list
list.DT <- lapply(list.DT, copy)
# (4) Not naming the list elements
list.DT <- list(DT1, DT2)
# and tried
list.DT <- list(copy(DT1), copy(DT2))
## All of the above still failed to modify in place
## (and also issued the same warning if trying to add a
column)
## when iterating using mapply
mapply(function(DT, C3, C4)
DT[, c("Col3", "Col4") := list(C3, C4)],
list.DT, list.Col3, list.Col4,
SIMPLIFY=FALSE)
# ========================== #
Ricardo Saporta
Rutgers University, New Jersey
e: [email protected] <mailto:[email protected]>
_______________________________________________
datatable-help mailing list
[email protected]
<mailto:[email protected]>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help