Does this sentence from the warning help?

" Also, in R<v3.1.0, list(DT1,DT2) copied the entire DT1 and DT2 (R's list() used to copy named objects); please upgrade to R>=v3.1.0 if that is biting. "

Matthew

On 20/09/13 19:01, Ricardo Saporta wrote:
One warning per DT in the list
  (I added the line breaks)
-Rick
=============================================
Warning messages:

1: In `[.data.table`(DT, , `:=`(c("Col3", "Col4"), list(C3, C4))) :

Invalid .internal.selfref detected and fixed by taking a copy of the whole table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or been created manually using structure() or similar). Avoid key<-, names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. Also, in R<v3.1.0, list(DT1,DT2) copied the entire DT1 and DT2 (R's list() used to copy named objects); please upgrade to R>=v3.1.0 if that is biting. If this message doesn't help, please report to datatable-help so the root cause can be fixed.

2: In `[.data.table`(DT, , `:=`(c("Col3", "Col4"), list(C3, C4))) :

Invalid .internal.selfref detected and fixed by taking a copy of the whole table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or been created manually using structure() or similar). Avoid key<-, names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. Also, in R<v3.1.0, list(DT1,DT2) copied the entire DT1 and DT2 (R's list() used to copy named objects); please upgrade to R>=v3.1.0 if that is biting. If this message doesn't help, please report to datatable-help so the root cause can be fixed.
=============================================




On Fri, Sep 20, 2013 at 12:49 PM, Matthew Dowle <[email protected] <mailto:[email protected]>> wrote:


    Hi,

    What's the warning?

    Matthew



    On 20/09/13 14:48, Ricardo Saporta wrote:
    I've encountered the following issue iterating over a list of
    data.tables.
    The issue is only with mapply, not with lapply .

    Given a list of data.table's, mapply'ing over the list directly
    cannot modify in place.

    Also if attempting to add a new column, we get an "Invalid
    .internal.selfref" warning.
    Modifying an existing column does not issue a warning, but still
    fails to modify-in-place

    WORKAROUND:
    ----------
    The workaround is to iterate over an index to the list, then to
      modify each data.table via list.of.DTs[[i]][ .. ]

    **Interestingly, this issue occurs with `mapply`, but not `lapply`.**

    EXAMPLE:
    --------
      # Given a list of DT's and two lists of vectors,
      #   we want to add the corresponding vectors as columns to the DT.

    ## ---------------- ##
    ##   SAMPLE DATA:   ##
    ## ---------------- ##
      # list of data.tables
      list.DT <- list(
        DT1=data.table(Col1=111:115, Col2=121:125),
        DT2=data.table(Col1=211:215, Col2=221:225)
        )

      # lists of columns to add
      list.Col3 <- list(131:135, 231:235)
      list.Col4 <- list(141:145, 241:245)


    ## ------------------------------------ ##
    ##   Iterating over the list elements ##
    ##     adding a new column  ##
    ## ------------------------------------ ##
    ##   Will issue warning and ##
    ##     will fail to modify in place ##
    ## ------------------------------------ ##
      mapply (
          function(DT, C3, C4)
             DT[, c("Col3", "Col4") := list(C3, C4)],
          list.DT,  # iterating over the list
          list.Col3, list.Col4,
          SIMPLIFY=FALSE
        )

      ## Note the lack of change
      list.DT


    ## ------------------------------------ ##
    ##   Iterating over an index  ##
    ## ------------------------------------ ##
      mapply (
          function(i, C3, C4)
             list.DT[[i]] [, c("Col3", "Col4") := list(C3, C4)],
          seq(list.DT),   # iterating over an index to the list
          list.Col3, list.Col4,
          SIMPLIFY=FALSE
        )

      ## Note each DT _has_ been modified
      list.DT

    ## ------------------------------------ ##
    ##   Iterating over the list elements ##
    ##     modifying existing column  ##
    ## ------------------------------------ ##
    ##   No warning issued, but ##
    ##     Will fail to modify in place ##
    ## ------------------------------------ ##
      mapply (
          function(DT, C3, C4)
             DT[, c("Col3", "Col4") := list(Col3*1e3, Col4*1e4)],

          list.DT,  # iterating over the list
          list.Col3, list.Col4,
          SIMPLIFY=FALSE
        )

      ## Note the lack of change (compare with output from `mapply`)
      list.DT

    ## ------------------------------------ ##
    ##  ##
    ##   `lapply` works as expected.  ##
    ##  ##
    ## ------------------------------------ ##
      ## NOW WITH lapply
      lapply(list.DT,
        function(DT)
          DT[, newCol := LETTERS[1:5]]
      )

      ## Note the new column:
      list.DT



    # ========================== #

    ##   NON-WORKAROUNDS   ##
    ##
    ## I also tried all of the following alternatives
    ##   in hopes of being able to iterate over the list
    ##   directly, using `mapply`.
    ## None of these worked.

    # (1) Creating the DTs First, then creating the list from them
        DT1 <- data.table(Col1=111:115, Col2=121:125)
        DT2 <- data.table(Col1=211:215, Col2=221:225)

        list.DT <- list(DT1=DT1,DT2=DT2 )


    # (2) Same as 1, and using `copy()` in the call to `list()`
        list.DT <- list(DT1=copy(DT1),
                        DT2=copy(DT2) )

    # (3) lapply'ing `copy` and then iterating over that list
        list.DT <- lapply(list.DT, copy)

    # (4) Not naming the list elements
        list.DT <- list(DT1, DT2)
        # and tried
        list.DT <- list(copy(DT1), copy(DT2))

    ## All of the above still failed to modify in place
    ##   (and also issued the same warning if trying to add a column)
    ##    when iterating using mapply

      mapply(function(DT, C3, C4)
        DT[, c("Col3", "Col4") := list(C3, C4)],
        list.DT, list.Col3, list.Col4,
        SIMPLIFY=FALSE)


    # ========================== #


    Ricardo Saporta
    Rutgers University, New Jersey
    e: [email protected] <mailto:[email protected]>



    _______________________________________________
    datatable-help mailing list
    [email protected]  
<mailto:[email protected]>
    https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help



_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to