Ha. Yes we certainly don't hold back from making the messages as long and as helpful as possible. If the code knows, or can know what exactly is wrong, it's a deliberate policy to put that info right there into the message. data.table is written by users; i.e. we wrote it for ourselves doing real jobs. I think that may be the root of that. If any messages could more helpful, those suggestions are very welcome.

Matt

On 12/02/14 17:58, John Laing wrote:
Thanks, Matt! With a slight amendment that works great:
for (x in c("foo", "bar", "qux")) set(fbq, which(is.na <http://is.na>(fbq[[x]])), x, FALSE)

Which highlights an opportunity to say that I really appreciate the unusually helpful error messages in this package.

-John


On Wed, Feb 12, 2014 at 12:44 PM, Matt Dowle <[email protected] <mailto:[email protected]>> wrote:


    Hi John,

    In examples like this I'd use set() and [[,  since it's a bit
    easier to write but memory efficient too.

    for (x in c("foo", "bar", "qux"))   set(fbq, is.na
    <http://is.na>(fbq[[x]]), x, FALSE)           [untested]

    A downside here is one repetition of the "fbq" symbol, but can
    live with that.  If you have a large number of columns  (and I've
    been surprised just how many columns some poeple have!) then
    calling set() many times has lower overhead than DT[, :=],  see
    ?set.   Note also that [[ is base R, doesn't copy the column and
    often useful to use with data.table.

    Or, use get() in either i or j rather than eval().

    HTH, Matt



    On 12/02/14 17:24, John Laing wrote:
    Let's say I merge together several data.tables such that I wind up
    with lots of NAs:

    require(data.table)
    foo <- data.table(k=1:4, foo=TRUE, key="k")
    bar <- data.table(k=3:6, bar=TRUE, key="k")
    qux <- data.table(k=5:8, qux=TRUE, key="k")
    fbq <- merge(merge(foo, bar, all=TRUE), qux, all=TRUE)
    print(fbq)
    #    k  foo  bar  qux
    # 1: 1 TRUE   NA   NA
    # 2: 2 TRUE   NA   NA
    # 3: 3 TRUE TRUE   NA
    # 4: 4 TRUE TRUE   NA
    # 5: 5   NA TRUE TRUE
    # 6: 6   NA TRUE TRUE
    # 7: 7   NA   NA TRUE
    # 8: 8   NA   NA TRUE

    I want to go through those columns and turn each NA into FALSE. I can
    do this by writing code for each column:

    fbq.cp <- copy(fbq)
    fbq.cp[is.na <http://is.na>(foo), foo:=FALSE]
    fbq.cp[is.na <http://is.na>(bar), bar:=FALSE]
    fbq.cp[is.na <http://is.na>(qux), qux:=FALSE]
    print(fbq.cp)
    #    k   foo   bar   qux
    # 1: 1  TRUE FALSE FALSE
    # 2: 2  TRUE FALSE FALSE
    # 3: 3  TRUE  TRUE FALSE
    # 4: 4  TRUE  TRUE FALSE
    # 5: 5 FALSE  TRUE  TRUE
    # 6: 6 FALSE  TRUE  TRUE
    # 7: 7 FALSE FALSE  TRUE
    # 8: 8 FALSE FALSE  TRUE

    But I can't figure out how to do it in a loop. More precisely, I
    can't
    figure out how to make the [ operator evaluate its first argument in
    the context of the data.table. All of these have no effect:
    for (x in c("foo", "bar", "qux")) fbq[is.na <http://is.na>(x),
    eval(x):=FALSE]
    for (x in c("foo", "bar", "qux")) fbq[is.na
    <http://is.na>(eval(x)), eval(x):=FALSE]
    for (x in c("foo", "bar", "qux")) fbq[eval(is.na
    <http://is.na>(x)), eval(x):=FALSE]

    I'm running R 3.0.2 on Linux, data.table 1.8.10.

    Thanks in advance,
    John


    _______________________________________________
    datatable-help mailing list
    [email protected]  
<mailto:[email protected]>
    https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help



_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to