The Data-table FAQ 1.11 states:

"When you write x[y,foo*boo], data.table automatically inspects the j
expression to see which columns it uses.

It will only subset, or group, those columns only. Memory is only created
for the columns the j uses.

Let’s say foo is in x, and boo is in y (along with 20 other columns in y).

Isn’t x[y,foo*boo] quicker to program and quicker to run than a merge step
followed by another subset step ?"


Contrary to what it says above, I get an error when I try to access a
y-column in the "j" argument of x[y,j].

See the sequence of code below.


> x <- data.table( foo = c(1,1,1,2,2,3), a = 1:6, key = 'foo')


> y <- data.table( foo = c(1,2), boo = 10:11, key = 'foo')



# the below works as expected

> x[y]

     foo a

[1,]   1 1

[2,]   2 4


> with( merge(x,y), foo*boo)

[1] 10 10 10 22 22


# I want to acheive the same result as the above using the

# syntactically more compact (and faster?) code below:


> x[y, foo * boo ]

Error in eval(expr, envir, enclos) : object 'boo' not found


So is the FAQ just wrong, or am I misunderstanding something?
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to