Hi,
On Wed, Mar 12, 2014 at 3:59 AM, Todd A. Johnson <[email protected]>
wrote:
I am using data.table Version 1.9.2 with R 3.0.2 on Mac OS 10.6.8.
I've looked through 6 months worth of the mailing list as well as the Bug
reports and of course the FAQ vignette. However, while my question seems
related to FAQ 2.21, that answer seems to say that returning DT when
assigning DT[i,col:=value] was made invisible in v1.8.3.
My question comes from observing different behavior for assignment by
reference to a column when a data.table DT is surrounded by braces compared
to without braces (such as within an if..else statement).
Here's a simple test program:
library(data.table)
DT <- data.table(a=c(1,2,3), b=c(4,5,6))
DT[,d:=7]
DT <- data.table(a=c(1,2,3), b=c(4,5,6))
if( nrow(DT)>0 ){DT[,d:=7]}
I can reproduce what you're seeing, but I don't think it has anything
to do with DT being surrounded by {}, a simple:
if (nrow(DT) > 0) DT[, d := 7]
will trigger a dump to the console as well
So, should the second assignment within the 'if' statement print out DT?
I don't think it should. Note that if the := isn't the last clause in
the expression block, nothing is printed, eg. this will be silent:
if (nrow(DT) > 0) {
DT[, d := 7]
x <- 1
}
To
get rid of this effect in my scripts (which potentially could result in
printing out tens-of-thousands of rows of data into a log file...),
That wouldn't happen, data.table "dumps" are always trimmed if they
are too long (this is configured by the 'datatable.print.nrows' and
'datatable.print.topn' otions).
By default, if the data.table is > 100 rows, you will only print the
top 5 and bottom 5 rows.
In fact, as a workaround for you, if you set:
options(datatable.print.nrows=0)
Your "problem" will now go away, meaning:
if (nrow(DT) > 0) DT[, d := 7]
will be silent
But so will all of your data.table "console dumps". Which is to say,
just typing `DT` would not print anything to the console. You'd now
have to explicitly set the 'nrows' option in a call to `print` to see
your data.table, eg: `print(DT, nrows=100)` so you could explore the
data.table on the console.
There are people who say you should never dump a data.table or
data.frame to the console, but rather look at str(dt) ... not sure
that I agree with that, but that is another thing to consider if you
hammer datatable.print.nrows to 0.
HTH,
-steve