Hi Steve, Thanks for your thorough answer. I suppose that my problem was that for some iterations of my script, the last update of DT was to a DT with just under 100 rows, so the not-so-silent column update then printed those rows into my log file, making the size of certain log files very different from the others. Setting options(datatable.print.nrows=0) at the top of my script seems like a more elegant way than finding the last DT[,d:=7] update in a script and surrounding it with 'invisible'. :-)
Todd On 3/13/14 2:47 AM, "Steve Lianoglou" <[email protected]> wrote: > Hi, > > On Wed, Mar 12, 2014 at 3:59 AM, Todd A. Johnson <[email protected]> > wrote: >> I am using data.table Version 1.9.2 with R 3.0.2 on Mac OS 10.6.8. >> >> I've looked through 6 months worth of the mailing list as well as the Bug >> reports and of course the FAQ vignette. However, while my question seems >> related to FAQ 2.21, that answer seems to say that returning DT when >> assigning DT[i,col:=value] was made invisible in v1.8.3. >> >> My question comes from observing different behavior for assignment by >> reference to a column when a data.table DT is surrounded by braces compared >> to without braces (such as within an if..else statement). >> >> Here's a simple test program: >> >> library(data.table) >> DT <- data.table(a=c(1,2,3), b=c(4,5,6)) >> DT[,d:=7] >> >> DT <- data.table(a=c(1,2,3), b=c(4,5,6)) >> if( nrow(DT)>0 ){DT[,d:=7]} > > I can reproduce what you're seeing, but I don't think it has anything > to do with DT being surrounded by {}, a simple: > > if (nrow(DT) > 0) DT[, d := 7] > > will trigger a dump to the console as well > >> So, should the second assignment within the 'if' statement print out DT? > > I don't think it should. Note that if the := isn't the last clause in > the expression block, nothing is printed, eg. this will be silent: > > if (nrow(DT) > 0) { > DT[, d := 7] > x <- 1 > } > >> To >> get rid of this effect in my scripts (which potentially could result in >> printing out tens-of-thousands of rows of data into a log file...), > > That wouldn't happen, data.table "dumps" are always trimmed if they > are too long (this is configured by the 'datatable.print.nrows' and > 'datatable.print.topn' otions). > > By default, if the data.table is > 100 rows, you will only print the > top 5 and bottom 5 rows. > > In fact, as a workaround for you, if you set: > > options(datatable.print.nrows=0) > > Your "problem" will now go away, meaning: > > if (nrow(DT) > 0) DT[, d := 7] > > will be silent > > But so will all of your data.table "console dumps". Which is to say, > just typing `DT` would not print anything to the console. You'd now > have to explicitly set the 'nrows' option in a call to `print` to see > your data.table, eg: `print(DT, nrows=100)` so you could explore the > data.table on the console. > > There are people who say you should never dump a data.table or > data.frame to the console, but rather look at str(dt) ... not sure > that I agree with that, but that is another thing to consider if you > hammer datatable.print.nrows to 0. > > HTH, > -steve _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
