Interesting. What's happening is due to the result of DT[,d:=7] being DT. That's so that compound statements can work e.g.

    DT[is.na(d),d:=0][,sum(a),by=d]

If DT[,d:=7] is the last line of a function or last line inside braces, then R is printing the result. It's not DT[,:=] printing, per se. You don't have to 'wrap' with invisible, it's quite common for the last line of a function to be invisible() on its own with no arguments, just as another option.

I'll take a look to see if we can trap DT[,:=] printing when it's the return value. If you could file an item on the tracker please. It's a new one so haven't considered it before.

Matt

On 14/03/14 11:48, Todd A. Johnson wrote:
Hi Steve,

Thanks for your thorough answer.  I suppose that my problem was that for
some iterations of my script, the last update of DT was to a DT with just
under 100 rows, so the not-so-silent column update then printed those rows
into my log file, making the size of certain log files very different from
the others.   Setting options(datatable.print.nrows=0) at the top of my
script seems like a more elegant way than finding the last DT[,d:=7] update
in a script and surrounding it with 'invisible'. :-)


Todd


On 3/13/14 2:47 AM, "Steve Lianoglou" <[email protected]> wrote:

Hi,

On Wed, Mar 12, 2014 at 3:59 AM, Todd A. Johnson <[email protected]>
wrote:
I am using data.table Version 1.9.2 with R 3.0.2 on Mac OS 10.6.8.

I've looked through 6 months worth of the mailing list as well as the Bug
reports and of course the FAQ vignette.  However, while my question seems
related to FAQ 2.21, that answer seems to say that returning DT when
assigning DT[i,col:=value] was made invisible in v1.8.3.

My question comes from observing different behavior for assignment by
reference to a column when a data.table DT is surrounded by braces compared
to without braces (such as within an if..else statement).

Here's a simple test program:

library(data.table)
DT <- data.table(a=c(1,2,3), b=c(4,5,6))
DT[,d:=7]

DT <- data.table(a=c(1,2,3), b=c(4,5,6))
if( nrow(DT)>0 ){DT[,d:=7]}
I can reproduce what you're seeing, but I don't think it has anything
to do with DT being surrounded by {}, a simple:

     if (nrow(DT) > 0) DT[, d := 7]

will trigger a dump to the console as well

So, should the second assignment within the 'if' statement print out DT?
I don't think it should. Note that if the := isn't the last clause in
the expression block, nothing is printed, eg. this will be silent:

     if (nrow(DT) > 0) {
       DT[, d := 7]
       x <- 1
     }

  To
get rid of this effect in my scripts (which potentially could result in
printing out tens-of-thousands of rows of data into a log file...),
That wouldn't happen, data.table "dumps" are always trimmed if they
are too long (this is configured by the 'datatable.print.nrows' and
'datatable.print.topn' otions).

By default, if the data.table is > 100 rows, you will only print the
top 5 and bottom 5 rows.

In fact, as a workaround for you, if you set:

     options(datatable.print.nrows=0)

Your "problem" will now go away, meaning:

     if (nrow(DT) > 0) DT[, d := 7]

will be silent

But so will all of your data.table "console dumps". Which is to say,
just typing `DT` would not print anything to the console. You'd now
have to explicitly set the 'nrows' option in a call to `print` to see
your data.table, eg: `print(DT, nrows=100)` so you could explore the
data.table on the console.

There are people who say you should never dump a data.table or
data.frame to the console, but rather look at str(dt) ... not sure
that I agree with that, but that is another thing to consider if you
hammer datatable.print.nrows to 0.

HTH,
-steve

_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help


_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to