Oh and since you're looping the := or set(), then options(warn=0) before the loop is probably faster than repeated calls to suppressWarnings().
> > :) > > When the column allocation is full, there's a formula to decide how much > to grow the allocation by. The check is there (iirc) to make sure that's > not growing the table too much. If you have 1 million columns, you > probably don't want to double that to 2 million, just to add 1 column. But > if you do, then use alloc.col first. That was the thinking. But that > thinking is biting in your case. > > Simplest might be to downgrade the warning to a message when verbosity is > on, then. > > In the meantime, does wrapping with suppressWarning() work around it for > now? Since in your case you know that over-allocating by more than 1000 is > appropriate. > > suppressWarnings(DT[,newcol:=]) > > Thanks for reporting. Interesting use case. > > Matthew > >> I'm running into this "truelength is greater than 1000 items >> over-allocated" warning/error as I use := to add columns to a >> data.frame, >> e.g.: >> >> tl (1346) is greater than 1000 items over-allocated (ncol = 308). If you >> didn't set the datatable.alloccol option very large, please report this >> to >> datatable-help including the result of sessionInfo(). >> >> The long preamble to this is a stackoverflow thread >> (http://stackoverflow.com/questions/10015544) in which I needed to >> update >> the contents of one data.table with the contents of another. >> >> The solution required the columns of both data.tables to match, hence my >> pre-processing loop to add columns to each data.table to satisfy the >> identical(names(dt1),names(dt2)) criteria. I may have to re-architect >> this >> depending on what is going on with this allocation business. >> >> If, for example, dt1 has 200 columns, and dt2 has 2000, and together >> they >> have 2100 unique columns, I'm going to add 1900 columns to dt1. If I set >> alloc.col to 2100 before my column-adding loop, I'll get slapped because >> 2100 is more than 1000 greater than the 200 columns present in dt1. >> >> So do I need to spoon-feed alloc.col? Every iteration through the loop >> set >> it to length(dt1)+1 before adding a column? That seems rather brutal. >> Alternatively checking for the delta between truelength and length, and >> how close that is to the magic 1000 number, and then only adjusting the >> setting seems fragile. >> >> I did try to make sense of the help for alloc.col. Regarding the bit >> about >> "if two or more variables are bound to the same data.table"; the column >> addition is within a function, and only one variable references the >> data.table, at least in the scope of the function. The function calling >> that function has a variable for the data.table too, so I don't know if >> that counts. Then there is mention of using copy (not sure how that >> helps, >> and BTW the hyperlink for copy goes to the page for setkey, which does >> mention copy, but suggests "See ?copy" which just conjures up the setkey >> page again), setting alloc.col, or changing datatable.alloccol (doesn't >> seem to help). >> >> The warning asked for sessionInfo; FWIW, here it is: >> >> R version 2.15.0 (2012-03-30) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=C LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods >> [7] base >> >> other attached packages: >> [1] data.table_1.8.2 >> >> Thanks >> George >> >> _______________________________________________ >> datatable-help mailing list >> [email protected] >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
