I believe you have been already aware of what I know. Just add some suggestions here.
My understanding for data.frame is list of column VECTORs, so is data.table. What I just learned is that data.frame now can be a list of anything? > DF = data.frame(A = 1:3, B = rnorm(3)) > DF$C = data.frame(a=1:3,b=rnorm(3)) > DF$D = list(i=1:6, j = 1,k="?") > print(DF) A B C.a C.b D 1 1 -0.949565 1 -0.5815717 1, 2, 3, 4, 5, 6 2 2 -1.903233 2 -0.5087712 1 3 3 1.559566 3 1.4596933 ? > class(DF$C) [1] "data.frame" > class(DF$D) [1] "list" This is very cool to me! I can think of many benefits from this features. A very common example: if D is a function of B but with variable output size, and I want to do fast grouping or sorting based on key A. Before I know this, I would have to save them as separate objects and add complexity of my codes. This just adds coding and management sugar. No benefits to performance yet. But, I think data.table can make a difference just like it makes differences to data.frame! There is no sorted / indexed list object yet, right? If my variable-size outputs are millions length, any aggregating operation on a less structured object like it will be painful. Technically, data.table can make it a sorted list to enjoy data.table high performance and syntax. I did some tests, use data.table as data.list, but most of the syntaxes that work for data.frame doesn't work for data.table. I would expect this could be an easy feature, since data.frame is kind of smoothly support it. Just a suggestion. *^^* Best regards, _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
