On 6 October 2011 00:15, Matthew Dowle <[email protected]> wrote: > Indeed. Or columns 11 and 12 of BED files (genomics). Near on the agenda > is a fast file loader straight into data.table and list columns > (dual-delimited files such as BED). >
Is this a fast file loader for any files that could be read using read.table, or just dual delimited files? If you can make a way to load things that is faster than read.table with the normal speed tweaks that get mentioned for it, I'd be ecstatic. > I don't believe SQL has an analogous concept to list columns? To achieve > that people may be using comma delimited strings in varchar columns, I > guess. > > On Wed, 2011-10-05 at 16:19 -0500, Branson Owen wrote: >> Thank you very, very much Matthew. I think this is a very valuable (at >> least to me), and unique feature for more powerful calculation. A very >> useful application I can immediately think of is for options chains >> and order book modeling. It's much easier to track and model the whole >> option chains or order book for each time stamp or symbol, and also >> save a lot of replicating time stamps and symbols. >> >> 2011/10/4 Matthew Dowle <[email protected]> >> On Sun, 2011-10-02 at 15:14 +0800, Branson Owen wrote: >> >> > Oh, sorry, I was testing the syntax like: >> > >> > DT = data.table(A = 1:2, B = list('a', 2i)) >> > >> > It didn't work, and I though this feature has not been >> implemented. >> > Thank you for pointing it out with a good example. >> >> >> Natural to assume that should work. Now in 1.6.7 : >> >> o data.table() now accepts list columns directly rather than >> needing to add list columns to an existing data.table; >> e.g., >> >> DT = data.table(x=1:3,y=list(4:6,3.14,matrix(1:12,3))) >> >> Thanks to Branson Owen for reminding. >> >> Accordingly, one item has been added to FAQ 2.17 >> (differences >> between data.frame and data.table) : >> "data.frame(list(1:2,"k",1:4)) >> creates 3 columns, data.table creates one list column" >> >> As before, list columns can be created via grouping; e.g., >> >> DT = data.table(x=c(1,1,2,2,2,3,3),y=1:7) >> DT2 = DT[,list(list(unique(y))),by=x] >> DT2 >> x V1 >> [1,] 1 1, 2 >> [2,] 2 3, 4, 5 >> [3,] 3 6, 7 >> >> and list columns can be grouped; e.g., >> >> DT2[,sum(unlist(V1)),by=list(x%%2)] >> x V1 >> [1,] 1 16 >> [2,] 0 12 >> >> >> >> >> > > > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
