yes the DelayedArray framework that handles HDF5Array, etc. seems like the right choice?
--t On Fri, Feb 24, 2017 at 1:26 PM, Aaron Lun <a...@wehi.edu.au> wrote: > Hi everyone, > > I just attended the Human Cell Atlas meeting in Stanford, and people were > talking about gene expression matrices for >1 million cells. If we assume > that we can get non-zero expression profiles for ~5000 genes, we’d be > talking about a 5000 x 1 million matrix for the raw count data. This would > be 20-40 GB in size, which would clearly benefit from sparse (via Matrix) > or disk-backed representations (bigmatrix, BufferedMatrix, rhdf5, etc.). > > I’m wondering whether there is any appetite amongst us for making a > consistent BioC API to handle these matrices, sort of like what > BiocParallel does for multicore and snow. It goes without saying that the > different matrix representations should have consistent functions at the R > level (rbind/cbind, etc.) but it would also be nice to have an integrated > C/C++ API (accessible via LinkedTo). There’s many non-trivial things that > can be done with this type of data, and it is often faster and more memory > efficient to do these complex operations in compiled code. > > I was thinking of something that you could supply any supported matrix > representation to a registered function via .Call; the C++ constructor > would recognise the type of matrix during class instantiation; and > operations (row/column/random read access, also possibly various ways of > writing a matrix) would be overloaded and behave as required for the class. > Only the implementation of the API would need to care about the nitty > gritty of each representation, and we would all be free to write code that > actually does the interesting analytical stuff. > > Anyway, just throwing some thoughts out there. Any comments appreciated. > > Cheers, > > Aaron > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel