Hi Everyone, I use R to create tools for analysis of microarrays (http://www.bioconductor.org).
I'm in a situation where I need to handle 6 matrices with 1M rows and about 2K columns, ie each matrix takes roughly 15GB RAM. The procedure I'm working on can be divided in two parts: 1) I read an input file, from which I compute one column for each of the matrices I mentioned above; 2) After the matrices are ready,all the computations I do can be performed in batches of rows (say 10K rows at a time), so there's no need to have all the matrices in memory at the same time. My (very naive) idea was to use SQLite to store these matrices (via RSQLite package). So I have the following: CREATE TABLE alleleA (sample1 REAL, sample2 REAL <all the way to> sample2000 REAL); When I have the data for sample1, I use an INSERT statement, which takes about 4secs. For all the other columns, I use and UPDATE statement, which is taking hours (more the 8 now). What are the obvious things I'm missing here? Or do you have any other suggestions in order to improve the performance? Thank you very much, b ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------