Hi Everyone,

I use R to create tools for analysis of microarrays
(http://www.bioconductor.org).

I'm in a situation where I need to handle 6 matrices with 1M rows and
about 2K columns, ie each matrix takes roughly 15GB RAM.

The procedure I'm working on can be divided in two parts:

1) I read an input file, from which I compute one column for each of
the matrices I mentioned above;

2) After the matrices are ready,all the computations I do can be
performed in batches of rows (say 10K rows at a time), so there's no
need to have all the matrices in memory at the same time.

My (very naive) idea was to use SQLite to store these matrices (via
RSQLite package). So I have the following:

CREATE TABLE alleleA (sample1 REAL, sample2 REAL <all the way to>
sample2000 REAL);

When I have the data for sample1, I use an INSERT statement, which
takes about 4secs.

For all the other columns, I use and UPDATE statement, which is taking
hours (more the 8 now).

What are the obvious things I'm missing here? Or do you have any other
suggestions in order to improve the performance?

Thank you very much,

b

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to