wow! thanks.
On 11/14/07, Kees Nuyt <[EMAIL PROTECTED]> wrote:
>
>
> On Tue, 13 Nov 2007 20:46:06 -0500, "Benilton Carvalho"
> <[EMAIL PROTECTED]> wrote:
>
> >I'm in the process of creating the whole db now and will give it a
> >try... maybe creation is the worse part, as I'm going to be accessing
On Tue, 13 Nov 2007 20:46:06 -0500, "Benilton Carvalho"
<[EMAIL PROTECTED]> wrote:
>I'm in the process of creating the whole db now and will give it a
>try... maybe creation is the worse part, as I'm going to be accessing
>contiguous sets of rows (all columns) at a time... something like:
>
>SEL
I have only glanced at the problem so I may have missed something but my
approach to a large matrix would be to realise it is a flat file and
mmap it. Your program would then treat it as a memory resident
structure. The VM features of the OS would perform paging as necessary
to keep a working
Hello Benilton,
some years ago i came across pyTables ( http://www.pytables.org ).
It's a wrapper for the HDF5-format. PyTables claims to handle high
data-thruput very well. It supports Matrix/Array-formats as these are
typically used in scientific-projects. PyTables does not provide any
That's what I'm testing right now... thanks a lot for the heads up...
b
On Nov 14, 2007 1:41 AM, Trevor Talbot <[EMAIL PROTECTED]> wrote:
> On 11/13/07, Benilton Carvalho <[EMAIL PROTECTED]> wrote:
>
> > Then each column is added on the fly, using:
> >
> > ALTER TABLE table ADD COLUMN colunm REAL;
On 11/13/07, Benilton Carvalho <[EMAIL PROTECTED]> wrote:
> Then each column is added on the fly, using:
>
> ALTER TABLE table ADD COLUMN colunm REAL;
>
> The columns in then filled used UPDATE as described previously.
I haven't investigated the storage layout of this, but from what the
docs impl
It appears my storage bandwidth is awesome if I had only 10
columns (as my toy example had).
My script creates the table and adds values to the primary key field.
Then each column is added on the fly, using:
ALTER TABLE table ADD COLUMN colunm REAL;
The columns in then filled used UPDATE as
On 11/13/07, Benilton Carvalho <[EMAIL PROTECTED]> wrote:
> I have "fixed" my problem, working late night and not sleeping usually
> causes that: I missed a WHERE on the UPDATE statement, so I could
> modify multiple rows at a time. Basically what I did was create a
> primary key (integer), which
Unfortunately the matrix is not sparse nor the direction of operations
can be changed.
The reason is that I have 2K samples, which are given in one file each
(2K binary files). Each of the files has roughly 7M numbers, which I
process by running some algorithms I have, reducing from 7M to 1M
eleme
Regarding:building a 100 row x 2000 column matrix
If by chance these are sparse matrices (i.e., the majority of the values
are empty) then conceivably you could store only the available values.
The schema might then be something like:
CREATE TABLE myTable (
matrixRow INTEGER,
On 11/13/07, John Stanton <[EMAIL PROTECTED]> wrote:
> The point is that the more statements you have in each transaction the
> better, up to a limit of perhaps a thousand or so.
Actually, I get the impression that number of statements is not really
the problem here.
On 11/13/07, Benilton Carva
The point is that the more statements you have in each transaction the
better, up to a limit of perhaps a thousand or so.
Benilton Carvalho wrote:
I'm using RSQLite functions like:
dbBeginTransaction(db2)
dbGetPreparedQuery(db2, sql, bind.data=df)
dbCommit(db2)
where db2 is the connection to
I'm using RSQLite functions like:
dbBeginTransaction(db2)
dbGetPreparedQuery(db2, sql, bind.data=df)
dbCommit(db2)
where db2 is the connection to the SQLite db, sql is either the INSERT
or UPDATE statements I mentioned and df is the R object (data.frame)
containing the data to be inserted/updated
How are you using transactions?
Benilton Carvalho wrote:
Hi Everyone,
I use R to create tools for analysis of microarrays
(http://www.bioconductor.org).
I'm in a situation where I need to handle 6 matrices with 1M rows and
about 2K columns, ie each matrix takes roughly 15GB RAM.
The procedure
Hi Everyone,
I use R to create tools for analysis of microarrays
(http://www.bioconductor.org).
I'm in a situation where I need to handle 6 matrices with 1M rows and
about 2K columns, ie each matrix takes roughly 15GB RAM.
The procedure I'm working on can be divided in two parts:
1) I read an i
15 matches
Mail list logo