Re: [base] Nimblegen

2006-09-14 Thread Keith Ching




unfortunately, we have to store and make queryable all the raw data..

so it would be like 14 million rows x 300 / month = 4.2 billion rows /
month
= 50 billion rows per year..

i guess mysql would bog..

however, since these are tiling arrays with evenly spaced probes, one
can calculate the position
of each probe given the starting point and the number of probes from
the start.

could information be stored more efficiently if the probes where
compacted into groups of
10k or 100k? then we're talking about millions of rows instead of
billions.

i've heard that oracle can handle billions of rows of data, but i can't
imagine that its very fast
even if indexed properly..

-keith

Nicklas Nordborg wrote:

  Keith Ching wrote:
  
  
Hi,

I am looking into using BASE2 to store ChIP-chip data from the NimbleGen 
platform.
Each whole genome scan has 14 million probes, divided up into 38 arrays 
of 370k probes each.

What is the feasibility of storing this information in BASE2?  Say we 
had 100+ whole genome scans.
Would it even be practical?  Should I just store the raw datafiles as 
file attachments?  It would be nice
to have some compression built into the file attachments as this could 
save 75% on the disk space as each
expt is 3 gigs or so.

  
  
Wow, that is really a lot of data. I wouldn't store that in the 
database. It would suck the performance out of the entire application. 
You could compress the files before you upload them to Base 2. Or, you 
could let the operating system automatically compress the folder where 
the file uploads are stored.

Note however, if you store the data in files, you will not be able to 
use any of the existing plugins to analyze the data. If you want to do 
that you will need to create a plugin that generates a more managable 
data set from the files. We have created such a plugin for Affymetrix 
files. See http://lev.thep.lu.se/trac/baseplugins/wiki/thep.lu.se.RMAExpress
for more information about it.

/Nicklas


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject "unsubscribe" to
[EMAIL PROTECTED]

  



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject unsubscribe to
[EMAIL PROTECTED]


Re: [base] Nimblegen

2006-08-29 Thread Nicklas Nordborg
Keith Ching wrote:
 Hi,
 
 I am looking into using BASE2 to store ChIP-chip data from the NimbleGen 
 platform.
 Each whole genome scan has 14 million probes, divided up into 38 arrays 
 of 370k probes each.
 
 What is the feasibility of storing this information in BASE2?  Say we 
 had 100+ whole genome scans.
 Would it even be practical?  Should I just store the raw datafiles as 
 file attachments?  It would be nice
 to have some compression built into the file attachments as this could 
 save 75% on the disk space as each
 expt is 3 gigs or so.

Wow, that is really a lot of data. I wouldn't store that in the 
database. It would suck the performance out of the entire application. 
You could compress the files before you upload them to Base 2. Or, you 
could let the operating system automatically compress the folder where 
the file uploads are stored.

Note however, if you store the data in files, you will not be able to 
use any of the existing plugins to analyze the data. If you want to do 
that you will need to create a plugin that generates a more managable 
data set from the files. We have created such a plugin for Affymetrix 
files. See http://lev.thep.lu.se/trac/baseplugins/wiki/thep.lu.se.RMAExpress
for more information about it.

/Nicklas


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject unsubscribe to
[EMAIL PROTECTED]