Re: [base] import of affymetrix data - which approach

2007-08-10 Thread Nicklas Nordborg
Reha Yildirimman wrote:

 I have another approach and if that seems ineffective to you please
 drop me a line! 

We also had that approach to begin with. However we were a bit worried 
about database performance since the Affymtrix chips contains a lot 
probes. Since most tools needed the CEL and/or CDF files anyway we 
decided to not put Affymetrix data into the database.

  My approach is to change the affymetrix data type
 inside the raw-data-type.xml from a simple file-definition to a
 database-definition like the other data-types defined inside the xml.

You should not change that. I am not sure what will go wrong, but I 
think there is code that depends on the default setup. We hope to make 
this more flexible in the future.

  I added properties for the values:  MEAN (float), STDV (float),
 NPIXELS (float) To be able to import data via a base-plugin from a
 CEL file each data line has to have an association to a reporter -
 thus I added a column holding the ProbeSet ID for each CEL file.
 
 My problem is that when I associate a CDF file to an array design - 
 which I need for reference purpose - it states under Properties: 
  Features: Yes(0) 

This will not work unless you also have the CEL file at the raw 
bioassay. And if you have the CEL file you can't have the data in the 
database...

I would recommend that you use the Affymetrix setup as it is. If you 
need a different way of calculating the intensity values this should be 
done by a plug-in.

/Nicklas

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject unsubscribe to
[EMAIL PROTECTED]


Re: [base] import of affymetrix data - which approach

2007-08-10 Thread Jari Häkkinen
Hi,

The plier and rmaexpress plug-ins calculates the probeset expression 
values using information from the cel/cdf files. The plugins are 
presented on the BASE plugin site 
http://baseplugins.thep.lu.se/wiki/se.lu.thep.affymetrix How the 
expression values are calculated is presented elsewhere (please follow 
links from the plugin website).

The raw data for Affymetrix is stored in the original cel files and is 
not stored in database tables like GenePix and other data. Affymetrix 
related expression data is not stored in the database tables until the 
creation of the first root bioassay set.

You do not need to create another affymetrix plugin if you are happy 
with the algorithms provided by plier and rmaexpress. However, if you 
need something else then you may need to write your own plugin (or help 
us improve the current affymetrix plugin).

You may want to change the extended-properties.xml, some information on 
this is available here, 
http://base.thep.lu.se/chrome/site/doc/admin/extended-properties.html

The problem with using CDFs in your array design is most likely related 
to missing reporters (probesets). Have you imported the reporters 
associated with the design (CDF)?

Please read the information provided in the plugin site and 
http://base.thep.lu.se/chrome/site/doc/user/getting_started.html (nice 
example on the workflow in BASE with references to Affymetrix data). 
And, of course, you can also use the mailing list.


Cheers,

Jari


Reha Yildirimman wrote:
 Hello,
 
 like some other people I am trying to import data from affymetrix CEL 
 files.
 Searching the archive I have read that people use the RMAExpress and PLIER
 plugins. As far as I understood the plug-in extracts/calculates the 
 intensities but
 where are those data stored, since the associated raw data type for 
 affymetrix has
 no database table definition?
 Furthermore, are those intensities the stated MEAN values inside the CEL 
 file
 and are extracted intensities for each coordinate/probe or only for 
 probe sets ?
 
 I have another approach and if that seems ineffective to you please drop 
 me a line!
 My approach is to change the affymetrix data type inside the 
 raw-data-type.xml
 from a simple file-definition to a database-definition like the other 
 data-types defined
 inside the xml.
 I added properties for the values:  MEAN (float), STDV (float), NPIXELS 
 (float)
 To be able to import data via a base-plugin from a CEL file each data 
 line has to have
 an association to a reporter - thus I added a column holding the 
 ProbeSet ID for each
 CEL file.
 
 My problem is that when I associate a CDF file to an array design - 
 which I need for
 reference purpose - it states under Properties:
 
 Features: Yes(0)
 
 
 This makes the raw data importer fail with:
 
 Item not found: Feature[row=null, column=null, block=null, 
 metaGridX=null, metaGridY=null] on line 2
 
 
 If I take out the connection to the CDF - thus turning the Feature 
 property to No - the raw data importer works.
 How can I get around the problem without having to add to each line 
 inside the CEL file fake numbers for the features
 row, column, block, ...
 
 Thanks alot in advance.
 
 Best,
 
 Reha
 
 -
 This SF.net email is sponsored by: Splunk Inc.
 Still grepping through log files to find problems?  Stop.
 Now Search log events and configuration files using AJAX and a browser.
 Download your FREE copy of Splunk now   http://get.splunk.com/
 
 
 
 
 ___
 The BASE general discussion mailing list
 basedb-users@lists.sourceforge.net
 unsubscribe: send a mail with subject unsubscribe to
 [EMAIL PROTECTED]

-- 
Jari Hakkinen, PhD
Complex Systems Divisionmailto:[EMAIL PROTECTED]
Department of Theoretical Physics   phone: +46 (0)46 2229347
Lund University fax:   +46 (0)46 2229686
Solvegatan 14a, SE-223 62 Lund, Sweden  http://www.thep.lu.se

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject unsubscribe to
[EMAIL PROTECTED]