Re: [base] Illumina design not playing its role

2009-08-24 Thread Nicklas Nordborg
I have added two tickets. One is for the Bead summary importer in the
Illumina package: http://baseplugins.thep.lu.se/ticket/240

The other is for the Overview functionality in the BASE web client:
http://base.thep.lu.se/ticket/1362

We'll probably be able to fix #1362 this week and before BASE 2.13 is
released. Don't know when the next Illumina package (1.4) will be released
though since there are some other issues that are not yet complete.

/Nicklas



Kjell Petersen wrote:
 
 Nicklas Nordborg wrote:
 The array design concept was intended to provide a way of checking if 
 the right raw data files are linked with right samples/extracts in the 
 experiment, and that the right data is analyzed in experiment. It seems 
 that it might not be the case for Illumina.

 Do you think there is a way to improve this and provide control over 
 what gets imported with what design?
 
 One problem is that the raw data files contains no information whatsoever 
 about
 which array design that was used. Do you have any ideas yourself? Given a 
 BGX file
 and two raw data files (one matching and one none-matching) how do you tell
 them apart?
   
 I know our lab people is able to trace this somehow, I'll forward the 
 question.
 The only thing I can think of is to report an error if the number of skipped
 data lines goes above a certain threshold. Any idea about what a good value
 for that threshold might be? 10? 100?
   
 Wouldn't a percentage make sense here? If more than say 5% of the 
 Features in a raw data file does not match the design = bgx file, than 
 there is probably an error? This will allow for some changes in the bgx 
 file that then wouldn't require the definition of a new array design.
 
 Another possibility is to add a check in the experiment overview that compare
 the number of features on the array design with the number of raw data spots 
 in
 the raw bioassay. If the difference is too big a warning could be generated.
   
 Would also be a nice solution.
 
 
 best,
 Kjell
 
 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
 trial. Simplify your report design, integration and deployment - and focus on 
 what you do best, core application coding. Discover what's new with 
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 The BASE general discussion mailing list
 basedb-users@lists.sourceforge.net
 unsubscribe: send a mail with subject unsubscribe to
 basedb-users-requ...@lists.sourceforge.net


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject unsubscribe to
basedb-users-requ...@lists.sourceforge.net


Re: [base] Illumina design not playing its role

2009-08-24 Thread Pawel Sztromwasser
Thank you!

Nicklas Nordborg wrote:
 I have added two tickets. One is for the Bead summary importer in the
 Illumina package: http://baseplugins.thep.lu.se/ticket/240
 
 The other is for the Overview functionality in the BASE web client:
 http://base.thep.lu.se/ticket/1362
 
 We'll probably be able to fix #1362 this week and before BASE 2.13 is
 released. Don't know when the next Illumina package (1.4) will be released
 though since there are some other issues that are not yet complete.
 
 /Nicklas
 
 
 
 Kjell Petersen wrote:
 Nicklas Nordborg wrote:
 The array design concept was intended to provide a way of checking if 
 the right raw data files are linked with right samples/extracts in the 
 experiment, and that the right data is analyzed in experiment. It seems 
 that it might not be the case for Illumina.

 Do you think there is a way to improve this and provide control over 
 what gets imported with what design?
 
 One problem is that the raw data files contains no information whatsoever 
 about
 which array design that was used. Do you have any ideas yourself? Given a 
 BGX file
 and two raw data files (one matching and one none-matching) how do you tell
 them apart?
   
 I know our lab people is able to trace this somehow, I'll forward the 
 question.
 The only thing I can think of is to report an error if the number of skipped
 data lines goes above a certain threshold. Any idea about what a good value
 for that threshold might be? 10? 100?
   
 Wouldn't a percentage make sense here? If more than say 5% of the 
 Features in a raw data file does not match the design = bgx file, than 
 there is probably an error? This will allow for some changes in the bgx 
 file that then wouldn't require the definition of a new array design.

 Another possibility is to add a check in the experiment overview that 
 compare
 the number of features on the array design with the number of raw data 
 spots in
 the raw bioassay. If the difference is too big a warning could be generated.
   
 Would also be a nice solution.


 best,
 Kjell

 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
 trial. Simplify your report design, integration and deployment - and focus 
 on 
 what you do best, core application coding. Discover what's new with 
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 The BASE general discussion mailing list
 basedb-users@lists.sourceforge.net
 unsubscribe: send a mail with subject unsubscribe to
 basedb-users-requ...@lists.sourceforge.net
 
 
 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
 trial. Simplify your report design, integration and deployment - and focus on 
 what you do best, core application coding. Discover what's new with 
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 The BASE general discussion mailing list
 basedb-users@lists.sourceforge.net
 unsubscribe: send a mail with subject unsubscribe to
 basedb-users-requ...@lists.sourceforge.net


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject unsubscribe to
basedb-users-requ...@lists.sourceforge.net


Re: [base] Illumina design not playing its role

2009-08-21 Thread Nicklas Nordborg
Pawel Sztromwasser wrote:
 Hello BASErs,
 
 After little hands-on session with BASE I have reported some 
 bugs/comments on trac:
 
 http://base.thep.lu.se/ticket/1360#preview
 http://base.thep.lu.se/ticket/1361#preview

Great. We'll have a look at them.


 These are rather minor things, but there is something that worries us 
 more. I know that the array design concept is not really holding when 
 comes to Illumina technology. It is more like a list of features. I am 
 also aware of that some features are being removed from consecutive 
 versions of .bgx files. Thus the 'skip feature' option had to be 
 introduced to import plugin allowing skipping some features that are 
 present in raw data files, but not .bgx files. It is a common practice 
 to run this plugin with skip option set to true.

 This has a serious implication: one can accidentally import data from 
 human IBS file to a raw bioassay that has a mouse (for example) array 
 design attached to it. If features are allowed to be skipped, a common 
 subset of features will be imported to db and no warnings will be issued 
 about wrong design. To mislead the user even more, the raw data file 
 will be marked as validated (because plugin finished run with no errors).
 
 The array design concept was intended to provide a way of checking if 
 the right raw data files are linked with right samples/extracts in the 
 experiment, and that the right data is analyzed in experiment. It seems 
 that it might not be the case for Illumina.
 
 Do you think there is a way to improve this and provide control over 
 what gets imported with what design?

One problem is that the raw data files contains no information whatsoever about
which array design that was used. Do you have any ideas yourself? Given a BGX 
file
and two raw data files (one matching and one none-matching) how do you tell
them apart?

The only thing I can think of is to report an error if the number of skipped
data lines goes above a certain threshold. Any idea about what a good value
for that threshold might be? 10? 100?

Another possibility is to add a check in the experiment overview that compare
the number of features on the array design with the number of raw data spots in
the raw bioassay. If the difference is too big a warning could be generated.

/Nicklas

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject unsubscribe to
basedb-users-requ...@lists.sourceforge.net


Re: [base] Illumina design not playing its role

2009-08-21 Thread Kjell Petersen


Nicklas Nordborg wrote:

 The array design concept was intended to provide a way of checking if 
 the right raw data files are linked with right samples/extracts in the 
 experiment, and that the right data is analyzed in experiment. It seems 
 that it might not be the case for Illumina.

 Do you think there is a way to improve this and provide control over 
 what gets imported with what design?
 

 One problem is that the raw data files contains no information whatsoever 
 about
 which array design that was used. Do you have any ideas yourself? Given a BGX 
 file
 and two raw data files (one matching and one none-matching) how do you tell
 them apart?
   
I know our lab people is able to trace this somehow, I'll forward the 
question.
 The only thing I can think of is to report an error if the number of skipped
 data lines goes above a certain threshold. Any idea about what a good value
 for that threshold might be? 10? 100?
   
Wouldn't a percentage make sense here? If more than say 5% of the 
Features in a raw data file does not match the design = bgx file, than 
there is probably an error? This will allow for some changes in the bgx 
file that then wouldn't require the definition of a new array design.

 Another possibility is to add a check in the experiment overview that compare
 the number of features on the array design with the number of raw data spots 
 in
 the raw bioassay. If the difference is too big a warning could be generated.
   
Would also be a nice solution.


best,
Kjell

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
The BASE general discussion mailing list
basedb-users@lists.sourceforge.net
unsubscribe: send a mail with subject unsubscribe to
basedb-users-requ...@lists.sourceforge.net