Re: [galaxy-dev] Tophat non Sanger input

2011-09-08 Thread Anton Nekrutenko
Dear Stephen (and others):

The sole reason for requiring fastq-sanger input to all of our wrappers was to 
force the users to run their data through the groomer. It is slow, but it 
checks data consistency in a way that is more robust than just checking 'four 
lines per fastq block' and prevents a lot of problems downstream. Here on 
Galaxy @ Penn State we see a lot of fastq files edited in MS Word and other 
similar horrors, which are being caught by groomer and prevent users from 
running into problems later on (and so cutting down on the support overhead - 
investigating why groomer has failed is a lot easier than researching why a 
particular set of polymorphisms derived from a Word-edited fastq file clusters 
Ukrainians with parasitic worms). In addition, even though Illumina did switch 
to Sanger encoding, there is still a lot of old data out there. However, we are 
open to suggestions ... What we are thinking of lately is switching to 
unaligned BAM for everyting. One of the benefits here is the ability to add 
readgroups from day 1 simplifying multisample analyses down the road.

a.


Anton Nekrutenko
http://galaxyproject.org




On Sep 8, 2011, at 10:14 AM, Stephen Taylor wrote:

 On 08/09/2011 14:17, Hans-Rudolf Hotz wrote:
 
 
 On 09/08/2011 09:47 AM, Stephen Taylor wrote:
 On 07/09/2011 20:22, Edward Kirton wrote:
 seems unnecessary since illumina switched over to fastqsanger now.
 
 http://www.illumina.com/truseq/quality_101/quality_scores.ilmn
 
 Eventually...unfortunately we still get a lot of fastqillumina :-(
 
 
 I might miss your point.but why can't you use the fastq groomer tool?
 
 
 - Duplication of data (disk space usage)
 - Groomer is slow and puts more demands on CPU usage where it can be done 
 easily on the fly by tophat
 - Consistency (bowtie does it)
 
 From the responses (or lack of :-)) we've been spurred on to change the 
 wrapper. If there is interest we will commit it to the code base when done.
 
 Cheers,
 
 Steve
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
 http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-dev] Tophat non Sanger input

2011-09-07 Thread Edward Kirton
seems unnecessary since illumina switched over to fastqsanger now.

http://www.illumina.com/truseq/quality_101/quality_scores.ilmn

On Wed, Aug 31, 2011 at 12:45 AM, Stephen Taylor 
stephen.tay...@imm.ox.ac.uk wrote:

 Hi,

 Is there any plans to enhance the tophat wrapper to accept non Sanger
 fastqs, as for bowtie?

 https://bitbucket.org/galaxy/**galaxy-central/changeset/**7a9476924dafhttps://bitbucket.org/galaxy/galaxy-central/changeset/7a9476924daf

 ?

 Kind regards and thanks,

 Steve
 __**_
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/