On Tue, Mar 29, 2011 at 1:31 AM, Assaf Gordon gor...@cshl.edu wrote:
Hello all,
We're developing alternative bowtie tools that more closely suit our
needs, are we're happy to share (and get comments).
The main differences are:
1. separate tools for paired-end and single-end
Sounds sensible
Hi Peter,
Peter Cock wrote, On 03/29/2011 05:39 AM:
2. the tools accepts FASTA, FASTQ in both Sanger and Illumina
format (no more need for grooming). Illumina is the default for
newly uploaded FASTQ files.
I think that's a bad idea - use Sanger FASTQ as the default to be
consistent with
Hi Assaf,
Just a quick note that the standard bowtie tool in Galaxy was enhanced in
changeset 5157:7a9476924daf to work on 'fastqillumina' and 'fastqsolexa'
variants in addition to the already possible 'fastqsanger'. In general, it is
not a good idea to have a tool accept dataset.ext=='fastq'
Hi Dan,
Daniel Blankenberg wrote, On 03/29/2011 10:55 AM:
When files are added to Galaxy, the datatype can be directly set to
any of the fastq variants (e.g. fastqillumina), which removes the
requirement of grooming (but should only be done when users know what
they are doing).
I'm not using
On Tue, Mar 29, 2011 at 4:46 PM, Assaf Gordon gor...@cshl.edu wrote:
Hi Dan,
Daniel Blankenberg wrote, On 03/29/2011 10:55 AM:
When files are added to Galaxy, the datatype can be directly set to
any of the fastq variants (e.g. fastqillumina), which removes the
requirement of grooming (but
Note about multithreaded bowtie:
currently the tools use 10 threads (hard-coded in the XML files) - easily
changeable.
If possible, have the user indicate as a parameter how many threads they
wish to use.
--
CONFIDENTIALITY NOTICE: This email communication may contain private,
The Grooming step is currently very time consuming and can be quite wasteful in
disk space if the source and target fastq files are the same, but I have seen
many occasions where Grooming has 'saved the day' by e.g. detecting truncated
files that may have gone undetected by downstream tools or
Dan and Peter,
Peter Cock wrote, On 03/29/2011 12:08 PM:
Why not do the Illumina to Sanger conversion as part of your
pipeline that gets the data into Galaxy (and mark the files as
fastqsanger)? As Glen said, with a C tool that isn't really so slow.
That future proofs you for the pending
I would humbly guess that most of those truncated files are due to
problematic HTTP uploads - so it saves the day from another problem,
which should be avoided all together.
Maybe most, but definitely not all. We see all kinds of strange
corruption.
However, I have been thinking about