On Thu, Feb 3, 2011 at 11:54 AM, Peter Cock <p.j.a.c...@googlemail.com> wrote:
> Hi all,
> I'm currently working with some 454 data where the sample was
> amplified with selective primers, and therefore the reads need a
> little processing to remove the primer sequences before assembly
> or mapping (something that sff_extract cleverly spots and warns
> the user about when doing an SFF to FASTA/FASTQ conversion).
> The actual processing I want to do is very similar to spotting
> and removing barcodes or adapters - except that PRC primers
> are often degenerate, i.e. have an N in them representing the
> fact it is a pool of primers covering A, C, G and T at that point,
> and primers may come in pairs.
> Looking over the provided tools in Galaxy, the only relevant ones
> I saw are as follows:
> emboss_5/emboss_primersearch.xml - the text output does not
> look helpful for trimming my sequences - nothing else in Galaxy
> uses this format, does it?
> fastx_toolkit/fastx_barcode_splitter.xml - copes with 5' or 3'
> barcodes, but only handles fastqsolexa (discussed recently on the
> mailing list - I guess it could handle fastqsanger and fastqillumina
> as well), not FASTA or SFF. Also according to the FASTX docs for
> fastx_barcode_splitter.pl it require non-ambiguous barcodes
> (i.e. ACGT only), so using it with ambiguous primers won't work:
> http://hannonlab.cshl.edu/fastx_toolkit/commandline.html
> I did look on the tool shed and noticed Edward Kirton has done
> some wrappers for the "Suite of Newbler tools", but his sfffile
> wrapper does not (yet) include support for splitting SFF files using
> Roche's MID barcodes.
> Are there any other relevant tools I have overlooked?

I forgot to mention fastx_toolkit/fastx_clipper.xml aka "Clip" which
does handle FASTA and FASTQ files, but apparently only deals
with 3' adapters (although perhaps the poorly documented -d
switch is relevant for a 5' adapter?), and appears to only handle
one adapter sequence at a time. The documentation doesn't
mention what happens if you want to use an ambiguous adapter
sequence (e.g. with an N in it).

galaxy-dev mailing list

Reply via email to