Hello Peter,

If these are standard length PCR primers, then UCSC's In-Silico PCR tool would be an option. It is a varient of BLAT and the source is available from Kent Informatics. Here is a UCSC link to the online version (send Jim Kent an email for a copy):

http://genome.ucsc.edu/cgi-bin/hgPcr?command=start

A wrapper could be made for your own instance or just use it command-line before loading data. If this is not what you had in mind, please let us know,

Best,

Jen
Galaxy team


On 2/3/11 4:14 AM, Peter Cock wrote:
On Thu, Feb 3, 2011 at 11:54 AM, Peter Cock<p.j.a.c...@googlemail.com>  wrote:
Hi all,

I'm currently working with some 454 data where the sample was
amplified with selective primers, and therefore the reads need a
little processing to remove the primer sequences before assembly
or mapping (something that sff_extract cleverly spots and warns
the user about when doing an SFF to FASTA/FASTQ conversion).

The actual processing I want to do is very similar to spotting
and removing barcodes or adapters - except that PRC primers
are often degenerate, i.e. have an N in them representing the
fact it is a pool of primers covering A, C, G and T at that point,
and primers may come in pairs.

Looking over the provided tools in Galaxy, the only relevant ones
I saw are as follows:

emboss_5/emboss_primersearch.xml - the text output does not
look helpful for trimming my sequences - nothing else in Galaxy
uses this format, does it?

fastx_toolkit/fastx_barcode_splitter.xml - copes with 5' or 3'
barcodes, but only handles fastqsolexa (discussed recently on the
mailing list - I guess it could handle fastqsanger and fastqillumina
as well), not FASTA or SFF. Also according to the FASTX docs for
fastx_barcode_splitter.pl it require non-ambiguous barcodes
(i.e. ACGT only), so using it with ambiguous primers won't work:
http://hannonlab.cshl.edu/fastx_toolkit/commandline.html

I did look on the tool shed and noticed Edward Kirton has done
some wrappers for the "Suite of Newbler tools", but his sfffile
wrapper does not (yet) include support for splitting SFF files using
Roche's MID barcodes.

Are there any other relevant tools I have overlooked?

I forgot to mention fastx_toolkit/fastx_clipper.xml aka "Clip" which
does handle FASTA and FASTQ files, but apparently only deals
with 3' adapters (although perhaps the poorly documented -d
switch is relevant for a 5' adapter?), and appears to only handle
one adapter sequence at a time. The documentation doesn't
mention what happens if you want to use an ambiguous adapter
sequence (e.g. with an N in it).

Peter
_______________________________________________
galaxy-dev mailing list
galaxy-dev@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-dev

--
Jennifer Jackson
http://usegalaxy.org
http://galaxyproject.org
_______________________________________________
galaxy-dev mailing list
galaxy-dev@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-dev

Reply via email to