Re: [Bioc-sig-seq] Adapter removal

Sean Davis Thu, 17 Jul 2008 07:44:16 -0700

On Thu, Jul 17, 2008 at 9:47 AM, Krys Kelly <[EMAIL PROTECTED]> wrote:
> I have inherited a pipeline for Solexa sequence data using Perl, Bioperl,
> SSAHA and mySQL.  As an R/Bioconducter user I am interested in ShortRead and
> BiostringsCinterfaceDemo.
>
> However, in the short term I need to use the current pipeline.  The imaging
> is done by the Sequencing Facility and we get fastq files with the 3'
> adapter still attached. The adapter removal is currently done by a Perl
> script which just keeps sequences which match any number of letters in
> [ACGT] followed by the first 8 letters of the adapter.  This seems pretty
> crude (e.g. only using 8 letters, not allowing for mismatches, not allowing
> for the diminishing quality along the length of the read).
>
> Google has not revealed any algorithms or code for this part of the
> pipeline.  Does anyone know what algorithms are being used or, even better,
> could anyone point me in the direction of some code?


I believe that MAQ will do this for you.  You can then use the
ShortRead package to read the MAQ output (VERY, VERY fast).

Sean

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] Adapter removal

Reply via email to