On Mon, Jul 6, 2009 at 1:32 PM, Peter Rice<[email protected]> wrote: > >> I am aware of people using EMBOSS tools (I assume water) to identify >> (known) adaptor sequences in raw Solexa/Illumina data. I considered >> doing something similar myself when trying to remove primer sequences >> from 454 data. Such a pipeline using the current EMBOSS water would be >> doing this matching at a purely fixed nucleotide level (ignoring the >> qualities), which isn't ideal. Upgrading to a probabilistic version of >> water should be an improvement. > > Would be interesting. > > Where can I look up adaptor calling methods?
The particular example I had in mind was the thread with Giles Weaver on the BioPerl mailing list, which I see you have just replied to: http://lists.open-bio.org/pipermail/bioperl-l/2009-June/030398.html http://lists.open-bio.org/pipermail/bioperl-l/2009-July/030404.html I think I made a typo earlier (needle versus water). If you are comparing a short but complete adaptor sequence to a read (which you expect may contain the full adaptor) doing a global alignment is more sensible that a local one. On re-reading, Giles did actually say he was using needle: http://lists.open-bio.org/pipermail/bioperl-l/2009-July/030411.html Peter _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
