On Mon, Jul 6, 2009 at 10:35 AM, Peter Rice <[email protected]> wrote: > > Peter Cock or biopython wrote: >> Hi Peter R. et al, >> >> I gather EMBOSS is looking for feedback for new applications (given >> the recent funding from the BBSRC - congratulations again). How about >> suggestions for extensions to existing EMBOSS applications? >> >> I've used bits of EMBOSS for several years now (thank you!). Something >> I have sometimes wanted to do is a many-to-many pairwise sequence >> alignment with the EMBOSS tools needle and water. >> >> Right now, needle and water take two files (here referred to as A and >> B), file A has just one sequence, and file B can have one or more >> sequences. I'd like to be able to supply two files both with multiple >> entries, and have needle/water do pairwise alignments between all the >> sequences in A against all the sequences in B. This might be useful >> for finding reciprocal best hits in comparative genomics (as an slower >> but exact alternative to FASTA or BLAST). > > The application is easy to add (after the release) > > The usual problem with all-against-all is that it involves loading one > of the inputs as a sequence set entirely in memory - to avoid reading > one input many times over. > > We have an application supermatcher which does this - the first sequence > is streamed through, the second is a sequence set loaded into memory. It > uses work matching to find seed alignments then runs a limited alignment > around the hits. > > superwater would be a possible name (or superneedle).
Is see EMBOSS 6.2 has a new tool "needleall" (although if there is a matching "waterall" the changelog doesn't mention it): http://lists.open-bio.org/pipermail/emboss/2010-January/003823.html I'll have to try this out... Peter _______________________________________________ EMBOSS mailing list [email protected] http://lists.open-bio.org/mailman/listinfo/emboss
