Thanks Andreas, I am switching from python/perl so my java is not great but with the implementation you mention I would need to pass the sequence each time and run it one by one? SSEARCH is also 'slow' (SW) but has a lot of optimization in place so at the end it does not take that long to run it. It's in C++ though.
Peter On Friday, 17 January 2014, 18:09, Andreas Prlic <[email protected]> wrote: We do have a Smith Waterman implementation in Biojava. However the algorithm is based on dynamic programming, which by definition is "slow" but gives you the optimal alignment... http://biojava.org/wiki/BioJava:CookBook3:PSA#Local_alignment Andreas On Fri, Jan 17, 2014 at 9:50 AM, Peter S <[email protected]> wrote: Thanks, I will give it a try. > >Does it mean there is no fast implementation of SW in java that I can use? > >Best, >Peter > > > > >On Friday, 17 January 2014, 17:45, Khalil El Mazouari ><[email protected]> wrote: > >Hi Peter, > >give it a try with Levenshtein Distance. You can use StringUtils from apache >common lang. it has a getLevenshteinDistance method. > >best, > >Khalil > > > >On 17 Jan 2014, at 18:37, Peter S <[email protected]> wrote: > >Hi Khalil, >> >> >>By short sequence I mean 12-18 nt long. I need to make alignment against the >>entire transcriptome and detect matches with up to 3 mismatches. This is the >>reason I need something quite fast but sensitive at the same time. >> >> >>Many thanks, >>Peter >> >> >> >>On Friday, 17 January 2014, 17:26, Khalil El Mazouari >><[email protected]> wrote: >> >>Hi, >> >>what do you mean by short sequences? NT or AA? >> >>Best >> >>Khalil >> >>On 17 Jan 2014, at 18:00, [email protected] wrote: >> >>> Send Biojava-l mailing list submissions to >>> [email protected] >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> or, via email, send a message with subject or body 'help' to >>> [email protected] >>> >>> You can reach the person managing the list at >>> [email protected] >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of Biojava-l digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Database search with Smith and Waterman (Peter S) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Fri, 17 Jan 2014 13:27:17 +0000 (GMT) >>> From: Peter S <[email protected]> >>> Subject: [Biojava-l] Database search with Smith and Waterman >>> > To: "[email protected]" <[email protected]> >>> Message-ID: >>> <[email protected]> >>> Content-Type: text/plain; charset=iso-8859-1 >>> >>> Dear All,? >>> >>> I'm looking for an implementation of Smith and Waterman algorithm to use in >>> the Java desktop application I want to develop.? >>> >>> I did find some information on pairwise aligners but what I would ideally >>> want to have is something similar to the SSEARCH package that can perform >>> alignments against a very big databases, > saved locally in a fasta format. Speed is quite important and ideally I would >need an output that I can easily parse, identifying mismatch/gap positions etc. >>> >>> Any suggestions if there is any java implementation that would fit the >>> description? I will be working on short sequences so sensitivity is >>> crucial.? >>> >>> Thanks very much for your help, >>> Peter >>> >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> Biojava-l mailing list - [email protected] >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >>> >>> End of Biojava-l Digest, Vol 131, Issue 3 >>> > ***************************************** >> >> >> >> >> >>----- >> >>Confidentiality Notice: This e-mail and any files transmitted with it are >>private and confidential and are solely for the use of the addressee. It may >>contain material which is legally privileged. If you are not the addressee or >>the person responsible for delivering to the addressee, please notify that >>you have received this e-mail in error and that any use of it is strictly >>prohibited. It would be helpful if you could notify the author by replying to >>it. >> >> >> >> >> > > > > > > >----- > >Confidentiality Notice: This e-mail and any files transmitted with it are >private and confidential and are solely for the use of the addressee. It may >contain material which is legally privileged. If you are not the addressee or >the person responsible for delivering to the addressee, please notify that you >have received this e-mail in error and that any use of it is strictly >prohibited. It would be helpful if you could notify the author by replying to >it. > >_______________________________________________ >Biojava-l mailing list - [email protected] >http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
