The Biojava3 has an additional validation layer and object creation going from DNA sequence to RNA sequence and then using the appropriate translation rules to return a protein sequence. Could be easily twice as fast if you went from DNA sequence to ProteinSequence which would put it at 8 seconds. We are going to carry a performance penalty setting everything up as a proper object versus doing a simple String to String translation.
On Wed, Oct 13, 2010 at 12:34 PM, Pjotr Prins <[email protected]>wrote: > On Wed, Oct 13, 2010 at 05:25:41PM +0100, Andy Yates wrote: > > That's great news and should be even faster once we get rid of the > requirement to upper case since you're having to parse the same sequence > twice. > > > > I wonder what the C version does to make itself even faster > > The EMBOSS implementation is fastest by a mile - takes less than 3 > seconds. But the code is, uhm, hard to read. > > I think table lookups will win in C, whatever you try. But it may be an > interesting exercise if we can get close. Note I am perhaps not using the > fastest JVM. > > java version "1.6.0_20" > Java(TM) SE Runtime Environment (build 1.6.0_20-b02) > Java HotSpot(TM) Server VM (build 16.3-b01, mixed mode) > > Pj. > _______________________________________________ > Biojava-l mailing list - [email protected] > http://lists.open-bio.org/mailman/listinfo/biojava-l > > _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
