Dear all, We now have a src-1.4 directory in BioJava - curtosey of Thomas. If you are on a 1.4 compliant platform, this source tree will be built allong with src. If you are on an earlier platform, it will be silently ignored.
The first addition to this directory is an implementation of the SSAHA searching algorithm developed at the Sanger Centre. It currently doesn't scale (being bound by a 2GB limit on hash-table size, and I'm not sure that the NIO packages in the 1.4 release are bug-free). I will be working on it to ensure that it can handle the full 2^64 byte data tables available via the c++ implementations. The java and c++ hash tables are unlikely to be binary compatible. The java hash tables should be network portable, assuming that you move them as binary ;-) If someone is keen, they could write a NIO-based socket server for the SSAHA search engine so that we could set up highly efficient client-server search services (should be able to handle 1000s of clients with NIO and a thread-pool). Also, it currently reports hits but not as collections of HSPs. There is the possibility of doing bounded alignments using SSAHA hits as anchor points. By replacing the Packing object, we could do a codon based SSAHA, a protein SSAHA, or any other funkey alphabet you can come up with. The rules for discarding frequent words are bad at the moment (absolute threshold), so this could be replaced with some nice histogram maths. I don't have the time to tidy all of this, but perhaps you do. NIO rocks! Have fun, Matthew _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l
