Hello everyone, Incase you don't know, SSAHA is a fast algorithm for searching sequences against a database by converting the sequences and database into bit-strings and then using shifts and equals to find matches. Take a look at the sanger software page to find the papers & c/c++ implementations.
I'm about to delve back into the ssaha implementation in BioJava. At the moment it uses the nio buffer mapping code to mount the ssaha hashtable. This is problamatic because in their infinite wisdom, the architects of nio used integer indecies. We need longs when using genomic sized data sets. The nio channel API lets us deal with long offsets, but is a little more tricky to use. On the other hand, it lets us use files of sizes up to Long.MAX_VALUE, which is probably big enough for searching embl ;-) Anyway, I don't know if anybody is using SSAHAj, and I hope it will be binary compatible once I'm finnished with it, but I thought it'd be polite to tell you before I potentialy break anything. At the worst, you will need to re-build your hash tables. Matthew __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l