> We have been using MUMmer3 (http://mummer.sourceforge.net) for rapid > alignments of whole genomes, genomes and contigs, and searching for
Thanks- that looks like a good tool that I didn't know about. I noticed they advertize e coli results prompting me to go back and check my own. I'd have to go check the suffix tree literature to see what exactly they claim to do in 17 seconds on e coli, but under cygwin, I was able to index all matching strings of length 25 or more, in about 67 seconds , $ date;$progpath/string_test -fastas both_fasta -index 8 -length 25 -fix 12 -output 3 -filterN -filterID -status -fcompare_all> anchors ;date Sat Nov 10 18:45:23 EST 2007 string_test.cpp177 loaded 2 fastas Sat Nov 10 18:46:30 EST 2007 and create a coarse alignment in another 25 seconds, $ date; $progpath/mm_align_tool -fastas both_fasta -v -pair_rules anchors -doall -pair_align 0 -output text> align1 ;date Sat Nov 10 18:50:01 EST 2007 mm_hit_classes.h389 annotation_model.h57 Loaded 33373 pair rules. mm_align_tool.cpp309 Doing string PAIR align with cutoff 3 mm_align_tool.h227 do_all with only one rule, did you mean -mrules? mm_align_tool.cpp318 doing 0 vs 1 mm_align_tool.cpp326 do hit dump rules Sat Nov 10 18:50:26 EST 2007 Do you have actual timing tests for various complete tasks or is 17 seconds about it? So, ok 67+25=92 seconds is not real impressive compared to 17, and I'm not sure how much I can blame cygwin for this :) I guess once I'm sure I have a useful algorithm, I can subtract IO time which has been significant in many cases. Someone also privately suggested blast's bl2seq and I would point out that this is quite fast on pairs of 50k sequences. Mike Marchywka 586 Saint James Walk Marietta GA 30067-7165 404-788-1216 (C)<- leave message 989-348-4796 (P)<- emergency only [EMAIL PROTECTED] Note: Hotmail is blocking my mom's entire ISP claiming it is to reduce spam but probably to force users to use hotmail. Please DON'T assume I am ignoring you and try me on [EMAIL PROTECTED] if no reply here. Thanks. _________________________________________________________________ Shed those extra pounds with MSN and The Biggest Loser! http://biggestloser.msn.com/ _______________________________________________ BBB mailing list [email protected] http://www.bioinformatics.org/mailman/listinfo/bbb
