hmm, I just downloaded all bacterial sequences from genbank (~20GB)
you can always easily search these files for a keyword
(sub-protein-sequence)
or search for sets of such subsequences simultaneously
with viruses I did build a binary database of 16-nucleotide-subsequences
and was searching all 24-subsequences all of whose 9 subsubsequences
were marked. This was pretty fast.
I'm not sure yet what to do with bacteria and amino acids
A blast for all bacterial sequences must be quite slow ?!?
_______________________________________________
BBB mailing list
[email protected]
http://www.bioinformatics.org/mailman/listinfo/bbb