Hi Xue-Li, There is no universal measurement of biological distance. There is controversy on the use of the term itself.
Furthermore, when you "measure" something, you want your "measurement" to be comparable to others, so there is a need too know what we are talking about. For example if you do such a "measurement" on your four proteins today, you may want to compare your present "measurement" with "measurements" that you take tomorrow with another four proteins. Or another 500... A generalized concept of "distance" may be applied to evaluate similarity. In doing that you might be interested in conservation of motifs in proteins, so you might be interested in finding out what is conderved to start with. And in doing that you might be willing to consider similarity where non-silent mutations might have been involved. You may want to use a widespread multiple sequence alignment program. The simple way is to use CLUSTALW to get a multiple sequence alignment. This program has a bunch of parameters that can be used at their default value, but the careful user would look for ways of using these parameter to better adapt to a concrete biological situation. If ou use CLUSTALW you should state which parameters you use at each time. And that will depend on the concrete problem that you want to address. For example the matrix that is used to encode for mutation rates should relect the problem. It may be not be indifferent if you are studying proteins from mammals or procariots or plants! It may be good to play with more parameters and tell the program if you are aligning very dissimilar or very similar proteins. But there is no harm in using CLUSTALW with all the defaults and see what happens, provided that you know that you shuld not use the output without thinking before stating things about "distances". CLUSTALW outputs distances in the form of a tree file. Several formats are commonly availale for this. Check with the documentation on the way it is calculated and the possible further use of these distances. As you may have undertood by now, getting these numbers is only the beginning, the tip of the iceberg. And, mind you, if you just want to judge similarity you may opt for not using the term "distance" in the firt place! God luck Pedro -- Pedro Fernandes Instituto Gulbenkian de Ciência Apartado 14 2781 OEIRAS PORTUGAL _______________________________________________ BBB mailing list [email protected] http://www.bioinformatics.org/mailman/listinfo/bbb
