Re: [ccp4bb] 3D modeling program
- Dima Klenchin [EMAIL PROTECTED] wrote: But how do we establish phylogeny? - Based on simple similarity! (Structural/morphological in early days and largely on sequence identity today). It's clearly a circular logic: Hardly. Two sequences can be similar and non-homologous at all levels. Also, two similar proteins can be homologous at one level but not at another. It's also possible for two proteins that have no detectable similarity above random sequences to be homologous. Hence there is no circularity. Of course there is. Just how do you establish that the two are not homologous? - By finding that they don't belong to the same branch. And how do you decide what constitutes the same branch? - By looking at how similar things are! But you have not established that there is circularity. Logical circularity means that you assume (as an essential premise) one of your conclusions. What exactly is the argument you are criticizing, and what is the conclusion that is assumed? When we conclude that two proteins are homologous at some level, we have not assumed that they are homologous at that level. Rather, the conclusion of homology is an inference that uses similarity as relevant evidence. Plus, presumably all living things trace their ancestry to the primordial soup - so the presence or a lack of ancestry is just a matter of how deeply one is willing to look. This is also wrong. Even if all organisms trace back to one common ancestor, that does not mean all proteins are homologous. New protein coding genes can and do arise independently, and hence they are not homologous to any other existing proteins. Just how do they arise independently? Would that be independent of DNA sequence? And if not, then why can't shared ancestry of the DNA sequence fully qualify for homology? Perhaps it could (although in some cases no), but still the new protein would not be homologous to any other protein *at the protein level*. You also ignore the levels of homology concept -- just because two proteins are homologous at one level does not mean they are homologous at others. For example, consider these three TIM barrel proteins: human IMPDH, hamster IMPDH, and chicken triose phosphate isomerase. They are all three homologous as TIM barrels. However, they are not all homologous as dehydrogenases -- only the human and hamster proteins are homologous as dehydrogenases. ... And all that is concluded based on sequence similarities [of other proteins/DNAs] to construct phylogenetic tree. So, ultimately, homology ~ similarity. This is a non sequitur. Yes, homology inference uses similarity as evidence, but that does not mean homology is equivalent to similarity. Two facile counterexamples to your claim: two proteins can be very similar yet non-homologous, and two very dissimilar proteins can be homologous. Homology is thus not equivalent to similarity. QED. The generic concept of homology used to be used as a proof of evolution. Today, things seem to be reversed and evolution is being used to infer homology. A useful concept turned into a statement with little or no utility. In fact quite the opposite is true. Before evolutionary theory, homology was a vacuous, mysterious concept with no utility. It was simply the descriptive observation that similar structures could have different functions. Now we know why that is the case. You have already pointed out that we have redefined homology (evolutionary homology is not the same as generic, pre-evolutionary homology), and this fact proves that the logic is non-circular: we assume generic homology and conclude evolutionary homology. This could only be circular if the two concepts were identical, which you admit they are not. Your argument founders on an equivocation. Cheers, Douglas
Re: [ccp4bb] 3D modeling program
I suspect everyone is refering to Rost's twilight zone in sequence similarity where homology modeling trials had better be avoided. If so, the twilight zone would rather correspond to any indefinite or transitional condition(s) with no applicable or ever relevant binary constraint(s). actually, it was russ doolittle who coined the term twilight zone. burkhard rost added the concept of the midnight zone. from some of the off-list mails i've been getting some people seem to be confused by the fact that establishing the probability of common ancestry of two proteins/domains (based on structure and sequence comparison, for instance) can be very difficult, and that there may be varying degrees of evidence for or against a common ancestry hypothesis. however, this does not change the fact that they either do or do not have a common ancestor. i also realised that the simile between homology and pregnancy can be extended: in both cases you sometimes have to worry about xenology -the possibility that lateral gene transfer has taken place- see http://www.massey.ac.nz/~kbirks/gender/whosdad.htm --dvd ** Gerard J. Kleywegt [Research Fellow of the Royal Swedish Academy of Sciences] Dept. of Cell Molecular Biology University of Uppsala Biomedical Centre Box 596 SE-751 24 Uppsala SWEDEN http://xray.bmc.uu.se/gerard/ mailto:[EMAIL PROTECTED] ** The opinions in this message are fictional. Any similarity to actual opinions, living or dead, is purely coincidental. **
Re: [ccp4bb] 3D modeling program
Having a generic dictionary definition is nice and dandy. However, in the present context, the term 'homology' has a much more specific meaning: it pertains to the having (or not) of a common ancestor. Thus, it is a binary concept. (*) But how do we establish phylogeny? - Based on simple similarity! (Structural/morphological in early days and largely on sequence identity today). It's clearly a circular logic: Lets not use generic definition; instead, lets use a specialized definition; and lets not notice that the specialized definition wholly depends on a system that is built using the generic definition to begin with. Plus, presumably all living things trace their ancestry to the primordial soup - so the presence or a lack of ancestry is just a matter of how deeply one is willing to look. In other words, it's nice and dandy to have theoretical binary concept but in practice it is just as fuzzy as anything else. IMHO, the phylogenetic concept of homology in biology does not buy you much of anything useful. It seems to be just a leftover from pre-Darwinian days - redefined since but still lacking solid foundation. Dima
Re: [ccp4bb] 3D modeling program
I think we are getting a bit too philosophical on a matter which is mainly terminology . 1. To quantify how similar two proteins are, one should best refer to 'percent identity'. Thats clear, correct and unambiguous. 2. One can also refer to similarity. In that case it should be clarified what is considered to be similar, mainly which comparison matrix was used to quantify the similarity. 3. Homology means common evolutionary origin. One understanding is that homology refers to the genome of 'LUCA', the hypothetical last universal common ancestor. I am not an evolutionary biologist, but I would clearly disagree that homology is a leftover pre-Darwinian term. The very notion of homology is only meaningful in the context of evolution. Thus, to me: 1. These proteins are 56% identical is clear. 2. These proteins are 62% similar is unclear. 3. These proteins are 62% similar using the Dayhoff-50 matrix is Ok. 4. These proteins are homologous is clear, but can be subjective as to what homology is. 5. These proteins are 32% homologous is simply wrong. Sorry for the non-crystallographic late evening blabber. A. On 6 Dec 2008, at 21:09, Dima Klenchin wrote: Having a generic dictionary definition is nice and dandy. However, in the present context, the term 'homology' has a much more specific meaning: it pertains to the having (or not) of a common ancestor. Thus, it is a binary concept. (*) But how do we establish phylogeny? - Based on simple similarity! (Structural/morphological in early days and largely on sequence identity today). It's clearly a circular logic: Lets not use generic definition; instead, lets use a specialized definition; and lets not notice that the specialized definition wholly depends on a system that is built using the generic definition to begin with. Plus, presumably all living things trace their ancestry to the primordial soup - so the presence or a lack of ancestry is just a matter of how deeply one is willing to look. In other words, it's nice and dandy to have theoretical binary concept but in practice it is just as fuzzy as anything else. IMHO, the phylogenetic concept of homology in biology does not buy you much of anything useful. It seems to be just a leftover from pre- Darwinian days - redefined since but still lacking solid foundation. Dima
Re: [ccp4bb] 3D modeling program
I agree with previous posts that the reality of inferring evolutionary relationships is often messy, but there is no excuse for being unclear on the concepts and, in particular, for use of the % homology construct, still far too common in supposedly good journals. BTW, %identity is clear but not always unambiguous... May AC. Percent sequence identity; the need to be explicit. Structure. 2004 May;12(5):737-8. PMID: 15130466 Dan On Sat, 2008-12-06 at 21:33 +0100, Anastassis Perrakis wrote: I think we are getting a bit too philosophical on a matter which is mainly terminology . 1. To quantify how similar two proteins are, one should best refer to 'percent identity'. Thats clear, correct and unambiguous. 2. One can also refer to similarity. In that case it should be clarified what is considered to be similar, mainly which comparison matrix was used to quantify the similarity. 3. Homology means common evolutionary origin. One understanding is that homology refers to the genome of 'LUCA', the hypothetical last universal common ancestor. I am not an evolutionary biologist, but I would clearly disagree that homology is a leftover pre-Darwinian term. The very notion of homology is only meaningful in the context of evolution. Thus, to me: 1. These proteins are 56% identical is clear. 2. These proteins are 62% similar is unclear. 3. These proteins are 62% similar using the Dayhoff-50 matrix is Ok. 4. These proteins are homologous is clear, but can be subjective as to what homology is. 5. These proteins are 32% homologous is simply wrong. Sorry for the non-crystallographic late evening blabber. A. On 6 Dec 2008, at 21:09, Dima Klenchin wrote: Having a generic dictionary definition is nice and dandy. However, in the present context, the term 'homology' has a much more specific meaning: it pertains to the having (or not) of a common ancestor. Thus, it is a binary concept. (*) But how do we establish phylogeny? - Based on simple similarity! (Structural/morphological in early days and largely on sequence identity today). It's clearly a circular logic: Lets not use generic definition; instead, lets use a specialized definition; and lets not notice that the specialized definition wholly depends on a system that is built using the generic definition to begin with. Plus, presumably all living things trace their ancestry to the primordial soup - so the presence or a lack of ancestry is just a matter of how deeply one is willing to look. In other words, it's nice and dandy to have theoretical binary concept but in practice it is just as fuzzy as anything else. IMHO, the phylogenetic concept of homology in biology does not buy you much of anything useful. It seems to be just a leftover from pre- Darwinian days - redefined since but still lacking solid foundation. Dima -- Dr Daniel John Rigden Tel:(+44) 151 795 4467 School of Biological Sciences FAX:(+44) 151 795 4406 Room 101, Biosciences Building University of Liverpool Crown St., Liverpool L69 7ZB, U.K.
Re: [ccp4bb] 3D modeling program
But how do we establish phylogeny? - Based on simple similarity! ah! the old rhetorical trick of changing the problem or question a posteriori! all i pointed out was that things can't be 25% homologous (well, i can think of a contrived example in which two four-domain proteins have one homologous domain in common, but that's not how the concept is normally (ab)used) current thinking about support for a hypothesis of common ancestry is summarised here (thank you, Wayback Machine! the prime source of webpages you want to find again but that have disappeared from the web altogether - see: http://www.archive.org/web/web.php): http://web.archive.org/web/20061020081239/http://opbs.okstate.edu/~melcher/ProtEvolOut.html # Summary of current views * Statistically significant sequence and structural similarity strongly imply common ancestry * Statistically significant sequence or structural similarity o weakly imply common ancestry; o could result from convergent evolution, often not considered seriously enough. (It's still evolution, though!) * Functional similarity supports a common ancestry hypothesis, but is not sufficient to prove it. Functional dissimilarity does not disprove common ancestry. * Intelligent design is a near-sighted and unrealistic argument, inconsistent with known properties of chemical and biological systems. IMHO, the phylogenetic concept of homology in biology does not buy you much of anything useful. It seems to be just a leftover from pre-Darwinian days - redefined since but still lacking solid foundation. i'm glad your opinion is humble here, because it has much to be humble about :-) do you really think that property (e.g., structure and function) prediction is not useful? and i can't even begin to understand how you can think that 'homology' in its present-day meaning is a pre-darwinian concept. okay, so can we all agree now that we won't be saying and writing things like the two proteins are X% homologous anymore from now on? --dvd ** Gerard J. Kleywegt [Research Fellow of the Royal Swedish Academy of Sciences] Dept. of Cell Molecular Biology University of Uppsala Biomedical Centre Box 596 SE-751 24 Uppsala SWEDEN http://xray.bmc.uu.se/gerard/ mailto:[EMAIL PROTECTED] ** The opinions in this message are fictional. Any similarity to actual opinions, living or dead, is purely coincidental. **
Re: [ccp4bb] 3D modeling program
But how do we establish phylogeny? - Based on simple similarity! ah! the old rhetorical trick of changing the problem or question a posteriori! all i pointed out was that things can't be 25% homologous Well, you were right that in today's definition things can't be. But you seem to be missing my point that today's definition is essentially meaningless (relies on circular logic and has no epistemologic value) and that nothing would be lost if the term reverted to its generic usage, similar. There would still be a question to be asked similar for what reason? - same question that is presumed to be answered whenever one invokes phylogeny-based homology. i'm glad your opinion is humble here, because it has much to be humble about :-) do you really think that property (e.g., structure and function) prediction is not useful? and i can't even begin to understand how you can think that 'homology' in its present-day meaning is a pre-darwinian concept. Homology is a pre-Darwinian concept that was *redefined* post-Darwin. That's what I wrote. okay, so can we all agree now that we won't be saying and writing things like the two proteins are X% homologous anymore from now on? IMHO, it truly does not matter if we do or do not as long as we understand each other. Like I wrote in the original reply, paying too much attention to definitions of fuzzy abstract concepts is not worth it. Dima
Re: [ccp4bb] 3D modeling program
- Dima Klenchin [EMAIL PROTECTED] wrote: But how do we establish phylogeny? - Based on simple similarity! This is a common, but erroneous, misconception. Modern phylogenetic methods (Bayesian, maximum likelihood, and some distance-based) rely on explicit models of molecular evolution, and the *patterns* of similarity they create. Even maximum parsimony, which is not model-based, does not reconstruct phylogenies based on simple similarity. ah! the old rhetorical trick of changing the problem or question a posteriori! all i pointed out was that things can't be 25% homologous Well, you were right that in today's definition things can't be. But you seem to be missing my point that today's definition is essentially meaningless (relies on circular logic and has no epistemologic value) and that nothing would be lost if the term reverted to its generic usage, similar. There would still be a question to be asked similar for what reason? - same question that is presumed to be answered whenever one invokes phylogeny-based homology. How does this make any sense? Two proteins can have certain similarities in sequence (or structure) due to either convergence or homology. That is the answer to your question of similar for what reason, and hence you have just shown that similarity is not the same as homology, and that homology is not meaningless. i'm glad your opinion is humble here, because it has much to be humble about :-) do you really think that property (e.g., structure and function) prediction is not useful? and i can't even begin to understand how you can think that 'homology' in its present-day meaning is a pre-darwinian concept. Homology is a pre-Darwinian concept that was *redefined* post-Darwin. That's what I wrote. okay, so can we all agree now that we won't be saying and writing things like the two proteins are X% homologous anymore from now on? IMHO, it truly does not matter if we do or do not as long as we understand each other. You are hard to understand if you say that two proteins are 25% homologous. Do you mean that one domain, out of four, is homologous between the proteins? That is the only sense in which that could be construed as correct. Like I wrote in the original reply, paying too much attention to definitions of fuzzy abstract concepts is not worth it. The homology concept is often misunderstood, that is true. But there are still blatantly incorrect uses, and substituting 25% homologous for 25% similar is unequivocaly wrong. An important point to note is that homology must be qualified. There are levels of homology, and a structure can be homologous at one level but not at another. The classic example is bird and bat wings. They are homologous as vertebrate forelimbs, but not as wings.
Re: [ccp4bb] 3D modeling program
- Anastassis Perrakis [EMAIL PROTECTED] wrote: I think we are getting a bit too philosophical on a matter which is mainly terminology . 1. To quantify how similar two proteins are, one should best refer to 'percent identity'. Thats clear, correct and unambiguous. 2. One can also refer to similarity. In that case it should be clarified what is considered to be similar, mainly which comparison matrix was used to quantify the similarity. 3. Homology means common evolutionary origin. One understanding is that homology refers to the genome of 'LUCA', the hypothetical last universal common ancestor. I am not an evolutionary biologist, but I would clearly disagree that homology is a leftover pre-Darwinian term. The very notion of homology is only meaningful in the context of evolution. Thus, to me: 1. These proteins are 56% identical is clear. Even this is unclear without qualification. Identity is always determined by alignment, and you can get different %ID by using different matrices. 2. These proteins are 62% similar is unclear. 3. These proteins are 62% similar using the Dayhoff-50 matrix is Ok. 4. These proteins are homologous is clear, but can be subjective as to what homology is. 5. These proteins are 32% homologous is simply wrong. Sorry for the non-crystallographic late evening blabber. A. On 6 Dec 2008, at 21:09, Dima Klenchin wrote: Having a generic dictionary definition is nice and dandy. However, in the present context, the term 'homology' has a much more specific meaning: it pertains to the having (or not) of a common ancestor. Thus, it is a binary concept. (*) But how do we establish phylogeny? - Based on simple similarity! (Structural/morphological in early days and largely on sequence identity today). It's clearly a circular logic: Lets not use generic definition; instead, lets use a specialized definition; and lets not notice that the specialized definition wholly depends on a system that is built using the generic definition to begin with. Plus, presumably all living things trace their ancestry to the primordial soup - so the presence or a lack of ancestry is just a matter of how deeply one is willing to look. In other words, it's nice and dandy to have theoretical binary concept but in practice it is just as fuzzy as anything else. IMHO, the phylogenetic concept of homology in biology does not buy you much of anything useful. It seems to be just a leftover from pre- Darwinian days - redefined since but still lacking solid foundation. Dima
Re: [ccp4bb] 3D modeling program
- Dima Klenchin [EMAIL PROTECTED] wrote: Having a generic dictionary definition is nice and dandy. However, in the present context, the term 'homology' has a much more specific meaning: it pertains to the having (or not) of a common ancestor. Thus, it is a binary concept. (*) But how do we establish phylogeny? - Based on simple similarity! (Structural/morphological in early days and largely on sequence identity today). It's clearly a circular logic: Hardly. Two sequences can be similar and non-homologous at all levels. Also, two similar proteins can be homologous at one level but not at another. It's also possible for two proteins that have no detectable similarity above random sequences to be homologous. Hence there is no circularity. Lets not use generic definition; instead, lets use a specialized definition; and lets not notice that the specialized definition wholly depends on a system that is built using the generic definition to begin with. Plus, presumably all living things trace their ancestry to the primordial soup - so the presence or a lack of ancestry is just a matter of how deeply one is willing to look. This is also wrong. Even if all organisms trace back to one common ancestor, that does not mean all proteins are homologous. New protein coding genes can and do arise independently, and hence they are not homologous to any other existing proteins. You also ignore the levels of homology concept -- just because two proteins are homologous at one level does not mean they are homologous at others. For example, consider these three TIM barrel proteins: human IMPDH, hamster IMPDH, and chicken triose phosphate isomerase. They are all three homologous as TIM barrels. However, they are not all homologous as dehydrogenases -- only the human and hamster proteins are homologous as dehydrogenases. In other words, it's nice and dandy to have theoretical binary concept but in practice it is just as fuzzy as anything else. IMHO, the phylogenetic concept of homology in biology does not buy you much of anything useful. It seems to be just a leftover from pre-Darwinian days - redefined since but still lacking solid foundation. Dima
Re: [ccp4bb] 3D modeling program
But how do we establish phylogeny? - Based on simple similarity! (Structural/morphological in early days and largely on sequence identity today). It's clearly a circular logic: Hardly. Two sequences can be similar and non-homologous at all levels. Also, two similar proteins can be homologous at one level but not at another. It's also possible for two proteins that have no detectable similarity above random sequences to be homologous. Hence there is no circularity. Of course there is. Just how do you establish that the two are not homologous? - By finding that they don't belong to the same branch. And how do you decide what constitutes the same branch? - By looking at how similar things are! Plus, presumably all living things trace their ancestry to the primordial soup - so the presence or a lack of ancestry is just a matter of how deeply one is willing to look. This is also wrong. Even if all organisms trace back to one common ancestor, that does not mean all proteins are homologous. New protein coding genes can and do arise independently, and hence they are not homologous to any other existing proteins. Just how do they arise independently? Would that be independent of DNA sequence? And if not, then why can't shared ancestry of the DNA sequence fully qualify for homology? You also ignore the levels of homology concept -- just because two proteins are homologous at one level does not mean they are homologous at others. For example, consider these three TIM barrel proteins: human IMPDH, hamster IMPDH, and chicken triose phosphate isomerase. They are all three homologous as TIM barrels. However, they are not all homologous as dehydrogenases -- only the human and hamster proteins are homologous as dehydrogenases. ... And all that is concluded based on sequence similarities [of other proteins/DNAs] to construct phylogenetic tree. So, ultimately, homology ~ similarity. The generic concept of homology used to be used as a proof of evolution. Today, things seem to be reversed and evolution is being used to infer homology. A useful concept turned into a statement with little or no utility. Dima
Re: [ccp4bb] 3D modeling program
Folks, This discussion is now dangerously close to a philosophical discourse regarding the differences between homoplasy, homology, and analogy. Throw into the mix synapomorphy and symplesiomorphy - and we've got ourselves a cladistic analysis soup sprinkled with the croutons of phylogeny. I do not claim to even be a novice in this field as my knowledge of the associated science(*) is microscopic -- but I do have a deep respect for the underlying philosophy, logic, and mathematics and therefore would hazard to suggest the following: Maximum likelihood, maximum parsimony, or Bayesian inference (or other approaches) are all 'apparently good' methods that have found many practically useful applications. We've adopted many of the terms from statistics and taxonomy - and sometimes we inadvertently twist their meaning to the point of error. May we all be forgiven for this - because the alternative to such lighthearted forgiveness is the requirement for absolute technical correctness of every piece of scientific text anyone has ever published. I know that I cannot pass the perfection test, and I do not think that any of us can, either. I think that we don't just live in glass houses - a more correct analogy in this case would be houses built of soap bubbles. With this in mind I'd like to wish us all Happy Holidays (whichever ones you prefer to celebrate). May your structures grow fat and happy. Artem * It is helpful to remember that the terminology we (structural biologists) use to compare protein structures and sequences is mostly derived from advanced statistics and taxonomic analysis that both predate structural biology (in its modern sense) by a fair margin. While it is fun and useful to assign relationships and build taxonomic trees - it may help to remember that what we end up with are models and/or estimates. We cannot entirely avoid bias in taxonomic statistical analysis because optimality criteria are something we come up with ourselves, and there is no inherent principle by which they can be judged. -Original Message- From: CCP4 bulletin board [mailto:[EMAIL PROTECTED] On Behalf Of Douglas Theobald Sent: Saturday, December 06, 2008 9:12 PM To: CCP4BB@JISCMAIL.AC.UK Subject: Re: [ccp4bb] 3D modeling program - Dima Klenchin [EMAIL PROTECTED] wrote: But how do we establish phylogeny? - Based on simple similarity! This is a common, but erroneous, misconception. Modern phylogenetic methods (Bayesian, maximum likelihood, and some distance-based) rely on explicit models of molecular evolution, and the *patterns* of similarity they create. Even maximum parsimony, which is not model-based, does not reconstruct phylogenies based on simple similarity. ah! the old rhetorical trick of changing the problem or question a posteriori! all i pointed out was that things can't be 25% homologous Well, you were right that in today's definition things can't be. But you seem to be missing my point that today's definition is essentially meaningless (relies on circular logic and has no epistemologic value) and that nothing would be lost if the term reverted to its generic usage, similar. There would still be a question to be asked similar for what reason? - same question that is presumed to be answered whenever one invokes phylogeny-based homology. How does this make any sense? Two proteins can have certain similarities in sequence (or structure) due to either convergence or homology. That is the answer to your question of similar for what reason, and hence you have just shown that similarity is not the same as homology, and that homology is not meaningless. i'm glad your opinion is humble here, because it has much to be humble about :-) do you really think that property (e.g., structure and function) prediction is not useful? and i can't even begin to understand how you can think that 'homology' in its present-day meaning is a pre-darwinian concept. Homology is a pre-Darwinian concept that was *redefined* post-Darwin. That's what I wrote. okay, so can we all agree now that we won't be saying and writing things like the two proteins are X% homologous anymore from now on? IMHO, it truly does not matter if we do or do not as long as we understand each other. You are hard to understand if you say that two proteins are 25% homologous. Do you mean that one domain, out of four, is homologous between the proteins? That is the only sense in which that could be construed as correct. Like I wrote in the original reply, paying too much attention to definitions of fuzzy abstract concepts is not worth it. The homology concept is often misunderstood, that is true. But there are still blatantly incorrect uses, and substituting 25% homologous for 25% similar is unequivocaly wrong. An important point to note is that homology must be qualified. There are levels of homology, and a structure can be homologous at one level but not at another. The classic example is bird and bat wings