Re: [R-sig-phylo] Multistate Trait Polymorphism
Yes Joe is correct, there is more to this problem than meets the eye. My implementation assumes equal probability of each unknown state, which is quite different from modeling an actual polymorphic character. I'm sure that doing something different might matter in many cases. lh On Apr 7, 2011, at 8:14 AM, Joe Felsenstein wrote: Luke Harmon said -- Yes Emmanuel is correct, fitDiscrete does not deal with polymorphic data. I have a fix that I made for a specific project that I'm sending to Charles, if anyone else is interested email me off-list. It's very clunky. I suspect this is not just a technical programming issue or a matter of standardizing formats of files, but depends on what you want to assume about the mode of evolution of a polymporphic character. Not a trivial matter at all, and not one where you just want to accept any old arbitrary rule. For example there is a very old (1967) parsimony method called polymorphism parsimony but it makes specific assumptions -- namely that polymorphism is hard to retain along a lineage, easy to lose but hard to regain. So do you want assume that, or what? Joe Joe Felsenstein j...@gs.washington.edu Department of Genome Sciences and Department of Biology, University of Washington, Box 355065, Seattle, WA 98195-5065 USA Luke Harmon Assistant Professor Biological Sciences University of Idaho 208-885-0346 lu...@uidaho.edu ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Multistate Trait Polymorphism
What the polymophism represents (ambiguity vs. an actual polymophism) is key. In my case, I am trying to code an actual polymorphism -- that is, a given taxon exhibits multiple states of a given trait. However, my taxonomic level is at the family, so these aren't polymorphisms in the population sense, but rather different species with different traits. From a purely practical stand point, does it seem reasonable to re-draw the taxon (family) as a polytomy with short branch-lengths and have each new tip exhibit different character state? That is, I would be representing a given family as a polytomy with as many taxa as polymorphic states. In species with significant population structure correlated with a given polymophism, would this 'practial' approach be applicable as well? C On Fri, Apr 8, 2011 at 8:03 AM, Joe Felsenstein j...@gs.washington.eduwrote: Luke Harmon wrote: Yes Joe is correct, there is more to this problem than meets the eye. My implementation assumes equal probability of each unknown state, which is quite different from modeling an actual polymorphic character. I'm sure that doing something different might matter in many cases. Assuming equal probability of each possible state might be thought of as a model of ambiguity of state, not polymorphism. But even for that it is not a complete likelihood treatment. In likelihood machinery, one uses conditional likelihoods, which give a likelihood of 1 to each possible state. This is not as crazy as it sounds (see pages 255-256 of my book). It is simply that what we have in the conditional likelihoods is NOT the probability of the state, but the probability of the ambiguous observation given the state. So, for example, if we see a purine but don't know whether it is A or G (in a DNA sequence case), the probability of seeing purine, given that we only can see purineness or pyrimidineness, and the state really is A, is 1, and similarly if it is really G. So the conditional likelihoods for the four nucleotides are (1,0,1,0). Sounds wrong but it isn't. Polymorphism is totally different: you have actually seen both states. For discrete 0/1 characters, one can use Sewall Wright's (1934) threshold model which I have discussed (briefly in the book and more extensively in a 2005 paper in the Philosophical Transactions of the Royal Society B). I have a paper under revision at a major journal about it and will release my program Threshml soon in a pre-PHYLIP version. Unlike Mark Pagel and Paul Lewis's Mk model, it predicts polymorphism in a natural way. The population has an underlying unobservable quantitative character, the liability, that implies some frequency of both 0 and 1 states.I think Ted Garland and others also use a log-linear model that has somewhat similar properties but is not exactly the same. To get these models to deal with multiple character states is possible but very very nontrivial. If you see states 0, 1, 2, is 1 intermediate between 0 and 2, or is it off at right angles to both? There are possible threshold models that could do either -- telling the difference between them requires lots of data. With, say, 6 states it would be a nightmare. Joe Joe Felsenstein, j...@gs.washington.edu Dept. of Genome Sciences, Univ. of Washington Box 355065, Seattle, WA 98195-5065 USA [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Multistate Trait Polymorphism
Luke Harmon said -- Yes Emmanuel is correct, fitDiscrete does not deal with polymorphic data. I have a fix that I made for a specific project that I'm sending to Charles, if anyone else is interested email me off-list. It's very clunky. I suspect this is not just a technical programming issue or a matter of standardizing formats of files, but depends on what you want to assume about the mode of evolution of a polymporphic character. Not a trivial matter at all, and not one where you just want to accept any old arbitrary rule. For example there is a very old (1967) parsimony method called polymorphism parsimony but it makes specific assumptions -- namely that polymorphism is hard to retain along a lineage, easy to lose but hard to regain. So do you want assume that, or what? Joe Joe Felsenstein j...@gs.washington.edu Department of Genome Sciences and Department of Biology, University of Washington, Box 355065, Seattle, WA 98195-5065 USA ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Re: [R-sig-phylo] Multistate Trait Polymorphism
Hi, Charles Willis wrote on 06/04/2011 01:24: Hi, I want to run some comparative methods in R (ace, fitDiscrete, etc) on a categorical trait that has 6 states. My problem is that certain taxa are polymorphic for this trait (they have multiple states). Is there a way to code polymorphisms in R? I know you can code polymorphisms in Mesquite (e.g., 12) and in Bayestraits (e.g., 12), but I cannot seem to find a description on how to code similar data in R. Importing polymorphic data from Mesquite (read.nexus) doesn't appear to be an option. I plan to run the analyzes coding the trait as 6 independent binary traits, but I wanted to see if it was possible to run it as a polymorphic multistate trait as well. ace() takes multistate characters into account. Apparently not firDiscrete(). Best, Emmanuel Thanks! Charlie Duke University Department of Biology 125 Science Drive Durham NC 27708 CP (605) 553-1057 charlie.wil...@duke.edu http://www.duke.edu/~cgw6/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo -- Emmanuel Paradis IRD, Jakarta, Indonesia http://ape.mpl.ird.fr/ ___ R-sig-phylo mailing list R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo