Re: [R-sig-phylo] Constraining node values in an OUCH analysis (Aaron King)Re: Constraining node values in an OUCH analysis (Krzysztof Bartoszek)
Hi Nathan, A bit late answer due to vacation but you can also try to use my mvSLOUCH package (on CRAN). While it (still) does not allow for explicit fossil species you can a very short tip branch at the place where the fossil should be. For missing observations you write NAs, the package has no problem with handling missing data on some variables in a species. It also allows for non-ultrametric trees. Hope this will be useful if you are still working on this! Best wishes Krzysztof ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Hi Daniel, There’s a difference between a method being able to handle fossil data, that is a dataset consisting of a non-ultrametric tree an data for all tips including non contemporaneous ones, and a method allowing you to directly specify trait values at nodes. Most trait evolution methods allow you to do the former (I don’t know for sure but I expect OUCH does). For the latter, which you want to do, there is a function in geiger (described in Slater, Harmon, and Alfaro 2012 Evolution), that allows you to place informative prior probability distributions on node trait values based on the fossil record. But this only allows for fitting simple models and not complex OU scenarios you might want to test. As a hack, I’d suggest adding zero length branches to all nodes in your tree and assigning your reconstructed node values to these. This will produce identical results to specifying node values directly. Zero length branches can be problematic for matrix operations required to compute likelihoods in R and so you might need to explore minimally short branch lengths (10^-5 time units has worked for me in the past). This all should have the same effect as specifying node values, but Aaron will need to confirm that it would work in OUCH. I would question though whether this is a good strategy - you’re assuming your ML estimates of node states , presumably inferred under BM, are robust enough to be fixed for subsequent macroevolutionary analyses. Given how dicey ASRs are, even when you include fossil data, this seems a big stretch. If this is a route you really want to go, perhaps explore using a restricted set of inferred node states - for example only those nodes in the extant taxon tree that are directly ancestral to a fossil taxon, to explore how much this approach influences your results. g Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Jun 4, 2015, at 2:50 PM, Daniel Fulop dfulop@gmail.commailto:dfulop@gmail.com wrote: Isn't at least some of this functionality in mvSLOUCH and/or geiger? ...it's definitely the case that mvSLOUCH can handle missing data at the tips, and I think fossil data can be incorporated in it and geiger as well. At least Slater 2013 has code for incorporating fossils in geiger or modified geiger functions. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Oops - sorry Daniel, yes that should have been addressed to Nathan... Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Jun 4, 2015, at 4:27 PM, Daniel Fulop dfulop@gmail.commailto:dfulop@gmail.com wrote: Thanks, Graham ...but I'm not the OP. I was just shooting off a quick lead without actually checking the specifics in case it was useful. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Thanks, Graham ...but I'm not the OP. I was just shooting off a quick lead without actually checking the specifics in case it was useful. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Hi Nathan, Although it is still possible to impute the missing data prior to the analysis, you can fit multivariate models with missing cases (NA values), using the mvMORPH (development version) from my gitHubhttps://github.com/JClavel/mvMORPH/ Non-ultrametric trees (trees with fossil species) are also allowed and you can compute the root state using the param list. HTH, Julien Short example derived after the code of the help page of mvOU:data2-data# put some missing casesdata2[8,2]-NAdata2[25,1]-NA #then you can fit both data2 or data mvOU(tree, data2) From: slat...@si.edu To: dfulop@gmail.com Date: Thu, 4 Jun 2015 21:22:46 + CC: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] Constraining node values in an OUCH analysis Hi Daniel, There�s a difference between a method being able to handle fossil data, that is a dataset consisting of a non-ultrametric tree an data for all tips including non contemporaneous ones, and a method allowing you to directly specify trait values at nodes. Most trait evolution methods allow you to do the former (I don�t know for sure but I expect OUCH does). For the latter, which you want to do, there is a function in geiger (described in Slater, Harmon, and Alfaro 2012 Evolution), that allows you to place informative prior probability distributions on node trait values based on the fossil record. But this only allows for fitting simple models and not complex OU scenarios you might want to test. As a hack, I�d suggest adding zero length branches to all nodes in your tree and assigning your reconstructed node values to these. This will produce identical results to specifying node values directly. Zero length branches can be problematic for matrix operations required to compute likelihoods in R and so you might need to explore minimally short branch lengths (10^-5 time units has worked for me in the past). This all should have the same effect as specifying node values, but Aaron will need to confirm that it would work in OUCH. I would question though whether this is a good strategy - you�re assuming your ML estimates of node states , presumably inferred under BM, are robust enough to be fixed for subsequent macroevolutionary analyses. Given how dicey ASRs are, even when you include fossil data, this seems a big stretch. If this is a route you really want to go, perhaps explore using a restricted set of inferred node states - for example only those nodes in the extant taxon tree that are directly ancestral to a fossil taxon, to explore how much this approach influences your results. g Graham Slater Peter Buck Post-Doctoral Fellow Department of Paleobiology National Museum of Natural History The Smithsonian Institution [NHB, MRC 121] P.O. Box 37012 (202) 633-1316 slat...@si.edumailto:slat...@si.edu www.fourdimensionalbiology.comhttp://www.fourdimensionalbiology.com On Jun 4, 2015, at 2:50 PM, Daniel Fulop dfulop@gmail.commailto:dfulop@gmail.com wrote: Isn't at least some of this functionality in mvSLOUCH and/or geiger? ...it's definitely the case that mvSLOUCH can handle missing data at the tips, and I think fossil data can be incorporated in it and geiger as well. At least Slater 2013 has code for incorporating fossils in geiger or modified geiger functions. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.orgmailto:R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Couldn't remember, so went and looked. Turns out that NAs are a problem in the tips. This isn't necessitated by the structure of the problem, only by the structure of the package, i.e., because ouchtrees are constructed in ignorance of where the data are. Unfortunately, it will require substantial refactoring to get this little bit of extra functionality As for the original poster's question: no, there is currently no way in 'ouch' to incorporate fossil data (i.e., data at internal nodes). Does anyone know: has someone else already implemented this for the Ornstein-Uhlenbeck or Brownian motion processes? A. On Thu, Jun 4, 2015 at 1:00 PM, David Bapst dwba...@gmail.com wrote: While contemplating Nate's question, I wondered, doesn't hansen currently support NA codings for missing variables for tip taxa? Unfortunately the donotrun{} example for hansen() using geiger data isn't currently functioning, so I couldn't test this. -- Aaron A. King, Ph.D. Ecology Evolutionary Biology Mathematics Center for the Study of Complex Systems University of Michigan GPG Public Key: 0x15780975 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Aaron, While contemplating Nate's question, I wondered, doesn't hansen currently support NA codings for missing variables for tip taxa? Unfortunately the donotrun{} example for hansen() using geiger data isn't currently functioning, so I couldn't test this. -Dave Bapst On Thu, Jun 4, 2015 at 10:46 AM, Aaron King kin...@umich.edu wrote: Interesting question, Nate. Do I understand you to say that you have data on some variables (and not others) at internal nodes? If so, what happens when you just add those to the data, with NA to indicate missing values? Have you tried this? A. On Thu, Jun 4, 2015 at 11:10 AM, Nathan Thompson nathan.thomp...@stonybrook.edu wrote: Hi all, I am performing an multivariate analysis in ouch on a group of extant species. I would ideally like to include information for fossil taxa in the analysis, however, no single fossil taxon preserves all of the variables of interest. However, I have performed univariate ancestral state reconstructions (with fossils) and obtained estimates of the node values. Is there any way, in a multivariate ouch analysis, to 'constrain' nodes to certain values based on this a priori knowledge of ancestral states? I realize the alternative would be to just run separate univariate OU analyses for each variable (including fossils), but I would like to do this in a multivariate framework. Thank you, Nathan E Thompson Doctoral Candidate Dept. of Anatomical Sciences Stony Brook University nathan.thomp...@stonybrook.edu [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Aaron A. King, Ph.D. Ecology Evolutionary Biology Mathematics Center for the Study of Complex Systems University of Michigan GPG Public Key: 0x15780975 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- David W. Bapst, PhD Adjunct Asst. Professor, Geology and Geol. Eng. South Dakota School of Mines and Technology 501 E. St. Joseph Rapid City, SD 57701 http://webpages.sdsmt.edu/~dbapst/ http://cran.r-project.org/web/packages/paleotree/index.html ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Interesting question, Nate. Do I understand you to say that you have data on some variables (and not others) at internal nodes? If so, what happens when you just add those to the data, with NA to indicate missing values? Have you tried this? A. On Thu, Jun 4, 2015 at 11:10 AM, Nathan Thompson nathan.thomp...@stonybrook.edu wrote: Hi all, I am performing an multivariate analysis in ouch on a group of extant species. I would ideally like to include information for fossil taxa in the analysis, however, no single fossil taxon preserves all of the variables of interest. However, I have performed univariate ancestral state reconstructions (with fossils) and obtained estimates of the node values. Is there any way, in a multivariate ouch analysis, to 'constrain' nodes to certain values based on this a priori knowledge of ancestral states? I realize the alternative would be to just run separate univariate OU analyses for each variable (including fossils), but I would like to do this in a multivariate framework. Thank you, Nathan E Thompson Doctoral Candidate Dept. of Anatomical Sciences Stony Brook University nathan.thomp...@stonybrook.edu [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ -- Aaron A. King, Ph.D. Ecology Evolutionary Biology Mathematics Center for the Study of Complex Systems University of Michigan GPG Public Key: 0x15780975 [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Constraining node values in an OUCH analysis
Isn't at least some of this functionality in mvSLOUCH and/or geiger? ...it's definitely the case that mvSLOUCH can handle missing data at the tips, and I think fossil data can be incorporated in it and geiger as well. At least Slater 2013 has code for incorporating fossils in geiger or modified geiger functions. [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/