Re: [R-sig-phylo] MCMCglmm for categorical data with more than 2 levels - prior specification?
Hi Sereina, You should not get that error message when you do not specify a prior - but if you do can you let me know. For the prior you specified you get the error message because us(trait):units is specifying a 3x3 covariance matrix, and yet your prior, R=list(V=1,nu=0.002), is specifying a 1x1 matrix. V should be a 3x3 matrix, but note that the residual covariance matrix with categorical data cannot be estimated from the data. For this reason most people would not fit a weak prior (i.e. nu=0.002) but fit a very strong prior (fixing it at some value a priori using fix=1 in the prior specification). The choice of residual covariance matrix is arbitrary - the results can always be expressed in a way that do not depend on the choice of residual covariance matrix (See the CourseNotes). The fixed and random effect formulae are also a bit odd. This type of model is essentially equivalent to a trivariate model where the three traits (on the latent scale) are the differences on a log scale between the probability of being in categories 2,3 or 4 compared to category 1: log(Pr(nominal[2]))-log(Pr(nominal[1])) log(Pr(nominal[3]))-log(Pr(nominal[1])) log(Pr(nominal[4]))-log(Pr(nominal[1])) where nominal[1] is called the baseline category. You can change the baseline category by reordering the factor levels in nominal. By having ~animal in the random formula you are assuming that a) the phylogenetic variance for each contrast is equal and b) that the correlation between the phylogenetic effects is one. This may make sense in some models and with some types of base-line category, but not generally I think. us(trait):animal allows the phylogenetic variances to differ over the traits and for each pair of traits to have a unique phylogenetic correlation. There are also other variance structures that can be fitted that are somewhere between these two extremes. For the same reason you probably want to have trait specific intercepts and trait specific regression coefficients for the covariates. This can be achieved by having: ~ trait-1+trait:lnBrain + trait:binary.x I remove the global intercept (-1) because I find the model output easier to interpret, but it is not necessary. You need to be careful with this type of model on these type of data, because generally there is not much information from data on extant taxa about the parameters of comparative analyses, particularly when the data are categorical. This means that priors, even ones that appear innocuous such as flat priors, may have a substantial influence on the posterior. In addition, numerical problems may exist in categorical models when the posterior distribution for the phylogenetic intra-class correlations has support in regions close to one (either because the true value is close to one, or because the posterior distribution is very wide because the data are not very informative). This can be checked by saving the latent variables (pL=TRUE in the call to MCMCglmm) and making sure that the absolute values of the latent variables do not regularly exceed 20. Lastly, mixing may be (very) poor so you may have to wait an inordinate amount of time to completely sample the posterior. Cheers, Jarrod Doing this is fine: you can always rescale the model parameters post analysis Quoting Sereina Graber sereina.gra...@gmx.ch on Fri, 2 Aug 2013 10:17:58 +0200 (CEST): Hi all, I am doing a phylogenetic analysis using the MCMCglmm package with the phylogenetic tree as the pedigree (Hadfield Nakagawa 2010). I have a categorical response variable (nominal) with more than 2 categories (4 categories in total) and a continuous and a binary explanatory variable. My model: mod-MCMCglmm(nominal ~ lnBrain + binary.x, random= ~animal, family=categorical,rcov=~us(trait):units, prior=prior4, data=bird.data, pedigree=bird.tree) Now there is always the following error message appearing if I do not specify any priors, thus, using the default: Error in priorformat(if (NOpriorG) { : V is the wrong dimension for some prior$G/prior$R elements However, I then tried different priors which didn`t work, because I would have the wrong dimensions in the prior...can any one help me with how I have to specifiy the priors correctly, what dimensions do I need? My priors: var2-cbind(c(1e+08,0.1,0.1), c(0.1,1e+08,0.1),c(0.1,0.1,1e+08)) prior4-list(R=list(V=1,nu=0.002), G=list(G1=list(V=1, nu=0.002)),B=list(mu=rep(0,3), V=var2)) These priors lead to the error: Error in priorformat(if (NOpriorG) { : V is the wrong dimension for some prior$G/prior$R elements For any help I am very grateful. Best -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable
Re: [R-sig-phylo] PGLS vs lm
My goal, it seems to me, is to get a bunch of replications of data in which one trait shows a phylogenetic signal, but the other one does not, but also that both share some predefined correlation with each other (over time). I can then test different kinds of methods to see which would be most appropriate statistical method for this kind of problem. I can see how I could simulate traits evolving with a given correlation value over a given tree, using sim.char() in R. However, won't this leave me with traits in which both have the same phylogenetic signal? Is my only option to simulate huge numbers of traits, half of which are evolving consistent with some tree, and the other half are independent of the tree (i.e., random numbers?), and then correlate pairs (one from each of these groups), retaining just those that have the level of correlation I'm interested in exploring? Thanks for any suggestions, -Tom On Jul 26, 2013, at 6:42 PM, Theodore Garland Jr theodore.garl...@ucr.edu wrote: Hi Tom, So far I have resisted jumping in here, but maybe this will help. Come up with a model for how you think your traits of interest might evolve together in a correlated fashion along a phylogenetic tree. Now implement it in a computer simulation along a phylogenetic tree. Also implement the model with no correlation between the traits. Analyze the data with whatever methods you choose. Check the Type I error rate and then the power of each method. Also check the bias and means squared error for the parameter you are trying to estimate. See what method works best. Use that method for your data if you have some confidence that the model you used to simulate trait evolution is reasonable, based on your understanding (and intuition) about the biology involved. Lots of us have done this sort of thing, e.g., check this: Martins, E. P., and T. Garland, Jr. 1991. Phylogenetic analyses of the correlated evolution of continuous characters: a simulation study. Evolution 45:534-557. Cheers, Ted Theodore Garland, Jr., Professor Department of Biology University of California, Riverside Riverside, CA 92521 Office Phone: (951) 827-3524 Wet Lab Phone: (951) 827-5724 Dry Lab Phone: (951) 827-4026 Home Phone: (951) 328-0820 Skype: theodoregarland Facsimile: (951) 827-4286 = Dept. office (not confidential) Email: tgarl...@ucr.edu http://www.biology.ucr.edu/people/faculty/Garland.html http://scholar.google.com/citations?hl=enuser=iSSbrhwJ Inquiry-based Middle School Lesson Plan: Born to Run: Artificial Selection Lab http://www.indiana.edu/~ensiweb/lessons/BornToRun.html From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org] on behalf of Tom Schoenemann [t...@indiana.edu] Sent: Friday, July 26, 2013 3:21 PM To: Tom Schoenemann Cc: r-sig-phylo@r-project.org Subject: Re: [R-sig-phylo] PGLS vs lm OK, so I haven't gotten any responses that convince me that PGLS isn't biologically suspect. At the risk of thinking out loud to myself here, I wonder if my finding might have to do with the method detecting phylogenetic signal in the error (residuals?): From: Revell, L. J. (2010). Phylogenetic signal and linear regression on species data. Methods in Ecology and Evolution, 1(4), 319-329. I note the following: ...the suitability of a phylogenetic regression should actually be diagnosed by estimating phylogenetic signal in the residual deviations of Y given our predictors (X1, X2, etc.). Let's say one variable, A, has a strong evolutionary signal, but the other, variable B, does not. Would we expect this to affect a PGLS differently if we use A to predict B, vs. using B to predict A? If so, it would explain my findings. However, given the difference, I can have no confidence that there is, or is not, a significant covariance between A and B independent of phylogeny. Doesn't this finding call into question the method itself? More directly, how is one to interpret such a finding? Is there, or is there not, a significant biological association? -Tom On Jul 21, 2013, at 11:47 PM, Tom Schoenemann t...@indiana.edu wrote: Thanks Liam, A couple of questions: How does one do a hypothesis test on a regression, controlling for phylogeny, if not using PGLS as I am doing? I realize one could use independent contrasts, though I was led to believe that is equivalent to a PGLS with lambda = 1. I take it from what you wrote that the PGLS in caper does a ML of lambda only on y, when doing the regression? Isn't this patently wrong, biologically speaking? Phylogenetic effects could have been operating on both x and y - we can't assume that it would only be relevant to y. Shouldn't phylogenetic methods account for both? You say you aren't sure it is a good idea to jointly optimize lambda for x y. Can you expand on this? What would be a better solution