Re: [R-sig-phylo] WG: Re: Re: MCMCglmm for categorical data with more than 2 levels - prior specification?
Hi Jarrod, hi all, I am still struggling with that MCMCglmm function: First, in the course notes I have read that for some reason which should come clearer later on in the text the IJ matrix is used for the prior of the residuals and the random effectsin the multinomial model. Whyespecially this matrix? Second, probably a very stupid questions: if the model did not converge, you have to run it longer, so increase the number of iterations, right? However, when I am increasing the number of iterations (increased from 12,000 to 100,000, there are still trends in the times series plots. What can I do then? what else might be the problem here? And also related to that, in the last email you wrote that there might be a problem du to my small effect sizes, however, it also seems that those do not increase with increasing number of iterations. I am very thankful for some help. Cheers, Sereina Gesendet:Freitag, 02. August 2013 um 14:54 Uhr Von:Jarrod Hadfield j.hadfi...@ed.ac.uk An:Sereina Graber sereina.gra...@gmx.ch Cc:r-sig-phylo@r-project.org Betreff:Re: Aw: Re: [R-sig-phylo] WG: Re: Re: MCMCglmm for categorical data with more than 2 levels - prior specification? Hi, They are the effect of the covariates on the probability of being in the categories 2,3,4 versus category 1. Note that your effective sample sizes are very small which means mixing is a problem and you need to run it for longer. Numerical/Inferential problems can also occur if the joint distribution of the predictors and the outcomes results in extreme categorical problems. You then might want to follow Gelmans advice on priors for fixed effects. See the function gelman.prior. Cheers, Jarrod Quoting Sereina Graber sereina.gra...@gmx.ch on Fri, 2 Aug 2013 14:48:44 +0200 (CEST): Great, thanks a lot! Then I have one last question: How do I have to interpret the following output of the location effects? the first three lines I guess represent the intercepts of categories 2 to 4, but how I should I interpret the rest having the two covariates lnBrain (continuous) and binary (binary). With the following model... myMCMC.phyl- MCMCglmm(nominal ~ trait-1+ trait:lnBrain + trait:binary, random=~us(trait):species, rcov = ~us(trait):units, pedigree=bird.tree, + data = "" family=categorical, + prior=Prior.phyl6) ...I got the following location effects: Location effects: nominal ~ trait - 1 + trait:lnBrain + trait:binary post.mean l-95% CI u-95% CI eff.samp pMCMC traitnominal.2 5.59844 4.49565 6.90609 9.676 0.001 *** traitnominal.3 -4.12383 -5.58366 -2.65665 7.794 0.001 *** traitnominal.4 -1.70863 -2.86831 -0.38491 12.770 0.006 ** traitnominal.2:lnBrain -0.08244 -2.10570 1.57463 3.228 0.880 traitnominal.3:lnBrain -1.29069 -3.36790 1.08456 3.790 0.376 traitnominal.4:lnBrain -0.53814 -2.76265 1.67985 3.859 0.762 traitnominal.2:binary2 -9.59263 -16.21345 -3.88906 3.403 0.001 *** traitnominal.3:binary2 13.37745 9.26769 19.93064 4.247 0.001 *** traitnominal.4:binary2 8.61585 3.82747 15.54171 3.446 0.001 *** --- Best thank you so much for your help! GESENDET: Freitag, 02. August 2013 um 13:55 Uhr VON: Jarrod Hadfield j.hadfi...@ed.ac.uk AN: sereina.graber sereina.gra...@gmx.ch CC: r-sig-phylo@r-project.org BETREFF: Re: [R-sig-phylo] WG: Re: Aw: Re: MCMCglmm for categorical data with more than 2 levels - prior specification? Hi, 1.) There is no difference between the arguments pedigree=bird.tree and ginverse = list(species=Ainv) where Ainv is defined by Ainv=inverseA(bird.tree)Ainv. The latter argument was added after the first version in order to provide more flexibility (for example if multiple phylogenies are to be fitted). 2.)and 4.) You have also fixed the phylogenetic covariance matrix in the prior (by using fix=1). You should remove the fix=1 if you want to actually estimate it rather than fix it. You should also add trait as a main effect to allow the traits to have different intercepts. Its hard to know what to recommend regarding prior information, but you could start perhaps with V=IJ and nu low (see CourseNotes). 3.) The number of traits is one less than the number of categories, so for a binary response there is only one trait. This is because if yuo know the probability of being in one state (Pr(A)), you already know the probability of being in the other state (1-Pr(A)). The covariance matrix specification in the prior should therefore be 1x1 not 2x2. You should also drop trait from the models and just have ~species, ~units etc. Cheers, Jarrod Quoting sereina.graber sereina.gra...@gmx.ch on Fri, 02 Aug 2013 12:54:00 +0200: Ursprngliche Nachricht Betreff: Re: Aw: Re: [R-sig-phylo] MCMCglmm for categorical data with more than 2 levels - prior specification? Von: Jarrod Hadfield j.hadfi...@ed.ac.uk An: Sereina Graber sereina.gra...@gmx.ch CC: Quoting Sereina Graber sereina.gra...@gmx.ch on Fri, 2 Aug 2013 12:12:41 +0200 (CEST): Hi Jarrod, Thanks a lot for tho
Re: [R-sig-phylo] WG: Re: Re: MCMCglmm for categorical data with more than 2 levels - prior specification?
Hi, The IJ prior (or posterior) implies that the variance in each probability is constant and that probabilities of different outcomes are mutually independent, conditional on the constraint that they must sum to one. To see why, let V be the covariance matrix of log-contrasts (either at the phylogenetic or residual level) then: V[1,1] = VAR(LP_2-LP_1) = VAR(LP_2)+VAR(LP_3)-2COV(LP_2,LP_1) and V[1,2] = COV(LP_2-LP_1, LP_3-LP_1) = COV(LP_2, LP_3)-COV(LP_2,LP_1)-COV(LP_3,LP_1)+VAR(LP_1) where LP_i = log(Pr(nominal[i])) from previous emails, and LP_1 is the log probability for the baseline category. If we would like to have a prior where VAR(LP_i) is constant (VAR(LP)) for all i, and COV(LP_i, LP_j) = 0 for all i and j, then: V[1,1] = 2*VAR(LP) and V[1,2] = VAR(LP) so a sensible prior is proportional to an I+J matrix where I is the identity matrix and J a unit matrix (a matrix of all ones). My guess is that the mixing/convergence problems are due to numerical issues if this is the same dataset that your other post (comp.gee not converging) refers to. Check out the latent variables as I have already suggested - do their absolute values exceed 25? If so you need to find out why (very high phylogenetic heritability, extreme category problems for the fixed effects etc.) Cheers, Jarrod Quoting Sereina Graber sereina.gra...@gmx.ch on Thu, 8 Aug 2013 15:02:20 +0200 (CEST): Hi Jarrod, hi all, I am still struggling with that MCMCglmm function: First, in the course notes I have read that for some reason which should come clearer later on in the text the IJ matrix is used for the prior of the residuals and the random effects in the multinomial model. Why especially this matrix? Second, probably a very stupid questions: if the model did not converge, you have to run it longer, so increase the number of iterations, right? However, when I am increasing the number of iterations (increased from 12,000 to 100,000, there are still trends in the times series plots. What can I do then? what else might be the problem here? And also related to that, in the last email you wrote that there might be a problem du to my small effect sizes, however, it also seems that those do not increase with increasing number of iterations. I am very thankful for some help. Cheers, Sereina GESENDET: Freitag, 02. August 2013 um 14:54 Uhr VON: Jarrod Hadfield j.hadfi...@ed.ac.uk AN: Sereina Graber sereina.gra...@gmx.ch CC: r-sig-phylo@r-project.org BETREFF: Re: Aw: Re: [R-sig-phylo] WG: Re: Re: MCMCglmm for categorical data with more than 2 levels - prior specification? Hi, They are the effect of the covariates on the probability of being in the categories 2,3,4 versus category 1. Note that your effective sample sizes are very small which means mixing is a problem and you need to run it for longer. Numerical/Inferential problems can also occur if the joint distribution of the predictors and the outcomes results in `extreme categorical problems'. You then might want to follow Gelman's advice on priors for fixed effects. See the function gelman.prior. Cheers, Jarrod Quoting Sereina Graber sereina.gra...@gmx.ch on Fri, 2 Aug 2013 14:48:44 +0200 (CEST): Great, thanks a lot! Then I have one last question: How do I have to interpret the following output of the location effects? the first three lines I guess represent the intercepts of categories 2 to 4, but how I should I interpret the rest having the two covariates lnBrain (continuous) and binary (binary). With the following model... myMCMC.phyl- MCMCglmm(nominal ~ trait-1+ trait:lnBrain + trait:binary, random=~us(trait):species, rcov = ~us(trait):units, pedigree=bird.tree, + data = bird.data, family=categorical, + prior=Prior.phyl6) ...I got the following location effects: Location effects: nominal ~ trait - 1 + trait:lnBrain + trait:binary post.mean l-95% CI u-95% CI eff.samp pMCMC traitnominal.2 5.59844 4.49565 6.90609 9.676 0.001 *** traitnominal.3 -4.12383 -5.58366 -2.65665 7.794 0.001 *** traitnominal.4 -1.70863 -2.86831 -0.38491 12.770 0.006 ** traitnominal.2:lnBrain -0.08244 -2.10570 1.57463 3.228 0.880 traitnominal.3:lnBrain -1.29069 -3.36790 1.08456 3.790 0.376 traitnominal.4:lnBrain -0.53814 -2.76265 1.67985 3.859 0.762 traitnominal.2:binary2 -9.59263 -16.21345 -3.88906 3.403 0.001 *** traitnominal.3:binary2 13.37745 9.26769 19.93064 4.247 0.001 *** traitnominal.4:binary2 8.61585 3.82747 15.54171 3.446 0.001 *** --- Best thank you so much for your help! GESENDET: Freitag, 02. August 2013 um 13:55 Uhr VON: Jarrod Hadfield j.hadfi...@ed.ac.uk AN: sereina.graber sereina.gra...@gmx.ch CC: r-sig-phylo@r-project.org BETREFF: Re: [R-sig-phylo] WG: Re: Aw: Re: MCMCglmm for categorical data with more than 2 levels - prior specification? Hi, 1.) There is no difference between the arguments pedigree=bird.tree and ginverse = list(species=Ainv) where Ainv is defined by Ainv=inverseA(bird.tree)$Ainv. The latter