Re: [R-sig-phylo] A perfect storm: phylogenetic trees, random effects and zero-inflated binomial data

Jarrod Hadfield Wed, 14 Oct 2015 07:42:02 -0700

Dear Diederik,

The lack of convergence is because the residual variance isnon-identifiable with binary data but you have a very weak prior onit. You should fix the residual variance at something (I usually use 1):

prior.test<-list(R=list(V=1,fix=1), G=list(G1=list(V=1, nu=0.002),G2 =list(V=1, nu=0.002)))

You might also want to consider using "threshold" rather than"categorical". The model is similar but the first uses a probit linkand the second a logit link, and the MCMC algorithm is more efficientfor the former. I would also advise updating to version 2.22 as it nowperforms better for phylogenetic models.

You may still find that problems persist with rare-event data, but getback to the list if this is the case after fixing the prior.

It has been argued that the h2 (or lambda if you prefer) forcategorical traits should be fixed at one rather than estimated. Ifyou did want to do this (I am not convinced it is always a reasonablething to do) then you can use MCMCglmmRAM(http://jarrod.bio.ed.ac.uk/MCMCglmmRAM_2.22.tar.gz).


Cheers,

Jarrod

Quoting Diederik Strubbe <[email protected]> on Wed, 14Oct 2015 16:18:28 +0200:

Dear all,



 A while ago, I was kindly advised to try MCMCglmm to investigate
invasion success of non-native species while accounting for phylogenetic
relatedness. I have managed to run some explanatory models but stumble
upon converge problems…



The data are (1) /introduction events/ of non-native species. There are
about 4.000 introduction events, but only about 40 have resulted in an
established population – so there is a low number of ‘events’.
Successful introductions are coded as “1”,  failure is “0”, (2) the
/phylogenetic tree/ is an ultrametric majority-rule consensus tree
(class “phylo”,  number of tips: 359, number of nodes: 298), (3)
/country/ in which species are introduced is included as a random
effect, (4) a set of continuous /explanatory variables/.



I have mainly explored univariate models (ie one explanatory variables
at a time), but including multiple explanatory variables also results in
converge problems. I use the following model, with continuous variable
‘CloseCentralInv’ as single explanatory variable.



prior.test<-list(R=list(V=1,nu=0.002), G=list(G1=list(V=1, nu=0.002),G2
= list(V=1, nu=0.002)))

test.phylo1 <- MCMCglmm(invasiveStatus ~ CloseCentralInv,

                                random = ~species + country,


ginverse=list(species=inv.mytree$Ainv),

                                nitt=1000000, thin=500, burnin=10000,

                                family = "categorical",

                                data = data.trade,

                                prior = prior.test,

                                verbose = FALSE)



As far as I understand, the basic model output looks reasonable (here
<https://www.dropbox.com/s/kffj8o3nf9nz0xn/summary%28test.phylo1%29.png?dl=0>,
1), although effective sample sizes for the random effects seem to be
small.



However, /Heidelberger and Welch's/ convergence diagnostic often passes
the stationary test, but not the Halfway mean test (example here
<https://www.dropbox.com/s/t25y37slvkqyd8l/heidel.diag.png?dl=0>, 2).
/Gelman and Rubin/'s convergence diagnostic (calculated on two chains)
indicates potential scale reduction factors that are often well above 1
(example here
<https://www.dropbox.com/s/kbxrc73ep1kjli5/gelman.diag.png?dl=0>, 3). A
plot of Gelman and Rubin also does not clearly indicate after how many
iterations convergence is to be expected (example here, 4). I tried to
use Raftery and Lewis's diagnostic to estimate the number of necessary
iterations, but this invariably outputs “You need a sample size of at
least 3746 with these values of q, r and s”. Traces for this model run
can be accessed here
<https://www.dropbox.com/s/7vvqf0r7she71hu/traces.png?dl=0> (5).



I tried upping the number of iterations to 1.000.000 for a number of
variables, but this does not seem to ameliorate the convergence problems.



I can see the following causes: (1) I misspecified the model and it is
not doing what I think it should be doing, (2) I actually need much more
iterations, (3) I need to specify stronger priors, (4) I am trying to
get blood from stone (aka the data do not allow such an analysis).



I might add that if I look at the p-values and estimates obtained
through the (non-converging- MCMCglmm runs, the results are very much
similar to a ‘simpler’ glmm where phylogeny is accounted for by using a
nested random effect for taxon and genus.



Any suggestions on how to proceed are much appreciated.



Best wishes and thanks in advance,



Diederik





1.
https://www.dropbox.com/s/kffj8o3nf9nz0xn/summary%28test.phylo1%29.png?dl=0

2.       https://www.dropbox.com/s/t25y37slvkqyd8l/heidel.diag.png?dl=0

3.       https://www.dropbox.com/s/kbxrc73ep1kjli5/gelman.diag.png?dl=0

4.       https://www.dropbox.com/s/0e2unw7rmr7rt2y/gelman.plot.png?dl=0

5.       https://www.dropbox.com/s/7vvqf0r7she71hu/traces.png?dl=0






On 6/3/2015 5:37 PM, Jörg Albrecht wrote:

Hi Diederik,

you can use MCMCglmm. The package allows for inclusion of phylogenetic
information, random effects and zero-inflated response variables.
However, it may take some time to get familiar with the package.

Best,

J


—
Jörg Albrecht, PhD
Postdoctoral researcher
Institute of Nature Conservation
Polish Academy of Sciences
Mickiewicza 33
31-120 Krakow, Poland
www.carpathianbear.pl <http://www.carpathianbear.pl>
www.globeproject.pl <http://www.globeproject.pl>
www.iop.krakow.pl <http://www.iop.krakow.pl>

Am 03.06.2015 um 17:25 schrieb Diederik Strubbe
<[email protected] <mailto:[email protected]>>:

Dear all,



I am struggling with analysing a dataset aimed at explaining invasion
success of non-native species. At a country level, I need to relate
invasion success (binomial: 0 for failed invasions, 1 for success) to
socio-economic variables, taking into account

-          Phylogenetic relatedness among introduced species: including
a phylogenetic tree

-          Country as a random effect

-          The fact that data are zero-inflated (most introductions
fail).



Any suggestions for R packages that can handle a binomial response
variable, phylogenetic trees, random effects and zero-inflation?



Thanks in advance,



Diederik

--
Dr.Diederik Strubbe
Evolutionary Ecology Group
Department of Biology
University of Antwerp
Middelheimcampus GV310
Groenenborgerlaan 171
2020 Antwerpen, Belgium
office: +32 3 265 34 69
mobile phone: +32 477445568
skype user name: lakrinn


[[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - [email protected]
<mailto:[email protected]>
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at
http://www.mail-archive.com/[email protected]/



        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - [email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/[email protected]/



--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-phylo mailing list - [email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/[email protected]/

Re: [R-sig-phylo] A perfect storm: phylogenetic trees, random effects and zero-inflated binomial data

Reply via email to