Re: [R-sig-phylo] PGLS vs lm

2013-08-02 Thread Tom Schoenemann
My goal, it seems to me, is to get a bunch of replications of data in which one 
trait shows a phylogenetic signal, but the other one does not, but also that 
both share some predefined correlation with each other (over time). I can then 
test different kinds of methods to see which would be most appropriate 
statistical method for this kind of problem.

I can see how I could simulate traits evolving with a given correlation value 
over a given tree, using sim.char() in R. However, won't this leave me with 
traits in which both have the same phylogenetic signal?

Is my only option to simulate huge numbers of traits, half of which are 
evolving consistent with some tree, and the other half are independent of the 
tree (i.e., random numbers?), and then correlate pairs (one from each of these 
groups), retaining just those that have the level of correlation I'm interested 
in exploring? 

Thanks for any suggestions,

-Tom


On Jul 26, 2013, at 6:42 PM, Theodore Garland Jr  
wrote:

> Hi Tom,
> 
> So far I have resisted jumping in here, but maybe this will help.
> Come up with a model for how you think your traits of interest might evolve 
> together in a correlated fashion along a phylogenetic tree.
> Now implement it in a computer simulation along a phylogenetic tree.
> Also implement the model with no correlation between the traits.  
> Analyze the data with whatever methods you choose.
> Check the Type I error rate and then the power of each method.  Also check 
> the bias and means squared error for the parameter you are trying to estimate.
> See what method works best.
> Use that method for your data if you have some confidence that the model you 
> used to simulate trait evolution is reasonable, based on your understanding 
> (and intuition) about the biology involved.
> 
> Lots of us have done this sort of thing, e.g., check this:
> 
> Martins, E. P., and T. Garland, Jr. 1991. Phylogenetic analyses of the 
> correlated evolution of continuous characters: a simulation study. Evolution 
> 45:534-557.
> 
> 
> 
> Cheers,
> Ted
> 
> Theodore Garland, Jr., Professor
> Department of Biology
> University of California, Riverside
> Riverside, CA 92521
> Office Phone:  (951) 827-3524
> Wet Lab Phone:  (951) 827-5724
> Dry Lab Phone:  (951) 827-4026
> Home Phone:  (951) 328-0820
> Skype:  theodoregarland
> Facsimile:  (951) 827-4286 = Dept. office (not confidential)
> Email:  tgarl...@ucr.edu
> http://www.biology.ucr.edu/people/faculty/Garland.html
> http://scholar.google.com/citations?hl=en&user=iSSbrhwJ
> 
> Inquiry-based Middle School Lesson Plan:
> "Born to Run: Artificial Selection Lab"
> http://www.indiana.edu/~ensiweb/lessons/BornToRun.html
> 
> From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org] 
> on behalf of Tom Schoenemann [t...@indiana.edu]
> Sent: Friday, July 26, 2013 3:21 PM
> To: Tom Schoenemann
> Cc: r-sig-phylo@r-project.org
> Subject: Re: [R-sig-phylo] PGLS vs lm
> 
> OK, so I haven't gotten any responses that convince me that PGLS isn't 
> biologically suspect. At the risk of thinking out loud to myself here, I 
> wonder if my finding might have to do with the method detecting phylogenetic 
> signal in the error (residuals?):
> 
> From:
> Revell, L. J. (2010). Phylogenetic signal and linear regression on species 
> data. Methods in Ecology and Evolution, 1(4), 319-329.
> 
> I note the following: "...the suitability of a phylogenetic regression should 
> actually be diagnosed by estimating phylogenetic signal in the residual 
> deviations of Y given our predictors (X1, X2, etc.)."
> 
> Let's say one variable, "A", has a strong evolutionary signal, but the other, 
> variable "B", does not. Would we expect this to affect a PGLS differently if 
> we use A to predict B, vs. using B to predict A?  
> 
> If so, it would explain my findings. However, given the difference, I can 
> have no confidence that there is, or is not, a significant covariance between 
> A and B independent of phylogeny. Doesn't this finding call into question the 
> method itself?
> 
> More directly, how is one to interpret such a finding? Is there, or is there 
> not, a significant biological association?
> 
> -Tom
> 
> 
> On Jul 21, 2013, at 11:47 PM, Tom Schoenemann  wrote:
> 
> > Thanks Liam,
> > 
> > A couple of questions: 
> > 
> > How does one do a hypothesis test on a regression, controlling for 
> > phylogeny, if not using PGLS as I am doing?  I realize one could use 
> > independent contrasts, though I was led to believe that is equivalent to a 
> > PGLS with lambda = 1.  
> > 
> > I take it from what you wrote that the PGLS in caper does a ML of lambda 
> > only on y, when doing the regression? Isn't this patently wrong, 
> > biologically speaking? Phylogenetic effects could have been operating on 
> > both x and y - we can't assume that it would only be relevant to y. 
> > Shouldn't phylogenetic methods account for both?
> > 
> > You say you aren't sure it is a good idea to jointly optim

Re: [R-sig-phylo] WG: Re: Re: MCMCglmm for categorical data with more than 2 levels - prior specification?

2013-08-02 Thread Jarrod Hadfield

Hi,

They are the effect of the covariates on the probability of being in  
the categories 2,3,4 versus category 1. Note that your effective  
sample sizes are very small which means mixing is a problem and you  
need to run it for longer. Numerical/Inferential problems can also  
occur if the joint distribution of the predictors and the outcomes  
results in `extreme categorical problems'. You then might want to  
follow Gelman's advice on priors for fixed effects. See the function  
gelman.prior.


Cheers,

Jarrod



  Quoting Sereina Graber  on Fri, 2 Aug 2013  
14:48:44 +0200 (CEST):




Great, thanks a lot! Then I have one last question: How do I have to
interpret the following output of the location effects? the first
three lines I guess represent the intercepts of categories 2 to 4, but
how I should I interpret the rest having the two covariates lnBrain
(continuous) and binary (binary).

With the following model...

myMCMC.phyl<- MCMCglmm(nominal ~ trait-1+ trait:lnBrain +
trait:binary, random=~us(trait):species, rcov = ~us(trait):units,
pedigree=bird.tree,
+ data = bird.data, family="categorical",
+ prior=Prior.phyl6)

...I got the following location effects:

 Location effects: nominal ~ trait - 1 + trait:lnBrain + trait:binary
   post.mean  l-95% CI  u-95% CI eff.samp  pMCMC

traitnominal.2   5.59844   4.49565   6.906099.676 <0.001
***
traitnominal.3  -4.12383  -5.58366  -2.656657.794 <0.001
***
traitnominal.4  -1.70863  -2.86831  -0.38491   12.770  0.006
**
traitnominal.2:lnBrain  -0.08244  -2.10570   1.574633.228  0.880

traitnominal.3:lnBrain  -1.29069  -3.36790   1.084563.790  0.376

traitnominal.4:lnBrain  -0.53814  -2.76265   1.679853.859  0.762

traitnominal.2:binary2  -9.59263 -16.21345  -3.889063.403 <0.001
***
traitnominal.3:binary2  13.37745   9.26769  19.930644.247 <0.001
***
traitnominal.4:binary2   8.61585   3.82747  15.541713.446 <0.001
***
---

Best & thank you so much for your help!

GESENDET: Freitag, 02. August 2013 um 13:55 Uhr
VON: "Jarrod Hadfield" 
AN: "sereina.graber" 
CC: r-sig-phylo@r-project.org
BETREFF: Re: [R-sig-phylo] WG: Re: Aw: Re: MCMCglmm for categorical
data with more than 2 levels - prior specification?
Hi,

1.) There is no difference between the arguments pedigree=bird.tree
and "ginverse = list(species=Ainv)" where Ainv is defined by
"Ainv=inverseA(bird.tree)$Ainv". The latter argument was added after
the first version in order to provide more flexibility (for example if
multiple phylogenies are to be fitted).

2.)and 4.) You have also fixed the phylogenetic covariance matrix in
the prior (by using fix=1). You should remove the fix=1 if you want to
actually estimate it rather than fix it. You should also add trait as
a main effect to allow the traits to have different intercepts. Its
hard to know what to recommend regarding prior information, but you
could start perhaps with V=IJ and nu low (see CourseNotes).

3.) The number of traits is one less than the number of categories, so
for a binary response there is only one trait. This is because if yuo
know the probability of being in one state (Pr(A)), you already know
the probability of being in the other state (1-Pr(A)). The covariance
matrix specification in the prior should therefore be 1x1 not 2x2. You
should also drop trait from the models and just have ~species, ~units
etc.

Cheers,

Jarrod

Quoting "sereina.graber"  on Fri, 02 Aug 2013
12:54:00 +0200:





 Ursprüngliche Nachricht 
Betreff: Re: Aw: Re: [R-sig-phylo] MCMCglmm for categorical data
with more than 2 levels - prior specification?
Von: Jarrod Hadfield 
An: Sereina Graber 
CC:



Quoting Sereina Graber  on Fri, 2 Aug 2013
12:12:41 +0200 (CEST):



Hi Jarrod,

Thanks a lot for those helpful tips. However,  now I ran into some
further problems:

1.)  What is the difference between the arguments pedigree=bird.tree
and "ginverse = list(species=Ainv)" where Ainv is defined by
"Ainv=inverseA(bird.tree)$Ainv"?

2.) Now the nominal model seems to work with the following code, did

I

implement it right like that?

Prior.phyl3 = list(R = list(V = diag(3), fix=1),G = list( G1 =

list(V

= diag(3) ,fix=1)))
myMCMC.phyl<- MCMCglmm(nominal ~ trait:lnBrain,
random=~us(trait):species, rcov = ~us(trait):units, ginverse =
list(species=Ainv), data = bird.data, family="categorical",
prior=Prior.phyl3)

However, if I try the same code with an adjusted prior specification
for a binary response variable

Prior.phyl31 = list(R = list(V = diag(2), fix=1),G = list( G1 =

list(V

= diag(2),fix=1)))
myMCMC.phyl<- MCMCglmm(binary ~ trait:lnBrain,
random=~us(trait):species, rcov = ~us(trait):units,
pedigree=bird.tree,
data = bird.data, family="categorical",
prior=Prior.phyl31)

Then I always get the error:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
   contrasts can be applied only to factors with 2 or more lev

Re: [R-sig-phylo] WG: Re: Re: MCMCglmm for categorical data with more than 2 levels - prior specification?

2013-08-02 Thread Sereina Graber

Great, thanks a lot! Then I have one last question: How do I have to interpret the following output of the location effects? the first three lines I guess represent the intercepts of categories 2 to 4, but how I should I interpret the rest having the two covariates lnBrain (continuous) and binary (binary).

 

With the following model...

 

myMCMC.phyl<- MCMCglmm(nominal ~ trait-1+ trait:lnBrain + trait:binary, random=~us(trait):species, rcov = ~us(trait):units, pedigree=bird.tree,
+ data = "" family="categorical",
+ prior=Prior.phyl6)

 

...I got the following location effects: 

 


 Location effects: nominal ~ trait - 1 + trait:lnBrain + trait:binary 

                       post.mean  l-95% CI  u-95% CI eff.samp  pMCMC    
traitnominal.2           5.59844   4.49565   6.90609    9.676 <0.001 ***
traitnominal.3          -4.12383  -5.58366  -2.65665    7.794 <0.001 ***
traitnominal.4          -1.70863  -2.86831  -0.38491   12.770  0.006 ** 
traitnominal.2:lnBrain  -0.08244  -2.10570   1.57463    3.228  0.880    
traitnominal.3:lnBrain  -1.29069  -3.36790   1.08456    3.790  0.376    
traitnominal.4:lnBrain  -0.53814  -2.76265   1.67985    3.859  0.762    
traitnominal.2:binary2  -9.59263 -16.21345  -3.88906    3.403 <0.001 ***
traitnominal.3:binary2  13.37745   9.26769  19.93064    4.247 <0.001 ***
traitnominal.4:binary2   8.61585   3.82747  15.54171    3.446 <0.001 ***
---



 

Best & thank you so much for your help!

 

 

 


Gesendet: Freitag, 02. August 2013 um 13:55 Uhr
Von: "Jarrod Hadfield" 
An: "sereina.graber" 
Cc: r-sig-phylo@r-project.org
Betreff: Re: [R-sig-phylo] WG: Re: Aw: Re: MCMCglmm for categorical data with more than 2 levels - prior specification?

Hi,

1.) There is no difference between the arguments pedigree=bird.tree
and "ginverse = list(species=Ainv)" where Ainv is defined by
"Ainv=inverseA(bird.tree)$Ainv". The latter argument was added after
the first version in order to provide more flexibility (for example if
multiple phylogenies are to be fitted).

2.)and 4.) You have also fixed the phylogenetic covariance matrix in
the prior (by using fix=1). You should remove the fix=1 if you want to
actually estimate it rather than fix it. You should also add trait as
a main effect to allow the traits to have different intercepts. Its
hard to know what to recommend regarding prior information, but you
could start perhaps with V=IJ and nu low (see CourseNotes).

3.) The number of traits is one less than the number of categories, so
for a binary response there is only one trait. This is because if yuo
know the probability of being in one state (Pr(A)), you already know
the probability of being in the other state (1-Pr(A)). The covariance
matrix specification in the prior should therefore be 1x1 not 2x2. You
should also drop trait from the models and just have ~species, ~units
etc.

Cheers,

Jarrod

Quoting "sereina.graber"  on Fri, 02 Aug 2013
12:54:00 +0200:

>
>
>
>  Ursprüngliche Nachricht 
> Betreff: Re: Aw: Re: [R-sig-phylo] MCMCglmm for categorical data
> with more than 2 levels - prior specification?
> Von: Jarrod Hadfield 
> An: Sereina Graber 
> CC:
>
>
>
> Quoting Sereina Graber  on Fri, 2 Aug 2013 
> 12:12:41 +0200 (CEST):
>
>
>
> Hi Jarrod,
>
> Thanks a lot for those helpful tips. However,  now I ran into some
> further problems:
>
> 1.)  What is the difference between the arguments pedigree=bird.tree
> and "ginverse = list(species=Ainv)" where Ainv is defined by
> "Ainv=inverseA(bird.tree)$Ainv"?
>
> 2.) Now the nominal model seems to work with the following code, did I
> implement it right like that?
>
> Prior.phyl3 = list(R = list(V = diag(3), fix=1),G = list( G1 = list(V
> = diag(3) ,fix=1)))
> myMCMC.phyl<- MCMCglmm(nominal ~ trait:lnBrain,
> random=~us(trait):species, rcov = ~us(trait):units, ginverse =
> list(species=Ainv), data = "" family="categorical",
> prior=Prior.phyl3)
>
> However, if I try the same code with an adjusted prior specification
> for a binary response variable
>
> Prior.phyl31 = list(R = list(V = diag(2), fix=1),G = list( G1 = list(V
> = diag(2),fix=1)))
> myMCMC.phyl<- MCMCglmm(binary ~ trait:lnBrain,
> random=~us(trait):species, rcov = ~us(trait):units,
> pedigree=bird.tree,
>     data = "" family="categorical",
>     prior=Prior.phyl31)
>
> Then I always get the error:
> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>    contrasts can be applied only to factors with 2 or more levels
>
> What is the problem here?
>
> 3.) And about prior specifications: I found out that depending on what
> prior you specifiy, the results are completely different. So for the
> example above, I used a identity matrix for the variance-covariance
> matrix, so the covariances are equal to zero. In my case now having 4
> levels in my nominal response variable should I put some covariances
> or not, and if yes, how do I choose them? What kind of prior would you
> take?
> And for G: G is the cova

Re: [R-sig-phylo] WG: Re: Aw: Re: MCMCglmm for categorical data with more than 2 levels - prior specification?

2013-08-02 Thread Jarrod Hadfield

Hi,

1.) There is no difference between the arguments pedigree=bird.tree  
and "ginverse = list(species=Ainv)" where Ainv is defined by  
"Ainv=inverseA(bird.tree)$Ainv".  The latter argument was added after  
the first version in order to provide more flexibility (for example if  
multiple phylogenies are to be fitted).


2.)and 4.) You have also fixed the phylogenetic covariance matrix in  
the prior (by using fix=1). You should remove the fix=1 if you want to  
actually estimate it rather than fix it. You should also add trait as  
a main effect to allow the traits to have different intercepts. Its  
hard to know what to recommend regarding prior information, but you  
could start perhaps with V=IJ and nu low (see CourseNotes).


3.) The number of traits is one less than the number of categories, so  
for a binary response there is only one trait. This is because if yuo  
know the probability of being in one state (Pr(A)), you already know  
the probability of being in the other state (1-Pr(A)).  The covariance  
matrix specification in the prior should therefore be 1x1 not 2x2. You  
should also drop trait from the models and just have ~species, ~units  
etc.


Cheers,

Jarrod

Quoting "sereina.graber"  on Fri, 02 Aug 2013  
12:54:00 +0200:






 Ursprüngliche Nachricht 
Betreff: Re: Aw: Re: [R-sig-phylo] MCMCglmm for categorical data  
with more than 2 levels - prior specification?

Von: Jarrod Hadfield 
An: Sereina Graber 
CC:



Quoting Sereina Graber  on Fri, 2 Aug 2013 
12:12:41 +0200 (CEST):



Hi Jarrod,

Thanks a lot for those helpful tips. However,  now I ran into some
further problems:

1.)  What is the difference between the arguments pedigree=bird.tree
and "ginverse = list(species=Ainv)" where Ainv is defined by
"Ainv=inverseA(bird.tree)$Ainv"?

2.) Now the nominal model seems to work with the following code, did I
implement it right like that?

Prior.phyl3 = list(R = list(V = diag(3), fix=1),G = list( G1 = list(V
= diag(3) ,fix=1)))
myMCMC.phyl<- MCMCglmm(nominal ~ trait:lnBrain,
random=~us(trait):species, rcov = ~us(trait):units, ginverse =
list(species=Ainv), data = bird.data, family="categorical",
prior=Prior.phyl3)

However, if I try the same code with an adjusted prior specification
for a binary response variable

Prior.phyl31 = list(R = list(V = diag(2), fix=1),G = list( G1 = list(V
= diag(2),fix=1)))
myMCMC.phyl<- MCMCglmm(binary ~ trait:lnBrain,
random=~us(trait):species, rcov = ~us(trait):units,
pedigree=bird.tree,
    data = bird.data, family="categorical",
    prior=Prior.phyl31)

Then I always get the error:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
   contrasts can be applied only to factors with 2 or more levels

What is the problem here?

3.) And about prior specifications: I found out that depending on what
prior you specifiy, the results are completely different. So for the
example above, I used a identity matrix for the variance-covariance
matrix, so the covariances are equal to zero. In my case now having 4
levels in my nominal response variable should I put some covariances
or not, and if yes, how do I choose them? What kind of prior would you
take?
And for G: G is the covariance  structure of the random effects, but
aren`t actually the (co)variances of the random effects specified by
the vcv of the tree? so why is that a 3times3 matrix? For the
(co)variance structure of B: if I have two covariates, and I simulated
them to be correlated with r=0.5, what covariances in B can I take?
would 0.5 be appropriate assuming SD=1?

Thanks a lot & Best,
Sereina

GESENDET: Freitag, 02. August 2013 um 11:03 Uhr
VON: "Jarrod Hadfield" 
AN: "Sereina Graber" 
CC: r-sig-phylo@r-project.org
BETREFF: Re: [R-sig-phylo] MCMCglmm for categorical data with more
than 2 levels - prior specification?
Hi Sereina,

You should not get that error message when you do not specify a prior
- but if you do can you let me know.

For the prior you specified you get the error message because
us(trait):units is specifying a 3x3 covariance matrix, and yet your
prior, R=list(V=1,nu=0.002), is specifying a 1x1 matrix. V should be a
3x3 matrix, but note that the residual covariance matrix with
categorical data cannot be estimated from the data. For this reason
most people would not fit a weak prior (i.e. nu=0.002) but fit a very
strong prior (fixing it at some value a priori using fix=1 in the
prior specification). The choice of residual covariance matrix is
arbitrary - the results can always be expressed in a way that do not
depend on the choice of residual covariance matrix (See the
CourseNotes).

The fixed and random effect formulae are also a bit odd. This type of
model is essentially equivalent to a trivariate model where the three
traits (on the latent scale) are the differences on a log scale
between the probability of being in categories 2,3 or 4 compared to
category 1:

log(Pr(nominal[2]))-log(Pr(nominal[1]))
log(Pr(nomi

[R-sig-phylo] WG: Re: Aw: Re: MCMCglmm for categorical data with more than 2 levels - prior specification?

2013-08-02 Thread sereina.graber



 Ursprüngliche Nachricht 
Betreff: Re: Aw: Re: [R-sig-phylo] MCMCglmm for categorical data with more than 
2 levels - prior specification?
Von: Jarrod Hadfield 
An: Sereina Graber 
CC: 



Quoting Sereina Graber  on Fri, 2 Aug 2013  
12:12:41 +0200 (CEST):



Hi Jarrod,

Thanks a lot for those helpful tips. However,  now I ran into some
further problems:

1.)  What is the difference between the arguments pedigree=bird.tree
and "ginverse = list(species=Ainv)" where Ainv is defined by
"Ainv=inverseA(bird.tree)$Ainv"?

2.) Now the nominal model seems to work with the following code, did I
implement it right like that?

Prior.phyl3 = list(R = list(V = diag(3), fix=1),G = list( G1 = list(V
= diag(3) ,fix=1)))
myMCMC.phyl<- MCMCglmm(nominal ~ trait:lnBrain,
random=~us(trait):species, rcov = ~us(trait):units, ginverse =
list(species=Ainv), data = bird.data, family="categorical",
prior=Prior.phyl3)

However, if I try the same code with an adjusted prior specification
for a binary response variable

Prior.phyl31 = list(R = list(V = diag(2), fix=1),G = list( G1 = list(V
= diag(2),fix=1)))
myMCMC.phyl<- MCMCglmm(binary ~ trait:lnBrain,
random=~us(trait):species, rcov = ~us(trait):units,
pedigree=bird.tree,
                        data = bird.data, 
family="categorical",
                        prior=Prior.phyl31)

Then I always get the error:
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
   contrasts can be applied only to factors with 2 or more levels

What is the problem here?

3.) And about prior specifications: I found out that depending on what
prior you specifiy, the results are completely different. So for the
example above, I used a identity matrix for the variance-covariance
matrix, so the covariances are equal to zero. In my case now having 4
levels in my nominal response variable should I put some covariances
or not, and if yes, how do I choose them? What kind of prior would you
take?
And for G: G is the covariance  structure of the random effects, but
aren`t actually the (co)variances of the random effects specified by
the vcv of the tree? so why is that a 3times3 matrix? For the
(co)variance structure of B: if I have two covariates, and I simulated
them to be correlated with r=0.5, what covariances in B can I take?
would 0.5 be appropriate assuming SD=1?

Thanks a lot & Best,
Sereina

GESENDET: Freitag, 02. August 2013 um 11:03 Uhr
VON: "Jarrod Hadfield" 
AN: "Sereina Graber" 
CC: r-sig-phylo@r-project.org
BETREFF: Re: [R-sig-phylo] MCMCglmm for categorical data with more
than 2 levels - prior specification?
Hi Sereina,

You should not get that error message when you do not specify a prior
- but if you do can you let me know.

For the prior you specified you get the error message because
us(trait):units is specifying a 3x3 covariance matrix, and yet your
prior, R=list(V=1,nu=0.002), is specifying a 1x1 matrix. V should be a
3x3 matrix, but note that the residual covariance matrix with
categorical data cannot be estimated from the data. For this reason
most people would not fit a weak prior (i.e. nu=0.002) but fit a very
strong prior (fixing it at some value a priori using fix=1 in the
prior specification). The choice of residual covariance matrix is
arbitrary - the results can always be expressed in a way that do not
depend on the choice of residual covariance matrix (See the
CourseNotes).

The fixed and random effect formulae are also a bit odd. This type of
model is essentially equivalent to a trivariate model where the three
traits (on the latent scale) are the differences on a log scale
between the probability of being in categories 2,3 or 4 compared to
category 1:

log(Pr(nominal[2]))-log(Pr(nominal[1]))
log(Pr(nominal[3]))-log(Pr(nominal[1]))
log(Pr(nominal[4]))-log(Pr(nominal[1]))

where nominal[1] is called the baseline category. You can change the
baseline category by reordering the factor levels in nominal.

By having ~animal in the random formula you are assuming that a) the
phylogenetic variance for each contrast is equal and b) that the
correlation between the phylogenetic effects is one. This may make
sense in some models and with some types of base-line category, but
not generally I think. us(trait):animal allows the phylogenetic
variances to differ over the traits and for each pair of traits to
have a unique phylogenetic correlation. There are also other variance
structures that can be fitted that are somewhere between these two
extremes.

For the same reason you probably want to have trait specific
intercepts and trait specific regression coefficients for the
covariates. This can be achieved by having:

~ trait-1+trait:lnBrain + trait:binary.x

I remove the global intercept (-1) because I find the model output
easier to interpret, but it is not necessary.

You need to be careful with this type of model on these type of data,
because generally there is not much information from data on extant
tax

Re: [R-sig-phylo] MCMCglmm for categorical data with more than 2 levels - prior specification?

2013-08-02 Thread Jarrod Hadfield

Hi Sereina,

You should not get that error message when you do not specify a prior  
- but if you do can you let me know.


For the prior you specified you get the error message because  
us(trait):units is specifying a 3x3 covariance matrix, and yet your  
prior, R=list(V=1,nu=0.002), is specifying a 1x1 matrix. V should be a  
3x3 matrix, but note that the residual covariance matrix with  
categorical data cannot be estimated from the data. For this reason  
most people would not fit a weak prior (i.e. nu=0.002) but fit a very  
strong prior (fixing it at some value a priori using fix=1 in the  
prior specification). The choice of residual covariance matrix is  
arbitrary - the results can always be expressed in a way that do not  
depend on the choice of residual covariance matrix (See the  
CourseNotes).


The fixed and random effect formulae are also a bit odd.  This type of  
model is essentially equivalent to a trivariate model where the three  
traits (on the latent scale) are the differences on a log scale  
between the probability of being in categories 2,3 or 4 compared to  
category 1:


 log(Pr(nominal[2]))-log(Pr(nominal[1]))
 log(Pr(nominal[3]))-log(Pr(nominal[1]))
 log(Pr(nominal[4]))-log(Pr(nominal[1]))

where nominal[1] is called the baseline category. You can change the  
baseline category by reordering the factor levels in nominal.


By having ~animal in the random formula you are assuming that a) the  
phylogenetic variance for each contrast is equal  and b) that the  
correlation between the phylogenetic effects is one. This may make  
sense in some models and with some types of base-line category, but  
not generally I think. us(trait):animal allows the phylogenetic  
variances to differ over the traits and for each pair of traits to  
have a unique phylogenetic correlation. There are also other variance  
structures that can be fitted that are somewhere between these two  
extremes.


For the same reason you probably want to have trait specific  
intercepts and trait specific regression coefficients for the  
covariates.  This can be achieved by having:


~ trait-1+trait:lnBrain + trait:binary.x

I remove the global intercept (-1) because I find the model output  
easier to interpret, but it is not necessary.


You need to be careful with this type of model on these type of data,  
because generally there is not much information from data on extant  
taxa about the parameters of comparative analyses, particularly when  
the data are categorical. This means that priors, even ones that  
appear innocuous such as flat priors, may have a substantial influence  
on the posterior. In addition, numerical problems may exist in  
categorical models when the posterior distribution for the  
phylogenetic intra-class correlations has support in regions close to  
one (either because the true value is close to one, or because the  
posterior distribution is very wide because the data are not very  
informative).  This can be checked by saving the latent variables  
(pL=TRUE in the call to MCMCglmm) and making sure that the absolute  
values of the latent variables do not regularly exceed 20. Lastly,  
mixing may be (very) poor so you may have to wait an inordinate amount  
of time to completely sample the posterior.


Cheers,

Jarrod





Doing this is fine: you can always rescale the model parameters post analysis



Quoting Sereina Graber  on Fri, 2 Aug 2013  
10:17:58 +0200 (CEST):




Hi all,

I am doing a phylogenetic analysis using the MCMCglmm package with the
phylogenetic tree as the pedigree (Hadfield & Nakagawa 2010). I have a
categorical response variable ("nominal") with more than 2 categories
(4 categories in total) and a continuous and a binary explanatory
variable. My model:

mod<-MCMCglmm(nominal ~ lnBrain + binary.x, random= ~animal,
family="categorical",rcov=~us(trait):units, prior=prior4,
data=bird.data, pedigree=bird.tree)

Now there is always the following error message appearing if I do not
specify any priors, thus, using the default:

 Error in priorformat(if (NOpriorG) { :
  V is the wrong dimension for some prior$G/prior$R elements

However, I then tried different priors which didn`t work, because I
would have the wrong dimensions in the prior...can any one help me
with how I have to specifiy the priors correctly, what dimensions do I
need? My priors:

var2<-cbind(c(1e+08,0.1,0.1), c(0.1,1e+08,0.1),c(0.1,0.1,1e+08))
prior4<-list(R=list(V=1,nu=0.002), G=list(G1=list(V=1,
nu=0.002)),B=list(mu=rep(0,3), V=var2))

These priors lead to the error:

Error in priorformat(if (NOpriorG) { :
  V is the wrong dimension for some prior$G/prior$R elements

For any help I am very grateful.

Best


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://

[R-sig-phylo] MCMCglmm for categorical data with more than 2 levels - prior specification?

2013-08-02 Thread Sereina Graber

Hi all,



 

I am doing a phylogenetic analysis using the MCMCglmm package with the phylogenetic tree as the pedigree (Hadfield & Nakagawa 2010). I have a categorical response variable ("nominal") with more than 2 categories (4 categories in total) and a continuous and a binary explanatory variable. My model:

 

mod<-MCMCglmm(nominal ~ lnBrain + binary.x, random= ~animal, family="categorical",rcov=~us(trait):units, prior=prior4, data="" pedigree=bird.tree)

 

Now there is always the following error message appearing if I do not specify any priors, thus, using the default:

 

 Error in priorformat(if (NOpriorG) { : 
  V is the wrong dimension for some prior$G/prior$R elements

 

However, I then tried different priors which didn`t work, because I would have the wrong dimensions in the prior...can any one help me with how I have to specifiy the priors correctly, what dimensions do I need? My priors:

 

var2<-cbind(c(1e+08,0.1,0.1), c(0.1,1e+08,0.1),c(0.1,0.1,1e+08))
prior4<-list(R=list(V=1,nu=0.002), G=list(G1=list(V=1, nu=0.002)),B=list(mu=rep(0,3), V=var2))

 

These priors lead to the error:

 

Error in priorformat(if (NOpriorG) { : 
  V is the wrong dimension for some prior$G/prior$R elements

 

 

For any help I am very grateful.

 

Best




___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/