Re: [R-sig-phylo] Threshold models using threshBayes vs MCMCglmmRAM

2016-12-15 Thread Jarrod Hadfield

Hi Chris,

OK - stick with the RAM model, the h2 is so high you will run into 
numerical issues otherwise. In the two-trait model you might want to add 
in us(at.level(trait,1)):units into the random effects (make sure it is 
not the last term in the random formula) in case log.dep has a h2 
substantially less than 1. Having a multi-level response will help with 
power so I would go for it. threshBayes does handle ordinal responses 
but you would probably have to run it for a VERY long time to sample the 
posterior adequately.


Cheers,

Jarrod

On 16/12/2016 07:11, Chris Mull wrote:

Hi Jarrod,
I hadn't appreciated that the clustering of reproductive modes on the 
tree might limit out ability to detect some of these relationships. 
This is in fact a step in testing reproduction as an ordinal variable 
(egg-laying, lecithotrophic live-bearing, and matrotrophic 
live-bearing) which follows gradients of depth and latitude, and 
threshBayes cannot handle ordinal variables. Perhaps treating the data 
this way will help given more transitions. I have run the model in 
MCMCglmm and have appended the summary and attached the histogram of 
the liabilities. Thanks so much for your help with this...


summary(dep2)
Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000

 DIC: 31.2585

 G-structure:  ~animal

   post.mean l-95% CI u-95% CI eff.samp
animal 82.1835.88140.16.266

 R-structure:  ~units

  post.mean l-95% CI u-95% CI eff.samp
units 1110

 Location effects: parity ~ log.med.depth

  post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)  0.4250 -13.5697  13.791328.54 0.946
log.med.depth   -0.3601  -4.4399   3.802216.48 0.862

On Thu, Dec 15, 2016 at 11:10 PM, Jarrod Hadfield > wrote:


Hi Chris,

I think MCMCglmm is probably giving you the right answer. There
are huge chunks of the phylogeny that are either egg-laying and
live-bearing. The non-phylogenetic model shows a strong
relationship between reproductive mode and depth, and that might
be causal or it might just be because certain taxa tend to live at
greater depths and *happen to have* the same reproductive mode.
There's not much information in the phylogenetic spread of
reproductive modes to distinguish between these hypotheses, hence
the wide confidence intervals.   Just to be sure can you

a) just perform independent contrasts (not really suitable for
binary data, but probably won't give you an answer far off the
truth and is a nice simple sanity check).

b) using MCMCglmm (not MCMCglmmRAM) fit

prior.dep2<-=list(R=list(V=diag(1), fix=1),
G=list(G1=list(V=diag(1), nu=0.002)))

dep2<-MCMCglmm(parity~log.med.depth,
   random=~animal,
   rcov=~units,
   pedigree=shark.tree,
   data=traits,
   prior=prior.dep2,
   pr=TRUE,
   pl=TRUE,
   family="threshold")

an send me the summary and hist(dep2$Liab)

Cheers,

Jarrod



On 16/12/2016 07:02, Jarrod Hadfield wrote:


Hi Chris,

I think ngen in threshbayes is not the number of full iterations
(i.e. a full update of all parameters), but the number of full
iterations multiplied by the number of nodes (2n-1). With n=600
species this means threshbayes has only really done about 8,000
iterations (i.e. about 1000X less than MCMCglmm). Simulations
suggest threshbayes is about half as efficient per full iteration
as MCMCglmm which means that it may have only collected
0.5*1132/1200 = 0.47 effective samples from the posterior. The
very extreme value and the surprisingly tight credible intervals
(+/-0.007) also suggest a problem.

**However**, the low effective sample size for the covariance in
the MCMglmm model is also worrying given the length of chain, and
may point to potential problems. Are egg-laying/live-bearing
scattered throughout the tree, or do they tend to aggregate a
lot? Can you send me the output from:

prior.dep<-=list(R=list(V=diag(1)*1e-15, fix=1),
G=list(G1=list(V=diag(1), fix=1)))

dep<-MCMCglmm(parity~log.med.depth,
   random=~animal,
   rcov=~units,
   pedigree=shark.tree,
   reduced=TRUE,
   data=traits,
   prior=prior2,
   pr=TRUE,
   pl=TRUE,
   family="threshold")

summary(dep)

summary(glm(parity~log.med.depth, data=traits,
family=binomial(link=probit)))

Cheers,

Jarrod



On 15/12/2016 20:59, Chris Mull wrote:

Hi All,
I am trying to look at the 

Re: [R-sig-phylo] Threshold models using threshBayes vs MCMCglmmRAM

2016-12-15 Thread Chris Mull
Hi Jarrod,
I hadn't appreciated that the clustering of reproductive modes on the tree
might limit out ability to detect some of these relationships. This is in
fact a step in testing reproduction as an ordinal variable (egg-laying,
lecithotrophic live-bearing, and matrotrophic live-bearing) which follows
gradients of depth and latitude, and threshBayes cannot handle ordinal
variables. Perhaps treating the data this way will help given more
transitions. I have run the model in MCMCglmm and have appended the summary
and attached the histogram of the liabilities. Thanks so much for your help
with this...

summary(dep2)
Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000

 DIC: 31.2585

 G-structure:  ~animal

   post.mean l-95% CI u-95% CI eff.samp
animal 82.1835.88140.16.266

 R-structure:  ~units

  post.mean l-95% CI u-95% CI eff.samp
units 1110

 Location effects: parity ~ log.med.depth

  post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept)  0.4250 -13.5697  13.791328.54 0.946
log.med.depth   -0.3601  -4.4399   3.802216.48 0.862

On Thu, Dec 15, 2016 at 11:10 PM, Jarrod Hadfield 
wrote:

> Hi Chris,
>
> I think MCMCglmm is probably giving you the right answer. There are huge
> chunks of the phylogeny that are either egg-laying and live-bearing. The
> non-phylogenetic model shows a strong relationship between reproductive
> mode and depth, and that might be causal or it might just be because
> certain taxa tend to live at greater depths and *happen to have* the same
> reproductive mode. There's not much information in the phylogenetic spread
> of reproductive modes to distinguish between these hypotheses, hence the
> wide confidence intervals.   Just to be sure can you
>
> a) just perform independent contrasts (not really suitable for binary
> data, but probably won't give you an answer far off the truth and is a nice
> simple sanity check).
>
> b) using MCMCglmm (not MCMCglmmRAM) fit
>
> prior.dep2<-=list(R=list(V=diag(1), fix=1), G=list(G1=list(V=diag(1),
> nu=0.002)))
>
> dep2<-MCMCglmm(parity~log.med.depth,
>random=~animal,
>rcov=~units,
>pedigree=shark.tree,
>data=traits,
>prior=prior.dep2,
>pr=TRUE,
>pl=TRUE,
>family="threshold")
>
> an send me the summary and hist(dep2$Liab)
>
> Cheers,
>
> Jarrod
>
>
>
> On 16/12/2016 07:02, Jarrod Hadfield wrote:
>
> Hi Chris,
>
> I think ngen in threshbayes is not the number of full iterations (i.e. a
> full update of all parameters), but the number of full iterations
> multiplied by the number of nodes (2n-1). With n=600 species this means
> threshbayes has only really done about 8,000 iterations (i.e. about 1000X
> less than MCMCglmm). Simulations suggest threshbayes is about half as
> efficient per full iteration as MCMCglmm which means that it may have only
> collected 0.5*1132/1200 = 0.47 effective samples from the posterior. The
> very extreme value and the surprisingly tight credible intervals (+/-0.007)
> also suggest a problem.
>
> **However**, the low effective sample size for the covariance in the
> MCMglmm model is also worrying given the length of chain, and may point to
> potential problems. Are egg-laying/live-bearing scattered throughout the
> tree, or do they tend to aggregate a lot? Can you send me the output from:
>
> prior.dep<-=list(R=list(V=diag(1)*1e-15, fix=1),
> G=list(G1=list(V=diag(1), fix=1)))
>
> dep<-MCMCglmm(parity~log.med.depth,
>random=~animal,
>rcov=~units,
>pedigree=shark.tree,
>reduced=TRUE,
>data=traits,
>prior=prior2,
>pr=TRUE,
>pl=TRUE,
>family="threshold")
>
> summary(dep)
>
> summary(glm(parity~log.med.depth, data=traits,
> family=binomial(link=probit)))
>
> Cheers,
>
> Jarrod
>
>
>
> On 15/12/2016 20:59, Chris Mull wrote:
>
> Hi All,
> I am trying to look at the correlated evolution of traits using the
> threshold model as implemented in phytools::threshBayes (Revell 2014) and
> MCMCglmmRAM (Hadfield 2015). My understanding from Hadfield 2015 is that
> the reduced animal models should yeild equivalent results, yet having run
> both I am having trouble reconciling the results. I have a tree covering
> ~600 species of sharks. skates, and rays and am interested in testing for
> the correlated evolution between reproductive mode (egg-laying and
> live-bearing) with depth. When I test for this correlation using
> phytools:threshBayes there is a clear result with a high correlation
> coefficient (-0.986) as I would predict. When I run the same analysis using
> MCMCglmmRAM I get a very different result, with 

Re: [R-sig-phylo] Threshold models using threshBayes vs MCMCglmmRAM

2016-12-15 Thread Chris Mull
Hi Jarrod,
Thanks very much for your fast reply. Egg-laying and live-bearing are
dispersed throughout the tree ( I have attached a PDF of a traitplot with
egg-laying and live-bearing on it; blue is egg-laying and red is
live-bearing), being universal in chimaeras and skates, and found in
several families of galeomorph sharks. Here are the summaries of the two
models:

#
>summary(dep)
Iterations = 3001:12991
 Thinning interval  = 10
 Sample size  = 1000

 DIC: 62.7561

 G-structure:  ~animal

   post.mean l-95% CI u-95% CI eff.samp
animal 1110

 R-structure:  ~units

  post.mean l-95% CI u-95% CI eff.samp
units 1110

 Location effects: parity ~ log.med.depth

  post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) 0.13854 -0.97336  1.4057652.98  0.87
log.med.depth  -0.06105 -0.37972  0.3212214.31  0.69

#
>summary(glm)
Call:
glm(formula = parity ~ log.med.depth, family = binomial(link = probit),
data = traits)

Deviance Residuals:
Min   1Q   Median   3Q  Max
-2.5976  -1.0564   0.5410   0.8522   1.6867

Coefficients:
  Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.6195 0.2428  10.789   <2e-16 ***
log.med.depth  -0.9815 0.1030  -9.526   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 784.17  on 609  degrees of freedom
Residual deviance: 683.16  on 608  degrees of freedom
AIC: 687.16

Number of Fisher Scoring iterations: 4


Please let me know if there is any more info I can provide...


Cheers,
Chris

On Thu, Dec 15, 2016 at 11:02 PM, Jarrod Hadfield 
wrote:

> Hi Chris,
>
> I think ngen in threshbayes is not the number of full iterations (i.e. a
> full update of all parameters), but the number of full iterations
> multiplied by the number of nodes (2n-1). With n=600 species this means
> threshbayes has only really done about 8,000 iterations (i.e. about 1000X
> less than MCMCglmm). Simulations suggest threshbayes is about half as
> efficient per full iteration as MCMCglmm which means that it may have only
> collected 0.5*1132/1200 = 0.47 effective samples from the posterior. The
> very extreme value and the surprisingly tight credible intervals (+/-0.007)
> also suggest a problem.
>
> **However**, the low effective sample size for the covariance in the
> MCMglmm model is also worrying given the length of chain, and may point to
> potential problems. Are egg-laying/live-bearing scattered throughout the
> tree, or do they tend to aggregate a lot? Can you send me the output from:
>
> prior.dep<-=list(R=list(V=diag(1)*1e-15, fix=1),
> G=list(G1=list(V=diag(1), fix=1)))
>
> dep<-MCMCglmm(parity~log.med.depth,
>random=~animal,
>rcov=~units,
>pedigree=shark.tree,
>reduced=TRUE,
>data=traits,
>prior=prior2,
>pr=TRUE,
>pl=TRUE,
>family="threshold")
>
> summary(dep)
>
> summary(glm(parity~log.med.depth, data=traits,
> family=binomial(link=probit)))
>
> Cheers,
>
> Jarrod
>
>
>
> On 15/12/2016 20:59, Chris Mull wrote:
>
> Hi All,
> I am trying to look at the correlated evolution of traits using the
> threshold model as implemented in phytools::threshBayes (Revell 2014) and
> MCMCglmmRAM (Hadfield 2015). My understanding from Hadfield 2015 is that
> the reduced animal models should yeild equivalent results, yet having run
> both I am having trouble reconciling the results. I have a tree covering
> ~600 species of sharks. skates, and rays and am interested in testing for
> the correlated evolution between reproductive mode (egg-laying and
> live-bearing) with depth. When I test for this correlation using
> phytools:threshBayes there is a clear result with a high correlation
> coefficient (-0.986) as I would predict. When I run the same analysis using
> MCMCglmmRAM I get a very different result, with a low degree of covariation
> and CI crossing zero (-0.374; -3.638 - 3.163 95%CI). Both models are run
> for 10,000,000 generations with the same bunr-in and sampling period. I
> have looked at the trace plots for each model using coda and parameter
> estimates and likelihodd . I'm not sure how to reconcile the differences in
> these results and any advice would be greatly appreciated. I have appended
> the code and outputs below.
>
>
> ###
> #phytools::threshBayes#
> ###
>
> X<-shark.data[c("parity","log.med.depth")]
> X<-as.matrix(X)
>
> #mcmc paramters (also see control options)
> ngen<-1000
> burnin<-0.2*ngen
> sample<-500
>
> thresh<-threshBayes(shark.tree,
>X,
>types=c("discrete","continuous"),
>

Re: [R-sig-phylo] Threshold models using threshBayes vs MCMCglmmRAM

2016-12-15 Thread Jarrod Hadfield

Hi Chris,

I think MCMCglmm is probably giving you the right answer. There are huge 
chunks of the phylogeny that are either egg-laying and live-bearing. The 
non-phylogenetic model shows a strong relationship between reproductive 
mode and depth, and that might be causal or it might just be because 
certain taxa tend to live at greater depths and *happen to have* the 
same reproductive mode. There's not much information in the phylogenetic 
spread of reproductive modes to distinguish between these hypotheses, 
hence the wide confidence intervals.   Just to be sure can you


a) just perform independent contrasts (not really suitable for binary 
data, but probably won't give you an answer far off the truth and is a 
nice simple sanity check).


b) using MCMCglmm (not MCMCglmmRAM) fit

prior.dep2<-=list(R=list(V=diag(1), fix=1), G=list(G1=list(V=diag(1), 
nu=0.002)))


dep2<-MCMCglmm(parity~log.med.depth,
   random=~animal,
   rcov=~units,
   pedigree=shark.tree,
   data=traits,
   prior=prior.dep2,
   pr=TRUE,
   pl=TRUE,
   family="threshold")

an send me the summary and hist(dep2$Liab)

Cheers,

Jarrod



On 16/12/2016 07:02, Jarrod Hadfield wrote:


Hi Chris,

I think ngen in threshbayes is not the number of full iterations (i.e. 
a full update of all parameters), but the number of full iterations 
multiplied by the number of nodes (2n-1). With n=600 species this 
means threshbayes has only really done about 8,000 iterations (i.e. 
about 1000X less than MCMCglmm). Simulations suggest threshbayes is 
about half as efficient per full iteration as MCMCglmm which means 
that it may have only collected 0.5*1132/1200 = 0.47 effective samples 
from the posterior. The very extreme value and the surprisingly tight 
credible intervals (+/-0.007) also suggest a problem.


*However*, the low effective sample size for the covariance in the 
MCMglmm model is also worrying given the length of chain, and may 
point to potential problems. Are egg-laying/live-bearing scattered 
throughout the tree, or do they tend to aggregate a lot? Can you send 
me the output from:


prior.dep<-=list(R=list(V=diag(1)*1e-15, fix=1), 
G=list(G1=list(V=diag(1), fix=1)))


dep<-MCMCglmm(parity~log.med.depth,
   random=~animal,
   rcov=~units,
   pedigree=shark.tree,
   reduced=TRUE,
   data=traits,
   prior=prior2,
   pr=TRUE,
   pl=TRUE,
   family="threshold")

summary(dep)

summary(glm(parity~log.med.depth, data=traits, 
family=binomial(link=probit)))


Cheers,

Jarrod



On 15/12/2016 20:59, Chris Mull wrote:

Hi All,
I am trying to look at the correlated evolution of traits using the
threshold model as implemented in phytools::threshBayes (Revell 2014) and
MCMCglmmRAM (Hadfield 2015). My understanding from Hadfield 2015 is that
the reduced animal models should yeild equivalent results, yet having run
both I am having trouble reconciling the results. I have a tree covering
~600 species of sharks. skates, and rays and am interested in testing for
the correlated evolution between reproductive mode (egg-laying and
live-bearing) with depth. When I test for this correlation using
phytools:threshBayes there is a clear result with a high correlation
coefficient (-0.986) as I would predict. When I run the same analysis using
MCMCglmmRAM I get a very different result, with a low degree of covariation
and CI crossing zero (-0.374; -3.638 - 3.163 95%CI). Both models are run
for 10,000,000 generations with the same bunr-in and sampling period. I
have looked at the trace plots for each model using coda and parameter
estimates and likelihodd . I'm not sure how to reconcile the differences in
these results and any advice would be greatly appreciated. I have appended
the code and outputs below.


###
#phytools::threshBayes#
###

X<-shark.data[c("parity","log.med.depth")]
X<-as.matrix(X)

#mcmc paramters (also see control options)
ngen<-1000
burnin<-0.2*ngen
sample<-500

thresh<-threshBayes(shark.tree,
X,
types=c("discrete","continuous"),
ngen=ngen,
control = list(sample=sample))

The return correlation is -0.986 (-0.99 - -0.976  95%HPD)


#
#MCMCglmm bivariate-gaussian#
#


prior2=list(R=list(V=diag(2)*1e-15, fix=1), G=list(G1=list(V=diag(2),

nu=0.002, fix=2)))

ellb.log.dep<-MCMCglmm(cbind(log.med.depth,parity)~trait-1,
   random=~us(trait):animal,
   rcov=~us(trait):units,
   pedigree=shark.tree,
   reduced=TRUE,
   data=traits,

Re: [R-sig-phylo] Threshold models using threshBayes vs MCMCglmmRAM

2016-12-15 Thread Jarrod Hadfield

Hi Chris,

I think ngen in threshbayes is not the number of full iterations (i.e. a 
full update of all parameters), but the number of full iterations 
multiplied by the number of nodes (2n-1). With n=600 species this means 
threshbayes has only really done about 8,000 iterations (i.e. about 
1000X less than MCMCglmm). Simulations suggest threshbayes is about half 
as efficient per full iteration as MCMCglmm which means that it may have 
only collected 0.5*1132/1200 = 0.47 effective samples from the 
posterior. The very extreme value and the surprisingly tight credible 
intervals (+/-0.007) also suggest a problem.


*However*, the low effective sample size for the covariance in the 
MCMglmm model is also worrying given the length of chain, and may point 
to potential problems. Are egg-laying/live-bearing scattered throughout 
the tree, or do they tend to aggregate a lot? Can you send me the output 
from:


prior.dep<-=list(R=list(V=diag(1)*1e-15, fix=1), 
G=list(G1=list(V=diag(1), fix=1)))


dep<-MCMCglmm(parity~log.med.depth,
   random=~animal,
   rcov=~units,
   pedigree=shark.tree,
   reduced=TRUE,
   data=traits,
   prior=prior2,
   pr=TRUE,
   pl=TRUE,
   family="threshold")

summary(dep)

summary(glm(parity~log.med.depth, data=traits, 
family=binomial(link=probit)))


Cheers,

Jarrod



On 15/12/2016 20:59, Chris Mull wrote:

Hi All,
I am trying to look at the correlated evolution of traits using the
threshold model as implemented in phytools::threshBayes (Revell 2014) and
MCMCglmmRAM (Hadfield 2015). My understanding from Hadfield 2015 is that
the reduced animal models should yeild equivalent results, yet having run
both I am having trouble reconciling the results. I have a tree covering
~600 species of sharks. skates, and rays and am interested in testing for
the correlated evolution between reproductive mode (egg-laying and
live-bearing) with depth. When I test for this correlation using
phytools:threshBayes there is a clear result with a high correlation
coefficient (-0.986) as I would predict. When I run the same analysis using
MCMCglmmRAM I get a very different result, with a low degree of covariation
and CI crossing zero (-0.374; -3.638 - 3.163 95%CI). Both models are run
for 10,000,000 generations with the same bunr-in and sampling period. I
have looked at the trace plots for each model using coda and parameter
estimates and likelihodd . I'm not sure how to reconcile the differences in
these results and any advice would be greatly appreciated. I have appended
the code and outputs below.


###
#phytools::threshBayes#
###

X<-shark.data[c("parity","log.med.depth")]
X<-as.matrix(X)

#mcmc paramters (also see control options)
ngen<-1000
burnin<-0.2*ngen
sample<-500

thresh<-threshBayes(shark.tree,
X,
types=c("discrete","continuous"),
ngen=ngen,
control = list(sample=sample))

The return correlation is -0.986 (-0.99 - -0.976  95%HPD)


#
#MCMCglmm bivariate-gaussian#
#


prior2=list(R=list(V=diag(2)*1e-15, fix=1), G=list(G1=list(V=diag(2),

nu=0.002, fix=2)))

ellb.log.dep<-MCMCglmm(cbind(log.med.depth,parity)~trait-1,
   random=~us(trait):animal,
   rcov=~us(trait):units,
   pedigree=shark.tree,
   reduced=TRUE,
   data=traits,
   prior=prior2,
   pr=TRUE,
   pl=TRUE,
   family=c("gaussian", "threshold"),
   thin=500,
   burnin = 100,
   nitt = 1000)

summary(ellb.log.dep)

Iterations = 101:501
Thinning interval  = 500
Sample size  = 18000
DIC: 2930.751
G-structure:  ~us(trait):animal
 post.mean l-95% CI u-95% CI eff.samp
traitscale.depth:traitscale.depth.animal16.965   15.092   18.864
  18000
traitparity:traitscale.depth.animal -0.374   -3.6383.163
1132
traitscale.depth:traitparity.animal -0.374   -3.6383.163
1132
traitparity:traitparity.animal   1.0001.0001.000
  0
R-structure:  ~us(trait):units
post.mean l-95% CI u-95% CI eff.samp
traitscale.depth:traitscale.depth.units16.965   15.092   18.86418000
traitparity:traitscale.depth.units -0.374   -3.6383.163 1132
traitscale.depth:traitparity.units -0.374   -3.6383.163 1132
traitparity:traitparity.units   1.0001.0001.0000
Location effects: cbind(scale.depth, parity) ~ trait - 1
 post.mean l-95% CI u-95% CI eff.samp pMCMC

[R-sig-phylo] Threshold models using threshBayes vs MCMCglmmRAM

2016-12-15 Thread Chris Mull
Hi All,
I am trying to look at the correlated evolution of traits using the
threshold model as implemented in phytools::threshBayes (Revell 2014) and
MCMCglmmRAM (Hadfield 2015). My understanding from Hadfield 2015 is that
the reduced animal models should yeild equivalent results, yet having run
both I am having trouble reconciling the results. I have a tree covering
~600 species of sharks. skates, and rays and am interested in testing for
the correlated evolution between reproductive mode (egg-laying and
live-bearing) with depth. When I test for this correlation using
phytools:threshBayes there is a clear result with a high correlation
coefficient (-0.986) as I would predict. When I run the same analysis using
MCMCglmmRAM I get a very different result, with a low degree of covariation
and CI crossing zero (-0.374; -3.638 - 3.163 95%CI). Both models are run
for 10,000,000 generations with the same bunr-in and sampling period. I
have looked at the trace plots for each model using coda and parameter
estimates and likelihodd . I'm not sure how to reconcile the differences in
these results and any advice would be greatly appreciated. I have appended
the code and outputs below.


###
#phytools::threshBayes#
###
>X<-shark.data[c("parity","log.med.depth")]
>X<-as.matrix(X)
>
>#mcmc paramters (also see control options)
>ngen<-1000
>burnin<-0.2*ngen
>sample<-500
>
>thresh<-threshBayes(shark.tree,
>X,
>types=c("discrete","continuous"),
>ngen=ngen,
>control = list(sample=sample))

The return correlation is -0.986 (-0.99 - -0.976  95%HPD)


#
#MCMCglmm bivariate-gaussian#
#

>prior2=list(R=list(V=diag(2)*1e-15, fix=1), G=list(G1=list(V=diag(2),
nu=0.002, fix=2)))
>
>ellb.log.dep<-MCMCglmm(cbind(log.med.depth,parity)~trait-1,
>   random=~us(trait):animal,
>   rcov=~us(trait):units,
>   pedigree=shark.tree,
>   reduced=TRUE,
>   data=traits,
>   prior=prior2,
>   pr=TRUE,
>   pl=TRUE,
>   family=c("gaussian", "threshold"),
>   thin=500,
>   burnin = 100,
>   nitt = 1000)
>
>summary(ellb.log.dep)

Iterations = 101:501
Thinning interval  = 500
Sample size  = 18000
DIC: 2930.751
G-structure:  ~us(trait):animal
post.mean l-95% CI u-95% CI eff.samp
traitscale.depth:traitscale.depth.animal16.965   15.092   18.864
 18000
traitparity:traitscale.depth.animal -0.374   -3.6383.163
1132
traitscale.depth:traitparity.animal -0.374   -3.6383.163
1132
traitparity:traitparity.animal   1.0001.0001.000
 0
R-structure:  ~us(trait):units
   post.mean l-95% CI u-95% CI eff.samp
traitscale.depth:traitscale.depth.units16.965   15.092   18.86418000
traitparity:traitscale.depth.units -0.374   -3.6383.163 1132
traitscale.depth:traitparity.units -0.374   -3.6383.163 1132
traitparity:traitparity.units   1.0001.0001.0000
Location effects: cbind(scale.depth, parity) ~ trait - 1
post.mean l-95% CI u-95% CI eff.samp pMCMC
traitscale.depth   0.12297 -3.63655  4.0200518000 0.949
traitparity   -0.02212 -1.00727  0.9338717058 0.971

Thanks for any and all input.

Cheers,
Chris

-- 
Christopher Mull
PhD Candidate, Shark Biology and Evolutionary Neuroecology
Dulvy Lab
Simon Fraser University
Burnaby,B.C.
V5A 1S6
Canada
(778) 782-3989
twitter: @mrsharkbrain
e-mail:cm...@sfu.ca
www.christophermull.weebly.com
www.earth2ocean.org

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] bonferroni corrections in Blomberg's K and Pagel's lambda

2016-12-15 Thread Carmelo Fruciano

Hi Ting-Wen,
as Ted pointed out, if and how one has to correct for multiple tests  
is a huge topic. Perhaps looking at the literature and making your own  
opinion on this matter would be the best choice (for example, Perneger  
1998 - British Medical Journal and Garcia 2004 - Oikos, present two  
different points of view in a very accessible way).


A few extra points:
- There are other methods that are generally less conservative than  
the Holm procedure ("Sequential Bonferroni"); these include the  
Benjamini-Hochberg procedure (Benjamini & Hochberg 1995 - Journal of  
the Royal Statistical Society; see also Benjamini & Yekutieli 2001 -  
The Annals of Statistics) and other more recent procedures (such as  
Carvajal-Rodriguez & de Una-Alvarez 2011 - Plos One); most of these  
are worth checking out, making your opinion on them and, possibly,  
using them (most of them have R implementations)


- If your dataset is multidimensional/multivariate in nature, you  
might want to consider using multivariate approaches for estimating  
and testing for phylogenetic signal (e.g., Adams 2014 - Systematic  
Biology), rather than many univariate tests


- If your dataset/hypothesis is multivariate in nature, testing  
individual PCs separately might be a poor choice (especially for PCs  
of order higher than 1), if it is not multivariate in nature, there  
might be no reason to use PCA


I hope this is of some help.
Best,
Carmelo



--
Carmelo Fruciano
Postdoctoral Fellow - Queensland University of Technology - Brisbane,  
Australia

Honorary Fellow - University of Catania - Catania, Italy
e-mail c.fruci...@unict.it
http://www.fruciano.it/research/



"Chen, Ting-Wen"  ha scritto:


Dear Ted, dear Fabio,

thank you so much for your suggestions. I found that people applied  
bonferroni corrections in p-values in Pagel’s lambda, as shown in  
this paper: http://www.pnas.org/content/106/43/18097.abstract


In my case, I decide to use p.adjust (x,method=“holm”,n=length(x))  
to correct the p-values, hope this would be better. I also did some  
PCA to reduce the trait dimensions and to avoid trait correlation  
with each other, and then tested phylogenetic signal for the first  
several PCs.


All the best
Ting-Wen

--
Ting-Wen Chen
J.F. Blumenbach Institute of Zoology and Anthropology
Georg August University Goettingen
Berliner Str. 28
D-37073 Goettingen, Germany
Tel: +49-55139-10943

r-sig-phylo-requ...@r-project.org  
於 2016年12月13日 19:00 寫道:


Send R-sig-phylo mailing list submissions to
r-sig-phylo@r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
or, via email, send a message with subject or body 'help' to
r-sig-phylo-requ...@r-project.org

You can reach the person managing the list at
r-sig-phylo-ow...@r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-phylo digest..."


Today's Topics:

  1. Re: bonferroni corrections in Blomberg's K and Pagel's lambda
 (F?bio Machado)


--

Message: 1
Date: Tue, 13 Dec 2016 08:49:59 -0200
From: F?bio Machado 
To: "r-sig-phylo@r-project.org" 
Subject: Re: [R-sig-phylo] bonferroni corrections in Blomberg's K and
Pagel's lambda
Message-ID: 
Content-Type: text/plain; charset="UTF-8"

Recently I came across a similar issue, but I had a far greater  
number of repeated tests, 595 actually. If i used 10.000  
permutations to access significance, I noticed that I could never  
reject the null hypothesis. That is because for 10.000 permutations,  
the minimum p value is 1e-04 and the bonferroni adjusted value is  
0.0595. So, instead of using any multiple test corrections, I simply  
rejected the null-hypothesis when the observed statistic fell  
completely outside the distribution constructed by simulation. This  
produced nearly identical results to a parametric approach with  
bonferroni correction (I was analyzing correlations in that case).


I don?t know if that adds to the question initially raised, since  
for 18 tests, the minimum adjusted pvalues are still lower than  
0.05, but this experience led me to believe that applying bonferroni  
correction to non-parametric p-value estimates is not that  
straightforward. I don?t know if another multiple-test correction  
method is more adequate for these cases. Any thoughts would be  
appreciated.


best regards,

Fabio Andrade Machado
Laborat?rio de Evolu??o de Mam?feros
Departamento de Gen?tica e Biologia Evolutiva- USP
f.mach...@usp.br  ; macfa...@gmail.com  


+55 11 982631029
skype: fabio_a_machado

Lattes: http://lattes.cnpq.br/3673327633303737  

[R-sig-phylo] ape chronos error

2016-12-15 Thread Riana Rishad Minocher
Hi, 

I’m writing with an issue using the chronos function in ape:

I have a rooted supertree of 186 taxa (genetic & linguistic data), and am 
trying to time-calibrate with a set of divergence dates (genetic & linguistic; 
available for about 1/3 of nodes). 

I’m using chronos and calling agemin and agemax from a table of dates: 

tree.dated <- chronos(tree, calibration = makeChronosCalib(tree, node=dat$node, 
age.min=dat$agemin, age.max=dat$agemax))

I have the following error:

Setting initial dates...
Fitting in progress... get a first set of estimates

Error in nlminb(start.para, f, g, control = opt.ctrl, lower = LOW, upper = UP) 
: 
  gradient function must return a numeric vector of length 161
In addition: Warning message:
In nlminb(start.para, f, g, control = opt.ctrl, lower = LOW, upper = UP) :
  NA/NaN function evaluation

I have had a brief look at the chronos function, and cannot determine where the 
error is generated. The error message is slightly different if I alter the 
substitution model, and seems to trip up at a different point. I’m not sure if 
it begins in setting “ini.rate” as this appears to be where the error appears 
for model = “discrete”. 

I have tried updating ape, altering my set of dates to a few nodes, removing 
the date for the root, listing node numbers and ages rather than calling it 
from the table, and reading my tree with read.nexus and in parenthetic format 
in case the tree object was problematic. I’ve noticed that the vector length 
the gradient function must generate varies depending on the number of known 
node ages provided - but I can’t figure out much more than that.

I was hoping someone had some suggestion on how I can deal with this error - 
essentially where the source of the error might be so I can adjust my input 
accordingly! Or, alternatively if anyone has a suggestion for different ways to 
time-calibrate the tree (other than chronos).

Thank you!

Best, 

Riana

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Midpoint root a tree with root() function ape ver 4.0

2016-12-15 Thread Todd Knutson
Hi Klaus, 

Actually, I need to apologize to the community. I was wrong — the midpoint() 
function works perfectly as is!

I had built a little example with dummy data that was confusing me. But, I 
changed my example and now I understand what’s going on. After running the 
midpoint() function, it simply labels both branches on either side of the 
midpoint root with the same support value (which is correct and makes sense). 
In my poor example, I thought this was a bug.

Thank you very much for this function. It’s very useful for my work!
Todd





> On Dec 15, 2016, at 3:22 PM, Klaus Schliep  wrote:
> 
> Hi Knud,
> in phangorn I try to take care that the node labels are assigned to the right 
> nodes after using midpoint, see attached code and pic. It seems to work quite 
> nicely. It would be useful if you could supply a reproducible example or the 
> tree you have problems with and open otherwise an issue on github. 
> Regards, 
> Klaus
> 
>   
> 
> 
> On Thu, Dec 15, 2016 at 9:14 AM, Todd Knutson  > wrote:
> Hi,
> 
> Is there any way to use the updated root() function in ape ver 4.0 to find 
> and set the midpoint root in a tree? I greatly appreciate the updated root() 
> function to include the “edgelabel = TRUE” option, so that when I have 
> bootstrapping support values listed as node labels, they get assigned to the 
> proper edge of the tree after re-rooting.
> 
> However, using the midpoint.root() function from phytools or midpoint() from 
> the phangorn packages alter the node labels after rooting, and the 
> bootstrapping values get assigned to the wrong edges.
> 
> Thus, I would love to use the new root() function, with “edgelabel = TRUE” 
> option enabled, to midpoint a tree. I would appreciate any suggestions.
> 
> Thanks,
> Todd
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org 
> 
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo 
> 
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ 
> 
> 
> 
> -- 
> Klaus Schliep
> Postdoctoral Fellow
> Revell Lab, University of Massachusetts Boston
> http://www.phangorn.org/ 
> 
> 


[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Midpoint root a tree with root() function ape ver 4.0

2016-12-15 Thread Klaus Schliep
Hi Knud,
in phangorn I try to take care that the node labels are assigned to the
right nodes after using midpoint, see attached code and pic. It seems to
work quite nicely. It would be useful if you could supply a reproducible
example or the tree you have problems with and open otherwise an issue on
github.
Regards,
Klaus




On Thu, Dec 15, 2016 at 9:14 AM, Todd Knutson  wrote:

> Hi,
>
> Is there any way to use the updated root() function in ape ver 4.0 to find
> and set the midpoint root in a tree? I greatly appreciate the updated
> root() function to include the “edgelabel = TRUE” option, so that when I
> have bootstrapping support values listed as node labels, they get assigned
> to the proper edge of the tree after re-rooting.
>
> However, using the midpoint.root() function from phytools or midpoint()
> from the phangorn packages alter the node labels after rooting, and the
> bootstrapping values get assigned to the wrong edges.
>
> Thus, I would love to use the new root() function, with “edgelabel = TRUE”
> option enabled, to midpoint a tree. I would appreciate any suggestions.
>
> Thanks,
> Todd
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-
> sig-ph...@r-project.org/




-- 
Klaus Schliep
Postdoctoral Fellow
Revell Lab, University of Massachusetts Boston
http://www.phangorn.org/
library(phangorn)
library(ape)

data("Laurasiatherian")
tree <- nj(dist.ml(Laurasiatherian))
set.seed(42)
trees <- bootstrap.phyDat(Laurasiatherian, function(x)nj(dist.ml(x)))

tree1 <- plotBS(tree, trees, "phylogram")
pdf("Bootstrap.pdf")
par(mar=c(1,1,1,1))
par(mfrow=c(3,1))
plot(tree1, show.node.label = TRUE)
tree2 <- midpoint(tree1)
plot(tree2, show.node.label = TRUE)
tree$tip.label[43]
tree3 = tree1
# make edge to Cat large
ind = which(tree1$edge[,2]==43)
tree3$edge.length[ind]=.5
tree3 = midpoint(tree3)
plot(tree3, show.node.label = TRUE)
dev.off()


Bootstrap.pdf
Description: Adobe PDF document
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

[R-sig-phylo] Midpoint root a tree with root() function ape ver 4.0

2016-12-15 Thread Todd Knutson
Hi,

Is there any way to use the updated root() function in ape ver 4.0 to find and 
set the midpoint root in a tree? I greatly appreciate the updated root() 
function to include the “edgelabel = TRUE” option, so that when I have 
bootstrapping support values listed as node labels, they get assigned to the 
proper edge of the tree after re-rooting. 

However, using the midpoint.root() function from phytools or midpoint() from 
the phangorn packages alter the node labels after rooting, and the 
bootstrapping values get assigned to the wrong edges. 

Thus, I would love to use the new root() function, with “edgelabel = TRUE” 
option enabled, to midpoint a tree. I would appreciate any suggestions. 

Thanks,
Todd 
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] compressTipLabel as an option to read.trees()

2016-12-15 Thread Emmanuel Paradis
It seems that .compressTipLabel's running times are proportional to N 
(number of trees) and to log(n) (n: nb of tips).


R> tr <- rmtree(1e4, 1e4) # takes ~5 minutes
R> system.time(a <- .compressTipLabel(tr))
utilisateur système  écoulé
 20.904   0.376  21.275

R> print(object.size(tr), unit = "Gb")
8.2 Gb
R> print(object.size(a), unit = "Gb")
3 Gb

If I divide N by 10, it takes 10 times less time:

R> tr <- rmtree(1e3, 1e4)
R> system.time(a <- .compressTipLabel(tr))
utilisateur système  écoulé
  2.088   0.000   2.088

To be compared with my previous message with N=1 and n=1000 which 
took 20 times less time (~1.2 sec).


I guess reading the tree file (either in Newick or in NEXUS) will be 
much longer than any of these.


Best,

Emmanuel

Le 14/12/2016 à 22:44, Yan Wong a écrit :


On 14 Dec 2016, at 20:57, Emmanuel Paradis  wrote:


What is the size of your problem?


Erm, quite large. I am looking at tree comparison metrics for roughly 10,000 
trees with perhaps 10,000 tips on each, replicated several times. The newick 
files themselves take up gigabyes uncompressed. For this sized problem I’m 
likely to implement my own comparison metrics, but I want to trial this out 
with a tested library before rolling my own.


Do you use a recent version of ape? This function was improved one or two years 
ago.


Yes, 4.0.

But I’m happy for the moment to just leave this stuff running for days on a 
server, so it was just a quick suggestion really.

Thanks for the quick reply

Yan






___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/