Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-17 Thread Julien Clavel
Hi Krzysztof,

I agree with you that if you want to test explicitly adaptation in a regression 
framework (i.e. environmental variables as predictors), slouch/mvslouch is more 
sounding than doing a simple regression ((p)GLS) or multiple regression. 
I emphasize here that the coefficients of a regression model and the 
correlations are really distincts. It's because all is about the nature of the 
variation in the response variable in a regression model (and it's why that 
changing the variables in a multiple regression assume different nature of 
errors terms).

The initial question was on how to obtain the correlations between the traits, 
and possibly their significance, while CCA provides only the correlation 
between the traits sets.

The significance of the correlations computed through multivariate approaches 
may be impeded by the number of parameters to estimate/sample size (a power 
issue), but in multiple regression setting if there is correlated independent 
variables the standard errors will be impacted and thus the p-values (in a more 
misleading way). 
It's also true that, as Krzystof pointed out, BM may not be the best 
evolutionary model for describing the correlation in the residuals.
However the number of parameters to estimate will be even greater (for OU 
multivariate model such as in mvSLOUCH), and we may lack power in the 
significance tests of the correlations, no?


Julien


> Date: Sat, 17 Oct 2015 14:09:25 +0200
> From: krzysztofbartos...@wp.pl
> To: r-sig-phylo@r-project.org
> CC: r-sig-phylo@r-project.org
> Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
>
> Hi Sandra, Julien and Liam!
> One thought that I had is since you have an environmental variable and one 
> supposedly correlated with it is that then the slouch/mvSLOUCH (on CRAN) 
> models could be a possibility? In them you can have the environmental 
> variable(s) evolving as a Brownian motion (i.e. random drift, no selecetion) 
> and then the responding one(s) as an OU whose primary optimum depends on the 
> state of the environment. As far as I followed from the discussion you had 
> either both environment and responding trait
> following a Brownian motion (meaning there is correlation between them but 
> not adaptation) or both following an OU process (meaning that the environment 
> is trying to adapt to something which might not be justified biologically).
>
> Sandra if you think such a setup - a trait adapting to a primary optimum 
> defined by a randomly evolving environment (this is not the same as being 
> correlated with it) is something useful for you I can help you out with using 
> mvSLOUCH.
>
> You can have a look at:
> * Hansen, T.F.,Pienaar,J.,Orzack,S.H.,2008. A comparative method for studying 
> adaptation to a randomly evolving environment.Evolution 62,19651977. 
> (for slouch model)
> * Bartoszek, K.,Pienaar,J.,Mostad,P.,Andersson,S.,Hansen,T.F.,2012. A 
> phylogenetic comparative method for studying multivariate adaptation 
> .J.Theor.Biol. 314,204215. (for mvSLOUCH model)
>
> Best wishes
> Krzysztof
>
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
  
___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-17 Thread krzysztofbartoszek
Hi Sandra, Julien and Liam!
One thought that I had is since you have an environmental variable and one 
supposedly correlated with it is that then the slouch/mvSLOUCH (on CRAN) models 
could be a possibility? In them you can have the environmental variable(s) 
evolving as a Brownian motion (i.e. random drift, no selecetion) and then the 
responding one(s) as an OU whose primary optimum depends on the state of the 
environment. As far as I followed from the discussion you had either both 
environment and responding trait 
following a Brownian motion (meaning there is correlation between them but not 
adaptation) or both following an OU process (meaning that the environment is 
trying to adapt to something which might not be justified biologically).

Sandra if you think such a setup - a trait adapting to a primary optimum 
defined by a randomly evolving environment (this is not the same as being 
correlated with it) is something useful for you I can help you out with using 
mvSLOUCH.

You can have a look at:
* Hansen, T.F.,Pienaar,J.,Orzack,S.H.,2008. A comparative method for studying 
adaptation to a randomly evolving environment.Evolution 62,19651977. 
(for slouch model)
* Bartoszek, K.,Pienaar,J.,Mostad,P.,Andersson,S.,Hansen,T.F.,2012. A 
phylogenetic comparative method for studying multivariate adaptation 
.J.Theor.Biol. 314,204215. (for mvSLOUCH model)

Best wishes
Krzysztof

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-17 Thread krzysztofbartoszek
Hi Julien,

You are right that power is a big (and generally still unexplored in the PCM 
context) issue. 
One way to address the question is after estimating the parameters of the 
BM/OU/mvSLOUCH model
you can calculate the covariance/correlation function of the traits, say 
Cov(Y)(t)/Corr(Y)(t) where Y is the 2D vector trait+environment variable. The 
mvSLOUCH package returns these, I guess similar as stationary(model_ou) in 
mvMorph? The of course one can look at the magnitude of the trait+environment 
correlation coefficient.  Regarding power difficult to say, model selection BM 
v OUBM might be difficult if the sample is small. But on the other hand in the 
mvSLOUCH paper Bartoszek et. al. 
(2012) we had a n=33 Cervidae set and a 2dim OU+1dim BM clearly outperformed 
3dim BM (by AICc). If the trait is related to the environment then I would 
hazard that this should be picked up. Estimation of individual parameters in 
the SDE can be a tough thing but from my experience the function Cov(Y)(now) is 
decently estimated. Which makes sense - this is a complex combination of the 
SDE parameters and the ML tries to fit them to the contemporary sample which is 
described by E(Y)(now), 
Cov(Y)(now). In the 1dim OU,BM cases this was studied quite well by us (Serik 
Sagitov) and Cecile Ane et. al. and there seems to be quite a bit of power to 
get at these parameter combinations. 

Another  possibility to look at is at the conditional expectation 
E(trait | environment)(now) - the evolutionary regression or its limit the 
optimal/limiting regression E(trait | environment)(infinity) .

To get significance is not that easy, I guess the best way is a parametric 
bootstrap as you suggest below. But this depends on the pylogeny size, if it 
will not take overly long.

I raised this point because depending on the model i.e. BM/OU/mvSLOUCH the 
design matrix of the regression should be different (Hansen et. al. 2008 
discuss this at length). And the interpretation of the regression parameters 
might be different - e.g. you could have a phylogenetic lag in the estimated 
regression coefficient if the design matrix is just the measured variable.

Best wishes
Krzysztof

> Hi Krzysztof,
> 
> I agree with you that if you want to test explicitly adaptation in a 
> regression framework (i.e. environmental variables as predictors), 
> slouch/mvslouch is more sounding than doing a simple regression ((p)GLS) 
> or multiple regression. 
> I emphasize here that the coefficients of a regression model and the 
> correlations are really distincts. It's because all is about the nature 
> of the variation in the response variable in a regression model (and 
> it's why that changing the variables in a multiple regression assume 
> different nature of errors terms).
> 
> The initial question was on how to obtain the correlations between the 
> traits, and possibly their significance, while CCA provides only the 
> correlation between the traits sets.
> 
> The significance of the correlations computed through multivariate 
> approaches may be impeded by the number of parameters to estimate/sample 
> size (a power issue), but in multiple regression setting if there is 
> correlated independent variables the standard errors will be impacted 
> and thus the p-values (in a more misleading way). 
> It's also true that, as Krzystof pointed out, BM may not be the best 
> evolutionary model for describing the correlation in the residuals.
> However the number of parameters to estimate will be even greater (for 
> OU multivariate model such as in mvSLOUCH), and we may lack power in the 
> significance tests of the correlations, no?
> 
> 
> Julien
> 
> 
> > Date: Sat, 17 Oct 2015 14:09:25 +0200
> > From: krzysztofbartos...@wp.pl
> > To: r-sig-phylo@r-project.org
> > CC: r-sig-phylo@r-project.org
> > Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
> >
> > Hi Sandra, Julien and Liam!
> > One thought that I had is since you have an environmental variable and 
> one supposedly correlated with it is that then the slouch/mvSLOUCH (on 
> CRAN) models could be a possibility? In them you can have the 
> environmental variable(s) evolving as a Brownian motion (i.e. random 
> drift, no selecetion) and then the responding one(s) as an OU whose 
> primary optimum depends on the state of the environment. As far as I 
> followed from the discussion you had either both environment and 
> responding trait
> > following a Brownian motion (meaning there is correlation between them 
> but not adaptation) or both following an OU process (meaning that the 
> environment is trying to adapt to something which might not be justified 
> biologically).
> >
> > Sandra if you think such a setup - a trait adapting to a primary optimum 
> def

Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-15 Thread Julien Clavel
Hi Sandra,

Both approaches are (p)GLS, but they represent different models: multivariate 
and multiple regression.
In the first you have several dependent variables, in the second you have 
several independent variables. 
The error associated with the different responses (or dependent) variables may 
have different variances and may be correlated in the multivariate model.
For sure you can dilute the power of the analysis when you try to make more 
estimates with the same amount of data (as it is the case in the multivariate 
model). 

The approach which consist in replacing the dependent and independent variable, 
is similar to what is called "reverse regression". But doing so you make 
particular assumptions because the nature of the stochastic error is different 
in each regressions. Maybe someone have more thought on this and if this can be 
misleading?


Julien


> Date: Wed, 14 Oct 2015 16:13:03 -0300
> Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
> From: gou...@mnhn.fr
> To: julien.cla...@hotmail.fr
> CC: liam.rev...@umb.edu; r-sig-phylo@r-project.org
>
> Oi Julien,
>
> Yes you are right, I was comparing things that are not the same!
> However, even changing the p-values as you suggested, the conclusions
> (which is really what i'm interested in) are still much different
> between the two methods. For the example I gave above, the correlation
> is still not significant (p-value = 0.16).
>
> Overall i get much less significant correlations with this method
> compared to pGLS. Is there an explanation? Could it be a problem of
> over-parametrisation?
>
> Best,
> Sandra.
>
>
> 2015-10-14 13:37 GMT-03:00 Julien Clavel
> <julien.cla...@hotmail.fr<mailto:julien.cla...@hotmail.fr>>:
> Sandra,
>
> I just figured out, are you directly comparing the coefficients
> (slopes) of the multiple regression analysis with the evolutionary
> correlations computed with mvBM?
>
> These are two distincts things although they are related (same sign)!
> For instance, while correlations (and evolutionary correlations) range
> between -1 and 1, the range of the slope can take higher values.
>
> Best,
>
> Julien
>
> 
>> From: julien.cla...@hotmail.fr<mailto:julien.cla...@hotmail.fr>
>> To: gou...@mnhn.fr<mailto:gou...@mnhn.fr>;
> liam.rev...@umb.edu<mailto:liam.rev...@umb.edu>
>> Date: Wed, 14 Oct 2015 17:50:37 +0200
>> CC: r-sig-phylo@r-project.org<mailto:r-sig-phylo@r-project.org>
>> Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
>>
>> Hi Sandra,
>>
>> Maybe I choose a bad example to illustrate the approach. In fact I
> simulated positive correlations so the way the p-value was computed was
> for illustrative purpose and is wrong for negative correlations (we
> need a two-sided test). In fact you must look whether your 95%
> confidence interval for the parameter include 0 or not.
>>
>> Just replace:
>> p_values <- apply(correl_boot,2,function(x){ mean(x < 0) })
>>
>> by something like that (if I'm not wrong)
>>
>> pvalue <- function(x){
>> neg <- mean(x>0)
>> pos <- mean(x<0)
>> pval <- min(neg,pos)*2
>> return(pval)
>> }
>>
>> p_values <- apply(correl_boot,2, pvalue)
>>
>> This should fix the problem.
>>
>> Julien
>>
>> 
>>> Date: Wed, 14 Oct 2015 11:35:43 -0300
>>> Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
>>> From: gou...@mnhn.fr<mailto:gou...@mnhn.fr>
>>> To: liam.rev...@umb.edu<mailto:liam.rev...@umb.edu>
>>> CC: julien.cla...@hotmail.fr<mailto:julien.cla...@hotmail.fr>;
> r-sig-phylo@r-project.org<mailto:r-sig-phylo@r-project.org>
>>>
>>> Dear Liam and Julien,
>>>
>>> Thank you both for your help. I have tried both solutions on the same
>>> dataset under a BM model to compare them. I expected to get similar
>>> results, at least for the general conclusions on the significance of
>>> correlations.
>>>
>>> for Julien's solution, I ran:
>>>
>>> model<-mvBM(tree, data, model="BM1",method="pic")
>>> cov2cor(model$sigma)
>>>
>>> # bootstrapping to get p-values
>>> data_boot <- simulate(model, tree=tree, nsim=100)
>>> model_boot <- lapply(data_boot, function(x){ cov2cor(mvBM(tree, x,
>>> model="BM1", method="pic", echo=FALSE, diagnostic=FALSE)$sigma) })
>>> correl_boot <- t(sapply(model_boot, function(x){ x[upper.tri(x)] }))
>>&

Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-14 Thread sandra goutte
ta, model="BM1",method="pic",
>> param=list(constraint="diagonal"))
>>
>> # Test the significance with an LRT test
>> LRT(model,model_2)
>>
>>
>>
>>
>> # But if you want to assess the significance for each traits when there
>> is more than two traits, use bootstrap instead:
>>
>> # Simulate data under the MLE and then fit the model to simulated traits
>> data_boot <- simulate(model, tree=tree, nsim=100)
>> model_boot <- lapply(data_boot, function(x){ cov2cor(mvBM(tree, x,
>> model="BM1", method="pic", echo=FALSE, diagnostic=FALSE)$sigma) })
>>
>> # Retrieve the results
>> correl_boot <- t(sapply(model_boot, function(x){ x[upper.tri(x)] }))
>> colnames(correl_boot)<-c("[1,2]","[1,3]","[2,3]")
>>
>> # Plot the results
>> boxplot(correl_boot, main="Correlations")
>>
>> # Assess if simulated data under the MLE are significantly different from
>> zero
>> p_values <- apply(correl_boot,2,function(x){ mean(x < 0) })
>> p_values
>>
>>
>>
>> # You can also use the same approach described above if the best fit
>> process is an Ornstein-Uhlenbeck (which is unlikely here); e.g.:
>>
>> model_ou <- mvOU(tree, data, model="OU1", method="rpf",
>> param=list(vcv="ouch"))
>> cov2cor(stationary(model_ou))
>>
>> There is some related examples in the package vignette.
>> HTH,
>>
>>
>> Julien
>>
>> Date: Mon, 12 Oct 2015 14:34:26 -0300
>>> From: gou...@mnhn.fr
>>> To: r-sig-phylo@r-project.org
>>> Subject: [R-sig-phylo] Question on pCCA (phyl.cca)
>>>
>>> Dear list,
>>>
>>> I have 2 sets of continuous variables (acoustic and environmental) and a
>>> phylogeny for a group of species. I am looking for correlations between
>>> the
>>> environmental variables and the acoustic ones. I ran a pCCA (phyl.cca
>>> {phytools} ), which gave me p-values for the correlations of each of my
>>> environmental variables with* all* acoustic variables. However, i have
>>> specific hyptotheses I would like to test such as: is environemental
>>> variable A correlated with acoustic variable Z?
>>>
>>> If I run univariate pGLS for each of my acoustic variables, I have the
>>> problem that I am not taking into account the potential colinearity
>>> between
>>> my acoustic variables... So I can first run pPCA, but then i'm not really
>>> testing my original hypotheses on specific variables.
>>>
>>> So my question is: is there a way to test the significance of correlation
>>> among continuous variables from two sets (taking the phylogenetic
>>> componant
>>> into account)?
>>>
>>> Any input will be very much apreciated!
>>> Best,
>>> Sandra.
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> ___
>>> R-sig-phylo mailing list - R-sig-phylo@r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>> Searchable archive at
>>> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> R-sig-phylo mailing list - R-sig-phylo@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>> Searchable archive at
>> http://www.mail-archive.com/r-sig-phylo@r-project.org/
>>
>>


-- 
Ph.D.
Laboratório de História Natural de Anfíbios Brasileiros (LaHNAB
<http://www.naturalhistory.com.br/>)
Departamento de Biologia Animal (Bloco P)
Instituto de Biologia (IB)
Universidade Estadual de Campinas (UNICAMP)
Rua Monteiro Lobato, 255, CEP 13083-862
Campinas, São Paulo
Brazil
tel: +55 (19) 99978-7173
Flickr: https://www.flickr.com/gp/133250906@N05/g4196q

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-14 Thread Julien Clavel
Hi Sandra,

Maybe I choose a bad example to illustrate the approach. In fact I simulated 
positive correlations so the way the p-value was computed was for illustrative 
purpose and is wrong for negative correlations (we need a two-sided test). In 
fact you must look whether your 95% confidence interval for the parameter 
include 0 or not.

Just replace:
p_values <- apply(correl_boot,2,function(x){ mean(x < 0) }) 

by something like that (if I'm not wrong)

pvalue <- function(x){
   neg <- mean(x>0)
   pos <- mean(x<0)
   pval <- min(neg,pos)*2
  return(pval)
}

p_values <- apply(correl_boot,2, pvalue) 

This should fix the problem.

Julien


> Date: Wed, 14 Oct 2015 11:35:43 -0300 
> Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca) 
> From: gou...@mnhn.fr 
> To: liam.rev...@umb.edu 
> CC: julien.cla...@hotmail.fr; r-sig-phylo@r-project.org 
> 
> Dear Liam and Julien, 
> 
> Thank you both for your help. I have tried both solutions on the same 
> dataset under a BM model to compare them. I expected to get similar 
> results, at least for the general conclusions on the significance of 
> correlations. 
> 
> for Julien's solution, I ran: 
> 
> model<-mvBM(tree, data, model="BM1",method="pic") 
> cov2cor(model$sigma) 
> 
> # bootstrapping to get p-values 
> data_boot <- simulate(model, tree=tree, nsim=100) 
> model_boot <- lapply(data_boot, function(x){ cov2cor(mvBM(tree, x, 
> model="BM1", method="pic", echo=FALSE, diagnostic=FALSE)$sigma) }) 
> correl_boot <- t(sapply(model_boot, function(x){ x[upper.tri(x)] })) 
> p_values <- apply(correl_boot,2,function(x){ mean(x < 0) }) 
> 
> # the correlations and associated p-values 
> correl <- t( cov2cor(model$sigma)[upper.tri(cov2cor(model$sigma))] ) 
> res<-rbind(correl, p_values) 
> 
> 
> For Liam's solution, I ran: 
> 
> corbr<-corBrownian(phy=tree) 
> 
> glsp1<- 
> gls(ac.var1~env.var1+env.var2+body.size+ac.var2+ac.var3+ac.var4, 
> correlation=corbr, data=data) 
> glsp2<- 
> gls(ac.var2~env.var1+env.var2+body.size+ac.var1+ac.var3+ac.var4, 
> correlation=corbr, data=data) 
> glsp3<- 
> gls(ac.var3~env.var1+env.var2+body.size+ac.var1+ac.var2+ac.var4, 
> correlation=corbr, data=data) 
> glsp4<- 
> gls(ac.var4~env.var1+env.var2+body.size+ac.var1+ac.var2+ac.var3, 
> correlation=corbr, data=data) 
> 
> 
> I'm getting much more significant correlations with pGLS than with 
> mvBM, and the correlation coefficients are sometimes very different as 
> well. The results I get from the pGLS are generally more intuitive 
> biologically. For example, for the dominant frequency of calls: 
> 
> 
> DF~body.size with mvBM: -0.29 p-value=0.92 and with pGLS: -0.26 
> p-value=0.016. Here the coefficients are close, but the significance is 
> very different. Biologically, we expect a significant correlation 
> between these two variables, so it is quite surprising to have such a 
> high p-value for the mvBM method. 
> 
> DF~env.var1 with mvBM:0.35 p-value=0.03 and with pGLS: 0.021 p-value= 
> 0.09. Here the coefficients are really quite different... 
> 
> Is it normal that I'm getting such different results? Did I miss something? 
> 
> Best, 
> Sandra. 
> 
> 
> 
> 
> 
> 
> 
> 2015-10-13 15:18 GMT-03:00 Liam J. Revell 
> <liam.rev...@umb.edu<mailto:liam.rev...@umb.edu>>: 
> Hi Sandra. 
> 
> If you are interested in the individual correlations, why not just do a 
> PGLS in which (on at a time) the variables of interest are set as the 
> independent variable. 
> 
> So, for instance, to test "is environemental variable A correlated with 
> acoustic variable Z" (controlling for all other environmental variables 
> & acoustic variables), one would do: 
> 
> acoustic variable Z ~ environmental variable A + all other 
> environmental variable + all other acoustic variables + phylogenetic 
> residual error 
> 
> If you have collinearity, then you may not be able distinguish between 
> the effects of different variables - but this will be true for all 
> statistical models. (Imagine the worst case in which variables A & B 
> were perfectly collinear, well then it becomes theoretically impossible 
> to determine if A is affecting Z or B, no matter what we do.) 
> 
> All the best, Liam 
> 
> Liam J. Revell, Associate Professor of Biology 
> University of Massachusetts Boston 
> web: http://faculty.umb.edu/liam.revell/ 
> email: liam.rev...@umb.edu<mailto:liam.rev...@umb.edu> 
> blog: http://blog.phytools.org 
> 
> 
> On 10/13/2015 5:47 AM, Julien Clavel wrote: 
> Hi Sandra, 
> 
> I think, a mult

Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-14 Thread Julien Clavel
Sandra,

I just figured out, are you directly comparing the coefficients (slopes) of the 
multiple regression analysis with the evolutionary correlations computed with 
mvBM?

These are two distincts things although they are related (same sign)! For 
instance, while correlations (and evolutionary correlations) range between -1 
and 1, the range of the slope can take higher values.

Best,

Julien 


> From: julien.cla...@hotmail.fr
> To: gou...@mnhn.fr; liam.rev...@umb.edu
> Date: Wed, 14 Oct 2015 17:50:37 +0200
> CC: r-sig-phylo@r-project.org
> Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
>
> Hi Sandra,
>
> Maybe I choose a bad example to illustrate the approach. In fact I simulated 
> positive correlations so the way the p-value was computed was for 
> illustrative purpose and is wrong for negative correlations (we need a 
> two-sided test). In fact you must look whether your 95% confidence interval 
> for the parameter include 0 or not.
>
> Just replace:
> p_values <- apply(correl_boot,2,function(x){ mean(x < 0) })
>
> by something like that (if I'm not wrong)
>
> pvalue <- function(x){
> neg <- mean(x>0)
> pos <- mean(x<0)
> pval <- min(neg,pos)*2
> return(pval)
> }
>
> p_values <- apply(correl_boot,2, pvalue)
>
> This should fix the problem.
>
> Julien
>
> 
>> Date: Wed, 14 Oct 2015 11:35:43 -0300
>> Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
>> From: gou...@mnhn.fr
>> To: liam.rev...@umb.edu
>> CC: julien.cla...@hotmail.fr; r-sig-phylo@r-project.org
>>
>> Dear Liam and Julien,
>>
>> Thank you both for your help. I have tried both solutions on the same
>> dataset under a BM model to compare them. I expected to get similar
>> results, at least for the general conclusions on the significance of
>> correlations.
>>
>> for Julien's solution, I ran:
>>
>> model<-mvBM(tree, data, model="BM1",method="pic")
>> cov2cor(model$sigma)
>>
>> # bootstrapping to get p-values
>> data_boot <- simulate(model, tree=tree, nsim=100)
>> model_boot <- lapply(data_boot, function(x){ cov2cor(mvBM(tree, x,
>> model="BM1", method="pic", echo=FALSE, diagnostic=FALSE)$sigma) })
>> correl_boot <- t(sapply(model_boot, function(x){ x[upper.tri(x)] }))
>> p_values <- apply(correl_boot,2,function(x){ mean(x < 0) })
>>
>> # the correlations and associated p-values
>> correl <- t( cov2cor(model$sigma)[upper.tri(cov2cor(model$sigma))] )
>> res<-rbind(correl, p_values)
>>
>>
>> For Liam's solution, I ran:
>>
>> corbr<-corBrownian(phy=tree)
>>
>> glsp1<-
>> gls(ac.var1~env.var1+env.var2+body.size+ac.var2+ac.var3+ac.var4,
>> correlation=corbr, data=data)
>> glsp2<-
>> gls(ac.var2~env.var1+env.var2+body.size+ac.var1+ac.var3+ac.var4,
>> correlation=corbr, data=data)
>> glsp3<-
>> gls(ac.var3~env.var1+env.var2+body.size+ac.var1+ac.var2+ac.var4,
>> correlation=corbr, data=data)
>> glsp4<-
>> gls(ac.var4~env.var1+env.var2+body.size+ac.var1+ac.var2+ac.var3,
>> correlation=corbr, data=data)
>>
>>
>> I'm getting much more significant correlations with pGLS than with
>> mvBM, and the correlation coefficients are sometimes very different as
>> well. The results I get from the pGLS are generally more intuitive
>> biologically. For example, for the dominant frequency of calls:
>>
>>
>> DF~body.size with mvBM: -0.29 p-value=0.92 and with pGLS: -0.26
>> p-value=0.016. Here the coefficients are close, but the significance is
>> very different. Biologically, we expect a significant correlation
>> between these two variables, so it is quite surprising to have such a
>> high p-value for the mvBM method.
>>
>> DF~env.var1 with mvBM:0.35 p-value=0.03 and with pGLS: 0.021 p-value=
>> 0.09. Here the coefficients are really quite different...
>>
>> Is it normal that I'm getting such different results? Did I miss something?
>>
>> Best,
>> Sandra.
>>
>>
>>
>>
>>
>>
>>
>> 2015-10-13 15:18 GMT-03:00 Liam J. Revell
>> <liam.rev...@umb.edu<mailto:liam.rev...@umb.edu>>:
>> Hi Sandra.
>>
>> If you are interested in the individual correlations, why not just do a
>> PGLS in which (on at a time) the variables of interest are set as the
>> independent variable.
>>
>> So, for instance, to test "is environemental variable A correlated with
>> acoustic variable Z" (c

Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-14 Thread sandra goutte
Oi Julien,

Yes you are right, I was comparing things that are not the same! However,
even changing the p-values as you suggested, the conclusions (which is
really what i'm interested in) are still much different between the two
methods. For the example I gave above, the correlation is still not
significant (p-value = 0.16).

Overall i get much less significant correlations with this method compared
to pGLS. Is there an explanation? Could it be a problem of
over-parametrisation?

Best,
Sandra.


2015-10-14 13:37 GMT-03:00 Julien Clavel <julien.cla...@hotmail.fr>:

> Sandra,
>
> I just figured out, are you directly comparing the coefficients (slopes)
> of the multiple regression analysis with the evolutionary correlations
> computed with mvBM?
>
> These are two distincts things although they are related (same sign)! For
> instance, while correlations (and evolutionary correlations) range between
> -1 and 1, the range of the slope can take higher values.
>
> Best,
>
> Julien
>
> 
> > From: julien.cla...@hotmail.fr
> > To: gou...@mnhn.fr; liam.rev...@umb.edu
> > Date: Wed, 14 Oct 2015 17:50:37 +0200
> > CC: r-sig-phylo@r-project.org
> > Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
> >
> > Hi Sandra,
> >
> > Maybe I choose a bad example to illustrate the approach. In fact I
> simulated positive correlations so the way the p-value was computed was for
> illustrative purpose and is wrong for negative correlations (we need a
> two-sided test). In fact you must look whether your 95% confidence interval
> for the parameter include 0 or not.
> >
> > Just replace:
> > p_values <- apply(correl_boot,2,function(x){ mean(x < 0) })
> >
> > by something like that (if I'm not wrong)
> >
> > pvalue <- function(x){
> > neg <- mean(x>0)
> > pos <- mean(x<0)
> > pval <- min(neg,pos)*2
> > return(pval)
> > }
> >
> > p_values <- apply(correl_boot,2, pvalue)
> >
> > This should fix the problem.
> >
> > Julien
> >
> > 
> >> Date: Wed, 14 Oct 2015 11:35:43 -0300
> >> Subject: Re: [R-sig-phylo] Question on pCCA (phyl.cca)
> >> From: gou...@mnhn.fr
> >> To: liam.rev...@umb.edu
> >> CC: julien.cla...@hotmail.fr; r-sig-phylo@r-project.org
> >>
> >> Dear Liam and Julien,
> >>
> >> Thank you both for your help. I have tried both solutions on the same
> >> dataset under a BM model to compare them. I expected to get similar
> >> results, at least for the general conclusions on the significance of
> >> correlations.
> >>
> >> for Julien's solution, I ran:
> >>
> >> model<-mvBM(tree, data, model="BM1",method="pic")
> >> cov2cor(model$sigma)
> >>
> >> # bootstrapping to get p-values
> >> data_boot <- simulate(model, tree=tree, nsim=100)
> >> model_boot <- lapply(data_boot, function(x){ cov2cor(mvBM(tree, x,
> >> model="BM1", method="pic", echo=FALSE, diagnostic=FALSE)$sigma) })
> >> correl_boot <- t(sapply(model_boot, function(x){ x[upper.tri(x)] }))
> >> p_values <- apply(correl_boot,2,function(x){ mean(x < 0) })
> >>
> >> # the correlations and associated p-values
> >> correl <- t( cov2cor(model$sigma)[upper.tri(cov2cor(model$sigma))] )
> >> res<-rbind(correl, p_values)
> >>
> >>
> >> For Liam's solution, I ran:
> >>
> >> corbr<-corBrownian(phy=tree)
> >>
> >> glsp1<-
> >> gls(ac.var1~env.var1+env.var2+body.size+ac.var2+ac.var3+ac.var4,
> >> correlation=corbr, data=data)
> >> glsp2<-
> >> gls(ac.var2~env.var1+env.var2+body.size+ac.var1+ac.var3+ac.var4,
> >> correlation=corbr, data=data)
> >> glsp3<-
> >> gls(ac.var3~env.var1+env.var2+body.size+ac.var1+ac.var2+ac.var4,
> >> correlation=corbr, data=data)
> >> glsp4<-
> >> gls(ac.var4~env.var1+env.var2+body.size+ac.var1+ac.var2+ac.var3,
> >> correlation=corbr, data=data)
> >>
> >>
> >> I'm getting much more significant correlations with pGLS than with
> >> mvBM, and the correlation coefficients are sometimes very different as
> >> well. The results I get from the pGLS are generally more intuitive
> >> biologically. For example, for the dominant frequency of calls:
> >>
> >>
> >> DF~body.size with mvBM: -0.29 p-value=0.92 and with pGLS: -0.26
> >> p-value=0.016. Here the coefficien

Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-13 Thread Julien Clavel
Hi Sandra,

I think, a multivariate correlational model with your continuous traits as 
response variables can be used to do what you want.
For instance, assuming Brownian motion:


library(mvMORPH)
set.seed(123)

## First simulate some data (3 traits)
# simulate a tree
tree <- pbtree(n=100)

# Define a correlation matrix
C <- matrix(c(1,0.7,0.3,0.7,1,0.3,0.3,0.3,1),3)

# Define the variance to build a covariance matrix
D <- diag(c(sqrt(0.3),sqrt(0.2),sqrt(0.1)),3)

# Compute the covariance matrix
sigma <- D%*%C%*%D

# Ancestral states
theta <- c(0,0,0)

# simulate the traits
data<-mvSIM(tree, param=list(sigma=sigma, ntraits=3, mu=theta,
names_traits=c("Trait 1","Trait 2", "Trait 3")), 
model="BM1", nsim=1)




## Now we fit the multivariate brownian model
model<-mvBM(tree, data, model="BM1",method="pic")

# Obtain the traits correlations (very close to the generating values):
cov2cor(model$sigma)

# You can compare the fit of a model with independent Brownian motion
model_2<-mvBM(tree, data, model="BM1",method="pic", 
param=list(constraint="diagonal"))

# Test the significance with an LRT test
LRT(model,model_2)




# But if you want to assess the significance for each traits when there is more 
than two traits, use bootstrap instead:

# Simulate data under the MLE and then fit the model to simulated traits
data_boot <- simulate(model, tree=tree, nsim=100)
model_boot <- lapply(data_boot, function(x){ cov2cor(mvBM(tree, x, model="BM1", 
method="pic", echo=FALSE, diagnostic=FALSE)$sigma) })

# Retrieve the results
correl_boot <- t(sapply(model_boot, function(x){ x[upper.tri(x)] }))
colnames(correl_boot)<-c("[1,2]","[1,3]","[2,3]")

# Plot the results
boxplot(correl_boot, main="Correlations")

# Assess if simulated data under the MLE are significantly different from zero
p_values <- apply(correl_boot,2,function(x){ mean(x < 0) })
p_values



# You can also use the same approach described above if the best fit process is 
an Ornstein-Uhlenbeck (which is unlikely here); e.g.:

model_ou <- mvOU(tree, data, model="OU1", method="rpf", param=list(vcv="ouch"))
cov2cor(stationary(model_ou))

There is some related examples in the package vignette.
HTH,


Julien

> Date: Mon, 12 Oct 2015 14:34:26 -0300
> From: gou...@mnhn.fr
> To: r-sig-phylo@r-project.org
> Subject: [R-sig-phylo] Question on pCCA (phyl.cca)
> 
> Dear list,
> 
> I have 2 sets of continuous variables (acoustic and environmental) and a
> phylogeny for a group of species. I am looking for correlations between the
> environmental variables and the acoustic ones. I ran a pCCA (phyl.cca
> {phytools} ), which gave me p-values for the correlations of each of my
> environmental variables with* all* acoustic variables. However, i have
> specific hyptotheses I would like to test such as: is environemental
> variable A correlated with acoustic variable Z?
> 
> If I run univariate pGLS for each of my acoustic variables, I have the
> problem that I am not taking into account the potential colinearity between
> my acoustic variables... So I can first run pPCA, but then i'm not really
> testing my original hypotheses on specific variables.
> 
> So my question is: is there a way to test the significance of correlation
> among continuous variables from two sets (taking the phylogenetic componant
> into account)?
> 
> Any input will be very much apreciated!
> Best,
> Sandra.
> 
> [[alternative HTML version deleted]]
> 
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
  
[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


Re: [R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-13 Thread Liam J. Revell

Hi Sandra.

If you are interested in the individual correlations, why not just do a 
PGLS in which (on at a time) the variables of interest are set as the 
independent variable.


So, for instance, to test "is environemental variable A correlated with 
acoustic variable Z" (controlling for all other environmental variables 
& acoustic variables), one would do:


acoustic variable Z ~ environmental variable A + all other environmental 
variable + all other acoustic variables + phylogenetic residual error


If you have collinearity, then you may not be able distinguish between 
the effects of different variables - but this will be true for all 
statistical models. (Imagine the worst case in which variables A & B 
were perfectly collinear, well then it becomes theoretically impossible 
to determine if A is affecting Z or B, no matter what we do.)


All the best, Liam

Liam J. Revell, Associate Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 10/13/2015 5:47 AM, Julien Clavel wrote:

Hi Sandra,

I think, a multivariate correlational model with your continuous traits as 
response variables can be used to do what you want.
For instance, assuming Brownian motion:


library(mvMORPH)
set.seed(123)

## First simulate some data (3 traits)
# simulate a tree
tree <- pbtree(n=100)

# Define a correlation matrix
C <- matrix(c(1,0.7,0.3,0.7,1,0.3,0.3,0.3,1),3)

# Define the variance to build a covariance matrix
D <- diag(c(sqrt(0.3),sqrt(0.2),sqrt(0.1)),3)

# Compute the covariance matrix
sigma <- D%*%C%*%D

# Ancestral states
theta <- c(0,0,0)

# simulate the traits
data<-mvSIM(tree, param=list(sigma=sigma, ntraits=3, mu=theta,
names_traits=c("Trait 1","Trait 2", "Trait 3")), 
model="BM1", nsim=1)




## Now we fit the multivariate brownian model
model<-mvBM(tree, data, model="BM1",method="pic")

# Obtain the traits correlations (very close to the generating values):
cov2cor(model$sigma)

# You can compare the fit of a model with independent Brownian motion
model_2<-mvBM(tree, data, model="BM1",method="pic", 
param=list(constraint="diagonal"))

# Test the significance with an LRT test
LRT(model,model_2)




# But if you want to assess the significance for each traits when there is more 
than two traits, use bootstrap instead:

# Simulate data under the MLE and then fit the model to simulated traits
data_boot <- simulate(model, tree=tree, nsim=100)
model_boot <- lapply(data_boot, function(x){ cov2cor(mvBM(tree, x, model="BM1", 
method="pic", echo=FALSE, diagnostic=FALSE)$sigma) })

# Retrieve the results
correl_boot <- t(sapply(model_boot, function(x){ x[upper.tri(x)] }))
colnames(correl_boot)<-c("[1,2]","[1,3]","[2,3]")

# Plot the results
boxplot(correl_boot, main="Correlations")

# Assess if simulated data under the MLE are significantly different from zero
p_values <- apply(correl_boot,2,function(x){ mean(x < 0) })
p_values



# You can also use the same approach described above if the best fit process is 
an Ornstein-Uhlenbeck (which is unlikely here); e.g.:

model_ou <- mvOU(tree, data, model="OU1", method="rpf", param=list(vcv="ouch"))
cov2cor(stationary(model_ou))

There is some related examples in the package vignette.
HTH,


Julien


Date: Mon, 12 Oct 2015 14:34:26 -0300
From: gou...@mnhn.fr
To: r-sig-phylo@r-project.org
Subject: [R-sig-phylo] Question on pCCA (phyl.cca)

Dear list,

I have 2 sets of continuous variables (acoustic and environmental) and a
phylogeny for a group of species. I am looking for correlations between the
environmental variables and the acoustic ones. I ran a pCCA (phyl.cca
{phytools} ), which gave me p-values for the correlations of each of my
environmental variables with* all* acoustic variables. However, i have
specific hyptotheses I would like to test such as: is environemental
variable A correlated with acoustic variable Z?

If I run univariate pGLS for each of my acoustic variables, I have the
problem that I am not taking into account the potential colinearity between
my acoustic variables... So I can first run pPCA, but then i'm not really
testing my original hypotheses on specific variables.

So my question is: is there a way to test the significance of correlation
among continuous variables from two sets (taking the phylogenetic componant
into account)?

Any input will be very much apreciated!
Best,
Sandra.

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archi

[R-sig-phylo] Question on pCCA (phyl.cca)

2015-10-12 Thread sandra goutte
Dear list,

I have 2 sets of continuous variables (acoustic and environmental) and a
phylogeny for a group of species. I am looking for correlations between the
environmental variables and the acoustic ones. I ran a pCCA (phyl.cca
{phytools} ), which gave me p-values for the correlations of each of my
environmental variables with* all* acoustic variables. However, i have
specific hyptotheses I would like to test such as: is environemental
variable A correlated with acoustic variable Z?

If I run univariate pGLS for each of my acoustic variables, I have the
problem that I am not taking into account the potential colinearity between
my acoustic variables... So I can first run pPCA, but then i'm not really
testing my original hypotheses on specific variables.

So my question is: is there a way to test the significance of correlation
among continuous variables from two sets (taking the phylogenetic componant
into account)?

Any input will be very much apreciated!
Best,
Sandra.

[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/