Tom Oliver <toli...@ceh.ac.uk> a écrit :
Hi Emmanuel,
Here is the script I used for randomizing independently on both
vectors. Similar results occur for sampling without replacement.
for (i in (1:30)){
resp<-data[,5]
explanatory<-data[,36]
resp<-sample(slope,replace=T)
Is that normal that you sampled 'slope' while you extracted 'resp' above?
EP
explanatory<-sample(explanatory,replace=T)
names(resp)<-names(explanatory)<-data[,2]
data2<-data.frame(resp,explanatory)
data2<-na.omit(data2)
print(summary(lm(resp~explanatory)))
print(compar.gee(resp~explanatory,data2,phy=tr2))
}
Thanks,
Tom
Tom Oliver
Biological Records Centre
Centre for Ecology and Hydrology (CEH)
Maclean Building,
Benson Lane,
Crowmarsh Gifford,
Wallingford, Oxfordshire, OX10 8BB
Tel: 01491 692517
Emmanuel Paradis <emmanuel.para...@mpl.ird.fr> 02/19/09 6:06 PM >>>
Tom Oliver <toli...@ceh.ac.uk> a écrit :
Hi Emmanuel,
Thanks for your post. I can accept that sometimes normal OLS
regressions will give non-significant correlations yet after
accounting for covariance among species using phylogenetic methods
then the traits may show significant correlations. However, I am
confused how I can randomly shuffle the response and explanatory
variables in the tree, (in theory reducing correlation to the lowest
it possibly can be),
Ummm... did you randomize keeping the pairs of variables (ie,
randomizing rows of your data frame)? Or independently on both
vectors? (Ideally, you can post the command used to do the
randomization: this could help... unless it is a 200-line script :) )
yet still get significant correlations on more than half of
occasions (rather than 1 in 20 at p<0.05). Shouldn't the
phylogenetic test be telling me the significance of the correlation
given the struture of the tree, rather than the tree structure
influencing estimates of significance?
I don't see the difference. (It's like looking at the same thing from
two different directions.)
Could it be that there are issues with the distribution of my
response variable? As I mentioned it is unimodal with a negative
skew, yet according to a Shapiro-Wilk Normality Test it is
non-normal. Would this bias the results do you think?
I don't think so.
EP
shapiro.test(temp.slope) # signif dif from normal dist
Shapiro-Wilk normality test
data: temp.slope
W = 0.6392, p-value = 6.403e-09
Many thanks,
Tom
Emmanuel Paradis <emmanuel.para...@mpl.ird.fr> 02/18/09 9:52 am >>>
Hi Tom,
We had a discussion related to this topic last year. Here's my main comment:
https://stat.ethz.ch/pipermail/r-sig-phylo/2008-April/000070.html
You may have a look also at other messages in the same thread, of course.
HTH
EP
Tom Oliver <toli...@ceh.ac.uk> a écrit :
Hello Helplist,
I was wondering if anyone could shed some light on this problem..
I have been using the compar.gee package (Package ape version 2.2)
for a comparative analaysis using a binary categorical variable.
My phylogenetic tree has 42 tips with branch lengths set to 1 and
the response variable I am using is slightly negatively skewed (i.e.
non-normal). However, from background reading apparently GEEs are
fairly robust to such skew.
The output suggests my explanatory variable is highly correlated
with my response across species (output #1 below). When I run a
normal anova, however, my result is nowhere near the p<0.05
significance level (output #2 below). I was quite suspicious about
the results so I decided to randomly shuffle the response and
explanatory vectors and repeat the tests 30 times. For the normal
ANOVA 2 out of 30 results were significant (p<0.05 without
bonferroni). But for the GEE, 17 out of 30 results were still
significant.
Clearly something is not quite right! I have also tried using a
continuous explanatory variable and the GEE model still seems to
overestimate significance. I can find very little material detailing
the use or limits of GEEs in R. The R package help page contains no
information on GEE assumptions etc., nor does the book by
E.Paradis- Analysis of Phylogenetics and Evolution with R.
I have noticed that if I specificy the GEE regressions to go through
the origin (output #3) then the results are far less significant.
Indeed, zero out of the 30 tests are significant. However, in the
example in the compar.gee help page it suggests that GEE regressions
NOT through the origin give more comparable results to
phylogentically independent contrasts (?pic help page example).
Any help you can offer is much appreciated.
Tom Oliver
########
output #1
print(compar.gee(slope~explanatory,data2,phy=tr2))
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
(Intercept) explanatoryy
-0.22694925 0.02282178
Call:
formula: slope ~ explanatory
Number of observations: 42
Model:
Link: identity
Variance to Mean Relation: gaussian
Summary of Residuals:
Min 1Q Median 3Q Max
-6.1949077 -0.5137222 0.2786307 0.5084768 3.7319044
Coefficients:
Estimate S.E. t Pr(T > |t|)
(Intercept) 0.3961744 0.7632086 0.5190905 0.6241267392
explanatoryy -0.8182404 0.1159283 -7.0581610 0.0006200716
Estimated Scale Parameter: 2.100696
"Phylogenetic" df (dfP): 7.439024
#######'################
output #2
print(summary(lm(slope~explanatory)))
Call:
lm(formula = slope ~ explanatory)
Residuals:
Min 1Q Median 3Q Max
-6.4128 -0.0539 0.1215 0.2905 3.5140
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.22695 0.38631 -0.587 0.560
explanatoryy 0.02282 0.46490 0.049 0.961
Residual standard error: 1.393 on 40 degrees of freedom
Multiple R-squared: 6.024e-05, Adjusted R-squared: -0.02494
F-statistic: 0.00241 on 1 and 40 DF, p-value: 0.961
##############
output #3
print(compar.gee(slope~explanatory-1,data2,phy=tr2))
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
explanatoryn explanatoryy
-0.2269493 -0.2041275
Call:
formula: slope ~ explanatory - 1
Number of observations: 42
Model:
Link: identity
Variance to Mean Relation: gaussian
Summary of Residuals:
Min 1Q Median 3Q Max
-6.1949077 -0.5137222 0.2786307 0.5084768 3.7319044
Coefficients:
Estimate S.E. t Pr(T > |t|)
explanatoryn 0.3961744 0.7632086 0.5190905 0.6241267
explanatoryy -0.4220661 0.7595631 -0.5556695 0.6005236
Estimated Scale Parameter: 2.100696
"Phylogenetic" df (dfP): 7.439024
########
end
Tom Oliver
Biological Records Centre
Centre for Ecology and Hydrology (CEH)
Maclean Building,
Benson Lane,
Crowmarsh Gifford,
Wallingford, Oxfordshire, OX10 8BB
Tel: 01491 692517
----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
--
This message (and any attachments) is for the recipien...{{dropped:13}}
_______________________________________________
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo