Tom Oliver <toli...@ceh.ac.uk> a écrit :
Hi Emmanuel,

Here is the script I used for randomizing independently on both
vectors. Similar results occur for sampling without replacement.

for (i in (1:30)){
resp<-data[,5]
explanatory<-data[,36]
resp<-sample(slope,replace=T)

Is that normal that you sampled 'slope' while you extracted 'resp' above?

EP

explanatory<-sample(explanatory,replace=T)
names(resp)<-names(explanatory)<-data[,2]
data2<-data.frame(resp,explanatory)
data2<-na.omit(data2)
print(summary(lm(resp~explanatory)))
print(compar.gee(resp~explanatory,data2,phy=tr2))
}

Thanks,
Tom

Tom Oliver
Biological Records Centre
Centre for Ecology and Hydrology (CEH)
Maclean Building,
Benson Lane,
Crowmarsh Gifford,
Wallingford, Oxfordshire, OX10 8BB
Tel: 01491 692517
Emmanuel Paradis <emmanuel.para...@mpl.ird.fr> 02/19/09 6:06 PM >>>
Tom Oliver <toli...@ceh.ac.uk> a écrit :
Hi Emmanuel,

Thanks for your post. I can accept that sometimes normal OLS
regressions will give non-significant correlations yet after
accounting for covariance among species using phylogenetic methods
then the traits may show significant correlations. However, I am
confused how I can randomly shuffle the response and explanatory
variables in the tree, (in theory reducing correlation to the lowest
 it possibly can be),

Ummm... did you randomize keeping the pairs of variables (ie,
randomizing rows of your data frame)? Or independently on both
vectors? (Ideally, you can post the command used to do the
randomization: this could help... unless it is a 200-line script :) )

yet still get significant correlations on more  than half of
occasions (rather than 1 in 20 at p<0.05). Shouldn't  the
phylogenetic test be telling me the significance of the  correlation
given the struture of the tree, rather than the tree  structure
influencing estimates of significance?

I don't see the difference. (It's like looking at the same thing from
two different directions.)

Could it be that there are issues with the distribution of my
response variable? As I mentioned it is unimodal with a negative
skew, yet according to a Shapiro-Wilk Normality Test it is
non-normal. Would this bias the results do you think?

I don't think so.

EP

shapiro.test(temp.slope)    # signif dif from normal dist

        Shapiro-Wilk normality test

data:  temp.slope
W = 0.6392, p-value = 6.403e-09

Many thanks,
Tom

Emmanuel Paradis <emmanuel.para...@mpl.ird.fr> 02/18/09 9:52 am >>>
Hi Tom,

We had a discussion related to this topic last year. Here's my main comment:

https://stat.ethz.ch/pipermail/r-sig-phylo/2008-April/000070.html

You may have a look also at other messages in the same thread, of course.

HTH

EP

Tom Oliver <toli...@ceh.ac.uk> a écrit :

Hello Helplist,

I was wondering if anyone could shed some light on this problem..

I have been using the compar.gee package (Package ape version 2.2)
for a comparative analaysis using a binary categorical variable.
My phylogenetic tree has 42 tips with branch lengths set to 1 and
the response variable I am using is slightly negatively skewed (i.e.
 non-normal). However, from background reading apparently GEEs are
fairly robust to such skew.

The output suggests my explanatory variable is highly correlated
with my response across species (output #1 below). When I run a
normal anova, however, my result is nowhere near the p<0.05
significance level (output #2 below). I was quite suspicious about
the results so I decided to randomly shuffle the response and
explanatory vectors and repeat the tests 30 times. For the normal
ANOVA 2 out of 30 results were significant (p<0.05 without
bonferroni). But for the GEE, 17 out of 30 results were still
significant.

Clearly something is not quite right! I have also tried using a
continuous explanatory variable and the GEE model still seems to
overestimate significance. I can find very little material detailing
 the use or limits of GEEs in R. The R package help page contains no
 information on GEE assumptions etc., nor does the book by
E.Paradis-  Analysis of Phylogenetics and Evolution with R.

I have noticed that if I specificy the GEE regressions to go through
 the origin (output #3)  then the results are far less significant.
Indeed, zero out of the 30 tests are significant. However, in the
example in the compar.gee help page it suggests that GEE regressions
 NOT through the origin give more comparable results to
phylogentically independent contrasts (?pic help page example).

Any help you can offer is much appreciated.
Tom Oliver


########

output #1

print(compar.gee(slope~explanatory,data2,phy=tr2))
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
 (Intercept) explanatoryy
 -0.22694925   0.02282178

Call:
  formula: slope ~ explanatory

Number of observations:  42

Model:
 Link:                      identity
 Variance to Mean Relation: gaussian

Summary of Residuals:
       Min         1Q     Median         3Q        Max
-6.1949077 -0.5137222  0.2786307  0.5084768  3.7319044


Coefficients:
               Estimate      S.E.          t  Pr(T > |t|)
(Intercept)   0.3961744 0.7632086  0.5190905 0.6241267392
explanatoryy -0.8182404 0.1159283 -7.0581610 0.0006200716

Estimated Scale Parameter:  2.100696
"Phylogenetic" df (dfP):  7.439024

#######'################

output #2

print(summary(lm(slope~explanatory)))

Call:
lm(formula = slope ~ explanatory)

Residuals:
    Min      1Q  Median      3Q     Max
-6.4128 -0.0539  0.1215  0.2905  3.5140

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.22695    0.38631  -0.587    0.560
explanatoryy  0.02282    0.46490   0.049    0.961

Residual standard error: 1.393 on 40 degrees of freedom
Multiple R-squared: 6.024e-05,  Adjusted R-squared: -0.02494
F-statistic: 0.00241 on 1 and 40 DF,  p-value: 0.961

##############

output #3

 print(compar.gee(slope~explanatory-1,data2,phy=tr2))
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
explanatoryn explanatoryy
  -0.2269493   -0.2041275

Call:
  formula: slope ~ explanatory - 1

Number of observations:  42

Model:
 Link:                      identity
 Variance to Mean Relation: gaussian

Summary of Residuals:
       Min         1Q     Median         3Q        Max
-6.1949077 -0.5137222  0.2786307  0.5084768  3.7319044


Coefficients:
               Estimate      S.E.          t Pr(T > |t|)
explanatoryn  0.3961744 0.7632086  0.5190905   0.6241267
explanatoryy -0.4220661 0.7595631 -0.5556695   0.6005236

Estimated Scale Parameter:  2.100696
"Phylogenetic" df (dfP):  7.439024

########
end



Tom Oliver
Biological Records Centre
Centre for Ecology and Hydrology (CEH)
Maclean Building,
Benson Lane,
Crowmarsh Gifford,
Wallingford, Oxfordshire, OX10 8BB
Tel: 01491 692517


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.



--
This message (and any attachments) is for the recipien...{{dropped:13}}

_______________________________________________
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Reply via email to