Yanthe,
Paolo,
Thanks for your replies...
You properly understood... log(BodySize) is the dependent variable and habitat
and location of symbionts are categorical independent variables...
At first there was another independent continuous variables... and I was
interested in assessing effects of continuous and categorical variables
simultaneously... That's why I used PGLS
I agree that a regression model is quite counterintuitive for categorical
independent variables... Like most people, I was taught (I miss that time :-)
that anova were for categorical variables and regressions for continuous
variables... Later I was told that regression and anova have a lot in common
actually...
As far as I have properly understood suggestions of others here and literature,
pGLS can be applied to my data... It may even be the only way while mixing
continuous and categorical variables simultaneously... I treated my categorical
variables as such (factors) and my understanding of the PGLS is that it uses
some dummy coding to perform the regression...
My results from pGLS are the same as from phy.anova... Yet I agree that the
latter has a more intuitive interpretation... and as my continuous independant
variable has no effect, I think I don't really need PGLS... phy.anova are
enough..
Sorry Paolo, I actually meant there was a difference between:
log(BodySize)~symbiont location*habitat VSlog(BodySize)~Habitat*Symbiont
location
Once the effect of one variable is removed, the other one is not significant...
and the result depends on the order in which the variables are considered...
Nice idea to test combination of factors... After checking more carefully my
data, it unfortunately seems that I don't have enough data to keep statistical
power while doing so (either using anova or pGLS) ... So I will stick to simple
phy.anovas.
Cheers
Julien
On Jun 7, 2012, at 12:47 AM, Yanthe Pearson wrote:
Based on what you say it sounds like Y=log(body size) is the dependent
variable and X1=habitat and X2=location are independent variables?
Yanthe E. Pearson
Postdoctoral Researcher
Dept. of Biology, Fagan Lab
University of Maryland College Park
Email: ypear...@umd.edu
From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org]
On Behalf Of ppi...@uniroma3.it [ppi...@uniroma3.it]
Sent: Wednesday, June 06, 2012 9:55 AM
To: Julien Lorion
Cc: r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] PGLS, categorical data and regression through
origin
Dear Julien,
maybe I dont understand your rmodel...but IF your
model has one continuous dep. and one categorical
(binary) indep. it looks like an ANOVA model: in this
case phy.anova() [or phy.manova() if you have 1
dependents] in the R package geiger does it. IF the
model is different ...please explain better.
Inverting the dependence-independence
relationshipdepends on your hypothesis testing.
When the categorical becomes the dependent you need to
apply a phylogentic logistic regression (in the case
of binary) or multinomial logistic (I think MCMCglmm
does it).
IF you have more factor variables as
dependent...applying comparative methods maybe is more
complicated but a trick could be useful.
Say you have a two-levels and a four levels factor
variables. You can test * pairwise *(!!) your
(**CONTINUOUS**!!) dependent against a new factor
where you coded any possible level identifiable by all
occurring combinations of the levels of the two factor
variables. Not necessarily there will be a n°levels
equal to the n°levels of fisrt factor variable *
n°levels of the second one: it depends from real data.
But you wrote: symbiont location ~ habitat VS habitat
~ symbiont location
hereall things are categoricalis it?...Where
is the body size?
Best
Paolo
Dear colleagues,
I am testing the impact of categorical binary
characters (habitat and presence/absence of symbionts)
on a continuous variable (log of body size) using
PGLS...
I am not sure if I should remove the intercept from
the formulae and the biological interpretation of the
absence of intercept for categorical variables.. All
papers I found on the issue of regression through
origin were about PIC and continuous characters
It does not change my conclusions when I test
individually each variable (BOTH have a HIGHLY
significant impact on body size), but it does when I
test them simultaneously:
Coefficients:
Estimate Std. Error t value
Pr(|t|)
(Intercept) 3.252335 0.731056 4.4488
5.115e-05 ***
Habitat10.706823 0.434013 1.6286
0.1099
Location1 0.598868 0.810679 0.7387
0.4637
Habitat1:Location1 -0.078772 0.905514 -0.0870
0.9310
F-statistic: 3.744 on 4 and 48 DF, p-value: 0.009906
Coefficients:
Estimate Std. Error t value
Pr(|t