-----Original Message-----
From: Christian A. Parker [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 22, 2008 10:39 AM
To: Landis, R Matthew
Cc: 'r-sig-ecology@r-project.org'
Subject: Re: [R-sig-eco] nlme model specification
Matthew
Please correct me if I am wrong (anyone) but because your observations
are not independent across your desired groups (years) your error terms
will be biased which will then influence your significant tests. So
regardless of the factor that you are interested you would
still want to
account for the fact that all measurements were taken on the same trees
each year by doing a repeated measures model of some sort.
Hope this helps,
-Chris
Landis, R Matthew wrote:
Dear R-sig-eco:
Many thanks to all of those who took the time to reply to my
question. The diversity of replies has made me go back and
try to clarify my question. Apologies for the length of the
e-mail. Thanks in advance to anyone willing to plow through
this and understand it. If you're ever in Middlebury I'll buy
you a beer.
To repeat, I have 300 trees, ranging in size from 10 - 150
cm diameter (big trees). To simplify my original question,
let's say I want to understand the relationship between growth
and two variables, diameter (continuous) and vine load
(ordinal index from 1-4). I'd also like to know the relative
importance of diameter vs. vine load, e.g. by partial R2. If
I had one year of data, this would be a simple regression.
However, I have 9 years of annual measurements on the trees.
It's as if I have the above analysis repeated 9 times. There
was no initial treatment, so I view these 9 years as a random
sample of the years in the life of the tree, and unlike most
examples of repeated measures I have read, the time effect is
of no interest whatsoever. That is, I am not interested in
viewing xyplot(growth ~ time|id). I don't expect to see any
consistent directional response to time. In a way, it's as if
the 9 years represent blocks, (except that it's the same 300
trees in each block) -- this is why I view the yr as a random
effect, and as the grouping variable.
If I were to graph the data, I would use xyplot(growth ~
diameter|yr) to see what I am most interested in. Grouping by
individual doesn't make sense to me here because each
individual only represents a very small slice of the full
range of measurements - e.g. over the ten years, each tree
only grows from 10 cm - 14 cm, so I can't really estimate the
growth vs. diameter relationship for each tree. xyplot(growth
~ diameter|id) would not be useful. This is why I don't
consider the individual to be the grouping variable, but
perhaps I am wrong on this.
So, now, as before, I am back to
fit <- lme(fixed = growth ~ diameter * vines, random = ~ 1|year)
I'm expecting that this will estimate separate intercepts
for each year. Which is what I want (I would like to fit
separate slopes by year too, but that model didn't converge).
I guess what I'm most concerned about is whether the
significance tests obtained for each term use the appropriate
error term and the appropriate degrees of freedom. I'm
currently using something like the following command to test
the effect of diameter
anova(fit.full.model, update(fit.full.model, . ~ vines))
But maybe I'm way off base there.
Thanks very much!
Matt Landis
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Landis, R Matthew
Sent: Wednesday, May 21, 2008 1:55 PM
To: 'r-sig-ecology@r-project.org'
Subject: [R-sig-eco] nlme model specification
Greetings R-eco folks,
I'm trying to analyze a dataset on tree growth rates to see
which factors are important (and their relative importance
too, if I can get that), and I'm having some trouble figuring
out how to specify the model, despite having carefully read
Pinheiro and Bates, the help files for nlme, Crawley's book on
Statistics with S, MASS, and other books besides.
The dataset consists of ~ 300 trees measured annually for 10
years. So, I have 9 pseudo-replicated intervals over which to
assess growth (about 2700 rows in the dataset). There are 5
different explanatory factors, which are a combination of
continuous variables and categorical factors. Some of these
vary with time. In the end, I would like to get both
coefficient estimates and partial R2 (or some other way of
ranking them) for each factor. Unlike most time-series
examples in the books, I am not interested in how growth
varies with time, nor am I particular interested in
interactions of explanatory factors with time.
Based on this, I've convinced myself that I should specify the
model as:
fit <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random
= ~1|year, method = 'ML')
Year is clearly a random effect, and is the grouping variable
for the analysis. Each of the other coefficients is "inner"
to this variable. I'm ignoring individual tree as a grouping
factor, since I don't want to estimate separate coefficients
for each tree. Does this sound like the correct way to do this?
Thanks for any help. Apologies if this is more of a
statistics question and less of an R question.
Matt Landis
****************************************************
R. Matthew Landis, Ph.D.
Dept. Biology
Middlebury College
Middlebury, VT 05753
tel.: 802.443.3484
**************************************************
[[alternative HTML version deleted]]
_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology