Matthew, I support Chris' point of view. The random effects should model the dependence structure between the measurements. The parameters of your fixed effect will be better when you have specified the random effect correctly. Define the most complex model that are willing to accept, run it with different random effects and compare those models (by LRT or AIC, both under REML). Take the best one and refine your fixed effects (under ML).
A few suggestions: M0 <- gls(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, method = 'REML') Ma <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random = ~1|year, method = 'REML') M1 <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random = ~1|id, method = 'REML') M2 <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random = ~year|id, method = 'REML') M3 <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random = ~factor(year)|id, method = 'REML') M4 <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random = list(id = pdCompSymm(~factor(year))), method = 'REML') anova(M0, Ma) anova(M0, M1, M2) anova(M0, M1, M3) anova(M0, M1, M4) Warning: M3 may take some time to calculate as it will have estimate 45 parameters for the random effect. A random slope by year when you group by year makes no sense since you have in each group only data from one year. Hence it is impossible to have a slope by year in each group. Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Christian A. Parker Verzonden: donderdag 22 mei 2008 16:39 Aan: Landis, R Matthew CC: 'r-sig-ecology@r-project.org' Onderwerp: Re: [R-sig-eco] nlme model specification Matthew Please correct me if I am wrong (anyone) but because your observations are not independent across your desired groups (years) your error terms will be biased which will then influence your significant tests. So regardless of the factor that you are interested you would still want to account for the fact that all measurements were taken on the same trees each year by doing a repeated measures model of some sort. Hope this helps, -Chris Landis, R Matthew wrote: > Dear R-sig-eco: > > Many thanks to all of those who took the time to reply to my question. The diversity of replies has made me go back and try to clarify my question. Apologies for the length of the e-mail. Thanks in advance to anyone willing to plow through this and understand it. If you're ever in Middlebury I'll buy you a beer. > > To repeat, I have 300 trees, ranging in size from 10 - 150 cm diameter (big trees). To simplify my original question, let's say I want to understand the relationship between growth and two variables, diameter (continuous) and vine load (ordinal index from 1-4). I'd also like to know the relative importance of diameter vs. vine load, e.g. by partial R2. If I had one year of data, this would be a simple regression. > > However, I have 9 years of annual measurements on the trees. It's as if I have the above analysis repeated 9 times. There was no initial treatment, so I view these 9 years as a random sample of the years in the life of the tree, and unlike most examples of repeated measures I have read, the time effect is of no interest whatsoever. That is, I am not interested in viewing xyplot(growth ~ time|id). I don't expect to see any consistent directional response to time. In a way, it's as if the 9 years represent blocks, (except that it's the same 300 trees in each block) -- this is why I view the yr as a random effect, and as the grouping variable. > > If I were to graph the data, I would use xyplot(growth ~ diameter|yr) to see what I am most interested in. Grouping by individual doesn't make sense to me here because each individual only represents a very small slice of the full range of measurements - e.g. over the ten years, each tree only grows from 10 cm - 14 cm, so I can't really estimate the growth vs. diameter relationship for each tree. xyplot(growth ~ diameter|id) would not be useful. This is why I don't consider the individual to be the grouping variable, but perhaps I am wrong on this. > > So, now, as before, I am back to > > fit <- lme(fixed = growth ~ diameter * vines, random = ~ 1|year) > > I'm expecting that this will estimate separate intercepts for each year. Which is what I want (I would like to fit separate slopes by year too, but that model didn't converge). > > I guess what I'm most concerned about is whether the significance tests obtained for each term use the appropriate error term and the appropriate degrees of freedom. I'm currently using something like the following command to test the effect of diameter > > anova(fit.full.model, update(fit.full.model, . ~ vines)) > > But maybe I'm way off base there. > > Thanks very much! > > Matt Landis > > >> -----Original Message----- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of >> Landis, R Matthew >> Sent: Wednesday, May 21, 2008 1:55 PM >> To: 'r-sig-ecology@r-project.org' >> Subject: [R-sig-eco] nlme model specification >> >> Greetings R-eco folks, >> >> I'm trying to analyze a dataset on tree growth rates to see >> which factors are important (and their relative importance >> too, if I can get that), and I'm having some trouble figuring >> out how to specify the model, despite having carefully read >> Pinheiro and Bates, the help files for nlme, Crawley's book on >> Statistics with S, MASS, and other books besides. >> >> The dataset consists of ~ 300 trees measured annually for 10 >> years. So, I have 9 pseudo-replicated intervals over which to >> assess growth (about 2700 rows in the dataset). There are 5 >> different explanatory factors, which are a combination of >> continuous variables and categorical factors. Some of these >> vary with time. In the end, I would like to get both >> coefficient estimates and partial R2 (or some other way of >> ranking them) for each factor. Unlike most time-series >> examples in the books, I am not interested in how growth >> varies with time, nor am I particular interested in >> interactions of explanatory factors with time. >> >> Based on this, I've convinced myself that I should specify the >> model as: >> >> fit <- lme(fixed = growth ~ (x1 + x2 + x3+ x4 + x5)^2, random >> = ~1|year, method = 'ML') >> >> Year is clearly a random effect, and is the grouping variable >> for the analysis. Each of the other coefficients is "inner" >> to this variable. I'm ignoring individual tree as a grouping >> factor, since I don't want to estimate separate coefficients >> for each tree. Does this sound like the correct way to do this? >> >> Thanks for any help. Apologies if this is more of a >> statistics question and less of an R question. >> >> Matt Landis >> >> **************************************************** >> R. Matthew Landis, Ph.D. >> Dept. Biology >> Middlebury College >> Middlebury, VT 05753 >> >> tel.: 802.443.3484 >> ************************************************** >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> R-sig-ecology mailing list >> R-sig-ecology@r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology >> >> > > _______________________________________________ > R-sig-ecology mailing list > R-sig-ecology@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > > _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology