Re: [R] Naming a random effect in lmer
Hi Bill, I'm about to take a look at this. If I understand the issue, very long expressions for what I call the grouping factor of a random effects term (the expressions on the right hand side of the vertical bar) are encountering problems with deparse. I should have realized that, any time one uses deparse, disaster looms. I can tell you the reason that the collection of random-effects terms is being named is partly for the printed form and partly so that terms with the same grouping factor can be associated. I guess my simplistic solution to the problem would be to precompute these sums and give them names, if it is the sum like Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 that is important, why not evaluate the sum testGroupSamp - within(testGroupSamp, Z29 - Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9) and use Z29 as the grouping factor. Even the use of variables with names like Z1, Z2, ... and the use of expressions like paste(Z, 2:9, sep = ) is not idiomatic R/S code. It's an SPSS/SASism. (You know I never realized before how close the word SASism, meaning a construction that is natural in SAS, is to Sadism.) Why not create a matrix Z and evaluate these sums as matrix/vector products? Zs2- paste(Z,2:9,sep=) On Fri, May 22, 2009 at 5:30 PM, William Dunlap wdun...@tibco.com wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg Sent: Friday, May 22, 2009 3:01 PM To: Leigh Ann Starcevich Cc: r-help@r-project.org Subject: Re: [R] Naming a random effect in lmer [ ... elided statistical advice ... ] If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through the code line by line using the debug function (or browser or the debug package). From this, you will likely see either (a) how you can do what you want differently to achieve the same result or (b) how to modify the code so it does what you want. The coding error is right in the error message: Error in names(bars) - unlist(lapply(bars, function(x) deparse(x[[3]]))) and I suspect that traceback() would tell you that came from a call to lmerFactorList. That code implicitly assumes that deparse() will produce a scalar character vector, but it doesn't if the input expression is complicated enough. Changing the deparse(x[[3]]) to deparse(x[[3]])[1] or paste(collapse= , deparse(x[[3]])[1]) would fix it. The first truncates the name and the second my make a very long name. There is at least one other use of that idiom in the lme4 code and your dataset and analysis may require that all of them be fixed. Hope this helps. Spencer Leigh Ann Starcevich wrote: Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + ... + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col lapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data= data.frame
Re: [R] Naming a random effect in lmer
Thanks for your thoughts on this. I tried the approach for the grouping variable using the within function. I looked at a subset of my data for which I do not get the deparse error in lmer and compared the results. The approach using the within function to form the grouping variable underestimates that variance component. Strangely, the Zt matrix for the random effects in the proposed approach does not appear to contain rows for the Zgrouped variable, although it does report a variance estimate for the random effect. The results for the standard approach appear correct for this simulated data set. Any other ideas on how to make the grouping work? Thanks -- Leigh Ann # Assume that only mb=10 years of data are available to compare to results # that avoid deparse error # Group the grouping variable using within testsamp1-testsamp Zs2- paste(Z,2:9,sep=) Zsum2 - paste(Zs2,collapse=+) testsamp1 - within(testsamp1, Zgrouped-Zsum2) fittest29.1-lmer(LogY ~ WYear + (1+WYear|Site) + (1|Zgrouped), data = testsamp1[testsamp1$WYear=10,]) Linear mixed model fit by REML Formula: LogY ~ WYear + (1 + WYear | Site) + (1 | Zgrouped) Data: testsamp1[testsamp1$WYear = 10, ] AICBIC logLik deviance REMLdev -42.96 -26.55 28.48 -63.37 -56.96 Random effects: Groups NameVariance Std.Dev. Corr Site (Intercept) 0.07351401 0.271135 WYear 0.02620422 0.161877 0.057 Zgrouped (Intercept) 0.00036891 0.019207 Residual 0.01065189 0.103208 Number of obs: 77, groups: Site, 7; Zgrouped, 1 Fixed effects: Estimate Std. Error t value (Intercept) 0.9857992 0.1065573 9.251 WYear -0.0002676 0.0612968 -0.004 Correlation of Fixed Effects: (Intr) WYear 0.044 # Use the standard formulation for the grouping variable with reduced number of # terms to avoid deparse error Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest29.2-lmer(Trendformula2, data = testsamp[testsamp$WYear=10,]) summary(fittest29.2) Linear mixed model fit by REML Formula: Trendformula2 Data: testsamp[testsamp$WYear = 10, ] AICBIC logLik deviance REMLdev -147.5 -131.1 80.77 -167.7 -161.5 Random effects: GroupsNameVariance Std.Dev. Corr Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 (Intercept) 0.0095237 0.097589 Site (Intercept) 0.0765445 0.276667 WYear 0.0262909 0.162145 0.046 Residual 0.0011282 0.033589 Number of obs: 77, groups: Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9, 11; Site, 7 Fixed effects: Estimate Std. Error t value (Intercept) 0.9857992 0.1183912 8.327 WYear -0.0002676 0.0619989 -0.004 Correlation of Fixed Effects: (Intr) WYear -0.020 fittest2...@zt 15 x 77 sparse Matrix of class dgCMatrix fittest2...@zt 25 x 77 sparse Matrix of class dgCMatrix At 09:37 AM 5/24/2009 -0500, Douglas Bates wrote: Hi Bill, I'm about to take a look at this. If I understand the issue, very long expressions for what I call the grouping factor of a random effects term (the expressions on the right hand side of the vertical bar) are encountering problems with deparse. I should have realized that, any time one uses deparse, disaster looms. I can tell you the reason that the collection of random-effects terms is being named is partly for the printed form and partly so that terms with the same grouping factor can be associated. I guess my simplistic solution to the problem would be to precompute these sums and give them names, if it is the sum like Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 that is important, why not evaluate the sum testGroupSamp - within(testGroupSamp, Z29 - Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9) and use Z29 as the grouping factor. Even the use of variables with names like Z1, Z2, ... and the use of expressions like paste(Z, 2:9, sep = ) is not idiomatic R/S code. It's an SPSS/SASism. (You know I never realized before how close the word SASism, meaning a construction that is natural in SAS, is to Sadism.) Why not create a matrix Z and evaluate these sums as matrix/vector products? Zs2- paste(Z,2:9,sep=) On Fri, May 22, 2009 at 5:30 PM, William Dunlap wdun...@tibco.com wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg Sent: Friday, May 22, 2009 3:01 PM To: Leigh Ann Starcevich Cc: r-help@r-project.org Subject: Re: [R] Naming a random effect in lmer [ ... elided statistical advice ... ] If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from http://cran.fhcrc.org/web/packages/lme4
Re: [R] Naming a random effect in lmer
I am having trouble modifying the lmer code to make the change that William Dunlap suggested, and I believe that this is because the environment is locked. I tried: 1. Revising the lmerFactorList function in a word processor and copying/pasting into lmer, but this did not work because the environment is locked. 2. Setting the environment and modifying the functionality of the lmerFactorList function as recommended in this post (https://stat.ethz.ch/pipermail/r-devel/2006-July/038547.html), but the unmodified lmerFactorList function was still used in the lmer call and the same error was generated. 3. Creating a new version of the lme4 package for my modified function with the makecdfenv package, but I do not know how to create a CDF file, feel that I am well beyond my programming expertise, and hope there is a more simple resolution to this issue. Thanks for any help -- las Leigh Ann Starcevich Doctoral student Oregon State University Corvallis, Oregon At 03:30 PM 5/22/2009 -0700, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg Sent: Friday, May 22, 2009 3:01 PM To: Leigh Ann Starcevich Cc: r-help@r-project.org Subject: Re: [R] Naming a random effect in lmer [ ... elided statistical advice ... ] If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through the code line by line using the debug function (or browser or the debug package). From this, you will likely see either (a) how you can do what you want differently to achieve the same result or (b) how to modify the code so it does what you want. The coding error is right in the error message: Error in names(bars) - unlist(lapply(bars, function(x) deparse(x[[3]]))) and I suspect that traceback() would tell you that came from a call to lmerFactorList. That code implicitly assumes that deparse() will produce a scalar character vector, but it doesn't if the input expression is complicated enough. Changing the deparse(x[[3]]) to deparse(x[[3]])[1] or paste(collapse= , deparse(x[[3]])[1]) would fix it. The first truncates the name and the second my make a very long name. There is at least one other use of that idiom in the lme4 code and your dataset and analysis may require that all of them be fixed. Hope this helps. Spencer Leigh Ann Starcevich wrote: Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + ... + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col lapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data= data.frame(testsamp.lme)) fittest3-lme(LogY ~ WYearCen, random= pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data
Re: [R] Naming a random effect in lmer
The first exaample on the lmer help page uses a formula Reaction ~ Days + (Days|Subject). Here, Subject is the name of a column in the data.frame sleepstudy, with levels 308, 309, ... . Does this answer your question? If no, please provide commented, minimal, self-contained, reproducible code, as requested in the posting guide http://www.R-project.org/posting-guide.html;. Your example is not self-contained, reproducible. Hope this helps. Spencer Leigh Ann Starcevich wrote: Dear guRus: I am using lmer for a mixed model that includes a random intercept for a set of effects that have the same distribution, Normal(0, sig2b). This set of effects is of variable size, so I am using an as.formula statement to create the formula for lmer. For example, if the set of random effects has dimension 8, then the lmer call is: Zs- paste(Z,1:mb,sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, paste(paste(Zs,collapse=+), fit2.4a-lmer(Trendformula, data = testsamp) which, for mb=8, expands to: fit1-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+ Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8), data = testsamp) I have no problems with this. However, if the set of random effects has a dimension of 30, then the lmer call is: fit2-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 + Z16 + Z17 + Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30), data = testsamp) In this case, I get an error because the name Z1+Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 + Z16 + Z17 + Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30 is too long to print in the output. Is there any way to name the random effect in lmer so that the shorter (and more descriptive) name may be used and the error avoided? Or is there a way to combine these into a single variable prior to the lmer function call? In SAS, I am able to parameterize these as a Toeplitz structure with bandwidth 1. Thanks for any help. Leigh Ann Starcevich Doctoral student Oregon State University Corvallis, Oregon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Naming a random effect in lmer
Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),collapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data= data.frame(testsamp.lme)) fittest3-lme(LogY ~ WYearCen, random= pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= testgroupSamp) summary(fittest3) VarCorr(fittest3) # Data testsamp - structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022), WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14), Site = c(4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94), LogY = c(0.848648866298552, 0.809143925760456, 0.734173952725014, 1.46749967704437, 0.716106254860468, 0.843871512951468, 1.09120092433378, 0.800893809796851, 0.996674977596997, 0.917613604481207, 1.71928772722884, 0.797853604855215, 0.922298691760041, 0.964654422529188, 0.782903180421921, 1.13457969553106, 1.21628917384868, 2.03084776647495, 0.872954085910578, 1.02192794559856, 0.746307251774509, 0.439812203188778, 1.11164109549224, 1.18414357836729, 2.00157711459358, 0.66577753155877, 0.856374433428581, 0.343862060001402, 0.278505653147057, 1.20632152691478, 1.32289150679746, 2.19814430598707, 0.538363941164496, 0.820038163290321, 0.070054765524828, 0.0479738684639024, 1.29137364087568, 1.52436357249377, 2.32150525025777, 0.595507040392793, 0.851417610550757, -0.115908144193410, 0.0118306018140099, 1.39448009350962, 1.71677754106603, 2.59146662837284, 0.595750060671620, 0.855387479311679, -0.430729785591898, 0.0178423104900579, 1.60964246000316, 1.99184029256509, 2.86865842252168, 0.69512483409, 0.96175860396451, -0.600991172113926, -0.174420349224615, 1.73794158380868, 2.06718359946362, 3.04502112974038, 0.730730638403177, 0.961110819398807, -0.856693722990918,
Re: [R] Naming a random effect in lmer
I miscommunicated: In every application I've seen with large numbers of parameters to estimate, most of those parameters are specific instances of different levels of a random effect. For example, a colleague recently did a fixed effects analysis of a longitudinal abundance survey of a large number of species of wildlife. This colleague claimed that treating species as a random effect was inappropriate, because the species were not selected by use of random numbers. With only two or three species, if the scientists were only concerned with those two or three species, this may be reasonable. However, most scientists -- and people who study their work -- will want to generalize beyond only the species in the study. Even if scientists were only interested in two or three species for which they had data, Stein's phenomenon suggests that even a rabid Bayesophobe would likely get lower mean square error at the expense of some bias from pretending the species were randomly selected from some larger population (http://en.wikipedia.org/wiki/James-Stein_estimator). If I were asked to referee a paper or serve on a thesis committee for a study estimating 30 random effects, I would not be happy unless there were a convincing discussion of why it was appropriate to estimate all those random effects separately; right now, I can not imagine an application where that would be appropriate. If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through the code line by line using the debug function (or browser or the debug package). From this, you will likely see either (a) how you can do what you want differently to achieve the same result or (b) how to modify the code so it does what you want. Hope this helps. Spencer Leigh Ann Starcevich wrote: Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + … + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),collapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data= data.frame(testsamp.lme)) fittest3-lme(LogY ~ WYearCen, random= pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= testgroupSamp) summary(fittest3) VarCorr(fittest3) # Data testsamp - structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022), WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3,
Re: [R] Naming a random effect in lmer
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg Sent: Friday, May 22, 2009 3:01 PM To: Leigh Ann Starcevich Cc: r-help@r-project.org Subject: Re: [R] Naming a random effect in lmer [ ... elided statistical advice ... ] If you and your advisor still feel that what you are doing makes sense, I suggest you first get the source code via svn checkout svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading lme4_0.999375-30.tar.gz from http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through the code line by line using the debug function (or browser or the debug package). From this, you will likely see either (a) how you can do what you want differently to achieve the same result or (b) how to modify the code so it does what you want. The coding error is right in the error message: Error in names(bars) - unlist(lapply(bars, function(x) deparse(x[[3]]))) and I suspect that traceback() would tell you that came from a call to lmerFactorList. That code implicitly assumes that deparse() will produce a scalar character vector, but it doesn't if the input expression is complicated enough. Changing the deparse(x[[3]]) to deparse(x[[3]])[1] or paste(collapse= , deparse(x[[3]])[1]) would fix it. The first truncates the name and the second my make a very long name. There is at least one other use of that idiom in the lme4 code and your dataset and analysis may require that all of them be fixed. Hope this helps. Spencer Leigh Ann Starcevich wrote: Here is a test data set and code. I am including the data set after the code and discussion to make reading easier. Apologies for the size of the data set, but my problem occurs when there are a lot of Z variables. Thanks for your time. # Enter data below # Sample code library(lme4) mb- length(unique(testsamp$WYear)) # Create the formula for the set of identically distributed random effects Zs- paste(Z,2:(mb-1)),sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs,collapse=+), fittest-lmer(Trendformula, data = testsamp) summary(fittest) # Here I get an error because the name of the random effect is too long to print # in the random effects output (I think). # The error message is: Error in names(bars) - unlist(lapply(bars, function(x) # deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the vector [2] # However, when fewer Z variables are used in the random portion of the model, # there is no error. # Using only Z2 + ... + Z9 for the random intercept Zs2- paste(Z,2:9,sep=) Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, randommodel=paste(paste(Zs2,collapse=+), fittest2-lmer(Trendformula2, data = testsamp) summary(fittest2) # Is there a way to either name the set of iid random effects something else or # to define a random variable that could be used in the model to create a # random intercept? # I have had some success in lme, but it would be helpful for my simulation if I # could conduct this analysis with lmer. My model in lme is not correctly # estimating one of the variance components (random Site intercept). # I am using: detach(package:lme4) library(nlme) random.model.lme -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col lapse=+))) n-dim(testsamp)[1] testsampgroup - rep(1,n) testsamp.lme -cbind(testsamp,testsampgroup) testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site, data= data.frame(testsamp.lme)) fittest3-lme(LogY ~ WYearCen, random= pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= testgroupSamp) summary(fittest3) VarCorr(fittest3) # Data testsamp - structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022), WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13
[R] Naming a random effect in lmer
Dear guRus: I am using lmer for a mixed model that includes a random intercept for a set of effects that have the same distribution, Normal(0, sig2b). This set of effects is of variable size, so I am using an as.formula statement to create the formula for lmer. For example, if the set of random effects has dimension 8, then the lmer call is: Zs- paste(Z,1:mb,sep=) Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, paste(paste(Zs,collapse=+), fit2.4a-lmer(Trendformula, data = testsamp) which, for mb=8, expands to: fit1-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+ Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8), data = testsamp) I have no problems with this. However, if the set of random effects has a dimension of 30, then the lmer call is: fit2-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 + Z16 + Z17 + Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30), data = testsamp) In this case, I get an error because the name Z1+Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 + Z16 + Z17 + Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30 is too long to print in the output. Is there any way to name the random effect in lmer so that the shorter (and more descriptive) name may be used and the error avoided? Or is there a way to combine these into a single variable prior to the lmer function call? In SAS, I am able to parameterize these as a Toeplitz structure with bandwidth 1. Thanks for any help. Leigh Ann Starcevich Doctoral student Oregon State University Corvallis, Oregon [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.