Re: [R] Naming a random effect in lmer

2009-05-24 Thread Douglas Bates
Hi Bill,

I'm about to take a look at this.  If I understand the issue, very
long expressions for what I call the grouping factor of a random
effects term (the expressions on the right hand side of the vertical
bar) are encountering problems with deparse.  I should have realized
that, any time one uses deparse, disaster looms.

I can tell you the reason that the collection of random-effects terms
is being named is partly for the printed form and partly so that terms
with the same grouping factor can be associated.

I guess my simplistic solution to the problem would be to precompute
these sums and give them names, if it is the sum like

Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9

that is important, why not evaluate the sum

testGroupSamp - within(testGroupSamp, Z29 - Z2 + Z3 + Z4 + Z5 + Z6 +
Z7 + Z8 + Z9)

and use Z29 as the grouping factor.

Even the use of variables with names like Z1, Z2, ... and the use of
expressions like paste(Z, 2:9, sep = ) is not idiomatic R/S code.
It's an SPSS/SASism.  (You know I never realized before how close the
word SASism, meaning a construction that is natural in SAS, is to
Sadism.)  Why not create a matrix Z and evaluate these sums as
matrix/vector products?


Zs2- paste(Z,2:9,sep=)


On Fri, May 22, 2009 at 5:30 PM, William Dunlap wdun...@tibco.com wrote:
 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg
 Sent: Friday, May 22, 2009 3:01 PM
 To: Leigh Ann Starcevich
 Cc: r-help@r-project.org
 Subject: Re: [R] Naming a random effect in lmer

 [ ... elided statistical advice ... ]
 If you and your advisor still feel that what you are doing
 makes sense,
 I suggest you first get the source code via svn checkout
 svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading
 lme4_0.999375-30.tar.gz from
 http://cran.fhcrc.org/web/packages/lme4/index.html;), then
 walk through
 the code line by line using the debug function (or browser or the
 debug package). From this, you will likely see either (a)
 how you can
 do what you want differently to achieve the same result or (b) how to
 modify the code so it does what you want.

 The coding error is right in the error message:
  Error in names(bars) - unlist(lapply(bars, function(x)
 deparse(x[[3]])))
 and I suspect that traceback() would tell you that came from a call
 to lmerFactorList.

 That code implicitly assumes that deparse() will produce a scalar
 character
 vector, but it doesn't if the input expression is complicated enough.
 Changing the
    deparse(x[[3]])
 to
    deparse(x[[3]])[1]
 or
    paste(collapse= , deparse(x[[3]])[1])
 would fix it.  The first truncates the name and the second my make a
 very
 long name.

 There is at least one other use of that idiom in the lme4 code and your
 dataset and analysis may require that all of them be fixed.



 Hope this helps.
 Spencer


 Leigh Ann Starcevich wrote:
  Here is a test data set and code. I am including the data set after
  the code and discussion to make reading easier. Apologies
 for the size
  of the data set, but my problem occurs when there are a lot of Z
  variables. Thanks for your time.
 
  # Enter data below
 
  # Sample code
  library(lme4)
  mb- length(unique(testsamp$WYear))
 
  # Create the formula for the set of identically distributed random
  effects
  Zs- paste(Z,2:(mb-1)),sep=)
  Trendformula -as.formula(paste(LogY ~ WYear +
 (1+WYear|Site) + (1|,
  randommodel=paste(paste(Zs,collapse=+), 
 
  fittest-lmer(Trendformula, data = testsamp)
  summary(fittest)
 
  # Here I get an error because the name of the random effect is too
  long to print
  # in the random effects output (I think).
  # The error message is: Error in names(bars) - unlist(lapply(bars,
  function(x)
  # deparse(x[[3]]))) : 'names' attribute [3] must be the
 same length as
  the vector [2]
 
  # However, when fewer Z variables are used in the random portion of
  the model,
  # there is no error.
  # Using only Z2 + ... + Z9 for the random intercept
 
  Zs2- paste(Z,2:9,sep=)
  Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +
  (1|, randommodel=paste(paste(Zs2,collapse=+), 
  fittest2-lmer(Trendformula2, data = testsamp)
  summary(fittest2)
 
 
  # Is there a way to either name the set of iid random effects
  something else or
  # to define a random variable that could be used in the
 model to create a
  # random intercept?
 
  # I have had some success in lme, but it would be helpful for my
  simulation if I
  # could conduct this analysis with lmer. My model in lme is
 not correctly
  # estimating one of the variance components (random Site intercept).
  # I am using:
 
  detach(package:lme4)
  library(nlme)
  random.model.lme
 
 -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col
 lapse=+)))
 
 
  n-dim(testsamp)[1]
  testsampgroup - rep(1,n)
  testsamp.lme -cbind(testsamp,testsampgroup)
  testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup,
 inner= ~Site,
  data= data.frame

Re: [R] Naming a random effect in lmer

2009-05-24 Thread Leigh Ann Starcevich
Thanks for your thoughts on this.  I tried the approach for the grouping 
variable using the within function.  I looked at a subset of my data for 
which I do not get the deparse error in lmer and compared the results.  The 
approach using the within function to form the grouping variable 
underestimates that variance component.  Strangely, the Zt matrix for the 
random effects in the proposed approach does not appear to contain rows for 
the Zgrouped variable, although it does report a variance estimate for the 
random effect.  The results for the standard approach appear correct for 
this simulated data set.  Any other ideas on how to make the grouping work?

Thanks -- Leigh Ann


# Assume that only mb=10 years of data are available to compare to results
# that avoid deparse error

# Group the grouping variable using within
testsamp1-testsamp
Zs2- paste(Z,2:9,sep=)
Zsum2 - paste(Zs2,collapse=+)
testsamp1 - within(testsamp1, Zgrouped-Zsum2)
fittest29.1-lmer(LogY ~ WYear + (1+WYear|Site) +  (1|Zgrouped), data = 
testsamp1[testsamp1$WYear=10,])

Linear mixed model fit by REML
Formula: LogY ~ WYear + (1 + WYear | Site) + (1 | Zgrouped)
Data: testsamp1[testsamp1$WYear = 10, ]
 AICBIC logLik deviance REMLdev
  -42.96 -26.55  28.48   -63.37  -56.96
Random effects:
  Groups   NameVariance   Std.Dev. Corr
  Site (Intercept) 0.07351401 0.271135
   WYear   0.02620422 0.161877 0.057
  Zgrouped (Intercept) 0.00036891 0.019207
  Residual 0.01065189 0.103208
Number of obs: 77, groups: Site, 7; Zgrouped, 1

Fixed effects:
   Estimate Std. Error t value
(Intercept)  0.9857992  0.1065573   9.251
WYear   -0.0002676  0.0612968  -0.004

Correlation of Fixed Effects:
   (Intr)
WYear 0.044

# Use the standard formulation for the grouping variable with reduced 
number of
# terms to avoid deparse error

Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +  (1|, 
randommodel=paste(paste(Zs2,collapse=+), 
fittest29.2-lmer(Trendformula2, data = testsamp[testsamp$WYear=10,])
summary(fittest29.2)

Linear mixed model fit by REML
Formula: Trendformula2
Data: testsamp[testsamp$WYear = 10, ]
 AICBIC logLik deviance REMLdev
  -147.5 -131.1  80.77   -167.7  -161.5
Random effects:
  GroupsNameVariance  Std.Dev. Corr
  Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9 (Intercept) 0.0095237 0.097589
  Site  (Intercept) 0.0765445 0.276667
WYear   0.0262909 0.162145 0.046
  Residual  0.0011282 0.033589
Number of obs: 77, groups: Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9, 11; Site, 7

Fixed effects:
   Estimate Std. Error t value
(Intercept)  0.9857992  0.1183912   8.327
WYear   -0.0002676  0.0619989  -0.004

Correlation of Fixed Effects:
   (Intr)
WYear -0.020

  fittest2...@zt
15 x 77 sparse Matrix of class dgCMatrix
  fittest2...@zt
25 x 77 sparse Matrix of class dgCMatrix






At 09:37 AM 5/24/2009 -0500, Douglas Bates wrote:
Hi Bill,

I'm about to take a look at this.  If I understand the issue, very
long expressions for what I call the grouping factor of a random
effects term (the expressions on the right hand side of the vertical
bar) are encountering problems with deparse.  I should have realized
that, any time one uses deparse, disaster looms.

I can tell you the reason that the collection of random-effects terms
is being named is partly for the printed form and partly so that terms
with the same grouping factor can be associated.

I guess my simplistic solution to the problem would be to precompute
these sums and give them names, if it is the sum like

Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9

that is important, why not evaluate the sum

testGroupSamp - within(testGroupSamp, Z29 - Z2 + Z3 + Z4 + Z5 + Z6 +
Z7 + Z8 + Z9)

and use Z29 as the grouping factor.

Even the use of variables with names like Z1, Z2, ... and the use of
expressions like paste(Z, 2:9, sep = ) is not idiomatic R/S code.
It's an SPSS/SASism.  (You know I never realized before how close the
word SASism, meaning a construction that is natural in SAS, is to
Sadism.)  Why not create a matrix Z and evaluate these sums as
matrix/vector products?


Zs2- paste(Z,2:9,sep=)


On Fri, May 22, 2009 at 5:30 PM, William Dunlap wdun...@tibco.com wrote:
  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg
  Sent: Friday, May 22, 2009 3:01 PM
  To: Leigh Ann Starcevich
  Cc: r-help@r-project.org
  Subject: Re: [R] Naming a random effect in lmer
 
  [ ... elided statistical advice ... ]
  If you and your advisor still feel that what you are doing
  makes sense,
  I suggest you first get the source code via svn checkout
  svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading
  lme4_0.999375-30.tar.gz from
  http://cran.fhcrc.org/web/packages/lme4

Re: [R] Naming a random effect in lmer

2009-05-23 Thread Leigh Ann Starcevich
I am having trouble modifying the lmer code to make the change that William 
Dunlap suggested, and I believe that this is because the environment is 
locked.

I tried:

1.  Revising the lmerFactorList function in a word processor and 
copying/pasting into lmer, but this did not work because the environment is 
locked.

2.  Setting the environment and modifying the functionality of the 
lmerFactorList function as recommended in this post 
(https://stat.ethz.ch/pipermail/r-devel/2006-July/038547.html), but the 
unmodified lmerFactorList function was still used in the lmer call and the 
same error was generated.

3.  Creating a new version of the lme4 package for my modified function 
with the makecdfenv package, but I do not know how to create a CDF file, 
feel that I am well beyond my programming expertise, and hope there is a 
more simple resolution to this issue.

Thanks for any help -- las

Leigh Ann Starcevich
Doctoral student
Oregon State University
Corvallis, Oregon


At 03:30 PM 5/22/2009 -0700, William Dunlap wrote:
  -Original Message-
  From: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg
  Sent: Friday, May 22, 2009 3:01 PM
  To: Leigh Ann Starcevich
  Cc: r-help@r-project.org
  Subject: Re: [R] Naming a random effect in lmer
 
  [ ... elided statistical advice ... ]
  If you and your advisor still feel that what you are doing
  makes sense,
  I suggest you first get the source code via svn checkout
  svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading
  lme4_0.999375-30.tar.gz from
  http://cran.fhcrc.org/web/packages/lme4/index.html;), then
  walk through
  the code line by line using the debug function (or browser or the
  debug package). From this, you will likely see either (a)
  how you can
  do what you want differently to achieve the same result or (b) how to
  modify the code so it does what you want.

The coding error is right in the error message:
   Error in names(bars) - unlist(lapply(bars, function(x)
deparse(x[[3]])))
and I suspect that traceback() would tell you that came from a call
to lmerFactorList.

That code implicitly assumes that deparse() will produce a scalar
character
vector, but it doesn't if the input expression is complicated enough.
Changing the
 deparse(x[[3]])
to
 deparse(x[[3]])[1]
or
 paste(collapse= , deparse(x[[3]])[1])
would fix it.  The first truncates the name and the second my make a
very
long name.

There is at least one other use of that idiom in the lme4 code and your
dataset and analysis may require that all of them be fixed.

 
 
  Hope this helps.
  Spencer
 
 
  Leigh Ann Starcevich wrote:
   Here is a test data set and code. I am including the data set after
   the code and discussion to make reading easier. Apologies
  for the size
   of the data set, but my problem occurs when there are a lot of Z
   variables. Thanks for your time.
  
   # Enter data below
  
   # Sample code
   library(lme4)
   mb- length(unique(testsamp$WYear))
  
   # Create the formula for the set of identically distributed random
   effects
   Zs- paste(Z,2:(mb-1)),sep=)
   Trendformula -as.formula(paste(LogY ~ WYear +
  (1+WYear|Site) + (1|,
   randommodel=paste(paste(Zs,collapse=+), 
  
   fittest-lmer(Trendformula, data = testsamp)
   summary(fittest)
  
   # Here I get an error because the name of the random effect is too
   long to print
   # in the random effects output (I think).
   # The error message is: Error in names(bars) - unlist(lapply(bars,
   function(x)
   # deparse(x[[3]]))) : 'names' attribute [3] must be the
  same length as
   the vector [2]
  
   # However, when fewer Z variables are used in the random portion of
   the model,
   # there is no error.
   # Using only Z2 + ... + Z9 for the random intercept
  
   Zs2- paste(Z,2:9,sep=)
   Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +
   (1|, randommodel=paste(paste(Zs2,collapse=+), 
   fittest2-lmer(Trendformula2, data = testsamp)
   summary(fittest2)
  
  
   # Is there a way to either name the set of iid random effects
   something else or
   # to define a random variable that could be used in the
  model to create a
   # random intercept?
  
   # I have had some success in lme, but it would be helpful for my
   simulation if I
   # could conduct this analysis with lmer. My model in lme is
  not correctly
   # estimating one of the variance components (random Site intercept).
   # I am using:
  
   detach(package:lme4)
   library(nlme)
   random.model.lme
  
  -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col
  lapse=+)))
  
  
   n-dim(testsamp)[1]
   testsampgroup - rep(1,n)
   testsamp.lme -cbind(testsamp,testsampgroup)
   testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup,
  inner= ~Site,
   data= data.frame(testsamp.lme))
  
   fittest3-lme(LogY ~ WYearCen, random=
   pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)),
   pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data

Re: [R] Naming a random effect in lmer

2009-05-22 Thread spencerg
 The first exaample on the lmer help page uses a formula 
Reaction ~ Days + (Days|Subject).  Here, Subject is the name of a 
column in the data.frame sleepstudy, with levels 308, 309, ... . 



 Does this answer your question?  If no, please provide commented, 
minimal, self-contained, reproducible code, as requested in the posting 
guide http://www.R-project.org/posting-guide.html;.  Your example is 
not self-contained, reproducible. 



 Hope this helps. 
 Spencer



Leigh Ann Starcevich wrote:

Dear guRus:

I am using lmer for a mixed model that includes a random intercept for a 
set of effects that have the same distribution, Normal(0, sig2b).  This set 
of effects is of variable size, so I am using an as.formula statement to 
create the formula for lmer.  For example, if the set of random effects has 
dimension 8, then the lmer call is:


Zs- paste(Z,1:mb,sep=)
Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +  (1|, 
paste(paste(Zs,collapse=+), 

fit2.4a-lmer(Trendformula, data = testsamp)

which, for mb=8, expands to:

fit1-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+ Z2 + Z3 + Z4 + 
Z5 + Z6 + Z7 + Z8), data = testsamp)



I have no problems with this.  However, if the set of random effects has a 
dimension of 30, then the lmer call is:


fit2-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+Z2 + Z3 + Z4 + 
Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 +  Z16 + Z17 + 
Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ 
Z30), data = testsamp)


In this case, I get an error because the name Z1+Z2 + Z3 + Z4 + Z5 + Z6 + 
Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 +  Z16 + Z17 + Z18 + Z19 + 
Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30 is too long 
to print in the output.   Is there any way to name the random effect in 
lmer so that the shorter (and more descriptive) name may be used and the 
error avoided?   Or is there a way to combine these into a single variable 
prior to the lmer function call?  In SAS, I am able to parameterize these 
as a Toeplitz structure with bandwidth 1.


Thanks for any help.

Leigh Ann Starcevich
Doctoral student
Oregon State University
Corvallis, Oregon
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Naming a random effect in lmer

2009-05-22 Thread Leigh Ann Starcevich
Here is a test data set and code.  I am including the data set after the 
code and discussion to make reading easier.  Apologies for the size of the 
data set, but my problem occurs when there are a lot of Z 
variables.   Thanks for your time.

# Enter data below

# Sample code
library(lme4)
mb- length(unique(testsamp$WYear))

# Create the formula for the set of identically distributed random effects
Zs- paste(Z,2:(mb-1)),sep=)
Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +  (1|, 
randommodel=paste(paste(Zs,collapse=+), 

fittest-lmer(Trendformula, data = testsamp)
summary(fittest)

# Here I get an error because the name of the random effect is too long to 
print
# in the random effects output (I think).
# The error message is:  Error in names(bars) - unlist(lapply(bars, 
function(x)
# deparse(x[[3]]))) : 'names' attribute [3] must be the same length as the 
vector [2]

# However, when fewer Z variables are used in the random portion of the model,
# there is no error.
# Using only Z2 + … + Z9 for the random intercept

Zs2- paste(Z,2:9,sep=)
Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +  (1|, 
randommodel=paste(paste(Zs2,collapse=+), 
fittest2-lmer(Trendformula2, data = testsamp)
summary(fittest2)


# Is there a way to either name the set of iid random effects something 
else or
# to define a random variable that could be used in the model to create a
# random intercept?

# I have had some success in lme, but it would be helpful for my simulation 
if I
# could conduct this analysis with lmer.  My model in lme is not correctly
# estimating one of the variance components (random Site intercept).
# I am using:

detach(package:lme4)
library(nlme)
random.model.lme 
-as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),collapse=+)))

n-dim(testsamp)[1]
testsampgroup - rep(1,n)
testsamp.lme -cbind(testsamp,testsampgroup)
testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site,
data= data.frame(testsamp.lme))

fittest3-lme(LogY ~ WYearCen, random= 
pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), 
pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= testgroupSamp)
summary(fittest3)
VarCorr(fittest3)


# Data

testsamp -
structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008,
2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010,
2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012,
2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013,
2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015,
2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018,
2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019,
2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021,
2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022),
 WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2,
 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4,
 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7,
 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10,
 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12,
 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14,
 14, 14), Site = c(4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40,
 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67,
 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75,
 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94,
 4, 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4,
 18, 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18,
 26, 40, 67, 75, 94, 4, 18, 26, 40, 67, 75, 94, 4, 18, 26,
 40, 67, 75, 94), LogY = c(0.848648866298552, 0.809143925760456,
 0.734173952725014, 1.46749967704437, 0.716106254860468, 
0.843871512951468,
 1.09120092433378, 0.800893809796851, 0.996674977596997, 
0.917613604481207,
 1.71928772722884, 0.797853604855215, 0.922298691760041, 
0.964654422529188,
 0.782903180421921, 1.13457969553106, 1.21628917384868, 2.03084776647495,
 0.872954085910578, 1.02192794559856, 0.746307251774509, 
0.439812203188778,
 1.11164109549224, 1.18414357836729, 2.00157711459358, 0.66577753155877,
 0.856374433428581, 0.343862060001402, 0.278505653147057,
 1.20632152691478, 1.32289150679746, 2.19814430598707, 0.538363941164496,
 0.820038163290321, 0.070054765524828, 0.0479738684639024,
 1.29137364087568, 1.52436357249377, 2.32150525025777, 0.595507040392793,
 0.851417610550757, -0.115908144193410, 0.0118306018140099,
 1.39448009350962, 1.71677754106603, 2.59146662837284, 0.595750060671620,
 0.855387479311679, -0.430729785591898, 0.0178423104900579,
 1.60964246000316, 1.99184029256509, 2.86865842252168, 0.69512483409,
 0.96175860396451, -0.600991172113926, -0.174420349224615,
 1.73794158380868, 2.06718359946362, 3.04502112974038, 0.730730638403177,
 0.961110819398807, -0.856693722990918, 

Re: [R] Naming a random effect in lmer

2009-05-22 Thread spencerg
 I miscommunicated: In every application I've seen with large numbers 
of parameters to estimate, most of those parameters are specific 
instances of different levels of a random effect. For example, a 
colleague recently did a fixed effects analysis of a longitudinal 
abundance survey of a large number of species of wildlife. This 
colleague claimed that treating species as a random effect was 
inappropriate, because the species were not selected by use of random 
numbers. With only two or three species, if the scientists were only 
concerned with those two or three species, this may be reasonable.



However, most scientists -- and people who study their work -- will want 
to generalize beyond only the species in the study. Even if scientists 
were only interested in two or three species for which they had data, 
Stein's phenomenon suggests that even a rabid Bayesophobe would likely 
get lower mean square error at the expense of some bias from pretending 
the species were randomly selected from some larger population 
(http://en.wikipedia.org/wiki/James-Stein_estimator).



If I were asked to referee a paper or serve on a thesis committee for a 
study estimating 30 random effects, I would not be happy unless there 
were a convincing discussion of why it was appropriate to estimate all 
those random effects separately; right now, I can not imagine an 
application where that would be appropriate.



If you and your advisor still feel that what you are doing makes sense, 
I suggest you first get the source code via svn checkout 
svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading 
lme4_0.999375-30.tar.gz from 
http://cran.fhcrc.org/web/packages/lme4/index.html;), then walk through 
the code line by line using the debug function (or browser or the 
debug package). From this, you will likely see either (a) how you can 
do what you want differently to achieve the same result or (b) how to 
modify the code so it does what you want.



Hope this helps.
Spencer


Leigh Ann Starcevich wrote:
Here is a test data set and code. I am including the data set after 
the code and discussion to make reading easier. Apologies for the size 
of the data set, but my problem occurs when there are a lot of Z 
variables. Thanks for your time.


# Enter data below

# Sample code
library(lme4)
mb- length(unique(testsamp$WYear))

# Create the formula for the set of identically distributed random 
effects

Zs- paste(Z,2:(mb-1)),sep=)
Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + (1|, 
randommodel=paste(paste(Zs,collapse=+), 


fittest-lmer(Trendformula, data = testsamp)
summary(fittest)

# Here I get an error because the name of the random effect is too 
long to print

# in the random effects output (I think).
# The error message is: Error in names(bars) - unlist(lapply(bars, 
function(x)
# deparse(x[[3]]))) : 'names' attribute [3] must be the same length as 
the vector [2]


# However, when fewer Z variables are used in the random portion of 
the model,

# there is no error.
# Using only Z2 + … + Z9 for the random intercept

Zs2- paste(Z,2:9,sep=)
Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + 
(1|, randommodel=paste(paste(Zs2,collapse=+), 

fittest2-lmer(Trendformula2, data = testsamp)
summary(fittest2)


# Is there a way to either name the set of iid random effects 
something else or

# to define a random variable that could be used in the model to create a
# random intercept?

# I have had some success in lme, but it would be helpful for my 
simulation if I

# could conduct this analysis with lmer. My model in lme is not correctly
# estimating one of the variance components (random Site intercept).
# I am using:

detach(package:lme4)
library(nlme)
random.model.lme 
-as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),collapse=+))) 



n-dim(testsamp)[1]
testsampgroup - rep(1,n)
testsamp.lme -cbind(testsamp,testsampgroup)
testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, inner= ~Site,
data= data.frame(testsamp.lme))

fittest3-lme(LogY ~ WYearCen, random= 
pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), 
pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= 
testgroupSamp)

summary(fittest3)
VarCorr(fittest3)


# Data

testsamp -
structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008,
2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010,
2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012,
2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013,
2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015,
2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016,
2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018,
2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019,
2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021,
2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022),
WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2,
2, 2, 2, 2, 2, 3, 3, 3, 3, 

Re: [R] Naming a random effect in lmer

2009-05-22 Thread William Dunlap
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of spencerg
 Sent: Friday, May 22, 2009 3:01 PM
 To: Leigh Ann Starcevich
 Cc: r-help@r-project.org
 Subject: Re: [R] Naming a random effect in lmer
 
 [ ... elided statistical advice ... ]
 If you and your advisor still feel that what you are doing 
 makes sense, 
 I suggest you first get the source code via svn checkout 
 svn://svn.r-forge.r-project.org/svnroot/lme4 (or by downloading 
 lme4_0.999375-30.tar.gz from 
 http://cran.fhcrc.org/web/packages/lme4/index.html;), then 
 walk through 
 the code line by line using the debug function (or browser or the 
 debug package). From this, you will likely see either (a) 
 how you can 
 do what you want differently to achieve the same result or (b) how to 
 modify the code so it does what you want.

The coding error is right in the error message:
  Error in names(bars) - unlist(lapply(bars, function(x)
deparse(x[[3]])))
and I suspect that traceback() would tell you that came from a call
to lmerFactorList.

That code implicitly assumes that deparse() will produce a scalar
character
vector, but it doesn't if the input expression is complicated enough.
Changing the
deparse(x[[3]])
to
deparse(x[[3]])[1]
or
paste(collapse= , deparse(x[[3]])[1])
would fix it.  The first truncates the name and the second my make a
very
long name.

There is at least one other use of that idiom in the lme4 code and your
dataset and analysis may require that all of them be fixed.
 
 
 
 Hope this helps.
 Spencer
 
 
 Leigh Ann Starcevich wrote:
  Here is a test data set and code. I am including the data set after 
  the code and discussion to make reading easier. Apologies 
 for the size 
  of the data set, but my problem occurs when there are a lot of Z 
  variables. Thanks for your time.
 
  # Enter data below
 
  # Sample code
  library(lme4)
  mb- length(unique(testsamp$WYear))
 
  # Create the formula for the set of identically distributed random 
  effects
  Zs- paste(Z,2:(mb-1)),sep=)
  Trendformula -as.formula(paste(LogY ~ WYear + 
 (1+WYear|Site) + (1|, 
  randommodel=paste(paste(Zs,collapse=+), 
 
  fittest-lmer(Trendformula, data = testsamp)
  summary(fittest)
 
  # Here I get an error because the name of the random effect is too 
  long to print
  # in the random effects output (I think).
  # The error message is: Error in names(bars) - unlist(lapply(bars, 
  function(x)
  # deparse(x[[3]]))) : 'names' attribute [3] must be the 
 same length as 
  the vector [2]
 
  # However, when fewer Z variables are used in the random portion of 
  the model,
  # there is no error.
  # Using only Z2 + ... + Z9 for the random intercept
 
  Zs2- paste(Z,2:9,sep=)
  Trendformula2 -as.formula(paste(LogY ~ WYear + (1+WYear|Site) + 
  (1|, randommodel=paste(paste(Zs2,collapse=+), 
  fittest2-lmer(Trendformula2, data = testsamp)
  summary(fittest2)
 
 
  # Is there a way to either name the set of iid random effects 
  something else or
  # to define a random variable that could be used in the 
 model to create a
  # random intercept?
 
  # I have had some success in lme, but it would be helpful for my 
  simulation if I
  # could conduct this analysis with lmer. My model in lme is 
 not correctly
  # estimating one of the variance components (random Site intercept).
  # I am using:
 
  detach(package:lme4)
  library(nlme)
  random.model.lme 
  
 -as.formula(paste(~-1+,paste(paste(Z,2:(mb-1),sep=),col
 lapse=+))) 
 
 
  n-dim(testsamp)[1]
  testsampgroup - rep(1,n)
  testsamp.lme -cbind(testsamp,testsampgroup)
  testgroupSamp- groupedData(LogY ~ WYearCen|testsampgroup, 
 inner= ~Site,
  data= data.frame(testsamp.lme))
 
  fittest3-lme(LogY ~ WYearCen, random= 
  pdBlocked(list(pdIdent(~-1+WYearCen:as.factor(Site)), 
  pdIdent(~-1+as.factor(Site)), pdIdent(random.model.lme))),data= 
  testgroupSamp)
  summary(fittest3)
  VarCorr(fittest3)
 
 
  # Data
 
  testsamp -
  structure(list(Year = c(2008, 2008, 2008, 2008, 2008, 2008, 2008,
  2009, 2009, 2009, 2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010,
  2010, 2010, 2010, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2012,
  2012, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013,
  2013, 2013, 2014, 2014, 2014, 2014, 2014, 2014, 2014, 2015, 2015,
  2015, 2015, 2015, 2015, 2015, 2016, 2016, 2016, 2016, 2016, 2016,
  2016, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2018, 2018, 2018,
  2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019, 2019, 2019, 2019,
  2020, 2020, 2020, 2020, 2020, 2020, 2020, 2021, 2021, 2021, 2021,
  2021, 2021, 2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022),
  WYear = c(0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 2, 2,
  2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4,
  5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7,
  7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10,
  10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12,
  12, 12, 12, 13, 13, 13, 13, 13, 13, 13

[R] Naming a random effect in lmer

2009-05-21 Thread Leigh Ann Starcevich
Dear guRus:

I am using lmer for a mixed model that includes a random intercept for a 
set of effects that have the same distribution, Normal(0, sig2b).  This set 
of effects is of variable size, so I am using an as.formula statement to 
create the formula for lmer.  For example, if the set of random effects has 
dimension 8, then the lmer call is:

Zs- paste(Z,1:mb,sep=)
Trendformula -as.formula(paste(LogY ~ WYear + (1+WYear|Site) +  (1|, 
paste(paste(Zs,collapse=+), 
fit2.4a-lmer(Trendformula, data = testsamp)

which, for mb=8, expands to:

fit1-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+ Z2 + Z3 + Z4 + 
Z5 + Z6 + Z7 + Z8), data = testsamp)


I have no problems with this.  However, if the set of random effects has a 
dimension of 30, then the lmer call is:

fit2-lmer(LogY ~ WYear + (1 | Site) + (1 | Year) + (1 | Z1+Z2 + Z3 + Z4 + 
Z5 + Z6 + Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 +  Z16 + Z17 + 
Z18 + Z19 + Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ 
Z30), data = testsamp)

In this case, I get an error because the name Z1+Z2 + Z3 + Z4 + Z5 + Z6 + 
Z7 + Z8 + Z9 + Z10 + Z11 + Z12 + Z13 + Z14 + Z15 +  Z16 + Z17 + Z18 + Z19 + 
Z20 + Z21 + Z22 + Z23 + Z24 + Z25 + Z26 + Z27 + Z28 + Z29+ Z30 is too long 
to print in the output.   Is there any way to name the random effect in 
lmer so that the shorter (and more descriptive) name may be used and the 
error avoided?   Or is there a way to combine these into a single variable 
prior to the lmer function call?  In SAS, I am able to parameterize these 
as a Toeplitz structure with bandwidth 1.

Thanks for any help.

Leigh Ann Starcevich
Doctoral student
Oregon State University
Corvallis, Oregon
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.