[R] gamm(): nested tensor product smooths
I'd like to compare tests based on the mixed model representation of additive models, testing among others y=f(x1)+f(x2) vs y=f(x1)+f(x2)+f(x1,x2) (testing for additivity) In mixed model representation, where X represents the unpenalized part of the spline functions and Z the wiggly parts, this would be: y=X%*%beta+ Z_1%*%b_1+ Z_2%*%b_2 vs y=X%*%beta+ Z_1%*%b_1+ Z_2%*%b_2 + Z_12 %*% b_12 where b are random effect vectors and the hypothesis to be tested is H_0: Var(b_12)=0 (= b_12_i == 0 for all i) the problem: gamm() does not seem to support the use of nested tensor product splines, does anybody know how to work around this? example code: (you'll need to source the P-spline constructor from ?p.spline beforehand) ### test1-function(x,z,sx=0.3,sz=0.4) { (pi**sx*sz)*(1.2*exp(-(x-0.2)^2/sx^2-(z-0.3)^2/sz^2)+ 0.8*exp(-(x-0.7)^2/sx^2-(z-0.8)^2/sz^2)) } n-400 x-runif(n);z-runif(n); f - test1(x,z) y - f + rnorm(n)*0.1 b-gam(y~s(x,bs=ps,m=c(3,2))+s(z,bs=ps,m=c(3,2))+te(x,z,bs=c(ps,ps),m=c(3,1),mp=F)) ##works fine b - gamm(y~s(x,bs=ps,m=c(3,2),k=10))+s(z,bs=ps,m=c(3,2),k=10)+te(x,z,bs=c(ps,ps),m=c(3,1),k=c(5,5))) # : Singularity in backsolve at level 0, block 1 b - gamm(y~te(x,z,bs=c(ps,ps),m=c(3,0),k=c(5,5))) #works fine # Additionallly, i'd like to work with # just one smoothness parameter # for the interaction, but mp=F doesn't work either: b - gamm(y~s(x,bs=ps,m=c(3,2),k=10)+s(z,bs=ps,m=c(3,2),k=10)+te(x,z,bs=c(ps,ps),m=c(3,1),k=c(5,5),mp=F)) # Tensor product penalty rank appears to be too low # which i don't really understand, since the penalty matrix for the # p-spline should be S-t(diff(diag(5),diff=1))%*%(diff(diag(5),diff=1)) # penalizing deviations from constant function # for the tensor product spline -- m=c(3,1) S # [,1] [,2] [,3] [,4] [,5] #[1,]1 -1000 #[2,] -12 -100 #[3,]0 -12 -10 #[4,]00 -12 -1 #[5,]000 -11 #and tensor.prod.penalties(list(S,S)) #which should be the penalties for the tensor prod smooth, # has rank 20- that's not enough ? sum(eigen(S[[2]])$values1e-10) # [1] 20 ## I hope some of the experts out there can help me out- any hints would be appreciated the problem is not caused by the p-splines, it also douesn't work with bs=tp. thank you for your time. -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulation with a for loop
Your for-loops aren't set up properly: try for(i in 1:NCOL(F.zoo)) HTH, Fabian Scheipl -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gamm(): degrees of freedom of the fit
I wonder whether any of you know of an efficient way to calculate the approximate degrees of freedom of a gamm() fit. Calculating the smoother/projection matrix S: y - \hat y and then its trace by sum(eigen(S))$values is what I've been doing so far- but I was hoping there might be a more efficient way than doing the spectral decomposition of an NxN-matrix. The degrees of freedom associated with the linear and smooth parts can be read from the gam-part of the fitted gamm()-model, but not the degrees of freedom associated with the random effects. For lmer()-models, there is hatTrace() but I know of nothing of the sort for lme()-Objects. Maybe somebody has some code or ideas to share? -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conservative ANOVA tables in lmer
Dear list, I have written functions to perform simulation-based tests of HO: Var(Random Effects)=0, beta_foo=0 in linear mixed models based on the exact distribution of the LR- and Restricted LR-test statistic (see research presented in Crainiceanu, Ruppert: LRT in LMM with 1 variance component, J. R. Statist. Soc. B (2004), 66, Part 1, pp. 165-185 ) -they are about 15-20 times faster than the parametric bootstrap. At the moment, the exact distributions are only easily simulated for the case of 1 single variance component/random effect and i.i.d. errors; feasible approximations for the multivariate case are currently being investigated and will be implemented soon. the syntax looks something like this: #begin code: data(sleepstudy) summary(sleepstudy) #Effect of sleep deprivation on reaction time xyplot(Reaction~Days|Subject, data=sleepstudy) m-lmer(Reaction~Days+(Days-1|Subject),data=sleepstudy) #random slopes, but no random intercept #doesna make sense, but it's just an example summary(m) #test for individual heterogeneity based on RLRT #(No restrictions on fixed effects under H0) #HO: lambda=Var(RandomSlopes)/Var(error)==0 == Var(RandomSlopes)==0 t3-RLRT1SimTest(m, lambda0=0, seed=5, nsim=1) #will produce output: #HO: lambda = 0 ; p-value = 0 # observed lambda = 0.06259639 #test for influence of Days based on LRT #(restriction on fixed efects: beta_Days==0) m0-lm(Reaction~1,data=sleepstudy) t4-LRT1SimTest(m, m0, seed=10, nsim=1) #will produce output: #Model under HO: Reaction ~ 1 ; #Model under HA: Reaction ~ Days + (Days - 1 | Subject) ; # p-value = 0 # observed lambda = 0.06259639 #end code If you are interested in using these functions i'll be glad to send them to you- be aware, however, that you can only use them for testing 1 Random Effect vs. no Random Effect in a model with i.i.d. errors!! The plan is to put them in a package beginning next year and use them as a basis for an (exact) anova.lmer() method. Greetings, Fabian Scheipl -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conservative ANOVA tables in lmer
Dear list, I have written functions to perform simulation-based tests of HO: Var(Random Effects)=0, beta_foo=0 in linear mixed models based on the exact distribution of the LR- and Restricted LR-test statistic (see research presented in Crainiceanu, Ruppert: LRT in LMM with 1 variance component, J. R. Statist. Soc. B (2004), 66, Part 1, pp. 165-185 ) -they are about 15-20 times faster than the parametric bootstrap. At the moment, the exact distributions are only easily simulated for the case of 1 single variance component/random effect and i.i.d. errors; feasible approximations for the multivariate case are currently being investigated and will be implemented soon. the syntax looks something like this: #begin code: data(sleepstudy) summary(sleepstudy) #Effect of sleep deprivation on reaction time xyplot(Reaction~Days|Subject, data=sleepstudy) m-lmer(Reaction~Days+(Days-1|Subject),data=sleepstudy) #random slopes, but no random intercept #doesna make sense, but it's just an example summary(m) #test for individual heterogeneity based on RLRT #(No restrictions on fixed effects under H0) #HO: lambda=Var(RandomSlopes)/Var(error)==0 == Var(RandomSlopes)==0 t3-RLRT1SimTest(m, lambda0=0, seed=5, nsim=1) #will produce output: #HO: lambda = 0 ; p-value = 0 # observed lambda = 0.06259639 #test for influence of Days based on LRT #(restriction on fixed efects: beta_Days==0) m0-lm(Reaction~1,data=sleepstudy) t4-LRT1SimTest(m, m0, seed=10, nsim=1) #will produce output: #Model under HO: Reaction ~ 1 ; #Model under HA: Reaction ~ Days + (Days - 1 | Subject) ; # p-value = 0 # observed lambda = 0.06259639 #end code If you are interested in using these functions i'll be glad to send them to you- be aware, however, that you can only use them for testing 1 Random Effect vs. no Random Effect in a model with i.i.d. errors!! The plan is to put them in a package beginning next year and use them as a basis for an (exact) anova.lmer() method. Greetings, Fabian Scheipl -- -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lmer(): specifying i.i.d random slopes for multiple covariates
Dear readers, Is it possible to specify a model y=X %*% beta + Z %*% b ; b=(b_1,..,b_k) and b_i~N(0,v^2) for i=1,..,k that is, a model where the random slopes for different covariates are i.i.d., in lmer() and how? In lme() one needs a constant grouping factor (e.g.: all=rep(1,n)) and would then specify: lme(fixed= y~X, random= list(all=pdIdent(~Z-1)) ) , that´s how it's done in the lmeSplines- documentation. Any hints would be greatly appreciated- I'm trying to write a suite of functions that will transform additive models into their mixed-effects representation like lmeSplines but using lmer() instead of lme(). Thank you for your time, Fabian Scheipl -- Echte DSL-Flatrate dauerhaft für 0,- Euro*. Nur noch kurze Zeit! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loss of numerical precision from conversion to list ?
Thank you both very much for your help. Peter Dalgaard is right- i didn't consider the fact that elementwise multiplication is column-wise rather than row-wise. Sorry for taking up timespace with such a trivial mistake. Original-Nachricht Datum: 21 Jul 2006 10:07:31 +0200 Von: Peter Dalgaard [EMAIL PROTECTED] An: Duncan Murdoch [EMAIL PROTECTED] Betreff: Re: [R] Loss of numerical precision from conversion to list ? Duncan Murdoch [EMAIL PROTECTED] writes: R tries to use the maximum precision (64 bit mantissa) in the floating ... Or perhaps your problem has nothing to do with this; I didn't really look at it in detail. It hasn't. I was off speculating about sum vs rowSums too, but: num.v- rowSums(((lambda-lambda0)*mu*w.k.sq[,-(K+1)])/(1+lambda*mu)) Inside this, we have mu*w.k.sq[,-(K+1)] . mu is a vector of length 27, and w.k.sq has 10 rows and 28 *columns*. Column-major storage and vector recycling kicks in... If mu has identical elements (never mind the magnitude), of course, the recycling doesn't matter. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 -- Echte DSL-Flatrate dauerhaft für 0,- Euro*! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Loss of numerical precision from conversion to list ?
I´m working on an R-implementation of the simulation-based finite-sample null-distribution of (R)LR-Test in Mixed Models (i.e. testing for Var(RandomEffect)=0) derived by C. M. Crainiceanu and D. Ruppert. I'm in the beginning stages of this project and while comparing quick and dirty grid-search-methods and more exact optim()/optimize()-based methods to find the maximum of a part of the RLR-Test-Statistic i stumbled upon the following problem: It seems to me that R produces different results depending on whether originally identical numbers involved in the exact same computations are read from a matrix or a list. (I need both in order to do quick vectorized computation for the grid-search with matrices and list-based computation so that i can put the function to be maximized in something like mapply(...,optim(foo),...)- I can elaborate if desired) However, the problem goes away once a number involved in the computation is set from almost zero (e-15) to 4. I'm completely mystified by this; especially since this number that I change is NOT one of the numbers that are switched from matrix to list. Here's the code: library(nlme) data(Orthodont)#108 dental measurements on 27 subjects # m1-lme(distance~age,random=~1|Subject,data=Orthodont) # summary(m1) # ... # Random effects: # Formula: ~1 | Subject # (Intercept) Residual # StdDev:2.114724 1.431592 - lambda.REML=2.114^2/1.431^2 = 2.182382 #DesignMatrix for fixed Effects X-cbind(rep(1,108),Orthodont$age) #DesignMatrix of RandomEffects Z-matrix(data=c(rep(1,4),rep(0,108)),nrow=108,ncol=27) #Corr(RanEf)^0.5 = 27 x 27 Identity, since RandomIntercepts are independent sqrt.Sigma-diag(27) K-27 #number of subjects/ random intercepts n-nrow(X) p-ncol(X) lambda0 - 2.182382 #actually not a sensible choice as Null-Hypothesis, but that doesn't pertain to the problem #Projection-Matrix for Fixed-Effects-Model: Y - errors P0=diag(n)-X%*%solve((t(X)%*%X))%*%t(X) mu-eigen(sqrt.Sigma%*%t(Z)%*%P0%*%Z%*%sqrt.Sigma)$values # mu # [1] 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 #[11] 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 #[21] 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 5.77316e-15 # ! Notice the last (27th) value very close to 0 nsim-10 set.seed(10) #nsim x K array of ChiSq(1)-variates w.k.sq.mat-matrix(rchisq(nsim*K,1),nrow=nsim) #nsim x 1 array of ChiSq(n-p-K)-variates w.sum2-rchisq(nsim,n-p-K) ### vectorized computation of nsim=10 realizations ### of a part of the RLR-statistic under the Null: w.k.sq- cbind(w.k.sq.mat,w.sum2) #nsim x (K+1) #vector-based results: num.v- rowSums(((lambda-lambda0)*mu*w.k.sq[,-(K+1)])/(1+lambda*mu)) den.v- rowSums(((1+lambda0*mu)*w.k.sq[,-(K+1)]) / (1+lambda*mu)) + w.k.sq[,K+1] ### list-based computation of nsim=10 realizations ### of a part of the RLR-statistic under the Null: w.k.sq-list() length(w.k.sq)-nsim #put the nsim rows into list-slots: for(i in 1:nsim) w.k.sq[[i]]-c(w.k.sq.mat[i,],w.sum2[i]) num.l-numeric(0) den.l-numeric(0) for(i in 1:nsim) { num.l[i]-sum(((lambda-lambda0)*mu*w.k.sq[[i]][-(K+1)])/(1+lambda*mu)) #exactly analogous to num.v den.v, except list-elements instead of vector den.l[i]-sum(((1+lambda0*mu)*w.k.sq[[i]][-(K+1)]) / (1+lambda*mu)) + w.k.sq[[i]][K+1] } # Now the actual problem: # notice the discrepancies between the results from vectorized computation # and the results from list-based computation # Since discrepancies disappear if mu[27] is changed # from 5.77316e-15 to 4, i'm guessing somewhere in the conversion to # list there must be a loss of precision or is there an entirely # different problem? num.l # [1] -25.93322 -17.65486 -18.80239 -19.49974 num.v # [1] -23.84733 -17.62233 -27.22975 -19.50294 den.l # [1] 117.30246 92.59041 92.91491 112.90113 ... den.v # [1] 115.21657 92.55789 101.34228 112.90433 ... #now i set mu[27]-4 #and reran the computation of num.l /.v and den.l /.v from above: num.l # [1] -26.25565 -17.67423 -27.47259 -20.97961 ... num.v # [1] -26.25565 -17.67423 -27.47259 -20.97961 ... den.l # [1] 117.62489 92.60979 101.58511 114.38100 ... den.v # [1] 117.62489 92.60979 101.58511 114.38100 ... what i would like to know now is: 1) which of the two calculations yields a more precise result? or rather: 2) how can i avoid these discrepancies in the future since i need to be able to compare these two methods? and, most importantly, 3) what in R.A.Fisher's name is happening here? version information: Version 2.3.1 (2006-06-01) i386-pc-mingw32 .Machine$double.eps is 2.220446e-16 (does it matter?) thanks for your time, -- Fabian Scheipl [EMAIL PROTECTED] Feel free – 10 GB Mailbox, 100 FreeSMS/Monat ... __ R-help@stat.math.ethz.ch mailing list https