[R] gamm(): nested tensor product smooths

2006-11-07 Thread Fabian Scheipl
I'd like to compare tests based on the mixed model representation of additive 
models, testing among others

y=f(x1)+f(x2) vs y=f(x1)+f(x2)+f(x1,x2)
(testing for additivity)

In mixed model representation, where X represents the unpenalized part of the 
spline functions and Z the wiggly parts, this would be:

y=X%*%beta+ Z_1%*%b_1+ Z_2%*%b_2
vs
y=X%*%beta+ Z_1%*%b_1+ Z_2%*%b_2 + Z_12 %*% b_12

where b are random effect vectors and the hypothesis to be tested is

H_0: Var(b_12)=0 (= b_12_i == 0 for all i)

the problem:
gamm() does not seem to support the use of nested tensor product splines,
does anybody know how to work around this?

example code:  (you'll need to source the P-spline constructor from ?p.spline 
beforehand)

###
test1-function(x,z,sx=0.3,sz=0.4)
 { (pi**sx*sz)*(1.2*exp(-(x-0.2)^2/sx^2-(z-0.3)^2/sz^2)+
   0.8*exp(-(x-0.7)^2/sx^2-(z-0.8)^2/sz^2))
 }
 n-400
 x-runif(n);z-runif(n);
 f - test1(x,z)
 y - f + rnorm(n)*0.1

 
b-gam(y~s(x,bs=ps,m=c(3,2))+s(z,bs=ps,m=c(3,2))+te(x,z,bs=c(ps,ps),m=c(3,1),mp=F))
 

##works fine

b - 
gamm(y~s(x,bs=ps,m=c(3,2),k=10))+s(z,bs=ps,m=c(3,2),k=10)+te(x,z,bs=c(ps,ps),m=c(3,1),k=c(5,5)))

# : Singularity in backsolve at level 0, block 1

b - gamm(y~te(x,z,bs=c(ps,ps),m=c(3,0),k=c(5,5))) #works fine

# Additionallly, i'd like to work with
# just one smoothness parameter
# for the interaction, but mp=F doesn't work either:

b - 
gamm(y~s(x,bs=ps,m=c(3,2),k=10)+s(z,bs=ps,m=c(3,2),k=10)+te(x,z,bs=c(ps,ps),m=c(3,1),k=c(5,5),mp=F))

# Tensor product penalty rank appears to be too low

# which i don't really understand, since the penalty matrix for the
# p-spline should be

S-t(diff(diag(5),diff=1))%*%(diff(diag(5),diff=1))   
# penalizing deviations from constant function 
# for the tensor product spline -- m=c(3,1)
S
# [,1] [,2] [,3] [,4] [,5]
#[1,]1   -1000
#[2,]   -12   -100
#[3,]0   -12   -10
#[4,]00   -12   -1
#[5,]000   -11

#and
tensor.prod.penalties(list(S,S))

#which should be the penalties for the tensor prod smooth,
# has rank 20-  that's not enough ? 
 
sum(eigen(S[[2]])$values1e-10)
# [1] 20

##

I hope some of the experts out there can help me out- any hints would be 
appreciated
the problem is not caused by the p-splines, it also douesn't work with bs=tp.

thank you for your time.
--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix manipulation with a for loop

2006-11-02 Thread Fabian Scheipl

Your for-loops aren't set up properly:

try 

for(i in 1:NCOL(F.zoo))

HTH, Fabian Scheipl


--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gamm(): degrees of freedom of the fit

2006-11-01 Thread Fabian Scheipl
I wonder whether any of you know of an efficient way to calculate the 
approximate degrees of freedom of a gamm() fit.

Calculating the smoother/projection matrix S: y - \hat y and then its trace by 
sum(eigen(S))$values is what I've been doing so far- but I was hoping there 
might be a more efficient way than doing the spectral decomposition of an 
NxN-matrix.

The degrees of freedom associated with the linear and smooth parts can be read 
from the gam-part of the fitted gamm()-model, but not the degrees of freedom 
associated with the random effects. For lmer()-models, there is hatTrace() but 
I know of nothing of the sort for lme()-Objects.

Maybe somebody has some code or ideas to share?



--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conservative ANOVA tables in lmer

2006-09-08 Thread Fabian Scheipl

Dear list,

I have written functions to perform simulation-based tests of 
HO: Var(Random Effects)=0, beta_foo=0 
in linear mixed models based on the exact distribution of the LR- and
Restricted LR-test statistic (see research presented in
Crainiceanu, Ruppert: LRT in LMM with 1 variance component,
J. R. Statist. Soc. B (2004), 66, Part 1, pp. 165-185 )
-they are about 15-20 times faster than the parametric bootstrap.

At the moment, the exact distributions are only easily simulated for the
case of 1 single variance component/random effect and i.i.d. errors; feasible 
approximations for the multivariate case are currently
being investigated and will be implemented soon.

the syntax looks something like this:

#begin code:

data(sleepstudy)
summary(sleepstudy)  #Effect of sleep deprivation on reaction time
xyplot(Reaction~Days|Subject, data=sleepstudy)
m-lmer(Reaction~Days+(Days-1|Subject),data=sleepstudy) 
#random slopes, but no random intercept  
#doesna make sense, but it's just an example
summary(m)

#test for individual heterogeneity based on RLRT
#(No restrictions on fixed effects under H0)
#HO: lambda=Var(RandomSlopes)/Var(error)==0 == Var(RandomSlopes)==0

t3-RLRT1SimTest(m, lambda0=0, seed=5, nsim=1)

#will produce output:
#HO: lambda = 0 ; p-value = 0 
# observed lambda = 0.06259639 

#test for influence of Days based on LRT 
#(restriction on fixed efects: beta_Days==0)

m0-lm(Reaction~1,data=sleepstudy)
t4-LRT1SimTest(m, m0, seed=10, nsim=1)

#will produce output:
#Model under HO:  Reaction ~ 1 ;
#Model under HA:  Reaction ~ Days + (Days - 1 | Subject) ;
# p-value = 0 
# observed lambda = 0.06259639 

#end code 
  
If you are interested in using these functions i'll be glad
to send them to you-
be aware, however, that you can only use them for testing
1 Random Effect vs. no Random Effect in a model with i.i.d. errors!!

The plan is to put them in a package beginning next year and
use them as a basis for an (exact) anova.lmer() method.

Greetings,
Fabian Scheipl


--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conservative ANOVA tables in lmer

2006-09-08 Thread Fabian Scheipl
Dear list,

I have written functions to perform simulation-based tests of 
HO: Var(Random Effects)=0, beta_foo=0 
in linear mixed models based on the exact distribution of the LR- and
Restricted LR-test statistic (see research presented in
Crainiceanu, Ruppert: LRT in LMM with 1 variance component,
J. R. Statist. Soc. B (2004), 66, Part 1, pp. 165-185 )
-they are about 15-20 times faster than the parametric bootstrap.

At the moment, the exact distributions are only easily simulated for the
case of 1 single variance component/random effect and i.i.d. errors; feasible 
approximations for the multivariate case are currently
being investigated and will be implemented soon.

the syntax looks something like this:

#begin code:

data(sleepstudy)
summary(sleepstudy)  #Effect of sleep deprivation on reaction time
xyplot(Reaction~Days|Subject, data=sleepstudy)
m-lmer(Reaction~Days+(Days-1|Subject),data=sleepstudy) 
#random slopes, but no random intercept  
#doesna make sense, but it's just an example
summary(m)

#test for individual heterogeneity based on RLRT
#(No restrictions on fixed effects under H0)
#HO: lambda=Var(RandomSlopes)/Var(error)==0 == Var(RandomSlopes)==0

t3-RLRT1SimTest(m, lambda0=0, seed=5, nsim=1)

#will produce output:
#HO: lambda = 0 ; p-value = 0 
# observed lambda = 0.06259639 

#test for influence of Days based on LRT 
#(restriction on fixed efects: beta_Days==0)

m0-lm(Reaction~1,data=sleepstudy)
t4-LRT1SimTest(m, m0, seed=10, nsim=1)

#will produce output:
#Model under HO:  Reaction ~ 1 ;
#Model under HA:  Reaction ~ Days + (Days - 1 | Subject) ;
# p-value = 0 
# observed lambda = 0.06259639 

#end code 
  
If you are interested in using these functions i'll be glad
to send them to you-
be aware, however, that you can only use them for testing
1 Random Effect vs. no Random Effect in a model with i.i.d. errors!!

The plan is to put them in a package beginning next year and
use them as a basis for an (exact) anova.lmer() method.

Greetings,
Fabian Scheipl


-- 





--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmer(): specifying i.i.d random slopes for multiple covariates

2006-08-24 Thread Fabian Scheipl
Dear readers,

Is it possible to specify a model

y=X %*% beta + Z %*% b ; b=(b_1,..,b_k) and b_i~N(0,v^2) for i=1,..,k

that is, a model where the random slopes for different covariates are i.i.d., 
in lmer() and how?

In lme() one needs a constant grouping factor (e.g.: all=rep(1,n)) and would 
then specify:
lme(fixed= y~X, random= list(all=pdIdent(~Z-1)) ) ,
that´s how it's done in the lmeSplines- documentation.

Any hints would be greatly appreciated- I'm trying to write a suite of 
functions that will transform additive models into their mixed-effects 
representation like lmeSplines but using lmer() instead of lme().

Thank you for your time,
Fabian Scheipl
-- 


Echte DSL-Flatrate dauerhaft für 0,- Euro*. Nur noch kurze Zeit!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loss of numerical precision from conversion to list ?

2006-07-21 Thread Fabian Scheipl
Thank you both very much for your help.

Peter Dalgaard is right- i  didn't consider the fact that elementwise 
multiplication is column-wise rather than row-wise.
Sorry for taking up timespace with such a trivial mistake.



 Original-Nachricht 
Datum: 21 Jul 2006 10:07:31 +0200
Von: Peter Dalgaard [EMAIL PROTECTED]
An: Duncan Murdoch [EMAIL PROTECTED]
Betreff: Re: [R] Loss of numerical precision from conversion to list ?

 Duncan Murdoch [EMAIL PROTECTED] writes:
 
  R tries to use the maximum precision (64 bit mantissa) in the floating 
 ...
  Or perhaps your problem has nothing to do with this; I didn't really 
  look at it in detail.
 
 It hasn't. I was off speculating about sum vs rowSums too, but:
 
   num.v-  rowSums(((lambda-lambda0)*mu*w.k.sq[,-(K+1)])/(1+lambda*mu))
 
 Inside this, we have mu*w.k.sq[,-(K+1)] . mu is a vector of length 27,
 and w.k.sq has 10 rows and 28 *columns*. Column-major storage and
 vector recycling kicks in... If mu has identical elements (never mind
 the magnitude), of course, the recycling doesn't matter.
 
 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907

-- 


Echte DSL-Flatrate dauerhaft für 0,- Euro*!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loss of numerical precision from conversion to list ?

2006-07-20 Thread Fabian Scheipl

I´m working on an R-implementation of the simulation-based finite-sample 
null-distribution of (R)LR-Test in Mixed Models (i.e. testing for 
Var(RandomEffect)=0) derived by C. M. Crainiceanu and D. Ruppert.

I'm in the beginning stages of this project and while comparing quick and dirty 
grid-search-methods and more exact optim()/optimize()-based methods to find the 
maximum of a part of the RLR-Test-Statistic i stumbled upon the following 
problem:

It seems to me that R produces different results depending on whether 
originally identical numbers involved in the exact same computations are read 
from a matrix or a list.
(I need both in order to do quick vectorized computation for the grid-search 
with matrices and list-based computation so that i can put the function to be 
maximized in something like mapply(...,optim(foo),...)- I can elaborate if 
desired)

However, the problem goes away once a number involved in the computation is set 
from almost zero (e-15) to 4.
I'm completely mystified by this; especially since this number that I change is 
NOT one of the numbers that are switched from matrix to list.

Here's the code:

library(nlme)
data(Orthodont)#108 dental measurements on 27 subjects
# m1-lme(distance~age,random=~1|Subject,data=Orthodont)
# summary(m1)
# ...
# Random effects:
# Formula: ~1 | Subject
#  (Intercept) Residual
# StdDev:2.114724 1.431592  - lambda.REML=2.114^2/1.431^2 = 2.182382

#DesignMatrix for fixed Effects
X-cbind(rep(1,108),Orthodont$age) 
#DesignMatrix of RandomEffects
Z-matrix(data=c(rep(1,4),rep(0,108)),nrow=108,ncol=27) 

#Corr(RanEf)^0.5 = 27 x 27 Identity, since RandomIntercepts are independent
sqrt.Sigma-diag(27) 

K-27 #number of subjects/ random intercepts
n-nrow(X)
p-ncol(X)
lambda0 - 2.182382 #actually not a sensible choice as Null-Hypothesis, but 
that doesn't pertain to the problem

#Projection-Matrix for Fixed-Effects-Model: Y - errors
P0=diag(n)-X%*%solve((t(X)%*%X))%*%t(X) 

mu-eigen(sqrt.Sigma%*%t(Z)%*%P0%*%Z%*%sqrt.Sigma)$values
# mu
# [1] 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 
4.0e+00 4.0e+00 4.0e+00 4.0e+00
#[11] 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 
4.0e+00 4.0e+00 4.0e+00 4.0e+00
#[21] 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 4.0e+00 
5.77316e-15
# ! Notice the last (27th) value very close to 0

nsim-10
set.seed(10)
#nsim x K array of ChiSq(1)-variates
w.k.sq.mat-matrix(rchisq(nsim*K,1),nrow=nsim) 
#nsim x 1 array of ChiSq(n-p-K)-variates
w.sum2-rchisq(nsim,n-p-K) 

### vectorized computation of nsim=10 realizations
### of a part of the RLR-statistic under the Null:
w.k.sq- cbind(w.k.sq.mat,w.sum2)   #nsim x (K+1)
#vector-based results:
num.v-  rowSums(((lambda-lambda0)*mu*w.k.sq[,-(K+1)])/(1+lambda*mu))
den.v-  rowSums(((1+lambda0*mu)*w.k.sq[,-(K+1)]) / (1+lambda*mu)) + 
w.k.sq[,K+1]

### list-based computation of nsim=10 realizations
### of a part of the RLR-statistic under the Null:
w.k.sq-list()
length(w.k.sq)-nsim
#put the nsim rows into list-slots:
for(i in 1:nsim) w.k.sq[[i]]-c(w.k.sq.mat[i,],w.sum2[i]) 
num.l-numeric(0)
den.l-numeric(0)
for(i in 1:nsim)
{
num.l[i]-sum(((lambda-lambda0)*mu*w.k.sq[[i]][-(K+1)])/(1+lambda*mu))
#exactly analogous to num.v  den.v, except list-elements instead of vector
den.l[i]-sum(((1+lambda0*mu)*w.k.sq[[i]][-(K+1)]) / (1+lambda*mu)) + 
w.k.sq[[i]][K+1]
}

#  Now the actual problem:
#  notice the discrepancies between the results from vectorized computation
#  and the results from list-based computation
#  Since discrepancies disappear if mu[27] is changed 
#  from 5.77316e-15 to 4, i'm guessing somewhere in the conversion to
#  list there must be a loss of precision or is there an entirely 
#  different problem?


num.l
# [1] -25.93322 -17.65486 -18.80239 -19.49974 
num.v
# [1] -23.84733 -17.62233 -27.22975 -19.50294 

den.l
# [1] 117.30246  92.59041  92.91491 112.90113 ...
den.v
# [1] 115.21657  92.55789 101.34228 112.90433 ...

#now i set
mu[27]-4
#and reran the computation of num.l /.v and den.l /.v from above:

num.l
# [1] -26.25565 -17.67423 -27.47259 -20.97961 ...
num.v
# [1] -26.25565 -17.67423 -27.47259 -20.97961 ...
den.l
# [1] 117.62489  92.60979 101.58511 114.38100 ...
den.v
# [1] 117.62489  92.60979 101.58511 114.38100 ...

what i would like to know now is:

1) which of the two calculations yields a more precise result?
or rather:
2) how can i avoid these discrepancies in the future since i need to be able to 
compare these two methods? 
and, most importantly,
3) what in R.A.Fisher's name is happening here?

version information:

Version 2.3.1 (2006-06-01) 
i386-pc-mingw32 
.Machine$double.eps is 2.220446e-16 (does it matter?)

thanks for your time,



-- 
Fabian Scheipl

[EMAIL PROTECTED]

Feel free – 10 GB Mailbox, 100 FreeSMS/Monat ...

__
R-help@stat.math.ethz.ch mailing list
https