Re: [R] hypothesis testing for rank-deficient linear models

2006-01-12 Thread dhinds
[EMAIL PROTECTED] wrote:
 Take the following example:

 a - rnorm(100)
 b - trunc(3*runif(100))
 g - factor(trunc(4*runif(100)),labels=c('A','B','C','D'))
 y - rnorm(100) + a + (b+1) * (unclass(g)+2)
...

Here's a cleaned-up function to compute estimable within-group effects
for rank deficient models.  For the above data, it could be invoked
as:

 m - lm(y~a+b*g, subset=(b==0 | g!='B'))
 subgroup.effects(m, 'b',  g=c('A','B','C','D'))
  g Estimate  StdError  t.value  p.value
1 A 2.779167 0.4190213  6.63252 4.722978e-09
2 B   NANA   NA   NA
3 C 4.572431 0.3074402 14.87258 6.226445e-24
4 D 5.920809 0.3502251 16.90572 3.995266e-27

Again, I'm not sure whether this is a good approach, or whether there
is an easier way using existing R functions.  One problem is figuring
exactly which terms are not estimable from the available data.  My
hack using alias() is not satisfactory and I've already run into cases
where it fails.  But I'm having trouble coming up with a more general,
correct test?

-- David Hinds



subgroup.effects - function(model, term, ...)
{
my.coef - function(n)
{
contr - lapply(names(args), function(i)
contr.treatment(args[[i]], unclass(gr[n,i])))
names(contr) - names(args)
u - update(model, formula=model$formula,
data=model$data, contrasts=contr)
uc - coef(summary(u))[term,]
if (any(is.na(coef(u))) 
any(!is.na(alias(u)$Complete)))
uc[1:4] - NA
uc
}
args - list(...)
gr - expand.grid(...)
d - data.frame(gr, t(sapply(1:nrow(gr), my.coef)))
names(d) - c(names(gr),'Estimate','StdError','t.value','p.value')
d
}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] hypothesis testing for rank-deficient linear models

2006-01-10 Thread dhinds
Take the following example:

a - rnorm(100)
b - trunc(3*runif(100))
g - factor(trunc(4*runif(100)),labels=c('A','B','C','D'))
y - rnorm(100) + a + (b+1) * (unclass(g)+2)
m - lm(y~a+b*g)
summary(m)

Here b is discrete but not treated as a factor.  I am interested in
computing the effect of b within groups defined by the factor g.  One
way to do that is with the estimable() function from gmodels:

ct - cbind(1,contr.treatment(4))
dimnames(ct)[[2]] - c('b','b:gB','b:gC','b:gD')
estimable(m, ct, conf.int=0.95)

Another way is the contrast() function from the Design library:

dd - datadist(a,b,g)
options(datadist='dd')
o - ols(y~a+b*g)
contrast(o, list(b=1,g=levels(g)), list(b=0,g=levels(g)))

Now take the situation where there are empty cells in the b x g table,
so that the model is rank deficient, i.e.:

m - lm(y~a+b*g, subset=(b==0 | g!='B'))
summary(m)

Now there's trouble.  The estimable() function and Design library do
not seem to handle rank deficient models well.  Here is my current
best attempt to get what I want:

my.coef - function(n)
{
ct - contr.treatment(levels(g), base=n)
u - update(m, contrasts=list(g=ct))
uc - coef(summary(u))['b',]
if (any(is.na(coef(u)))  any(!is.na(alias(u)$Complete)))
uc[1:4] - NA
uc
}
t(sapply(1:4, my.coef))

This seems adequate for my immediate purposes and I can clean it up to
be more general.  But I'm wondering if there is an easier/better way
using existing R facilities?  Here, I'm doing more model fitting than
necessary, but (for now) speed is not a factor and I was having
trouble writing a concise solution that worked on just the original
model object.

-- David Hinds

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html