Re: [R] Running geeglm unstructured corstr

2007-09-07 Thread Ulrich Halekoh
Dear Niccolo,
 

a) your example program
works for me 
and on a Pentium 4, 3 Ghz, 1GB ram, The model fit uses
   user  system elapsed 
 146.920.76  143.32 
 
library(geepack)
set.seed(299)
header - rep.int(seq(1:615),sample(seq(1:19),size=615,replace=T))
inr - rlnorm(length(header),0.8434359,0.3268392)
group_cod - sample(c(0,1),size=length(header),replace=T)
inside - sample(c(0,1),size=length(header),replace=T)
gee.frame - data.frame(header,inr,group_cod,inside)
 
geeglm.model- geeglm(inside~group_cod,family=binomial, data=gee.frame
,id=header,corstr=unstructured)
 
 
 

Version of_   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  5.1 
year   2007
month  06  
day27  
svn rev42083   
language   R   
version.string R version 2.5.1 (2007-06-27)
 

b)

for costr= ar1
 
In my dataset I also have a variable weeks, that I use to specify
waves in
the geeglm. When I use this, the output for an autoregressive gee model
gives me only the estimates, when I don't use it everything's alright.
The
problem's that I have unbalanced observations, so the use of waves could
take this into account while estimating parameters. Is this a conflict
between these options, or what?

 
The argument waves should work in conjunction with the ar1 working
correlation.
Make sure that the observations are orderd  wrt. cluster-id as it is
said in ?geeglm for the 'id' argument
 
 
 
c) if you build your own fixed-working correlation then  there are two
issues:
 
 1) you must leave out  clusters of size 1 in the construction of the
zcor matrix:
 
 if the other clusters have the same size you can use geeglm with that
zcor matrix
   Example:
 
# using corstr=fixed the cluster with only 1 observation must be
# left out from the construction of the zcor-vector. 
# the following example is for the case where clusters of size larger
then 1
# have no missing observations and are of the same size
 
library(geepack)
data(seizure)
## Diggle, Liang, and Zeger (1994) pp166-168, compare Table 8.10
seiz.l - reshape(seizure,
  varying=list(c(base,y1, y2, y3, y4)),
  v.names=y, times=0:4, direction=long)
seiz.l - seiz.l[order(seiz.l$id, seiz.l$time),]
seiz.l$t - ifelse(seiz.l$time == 0, 8, 2)
seiz.l$x - ifelse(seiz.l$time == 0, 0, 1)
 
 
 
## defining fixed correlation matrix
cor.fixed - matrix(c(1, 0.5, 0.25, 0.125, 0.125,
  0.5, 1, 0.25, 0.125, 0.125,
  0.25, 0.25, 1, 0.5, 0.125,
  0.125, 0.125, 0.5, 1, 0.125,
  0.125, 0.125, 0.125, 0.125, 1), 5, 5)
 
zcor - rep(cor.fixed[lower.tri(cor.fixed)], 59)
g1 - geeglm(y ~ offset(log(t)) + x + trt + x:trt, id = id,
data = seiz.l, family = poisson,
corstr = fixed, zcor = zcor)
 
#reducing clusters 1 ,3 and 58 to only one observation 
seiz.reduc-subset(seiz.l,!( (id==1 | id==3 | id==58)  time0))
 
# zcor is only constructed for clusters with size larger than 1
n.larger.one-sum(table(seiz.reduc$id)1)
zcor -c( rep(cor.fixed[lower.tri(cor.fixed)], n.larger.one))
 

g2- geeglm(y ~ offset(log(t)) + x + trt + x:trt, id = id,
data = seiz.reduc, family = poisson,
corstr = fixed, zcor = zcor)
 
 
 

2) if the clusters have different size than you must construct the
zcor-matrix according to the missing structure,
 the usage of the wave-argument to specify the missing structure does
not work.
 

Example:
 

# using corstr=fixed the  for clusters with unbalanced number of
observations
# the wave argument does not work, the zcor-matrix must be constructed
'by-hand'
library(geepack)
data(seizure)
## Diggle, Liang, and Zeger (1994) pp166-168, compare Table 8.10
seiz.l - reshape(seizure,
  varying=list(c(base,y1, y2, y3, y4)),
  v.names=y, times=0:4, direction=long)
seiz.l - seiz.l[order(seiz.l$id, seiz.l$time),]
seiz.l$t - ifelse(seiz.l$time == 0, 8, 2)
seiz.l$x - ifelse(seiz.l$time == 0, 0, 1)
 
# transform time such that the initial time is 1 
seiz.l$time-seiz.l$time+1
 

#taking only a subset of the data such the data are unbalanced wrt
cluster -size
set.seed(88)
seiz-subset(seiz.l,!( (id==1 | id==3 | id==58)  time2))
 
## Construction of  a  fixed correlation matrix
cor.fixed - matrix(c(1, 0.7, 0.5, 0.25, 0.12,
  0.7, 1, 0.71, 0.125, 0.125,
  0.5, 0.71, 1, 0.29, 0.123,
  0.25, 0.125, 0.29, 1, 0.119,
  0.120, 0.125, 0.123, 0.119, 1), 5, 5)
 
# The zcor-vector is constructed only for times which exist in the data

[R] Running geeglm unstructured corstr

2007-09-05 Thread Niccolò Bassani
Dear R users,
I've got a serious problem running some gee functions, and I really can't
fix it. My dataframe is made of several rows and columns (say 7600 x 15),
like the one below:

header inr  .   inside  group_cod
1 2.25   0   1
1 3.46   0   0
1 ..   1   0
1 ..   1...
1 ..  .  ...
   ..  .  ...
   ..  .  ...
   ..  .  ...
   ..  .  ...
   ..  .  ...
615  ..  .  ...
615  ..  .  ...
615  ..  .   ...

As you can see I've got several repeated measures, resulting in clusters (
i.e. ID) of different sizes. You can get a frame like this by running this
code:

header - rep.int(seq(1:615),sample(seq(1:19),size=615,replace=T))
inr - rlnorm(length(header),0.8434359,0.3268392)
group_cod - sample(c(0,1),size=length(header),replace=T)
inside - sample(c(0,1),size=length(header),replace=T)
gee.frame - data.frame(header,inr,group_cod,inside)

When I try running a longitudinal model with geeglm, the
corstr=unstructured option returns me an assertion failure of R itself (a
sort of error window), while the ar1 option for corstr returns a model only
with estimates, and no s.e. or wald test. Same is for the anova method. I
removed subjects with only one observation, but the problem's still the
same. This is the code used:

- for corstr= unstructured

geeglm.model- geeglm(inside~group_cod,family=binomial, data=gee.frame
,id=header,corstr=unstructured)

This model gives me an error (not the kind written in R-gui: simply the
program seems to stop not producing anything), and doesn't produce any
result, leading me to quit R. The same happens when I use a userdefined
matrix built with genZcor function using crostrv=4 (i.e. unstructured). My
initial thought was of problems because of highly unbalanced observations
and because of exceeding of correlation parameters to estimate, but
restricting cluster size did not result in any improvement.
Is there a problem with the code, or it could be due to the data?

- for costr= ar1

In my dataset I also have a variable weeks, that I use to specify waves in
the geeglm. When I use this, the output for an autoregressive gee model
gives me only the estimates, when I don't use it everything's alright. The
problem's that I have unbalanced observations, so the use of waves could
take this into account while estimating parameters. Is this a conflict
between these options, or what?

In addition, I've built the empirical correlation structure. A 19x19
structure, thus with nrow=maximum cluster size. When I use this as zcor the
program gives me

 geeglm.fixed - geeglm(inside~weeks+group_cod+età,family=binomial, data=
frame.model,id=header,waves=weeks,zcor=corr.gee,corstr=userdefined)
*Errore in geese.fit(xx, yy, id, offset, soffset, w, waves = waves, zsca,  :

nrow(zcor) need to be equal sum(clusz * (clusz - 1) / 2) for
unstructured or userdefined corstr.*

I really don't understand the meaning of this: the correlation should have
number of rows equal to maximum cluster size. Also in the online help the
details say something about the dimension of Zcor, but I can't understand
the same...
I hope I've been clear.
Thanks in advance
niccolò

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.