[R] Plot of Fine and Gray model

2013-01-02 Thread Laura Bonnett
Dear all,

Happy New year!

I have used the 'crr' function to fit the 'proportional subdistribution
hazards' regression model described in Fine and Gray (1999).

dat1 is a three column dataset where:
- ccr is the time to event variable
- Crcens is an indicator variable equal to 0 if the event was achieved, 1
if the event wasn't acheived due to death or 2 if the event wasn't achieved
due to disease progression
- pre is an indicator variable (and the covariate of interest)

I want to investigate if pre has a significant impact on time to event for
patients who died and for those who suffered disease progression (as well
as it's impact on the overall time to event).

The code I have used is as follows:

fitd - crr(dat1$ccr,dat1$Crcens,dat1$pre,failcode=1,cencode=0)
fitp - crr(dat1$ccr,dat1$Crcens,dat1$pre,failcode=2,cencode=0)

In these cases I get p-values of 0 and 0.66 respectively.

What I would now like to do, is to plot two cumulative incidence curves -
one for the 'pre' variable status for patients who didn't acheive the event
due to death and one for those who didn't achieve it due to progression.

How can I do this?  I can only see things involving plot.predict.crr which
doesn't seem to be what I need?


Many thanks,
Laura

Usung Windows 7 and R 2.14.1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple forest plots with the same x-axis and colour coded estimates and lines

2011-08-22 Thread Laura Bonnett
Dear all,

I would like to draw three forest plots to represent results at years 1, 2
and 3.  I have the data as point estimates and 95% confidence intervals.
Using the following code I can get three basic forest plots - the first
which has the table of results.  I have to plot each separately as the usual
par(mfrow=c(3,1)) does not work with the function forestplot within rmeta.
I can easily put them next to one another within powerpoint or similar
though so that's not a problem.

#Read data in
dattabrem1 - read.table(risk factors rem1.txt,header=TRUE) # year 1
results
dattabrem2 - read.table(risk factors rem2.txt,header=TRUE) # year 2
results
dattabrem3 - read.table(risk factors rem3.txt,header=TRUE) # year 3
results

# Set up table of results for the three plots
plotextr -
rbind(c(Age,Gender,Seizures,Treatment),c(10,M,2,CBZ),c(10,F,2,CBZ),
c(10,M,2,LTG),c(10,F,2,LTG),c(10,M,10,CBZ),c(10,F,10,CBZ),
c(10,M,10,LTG),c(10,F,10,LTG),c(40,M,2,CBZ),c(40,F,2,CBZ),
c(40,M,2,LTG),c(40,F,2,LTG),c(40,M,10,CBZ),c(40,F,10,CBZ),
c(40,M,10,LTG),c(40,F,10,LTG),c(75,M,2,CBZ),c(75,F,2,CBZ),
c(75,M,2,LTG),c(75,F,2,LTG),c(75,M,10,CBZ),c(75,F,10,CBZ),
c(75,M,10,LTG),c(75,F,10,LTG))

# 1 year results
estimate1y - c(NA,dattabrem1$HR)
lowerd1y - c(NA,dattabrem1$CIlower)
upperd1y - c(NA,dattabrem1$CIupper)
# 2 year results
estimate2y - c(NA,dattabrem2$HR)
lowerd2y - c(NA,dattabrem2$CIlower)
upperd2y - c(NA,dattabrem2$CIupper)
# 3 year results
estimate3y - c(NA,dattabrem3$HR)
lowerd3y - c(NA,dattabrem3$CIlower)
upperd3y - c(NA,dattabrem3$CIupper)

# Draw forest plots
forestplot(plotextr,estimate1y,lowerd1y,upperd1y,is.summary=c(TRUE,rep(FALSE,24)),zero=,align=c,xlab=Remission
@ 1 year)
forestplot(plotext2r,estimate2y,lowerd2y,upperd2y,zero=,xlab=Remission @
2 years)
forestplot(plotext2r,estimate3y,lowerd3y,upperd3y,zero=,xlab=Remission @
3 years)

Having managed to obtain these basic plots I need the x-axes to be the
same.  Usually xlim would be sufficient to do this but this function is not
avaialble within forestplot so does anyone know how I can make the x-axes
the same over all the plots?

Additionally, within each plot, two treatments are compared (top two blocks
are treatment 1, 2nd 2 blocks are treatment 2, next 2 are treatment 1 etc.)
Is there any way I can colour code the boxes to show this?

Many thanks,
Laura
P.S. I'm using Windows, R 2.9.2

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confidence interval for difference in Harrell's c statistics (or equivalently Somers' D statistics)

2011-05-05 Thread Laura Bonnett
Dear All,

I am trying to calculate a 95% confidence interval for the difference in two
c statistics (or equivalently D statistics).  In Stata I gather that this
can be done using the lincom command.  Is there anything similar in R?

As you can see below I have two datasets (that are actually two independent
subsets of the same data) and the respective c statistics for the variables
in both cases.  What I would now like to do is to prove that there is no
statistically significant difference between the statistics (between the dev
and val datasets.)

Any help would be much appreciated.

 rdev - rcorrcens(Surv(stimes1,eind1)~gendat1+neurodat1)
 rdev

Somers' Rank Correlation for Censored DataResponse
variable:Surv(stimes1, eind1)

  CDxy  aDxySDZ  Pn
gendat1   0.534  0.069 0.069 0.017 3.98 0.0001 1500
neurodat1 0.482 -0.036 0.036 0.011 3.18 0.0015 1500

 rval - rcorrcens(Surv(stimes2,eind2)~gendat2+neurodat2)
 rval

Somers' Rank Correlation for Censored DataResponse
variable:Surv(stimes2, eind2)

  CDxy  aDxySDZ Pn
gendat2   0.543  0.085 0.085 0.017 4.94 0e+00 1500
neurodat2 0.481 -0.038 0.038 0.011 3.44 6e-04 1500

Many thanks,
Laura
P.S. I'm using Windows XP, R 2.9.2

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence interval for difference in Harrell's c statistics (or equivalently Somers' D statistics)

2011-05-05 Thread Laura Bonnett
Dear David,

Thank you for your reply.  I have come across rcorrp.cens before.  However,
I'm not sure it does quite what I want it to.  It seems to compare whether
one predictor is more concordant than another within the same survival
function.  I want to see whether one predictor is more concordant than
another over two survival functions hence I fitted two rcorrcens functions.

E.g. if I have a development data set with a variable for age and a
validation data set for age then I want to know if the concordance is the
same over the development and validation data sets.

Thank you,
Laura

On 5 May 2011 16:09, David Winsemius dwinsem...@comcast.net wrote:


 On May 5, 2011, at 8:20 AM, Laura Bonnett wrote:

  Dear All,

 I am trying to calculate a 95% confidence interval for the difference in
 two
 c statistics (or equivalently D statistics).  In Stata I gather that this
 can be done using the lincom command.  Is there anything similar in R?


 Have you looked at rcorrp.cens {Hmisc}?

 snipped


[[alternative HTML version deleted]]



 Please post in plain text.

 --
 David Winsemius, MD
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] repeat write.table with the same code many times

2010-11-30 Thread Laura Bonnett
Dear all,

I am using R version 2.9.2 in Windows.

I would like to output the results of a function I have written to a .txt
file.  I know that I can do this by using the code
write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) etc.  However, I
would like to bootstrap my function 'boothd' several times and get each
vector of results as a new line in my text file.  Is there a way to do this?

I usually just set the code up to do bootstrapping around the function (i.e.
I perform the replications within the function and output a matrix of
results).  However in the case of 'boothd' I am dealing with rare events and
so sometimes I get an empty vector as output which makes mathematical
sense.  Unfortunately this casues the bootstrapping code to crash.

I'm hoping that writing the results out line by line will remove this
problem.  I have tried rep(write.table(...),15) say but because of the
occasional null vector the table is not written.

Thank you for any help you can give.

By the way,
write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE)
write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE)
write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE)
write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE)
write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) etc works but if
I want to look at 1000 replications this is very time consuming!

Thanks,
Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fractional Polynomials - Hazard Ratios and Relative Hazard Plots

2010-07-21 Thread Laura Bonnett
Dear All,

I am using Windows and R version 2.9.2 with libraries cmprsk, mfp and
Design.

I have a dataset with approximately 1700 patients (1 row per patient) and I
have 12 outcomes, three of which are continuous.  I have performed
univariate analyses to see if any factors are associated with a higher
likelihood of the event of interest (achieving 12 month remission from
epileptic seizures) and also an analysis adjusted for multiple variables.

I have tried log and fractional polynomial (FP) transformations of each
continuous variable.  In the case of age, used for the example below, the FP
transformation led to the best model fit according to the AIC.  I have
therefore applied this transformation for all analyses.

To begin with I have fitted a Cox model stratified by randomisation period,
rpa, (either before or after a certain date).

fit1 - mfp(Surv(Remtime,Rcens) ~ fp(age) + strata(rpa), family=cox,
data=nearma, select=0.05, verbose=TRUE)

I would like two things from this model, hazard ratios and and an associated
hazard ratio plot.  I am aware that the hazard ratios produced from a
fractional polynomial transformation are not to be used directly (i.e. those
obtained from summary(coxfitf1)).  Instead the derived functional form of
the variable should be used to estimate hazard ratios post hoc.  I have
attached a word document explaining how hazard ratios and confidence
interval can be derived and given a worked example for the variable, age.
The univariate results are:

Age (years)

≤10

(10 to 25]

(25 to 37]

(37 to 50]

(50 to 71]

71

1.00

1.00 (1.00 to 1.00)

0.99 (0.99 to 1.00)

0.99 (0.99 to 0.99)

0.99 (0.98 to 0.99)

0.98 (0.97 to 0.99)

To create a plot of the relative hazard I have used the code:
plot(nearma$age,predict(fit1,type=risk),xlab=Age,ylab=Relative Hazard)
The produced plot is attached.
As you can clearly see, the hazard ratios above and the relative hazard plot
do not agree.
This is also the case for the other two continuous variables that have been
transformed via the FP approach.

The hazard ratios for age using the model adjusted for multiple variables
are as follows, which do coincide with the plot:

Age (years)

≤10

(10 to 25]

(25 to 37]

(37 to 50]

(50 to 71]

71

1.00

0.86 (0.77 to 0.95)

0.78 (0.65 to 0.94)

0.80 (0.64 to 1.00)

1.01 (0.81 to 1.25)

1.61 (1.27 to 2.05)

Can anyone therefore explain why the univariate hazard ratios do not
coincide with the relative hazard plot and yet the hazard ratios from the
multivariable model do?  I know that the calculations are correct for both
sets of hazard ratios.

Thank you for any help you can provide as I am at a loss to explain the
difference in the plot fo the calculations - they should, after all, be
saying the same thing!

Thank you,
Laura
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Relative Risk/Hazard Ratio plots for continuous variables

2010-05-25 Thread Laura Bonnett
Dear all,

I am using Windows and R 2.9.2 for my analyses.  I have a large dataset and
I am particularly interested in looking at time to an event for a continuous
variable.  I would like to produce a plot of log(relative risk) or relative
risk (also known as hazard ratio) against the continuous variable.

I have spent a long time looking for advice on how to do this but my search
has proved fruitless - sorry if I've missed something obvious.  It seems
that there are options such as muhaz, survfit, coxph and cph that may enable
some plots to be produced but none that specifically look at the relative
risk one.

In addition to the survival analysis, I have incorporated the mfp function
(from package mfp).

I currently use code such as,

library(mfp)
library(Design)

coxfit1 - coxph(Surv(rtime,rcens)~cts,data=data1)
or
coxfit2 -
mfp(Surv(rtime,rcens)~fp(cts),family=cox,data=data1,select=0.05,verbose=TRUE)

plot(coxfit1) nor plot(coxfit2) produce the relevant relative risk vs.
continuous variable that I need.

Can anyone help?

Thank you,
Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Baseline survival estimate

2009-12-16 Thread Laura Bonnett
Dear R-help,

I am trying to obtain the baseline survival estimate of a fitted Cox model
(S_0 (t)).  I know that previous posts have said use 'basehaz' but this
gives the baseline hazard function and not the baseline survival estimate.
Is there a way to obtain the baseline survival estimate or do I have to use
the formula which does something like S(t) = exp[- the integral from 0 to t
of h(u) du]?

Thank you for your assistance,

Laura

fit1 -
coxph(Surv(tsecond/365,seccens)~stroke(smess1)+othnd(smess1)+relat(smess1)+asleep(smess1)+abeeg1(smess1)+treat(smess1),data=smess1)
basehaz(fit1)

where stroke is a function which creates a binary variable from the dataset
smess1 etc.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using stepAIC to produce a p-value for when a particular variable was taken out of the model

2009-11-30 Thread Laura Bonnett
Dear all,

I have decided after much deliberation to use backward elimination and
forward selection to produce a multivariate model.  Having read about the
problems with choosing selection values I have chosen to base my decisions
of inclusion and exclusion on the AIC and am consequently using the stepAIC
function.  This post however does not relate to whether or not this is the
correct decision!

I am interested in determining what the p-value was when a particular
variable was taken out of the model.  If I choose trace=TRUE then I
obviously can see each step of the elimination process together with the AIC
and the degrees of freedom for each variable and for the null model.  When
the stepwise process is complete it is possible to call the anova value
which shows deviances and assocaited degrees of freedoms for variables left
in the model.  Therefore I could use this information to calculate
p-values.  However, is it possible to do the same for the varaibles which
were thrown out of the model?

There doesn't seem to be any literature on how to use AICs to get p-values
as the distribution isn't quite a chi-squared one.  Does anyone therefore
know how I can determine the p-value for a variable when it was taken out of
the model?

Thank you,

Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using stepAIC to produce a p-value for when a particular variable was taken out of the model

2009-11-30 Thread Laura Bonnett
I've just realised that this is a very silly post as I can't read!!!  The
output in the anova is the excluded variables - very sorry!

Laura

2009/11/30 Laura Bonnett l.j.bonn...@googlemail.com

 Dear all,

 I have decided after much deliberation to use backward elimination and
 forward selection to produce a multivariate model.  Having read about the
 problems with choosing selection values I have chosen to base my decisions
 of inclusion and exclusion on the AIC and am consequently using the stepAIC
 function.  This post however does not relate to whether or not this is the
 correct decision!

 I am interested in determining what the p-value was when a particular
 variable was taken out of the model.  If I choose trace=TRUE then I
 obviously can see each step of the elimination process together with the AIC
 and the degrees of freedom for each variable and for the null model.  When
 the stepwise process is complete it is possible to call the anova value
 which shows deviances and assocaited degrees of freedoms for variables left
 in the model.  Therefore I could use this information to calculate
 p-values.  However, is it possible to do the same for the varaibles which
 were thrown out of the model?

 There doesn't seem to be any literature on how to use AICs to get p-values
 as the distribution isn't quite a chi-squared one.  Does anyone therefore
 know how I can determine the p-value for a variable when it was taken out of
 the model?

 Thank you,

 Laura


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unusual error while using coxph

2009-10-05 Thread Laura Bonnett
Hi all,

I'm very confused!  I've been using the same code for many weeks without any
bother for various covariates.  I'm now looking at another covaraite and
whenever I run the code you can see below I get an error message: Error in
rep(0, nrow(data)) : invalid 'times' argument

This code works:
# remove 'missing' cases from data #
snearma - function(data)
{
for(i in 1:nrow(data)){
if(is.na(data$all.sp)[i])
data$all.sp[i]-0
if(is.na(data$all.cp)[i])
data$all.cp[i]-0
if(is.na(data$all.scgtc)[i])
data$all.scgtc[i]-0
if(is.na(data$all.tc)[i])
data$all.tc[i] - 0
if(is.na(data$all.ta)[i])
data$all.ta[i] - 0
if(is.na(data$all.aa)[i])
data$all.aa[i] - 0
if(is.na(data$all.m)[i])
data$all.m[i] - 0
if(is.na(data$all.otc)[i])
data$all.otc[i] - 0
if(is.na(data$all.o)[i])
data$all.o[i] - 0
}
dummy - rep(0,nrow(data))
for(i in 1:nrow(data)){
if(data$all.sp[i]==0  data$all.cp[i]==0  data$all.scgtc[i]==0 
data$all.tc[i]==0  data$all.ta[i]==0  data$all.aa[i]==0 
data$all.m[i]==0  data$all.otc[i]==0  data$all.o[i]==0)
dummy[i] - i
}
return(data[-dummy,])
}
# create smaller dataset with missing cases removed #
nmarma - snearma(nearma)
# create short stratification variable #
nmrpa - randp(nmarma)
# create censoring variable for the covariate #
stypea - function(data)
{
for(i in 1:nrow(data)){
if(is.na(data$all.sp)[i])
data$all.sp[i]-0
if(is.na(data$all.cp)[i])
data$all.cp[i]-0
if(is.na(data$all.scgtc)[i])
data$all.scgtc[i]-0
if(is.na(data$all.tc)[i])
data$all.tc[i] - 0
if(is.na(data$all.ta)[i])
data$all.ta[i] - 0
if(is.na(data$all.aa)[i])
data$all.aa[i] - 0
if(is.na(data$all.m)[i])
data$all.m[i] - 0
if(is.na(data$all.otc)[i])
data$all.otc[i] - 0
if(is.na(data$all.o)[i])
data$all.o[i] - 0
}
stype - rep(0,nrow(data))
for(i in 1:nrow(data)){
if(data$all.type[i]==P  data$all.sp[i]=1  data$all.scgtc[i] == 0)
stype[i] - 1
if(data$all.type[i]==P  data$all.cp[i]=1  data$all.scgtc[i] == 0)
stype[i] - 1
if(data$all.type[i]==P  data$all.scgtc[i]=1)
stype[i] - 2
if(data$all.type[i]==P  data$all.sp[i]==0  data$all.cp[i]==0 
data$all.scgtc[i]==0  data$all.otc[i]==0  data$all.o[i]==0)
stype[i] - 2
if(data$all.type[i]==G  data$all.tc[i]=1  data$all.m[i]==0 
data$all.ta[i]==0  data$all.aa[i]==0)
stype[i] - 3
if(data$all.type[i]==G  data$all.m[i]=1  data$all.tc[i]==0)
stype[i] - 3
if(data$all.type[i]==G  data$all.ta[i]=1  data$all.tc[i]==0)
stype[i] - 3
if(data$all.type[i]==G  data$all.aa[i]=1  data$all.tc[i]==0)
stype[i] - 3
if(data$all.type[i]==G  data$all.m[i]=1  data$all.tc[i]=1)
stype[i] - 3
if(data$all.type[i]==G  data$all.ta[i]=1  data$all.tc[i]=1)
stype[i] - 3
if(data$all.type[i]==G  data$all.aa[i]=1  data$all.tc[i]=1)
stype[i] - 3
if(data$all.otc[i]=1)
stype[i] - 4
if(data$all.o[i]=1)
stype[i] - 4
}
return(stype)
}
fita -
survdiff(Surv(rem.Remtime,rem.Rcens)~stypea(nmarma)+strata(nmrpa),data=nmarma)
fita
lrpvalue=1-pchisq(fita$chisq,3)
xx -
cuminc(nmarma$rem.Remtime/365,nmarma$rem.Rcens,stypea(nmarma),strata=nmrpa)
plot(xx,curvlab=c(Simple/Complex,SC+2gentc or 2gentc,TC or My/Ab or
My/Ab+gentc,Other),lty=1,color=c(2:5),xlab=Time from randomisation
(years),ylab=Probability of 12-month remission,main=Time to 12-month
remission,wh=c(2.0,0.4))
text(4,0.5,cex=0.85,paste(Log-rank
test=,round(fita$chisq,3),p-value=,round(lrpvalue,3)))

whereas this doesn't:
par - function(data)
{
dummy - rep(0,nrow(data))
for(i in 1:nrow(data)){
if(is.na(data$all.frontlob)[i]  is.na(data$all.templob)[i] 
is.na(data$all.parlob)[i]
 is.na(data$all.occlob)[i]  is.na(data$all.notspec)[i])
dummy[i]- i
}
for(i in 1:nrow(data)){
if(is.na(data$all.frontlob)[i])
data$all.frontlob[i] - N
if(is.na(data$all.templob)[i])
data$all.templob[i] - N
if(is.na(data$all.parlob)[i])
data$all.parlob[i] - N
if(is.na(data$all.occlob)[i])
data$all.occlob[i] - N
if(is.na(data$all.notspec)[i])
data$all.notspec[i] - N
}
for(i in 1:nrow(data)){
if(data$all.frontlob[i]==N  data$all.templob[i]==N 
data$all.parlob[i]==N  data$all.occlob[i]==N 
data$all.notspec[i]==N)
dummy[i] - i
if(data$all.frontlob[i]==Y  data$all.parlob[i]==Y)
dummy[i] - i
}
return(data[-dummy,])
}
shortpar - par(nearma)
shortrpa - randp(shortpar)
lobe - function(data)
{
for(i in 1:nrow(data)){
if(is.na(data$all.frontlob)[i])
data$all.frontlob[i] - N
if(is.na(data$all.templob)[i])
data$all.templob[i] - N
if(is.na(data$all.occlob)[i])
data$all.occlob[i] - N
if(is.na(data$all.parlob)[i])
data$all.parlob[i] - N
if(is.na(data$all.notspec)[i])
data$all.notspec[i] - N
}
lobe - rep(0,nrow(data))
for(i in 1:nrow(data)){
if(data$all.frontlob[i]==Y)

Re: [R] crr - computationally singular

2009-07-06 Thread Laura Bonnett
Hi Everyone,

Thank you for all your comments and suggestions.

I determined that I had a full rank model matrix by using the code:
 qr(covaeb)$rank
This is 17 which is equal to the number of covariates in the matrix, covaeb.

I cannot invert the model matrix using 'solve' as my matrix is not
square.  In the matrices ending in a, there are 1677 rows and 15
columns/covariates while in the matrices ending in b, there are 701
rows and 17 columns.

Thank you,

Laura

2009/6/26 Ravi Varadhan rvarad...@jhmi.edu:

 How did you determine that you have full rank model matrix comprising 17
 predictors?  Are you able to invert the model matrix using `solve'?  If not,
 you still have collinearity problem.

 If you are, then the problem might be in the Newton's method used by `crr'
 to solve the partial-likelihood optimization.  The hessian matrix of the
 parameters might be singular during the iterations.  If this is the case,
 your best bet would be to just simplify the model, i.e. use fewer
 predictors.

 Ravi.

 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:
 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tml



 
 


 -Original Message-
 From: Laura Bonnett [mailto:l.j.bonn...@googlemail.com]
 Sent: Friday, June 26, 2009 6:22 AM
 To: Ravi Varadhan
 Cc: r-help@r-project.org
 Subject: Re: [R] crr - computationally singular

 Dear Sir,

 Thank you for your response.  You were correct, I had 1 linearly dependent
 column.  I have solved this problem and now the rank of 'covaeb' is 17
 (qr(covaeb)$rank = 17).  However, I still get the same error message when I
 use covaeb in the 'crr' function.

 fit=crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0)
 8 cases omitted due to missing values
 Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) :
  system is computationally singular: reciprocal condition number =
 3.45905e-25

 Are there any other reasons why this may be happening?

 Thank you,

 Laura

 2009/6/25 Ravi Varadhan rvarad...@jhmi.edu:
 This means that your design matrix or model matrix is rank deficient,
 i.e it does not have linearly independent columns.  Your predictors are
 collinear!


 Just take your design matrices covaea or covaeb with 17 predcitors
 and compute their rank or try to invert them.  You will see the problem.

 Ravi.

 --
 --
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:
 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Vara
 dhan.h
 tml



 --
 --
 


 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of Laura Bonnett
 Sent: Thursday, June 25, 2009 11:39 AM
 To: r-help@r-project.org
 Subject: [R] crr - computationally singular

 Dear R-help,

 I'm very sorry to ask 2 questions in a week.  I am using the package 'crr'
 and it does exactly what I need it to when I use the dataset a.
 However, when I use dataset b I get the following error message:
 Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) :
  system is computationally singular: reciprocal condition number =
 1.28654e-24

 This is obviously as a result of a problem with the data but apart
 from dataset a having 1674 rows and dataset b having 701 rows there is
 really no difference between them.

 The code I am using is as follows where covaea and covaeb are matrices
 of covarites, all coded as binary variables.
 In case a:
 covaea -
 cbind(sexa,fsha,fdra,nsigna,eega,th1a,th2a,stype1a,stype2a,stype3a,pg
 u
 1a,pgu2a,log(agea),firstinta/1000,totsezbasea)
 fita - crr(snearma$with.Withtime,csaea,covaea,failcode=2,cencode=0)

 and in case b:
 covaeb -
 cbind(sexb,fshb,fdrb,nsignb,eegb,th1b,th2b,stype1b,stype2b,stype3b,st
 y
 pe4b,stype5b,pgu1b,pgu2b,(ageb/10)^(-1),firstintb,log(totsezbaseb))
 fitb - crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0)

 csaea and csaeb are the censoring indicators for a and b respectively
 which equal 1 for the event of interest, 2 for the competing risks
 event and 0 otherwise.

 Can anyone suggest a reason for the error message?  I've tried running
 fitb with variants of covaeb and irrespective of the order of the
 covariates in the matrix, the code runs fine with 16 of the 17
 covariates included but then produces an error message when

Re: [R] crr - computationally singular

2009-06-26 Thread Laura Bonnett
Dear Sir,

Thank you for your response.  You were correct, I had 1 linearly
dependent column.  I have solved this problem and now the rank of
'covaeb' is 17 (qr(covaeb)$rank = 17).  However, I still get the same
error message when I use covaeb in the 'crr' function.

 fit=crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0)
8 cases omitted due to missing values
Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) :
  system is computationally singular: reciprocal condition number = 3.45905e-25

Are there any other reasons why this may be happening?

Thank you,

Laura

2009/6/25 Ravi Varadhan rvarad...@jhmi.edu:
 This means that your design matrix or model matrix is rank deficient, i.e it
 does not have linearly independent columns.  Your predictors are collinear!


 Just take your design matrices covaea or covaeb with 17 predcitors and
 compute their rank or try to invert them.  You will see the problem.

 Ravi.

 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:
 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tml



 
 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Laura Bonnett
 Sent: Thursday, June 25, 2009 11:39 AM
 To: r-help@r-project.org
 Subject: [R] crr - computationally singular

 Dear R-help,

 I'm very sorry to ask 2 questions in a week.  I am using the package 'crr'
 and it does exactly what I need it to when I use the dataset a.
 However, when I use dataset b I get the following error message:
 Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) :
  system is computationally singular: reciprocal condition number =
 1.28654e-24

 This is obviously as a result of a problem with the data but apart from
 dataset a having 1674 rows and dataset b having 701 rows there is really no
 difference between them.

 The code I am using is as follows where covaea and covaeb are matrices of
 covarites, all coded as binary variables.
 In case a:
 covaea -
 cbind(sexa,fsha,fdra,nsigna,eega,th1a,th2a,stype1a,stype2a,stype3a,pgu
 1a,pgu2a,log(agea),firstinta/1000,totsezbasea)
 fita - crr(snearma$with.Withtime,csaea,covaea,failcode=2,cencode=0)

 and in case b:
 covaeb -
 cbind(sexb,fshb,fdrb,nsignb,eegb,th1b,th2b,stype1b,stype2b,stype3b,sty
 pe4b,stype5b,pgu1b,pgu2b,(ageb/10)^(-1),firstintb,log(totsezbaseb))
 fitb - crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0)

 csaea and csaeb are the censoring indicators for a and b respectively which
 equal 1 for the event of interest, 2 for the competing risks event and 0
 otherwise.

 Can anyone suggest a reason for the error message?  I've tried running fitb
 with variants of covaeb and irrespective of the order of the covariates in
 the matrix, the code runs fine with 16 of the 17 covariates included but
 then produces an error message when the 17th is added.

 Thank you for your help,

 Laura

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] crr - computationally singular

2009-06-26 Thread Laura Bonnett
But I have centred all the dummy variables for the covariates...

2009/6/26 David Winsemius dwinsem...@comcast.net:
 Still the same reasons. It is possible to have collinearity without having
 any one column be a multiple of another.

 xyz - data.frame(x=sample(1:1000, 5), y=sample(1:1000, 5) ,
 xx=sample(1:1000, 5) ,yy=sample(1:1000, 5) )
 xyz$z - xyz$x + xyz$y + xyz$xx
 solve(xyz)
 Error in solve.default(xyz) :
  system is computationally singular: reciprocal condition number =
 6.39164e-20

 On Jun 26, 2009, at 6:22 AM, Laura Bonnett wrote:

 Dear Sir,

 Thank you for your response.  You were correct, I had 1 linearly
 dependent column.  I have solved this problem and now the rank of
 'covaeb' is 17 (qr(covaeb)$rank = 17).  However, I still get the same
 error message when I use covaeb in the 'crr' function.

 fit=crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0)

 8 cases omitted due to missing values
 Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) :
  system is computationally singular: reciprocal condition number =
 3.45905e-25

 Are there any other reasons why this may be happening?

 Thank you,

 Laura

 2009/6/25 Ravi Varadhan rvarad...@jhmi.edu:

 This means that your design matrix or model matrix is rank deficient, i.e
 it
 does not have linearly independent columns.  Your predictors are
 collinear!


 Just take your design matrices covaea or covaeb with 17 predcitors
 and
 compute their rank or try to invert them.  You will see the problem.

 Ravi.


 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:

 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tml




 
 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of Laura Bonnett
 Sent: Thursday, June 25, 2009 11:39 AM
 To: r-help@r-project.org
 Subject: [R] crr - computationally singular

 Dear R-help,

 I'm very sorry to ask 2 questions in a week.  I am using the package
 'crr'
 and it does exactly what I need it to when I use the dataset a.
 However, when I use dataset b I get the following error message:
 Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base))
 :
  system is computationally singular: reciprocal condition number =
 1.28654e-24

 This is obviously as a result of a problem with the data but apart from
 dataset a having 1674 rows and dataset b having 701 rows there is really
 no
 difference between them.

 The code I am using is as follows where covaea and covaeb are matrices of
 covarites, all coded as binary variables.
 In case a:

 covaea -
 cbind(sexa,fsha,fdra,nsigna,eega,th1a,th2a,stype1a,stype2a,stype3a,pgu
 1a,pgu2a,log(agea),firstinta/1000,totsezbasea)
 fita - crr(snearma$with.Withtime,csaea,covaea,failcode=2,cencode=0)

 and in case b:

 covaeb -
 cbind(sexb,fshb,fdrb,nsignb,eegb,th1b,th2b,stype1b,stype2b,stype3b,sty
 pe4b,stype5b,pgu1b,pgu2b,(ageb/10)^(-1),firstintb,log(totsezbaseb))
 fitb - crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0)

 csaea and csaeb are the censoring indicators for a and b respectively
 which
 equal 1 for the event of interest, 2 for the competing risks event and 0
 otherwise.

 Can anyone suggest a reason for the error message?  I've tried running
 fitb
 with variants of covaeb and irrespective of the order of the covariates
 in
 the matrix, the code runs fine with 16 of the 17 covariates included but
 then produces an error message when the 17th is added.

 Thank you for your help,

 Laura

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] crr - computationally singular

2009-06-25 Thread Laura Bonnett
Dear R-help,

I'm very sorry to ask 2 questions in a week.  I am using the package
'crr' and it does exactly what I need it to when I use the dataset a.
However, when I use dataset b I get the following error message:
Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) :
  system is computationally singular: reciprocal condition number = 1.28654e-24

This is obviously as a result of a problem with the data but apart
from dataset a having 1674 rows and dataset b having 701 rows there is
really no difference between them.

The code I am using is as follows where covaea and covaeb are matrices
of covarites, all coded as binary variables.
In case a:
 covaea - 
 cbind(sexa,fsha,fdra,nsigna,eega,th1a,th2a,stype1a,stype2a,stype3a,pgu1a,pgu2a,log(agea),firstinta/1000,totsezbasea)
 fita - crr(snearma$with.Withtime,csaea,covaea,failcode=2,cencode=0)

and in case b:
 covaeb - 
 cbind(sexb,fshb,fdrb,nsignb,eegb,th1b,th2b,stype1b,stype2b,stype3b,stype4b,stype5b,pgu1b,pgu2b,(ageb/10)^(-1),firstintb,log(totsezbaseb))
 fitb - crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0)

csaea and csaeb are the censoring indicators for a and b respectively
which equal 1 for the event of interest, 2 for the competing risks
event and 0 otherwise.

Can anyone suggest a reason for the error message?  I've tried running
fitb with variants of covaeb and irrespective of the order of the
covariates in the matrix, the code runs fine with 16 of the 17
covariates included but then produces an error message when the 17th
is added.

Thank you for your help,

Laura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fractional Polynomials in Competing Risks setting

2009-06-23 Thread Laura Bonnett
Dear All,

I have analysed time to event data for continuous variables by
considering the multivariable fractional polynomial (MFP) model and
comparing this to the untransformed and log transformed model to
determine which transformation, if any, is best.  This was possible as
the Cox model was the underlying model.  However, I am now at the
situation where the assumption that the competing risks are
independent is no longer true and therefore I cannot use the Cox
model.

The code I used to get the MFP model was:
coxfitf - 
mfp(Surv(with.Withtime,with.Wcens)~fp(all.age),family=cox,data=nearma,select=0.05,verbose=TRUE)
where with.Withtime is the time to treatment withdrawal, with.Wcens is
the censoring indictor for the event and all.firstint is the age at
baseline.

To look at the competing risks regression modelling when age in
untransformed, I can use the following code:
fitn-crr(nearma$with.Withtime,censaeb,as.matrix(nearma$all.age),failcode=2,cencode=0)
where censaeb is the censoring indicator which is coded 1 for the
event of interest (time to treatment failure), 2 for the competing
risk and 0 for the censored value.

Can anyone suggest how I can effectively combine these situations i.e.
is there a way to apply the fractional polynomail transformation to
the variable to assertain whether the transformation improves the
model fit?  I've tried the following code but it doesn't actually
transform the variable:
fitf=crr(nearmb$with.Withtime,censaeb,as.matrix(fp(nearmb$all.firstint)),failcode=2,cencode=0)

Thank you for your help,

Laura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Isolating a single plot from plots produced simultaneously

2009-06-09 Thread Laura Bonnett
Dear R-Help,

I am using the 'mfp' package.  It produces three plots (as I am using
the Cox model) simultaneously which can be viewed together using the
following code:

fit - 
mfp(Surv(rem.Remtime,rem.Rcens)~fp(age)+strata(rpa),family=cox,data=nearma,select=0.05,verbose=TRUE)
par(mfrow=c(2,2))
plot(fit)

They can be viewed separately but the return key must be pressed after
each graph appears (Click or hit ENTER for next page).

I'd like to isolate the second plot produced (the estimated functional
form of the influence of age on the log relative hazard) so that I can
use the 'points' function to add the linear predictors for the
untransformed and the log-transformed models.  In the usual situation
one would produce a plot and then type:

coxfitu - coxph(Surv(rem.Remtime,rem.Rcens)~age+strata(rpa),data=nearma)
points(coxfitu$linear.predictor,col=2)
coxfitl - coxph(Surv(rem.Remtime,rem.Rcens)~log(age)+strata(rpa),data=nearma)
points(coxfitl$linear.predictor,col=3)

Can anyone tell me how to isolate just the second plot produced?

Thank you for your help,

Laura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Forcing a variableinto a model using stepAIC

2009-05-22 Thread Laura Bonnett
Dear All,

I am attempting to use forward and/or backward selection to determine
the best model for the variables I have.  Unfortunately, because I am
dealing with patients and every patient is receiving treatment I need
to force the variable for treatment into the model.  Is there a way to
do this using R?  (Additionally, the model is stratified by
randomisation period).  I know that SAS can be used to do this but my
SAS coding is poor and consequently it would be easier for me to use
R, especially given the fractional polynomial transformations!

Currently the model is as follows (without treatment).

coxfita=coxph(Surv(rem.Remtime,rem.Rcens)~sind(nearma)+fsh(nearma)+fdr(nearma)+th1(nearma)+th2(nearma)+fp(cage)+fp(fint)+fp(tsb)+strata(rpa),data=nearma)


Thank you for your help,

Laura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Survreg/psm output

2009-04-16 Thread Laura Bonnett
Dear R-listers,

I know that there have been many, many posts on the output from
Survreg.  To summarise what I have read, Scale is 1/shape of the
Weibull which is also the standard deviation of the normal
distribution which is also the standard deviation of the log survival
time and Intercept is log(scale).  I also know that the hazard
function can be calculated from the output to give something like:
h(time)=exp(-intercept)^scale x scale x time^(1-scale)
So, for example if the intercept was -1.11 and the scale was 1.17 then
the hazard function is h(t) = exp(1.11)^1.17 x 1.17 x t^0.17 = 4.30
t^0.17

However, how can I work out what the hazard is I.e. by how much does
the risk of death increase or decrease with time? Some people have
said that is the scale parameter is greater than 1 then the risk of
death increases with age.  Is this correct?

I am also aware that Harrell has posted previously that there is a
case study in his book.  I have a copy of the book and have read all
the case studies but I'm still not convinced I know what the hazard of
death is?!

Sorry that this relates to previously posted queries, but as yet there
doesn't seem to be a satisfactory solution to the question of the
output?

 Thank you for your help as always,

Laura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld residuals

2009-04-14 Thread Laura Bonnett
Hi,

Thank you for your comments and apologies for the delay in replying.
rem.Rcens =1 for the censored variables.  The problem arises because I
am not strictly looking at time to death.  Instead I am looking at
time to 12-month remission in epilepsy.  Therefore a lot of people
have the same event i.e. they successfully achieve 12-month remission
from day 1 of the treatment.

I think I shall avoid the problem by 'excluding' patients with
immediate 12-month remission i.e. I will look at patients with
immediate success in a separate analysis to patients with delayed
success.

Thanks for your help,

Laura


2009/4/6 Terry Therneau thern...@mayo.edu:
 Laura Bonnett was kind enough to send me a copy of the data that caused the
 plotting error, since it was an error I had not seen before.

 1. The latest version of survival gives a nicer error message:

 fit - coxph(Surv(rem.Remtime, rem.Rcens) ~ all.sex, nearma)
 cfit - cox.zph(fit)
 plot(cfit)
 Error in plot.cox.zph(cfit) :
   Spline fit is singular, try a smaller degrees of freedom

 2. What's the problem?
  There are 1085 events in the data set (rem.Rcens==1), and of these 502 are
 tied events on exactly day 365.  The plot.cox.zph function tries to fit a
 smoothing spline to the data to help the eye; the fit gives weight 1 to each
 death and having this high a proportion of ties creates problems for the
 underlying regression.

 3.
 plot(cfit, df=2)
  Warning messages:
 1: In approx(xx, xtime, seq(min(xx), max(xx), length = 17)[2 * (1:8)]) :
  collapsing to unique 'x' values
 2: In approx(xtime, xx, temp) : collapsing to unique 'x' values

  These warning messages are ignorable.  I'll work on making them go away.


 4. A shot in the dark -- is perchance the variable rem.Rcens=1 a marker of a
 censored observation, and the events are 0?  (A whole lot of events at 1 year 
 is
 suspicious, but half censored at one year is believable.) Then the proper 
 coxph
 code is

 fit2 -  coxph(Surv(rem.Remtime, rem.Rcens==0) ~ all.sex, nearma)

        Terry Therneau





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Schoenfeld Residuals

2009-04-03 Thread Laura Bonnett
Dear All,

Sorry to bother you again.

I have a model:
coxfita=coxph(Surv(rem.Remtime/365,rem.Rcens)~all.sex,data=nearma)
and I'm trying to do a plot of Schoenfeld residuals using the code:
plot(cox.zph(coxfita))
abline(h=0,lty=3)

The error message I get is:
Error in plot.window(...) : need finite 'ylim' values
In addition: Warning messages:
1: In sqrt(x$var[i, i] * seval) : NaNs produced
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

My data (nearma) has a lot of rem.Remtime entries which are equal i.e
large amounts of tied data.  If I remove the entries where this is the
case from the dataset I get the results I want!

Please can someone explain why removing paients with tied remission
time has such an effect on the code and also how to remedy the problem
without removing patients?

Thank you very much,

Laura.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schoenfeld Residuals

2009-04-03 Thread Laura Bonnett
Thank you for your comments.  I have about 200 out of 2000 tied data
points which makes the situation more complicated!  I'll have at look
at the book section you referred to.  With regards to making the ylim
finite, I'm not sure how I can go about that given that I don't
understand why it isn't already!

Thank you for your help,

Laura

2009/4/3 David Winsemius dwinsem...@comcast.net:
 I am not sure that ties are the only reason. If I create a few ties in the
 ovarian dataset that Therneau and Lumley provide, all I get are some
 warnings:
 ovarian[4:5, 1] - mean(ovarian[4:5, 1])
 ovarian[6:8, 1] - mean(ovarian[6:8, 1])
 fit - coxph( Surv(futime, fustat) ~ age + rx, ovarian)
 temp- cox.zph(fit)

 plot(temp)
 Warning messages:
 1: In approx(xx, xtime, seq(min(xx), max(xx), length.out = 17)[2 *  :
  collapsing to unique 'x' values
 2: In approx(xtime, xx, temp) : collapsing to unique 'x' values

 The error message you get is requesting a finite ylim. Have you considered
 acceding with that request?

 Alternative: Assuming the number of tied survival times is modest, have you
 tried jitter-ing the rem.Remtime variable a few times to see it the results
 are stable?

 If the number of ties is large, then you need to review Thernaeu  Gramsch
 section 3.3

 --
 David Winsemius

 On Apr 3, 2009, at 7:57 AM, Laura Bonnett wrote:

 Dear All,

 Sorry to bother you again.

 I have a model:
 coxfita=coxph(Surv(rem.Remtime/365,rem.Rcens)~all.sex,data=nearma)
 and I'm trying to do a plot of Schoenfeld residuals using the code:
 plot(cox.zph(coxfita))
 abline(h=0,lty=3)

 The error message I get is:
 Error in plot.window(...) : need finite 'ylim' values
 In addition: Warning messages:
 1: In sqrt(x$var[i, i] * seval) : NaNs produced
 2: In min(x) : no non-missing arguments to min; returning Inf
 3: In max(x) : no non-missing arguments to max; returning -Inf

 My data (nearma) has a lot of rem.Remtime entries which are equal i.e
 large amounts of tied data.  If I remove the entries where this is the
 case from the dataset I get the results I want!

 Please can someone explain why removing paients with tied remission
 time has such an effect on the code and also how to remedy the problem
 without removing patients?

 Thank you very much,

 Laura.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Centring variables in Cox Proportional Hazards Model

2009-03-26 Thread Laura Bonnett
Dear All,

I am contemplating centering the covariates in my Cox model to reduce
multicollinearity between the predictors and the interaction term and
to render a more meaningful interpretation of the regression
coefficient.  Suppose I have two indicator variables, x1 and x2 which
represent age categories (x1 is patients less than 16 while x2 is for
patients older than 65).  If I use the following Cox model, is there
anyway I can centre the variables?  Do I have to do it before I fit
them into the model and if so, how?

fit2=coxph(Surv(rem.Remtime,rem.Rcens)~x1(partial)+x2(partial),data=partial,method=breslow)

Thank you,

Laura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] impcat='tree'

2009-03-06 Thread Laura Bonnett
Dear All,

I am going through a worked example provided by Harrell, Lee and Mark
(1996, Stats in Medicine, 15, 361-387).  I know that the code provided
is for S-PLUS and R but the languages don't differ enough for this to
be a problem.

I am using the Hmisc and Design libraries and have used the following
code (as shown in the example provided in the referenced paper):

'%in%' - function(a,b)match(a,b,nomatch=0)0 # Define function for
easy determination of whether a value is in a list
levels(ekg)[levels(ekg)%in%c('oldMI','recentMI')] - 'MI' # Combines
last 2 levels and uses a new name, MI
pf.coded - as.integer(pf) # Save original pf, re-code to 1-4
levels(pf) - c(levels(pf)[1:3],levels(pf)[3]) # Combine last 2 levels
of original

This is where I have the problem.  I am writing an imputation rule:
w - 
transcan(~sz+sg+ap+sbp+dbp+age+wt+hg+ekg+pf+bm+hx,imputed=TRUE,data=prostate,impcat='tree')

However I get the following error message(s)
Convergence criterion:1.511 0.787 0.41 0.215 0.115 0.062 Error: could
not find function tree
In addition: Warning messages:
1: In approx(y, x, xout = aty, rule = rule) :
  collapsing to unique 'x' values
2: In approx(y, x, xout = aty, rule = rule) :
  collapsing to unique 'x' values
3: In approx(y, x, xout = aty, rule = rule) :
  collapsing to unique 'x' values
4: In approx(y, x, xout = aty, rule = rule) :
  collapsing to unique 'x' values

Has anyone had a similar problem?  If so, any solution?

Thank you,

Laura

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dose Profile

2008-10-29 Thread Laura Bonnett
Hi Jim,

Thank you very much for your help.  It is much appreciated.

Bert, the reason I didn't ask someone at work is because none of us knew.
The person I referred to in my email thought it might be possible but didn't
know how.  I can now inform everyone.

Thanks,

Laura

On Wed, Oct 29, 2008 at 8:46 AM, Jim Lemon [EMAIL PROTECTED] wrote:

 Laura Bonnett wrote:

 Hi Everyone,

 I have data in a long format e.g. there is one row per patient but each
 follow-up appointment is included in the row.  So, a snippet of the data
 looks like this:
  TrialNo Drug Sex Rand Adate1 Date1 Dose1 Time1 Adate2 Date2 Dose2
 Time2  B1001029 LTG M 15719 30/04/2003 15825 150 106 29/08/2003 15946 200
 227  B1117003 LTG M 15734 30/04/2003 15825 200 91 03/09/2003 15951 250 217
 B138015 LTG M 14923 06/02/2001 15012 225 89 08/05/2001 15103 300 180
 B112003 TPM F 14914 15/01/2001 14990 60 76 05/03/2001 15039 100 125
 Of course, not everyone has the same number of follow-up appointments and
 so
 there may be some column entries that are NAs.

 What I'd like to do is a dose profile i.e. a plot of time on the x-axis
 agaisnt dose on the y-axis for each patient ideally colouring lines
 according to drug.  Does anyone know how I can do this?  Someone at work
 has
 suggested that I use plot and then loess.smooth but I don't really know
 what
 to do.



 Hi Laura,
 You can get a basic plot like this:

 matplot(rbind(lbonnett$Time1,lbonnett$Time2),
 rbind(lbonnett$Dose1,lbonnett$Dose2),type=l,
 col=lbonnett$Drug,lty=1)

 As you can see by running this, you will get a line for each patient, with
 the color of each line determined by the drug (you can select different
 colors, I was just being lazy). The problem you are likely to face is that
 there will be gaps in the lines if you have NAs. You can rejig matplot or
 write a similar function to get around this so that the lines are just drawn
 across the gaps if that is what you want.

 Jim



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dose Profile

2008-10-28 Thread Laura Bonnett
Hi Everyone,

I have data in a long format e.g. there is one row per patient but each
follow-up appointment is included in the row.  So, a snippet of the data
looks like this:
  TrialNo Drug Sex Rand Adate1 Date1 Dose1 Time1 Adate2 Date2 Dose2
Time2  B1001029 LTG M 15719 30/04/2003 15825 150 106 29/08/2003 15946 200
227  B1117003 LTG M 15734 30/04/2003 15825 200 91 03/09/2003 15951 250 217
B138015 LTG M 14923 06/02/2001 15012 225 89 08/05/2001 15103 300 180
B112003 TPM F 14914 15/01/2001 14990 60 76 05/03/2001 15039 100 125
Of course, not everyone has the same number of follow-up appointments and so
there may be some column entries that are NAs.

What I'd like to do is a dose profile i.e. a plot of time on the x-axis
agaisnt dose on the y-axis for each patient ideally colouring lines
according to drug.  Does anyone know how I can do this?  Someone at work has
suggested that I use plot and then loess.smooth but I don't really know what
to do.

Thank you very much for your help,

Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] na.pass

2008-10-13 Thread Laura Bonnett
Hi All,

I have a data frame which has columns comprised mainly of NAs.  I know
there are functions na.pass and na.omit etc which can be used in these
situations however I can't them to work in this case.  I have a function
which returns the data according to some rule i.e. removal of N in this
code:

nep - function(data)
{
dummy - rep(0,378)
for(i in 1:378){
if(is.na(data$with.Wcode)[i])
data$with.Wcode[i] - O
}
for(i in 1:378){
if(data$with.Wcode[i]==N)
dummy[i] - i
}
return(data[-dummy,])
}

However, I really don't want to replace the NAs with O.  I'd just like to
gloss over them.  I can't just delete them because the structure of the data
frame needs to be maintained.  Can anyone suggest how I can write in a line
or two to ignore the NAs instead of replacing them?  I've tried this code
but it doesn't work!

nep - function(data)
{
dummy - rep(0,378)
for(i in 1:378){
na.pass(data$with.Wcode[i])
if(data$with.Wcode[i]==N)
dummy[i] - i
}
return(data[-dummy,])
}


Thank you,

Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] na.pass

2008-10-13 Thread Laura Bonnett
I have a data frame.  It has lots of patient information, their age, their
gender, etc etc.  I need to keep all this information whilst selecting
relevant rows.  So, in the example of code I provided I want to remove all
those patients who have entry N in the column with.Wcode.  The dimension of
the data is 378 i.e. 378 patients and currently I am replacing any entries
in column with.Wcode with the letter O as this is another level of the same
column.  Does that make more sense?
nep - function(data)
{
dummy - rep(0,378)
for(i in 1:378){
if(is.na(data$with.Wcode)[i])
data$with.Wcode[i] - O
}
for(i in 1:378){
if(data$with.Wcode[i]==N)
dummy[i] - i
}
return(data[-dummy,])
}

How can I therefore not replace NA with level O but instead, ignore the NAs
and effectively gloss over them?

Thank you,

Laura


On Mon, Oct 13, 2008 at 12:42 PM, jim holtman [EMAIL PROTECTED] wrote:

 Not sure exactly what you are trying to do since you did not provide
 commented, minimal, self-contained, reproducible code.  Let me take a
 guess in that you also have to test for NAs:

  x - sample(c(N, A, B, NA), 20, TRUE)
  x
  [1] A A B NA  N NA  NA  B B N N N B A NA  A
 B NA  A NA
  x != N
  [1]  TRUE  TRUE  TRUENA FALSENANA  TRUE  TRUE FALSE FALSE
 FALSE  TRUE  TRUENA  TRUE  TRUENA
 [19]  TRUENA
  x[x != N]
  [1] A A B NA  NA  NA  B B B A NA  A B NA  A NA
  (!is.na(x))  (x != N)
  [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
 FALSE  TRUE  TRUE FALSE  TRUE  TRUE FALSE
 [19]  TRUE FALSE
  x[(!is.na(x))  (x != N)]
  [1] A A B B B B A A B A
 


 On Mon, Oct 13, 2008 at 7:15 AM, Laura Bonnett
 [EMAIL PROTECTED] wrote:
  Hi All,
 
  I have a data frame which has columns comprised mainly of NAs.  I know
  there are functions na.pass and na.omit etc which can be used in these
  situations however I can't them to work in this case.  I have a function
  which returns the data according to some rule i.e. removal of N in this
  code:
 
  nep - function(data)
 {
 dummy - rep(0,378)
 for(i in 1:378){
 if(is.na(data$with.Wcode)[i])
 data$with.Wcode[i] - O
 }
 for(i in 1:378){
 if(data$with.Wcode[i]==N)
 dummy[i] - i
 }
 return(data[-dummy,])
 }
 
  However, I really don't want to replace the NAs with O.  I'd just like
 to
  gloss over them.  I can't just delete them because the structure of the
 data
  frame needs to be maintained.  Can anyone suggest how I can write in a
 line
  or two to ignore the NAs instead of replacing them?  I've tried this code
  but it doesn't work!
 
  nep - function(data)
 {
 dummy - rep(0,378)
 for(i in 1:378){
 na.pass(data$with.Wcode[i])
 if(data$with.Wcode[i]==N)
 dummy[i] - i
 }
 return(data[-dummy,])
 }
 
 
  Thank you,
 
  Laura
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hazard plot

2008-09-29 Thread Laura Bonnett
Hi All,

This sounds a relatively simple query, and I hope it is!

I am looking at a continuous variable, age.  I am looking at time to
12-month remission and can calculate the HR and 95% confidence interval are
follows:
coxfita = coxph(Surv(rem.Remtime,rem.Rcens)~nearma$all.age,data=nearma)
exp(coxfita$coefficients)
exp(confint(coxfita))


However, because I am looking at age as a continuous variable I cannot draw
a Kaplan-Meier curve.  Instead I need to draw a plot of hazard against age.
How do I do this? I don't think
plot(nearma$all.age,coxfita$linear.predictors) is quite right.

Thank you for your help,

Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generalising to n-dimensions

2008-09-26 Thread Laura Bonnett
Sorry to hassle you, but I really need to get my code up and running.
Please can you therefore explain what a and v are?

Thank you,

Laura

On Wed, Sep 24, 2008 at 8:27 PM, Laura Bonnett
[EMAIL PROTECTED]wrote:

 Can I ask what a and v are?

 Thanks,

 Laura


 On Sat, Aug 23, 2008 at 11:41 AM, Robin Hankin [EMAIL PROTECTED] wrote:

 Laura Bonnett wrote:

 crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]]
  crosstable is just a crosstabulation of an n+2-dimensional dataset and I
 am trying to pick out those that are in combination 'd' of expand. So for
 example, for 5-dimensional data using your example:
   Var1 Var2 Var3
 1 111
 2 211
 3 311
 4 121
 5 221
 6 321
 7 112
 8 212
 9 312
 10122
 11222
 12322
  d refers to the row of the matrix above - d=2 is 2,1,1 so
 crosstable[,,2,1,1] would retrieve all the data where Var1 =2, Var2=1,
 Var3=1 and the two remaining variables are given in the crosstabulations for
 all values.
  Is that any better?


 OK  I think I understand.  The magic package uses this type of
 construction extensively, but not this particular one.

 It's trickier than I'd have expected.

 Try this:

 f - function(a,v){
   jj -
 sapply(dim(a)[seq_len(length(dim(a))-length(v))],seq_len,simplify=FALSE)
   jj - c(jj , as.list(v))
   do.call([ , c(list(a) , jj, drop=TRUE))
 }



 [you will have to coerce the output from expand.grid() to a matrix in
 order to extract a row from it]


 HTH

 rksh








 --
 Robin K. S. Hankin
 Senior Research Associate
 Cambridge Centre for Climate Change Mitigation Research (4CMR)
 Department of Land Economy
 University of Cambridge
 [EMAIL PROTECTED]
 01223-764877




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generalising to n-dimensions

2008-09-24 Thread Laura Bonnett
Can I ask what a and v are?

Thanks,

Laura

On Sat, Aug 23, 2008 at 11:41 AM, Robin Hankin [EMAIL PROTECTED] wrote:

 Laura Bonnett wrote:

 crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]]
  crosstable is just a crosstabulation of an n+2-dimensional dataset and I
 am trying to pick out those that are in combination 'd' of expand. So for
 example, for 5-dimensional data using your example:
   Var1 Var2 Var3
 1 111
 2 211
 3 311
 4 121
 5 221
 6 321
 7 112
 8 212
 9 312
 10122
 11222
 12322
  d refers to the row of the matrix above - d=2 is 2,1,1 so
 crosstable[,,2,1,1] would retrieve all the data where Var1 =2, Var2=1,
 Var3=1 and the two remaining variables are given in the crosstabulations for
 all values.
  Is that any better?


 OK  I think I understand.  The magic package uses this type of construction
 extensively, but not this particular one.

 It's trickier than I'd have expected.

 Try this:

 f - function(a,v){
   jj -
 sapply(dim(a)[seq_len(length(dim(a))-length(v))],seq_len,simplify=FALSE)
   jj - c(jj , as.list(v))
   do.call([ , c(list(a) , jj, drop=TRUE))
 }



 [you will have to coerce the output from expand.grid() to a matrix in order
 to extract a row from it]


 HTH

 rksh








 --
 Robin K. S. Hankin
 Senior Research Associate
 Cambridge Centre for Climate Change Mitigation Research (4CMR)
 Department of Land Economy
 University of Cambridge
 [EMAIL PROTECTED]
 01223-764877



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Generalising to n-dimensions

2008-09-23 Thread Laura Bonnett
 Hi R-helpers,

I have two queries relating to generalising to n dimensions:

What I want to do in the first one is generalise the following statement:
expand-expand.grid(1:x[1],1:x[2],...1:x[n]) where x is a vector of integers
and expand.grid gives every combination of the set of numbers, so for
example, expand.grid(1:2, 1:3) takes 1,2 and 1,2,3  and gives 1,1   2,1
1,2   2,2   1,3   2,3
My x vector has varying lengths and I can't find a way of giving it every
set without stating each set individually.

Secondly and similarly, I want to get the table within crosstable that has
the elements defined by the combinations given in expand above
crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]] where
crosstable is just a crosstabulation of an n+2-dimensional dataset and I am
trying to pick out those that are in combination 'd' of expand.
So for example, using x[1]=2 and x[2]=3 as above example, if d =2 then the
order is 2,1 so I take crosstable[,,2,1].

Can anyone suggest a way to give the code every set without stating each set
individually?

Thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generalising to n-dimensions

2008-09-23 Thread Laura Bonnett
 crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]]

crosstable is just a crosstabulation of an n+2-dimensional dataset and I am
trying to pick out those that are in combination 'd' of expand.
So for example, for 5-dimensional data using your example:

 Var1 Var2 Var3
1 111
2 211
3 311
4 121
5 221
6 321
7 112
8 212
9 312
10122
11222
12322

d refers to the row of the matrix above - d=2 is 2,1,1 so
crosstable[,,2,1,1] would retrieve all the data where Var1 =2, Var2=1,
Var3=1 and the two remaining variables are given in the crosstabulations for
all values.

Is that any better?


On Sat, Aug 23, 2008 at 10:40 AM, Robin Hankin [EMAIL PROTECTED] wrote:

 First bit:

  x - c(3,2,2)
  expand.grid(sapply(x,seq_len))
  Var1 Var2 Var3
 1 111
 2 211
 3 311
 4 121
 5 221
 6 321
 7 112
 8 212
 9 312
 10122
 11222
 12322
 


 second bit I'm not sure about.  I didn't quite get why d=2 implied the
 order is 2,1.
 Could you post a small self-contained example?

 HTH

 rksh



 Laura Bonnett wrote:

  Hi R-helpers,

 I have two queries relating to generalising to n dimensions:

 What I want to do in the first one is generalise the following statement:
 expand-expand.grid(1:x[1],1:x[2],...1:x[n]) where x is a vector of
 integers
 and expand.grid gives every combination of the set of numbers, so for
 example, expand.grid(1:2, 1:3) takes 1,2 and 1,2,3  and gives 1,1   2,1
 1,2   2,2   1,3   2,3
 My x vector has varying lengths and I can't find a way of giving it every
 set without stating each set individually.

 Secondly and similarly, I want to get the table within crosstable that has
 the elements defined by the combinations given in expand above
 crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]] where
 crosstable is just a crosstabulation of an n+2-dimensional dataset and I
 am
 trying to pick out those that are in combination 'd' of expand.
 So for example, using x[1]=2 and x[2]=3 as above example, if d =2 then the
 order is 2,1 so I take crosstable[,,2,1].

 Can anyone suggest a way to give the code every set without stating each
 set
 individually?

 Thank you

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Robin K. S. Hankin
 Senior Research Associate
 Cambridge Centre for Climate Change Mitigation Research (4CMR)
 Department of Land Economy
 University of Cambridge
 [EMAIL PROTECTED]
 01223-764877



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variance-covariance matrix

2008-08-27 Thread Laura Bonnett
Hi all,

Sorry to ask again but I'm still not sure how to get the full
variance-covariance matrix.  Peter suggested a three-level treatment
factor.  However, I thought that the censoring variable could only take
values 0 or 1 so how do you programme such a factor.

Alternatively, is there another way to produce the required covariance?

Thank you,

Laura

On Tue, Aug 26, 2008 at 11:37 AM, Laura Bonnett
[EMAIL PROTECTED]wrote:

 The standard treatment is the same in both comparison.

 How do you do a three-level treatment factor?
 I thought you had to have a censoring indicator which took values 0 or 1
 not 1, 2 or 3?

 Thanks,

 Laura

 On Tue, Aug 26, 2008 at 11:05 AM, Peter Dalgaard [EMAIL PROTECTED]
  wrote:

 Laura Bonnett wrote:
  Dear R help forum,
 
  I am using the function 'coxph' to obtain hazard ratios for the
 comparison
  of a standard treatment to new treatments.  This is easily obtained by
  fitting the relevant model and then calling exp(coef(fit1)) say.
 
  I now want to obtain the hazard ratio for the comparison of two
 non-standard
  treatments.
  From a statistical point of view, this can be achieved by dividing the
  exponentiated coefficients of 2 comparisions. E.g. to compared new
 treatment
  1 (nt1) to new treatment 2 (nt2) we can fit 2 models:
  fit1 = standard treatment vs nt1
  fit2 = standard treatment vs nt2.
  The required hazard ratio is therefore exp(coef(fit1))/exp(coef(fit2))
 
  In order to obtain an associated confidence interval for this I require
 the
  covariance of this comparison.  I know that R gives the
 variance-covariance
  matrix by the command 'fit$var'.  However, this only gives the
 covariance
  matrix for non standard drugs and not the full covariance matrix.
 
  Can anyone tell me how to obtain the full covariance matrix?
 
 
 What kind of data do you have? Is the standard treatment group the
 same in both comparisons?  If so, why not just have a three-level
 treatment factor and compare nt1 to nt2 directly. If the control groups
 are completely separate, then the covariance between fits made on
 independent data is of course zero.

  Thank you,
 
  Laura
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variance-covariance matrix

2008-08-27 Thread Laura Bonnett
Here is the exact code I have written which does the standard vs nt1 and
standard vs nt2 and also gives me the hazard ratio for nt1 vs nt2.

with - read.table(allwiths.txt,header=TRUE)
fix(arm)
function (data)
{
dummy - rep(0,2437)
for(i in 1:2437){
if(data$Arm[i]==CBZ)
dummy[i] - i
}
return(data[-dummy,])
}
fix(x1)
function (data)
{
x1 - rep(0,716)
for(i in 1:716){
if(data$Treat[i]==TPM)
x1[i] - 0
if(data$Treat[i]==VPS)
x1[i] - 0
if(data$Treat[i]==LTG)
x1[i] - 1
}
return(x1)
}
fix(x2)
function (data)
{
x2 - rep(0,716)
for(i in 1:716){
if(data$Treat[i]==TPM)
x2[i] - 1
if(data$Treat[i]==VPS)
x2[i] - 0
if(data$Treat[i]==LTG)
x2[i] - 0
}
return(x2)
}
fit1 = coxph(Surv(Withtime,Wcens)~x1(armb),data=armb,method=breslow)
exp(fit1$coefficients)
exp(confint(fit1))
fit2 = coxph(Surv(Withtime,Wcens)~x2(armb),data=armb,method=breslow)
exp(fit2$coefficients)
exp(confint(fit2))
exp(fit2$coefficients)/exp(fit1$coefficients)

From that, how do I get the necessary variance-covaraince matrix.

Sorry if I appear dense.  It really isn't my intention.

Laura


On Wed, Aug 27, 2008 at 10:36 AM, Peter Dalgaard
[EMAIL PROTECTED]wrote:

 Laura Bonnett wrote:
  Hi all,
 
  Sorry to ask again but I'm still not sure how to get the full
  variance-covariance matrix.  Peter suggested a three-level treatment
  factor.  However, I thought that the censoring variable could only take
  values 0 or 1 so how do you programme such a factor.
 
 Well, not to put it too diplomatically: If you can confuse the treatment
 factor and the censoring indicator, there might be more wrong with your
 analysis than we originally assumed, so how about showing us a bit more
 of what you actually did?

  Alternatively, is there another way to produce the required covariance?
 
  Thank you,
 
  Laura
 
  On Tue, Aug 26, 2008 at 11:37 AM, Laura Bonnett
  [EMAIL PROTECTED]wrote:
 
 
  The standard treatment is the same in both comparison.
 
  How do you do a three-level treatment factor?
  I thought you had to have a censoring indicator which took values 0 or 1
  not 1, 2 or 3?
 
  Thanks,
 
  Laura
 
  On Tue, Aug 26, 2008 at 11:05 AM, Peter Dalgaard 
 [EMAIL PROTECTED]
 
  wrote:
 
  Laura Bonnett wrote:
 
  Dear R help forum,
 
  I am using the function 'coxph' to obtain hazard ratios for the
 
  comparison
 
  of a standard treatment to new treatments.  This is easily obtained by
  fitting the relevant model and then calling exp(coef(fit1)) say.
 
  I now want to obtain the hazard ratio for the comparison of two
 
  non-standard
 
  treatments.
  From a statistical point of view, this can be achieved by dividing
 the
  exponentiated coefficients of 2 comparisions. E.g. to compared new
 
  treatment
 
  1 (nt1) to new treatment 2 (nt2) we can fit 2 models:
  fit1 = standard treatment vs nt1
  fit2 = standard treatment vs nt2.
  The required hazard ratio is therefore exp(coef(fit1))/exp(coef(fit2))
 
  In order to obtain an associated confidence interval for this I
 require
 
  the
 
  covariance of this comparison.  I know that R gives the
 
  variance-covariance
 
  matrix by the command 'fit$var'.  However, this only gives the
 
  covariance
 
  matrix for non standard drugs and not the full covariance matrix.
 
  Can anyone tell me how to obtain the full covariance matrix?
 
 
 
  What kind of data do you have? Is the standard treatment group the
  same in both comparisons?  If so, why not just have a three-level
  treatment factor and compare nt1 to nt2 directly. If the control groups
  are completely separate, then the covariance between fits made on
  independent data is of course zero.
 
 
  Thank you,
 
  Laura
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 
  http://www.R-project.org/posting-guide.html
 
  and provide commented, minimal, self-contained, reproducible code.
 
 
  --
O__   Peter Dalgaard �ster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
   (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
  ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907
 
 
 
 
 
[[alternative HTML version deleted]]
 
 
  
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University

[R] Variance-covariance matrix

2008-08-26 Thread Laura Bonnett
Dear R help forum,

I am using the function 'coxph' to obtain hazard ratios for the comparison
of a standard treatment to new treatments.  This is easily obtained by
fitting the relevant model and then calling exp(coef(fit1)) say.

I now want to obtain the hazard ratio for the comparison of two non-standard
treatments.
From a statistical point of view, this can be achieved by dividing the
exponentiated coefficients of 2 comparisions. E.g. to compared new treatment
1 (nt1) to new treatment 2 (nt2) we can fit 2 models:
fit1 = standard treatment vs nt1
fit2 = standard treatment vs nt2.
The required hazard ratio is therefore exp(coef(fit1))/exp(coef(fit2))

In order to obtain an associated confidence interval for this I require the
covariance of this comparison.  I know that R gives the variance-covariance
matrix by the command 'fit$var'.  However, this only gives the covariance
matrix for non standard drugs and not the full covariance matrix.

Can anyone tell me how to obtain the full covariance matrix?

Thank you,

Laura

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variance-covariance matrix

2008-08-26 Thread Laura Bonnett
The standard treatment is the same in both comparison.

How do you do a three-level treatment factor?
I thought you had to have a censoring indicator which took values 0 or 1 not
1, 2 or 3?

Thanks,

Laura

On Tue, Aug 26, 2008 at 11:05 AM, Peter Dalgaard
[EMAIL PROTECTED]wrote:

 Laura Bonnett wrote:
  Dear R help forum,
 
  I am using the function 'coxph' to obtain hazard ratios for the
 comparison
  of a standard treatment to new treatments.  This is easily obtained by
  fitting the relevant model and then calling exp(coef(fit1)) say.
 
  I now want to obtain the hazard ratio for the comparison of two
 non-standard
  treatments.
  From a statistical point of view, this can be achieved by dividing the
  exponentiated coefficients of 2 comparisions. E.g. to compared new
 treatment
  1 (nt1) to new treatment 2 (nt2) we can fit 2 models:
  fit1 = standard treatment vs nt1
  fit2 = standard treatment vs nt2.
  The required hazard ratio is therefore exp(coef(fit1))/exp(coef(fit2))
 
  In order to obtain an associated confidence interval for this I require
 the
  covariance of this comparison.  I know that R gives the
 variance-covariance
  matrix by the command 'fit$var'.  However, this only gives the covariance
  matrix for non standard drugs and not the full covariance matrix.
 
  Can anyone tell me how to obtain the full covariance matrix?
 
 
 What kind of data do you have? Is the standard treatment group the
 same in both comparisons?  If so, why not just have a three-level
 treatment factor and compare nt1 to nt2 directly. If the control groups
 are completely separate, then the covariance between fits made on
 independent data is of course zero.

  Thank you,
 
  Laura
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Variance-covariance matrix for parameter estimates

2008-08-06 Thread Laura Bonnett
Dear All,

I am currently working with the coxph function within the package survival.
I have the model h_ij = h_0(t) exp(b1x1 + b2x2) where the indicator
variables
are as follows:
x1 x2
VPS 0 0
LTG 1 0
TPM 0 1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Variance-covariance matrix for parameter estimates

2008-08-06 Thread Laura Bonnett
(Sorry, my last email appeared to be missing the important bits so I'll try
again!)

Dear All,

I am currently working with the coxph function within the package survival.
I have the model h_ij = h_0(t) exp( b1x1 + b 2x2) where the indicator
variables are as follows:
 x1   x2
A00
B10
C01

The hazard rates are:
B   h_ij(t)=h_0(t)exp(b1)
C   h_ij(t)=h_0(t)exp(b2)
A   h_ij(t)=h_0(t)

This therefore means that the hazard ratios are
B:A   h_0(t)exp(b1)/h_0 = exp(b1)
C:A   exp(b2)
B:C   exp(b1)/exp(b2)

Using coxph, it is easy to determine B:A and C:A with their associated
confidence intervals and it is fairly trivial to obtain the hazard ratio for
B:C (by isolating the exponentiated coefficients for both previously fitted
models and dividing them.)  However, I cannot work out how to determine the
confidence interval for B:C.

I know that it will be clear from the variance-covariance matrix, but how do
I obtain that?  I have looked at functions such as vcoc and Var but the
problem I have is that B:A is what I have called fit1 and C:A is called fit2
and there appears to be no easy way to combine these fitted models to
produce the required matrix.

Thank you for your help,

Laura
P.S. Sorry again for the 1st attempt.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] If statements for vectors

2008-04-09 Thread Laura Bonnett
Dear Sirs,

I am using both the Bioconductor adds on (Affy, AffyPLM,...) and the
'standard' R-package.

I am trying to select a list of genes which all have expression values below
a certain threshold.
I have done this by creating a vector which has 0s where the expression is
greater than the threshold and 1s where it is less than or equal to it.
Multiplying this vector by the expression values produces a list of 0s and
expression values below the threshold value.

However, now I need to remove the 0s.  I thought that this would be
relatively trivial but it appears it isn't!!!


The dimension of the list (with the 0s and values) is 506994.  So I wrote
the following:

for(i in 1:506994) {
if(exp2[i]  0) {
exp3 - append(1,exp2[i])
}
return(exp3)
}

where exp2 is the vector of 0s and threshold values.
However I have since discovered that 'if' does not work on vectors.  The
suggestions I have seen on this forum include 'ifelse' which I don't believe
to be relevant in this situation and 'sapply' which again I don't think is
relevant.

Can anyone therefore tell me how I can produce a new vector from an old one
by removing the 0 entries?


Thank you for your help,

Laura
University of Warwick

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.