[R] Plot of Fine and Gray model
Dear all, Happy New year! I have used the 'crr' function to fit the 'proportional subdistribution hazards' regression model described in Fine and Gray (1999). dat1 is a three column dataset where: - ccr is the time to event variable - Crcens is an indicator variable equal to 0 if the event was achieved, 1 if the event wasn't acheived due to death or 2 if the event wasn't achieved due to disease progression - pre is an indicator variable (and the covariate of interest) I want to investigate if pre has a significant impact on time to event for patients who died and for those who suffered disease progression (as well as it's impact on the overall time to event). The code I have used is as follows: fitd - crr(dat1$ccr,dat1$Crcens,dat1$pre,failcode=1,cencode=0) fitp - crr(dat1$ccr,dat1$Crcens,dat1$pre,failcode=2,cencode=0) In these cases I get p-values of 0 and 0.66 respectively. What I would now like to do, is to plot two cumulative incidence curves - one for the 'pre' variable status for patients who didn't acheive the event due to death and one for those who didn't achieve it due to progression. How can I do this? I can only see things involving plot.predict.crr which doesn't seem to be what I need? Many thanks, Laura Usung Windows 7 and R 2.14.1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple forest plots with the same x-axis and colour coded estimates and lines
Dear all, I would like to draw three forest plots to represent results at years 1, 2 and 3. I have the data as point estimates and 95% confidence intervals. Using the following code I can get three basic forest plots - the first which has the table of results. I have to plot each separately as the usual par(mfrow=c(3,1)) does not work with the function forestplot within rmeta. I can easily put them next to one another within powerpoint or similar though so that's not a problem. #Read data in dattabrem1 - read.table(risk factors rem1.txt,header=TRUE) # year 1 results dattabrem2 - read.table(risk factors rem2.txt,header=TRUE) # year 2 results dattabrem3 - read.table(risk factors rem3.txt,header=TRUE) # year 3 results # Set up table of results for the three plots plotextr - rbind(c(Age,Gender,Seizures,Treatment),c(10,M,2,CBZ),c(10,F,2,CBZ), c(10,M,2,LTG),c(10,F,2,LTG),c(10,M,10,CBZ),c(10,F,10,CBZ), c(10,M,10,LTG),c(10,F,10,LTG),c(40,M,2,CBZ),c(40,F,2,CBZ), c(40,M,2,LTG),c(40,F,2,LTG),c(40,M,10,CBZ),c(40,F,10,CBZ), c(40,M,10,LTG),c(40,F,10,LTG),c(75,M,2,CBZ),c(75,F,2,CBZ), c(75,M,2,LTG),c(75,F,2,LTG),c(75,M,10,CBZ),c(75,F,10,CBZ), c(75,M,10,LTG),c(75,F,10,LTG)) # 1 year results estimate1y - c(NA,dattabrem1$HR) lowerd1y - c(NA,dattabrem1$CIlower) upperd1y - c(NA,dattabrem1$CIupper) # 2 year results estimate2y - c(NA,dattabrem2$HR) lowerd2y - c(NA,dattabrem2$CIlower) upperd2y - c(NA,dattabrem2$CIupper) # 3 year results estimate3y - c(NA,dattabrem3$HR) lowerd3y - c(NA,dattabrem3$CIlower) upperd3y - c(NA,dattabrem3$CIupper) # Draw forest plots forestplot(plotextr,estimate1y,lowerd1y,upperd1y,is.summary=c(TRUE,rep(FALSE,24)),zero=,align=c,xlab=Remission @ 1 year) forestplot(plotext2r,estimate2y,lowerd2y,upperd2y,zero=,xlab=Remission @ 2 years) forestplot(plotext2r,estimate3y,lowerd3y,upperd3y,zero=,xlab=Remission @ 3 years) Having managed to obtain these basic plots I need the x-axes to be the same. Usually xlim would be sufficient to do this but this function is not avaialble within forestplot so does anyone know how I can make the x-axes the same over all the plots? Additionally, within each plot, two treatments are compared (top two blocks are treatment 1, 2nd 2 blocks are treatment 2, next 2 are treatment 1 etc.) Is there any way I can colour code the boxes to show this? Many thanks, Laura P.S. I'm using Windows, R 2.9.2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confidence interval for difference in Harrell's c statistics (or equivalently Somers' D statistics)
Dear All, I am trying to calculate a 95% confidence interval for the difference in two c statistics (or equivalently D statistics). In Stata I gather that this can be done using the lincom command. Is there anything similar in R? As you can see below I have two datasets (that are actually two independent subsets of the same data) and the respective c statistics for the variables in both cases. What I would now like to do is to prove that there is no statistically significant difference between the statistics (between the dev and val datasets.) Any help would be much appreciated. rdev - rcorrcens(Surv(stimes1,eind1)~gendat1+neurodat1) rdev Somers' Rank Correlation for Censored DataResponse variable:Surv(stimes1, eind1) CDxy aDxySDZ Pn gendat1 0.534 0.069 0.069 0.017 3.98 0.0001 1500 neurodat1 0.482 -0.036 0.036 0.011 3.18 0.0015 1500 rval - rcorrcens(Surv(stimes2,eind2)~gendat2+neurodat2) rval Somers' Rank Correlation for Censored DataResponse variable:Surv(stimes2, eind2) CDxy aDxySDZ Pn gendat2 0.543 0.085 0.085 0.017 4.94 0e+00 1500 neurodat2 0.481 -0.038 0.038 0.011 3.44 6e-04 1500 Many thanks, Laura P.S. I'm using Windows XP, R 2.9.2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence interval for difference in Harrell's c statistics (or equivalently Somers' D statistics)
Dear David, Thank you for your reply. I have come across rcorrp.cens before. However, I'm not sure it does quite what I want it to. It seems to compare whether one predictor is more concordant than another within the same survival function. I want to see whether one predictor is more concordant than another over two survival functions hence I fitted two rcorrcens functions. E.g. if I have a development data set with a variable for age and a validation data set for age then I want to know if the concordance is the same over the development and validation data sets. Thank you, Laura On 5 May 2011 16:09, David Winsemius dwinsem...@comcast.net wrote: On May 5, 2011, at 8:20 AM, Laura Bonnett wrote: Dear All, I am trying to calculate a 95% confidence interval for the difference in two c statistics (or equivalently D statistics). In Stata I gather that this can be done using the lincom command. Is there anything similar in R? Have you looked at rcorrp.cens {Hmisc}? snipped [[alternative HTML version deleted]] Please post in plain text. -- David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] repeat write.table with the same code many times
Dear all, I am using R version 2.9.2 in Windows. I would like to output the results of a function I have written to a .txt file. I know that I can do this by using the code write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) etc. However, I would like to bootstrap my function 'boothd' several times and get each vector of results as a new line in my text file. Is there a way to do this? I usually just set the code up to do bootstrapping around the function (i.e. I perform the replications within the function and output a matrix of results). However in the case of 'boothd' I am dealing with rare events and so sometimes I get an empty vector as output which makes mathematical sense. Unfortunately this casues the bootstrapping code to crash. I'm hoping that writing the results out line by line will remove this problem. I have tried rep(write.table(...),15) say but because of the occasional null vector the table is not written. Thank you for any help you can give. By the way, write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) write.table(boothd(10),boothd10.txt,sep=\t,append=TRUE) etc works but if I want to look at 1000 replications this is very time consuming! Thanks, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fractional Polynomials - Hazard Ratios and Relative Hazard Plots
Dear All, I am using Windows and R version 2.9.2 with libraries cmprsk, mfp and Design. I have a dataset with approximately 1700 patients (1 row per patient) and I have 12 outcomes, three of which are continuous. I have performed univariate analyses to see if any factors are associated with a higher likelihood of the event of interest (achieving 12 month remission from epileptic seizures) and also an analysis adjusted for multiple variables. I have tried log and fractional polynomial (FP) transformations of each continuous variable. In the case of age, used for the example below, the FP transformation led to the best model fit according to the AIC. I have therefore applied this transformation for all analyses. To begin with I have fitted a Cox model stratified by randomisation period, rpa, (either before or after a certain date). fit1 - mfp(Surv(Remtime,Rcens) ~ fp(age) + strata(rpa), family=cox, data=nearma, select=0.05, verbose=TRUE) I would like two things from this model, hazard ratios and and an associated hazard ratio plot. I am aware that the hazard ratios produced from a fractional polynomial transformation are not to be used directly (i.e. those obtained from summary(coxfitf1)). Instead the derived functional form of the variable should be used to estimate hazard ratios post hoc. I have attached a word document explaining how hazard ratios and confidence interval can be derived and given a worked example for the variable, age. The univariate results are: Age (years) ≤10 (10 to 25] (25 to 37] (37 to 50] (50 to 71] 71 1.00 1.00 (1.00 to 1.00) 0.99 (0.99 to 1.00) 0.99 (0.99 to 0.99) 0.99 (0.98 to 0.99) 0.98 (0.97 to 0.99) To create a plot of the relative hazard I have used the code: plot(nearma$age,predict(fit1,type=risk),xlab=Age,ylab=Relative Hazard) The produced plot is attached. As you can clearly see, the hazard ratios above and the relative hazard plot do not agree. This is also the case for the other two continuous variables that have been transformed via the FP approach. The hazard ratios for age using the model adjusted for multiple variables are as follows, which do coincide with the plot: Age (years) ≤10 (10 to 25] (25 to 37] (37 to 50] (50 to 71] 71 1.00 0.86 (0.77 to 0.95) 0.78 (0.65 to 0.94) 0.80 (0.64 to 1.00) 1.01 (0.81 to 1.25) 1.61 (1.27 to 2.05) Can anyone therefore explain why the univariate hazard ratios do not coincide with the relative hazard plot and yet the hazard ratios from the multivariable model do? I know that the calculations are correct for both sets of hazard ratios. Thank you for any help you can provide as I am at a loss to explain the difference in the plot fo the calculations - they should, after all, be saying the same thing! Thank you, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Relative Risk/Hazard Ratio plots for continuous variables
Dear all, I am using Windows and R 2.9.2 for my analyses. I have a large dataset and I am particularly interested in looking at time to an event for a continuous variable. I would like to produce a plot of log(relative risk) or relative risk (also known as hazard ratio) against the continuous variable. I have spent a long time looking for advice on how to do this but my search has proved fruitless - sorry if I've missed something obvious. It seems that there are options such as muhaz, survfit, coxph and cph that may enable some plots to be produced but none that specifically look at the relative risk one. In addition to the survival analysis, I have incorporated the mfp function (from package mfp). I currently use code such as, library(mfp) library(Design) coxfit1 - coxph(Surv(rtime,rcens)~cts,data=data1) or coxfit2 - mfp(Surv(rtime,rcens)~fp(cts),family=cox,data=data1,select=0.05,verbose=TRUE) plot(coxfit1) nor plot(coxfit2) produce the relevant relative risk vs. continuous variable that I need. Can anyone help? Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Baseline survival estimate
Dear R-help, I am trying to obtain the baseline survival estimate of a fitted Cox model (S_0 (t)). I know that previous posts have said use 'basehaz' but this gives the baseline hazard function and not the baseline survival estimate. Is there a way to obtain the baseline survival estimate or do I have to use the formula which does something like S(t) = exp[- the integral from 0 to t of h(u) du]? Thank you for your assistance, Laura fit1 - coxph(Surv(tsecond/365,seccens)~stroke(smess1)+othnd(smess1)+relat(smess1)+asleep(smess1)+abeeg1(smess1)+treat(smess1),data=smess1) basehaz(fit1) where stroke is a function which creates a binary variable from the dataset smess1 etc. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using stepAIC to produce a p-value for when a particular variable was taken out of the model
Dear all, I have decided after much deliberation to use backward elimination and forward selection to produce a multivariate model. Having read about the problems with choosing selection values I have chosen to base my decisions of inclusion and exclusion on the AIC and am consequently using the stepAIC function. This post however does not relate to whether or not this is the correct decision! I am interested in determining what the p-value was when a particular variable was taken out of the model. If I choose trace=TRUE then I obviously can see each step of the elimination process together with the AIC and the degrees of freedom for each variable and for the null model. When the stepwise process is complete it is possible to call the anova value which shows deviances and assocaited degrees of freedoms for variables left in the model. Therefore I could use this information to calculate p-values. However, is it possible to do the same for the varaibles which were thrown out of the model? There doesn't seem to be any literature on how to use AICs to get p-values as the distribution isn't quite a chi-squared one. Does anyone therefore know how I can determine the p-value for a variable when it was taken out of the model? Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using stepAIC to produce a p-value for when a particular variable was taken out of the model
I've just realised that this is a very silly post as I can't read!!! The output in the anova is the excluded variables - very sorry! Laura 2009/11/30 Laura Bonnett l.j.bonn...@googlemail.com Dear all, I have decided after much deliberation to use backward elimination and forward selection to produce a multivariate model. Having read about the problems with choosing selection values I have chosen to base my decisions of inclusion and exclusion on the AIC and am consequently using the stepAIC function. This post however does not relate to whether or not this is the correct decision! I am interested in determining what the p-value was when a particular variable was taken out of the model. If I choose trace=TRUE then I obviously can see each step of the elimination process together with the AIC and the degrees of freedom for each variable and for the null model. When the stepwise process is complete it is possible to call the anova value which shows deviances and assocaited degrees of freedoms for variables left in the model. Therefore I could use this information to calculate p-values. However, is it possible to do the same for the varaibles which were thrown out of the model? There doesn't seem to be any literature on how to use AICs to get p-values as the distribution isn't quite a chi-squared one. Does anyone therefore know how I can determine the p-value for a variable when it was taken out of the model? Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unusual error while using coxph
Hi all, I'm very confused! I've been using the same code for many weeks without any bother for various covariates. I'm now looking at another covaraite and whenever I run the code you can see below I get an error message: Error in rep(0, nrow(data)) : invalid 'times' argument This code works: # remove 'missing' cases from data # snearma - function(data) { for(i in 1:nrow(data)){ if(is.na(data$all.sp)[i]) data$all.sp[i]-0 if(is.na(data$all.cp)[i]) data$all.cp[i]-0 if(is.na(data$all.scgtc)[i]) data$all.scgtc[i]-0 if(is.na(data$all.tc)[i]) data$all.tc[i] - 0 if(is.na(data$all.ta)[i]) data$all.ta[i] - 0 if(is.na(data$all.aa)[i]) data$all.aa[i] - 0 if(is.na(data$all.m)[i]) data$all.m[i] - 0 if(is.na(data$all.otc)[i]) data$all.otc[i] - 0 if(is.na(data$all.o)[i]) data$all.o[i] - 0 } dummy - rep(0,nrow(data)) for(i in 1:nrow(data)){ if(data$all.sp[i]==0 data$all.cp[i]==0 data$all.scgtc[i]==0 data$all.tc[i]==0 data$all.ta[i]==0 data$all.aa[i]==0 data$all.m[i]==0 data$all.otc[i]==0 data$all.o[i]==0) dummy[i] - i } return(data[-dummy,]) } # create smaller dataset with missing cases removed # nmarma - snearma(nearma) # create short stratification variable # nmrpa - randp(nmarma) # create censoring variable for the covariate # stypea - function(data) { for(i in 1:nrow(data)){ if(is.na(data$all.sp)[i]) data$all.sp[i]-0 if(is.na(data$all.cp)[i]) data$all.cp[i]-0 if(is.na(data$all.scgtc)[i]) data$all.scgtc[i]-0 if(is.na(data$all.tc)[i]) data$all.tc[i] - 0 if(is.na(data$all.ta)[i]) data$all.ta[i] - 0 if(is.na(data$all.aa)[i]) data$all.aa[i] - 0 if(is.na(data$all.m)[i]) data$all.m[i] - 0 if(is.na(data$all.otc)[i]) data$all.otc[i] - 0 if(is.na(data$all.o)[i]) data$all.o[i] - 0 } stype - rep(0,nrow(data)) for(i in 1:nrow(data)){ if(data$all.type[i]==P data$all.sp[i]=1 data$all.scgtc[i] == 0) stype[i] - 1 if(data$all.type[i]==P data$all.cp[i]=1 data$all.scgtc[i] == 0) stype[i] - 1 if(data$all.type[i]==P data$all.scgtc[i]=1) stype[i] - 2 if(data$all.type[i]==P data$all.sp[i]==0 data$all.cp[i]==0 data$all.scgtc[i]==0 data$all.otc[i]==0 data$all.o[i]==0) stype[i] - 2 if(data$all.type[i]==G data$all.tc[i]=1 data$all.m[i]==0 data$all.ta[i]==0 data$all.aa[i]==0) stype[i] - 3 if(data$all.type[i]==G data$all.m[i]=1 data$all.tc[i]==0) stype[i] - 3 if(data$all.type[i]==G data$all.ta[i]=1 data$all.tc[i]==0) stype[i] - 3 if(data$all.type[i]==G data$all.aa[i]=1 data$all.tc[i]==0) stype[i] - 3 if(data$all.type[i]==G data$all.m[i]=1 data$all.tc[i]=1) stype[i] - 3 if(data$all.type[i]==G data$all.ta[i]=1 data$all.tc[i]=1) stype[i] - 3 if(data$all.type[i]==G data$all.aa[i]=1 data$all.tc[i]=1) stype[i] - 3 if(data$all.otc[i]=1) stype[i] - 4 if(data$all.o[i]=1) stype[i] - 4 } return(stype) } fita - survdiff(Surv(rem.Remtime,rem.Rcens)~stypea(nmarma)+strata(nmrpa),data=nmarma) fita lrpvalue=1-pchisq(fita$chisq,3) xx - cuminc(nmarma$rem.Remtime/365,nmarma$rem.Rcens,stypea(nmarma),strata=nmrpa) plot(xx,curvlab=c(Simple/Complex,SC+2gentc or 2gentc,TC or My/Ab or My/Ab+gentc,Other),lty=1,color=c(2:5),xlab=Time from randomisation (years),ylab=Probability of 12-month remission,main=Time to 12-month remission,wh=c(2.0,0.4)) text(4,0.5,cex=0.85,paste(Log-rank test=,round(fita$chisq,3),p-value=,round(lrpvalue,3))) whereas this doesn't: par - function(data) { dummy - rep(0,nrow(data)) for(i in 1:nrow(data)){ if(is.na(data$all.frontlob)[i] is.na(data$all.templob)[i] is.na(data$all.parlob)[i] is.na(data$all.occlob)[i] is.na(data$all.notspec)[i]) dummy[i]- i } for(i in 1:nrow(data)){ if(is.na(data$all.frontlob)[i]) data$all.frontlob[i] - N if(is.na(data$all.templob)[i]) data$all.templob[i] - N if(is.na(data$all.parlob)[i]) data$all.parlob[i] - N if(is.na(data$all.occlob)[i]) data$all.occlob[i] - N if(is.na(data$all.notspec)[i]) data$all.notspec[i] - N } for(i in 1:nrow(data)){ if(data$all.frontlob[i]==N data$all.templob[i]==N data$all.parlob[i]==N data$all.occlob[i]==N data$all.notspec[i]==N) dummy[i] - i if(data$all.frontlob[i]==Y data$all.parlob[i]==Y) dummy[i] - i } return(data[-dummy,]) } shortpar - par(nearma) shortrpa - randp(shortpar) lobe - function(data) { for(i in 1:nrow(data)){ if(is.na(data$all.frontlob)[i]) data$all.frontlob[i] - N if(is.na(data$all.templob)[i]) data$all.templob[i] - N if(is.na(data$all.occlob)[i]) data$all.occlob[i] - N if(is.na(data$all.parlob)[i]) data$all.parlob[i] - N if(is.na(data$all.notspec)[i]) data$all.notspec[i] - N } lobe - rep(0,nrow(data)) for(i in 1:nrow(data)){ if(data$all.frontlob[i]==Y)
Re: [R] crr - computationally singular
Hi Everyone, Thank you for all your comments and suggestions. I determined that I had a full rank model matrix by using the code: qr(covaeb)$rank This is 17 which is equal to the number of covariates in the matrix, covaeb. I cannot invert the model matrix using 'solve' as my matrix is not square. In the matrices ending in a, there are 1677 rows and 15 columns/covariates while in the matrices ending in b, there are 701 rows and 17 columns. Thank you, Laura 2009/6/26 Ravi Varadhan rvarad...@jhmi.edu: How did you determine that you have full rank model matrix comprising 17 predictors? Are you able to invert the model matrix using `solve'? If not, you still have collinearity problem. If you are, then the problem might be in the Newton's method used by `crr' to solve the partial-likelihood optimization. The hessian matrix of the parameters might be singular during the iterations. If this is the case, your best bet would be to just simplify the model, i.e. use fewer predictors. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: Laura Bonnett [mailto:l.j.bonn...@googlemail.com] Sent: Friday, June 26, 2009 6:22 AM To: Ravi Varadhan Cc: r-help@r-project.org Subject: Re: [R] crr - computationally singular Dear Sir, Thank you for your response. You were correct, I had 1 linearly dependent column. I have solved this problem and now the rank of 'covaeb' is 17 (qr(covaeb)$rank = 17). However, I still get the same error message when I use covaeb in the 'crr' function. fit=crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0) 8 cases omitted due to missing values Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) : system is computationally singular: reciprocal condition number = 3.45905e-25 Are there any other reasons why this may be happening? Thank you, Laura 2009/6/25 Ravi Varadhan rvarad...@jhmi.edu: This means that your design matrix or model matrix is rank deficient, i.e it does not have linearly independent columns. Your predictors are collinear! Just take your design matrices covaea or covaeb with 17 predcitors and compute their rank or try to invert them. You will see the problem. Ravi. -- -- --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Vara dhan.h tml -- -- -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Laura Bonnett Sent: Thursday, June 25, 2009 11:39 AM To: r-help@r-project.org Subject: [R] crr - computationally singular Dear R-help, I'm very sorry to ask 2 questions in a week. I am using the package 'crr' and it does exactly what I need it to when I use the dataset a. However, when I use dataset b I get the following error message: Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) : system is computationally singular: reciprocal condition number = 1.28654e-24 This is obviously as a result of a problem with the data but apart from dataset a having 1674 rows and dataset b having 701 rows there is really no difference between them. The code I am using is as follows where covaea and covaeb are matrices of covarites, all coded as binary variables. In case a: covaea - cbind(sexa,fsha,fdra,nsigna,eega,th1a,th2a,stype1a,stype2a,stype3a,pg u 1a,pgu2a,log(agea),firstinta/1000,totsezbasea) fita - crr(snearma$with.Withtime,csaea,covaea,failcode=2,cencode=0) and in case b: covaeb - cbind(sexb,fshb,fdrb,nsignb,eegb,th1b,th2b,stype1b,stype2b,stype3b,st y pe4b,stype5b,pgu1b,pgu2b,(ageb/10)^(-1),firstintb,log(totsezbaseb)) fitb - crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0) csaea and csaeb are the censoring indicators for a and b respectively which equal 1 for the event of interest, 2 for the competing risks event and 0 otherwise. Can anyone suggest a reason for the error message? I've tried running fitb with variants of covaeb and irrespective of the order of the covariates in the matrix, the code runs fine with 16 of the 17 covariates included but then produces an error message when
Re: [R] crr - computationally singular
Dear Sir, Thank you for your response. You were correct, I had 1 linearly dependent column. I have solved this problem and now the rank of 'covaeb' is 17 (qr(covaeb)$rank = 17). However, I still get the same error message when I use covaeb in the 'crr' function. fit=crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0) 8 cases omitted due to missing values Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) : system is computationally singular: reciprocal condition number = 3.45905e-25 Are there any other reasons why this may be happening? Thank you, Laura 2009/6/25 Ravi Varadhan rvarad...@jhmi.edu: This means that your design matrix or model matrix is rank deficient, i.e it does not have linearly independent columns. Your predictors are collinear! Just take your design matrices covaea or covaeb with 17 predcitors and compute their rank or try to invert them. You will see the problem. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Laura Bonnett Sent: Thursday, June 25, 2009 11:39 AM To: r-help@r-project.org Subject: [R] crr - computationally singular Dear R-help, I'm very sorry to ask 2 questions in a week. I am using the package 'crr' and it does exactly what I need it to when I use the dataset a. However, when I use dataset b I get the following error message: Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) : system is computationally singular: reciprocal condition number = 1.28654e-24 This is obviously as a result of a problem with the data but apart from dataset a having 1674 rows and dataset b having 701 rows there is really no difference between them. The code I am using is as follows where covaea and covaeb are matrices of covarites, all coded as binary variables. In case a: covaea - cbind(sexa,fsha,fdra,nsigna,eega,th1a,th2a,stype1a,stype2a,stype3a,pgu 1a,pgu2a,log(agea),firstinta/1000,totsezbasea) fita - crr(snearma$with.Withtime,csaea,covaea,failcode=2,cencode=0) and in case b: covaeb - cbind(sexb,fshb,fdrb,nsignb,eegb,th1b,th2b,stype1b,stype2b,stype3b,sty pe4b,stype5b,pgu1b,pgu2b,(ageb/10)^(-1),firstintb,log(totsezbaseb)) fitb - crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0) csaea and csaeb are the censoring indicators for a and b respectively which equal 1 for the event of interest, 2 for the competing risks event and 0 otherwise. Can anyone suggest a reason for the error message? I've tried running fitb with variants of covaeb and irrespective of the order of the covariates in the matrix, the code runs fine with 16 of the 17 covariates included but then produces an error message when the 17th is added. Thank you for your help, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] crr - computationally singular
But I have centred all the dummy variables for the covariates... 2009/6/26 David Winsemius dwinsem...@comcast.net: Still the same reasons. It is possible to have collinearity without having any one column be a multiple of another. xyz - data.frame(x=sample(1:1000, 5), y=sample(1:1000, 5) , xx=sample(1:1000, 5) ,yy=sample(1:1000, 5) ) xyz$z - xyz$x + xyz$y + xyz$xx solve(xyz) Error in solve.default(xyz) : system is computationally singular: reciprocal condition number = 6.39164e-20 On Jun 26, 2009, at 6:22 AM, Laura Bonnett wrote: Dear Sir, Thank you for your response. You were correct, I had 1 linearly dependent column. I have solved this problem and now the rank of 'covaeb' is 17 (qr(covaeb)$rank = 17). However, I still get the same error message when I use covaeb in the 'crr' function. fit=crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0) 8 cases omitted due to missing values Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) : system is computationally singular: reciprocal condition number = 3.45905e-25 Are there any other reasons why this may be happening? Thank you, Laura 2009/6/25 Ravi Varadhan rvarad...@jhmi.edu: This means that your design matrix or model matrix is rank deficient, i.e it does not have linearly independent columns. Your predictors are collinear! Just take your design matrices covaea or covaeb with 17 predcitors and compute their rank or try to invert them. You will see the problem. Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Laura Bonnett Sent: Thursday, June 25, 2009 11:39 AM To: r-help@r-project.org Subject: [R] crr - computationally singular Dear R-help, I'm very sorry to ask 2 questions in a week. I am using the package 'crr' and it does exactly what I need it to when I use the dataset a. However, when I use dataset b I get the following error message: Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) : system is computationally singular: reciprocal condition number = 1.28654e-24 This is obviously as a result of a problem with the data but apart from dataset a having 1674 rows and dataset b having 701 rows there is really no difference between them. The code I am using is as follows where covaea and covaeb are matrices of covarites, all coded as binary variables. In case a: covaea - cbind(sexa,fsha,fdra,nsigna,eega,th1a,th2a,stype1a,stype2a,stype3a,pgu 1a,pgu2a,log(agea),firstinta/1000,totsezbasea) fita - crr(snearma$with.Withtime,csaea,covaea,failcode=2,cencode=0) and in case b: covaeb - cbind(sexb,fshb,fdrb,nsignb,eegb,th1b,th2b,stype1b,stype2b,stype3b,sty pe4b,stype5b,pgu1b,pgu2b,(ageb/10)^(-1),firstintb,log(totsezbaseb)) fitb - crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0) csaea and csaeb are the censoring indicators for a and b respectively which equal 1 for the event of interest, 2 for the competing risks event and 0 otherwise. Can anyone suggest a reason for the error message? I've tried running fitb with variants of covaeb and irrespective of the order of the covariates in the matrix, the code runs fine with 16 of the 17 covariates included but then produces an error message when the 17th is added. Thank you for your help, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] crr - computationally singular
Dear R-help, I'm very sorry to ask 2 questions in a week. I am using the package 'crr' and it does exactly what I need it to when I use the dataset a. However, when I use dataset b I get the following error message: Error in drop(.Call(La_dgesv, a, as.matrix(b), tol, PACKAGE = base)) : system is computationally singular: reciprocal condition number = 1.28654e-24 This is obviously as a result of a problem with the data but apart from dataset a having 1674 rows and dataset b having 701 rows there is really no difference between them. The code I am using is as follows where covaea and covaeb are matrices of covarites, all coded as binary variables. In case a: covaea - cbind(sexa,fsha,fdra,nsigna,eega,th1a,th2a,stype1a,stype2a,stype3a,pgu1a,pgu2a,log(agea),firstinta/1000,totsezbasea) fita - crr(snearma$with.Withtime,csaea,covaea,failcode=2,cencode=0) and in case b: covaeb - cbind(sexb,fshb,fdrb,nsignb,eegb,th1b,th2b,stype1b,stype2b,stype3b,stype4b,stype5b,pgu1b,pgu2b,(ageb/10)^(-1),firstintb,log(totsezbaseb)) fitb - crr(snearmb$with.Withtime,csaeb,covaeb,failcode=2,cencode=0) csaea and csaeb are the censoring indicators for a and b respectively which equal 1 for the event of interest, 2 for the competing risks event and 0 otherwise. Can anyone suggest a reason for the error message? I've tried running fitb with variants of covaeb and irrespective of the order of the covariates in the matrix, the code runs fine with 16 of the 17 covariates included but then produces an error message when the 17th is added. Thank you for your help, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fractional Polynomials in Competing Risks setting
Dear All, I have analysed time to event data for continuous variables by considering the multivariable fractional polynomial (MFP) model and comparing this to the untransformed and log transformed model to determine which transformation, if any, is best. This was possible as the Cox model was the underlying model. However, I am now at the situation where the assumption that the competing risks are independent is no longer true and therefore I cannot use the Cox model. The code I used to get the MFP model was: coxfitf - mfp(Surv(with.Withtime,with.Wcens)~fp(all.age),family=cox,data=nearma,select=0.05,verbose=TRUE) where with.Withtime is the time to treatment withdrawal, with.Wcens is the censoring indictor for the event and all.firstint is the age at baseline. To look at the competing risks regression modelling when age in untransformed, I can use the following code: fitn-crr(nearma$with.Withtime,censaeb,as.matrix(nearma$all.age),failcode=2,cencode=0) where censaeb is the censoring indicator which is coded 1 for the event of interest (time to treatment failure), 2 for the competing risk and 0 for the censored value. Can anyone suggest how I can effectively combine these situations i.e. is there a way to apply the fractional polynomail transformation to the variable to assertain whether the transformation improves the model fit? I've tried the following code but it doesn't actually transform the variable: fitf=crr(nearmb$with.Withtime,censaeb,as.matrix(fp(nearmb$all.firstint)),failcode=2,cencode=0) Thank you for your help, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Isolating a single plot from plots produced simultaneously
Dear R-Help, I am using the 'mfp' package. It produces three plots (as I am using the Cox model) simultaneously which can be viewed together using the following code: fit - mfp(Surv(rem.Remtime,rem.Rcens)~fp(age)+strata(rpa),family=cox,data=nearma,select=0.05,verbose=TRUE) par(mfrow=c(2,2)) plot(fit) They can be viewed separately but the return key must be pressed after each graph appears (Click or hit ENTER for next page). I'd like to isolate the second plot produced (the estimated functional form of the influence of age on the log relative hazard) so that I can use the 'points' function to add the linear predictors for the untransformed and the log-transformed models. In the usual situation one would produce a plot and then type: coxfitu - coxph(Surv(rem.Remtime,rem.Rcens)~age+strata(rpa),data=nearma) points(coxfitu$linear.predictor,col=2) coxfitl - coxph(Surv(rem.Remtime,rem.Rcens)~log(age)+strata(rpa),data=nearma) points(coxfitl$linear.predictor,col=3) Can anyone tell me how to isolate just the second plot produced? Thank you for your help, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Forcing a variableinto a model using stepAIC
Dear All, I am attempting to use forward and/or backward selection to determine the best model for the variables I have. Unfortunately, because I am dealing with patients and every patient is receiving treatment I need to force the variable for treatment into the model. Is there a way to do this using R? (Additionally, the model is stratified by randomisation period). I know that SAS can be used to do this but my SAS coding is poor and consequently it would be easier for me to use R, especially given the fractional polynomial transformations! Currently the model is as follows (without treatment). coxfita=coxph(Surv(rem.Remtime,rem.Rcens)~sind(nearma)+fsh(nearma)+fdr(nearma)+th1(nearma)+th2(nearma)+fp(cage)+fp(fint)+fp(tsb)+strata(rpa),data=nearma) Thank you for your help, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survreg/psm output
Dear R-listers, I know that there have been many, many posts on the output from Survreg. To summarise what I have read, Scale is 1/shape of the Weibull which is also the standard deviation of the normal distribution which is also the standard deviation of the log survival time and Intercept is log(scale). I also know that the hazard function can be calculated from the output to give something like: h(time)=exp(-intercept)^scale x scale x time^(1-scale) So, for example if the intercept was -1.11 and the scale was 1.17 then the hazard function is h(t) = exp(1.11)^1.17 x 1.17 x t^0.17 = 4.30 t^0.17 However, how can I work out what the hazard is I.e. by how much does the risk of death increase or decrease with time? Some people have said that is the scale parameter is greater than 1 then the risk of death increases with age. Is this correct? I am also aware that Harrell has posted previously that there is a case study in his book. I have a copy of the book and have read all the case studies but I'm still not convinced I know what the hazard of death is?! Sorry that this relates to previously posted queries, but as yet there doesn't seem to be a satisfactory solution to the question of the output? Thank you for your help as always, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Schoenfeld residuals
Hi, Thank you for your comments and apologies for the delay in replying. rem.Rcens =1 for the censored variables. The problem arises because I am not strictly looking at time to death. Instead I am looking at time to 12-month remission in epilepsy. Therefore a lot of people have the same event i.e. they successfully achieve 12-month remission from day 1 of the treatment. I think I shall avoid the problem by 'excluding' patients with immediate 12-month remission i.e. I will look at patients with immediate success in a separate analysis to patients with delayed success. Thanks for your help, Laura 2009/4/6 Terry Therneau thern...@mayo.edu: Laura Bonnett was kind enough to send me a copy of the data that caused the plotting error, since it was an error I had not seen before. 1. The latest version of survival gives a nicer error message: fit - coxph(Surv(rem.Remtime, rem.Rcens) ~ all.sex, nearma) cfit - cox.zph(fit) plot(cfit) Error in plot.cox.zph(cfit) : Spline fit is singular, try a smaller degrees of freedom 2. What's the problem? There are 1085 events in the data set (rem.Rcens==1), and of these 502 are tied events on exactly day 365. The plot.cox.zph function tries to fit a smoothing spline to the data to help the eye; the fit gives weight 1 to each death and having this high a proportion of ties creates problems for the underlying regression. 3. plot(cfit, df=2) Warning messages: 1: In approx(xx, xtime, seq(min(xx), max(xx), length = 17)[2 * (1:8)]) : collapsing to unique 'x' values 2: In approx(xtime, xx, temp) : collapsing to unique 'x' values These warning messages are ignorable. I'll work on making them go away. 4. A shot in the dark -- is perchance the variable rem.Rcens=1 a marker of a censored observation, and the events are 0? (A whole lot of events at 1 year is suspicious, but half censored at one year is believable.) Then the proper coxph code is fit2 - coxph(Surv(rem.Remtime, rem.Rcens==0) ~ all.sex, nearma) Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Schoenfeld Residuals
Dear All, Sorry to bother you again. I have a model: coxfita=coxph(Surv(rem.Remtime/365,rem.Rcens)~all.sex,data=nearma) and I'm trying to do a plot of Schoenfeld residuals using the code: plot(cox.zph(coxfita)) abline(h=0,lty=3) The error message I get is: Error in plot.window(...) : need finite 'ylim' values In addition: Warning messages: 1: In sqrt(x$var[i, i] * seval) : NaNs produced 2: In min(x) : no non-missing arguments to min; returning Inf 3: In max(x) : no non-missing arguments to max; returning -Inf My data (nearma) has a lot of rem.Remtime entries which are equal i.e large amounts of tied data. If I remove the entries where this is the case from the dataset I get the results I want! Please can someone explain why removing paients with tied remission time has such an effect on the code and also how to remedy the problem without removing patients? Thank you very much, Laura. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Schoenfeld Residuals
Thank you for your comments. I have about 200 out of 2000 tied data points which makes the situation more complicated! I'll have at look at the book section you referred to. With regards to making the ylim finite, I'm not sure how I can go about that given that I don't understand why it isn't already! Thank you for your help, Laura 2009/4/3 David Winsemius dwinsem...@comcast.net: I am not sure that ties are the only reason. If I create a few ties in the ovarian dataset that Therneau and Lumley provide, all I get are some warnings: ovarian[4:5, 1] - mean(ovarian[4:5, 1]) ovarian[6:8, 1] - mean(ovarian[6:8, 1]) fit - coxph( Surv(futime, fustat) ~ age + rx, ovarian) temp- cox.zph(fit) plot(temp) Warning messages: 1: In approx(xx, xtime, seq(min(xx), max(xx), length.out = 17)[2 * : collapsing to unique 'x' values 2: In approx(xtime, xx, temp) : collapsing to unique 'x' values The error message you get is requesting a finite ylim. Have you considered acceding with that request? Alternative: Assuming the number of tied survival times is modest, have you tried jitter-ing the rem.Remtime variable a few times to see it the results are stable? If the number of ties is large, then you need to review Thernaeu Gramsch section 3.3 -- David Winsemius On Apr 3, 2009, at 7:57 AM, Laura Bonnett wrote: Dear All, Sorry to bother you again. I have a model: coxfita=coxph(Surv(rem.Remtime/365,rem.Rcens)~all.sex,data=nearma) and I'm trying to do a plot of Schoenfeld residuals using the code: plot(cox.zph(coxfita)) abline(h=0,lty=3) The error message I get is: Error in plot.window(...) : need finite 'ylim' values In addition: Warning messages: 1: In sqrt(x$var[i, i] * seval) : NaNs produced 2: In min(x) : no non-missing arguments to min; returning Inf 3: In max(x) : no non-missing arguments to max; returning -Inf My data (nearma) has a lot of rem.Remtime entries which are equal i.e large amounts of tied data. If I remove the entries where this is the case from the dataset I get the results I want! Please can someone explain why removing paients with tied remission time has such an effect on the code and also how to remedy the problem without removing patients? Thank you very much, Laura. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Centring variables in Cox Proportional Hazards Model
Dear All, I am contemplating centering the covariates in my Cox model to reduce multicollinearity between the predictors and the interaction term and to render a more meaningful interpretation of the regression coefficient. Suppose I have two indicator variables, x1 and x2 which represent age categories (x1 is patients less than 16 while x2 is for patients older than 65). If I use the following Cox model, is there anyway I can centre the variables? Do I have to do it before I fit them into the model and if so, how? fit2=coxph(Surv(rem.Remtime,rem.Rcens)~x1(partial)+x2(partial),data=partial,method=breslow) Thank you, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] impcat='tree'
Dear All, I am going through a worked example provided by Harrell, Lee and Mark (1996, Stats in Medicine, 15, 361-387). I know that the code provided is for S-PLUS and R but the languages don't differ enough for this to be a problem. I am using the Hmisc and Design libraries and have used the following code (as shown in the example provided in the referenced paper): '%in%' - function(a,b)match(a,b,nomatch=0)0 # Define function for easy determination of whether a value is in a list levels(ekg)[levels(ekg)%in%c('oldMI','recentMI')] - 'MI' # Combines last 2 levels and uses a new name, MI pf.coded - as.integer(pf) # Save original pf, re-code to 1-4 levels(pf) - c(levels(pf)[1:3],levels(pf)[3]) # Combine last 2 levels of original This is where I have the problem. I am writing an imputation rule: w - transcan(~sz+sg+ap+sbp+dbp+age+wt+hg+ekg+pf+bm+hx,imputed=TRUE,data=prostate,impcat='tree') However I get the following error message(s) Convergence criterion:1.511 0.787 0.41 0.215 0.115 0.062 Error: could not find function tree In addition: Warning messages: 1: In approx(y, x, xout = aty, rule = rule) : collapsing to unique 'x' values 2: In approx(y, x, xout = aty, rule = rule) : collapsing to unique 'x' values 3: In approx(y, x, xout = aty, rule = rule) : collapsing to unique 'x' values 4: In approx(y, x, xout = aty, rule = rule) : collapsing to unique 'x' values Has anyone had a similar problem? If so, any solution? Thank you, Laura __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dose Profile
Hi Jim, Thank you very much for your help. It is much appreciated. Bert, the reason I didn't ask someone at work is because none of us knew. The person I referred to in my email thought it might be possible but didn't know how. I can now inform everyone. Thanks, Laura On Wed, Oct 29, 2008 at 8:46 AM, Jim Lemon [EMAIL PROTECTED] wrote: Laura Bonnett wrote: Hi Everyone, I have data in a long format e.g. there is one row per patient but each follow-up appointment is included in the row. So, a snippet of the data looks like this: TrialNo Drug Sex Rand Adate1 Date1 Dose1 Time1 Adate2 Date2 Dose2 Time2 B1001029 LTG M 15719 30/04/2003 15825 150 106 29/08/2003 15946 200 227 B1117003 LTG M 15734 30/04/2003 15825 200 91 03/09/2003 15951 250 217 B138015 LTG M 14923 06/02/2001 15012 225 89 08/05/2001 15103 300 180 B112003 TPM F 14914 15/01/2001 14990 60 76 05/03/2001 15039 100 125 Of course, not everyone has the same number of follow-up appointments and so there may be some column entries that are NAs. What I'd like to do is a dose profile i.e. a plot of time on the x-axis agaisnt dose on the y-axis for each patient ideally colouring lines according to drug. Does anyone know how I can do this? Someone at work has suggested that I use plot and then loess.smooth but I don't really know what to do. Hi Laura, You can get a basic plot like this: matplot(rbind(lbonnett$Time1,lbonnett$Time2), rbind(lbonnett$Dose1,lbonnett$Dose2),type=l, col=lbonnett$Drug,lty=1) As you can see by running this, you will get a line for each patient, with the color of each line determined by the drug (you can select different colors, I was just being lazy). The problem you are likely to face is that there will be gaps in the lines if you have NAs. You can rejig matplot or write a similar function to get around this so that the lines are just drawn across the gaps if that is what you want. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dose Profile
Hi Everyone, I have data in a long format e.g. there is one row per patient but each follow-up appointment is included in the row. So, a snippet of the data looks like this: TrialNo Drug Sex Rand Adate1 Date1 Dose1 Time1 Adate2 Date2 Dose2 Time2 B1001029 LTG M 15719 30/04/2003 15825 150 106 29/08/2003 15946 200 227 B1117003 LTG M 15734 30/04/2003 15825 200 91 03/09/2003 15951 250 217 B138015 LTG M 14923 06/02/2001 15012 225 89 08/05/2001 15103 300 180 B112003 TPM F 14914 15/01/2001 14990 60 76 05/03/2001 15039 100 125 Of course, not everyone has the same number of follow-up appointments and so there may be some column entries that are NAs. What I'd like to do is a dose profile i.e. a plot of time on the x-axis agaisnt dose on the y-axis for each patient ideally colouring lines according to drug. Does anyone know how I can do this? Someone at work has suggested that I use plot and then loess.smooth but I don't really know what to do. Thank you very much for your help, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] na.pass
Hi All, I have a data frame which has columns comprised mainly of NAs. I know there are functions na.pass and na.omit etc which can be used in these situations however I can't them to work in this case. I have a function which returns the data according to some rule i.e. removal of N in this code: nep - function(data) { dummy - rep(0,378) for(i in 1:378){ if(is.na(data$with.Wcode)[i]) data$with.Wcode[i] - O } for(i in 1:378){ if(data$with.Wcode[i]==N) dummy[i] - i } return(data[-dummy,]) } However, I really don't want to replace the NAs with O. I'd just like to gloss over them. I can't just delete them because the structure of the data frame needs to be maintained. Can anyone suggest how I can write in a line or two to ignore the NAs instead of replacing them? I've tried this code but it doesn't work! nep - function(data) { dummy - rep(0,378) for(i in 1:378){ na.pass(data$with.Wcode[i]) if(data$with.Wcode[i]==N) dummy[i] - i } return(data[-dummy,]) } Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] na.pass
I have a data frame. It has lots of patient information, their age, their gender, etc etc. I need to keep all this information whilst selecting relevant rows. So, in the example of code I provided I want to remove all those patients who have entry N in the column with.Wcode. The dimension of the data is 378 i.e. 378 patients and currently I am replacing any entries in column with.Wcode with the letter O as this is another level of the same column. Does that make more sense? nep - function(data) { dummy - rep(0,378) for(i in 1:378){ if(is.na(data$with.Wcode)[i]) data$with.Wcode[i] - O } for(i in 1:378){ if(data$with.Wcode[i]==N) dummy[i] - i } return(data[-dummy,]) } How can I therefore not replace NA with level O but instead, ignore the NAs and effectively gloss over them? Thank you, Laura On Mon, Oct 13, 2008 at 12:42 PM, jim holtman [EMAIL PROTECTED] wrote: Not sure exactly what you are trying to do since you did not provide commented, minimal, self-contained, reproducible code. Let me take a guess in that you also have to test for NAs: x - sample(c(N, A, B, NA), 20, TRUE) x [1] A A B NA N NA NA B B N N N B A NA A B NA A NA x != N [1] TRUE TRUE TRUENA FALSENANA TRUE TRUE FALSE FALSE FALSE TRUE TRUENA TRUE TRUENA [19] TRUENA x[x != N] [1] A A B NA NA NA B B B A NA A B NA A NA (!is.na(x)) (x != N) [1] TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE TRUE FALSE TRUE TRUE FALSE [19] TRUE FALSE x[(!is.na(x)) (x != N)] [1] A A B B B B A A B A On Mon, Oct 13, 2008 at 7:15 AM, Laura Bonnett [EMAIL PROTECTED] wrote: Hi All, I have a data frame which has columns comprised mainly of NAs. I know there are functions na.pass and na.omit etc which can be used in these situations however I can't them to work in this case. I have a function which returns the data according to some rule i.e. removal of N in this code: nep - function(data) { dummy - rep(0,378) for(i in 1:378){ if(is.na(data$with.Wcode)[i]) data$with.Wcode[i] - O } for(i in 1:378){ if(data$with.Wcode[i]==N) dummy[i] - i } return(data[-dummy,]) } However, I really don't want to replace the NAs with O. I'd just like to gloss over them. I can't just delete them because the structure of the data frame needs to be maintained. Can anyone suggest how I can write in a line or two to ignore the NAs instead of replacing them? I've tried this code but it doesn't work! nep - function(data) { dummy - rep(0,378) for(i in 1:378){ na.pass(data$with.Wcode[i]) if(data$with.Wcode[i]==N) dummy[i] - i } return(data[-dummy,]) } Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hazard plot
Hi All, This sounds a relatively simple query, and I hope it is! I am looking at a continuous variable, age. I am looking at time to 12-month remission and can calculate the HR and 95% confidence interval are follows: coxfita = coxph(Surv(rem.Remtime,rem.Rcens)~nearma$all.age,data=nearma) exp(coxfita$coefficients) exp(confint(coxfita)) However, because I am looking at age as a continuous variable I cannot draw a Kaplan-Meier curve. Instead I need to draw a plot of hazard against age. How do I do this? I don't think plot(nearma$all.age,coxfita$linear.predictors) is quite right. Thank you for your help, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generalising to n-dimensions
Sorry to hassle you, but I really need to get my code up and running. Please can you therefore explain what a and v are? Thank you, Laura On Wed, Sep 24, 2008 at 8:27 PM, Laura Bonnett [EMAIL PROTECTED]wrote: Can I ask what a and v are? Thanks, Laura On Sat, Aug 23, 2008 at 11:41 AM, Robin Hankin [EMAIL PROTECTED] wrote: Laura Bonnett wrote: crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]] crosstable is just a crosstabulation of an n+2-dimensional dataset and I am trying to pick out those that are in combination 'd' of expand. So for example, for 5-dimensional data using your example: Var1 Var2 Var3 1 111 2 211 3 311 4 121 5 221 6 321 7 112 8 212 9 312 10122 11222 12322 d refers to the row of the matrix above - d=2 is 2,1,1 so crosstable[,,2,1,1] would retrieve all the data where Var1 =2, Var2=1, Var3=1 and the two remaining variables are given in the crosstabulations for all values. Is that any better? OK I think I understand. The magic package uses this type of construction extensively, but not this particular one. It's trickier than I'd have expected. Try this: f - function(a,v){ jj - sapply(dim(a)[seq_len(length(dim(a))-length(v))],seq_len,simplify=FALSE) jj - c(jj , as.list(v)) do.call([ , c(list(a) , jj, drop=TRUE)) } [you will have to coerce the output from expand.grid() to a matrix in order to extract a row from it] HTH rksh -- Robin K. S. Hankin Senior Research Associate Cambridge Centre for Climate Change Mitigation Research (4CMR) Department of Land Economy University of Cambridge [EMAIL PROTECTED] 01223-764877 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generalising to n-dimensions
Can I ask what a and v are? Thanks, Laura On Sat, Aug 23, 2008 at 11:41 AM, Robin Hankin [EMAIL PROTECTED] wrote: Laura Bonnett wrote: crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]] crosstable is just a crosstabulation of an n+2-dimensional dataset and I am trying to pick out those that are in combination 'd' of expand. So for example, for 5-dimensional data using your example: Var1 Var2 Var3 1 111 2 211 3 311 4 121 5 221 6 321 7 112 8 212 9 312 10122 11222 12322 d refers to the row of the matrix above - d=2 is 2,1,1 so crosstable[,,2,1,1] would retrieve all the data where Var1 =2, Var2=1, Var3=1 and the two remaining variables are given in the crosstabulations for all values. Is that any better? OK I think I understand. The magic package uses this type of construction extensively, but not this particular one. It's trickier than I'd have expected. Try this: f - function(a,v){ jj - sapply(dim(a)[seq_len(length(dim(a))-length(v))],seq_len,simplify=FALSE) jj - c(jj , as.list(v)) do.call([ , c(list(a) , jj, drop=TRUE)) } [you will have to coerce the output from expand.grid() to a matrix in order to extract a row from it] HTH rksh -- Robin K. S. Hankin Senior Research Associate Cambridge Centre for Climate Change Mitigation Research (4CMR) Department of Land Economy University of Cambridge [EMAIL PROTECTED] 01223-764877 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalising to n-dimensions
Hi R-helpers, I have two queries relating to generalising to n dimensions: What I want to do in the first one is generalise the following statement: expand-expand.grid(1:x[1],1:x[2],...1:x[n]) where x is a vector of integers and expand.grid gives every combination of the set of numbers, so for example, expand.grid(1:2, 1:3) takes 1,2 and 1,2,3 and gives 1,1 2,1 1,2 2,2 1,3 2,3 My x vector has varying lengths and I can't find a way of giving it every set without stating each set individually. Secondly and similarly, I want to get the table within crosstable that has the elements defined by the combinations given in expand above crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]] where crosstable is just a crosstabulation of an n+2-dimensional dataset and I am trying to pick out those that are in combination 'd' of expand. So for example, using x[1]=2 and x[2]=3 as above example, if d =2 then the order is 2,1 so I take crosstable[,,2,1]. Can anyone suggest a way to give the code every set without stating each set individually? Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generalising to n-dimensions
crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]] crosstable is just a crosstabulation of an n+2-dimensional dataset and I am trying to pick out those that are in combination 'd' of expand. So for example, for 5-dimensional data using your example: Var1 Var2 Var3 1 111 2 211 3 311 4 121 5 221 6 321 7 112 8 212 9 312 10122 11222 12322 d refers to the row of the matrix above - d=2 is 2,1,1 so crosstable[,,2,1,1] would retrieve all the data where Var1 =2, Var2=1, Var3=1 and the two remaining variables are given in the crosstabulations for all values. Is that any better? On Sat, Aug 23, 2008 at 10:40 AM, Robin Hankin [EMAIL PROTECTED] wrote: First bit: x - c(3,2,2) expand.grid(sapply(x,seq_len)) Var1 Var2 Var3 1 111 2 211 3 311 4 121 5 221 6 321 7 112 8 212 9 312 10122 11222 12322 second bit I'm not sure about. I didn't quite get why d=2 implied the order is 2,1. Could you post a small self-contained example? HTH rksh Laura Bonnett wrote: Hi R-helpers, I have two queries relating to generalising to n dimensions: What I want to do in the first one is generalise the following statement: expand-expand.grid(1:x[1],1:x[2],...1:x[n]) where x is a vector of integers and expand.grid gives every combination of the set of numbers, so for example, expand.grid(1:2, 1:3) takes 1,2 and 1,2,3 and gives 1,1 2,1 1,2 2,2 1,3 2,3 My x vector has varying lengths and I can't find a way of giving it every set without stating each set individually. Secondly and similarly, I want to get the table within crosstable that has the elements defined by the combinations given in expand above crosstable[,,expand[d,1],expand[d,2],expand[d,3],...expand[d,n]] where crosstable is just a crosstabulation of an n+2-dimensional dataset and I am trying to pick out those that are in combination 'd' of expand. So for example, using x[1]=2 and x[2]=3 as above example, if d =2 then the order is 2,1 so I take crosstable[,,2,1]. Can anyone suggest a way to give the code every set without stating each set individually? Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Robin K. S. Hankin Senior Research Associate Cambridge Centre for Climate Change Mitigation Research (4CMR) Department of Land Economy University of Cambridge [EMAIL PROTECTED] 01223-764877 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variance-covariance matrix
Hi all, Sorry to ask again but I'm still not sure how to get the full variance-covariance matrix. Peter suggested a three-level treatment factor. However, I thought that the censoring variable could only take values 0 or 1 so how do you programme such a factor. Alternatively, is there another way to produce the required covariance? Thank you, Laura On Tue, Aug 26, 2008 at 11:37 AM, Laura Bonnett [EMAIL PROTECTED]wrote: The standard treatment is the same in both comparison. How do you do a three-level treatment factor? I thought you had to have a censoring indicator which took values 0 or 1 not 1, 2 or 3? Thanks, Laura On Tue, Aug 26, 2008 at 11:05 AM, Peter Dalgaard [EMAIL PROTECTED] wrote: Laura Bonnett wrote: Dear R help forum, I am using the function 'coxph' to obtain hazard ratios for the comparison of a standard treatment to new treatments. This is easily obtained by fitting the relevant model and then calling exp(coef(fit1)) say. I now want to obtain the hazard ratio for the comparison of two non-standard treatments. From a statistical point of view, this can be achieved by dividing the exponentiated coefficients of 2 comparisions. E.g. to compared new treatment 1 (nt1) to new treatment 2 (nt2) we can fit 2 models: fit1 = standard treatment vs nt1 fit2 = standard treatment vs nt2. The required hazard ratio is therefore exp(coef(fit1))/exp(coef(fit2)) In order to obtain an associated confidence interval for this I require the covariance of this comparison. I know that R gives the variance-covariance matrix by the command 'fit$var'. However, this only gives the covariance matrix for non standard drugs and not the full covariance matrix. Can anyone tell me how to obtain the full covariance matrix? What kind of data do you have? Is the standard treatment group the same in both comparisons? If so, why not just have a three-level treatment factor and compare nt1 to nt2 directly. If the control groups are completely separate, then the covariance between fits made on independent data is of course zero. Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variance-covariance matrix
Here is the exact code I have written which does the standard vs nt1 and standard vs nt2 and also gives me the hazard ratio for nt1 vs nt2. with - read.table(allwiths.txt,header=TRUE) fix(arm) function (data) { dummy - rep(0,2437) for(i in 1:2437){ if(data$Arm[i]==CBZ) dummy[i] - i } return(data[-dummy,]) } fix(x1) function (data) { x1 - rep(0,716) for(i in 1:716){ if(data$Treat[i]==TPM) x1[i] - 0 if(data$Treat[i]==VPS) x1[i] - 0 if(data$Treat[i]==LTG) x1[i] - 1 } return(x1) } fix(x2) function (data) { x2 - rep(0,716) for(i in 1:716){ if(data$Treat[i]==TPM) x2[i] - 1 if(data$Treat[i]==VPS) x2[i] - 0 if(data$Treat[i]==LTG) x2[i] - 0 } return(x2) } fit1 = coxph(Surv(Withtime,Wcens)~x1(armb),data=armb,method=breslow) exp(fit1$coefficients) exp(confint(fit1)) fit2 = coxph(Surv(Withtime,Wcens)~x2(armb),data=armb,method=breslow) exp(fit2$coefficients) exp(confint(fit2)) exp(fit2$coefficients)/exp(fit1$coefficients) From that, how do I get the necessary variance-covaraince matrix. Sorry if I appear dense. It really isn't my intention. Laura On Wed, Aug 27, 2008 at 10:36 AM, Peter Dalgaard [EMAIL PROTECTED]wrote: Laura Bonnett wrote: Hi all, Sorry to ask again but I'm still not sure how to get the full variance-covariance matrix. Peter suggested a three-level treatment factor. However, I thought that the censoring variable could only take values 0 or 1 so how do you programme such a factor. Well, not to put it too diplomatically: If you can confuse the treatment factor and the censoring indicator, there might be more wrong with your analysis than we originally assumed, so how about showing us a bit more of what you actually did? Alternatively, is there another way to produce the required covariance? Thank you, Laura On Tue, Aug 26, 2008 at 11:37 AM, Laura Bonnett [EMAIL PROTECTED]wrote: The standard treatment is the same in both comparison. How do you do a three-level treatment factor? I thought you had to have a censoring indicator which took values 0 or 1 not 1, 2 or 3? Thanks, Laura On Tue, Aug 26, 2008 at 11:05 AM, Peter Dalgaard [EMAIL PROTECTED] wrote: Laura Bonnett wrote: Dear R help forum, I am using the function 'coxph' to obtain hazard ratios for the comparison of a standard treatment to new treatments. This is easily obtained by fitting the relevant model and then calling exp(coef(fit1)) say. I now want to obtain the hazard ratio for the comparison of two non-standard treatments. From a statistical point of view, this can be achieved by dividing the exponentiated coefficients of 2 comparisions. E.g. to compared new treatment 1 (nt1) to new treatment 2 (nt2) we can fit 2 models: fit1 = standard treatment vs nt1 fit2 = standard treatment vs nt2. The required hazard ratio is therefore exp(coef(fit1))/exp(coef(fit2)) In order to obtain an associated confidence interval for this I require the covariance of this comparison. I know that R gives the variance-covariance matrix by the command 'fit$var'. However, this only gives the covariance matrix for non standard drugs and not the full covariance matrix. Can anyone tell me how to obtain the full covariance matrix? What kind of data do you have? Is the standard treatment group the same in both comparisons? If so, why not just have a three-level treatment factor and compare nt1 to nt2 directly. If the control groups are completely separate, then the covariance between fits made on independent data is of course zero. Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard �ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Ãster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University
[R] Variance-covariance matrix
Dear R help forum, I am using the function 'coxph' to obtain hazard ratios for the comparison of a standard treatment to new treatments. This is easily obtained by fitting the relevant model and then calling exp(coef(fit1)) say. I now want to obtain the hazard ratio for the comparison of two non-standard treatments. From a statistical point of view, this can be achieved by dividing the exponentiated coefficients of 2 comparisions. E.g. to compared new treatment 1 (nt1) to new treatment 2 (nt2) we can fit 2 models: fit1 = standard treatment vs nt1 fit2 = standard treatment vs nt2. The required hazard ratio is therefore exp(coef(fit1))/exp(coef(fit2)) In order to obtain an associated confidence interval for this I require the covariance of this comparison. I know that R gives the variance-covariance matrix by the command 'fit$var'. However, this only gives the covariance matrix for non standard drugs and not the full covariance matrix. Can anyone tell me how to obtain the full covariance matrix? Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variance-covariance matrix
The standard treatment is the same in both comparison. How do you do a three-level treatment factor? I thought you had to have a censoring indicator which took values 0 or 1 not 1, 2 or 3? Thanks, Laura On Tue, Aug 26, 2008 at 11:05 AM, Peter Dalgaard [EMAIL PROTECTED]wrote: Laura Bonnett wrote: Dear R help forum, I am using the function 'coxph' to obtain hazard ratios for the comparison of a standard treatment to new treatments. This is easily obtained by fitting the relevant model and then calling exp(coef(fit1)) say. I now want to obtain the hazard ratio for the comparison of two non-standard treatments. From a statistical point of view, this can be achieved by dividing the exponentiated coefficients of 2 comparisions. E.g. to compared new treatment 1 (nt1) to new treatment 2 (nt2) we can fit 2 models: fit1 = standard treatment vs nt1 fit2 = standard treatment vs nt2. The required hazard ratio is therefore exp(coef(fit1))/exp(coef(fit2)) In order to obtain an associated confidence interval for this I require the covariance of this comparison. I know that R gives the variance-covariance matrix by the command 'fit$var'. However, this only gives the covariance matrix for non standard drugs and not the full covariance matrix. Can anyone tell me how to obtain the full covariance matrix? What kind of data do you have? Is the standard treatment group the same in both comparisons? If so, why not just have a three-level treatment factor and compare nt1 to nt2 directly. If the control groups are completely separate, then the covariance between fits made on independent data is of course zero. Thank you, Laura [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Variance-covariance matrix for parameter estimates
Dear All, I am currently working with the coxph function within the package survival. I have the model h_ij = h_0(t) exp(b1x1 + b2x2) where the indicator variables are as follows: x1 x2 VPS 0 0 LTG 1 0 TPM 0 1 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Variance-covariance matrix for parameter estimates
(Sorry, my last email appeared to be missing the important bits so I'll try again!) Dear All, I am currently working with the coxph function within the package survival. I have the model h_ij = h_0(t) exp( b1x1 + b 2x2) where the indicator variables are as follows: x1 x2 A00 B10 C01 The hazard rates are: B h_ij(t)=h_0(t)exp(b1) C h_ij(t)=h_0(t)exp(b2) A h_ij(t)=h_0(t) This therefore means that the hazard ratios are B:A h_0(t)exp(b1)/h_0 = exp(b1) C:A exp(b2) B:C exp(b1)/exp(b2) Using coxph, it is easy to determine B:A and C:A with their associated confidence intervals and it is fairly trivial to obtain the hazard ratio for B:C (by isolating the exponentiated coefficients for both previously fitted models and dividing them.) However, I cannot work out how to determine the confidence interval for B:C. I know that it will be clear from the variance-covariance matrix, but how do I obtain that? I have looked at functions such as vcoc and Var but the problem I have is that B:A is what I have called fit1 and C:A is called fit2 and there appears to be no easy way to combine these fitted models to produce the required matrix. Thank you for your help, Laura P.S. Sorry again for the 1st attempt. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] If statements for vectors
Dear Sirs, I am using both the Bioconductor adds on (Affy, AffyPLM,...) and the 'standard' R-package. I am trying to select a list of genes which all have expression values below a certain threshold. I have done this by creating a vector which has 0s where the expression is greater than the threshold and 1s where it is less than or equal to it. Multiplying this vector by the expression values produces a list of 0s and expression values below the threshold value. However, now I need to remove the 0s. I thought that this would be relatively trivial but it appears it isn't!!! The dimension of the list (with the 0s and values) is 506994. So I wrote the following: for(i in 1:506994) { if(exp2[i] 0) { exp3 - append(1,exp2[i]) } return(exp3) } where exp2 is the vector of 0s and threshold values. However I have since discovered that 'if' does not work on vectors. The suggestions I have seen on this forum include 'ifelse' which I don't believe to be relevant in this situation and 'sapply' which again I don't think is relevant. Can anyone therefore tell me how I can produce a new vector from an old one by removing the 0 entries? Thank you for your help, Laura University of Warwick [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.