Re: [R] what does the S.D. returned by {Hmisc} rcorr.cens measure?
Thanks for your reply Prof Harrell!! Could you kindly list some references of the fomula for calculating the SD of Somer's D in this kind of application? Because I couldnt find any.. -- View this message in context: http://r.789695.n4.nabble.com/what-does-the-S-D-returned-by-Hmisc-rcorr-cens-measure-tp3329609p3331186.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] which does the S.D. returned by {Hmisc} rcorr.cens measure?
Dear R-help, This is an example in the {Hmisc} manual under rcorr.cens function: set.seed(1) x - round(rnorm(200)) y - rnorm(200) round(rcorr.cens(x, y, outx=F),4) C IndexDxy S.D. nmissing uncensored Relevant Pairs Concordant Uncertain 0.4831-0.0338 0.0462 200. 0. 200. 39800. 19228. 0. That S.D. confuses me!! It is obviously not the standard deviation of x or y.. but there is only one realization of the c-index or Dxy for this sample dataset, where does the variation come from..?? if I use the conventional formula for calculating the standard deviation of proportions: sqrt((C Index)*(1-C Index)/n), I get 0.0353 instead of 0.0462.. Any advice is appreciated. Vikki -- View this message in context: http://r.789695.n4.nabble.com/which-does-the-S-D-returned-by-Hmisc-rcorr-cens-measure-tp3329609p3329609.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph
Thank you very very very much Prof Harrell!! You've made my day!! -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3325844.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph
Dear Prof Frank, I tried to simulate an example data set as close as possible to my own real data with the codes below. There are only two covariates, tumor(3 levels) and ecog(3 levels). rx is treatment (4 levels). Validation with the stratified model (by rx) had a negative R2.. and the R2 under the training column was so high...? n-1000 set.seed(110222) data-matrix(rep(0,5000),ncol=5) data[,1]-sample(1:3,n,rep=TRUE,prob=c(.32, .30, .38)) for (i in 1:1000){ if (data[i,1]==1) data[i,2]-sample(1:3,1,prob=c(.76, .18, .06)) if (data[i,1]==2) data[i,2]-sample(1:3,1,prob=c(.67, .24, .09)) if (data[i,1]==3) data[i,2]-sample(1:3,1,prob=c(.47, .37, .16))} for (i in 1:1000){ if (data[i,1]==1) data[i,3]-sample(1:4,1,prob=c(.70, .19, .03, .08)) if (data[i,1]==2) data[i,3]-sample(1:4,1,prob=c(.42, .28, .12, .18)) if (data[i,1]==3) data[i,3]-sample(1:4,1,prob=c(.11, .29, .30, .30))} for (i in 1:1000){ if (data[i,3]==1) data[i,4]-12*rgamma(1000,rate=0.4,shape=1.7)[c(sample(26:975,1,prob=c(rep(1/950,950] if (data[i,3]==2) data[i,4]-12*rgamma(1000,rate=0.9,shape=1.7)[c(sample(26:975,1,prob=c(rep(1/950,950] if (data[i,3]==3) data[i,4]-12*rgamma(1000,rate=1.2,shape=0.6)[c(sample(26:975,1,prob=c(rep(1/950,950] if (data[i,3]==4) data[i,4]-12*rgamma(1000,rate=1.5,shape=0.7)[c(sample(26:975,1,prob=c(rep(1/950,950]} for (i in 1:1000){ if (data[i,3]==1) data[i,5]-sample(c(0,1),1,prob=c(.53, .47)) if (data[i,3]==2) data[i,5]-sample(c(0,1),1,prob=c(.17, .83)) if (data[i,3]==3) data[i,5]-sample(c(0,1),1,prob=c(.05, .95)) if (data[i,3]==4) data[i,5]-sample(c(0,1),1,prob=c(.06, .94))} colnames(data)-c(tumor,ecog,rx,os,censor) data-data.frame(data) attach(data) library(rms) surv.obj-Surv(os,censor) validate(cph(surv.obj~factor(tumor)+factor(ecog),x=T,y=T,surv=T),method=b,B=100,dxy=T) validate(cph(surv.obj~factor(tumor)+factor(ecog)+strat(rx), x=T,y=T,surv=T,time.inc=60),method=b,B=100,dxy=T,u=60) Best regards, Vikki -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3323016.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph
I really appreciate your help Prof Harrell! I followed your instruction and re-ran the second model without strat but with surv=TRUE, time.inc=30, and u=30 to validate, the Dxy was really the same as that in the first model output! But this confused me...shouldn't the Dxy be positive in this case because u was specified to estimate the concordance between surv prob and surv time??? library(survival) attach(colon) S-Surv(time,status) obstruct0-factor(obstruct) perfor0-factor(perfor) adhere0-factor(adhere) differ0-factor(differ) extent0-factor(extent) node40-factor(node4) f2-cph(S~obstruct0+perfor0+adhere0+differ0+extent0+node40,x=T,y=T,surv=T,time.inc=30) set.seed(110221) validate(f2,method=b,B=100,dxy=T,pr=F,u=30) index.orig trainingtest optimism index.corrected n Dxy -0.2918 -0.2932 -0.2861 -0.0070 -0.2847 100 R20.1145 0.1191 0.1104 0.0088 0.1057 100 Slope 1. 1. 0.9626 0.0374 0.9626 100 D 0.0170 0.0178 0.0164 0.0014 0.0156 100 U-0.0002 -0.0002 0.0001 -0.0003 0.0001 100 Q 0.0172 0.0179 0.0162 0.0017 0.0155 100 g 0.5472 0.5590 0.5348 0.0242 0.5230 100 (the Dxy's of -0.54 or 0.6 were estimated from my own data and were not shown here because of the difficulty to produce codes that simulate my data, sorry for the confusion!) Best regards, Vikki -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3318554.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph
Dear R-help, I am having a problem with the interpretation of result from validate.cph in the Design package. My purpose is to fit a cox model and validate the Somer's Dxy. I used the hypothetical data given in the help manual with modification to the cox model fit. My research problem is very similar to this example. This is the model without stratification: library(Design) f1 - cph(S ~ age, x=TRUE, y=TRUE) coef(f) age 0.0440095 set.seed(1) validate(f1, B=10, dxy=T) index.orig training test optimism index.corrected n Dxy -0.3376784858 -0.3287537760 -0.3376784858 0.0089247099 -0.34660320 10 R2 0.0627722521 0.0636136044 0.0627722521 0.0008413523 0.06193090 10 Slope 1.00 1.00 0.9896987441 0.0103012559 0.98969874 10 D 0.0237993965 0.0239476118 0.0237993965 0.0001482153 0.02365118 10 U -0.0008208378 -0.0008141441 0.0007104737 -0.0015246178 0.00070378 10 Q 0.0246202342 0.0247617559 0.0230889228 0.0016728331 0.02294740 10 But if I fit a stratified cox model to the same data, the result becomes: f2- cph(S ~ age + strat(sex), x=TRUE, y=TRUE, surv=TRUE, time.inc=2) coef(f) age 0.04271953 set.seed(1) validate(f2, dxy=TRUE, u=2, B=10) index.orig training test optimism index.corrected n Dxy0.3514778665 0.3259011492 0.3044982080 0.02140294120.3300749254 10 R2 0.0622369082 0.0651967502 0.0622369082 0.00295984190.0592770663 10 Slope 1.00 1.00 0.9621830568 0.03781694320.9621830568 10 D 0.0257519780 0.0267239073 0.0257519780 0.00097192930.0247800486 10 U -0.0009257142 -0.0009125388 0.0009102968 -0.00182283560.0008971213 10 Q 0.0266776922 0.0276364461 0.0248416812 0.00279476490.0238829273 10 The coefficients are similar between the models, so I expect the results would be somewhat similar, yet the two models give totally contrasting Dxy. I reckon a negative Dxy value is normal in the sense that the survival time and the prediction are concordant, but why does the result become discordant when stratification is used? Is there something wrong or is there a sensible interpretation for this? This problem is very critical to me because the Design package is the only one I can use for my purpose. Any advice is greatly appreciated. Best regards, Vikki -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3316820.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph
Thank you very much Prof Harrell! Sorry that I am new to this forum, and so ain't familiar with how to post message appropriately. I repeated the same procedure using a dataset from the {survival} package. This time I used the {rms} package, and 100 bootstrap samples: library(rms) library(survival) attach(colon) S-Surv(time,status) f-cph(S~factor(obstruct)+factor(perfor)+factor(adhere)+factor(differ)+factor(extent)+factor(node4),x=T,y=T,surv=T) # no stratification set.seed(110221) validate(f,method=b,B=100,dxy=T,pr=F) index.orig trainingtest optimism index.corrected n Dxy -0.2918 -0.2932 -0.2861 -0.0070 -0.2847 100 R20.1145 0.1191 0.1104 0.0088 0.1057 100 Slope 1. 1. 0.9626 0.0374 0.9626 100 D 0.0170 0.0178 0.0164 0.0014 0.0156 100 U-0.0002 -0.0002 0.0001 -0.0003 0.0001 100 Q 0.0172 0.0179 0.0162 0.0017 0.0155 100 g 0.5472 0.5590 0.5348 0.0242 0.5230 100 f2-cph(S~factor(obstruct)+factor(perfor)+factor(adhere)+factor(differ)+factor(extent)+factor(node4)+strat(rx),x=T,y=T,surv=T) # with stratification set.seed(110221) validate(f2,method=b,B=100,dxy=T,pr=F,u=30) index.orig training test optimism index.corrected n Dxy 0.1567 0.1966 0.1826 0.0140 0.1426 100 R20.1154 0.1191 0. 0.0081 0.1073 100 Slope 1. 1. 0.9720 0.0280 0.9720 100 D 0.0203 0.0210 0.0195 0.0015 0.0188 100 U-0.0002 -0.0002 0.0001 -0.0003 0.0001 100 Q 0.0205 0.0212 0.0193 0.0018 0.0186 100 g 0.5523 0.5591 0.5402 0.0189 0.5333 100 The same situation happened again. The Dxy's were all in opposite directions. In fact my case is even worse than these examples - the Dxy for non-stratified model was -0.54 but the Dxy for stratified model was almost +0.6; and the bootstrap validated R^2 was even negative!! But..why does this happen?? Thanks a lot, Vikki -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3317743.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] contrasting Somer's D from Design package
Dear R help, I am having a problem with the Design package and my problem is detailed here. I fit a cox model to my data and validate the Somer's Dxy using the Design package. (Because of computation time problem, i only try 10 bootstrap samples for the time being) This is the model without stratification: library(Design) cox1a-cph(surv.obj~factor(ecog)+factor(grade)+factor(tumor)+factor(extra),x=T,y=T) coef1a-coef(cox1a) coef1a ecog=1 ecog=2 grade=2 grade=3 tumor=2 tumor=3 extra=1 0.3578954 0.8993140 0.4834090 0.5716166 0.7600330 1.5974558 0.8112942 validate(cox1a,dxy=T,method=b,B=10) index.orig training test optimism index.corrected n Dxy -5.450122e-01 -5.434214e-01 -5.437194e-01 2.979857e-04 -5.453102e-01 10 R2 4.470271e-01 4.470172e-01 4.458043e-01 1.212920e-03 4.458142e-01 10 Slope 1.00e+00 1.00e+00 9.910835e-01 8.916521e-03 9.910835e-01 10 D 5.315241e-02 5.346057e-02 5.295422e-02 5.063561e-04 5.264606e-02 10 U -4.651872e-05 -4.677365e-05 2.708456e-05 -7.385821e-05 2.733949e-05 10 Q 5.319893e-02 5.350735e-02 5.292713e-02 5.802143e-04 5.261872e-02 10 But if I fit a stratified cox model to the same data, the result becomes: cox1b-cph(surv.obj~factor(ecog)+factor(grade)+factor(tumor)+factor(extra)+strat(ft),x=T,y=T,surv=T) coef1b-coef(cox1b) coef1b ecog=1 ecog=2 grade=2 grade=3 tumor=2 tumor=3 extra=1 0.1058620 0.5074692 0.3190904 0.4728515 0.5631035 1.2190005 0.3500670 validate(cox1a,dxy=T,method=b,B=10,u=12) #for whatever u values I try, similar result is returned index.orig training test optimism index.corrected n Dxy6.004062e-01 5.469688e-01 0.54184724 0.005121538 0.595284689 10 R2 1.702137e-01 4.463073e-01 0.16343334 0.282873951-0.112660217 10 Slope 1.00e+00 1.00e+00 0.64516984 0.354830161 0.645169839 10 D 2.110517e-02 6.405669e-02 0.02018354 0.043873150-0.022767980 10 U -5.878953e-05 -5.625082e-05 0.00600501 -0.006061261 0.006002471 10 Q 2.116396e-02 6.411294e-02 0.01417853 0.049934410-0.028770451 10 The coefficients are different between the models, but the relative magnitudes among the coefficients in each model are in fact quite stable. I expect the results would be somewhat similar, yet the two models give totally contrasting Dxy, one very negative and the other very positive. Also the corrected R2 of the stratified model is negative, which is impossible. Is there something wrong or is there a sensible interpretation for this? Many thanks in advance. Best regards, Vikki -- View this message in context: http://r.789695.n4.nabble.com/contrasting-Somer-s-D-from-Design-package-tp3314217p3314217.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.