Re: [R] survfit number of variables != number of variable names

2012-11-19 Thread Terry Therneau

I can't reproduce the problem.

Tell us what version of R and what version of the survival package.
Create a reproducable example.  I don't know if some variables are numeric and some are 
factors, how/where the surv object was defined, etc.


Terry Therneau



On 11/17/2012 05:00 AM, r-help-requ...@r-project.org wrote:

This works ok:


  cox = coxph(surv ~ bucket*(today + accor + both) + activity, data = data)
  fit = survfit(cox, newdata=data[1:100,])

but using strata leads to problems:


  cox.s = coxph(surv ~  bucket*(today + accor + both) + strata(activity),
  data = data)
  fit.s = survfit(cox.s, newdata=data[1:100,])

Error in model.frame.default(data = data[1:100, ], formula = ~bucket +  :
   number of variables != number of variable names

Note that the following give rise to the same error:


  fit.s = survfit(cox.s, newdata=data)

Error in model.frame.default(data = data, formula = ~bucket + today +  :
   number of variables != number of variable names

but if I use data implicitly, all is working fine:

  fit.s = survfit(cox.s)

Any idea on how I could solve this?

Best, and thank you,

ge


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survfit number of variables != number of variable names

2012-11-19 Thread Georges Dupret
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Terry,

I attached a small data set to this email. This is what I get (I
restricted the formula to avoid NA's):

 surv = with(small, Surv(time=absence, event=(censored==FALSE))) 
 (cox.s = coxph(surv ~  bucket*(today) + strata(activity), data =
 small))
Call:
coxph(formula = surv ~ bucket * (today) + strata(activity), data = small)


   coef exp(coef) se(coef)  zp
bucket5750.4526 1.5720.740  0.612 0.54
todayTRUE   -0.0886 0.9150.676 -0.131 0.90
bucket575:todayTRUE -0.1670 0.8460.794 -0.210 0.83

Likelihood ratio test=2.32  on 3 df, p=0.509  n= 100, number of
events= 100
 fit = survfit(cox.s, newdata=small[1:50,])
Error in model.frame.default(data = small[1:50, ], formula = ~bucket +  :
  number of variables != number of variable names

 

also:

 R.version
   _
platform   x86_64-redhat-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  15.1
year   2012
month  06
day22
svn rev59600
language   R
version.string R version 2.15.1 (2012-06-22)
nickname   Roasted Marshmallows

package ‘survival’ is version 2.36-14

and finally, variable absence is numeric, bucket  activity are
factors and all other variables are logical.

I tested the same formula without 'strata' and I had no problem.

Best and thank you,

ge

On 11/19/2012 09:01 AM, Terry Therneau wrote:
 I can't reproduce the problem.
 
 Tell us what version of R and what version of the survival
 package. Create a reproducable example.  I don't know if some
 variables are numeric and some are factors, how/where the surv
 object was defined, etc.
 
 Terry Therneau
 
 
 
 On 11/17/2012 05:00 AM, r-help-requ...@r-project.org wrote:
 This works ok:
 
 cox = coxph(surv ~ bucket*(today + accor + both) + activity,
 data
 = data)
 fit = survfit(cox, newdata=data[1:100,])
 but using strata leads to problems:
 
 cox.s = coxph(surv ~  bucket*(today + accor + both) +
 strata(activity),
 data = data) fit.s = survfit(cox.s, newdata=data[1:100,])
 Error in model.frame.default(data = data[1:100, ], formula =
 ~bucket +  : number of variables != number of variable names
 
 Note that the following give rise to the same error:
 
 fit.s = survfit(cox.s, newdata=data)
 Error in model.frame.default(data = data, formula = ~bucket +
 today +  : number of variables != number of variable names
 
 but if I use data implicitly, all is working fine:
 fit.s = survfit(cox.s)
 Any idea on how I could solve this?
 
 Best, and thank you,
 
 ge
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iQEcBAEBAgAGBQJQqoIKAAoJEDf4/Woixvcht1gH/iHxE1liaML5j/8ruEfXX85P
vNeQaZHDVZnrYDCbBPxgO/SlpIUmDatUOO9vhG1vBjnalMnftHJqBJCLz8lFswNy
z2CepUe2HoX/CcKI5QVlPfXvYzWHBHXbKwYmq9dI+WpNZg0qbyeP3n4ac4ZBsNN+
uzT7gjacA60/zfVOf7D+Rdno+W15Xd8ySrHZU3naPutGN7mGWdgVUlP2wwudad19
2HlTVun40OYLV9TWLJsstYgtead4PamDXvCYrQWZeC29CQesOJ0KzUpojAYWtOpb
jZkeh3F+7xKIa4DuBsGQBnIvf8b+vguvSPpVfkrjLCD/6jtVDyyslp6vEISyikw=
=+M15
-END PGP SIGNATURE-


small.csv.gz.sig
Description: PGP signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survfit number of variables != number of variable names

2012-11-19 Thread Carina Salt
Hi - I've seen a similar issue going on with survfit when using strata in
the model, although I get a different error message from ge.  If it helps
to track down the problem (rather than confusing things further) here is
some code that should reproduce the issue I've seen.  I'm running R 2.15.2,
with survival version 2.36-14 .

Cheers,
Carina

##

library(survival)

# use aml dataset from survival  create 3 imaginary possible stratas -
strat3 has 3 levels, strat4 has 4 levels, strat5 has 5 levels
aml$strat3-as.factor(rep(c(1:3),length=nrow(aml)))
aml$strat4-as.factor(rep(c(1:4),length=nrow(aml)))
aml$strat5-as.factor(rep(c(1:5),length=nrow(aml)))

# create a counting process format dataset from aml - call this aml2
aml2-survSplit(aml,cut=c(5,10,50),end=time,start=start,event=status,episode=i,id=indiv)

# create a dataset of 4 'new' observations - call this aml.new
aml.new-aml[1:4,]
aml.new$time-c(30,50,70,100)
aml.new$status-1
aml.new$x[1:4]-c(rep(Maintained,2),rep(Nonmaintained,2))
aml.new$strat3[1:4]-1
aml.new$strat4[1:4]-1
aml.new$strat5[1:4]-1

# create a counting process format dataset from aml.new - call this aml2.new
aml2.new-survSplit(aml.new,cut=c(5,10,50),end=time,start=start,event=status,episode=i,id=indiv)

# First a model using no strata - survfit works fine on new dataset
myModel-coxph(Surv(start, time, status) ~ x,data = aml2)
plot(survfit(myModel,aml2.new,id=indiv))

# Now a model using strata = strat4 (which has 4 levels) - survfit again
works fine on new dataset (which has 4 new individuals)
myModel-coxph(Surv(start, time, status) ~ x+strata(strat4),data = aml2)
plot(survfit(myModel,aml2.new,id=indiv))

# Now a model using strata = strat3 (which has 3 levels) - survfit works
here too
myModel-coxph(Surv(start, time, status) ~ x+strata(strat3),data = aml2)
plot(survfit(myModel,aml2.new,id=indiv))

# Now a model using strata = strat5 (which has 5 levels) - survfit now does
not work, with error saying
# Error in survfitcoxph.fit(y, x, wt, x2, risk, newrisk, strata, se.fit,  :
#  'names' attribute [5] must be the same length as the vector [4]
myModel-coxph(Surv(start, time, status) ~ x+strata(strat5),data = aml2)
plot(survfit(myModel,aml2.new,id=indiv))

# Now recreate aml.new but with 3 rather than 4 'new' observations
rm(aml.new)
aml.new-aml[1:3,]
aml.new$time-c(30,50,70)
aml.new$status-1
aml.new$x[1:3]-c(rep(Maintained,2),rep(Nonmaintained,1))
aml.new$strat3[1:3]-1
aml.new$strat4[1:3]-1
aml.new$strat5[1:3]-1

# create a counting process format dataset from aml.new
aml2.new-survSplit(aml.new,cut=c(5,10,50),end=time,start=start,event=status,episode=i,id=indiv)


# Survfit on model using strat3 still works
myModel-coxph(Surv(start, time, status) ~ x+strata(strat3),data = aml2)
plot(survfit(myModel,aml2.new,id=indiv))

# But Survfit on model using strat4 doesn't work now
myModel-coxph(Surv(start, time, status) ~ x+strata(strat4),data = aml2)
plot(survfit(myModel,aml2.new,id=indiv))

# Survfit on strat5 model doesn't work either
myModel-coxph(Surv(start, time, status) ~ x+strata(strat5),data = aml2)
plot(survfit(myModel,aml2.new,id=indiv))







On 19 November 2012 17:01, Terry Therneau thern...@mayo.edu wrote:

 I can't reproduce the problem.

 Tell us what version of R and what version of the survival package.
 Create a reproducable example.  I don't know if some variables are numeric
 and some are factors, how/where the surv object was defined, etc.

 Terry Therneau




 On 11/17/2012 05:00 AM, r-help-requ...@r-project.org wrote:

 This works ok:

cox = coxph(surv ~ bucket*(today + accor + both) + activity, data =
 data)
   fit = survfit(cox, newdata=data[1:100,])

 but using strata leads to problems:

cox.s = coxph(surv ~  bucket*(today + accor + both) +
 strata(activity),
   data = data)
   fit.s = survfit(cox.s, newdata=data[1:100,])

 Error in model.frame.default(data = data[1:100, ], formula = ~bucket +  :
number of variables != number of variable names

 Note that the following give rise to the same error:

fit.s = survfit(cox.s, newdata=data)

 Error in model.frame.default(data = data, formula = ~bucket + today +  :
number of variables != number of variable names

 but if I use data implicitly, all is working fine:

   fit.s = survfit(cox.s)

 Any idea on how I could solve this?

 Best, and thank you,

 ge


 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and 

Re: [R] survfit number of variables != number of variable names

2012-11-19 Thread Georges Dupret
Hi!

In answer to:


I noticed that you were using what might be called an externally  
created Surv object. I have a memory that Terry Therneau has  
criticized that practice. I cannot remember if it was in exactly this  
situation but I might ask if setting up the model as:

cox = coxph(Surv(stime, event) ~ bucket*(today + accor + both) +  
activity, data = data)

... might give the survival machinery a better handle on where  
everything might be found. 


I tried to create the Surv object internally but I face the same issue:

 (cox.s = coxph(Surv(time=absence, event=(censored==FALSE)) ~ 
 bucket*(today) + strata(activity), data = small))
Call:
coxph(formula = Surv(time = absence, event = (censored == FALSE)) ~ 
bucket * (today) + strata(activity), data = small)

   coef exp(coef) se(coef)  zp
bucket5750.4526 1.5720.740  0.612 0.54
todayTRUE   -0.0886 0.9150.676 -0.131 0.90
bucket575:todayTRUE -0.1670 0.8460.794 -0.210 0.83

Likelihood ratio test=2.32  on 3 df, p=0.509  n= 100, number of events= 100 
 fit = survfit(cox.s, newdata=small[1:50,])
Error in model.frame.default(data = small[1:50, ], formula = ~bucket +  : 
  number of variables != number of variable names

Best, and thank you for the suggestion.

ge



--
View this message in context: 
http://r.789695.n4.nabble.com/survfit-number-of-variables-number-of-variable-names-tp4649834p4650080.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survfit number of variables != number of variable names

2012-11-19 Thread David Winsemius

On Nov 19, 2012, at 5:33 PM, Georges Dupret wrote:

 Hi David,
 
 Sorry for the signature files... this is automatic. I should disable that.
 
 Please find in attachment a copy of small.csv.gz

I found it but I suspect nobody else will. I think Terry Therneau already got a 
copy. when you attached it earlier. But the rest of Rhelp did not, since .gz 
files will get scrubbed by the list-serv.


 Best,
 
 ge
 
 On 11/19/2012 02:37 PM, David Winsemius wrote:
 
 On Nov 19, 2012, at 2:23 PM, David Winsemius wrote:
 
 
 On Nov 19, 2012, at 11:07 AM, Georges Dupret wrote:
 
 Hi!
 
 In answer to:
 
 
 I noticed that you were using what might be called an externally  
 created Surv object. I have a memory that Terry Therneau has  
 criticized that practice. I cannot remember if it was in exactly this  
 situation but I might ask if setting up the model as:
 
 cox = coxph(Surv(stime, event) ~ bucket*(today + accor + both) +  
 activity, data = data)
 
 ... might give the survival machinery a better handle on where  
 everything might be found. 
 
 
 I tried to create the Surv object internally but I face the same issue:
 
 (cox.s = coxph(Surv(time=absence, event=(censored==FALSE)) ~ 
 bucket*(today) + strata(activity), data = small))
 Call:
 coxph(formula = Surv(time = absence, event = (censored == FALSE)) ~ 
  bucket * (today) + strata(activity), data = small)

All of your 'censored' were FALSE so all of your events were TRUE. My guess is 
that you are having problems because you end up with different model designs in 
the different strata:

 with( small, table(activity, today))
today
activity FALSE TRUE
  (100,121]  1   13
  (121,149]  28
  (149,196]  04
  (196,1.33e+03] 18
  (30,42]18
  (42,55]4   12
  (55,68]29
  (68,83]29
  (83,100]   26
  [11,30]08


I do not think it matters that you levels for the factor variable will not be 
in the expected order:

table(small$activity)

 (100,121]  (121,149]  (149,196] (196,1.33e+03](30,42]  
  (42,55](55,68](68,83] 
14 10  4  9  9  
   16 11 11 
  (83,100][11,30] 
 8  8 


But I do also wonder if the small numbers in each strata might be causing 
problems. Is it really needed to stratify so finely?

-- 
David.

 
 coef exp(coef) se(coef)  zp
 bucket5750.4526 1.5720.740  0.612 0.54
 todayTRUE   -0.0886 0.9150.676 -0.131 0.90
 bucket575:todayTRUE -0.1670 0.8460.794 -0.210 0.83
 
 Likelihood ratio test=2.32  on 3 df, p=0.509  n= 100, number of events= 
 100 
 fit = survfit(cox.s, newdata=small[1:50,])
 Error in model.frame.default(data = small[1:50, ], formula = ~bucket +  : 
 number of variables != number of variable names
 
 OK. Thanks for doing that. You might want to know that the only attachment 
 that made it through to the emailing list was a file named small.csv.gz.sig 
  That's not a format that my system knows how to decompress ( I tried 
 downloading GnuPG and compiling it but 
 
 
 (hit sent button too soon. )    was unable to figure out how to 
 decompress with GnuPG either. (It's hard to imagine this needed to be 
 encrypted.)
 
 small.csv.gz

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survfit number of variables != number of variable names

2012-11-19 Thread Georges Dupret
Hi!

It seems the data file wasn't transmit. Please find a copy in attachment.

Best,

ge

On 11/19/2012 09:02 AM, Terry Therneau-2 [via R] wrote:
 I can't reproduce the problem.
 
 Tell us what version of R and what version of the survival package.
 Create a reproducable example.  I don't know if some variables are
 numeric and some are
 factors, how/where the surv object was defined, etc.
 
 Terry Therneau
 
 
 
 On 11/17/2012 05:00 AM, [hidden email]
 /user/SendEmail.jtp?type=nodenode=4650064i=0 wrote:
 
 This works ok:

   cox = coxph(surv ~ bucket*(today + accor + both) + activity, data
 = data)
   fit = survfit(cox, newdata=data[1:100,])
 but using strata leads to problems:

   cox.s = coxph(surv ~  bucket*(today + accor + both) +
 strata(activity),
   data = data)
   fit.s = survfit(cox.s, newdata=data[1:100,])
 Error in model.frame.default(data = data[1:100, ], formula = ~bucket +  :
number of variables != number of variable names

 Note that the following give rise to the same error:

   fit.s = survfit(cox.s, newdata=data)
 Error in model.frame.default(data = data, formula = ~bucket + today +  :
number of variables != number of variable names

 but if I use data implicitly, all is working fine:
   fit.s = survfit(cox.s)
 Any idea on how I could solve this?

 Best, and thank you,

 ge
 
 __
 [hidden email] /user/SendEmail.jtp?type=nodenode=4650064i=1 mailing
 list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 If you reply to this email, your message will be added to the discussion
 below:
 http://r.789695.n4.nabble.com/survfit-number-of-variables-number-of-variable-names-tp4649834p4650064.html
 
 To unsubscribe from survfit  number of variables != number of variable
 names, click here
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4649834code=Z2Vvcmdlcy5kdXByZXRAeWFob28uZnJ8NDY0OTgzNHwtODc0NjQ3MDY=.
 NAML
 http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
 


small.csv.gz (1K) 
http://r.789695.n4.nabble.com/attachment/4650122/0/small.csv.gz




--
View this message in context: 
http://r.789695.n4.nabble.com/survfit-number-of-variables-number-of-variable-names-tp4649834p4650122.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] survfit number of variables != number of variable names

2012-11-17 Thread David Winsemius


On Nov 16, 2012, at 6:05 PM, Georges Dupret wrote:


This works ok:

cox = coxph(surv ~ bucket*(today + accor + both) + activity, data =  
data)

fit = survfit(cox, newdata=data[1:100,])


but using strata leads to problems:

cox.s = coxph(surv ~  bucket*(today + accor + both) +  
strata(activity),

data = data)
fit.s = survfit(cox.s, newdata=data[1:100,])


Error in model.frame.default(data = data[1:100, ], formula = ~bucket  
+  :

 number of variables != number of variable names

Note that the following give rise to the same error:


fit.s = survfit(cox.s, newdata=data)
Error in model.frame.default(data = data, formula = ~bucket + today  
+  :

 number of variables != number of variable names

but if I use data implicitly, all is working fine:

fit.s = survfit(cox.s)


Any idea on how I could solve this?



I noticed that you were using what might be called an externally  
created Surv object. I have a memory that Terry Therneau has  
criticized that practice. I cannot remember if it was in exactly this  
situation but I might ask if setting up the model as:


cox = coxph(Surv(stime, event) ~ bucket*(today + accor + both) +  
activity, data = data)


... might give the survival machinery a better handle on where  
everything might be found.


--
David.


David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.