from:"moleps"

[R] confidence interval for logistic joinpoint regression from package ljr

2010-11-30 Thread moleps

I´m trying to run a logistic joinpoint regression utilising the ljr package. 
I´ve been using the forward selection technique to get the number of knots for 
the analysis, but I´m uncertain as to my results and the interpretation. The 
documentation is rather brief ( in the package and the stats in medicine 
article is quite technical) and without any good examples. At the moment I´m 
thinking 
1)find the number of knots both using forward and backward techniques and see 
if they are close
2)present the annual percent change (APC) for each of the intervals, ie my 
present data (1950-2010 in 5 year intervals) is giving me

   Variables  Coef
b0 Intercept -131.20404630
g0 t0.06146463
g1 max(t-tau1,0)   -0.51582466
g2 max(t-tau2,0)0.43429615

Joinpoints:
  
1 tau1= 1990.5
2 tau2= 1995.5

APC 1950-1990=exp(0.06)=1.06--6%
1990-1995=exp(0.06-0.51)=exp(-0.45)=0.63-- -37%
1995-2010=exp(0.06-0.51+0.43)---2%

3) Preferably a confidence interval for the APC should be given. However, this 
I havent figured out yet.

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] coef(summary) and plyr

2010-08-09 Thread moleps

Dear all,

I´m having trouble getting a list of regression variables back into a 
dataframe. 

mydf - data.frame(x1=rnorm(100), x2=rnorm(100), x3=rnorm(100))

mydf$fac-factor(sample((0:2),replace=T,100))

 mydf$y- mydf$x1+0.01+mydf$x2*3-mydf$x3*19+rnorm(100)

dlply(mydf,.(fac),function(df) lm(y~x1+x2+x3,data=df))-dl

here I´d like to use 

ldply(dl,coef(summary)) or something similar but I cant figure it out...



Best,

M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] coef(summary) and plyr

2010-08-09 Thread moleps

ldply doesnt need a grouping variable as far as I understand the command..

Description

For each element of a list, apply function then combine results into a data 
frame

Usage

ldply(.data, .fun = NULL, ..., .progress = none)


regards,

M


On 9. aug. 2010, at 15.33, David Winsemius wrote:

 
 On Aug 9, 2010, at 7:51 AM, moleps wrote:
 
 Dear all,
 
 I´m having trouble getting a list of regression variables back into a 
 dataframe.
 
 mydf - data.frame(x1=rnorm(100), x2=rnorm(100), x3=rnorm(100))
 
 mydf$fac-factor(sample((0:2),replace=T,100))
 
 mydf$y- mydf$x1+0.01+mydf$x2*3-mydf$x3*19+rnorm(100)
 
 dlply(mydf,.(fac),function(df) lm(y~x1+x2+x3,data=df))-dl
 
 here I´d like to use
 
 ldply(dl,coef(summary)) or something similar but I cant figure it out...
 
 dfdl - ldply(dl, function(x) coef(summary(x)) )
 
 Doesn't create a grouping variable, so:
 
 dfdl$group=rep(0:2, each=4)
 
 
 David Winsemius, MD
 West Hartford, CT
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] coef(summary) and plyr

2010-08-10 Thread moleps

Upon reading the plyr documentation that was the distinct impression I got and 
I´m glad that whatever expectations  I had developed regarding plyr were 
fulfilled. Thx for the input Hadley. 

 Maybe this is a cumbersome solution, but it works.. 

And Matthew, I will most definitively look into the datatable library. 

mydf - data.frame(x1=rnorm(100), x2=rnorm(100), x3=rnorm(100))
mydf$fac-factor(sample((0:2),replace=T,100))
 mydf$y- mydf$x1+0.01+mydf$x2*3-mydf$x3*19+rnorm(100)
dlply(mydf,.(fac),function(df) lm(y~x1+x2+x3,data=df))-dl


test-function(a){
coef(summary(a))-lo
a-colnames(lo)
b-rownames(lo)
c-length(a)
e-character(0)
r-NULL
for (x in (1:c)){
d-rep(paste(a[1:c],b[x],sep= ))
e-paste(c(e,d))
t-lo[x,]
r-c(r,t)
names(r)-e
}
return(r)
}


ldply(dl,function(x) test(x))-g
g


Regards,

Moleps


On 9. aug. 2010, at 19.55, Hadley Wickham wrote:

 That's exactly what dlply does - so you should never have to do that
 yourself.
 
 I'm unclear what you are saying. Are you saying that the plyr function
 _should_ have examined the objects in that list and determined that there
 were 4 rows and properly labeled the rows to indicate which list they came
 from?
 
 Yes, exactly.  It's the output from coef(summary(x)) that makes it
 look like this isn't happening.
 
 Hadley
 
 -- 
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] coef(summary) and plyr

2010-08-10 Thread moleps

correction...

Col and rows were mixed up and loop only worked when rows were less than or 
equal to number of columns

//M


test-function(a){
coef(summary(a))-lo
a-colnames(lo)
b-rownames(lo)
c-length(a)
e-character(0)
r-NULL
for (x in (1:length(b))){
d-rep(paste(a[1:c],b[x],sep= ))
e-paste(c(e,d))
t-lo[x,]
r-c(r,t)
names(r)-e
}
return(r)
}

On 9. aug. 2010, at 19.55, Hadley Wickham wrote:

 That's exactly what dlply does - so you should never have to do that
 yourself.
 
 I'm unclear what you are saying. Are you saying that the plyr function
 _should_ have examined the objects in that list and determined that there
 were 4 rows and properly labeled the rows to indicate which list they came
 from?
 
 Yes, exactly.  It's the output from coef(summary(x)) that makes it
 look like this isn't happening.
 
 Hadley
 
 -- 
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple imputation and matching

2010-08-20 Thread moleps

Dear all,

Is it possible to impute a dataset and create a summary table with summary from 
Hmisc and convert it to latex? I´m mostly familiar with cem and amelia hence 
the example from the documentation in cem. The imbalance command is not exactly 
what I was looking for...

library (cem)

if(require(Amelia)){
 data(LL)
 n - dim(LL)[1]
 k - dim(LL)[2]

 set.seed(123)

 LL1 - LL
 idx - sample(1:n, .3*n)
 invisible(sapply(idx, function(x) LL1[x,sample(2:k,1)] - NA))

 imputed - amelia(LL1,noms=c(black,hispanic,treated,married,
  nodegree,u74,u75)) 
 imputed - imputed$imputations[1:5]

##Here I´d like to produce a table

 mat1 - cem(treated, datalist=imputed, drop=re78)
 mat1

## here I´d like to produce a table with the matched elements.

Regards,

M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] accessing the attr(*,label.table) after importing from spss

2010-08-25 Thread moleps

Dear all,
I just received a file from a colleague in spss. The read.spss could not finish 
the file due to an error (Unrecognized record type 7, subtype 18 encountered in 
system file) so instead I converted the file using stat-transfer. Looking at my 
data I see that most labels are in the attributes and I´d love to access them 
and assign the pertinent variables to factors without doing the whole 
factor(levels,labels) thing manually. Is there any remedy to this ??

regards,

M


 str(dat)
'data.frame':   860 obs. of  19 variables:
 $ Ag: int  15 15 15 15 15 15 15 15 15 15 ...
 $ G: int  2 2 2 1 1 1 1 1 1 2 ...
 $ GCQ: int  15 15 15 15 15 15 15 15 15 15 ...
 $ Amn: int  2 2 2 2 2 2 1 1 1 1 ...
 $ HI  : int  1 1 1 1 1 1 2 2 2 2 ...
 $ Hos: int  2 2 2 2 2 2 2 2 2 2 ...
 $ Risk   : int  2 2 2 2 2 2 2 2 2 2 ...
 $ CTO : int  2 2 2 2 2 1 1 1 1 1 ...
 $ pat  : int  NA NA NA NA NA 2 2 2 2 2 ...
 $ Day: int  7 7 7 5 4 6 5 7 7 5 ...
 $ Ho  : int  NA NA NA NA NA NA NA NA NA NA ...
 $ coh  : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Comp : int  1 1 1 1 1 1 1 1 1 1 ...
 $ Ethan: int  2 2 2 2 2 2 2 2 2 2 ...
 $ Pro   : num  NA NA NA NA NA NA NA NA NA NA ...
 $ Ye  : int  1 1 1 3 3 3 1 1 1 3 ...
 $ Ageg: int  1 1 1 1 1 1 1 1 1 1 ...
 $ BAC: int  0 0 0 0 0 0 0 0 0 0 ...
 - attr(*, val.labels)= chr   VL_Gender  VL_Amnesia ...
 - attr(*, var.labels)= chr  Age (years) Gender GCQSSA Amnesty ...
 - attr(*, label.table)=List of 19
  ..$ : NULL
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Male Female
  ..$ : NULL
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : NULL
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2 3 4 5 6 7
  .. ..- attr(*, names)= chr  Monday Tuesday Wednesday Thursday ...
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2 3 4 5 6 7
  .. ..- attr(*, names)= chr  Yes Overtriage  Undertriage with 
admission Overtriage with pos ...
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : NULL
  ..$ : NULL
  ..$ : Named num  1 2 3 4
  .. ..- attr(*, names)= chr  15 - 24 25-39 40-59 60-
  ..$ : Named num  1 2 3
  .. ..- attr(*, names)= chr  0.10-0.99 1.00-1.99 2.00-
  ..$ : Named num  0 1 2 3 4
  .. ..- attr(*, names)= chr  sorry undeweight 0.10-0.99 1.00-1.99 ...
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] accessing the attr(*, label.table) after importing from spss

2010-08-26 Thread moleps

Thx... The following seems to work. However I´m sure there is a more elegant 
solution to it


tre-dat
b-length(dat)
 attr(dat,label.table)-a
 for (i in 1:b){
if(!is.null(a[[i]])  
length(levels(as.factor(dat[,i])))==length(a[[i]]))
{
tre[,i]-factor(dat[,i],labels=names(a[[i]]))
}
}
tre

Regards,

//M



On 25. aug. 2010, at 23.05, Bert Gunter wrote:

 ?attr
 ?attributes
 
 -- Bert Gunter
 
 On Wed, Aug 25, 2010 at 1:26 PM, moleps mole...@gmail.com wrote:
 Dear all,
 I just received a file from a colleague in spss. The read.spss could not 
 finish the file due to an error (Unrecognized record type 7, subtype 18 
 encountered in system file) so instead I converted the file using 
 stat-transfer. Looking at my data I see that most labels are in the 
 attributes and I´d love to access them and assign the pertinent variables to 
 factors without doing the whole factor(levels,labels) thing manually. Is 
 there any remedy to this ??
 
 regards,
 
 M
 
 
 str(dat)
 'data.frame':   860 obs. of  19 variables:
  $ Ag: int  15 15 15 15 15 15 15 15 15 15 ...
  $ G: int  2 2 2 1 1 1 1 1 1 2 ...
  $ GCQ: int  15 15 15 15 15 15 15 15 15 15 ...
  $ Amn: int  2 2 2 2 2 2 1 1 1 1 ...
  $ HI  : int  1 1 1 1 1 1 2 2 2 2 ...
  $ Hos: int  2 2 2 2 2 2 2 2 2 2 ...
  $ Risk   : int  2 2 2 2 2 2 2 2 2 2 ...
  $ CTO : int  2 2 2 2 2 1 1 1 1 1 ...
  $ pat  : int  NA NA NA NA NA 2 2 2 2 2 ...
  $ Day: int  7 7 7 5 4 6 5 7 7 5 ...
  $ Ho  : int  NA NA NA NA NA NA NA NA NA NA ...
  $ coh  : int  1 1 1 1 1 1 1 1 1 1 ...
  $ Comp : int  1 1 1 1 1 1 1 1 1 1 ...
  $ Ethan: int  2 2 2 2 2 2 2 2 2 2 ...
  $ Pro   : num  NA NA NA NA NA NA NA NA NA NA ...
  $ Ye  : int  1 1 1 3 3 3 1 1 1 3 ...
  $ Ageg: int  1 1 1 1 1 1 1 1 1 1 ...
  $ BAC: int  0 0 0 0 0 0 0 0 0 0 ...
  - attr(*, val.labels)= chr   VL_Gender  VL_Amnesia ...
  - attr(*, var.labels)= chr  Age (years) Gender GCQSSA Amnesty ...
  - attr(*, label.table)=List of 19
  ..$ : NULL
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Male Female
  ..$ : NULL
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : NULL
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2 3 4 5 6 7
  .. ..- attr(*, names)= chr  Monday Tuesday Wednesday Thursday ...
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : Named num  1 2 3 4 5 6 7
  .. ..- attr(*, names)= chr  Yes Overtriage  Undertriage with 
 admission Overtriage with pos ...
  ..$ : Named num  1 2
  .. ..- attr(*, names)= chr  Yes No
  ..$ : NULL
  ..$ : NULL
  ..$ : Named num  1 2 3 4
  .. ..- attr(*, names)= chr  15 - 24 25-39 40-59 60-
  ..$ : Named num  1 2 3
  .. ..- attr(*, names)= chr  0.10-0.99 1.00-1.99 2.00-
  ..$ : Named num  0 1 2 3 4
  .. ..- attr(*, names)= chr  sorry undeweight 0.10-0.99 1.00-1.99 
 ...
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] glm prb (Error in `contrasts-`(`tmp`, value = contr.treatment) : )

2010-08-29 Thread moleps


glm(A~B+C+D+E+F,family = binomial(link = logit),data=tre,na.action=na.omit)
Error in `contrasts-`(`*tmp*`, value = contr.treatment) : 
  contrasts can be applied only to factors with 2 or more levels

however,

glm(A~B+C+D+E,family = binomial(link = logit),data=tre,na.action=na.omit)

runs fine

glm(A~B+C+D+F,family = binomial(link = logit),data=tre,na.action=na.omit)

runs fine



glm(A~E+F,family = binomial(link = logit),data=tre,na.action=na.omit)
Error in `contrasts-`(`*tmp*`, value = contr.treatment) : 
  contrasts can be applied only to factors with 2 or more levels

Why is this? Could it be due to collinearity between the two?


Regards,

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] summary in Hmisc and Latex

2010-08-31 Thread moleps

Dear all,

With the latest update of Hmisc I no longer have any problems with latex. 
However using the ctable option produces latex code that at least on both the 
miktex distribution at work and mactex distribution at home refuses to run due 
to an extra blank line inserted between the multicolumn lines in the latex 
code... It runs fine if the line is deleted or if the ctable option is left 
out.  Does this apply to other people as well?

Regards,
//M





library(Hmisc)

options(digits=3)
set.seed(173)
sex- factor(sample(c(m,f), 500, rep=TRUE))
age- rnorm(500, 50, 5)
treatment- factor(sample(c(Drug,Placebo), 500, rep=TRUE))
symp- c('Headache','Stomach Ache','Hangnail',
 'Muscle Ache','Depressed')
symptom1- sample(symp, 500,TRUE)
symptom2- sample(symp, 500,TRUE)
symptom3- sample(symp, 500,TRUE)
Symptoms- mChoice(symptom1, symptom2, symptom3, label='Primary Symptoms')
table (Symptoms)
table(symptom1,symptom2)
f- summary(treatment ~ age + sex + Symptoms, method=reverse, test=TRUE)
latex(f,file=)
latex(f,file=,ctable=T)
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multiple graphs

2010-09-07 Thread moleps

Dear all,

I´m trying to create multiple graphs on the same page, but they are all stacked 
on top of each other.

My code:


par(mfrow=c(2,2))
a-list(levels(bar$h.r)[c(1,3,6)])
print(a)

lapply(a,function(x){
a-subset(bar,h.r==x)
with(a,cdplot(wh~Age,ylab=x))
#plot.new()
})

The plot.new command doesnt help...

Any ideas??


//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multiple graphs

2010-09-08 Thread moleps

Problem solved..

My bad. No prb with cdplot or graphics-part. The problem was the a-list.. 
command which resulted in all three levels of bar$h.r in a[[1]]. Skipping the 
list function sorted it out.


par(mfrow=c(2,2))
a-levels(bar$h.r)[c(1,3,6)]
print(a)
lapply(a,function(x){
a-subset(bar,h.r==x)
with(a, cdplot(wh~Age,ylab=x))
#plot.new()
})

Regards,

//M



On 8. sep. 2010, at 03.37, David Winsemius wrote:

 
 On Sep 7, 2010, at 8:02 PM, moleps wrote:
 
 Dear all,
 
 I´m trying to create multiple graphs on the same page, but they are all 
 stacked on top of each other.
 
 My code:
 
 
 par(mfrow=c(2,2))
 a-list(levels(bar$h.r)[c(1,3,6)])
 print(a)
 
 lapply(a,function(x){
  a-subset(bar,h.r==x)
  with(a, cdplot(wh~Age,ylab=x))
  #plot.new()
  })
 
 The plot.new command doesnt help...
 
 Any ideas??
 
 ?layout  # assuming that the undescribed plotting function is base graphics. 
 Some plotting functions are hard coded and are able to defeat the usual 
 formatting options.
 
 -- 
 David Winsemius, MD
 West Hartford, CT
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data-management: Rowwise NA

2010-06-03 Thread moleps

Dear R´ers..

In this mock dataset how can I generate a logical variable based on whether 
just tes or tes3 are NA in each row?? 

test-sample(c(A,NA,B),100,replace=T)
test2-sample(c(A,NA,B),100,replace=T)
test3-sample(c(A,NA,B),100,replace=T)

tes-cbind(test,test2,test3)

sam-c(test,test3)
apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))

However this just tests whether each variable is missing or not per row. I´d 
like an -or- function in here that would provide one true/false per row based 
on whether test or tes3 are NA. I guess it would be easy to do it by subsetting 
in the example but I figure there is a more elegant way of doing it when -sam- 
contains 50 variables...

//M



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data-management: Rowwise NA

2010-06-03 Thread moleps

-Any- was my fix... Appreciate it.

//M


On 3. juni 2010, at 21.33, Phil Spector wrote:

 ?any
 
 Not really a reproducible answer, but I think you're looking
 for
 
 apply(tes[,sam],1,function(x)any(is.na(x)))
 
 
   - Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spec...@stat.berkeley.edu
 
 
 On Thu, 3 Jun 2010, moleps wrote:
 
 Dear R?ers..
 
 In this mock dataset how can I generate a logical variable based on whether 
 just tes or tes3 are NA in each row??
 
 test-sample(c(A,NA,B),100,replace=T)
 test2-sample(c(A,NA,B),100,replace=T)
 test3-sample(c(A,NA,B),100,replace=T)
 
 tes-cbind(test,test2,test3)
 
 sam-c(test,test3)
 apply(subset(tes,select=sam),1,FUN=function(x) is.na(x))
 
 However this just tests whether each variable is missing or not per row. I?d 
 like an -or- function in here that would provide one true/false per row 
 based on whether test or tes3 are NA. I guess it would be easy to do it by 
 subsetting in the example but I figure there is a more elegant way of doing 
 it when -sam- contains 50 variables...
 
 //M
 
 
 
  [[alternative HTML version deleted]]
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] two columns into one

2010-06-03 Thread moleps

Dear R´ers,


How can I create one single factor variable from two variables incorporating 
all possible combinations of the values??


test-sample(c(A,NA,B),100,replace=T)
test2-sample(c(E,F,A),100,replace=T)


tes-cbind(test,test2)


pseduocode:

r-function(test,test2)

r

AE
AF
AA
NAE
NAF
NAA
BE
BF
BA


//M




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] apply min function rowwise

2010-06-05 Thread moleps

I´m trying to tease out the minimum value from a row in a dataframe where all 
the variables are dates. 

apply(canc[,vec],1,function(x)min(x,na.rm=T))


However it only returns empty strings for the entire dataframe except for one 
date value (which is not the minimum date).

I´ve also tried 

apply(canc[,vec],1,function(x)max(x,na.rm=T))

which provides values rowwise, but many of them are not in fact the largest in 
the row.


Any advice?


Regards,

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply min function rowwise

2010-06-05 Thread moleps

Appreciate it...

//M

On 5. juni 2010, at 20.11, Joshua Wiley wrote:

 On Sat, Jun 5, 2010 at 10:22 AM, moleps mole...@gmail.com wrote:
 thx.
 
 It was only the first instance that was class date. The rest were factors. 
 So that explains it.
 
 If I want to change the rest in vec into class date (there are many of 
 them...)
 
 neither  as.Date(canc[,vec],%d.%m.%Y) nor 
 sapply(canc[,vec],FUN=function(x) as.date(x,%d.%m.%Y)) works
 
 What is the easy solution to this?
 
 This is the nicest solution that comes to mind:
 
 as.data.frame(lapply(X=samp.dat, FUN=as.Date, format=%d.%m.%Y))
 
 I believe the problem is that sapply() coerces the results (by default
 when simplify=TRUE) using as.vector() leaving you with the number of
 days since the origin.  Anyway, using as.data.frame() on the list
 output from lapply() seems to work.
 
 Josh
 
 
 Regards,
 
 //M
 
 
 
 On 5. juni 2010, at 18.30, Joshua Wiley wrote:
 
 Hello M,
 
 My guess is that it has something to do with the class of the
 variables.  Perhaps you could provide a small sample dataframe?  Also
 you might try running str() on your data frame and seeing if the
 results are what you would expect.  As a side note, it is not
 necessary to make an anonymous function here, as you are allowed to
 pass arguments to the function applied.
 
 apply(canc[,vec],1, min, na.rm=TRUE)
 
 Best regards,
 
 Josh
 
 On Sat, Jun 5, 2010 at 8:30 AM, moleps mole...@gmail.com wrote:
 I´m trying to tease out the minimum value from a row in a dataframe where 
 all the variables are dates.
 
 apply(canc[,vec],1,function(x)min(x,na.rm=T))
 
 
 However it only returns empty strings for the entire dataframe except for 
 one date value (which is not the minimum date).
 
 I´ve also tried
 
 apply(canc[,vec],1,function(x)max(x,na.rm=T))
 
 which provides values rowwise, but many of them are not in fact the 
 largest in the row.
 
 
 Any advice?
 
 
 Regards,
 
 //M
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 --
 Joshua Wiley
 Senior in Psychology
 University of California, Riverside
 http://www.joshuawiley.com/
 
 
 
 
 
 -- 
 Joshua Wiley
 Senior in Psychology
 University of California, Riverside
 http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] paste together a string object later to be utilized in a function

2010-06-06 Thread moleps

Dear r-listers,

I need to pass a string to a function. However the length of the string is 
dependent on the length of a vector. 


b-length(h)
v-paste(rep(names(ts$a[,1:b,]),ts$a[,1:b,]),sep=)


Is it possible somehow to pass this as an argument to a function later on ? 



Regards,

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paste together a string object later to be utilized in a function

2010-06-07 Thread moleps

Sorry for bothering all of you. In the end it turned out to be much simpler 
than I thought. Takes a while to get used to the vectorizing idea. 



require(survival)
require(ggplot2)

ggkm-function(time,event,stratum) {

stratum-as.factor(stratum)
m2s-Surv(time,as.numeric(event))

fit - survfit(m2s ~ stratum)

w-fit$time

k-fit$surv
o-length(levels(stratum))
strata-c(rep(names(fit$strata[1:o]),fit$strata[1:o]))

upper-fit$upper
lower-fit$lower
f-data.frame(w,k,strata,upper,lower)

r-ggplot 
(f,aes(x=w,y=k,fill=strata,group=strata))+geom_line(aes(color=strata))+geom_ribbon(aes(ymin=lower,ymax=upper),alpha=0.3)+xlim(0,fit$maxtime)+ylim(0,1)

r-r+scale_fill_brewer(f$strata,palette=Set1)+scale_color_brewer(f$strata,palette=Set1)
return(r)
}





data(lung)
with(lung,ggkm(time,status,sex))

with(lung,ggkm(time,status,pat.karno))





//M





On 7. juni 2010, at 22.31, Greg Snow wrote:

 Does the collapse argument to the paste function do what you want?  Possibly 
 nested inside another paste.
 
 -- 
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of moleps
 Sent: Sunday, June 06, 2010 1:51 PM
 To: r-help@r-project.org
 Subject: [R] paste together a string object later to be utilized in a
 function
 
 Dear r-listers,
 
 I need to pass a string to a function. However the length of the string
 is dependent on the length of a vector.
 
 
 b-length(h)
 v-paste(rep(names(ts$a[,1:b,]),ts$a[,1:b,]),sep=)
 
 
 Is it possible somehow to pass this as an argument to a function later
 on ?
 
 
 
 Regards,
 
 //M
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] marginal structural models

2010-06-09 Thread moleps

Dear listers,

Does anyone have any experience running marginal structural models in r or can 
point me in the direction of any good tutorials on this?

Regards,

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Latex and r

2010-06-16 Thread moleps

Dear R´ers 

I´m trying to get a summary table  using latex and summary in the rms package 
to no avail. I´m running R 2.10.1, Mac OS X snow leopard and I have the mactex 
2009 distribution installed. Any obvious things I´m missing?

//M

 

options(digits=3)
set.seed(173)
sex - factor(sample(c(m,f), 500, rep=TRUE))
age - rnorm(500, 50, 5)
treatment - factor(sample(c(Drug,Placebo), 500, rep=TRUE))
f - summary(treatment ~ age + sex + Symptoms, method=reverse, test=TRUE)

latex(f)

results in the following:


This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009)
entering extended mode

(/var/folders/q9/q9COp2FREsikCyHB7w+OxE+++TI/-Tmp-//RtmpVIk0iB/file587f83cb.tex
LaTeX2e 2009/09/24
Babel v3.8l and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, german-x-2009-06-19, ngerman-x-2009-06-19, ancientgreek, ibycus, ar
abic, basque, bulgarian, catalan, pinyin, coptic, croatian, czech, danish, dutc
h, esperanto, estonian, farsi, finnish, french, galician, german, ngerman, mono
greek, greek, hungarian, icelandic, indonesian, interlingua, irish, italian, ku
rmanji, latin, latvian, lithuanian, mongolian, mongolian2a, bokmal, nynorsk, po
lish, portuguese, romanian, russian, sanskrit, serbian, slovak, slovenian, span
ish, swedish, turkish, ukenglish, ukrainian, uppersorbian, welsh, loaded.
(/usr/local/texlive/2009/texmf-dist/tex/latex/base/report.cls
Document Class: report 2007/10/19 v1.4h Standard LaTeX document class
(/usr/local/texlive/2009/texmf-dist/tex/latex/base/size10.clo))
(/usr/local/texlive/2009/texmf-dist/tex/latex/geometry/geometry.sty
(/usr/local/texlive/2009/texmf-dist/tex/latex/graphics/keyval.sty)
(/usr/local/texlive/2009/texmf-dist/tex/generic/oberdiek/ifpdf.sty)
(/usr/local/texlive/2009/texmf-dist/tex/generic/oberdiek/ifvtex.sty)
(/usr/local/texlive/2009/texmf-dist/tex/xelatex/xetexconfig/geometry.cfg))
No file file587f83cb.aux.
*geometry auto-detecting driver*
*geometry detected driver: dvips*

Overfull \hbox (1.14412pt too wide) in paragraph at lines 9--23
 [] 
[1] (./file587f83cb.aux)

LaTeX Warning: Label(s) may have changed. Rerun to get cross-references right.

 )
(see the transcript file for additional information)
Output written on file587f83cb.dvi (1 page, 1620 bytes).
Transcript written on file587f83cb.log.
sh: xdvi: command not found

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Latex and r

2010-06-16 Thread moleps

xdvi is installed in the same location as yours. I even did a reinstallment of 
mactex. Still doesnt work. But since I´m now convinced its related to my latex 
distribution I´ll take the problem elsewhere..

Regards,

//M



On 16. juni 2010, at 17.19, Prof Brian Ripley wrote:

 On Wed, 16 Jun 2010, Erik Iverson wrote:
 
 
 
 moleps wrote:
 Dear R´ers
 I´m trying to get a summary table  using latex and summary in the rms
 package to no avail. I´m running R 2.10.1, Mac OS X snow leopard and
 I have the mactex 2009 distribution installed. Any obvious things I´m
 missing?
 
 file587f83cb.log. sh: xdvi: command not found
 
 You apparently don't have xdvi installed, which is used to view the resulting
 
 xdvi is part of MacTeX 2009.  So the latter may be installed, but it is not 
 in the path or incomplete or   I have
 
 tystie% which xdvi
 /usr/texbin/xdvi
 
 
 document. How you get that installed on your OS, I don't know. Depending on 
 what you're doing, you might want to use the file argument of latex 
 function to output a latex file, and then do further processing.  Also, 
 assigning the latex function call to a variable, x - latex(...) will 
 suppress printing of the object, which is ultimately what is trying to use 
 xdvi.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Latex and r (using summary from RMS)

2010-06-16 Thread moleps

Dear all,
After spending all day and most of the night on this I did a new R-installation 
and it works. The question now is - upon running this code (from the Hmisc 
library-latex function) I believe the call to summary.formula is allright, but 
the latex command results in a totally different table where all the numbers 
and test columns are wrong are wrong. Is this still a matter for the 
installation or is there something in the latex syntax I havent grasped?

//M



library(Hmisc)

options(digits=3)
set.seed(173)
sex - factor(sample(c(m,f), 500, rep=TRUE))
age - rnorm(500, 50, 5)
treatment - factor(sample(c(Drug,Placebo), 500, rep=TRUE))
symp - c('Headache','Stomach Ache','Hangnail',
  'Muscle Ache','Depressed')
symptom1 - sample(symp, 500,TRUE)
symptom2 - sample(symp, 500,TRUE)
symptom3 - sample(symp, 500,TRUE)
Symptoms - mChoice(symptom1, symptom2, symptom3, label='Primary Symptoms')
table (Symptoms)
table(symptom1,symptom2)
f - summary(treatment ~ age + sex + Symptoms, method=reverse, test=TRUE)
g - summary(treatment ~ age + sex + symptom1, method=reverse, test=TRUE)

latex(g)

Produces nice tables-but the numbers are all wrong as you can see below from 
the latex file...

 latex(g,file=)
% latex.default(cstats, title = title, caption = caption, rowlabel = rowlabel,  
col.just = col.just, numeric.dollar = FALSE, insert.bottom = legend,  
rowname = lab, dcolumn = dcolumn, extracolheads = extracolheads,  
extracolsize = Nsize, ...) 
%
\begin{table}[!tbp]
 \caption{Descriptive Statistics by treatment\label{g}} 
 \begin{center}
 \begin{tabular}{lccc}\hline\hline
\multicolumn{1}{l}{}\multicolumn{1}{c}{Drug}\multicolumn{1}{c}{Placebo}\multicolumn{1}{c}{Test
 Statistic}\tabularnewline

\multicolumn{1}{c}{{\scriptsize $N=263$}}\multicolumn{1}{c}{{\scriptsize 
$N=237$}}\tabularnewline
\hline
age114\tabularnewline
sex~:~m672\tabularnewline
symptom1~:~Depressed433\tabularnewline
Hangnail561\tabularnewline
Headache421\tabularnewline
Muscle~Ache351\tabularnewline
Stomach~Ache241\tabularnewline
\hline
\end{tabular}

\end{center}

\noindent {\scriptsize $a$\ }{$b$\ }{\scriptsize $c$\ } represent the lower 
quartile $a$, the median $b$, and the upper quartile $c$\ for continuous 
variables.\\Numbers after percents are frequencies.\\\indent Tests 
used:\\\textsuperscript{\normalfont 1}Wilcoxon test; 
\textsuperscript{\normalfont 2}Pearson test
\end{table}


Then I did another example from Harrell´s statistical tables and plots 

rm(list=ls())



library(Hmisc)
getHdata(prostate)
# Variables in prostate had units in ( ) inside variable labels. Move
# these units of measurements to separate units attributes
# wt is an exception. It has ( ) in its label but this does not denote units
# Also make hg have a legal R plotmath expression
prostate-upData(prostate, moveUnits=TRUE,units=c(wt=, 
hg=g/100*ml),labels=c(wt=Weight Index = wt(kg)-ht(cm)+200))

attach(prostate)
stage- factor(stage, 3:4, c(Stage 3,Stage 4))
s6-summary(stage~rx+age+wt+pf+hx+sbp+dbp+ekg+hg+sz+sg+ap+bm,method=reverse, 
overall=TRUE, test=TRUE)
options(digits=2)
w-latex(s6, size=smaller[3], outer.size=smaller, 
Nsize=smaller,long=TRUE, prmsd=TRUE, msdsize=smaller,middle.bold=TRUE, 
ctable=TRUE)


This refused to run ( as long as the ctable=T was included), but without it 

latex (s6)

I do get a nicely formated table, but again the numbers are all wrong... Also 

latex(s6, long=TRUE, prmsd=TRUE, msdsize=smaller,middle.bold=TRUE)

makes no difference from latex(s6) alone with regards to formatting...

Quite frustrating-Any suggestions??


//M






On 16. juni 2010, at 20.10, Kevin E. Thorpe wrote:

 moleps wrote:
 Dear R´ers I´m trying to get a summary table  using latex and summary in the 
 rms package to no avail. I´m running R 2.10.1, Mac OS X snow leopard and I 
 have the mactex 2009 distribution installed. Any obvious things I´m missing?
 //M
 options(digits=3)
 set.seed(173)
 sex - factor(sample(c(m,f), 500, rep=TRUE))
 age - rnorm(500, 50, 5)
 treatment - factor(sample(c(Drug,Placebo), 500, rep=TRUE))
 f - summary(treatment ~ age + sex + Symptoms, method=reverse, test=TRUE)
 latex(f)
 results in the following:
 This is pdfTeX, Version 3.1415926-1.40.10 (TeX Live 2009)
 entering extended mode
 (/var/folders/q9/q9COp2FREsikCyHB7w+OxE+++TI/-Tmp-//RtmpVIk0iB/file587f83cb.tex
 LaTeX2e 2009/09/24
 Babel v3.8l and hyphenation patterns for english, usenglishmax, dumylang, 
 noh
 yphenation, german-x-2009-06-19, ngerman-x-2009-06-19, ancientgreek, ibycus, 
 ar
 abic, basque, bulgarian, catalan, pinyin, coptic, croatian, czech, danish, 
 dutc
 h, esperanto, estonian, farsi, finnish, french, galician, german, ngerman, 
 mono
 greek, greek, hungarian, icelandic, indonesian, interlingua, irish, italian, 
 ku
 rmanji, latin, latvian, lithuanian, mongolian, mongolian2a, bokmal, nynorsk, 
 po
 lish, portuguese, romanian, russian, sanskrit, serbian, slovak, slovenian, 
 span
 ish, swedish, turkish, ukenglish, ukrainian

[R] Remove squares from scatter3D

2010-06-22 Thread moleps

Dear All,

I´ve been trying to find an option to scatter3D from rcmdr to remove the 
individual points from the plots but to no help so far. Removing the residuals 
is easy, but I cannot find a similar point option. Is there such an option that 
can be set to FALSE?


Best,

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data manipulation problem

2010-04-05 Thread moleps

Dear R´ers.

I´ve got a dataset with age and year of diagnosis. In order to age-standardize 
the incidence I need to transform the data into a matrix with age-groups 
(divided in 5 or 10 years) along one axis and year divided into 5 years along 
the other axis. Each cell should contain the number of cases for that age group 
and for that period. 

I.e.
My data format now is
ID-age (to one decimal)-year(yearly data).

What I´d like is 


age 1960-1965 1966-1970 etc...
0-5 3 8 10 15
6-10 2 5 8 13
etc..


Any good ideas?

Regards,
M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data manipulation problem

2010-04-05 Thread moleps

I already did try the regression modeling approach. However the epidemiologists 
(referee) turns out to be quite fond of comparing the incidence rates to 
different standard populations, hence the need for this labourius approach. 
And trying the cutting approach I ended up with :

 table (age5)
age5
   (0,5]   (5,10]  (10,15]  (15,20]  (20,25]  (25,30]  (30,35]  (35,40]  
(40,45]  (45,50]  (50,55]  (55,60]  (60,65]  (65,70]  (70,75]  (75,80]  (80,85] 
(85,100] 
  35   34   33   47   51  109  157  231  
362  511  745  926 1002  866  547  247   82 
  18 
 table (yr5)
yr5
(1950,1955] (1955,1960] (1960,1965] (1965,1970] (1970,1975] (1975,1980] 
(1980,1985] (1985,1990] (1990,1995] (1995,2000] (2000,2005] (2005,2009] 
  3   5   5   5   5   5 
  5   5   5   5   5   3 
 table (yr5,age5)
Error in table(yr5, age5) : all arguments must have the same length

Sincerely,
M





On 5. apr. 2010, at 20.59, Bert Gunter wrote:

 You have tempted, and being weak, I yield to temptation:
 
 Any good ideas?
 
 Yes. Don't do this.
 
 (what you probably really want to do is fit a model with age as a factor,
 which can be done statistically e.g. by logistic regression; or graphically
 using conditioning plots, e.g. via trellis graphics (the lattice package).
 This avoids the arbitrariness and discontinuities of binning by age range.)
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of moleps
 Sent: Monday, April 05, 2010 11:46 AM
 To: r-help@r-project.org
 Subject: [R] Data manipulation problem
 
 Dear R´ers.
 
 I´ve got a dataset with age and year of diagnosis. In order to
 age-standardize the incidence I need to transform the data into a matrix
 with age-groups (divided in 5 or 10 years) along one axis and year divided
 into 5 years along the other axis. Each cell should contain the number of
 cases for that age group and for that period. 
 
 I.e.
 My data format now is
 ID-age (to one decimal)-year(yearly data).
 
 What I´d like is 
 
 
 age 1960-1965 1966-1970 etc...
 0-5 3 8 10 15
 6-10 2 5 8 13
 etc..
 
 
 Any good ideas?
 
 Regards,
 M
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data manipulation problem

2010-04-05 Thread moleps


Thx Erik,
I have no idea what went wrong with the other code snippet, but this one 
works.. Appreciate it.

qta- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest = 
TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE))

M


On 5. apr. 2010, at 21.45, Erik Iverson wrote:

 I don't know what your data are like, since you haven't given a reproducible 
 example. I was imagining something like:
 
 ## generate fake data
 age - sample(20:90, 100, replace = TRUE)
 year - sample(1950:2000, 100, replace = TRUE)
 
 ##look at big table
 table(age, year)
 
 ## categorize data
 ## see include.lowest and right arguments to cut
 age.factor - cut(age, breaks = seq(20, 90, by = 10),
  include.lowest = TRUE)
 
 year.factor - cut(year, breaks = seq(1950, 2000, by = 10),
   include.lowest = TRUE)
 
 table(age.factor, year.factor)
 
 moleps wrote:
 I already did try the regression modeling approach. However the 
 epidemiologists (referee) turns out to be quite fond of comparing the 
 incidence rates to different standard populations, hence the need for this 
 labourius approach. And trying the cutting approach I ended up with :
 table (age5)
 age5
   (0,5]   (5,10]  (10,15]  (15,20]  (20,25]  (25,30]  (30,35]  (35,40]  
 (40,45]  (45,50]  (50,55]  (55,60]  (60,65]  (65,70]  (70,75]  (75,80]  
 (80,85] (85,100]   35   34   33   47   51  109  
 157  231  362  511  745  926 1002  866  547  
 247   82   18 
 table (yr5)
 yr5
 (1950,1955] (1955,1960] (1960,1965] (1965,1970] (1970,1975] (1975,1980] 
 (1980,1985] (1985,1990] (1990,1995] (1995,2000] (2000,2005] (2005,2009]  
  3   5   5   5   5   5   
 5   5   5   5   5   3 
 table (yr5,age5)
 Error in table(yr5, age5) : all arguments must have the same length
 Sincerely,
 M
 On 5. apr. 2010, at 20.59, Bert Gunter wrote:
 You have tempted, and being weak, I yield to temptation:
 
 Any good ideas?
 
 Yes. Don't do this.
 
 (what you probably really want to do is fit a model with age as a factor,
 which can be done statistically e.g. by logistic regression; or graphically
 using conditioning plots, e.g. via trellis graphics (the lattice package).
 This avoids the arbitrariness and discontinuities of binning by age range.)
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of moleps
 Sent: Monday, April 05, 2010 11:46 AM
 To: r-help@r-project.org
 Subject: [R] Data manipulation problem
 
 Dear R´ers.
 
 I´ve got a dataset with age and year of diagnosis. In order to
 age-standardize the incidence I need to transform the data into a matrix
 with age-groups (divided in 5 or 10 years) along one axis and year divided
 into 5 years along the other axis. Each cell should contain the number of
 cases for that age group and for that period. 
 I.e.
 My data format now is
 ID-age (to one decimal)-year(yearly data).
 
 What I´d like is 
 
 age 1960-1965 1966-1970 etc...
 0-5 3 8 10 15
 6-10 2 5 8 13
 etc..
 
 
 Any good ideas?
 
 Regards,
 M
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data manipulation problem

2010-04-09 Thread moleps

In the end after going at it from scratch...This worked out allright...


##set up data
age.cat-seq(0,100,10)
 year-(1953:(1953+55))
 dat.vec-sample(1:10,(length(age.cat)*length(year)))
 dat.matrix-matrix(dat.vec,c(length(age.cat),length(year)))
 rownames(dat.matrix)-age.cat
 colnames(dat.matrix)-year
 year.int-seq(1950,2010,5)
 age.div-cut(year,year.int,include.lowest=T)
 
##summarise by another variable

 a-do.call(cbind,by(t(dat.matrix),age.div,function(x)colSums(x)));a
 
//M









On 6. apr. 2010, at 21.41, David Winsemius wrote:

 
 On Apr 6, 2010, at 3:30 PM, David Winsemius wrote:
 
 
 On Apr 6, 2010, at 9:56 AM, moleps islon wrote:
 
 OK... next question.. Which is still a data manipulation problem so I
 believe the heading is still OK.
 
 ##So now I read my population data from excel.
 
 No, you read it from a text file and providing the first ten lines of that 
 text file should have been really easy. Read the Posting Guide for advice 
 about offering datasets either as structure() objects with dput or dump or 
 as attached files with *.txt extension (not .csv). Just change the file 
 name with your file browser.
 
 pop-read.csv(pop.csv)
 
 typeof(pop) ## yields a list
 
 Really? I would have guessed it to yield just list.
 
 where I have age-specific population rows
 and a yearly column population, where the years are suffixed by X
 
 And had you used class(pop) you would have learned it was a dataframe and 
 even more informative would have been str(pop).
 
 c-(1953:2008)
 
 No, no, no. Do not use variable names that are important function names. The 
 R interpreter can (usually) keep things straight but it is our brains that 
 experience problems.  Other  function names to avoid: data, df, cut, mean, 
 sd, list, vector, matrix
 
 names(pop)-c
 c.div-cut(c,break=seq(1950,2010,by=5)
 
 (You should have gotten an error here.) After fixing the error, did you you 
 notice that there were only 3 of the first level???
 
 Watch out for cut(). It uses the default convention of ( , ] , i.e. open 
 interval at right
  er,  
   ^left^
 
 which is backwards to what some (most?) of us think natural. Because of that 
 the lowest level gets dropped unless you take special precautions.  That is 
 undoubtedly why Harrell set up his Hmisc::cut2 to have the default be [ , )
 
 Aggregating across columns? Certainly possible, but maybe not as natural a 
 fit to functions like split as would occur with working across rows. I 
 suppose you could use something like this untested (because _still_ no 
 sample dataset provided) code:
 
 apply(pop, 1,# this works a row a time
   function(x) tapply(x, list(c.div), sum) ) )  # or use aggregate which uses 
 tapply
 
 I'm not sure it will work, since I don't know if the column names would get 
 carried over into x by apply(). You might need to create a separate index 
 that used the numeric positions of the columns rather than their names. 
 Perhaps use c.div -  seq(0,(2008-1953)) %/% 5  or some such inside tapply.
 
 
 Now I'd like to sum the agespecific population over the individual
 levels of -c.div- and generate a new table for this with agespecific
 rows and columns containing the 5-year bins instead of the original
 yearly data. Do I have to program this from scratch or is it possible
 to use an already existing function?
 
 I think you ought to read more introductory material (and the Posting Guide 
 regarding how to offer example datasets). In this case there are many 
 functions that do data aggregation and most of them should be illustrated in 
 a good introductory text.
 
 -- 
 David.
 
 
 //M
 
 qta- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest =
 TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE
 
 On Mon, Apr 5, 2010 at 10:11 PM, moleps mole...@gmail.com wrote:
 
 Thx Erik,
 I have no idea what went wrong with the other code snippet, but this one 
 works.. Appreciate it.
 
 qta- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest = 
 TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE))
 
 M
 
 
 On 5. apr. 2010, at 21.45, Erik Iverson wrote:
 
 I don't know what your data are like, since you haven't given a 
 reproducible example. I was imagining something like:
 
 ## generate fake data
 age - sample(20:90, 100, replace = TRUE)
 year - sample(1950:2000, 100, replace = TRUE)
 
 ##look at big table
 table(age, year)
 
 ## categorize data
 ## see include.lowest and right arguments to cut
 age.factor - cut(age, breaks = seq(20, 90, by = 10),
   include.lowest = TRUE)
 
 year.factor - cut(year, breaks = seq(1950, 2000, by = 10),
include.lowest = TRUE)
 
 table(age.factor, year.factor)
 
 moleps wrote:
 I already did try the regression modeling approach. However the 
 epidemiologists (referee) turns out to be quite fond of comparing the 
 incidence rates to different standard populations, hence the need

[R] ctree and survival problem

2011-04-27 Thread moleps

Dear all,

I was intrigued by the ctree command and wanted to check it out. I first ran 
the demo with example(ctree) and did get the survival graphs in the end. Upon 
doing this with my own data and yielding a Invalid operation on a survival 
time I tried to rerun example(ctree) and now I also get Invalid operation on 
a survival time after the example runs plot(GBSG2ct)... any ideas what is 
going on here. I´m running R 2.11.1

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ctree and survival problem

2011-04-27 Thread moleps

Forgot to mention that the ctree command is from the party library.
//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ctree and survival problem

2011-04-28 Thread moleps

sessionInfo yields the following:

 sessionInfo()
R version 2.11.1 (2010-05-31) 
x86_64-apple-darwin9.8.0 

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] stats4tcltk splines   grid  stats graphics  grDevices 
utils datasets  methods   base 

other attached packages:
 [1] ipred_0.8-8class_7.3-3mlbench_2.1-0  rpart_3.1-46  
 party_0.9-1vcd_1.2-9  colorspace_1.0-1   strucchange_1.4-3  
coin_1.0-18mvtnorm_0.9-95
[11] modeltools_0.2-17  Design_2.3-0   car_2.0-8  nnet_7.3-1
 sandwich_2.2-6 boot_1.2-43Amelia_1.2-18  lmtest_0.9-27  
zoo_1.6-4  MASS_7.3-7
[21] mgcv_1.7-2 cem_1.0.142randomForest_4.6-2 lattice_0.19-17   
 nlme_3.1-97rgl_0.92.798   Hmisc_3.8-3survival_2.36-2
ggplot2_0.8.9  proto_0.3-8   
[31] reshape_0.8.3  plyr_1.4   foreign_0.8-41

loaded via a namespace (and not attached):
[1] cluster_1.13.2 digest_0.4.2   Matrix_0.999375-46 tools_2.11.1  


As i described in the original post I ran

 example(ctree) 

and the following part yields an error:


  if (require(ipred)) {
data(GBSG2, package = ipred)
GBSG2ct - ctree(Surv(time, cens) ~ .,data = GBSG2)
plot(GBSG2ct)
treeresponse(GBSG2ct, newdata = GBSG2[1:2,])
}

Error in Summary.Surv(c(1814, 2018, 712, 1807, 772, 448, 2172, 2161, 471,  : 
  Invalid operation on a survival time


//M





On 28. apr. 2011, at 14.19, Jonathan Daily wrote:

 It would help people who know more about R's guts than me if you
 posted your sessionInfo() output and exactly what commands produced
 your error.
 
 It is also recommended that you try simply upgrading R to the latest
 version and see if you get an error with the latest version of
 'party'. My guess is that the error will go away.
 
 On Wed, Apr 27, 2011 at 3:40 PM, moleps mole...@gmail.com wrote:
 Forgot to mention that the ctree command is from the party library.
 //M
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 ===
 Jon Daily
 Technician
 ===
 #!/usr/bin/env outside
 # It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ctree and survival problem

2011-04-28 Thread moleps

Yup, thats the culprit. Thx.

//M


On 28. apr. 2011, at 18.36, Achim Zeileis wrote:

 On Thu, 28 Apr 2011, moleps wrote:
 
 sessionInfo yields the following:
 
 OK, the Design package causes the problem here. When you load the Design 
 package, it provides a new Surv() and related methods. This clashes with the 
 computations of ctree() based on Surv(). So it's better not to load both 
 packages simultaneously...
 Z
 
 sessionInfo()
 R version 2.11.1 (2010-05-31)
 x86_64-apple-darwin9.8.0
 
 locale:
 [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
 
 attached base packages:
 [1] stats4tcltk splines   grid  stats graphics  grDevices 
 utils datasets  methods   base
 
 other attached packages:
 [1] ipred_0.8-8class_7.3-3mlbench_2.1-0  rpart_3.1-46
party_0.9-1vcd_1.2-9  colorspace_1.0-1   
 strucchange_1.4-3  coin_1.0-18mvtnorm_0.9-95
 [11] modeltools_0.2-17  Design_2.3-0   car_2.0-8  nnet_7.3-1 
 sandwich_2.2-6 boot_1.2-43Amelia_1.2-18  lmtest_0.9-27   
zoo_1.6-4  MASS_7.3-7
 [21] mgcv_1.7-2 cem_1.0.142randomForest_4.6-2 
 lattice_0.19-17nlme_3.1-97rgl_0.92.798   Hmisc_3.8-3
 survival_2.36-2ggplot2_0.8.9  proto_0.3-8
 [31] reshape_0.8.3  plyr_1.4   foreign_0.8-41
 
 loaded via a namespace (and not attached):
 [1] cluster_1.13.2 digest_0.4.2   Matrix_0.999375-46 tools_2.11.1
 
 
 As i described in the original post I ran
 
 example(ctree)
 
 and the following part yields an error:
 
 
 if (require(ipred)) {
   data(GBSG2, package = ipred)
   GBSG2ct - ctree(Surv(time, cens) ~ .,data = GBSG2)
   plot(GBSG2ct)
   treeresponse(GBSG2ct, newdata = GBSG2[1:2,])
   }
 
 Error in Summary.Surv(c(1814, 2018, 712, 1807, 772, 448, 2172, 2161, 471,  :
 Invalid operation on a survival time
 
 
 //M
 
 
 
 
 
 On 28. apr. 2011, at 14.19, Jonathan Daily wrote:
 
 It would help people who know more about R's guts than me if you
 posted your sessionInfo() output and exactly what commands produced
 your error.
 
 It is also recommended that you try simply upgrading R to the latest
 version and see if you get an error with the latest version of
 'party'. My guess is that the error will go away.
 
 On Wed, Apr 27, 2011 at 3:40 PM, moleps mole...@gmail.com wrote:
 Forgot to mention that the ctree command is from the party library.
 //M
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 --
 ===
 Jon Daily
 Technician
 ===
 #!/usr/bin/env outside
 # It's great, trust me.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Hmisc, summary.formula and catTest

2011-01-06 Thread moleps

Dear all,

I´m specifying the fisher.exact test for use with summary.formula as follows:




u-function(a,b){

j-fisher.test(a)
p-list(P=j$p.value,stat=NA,df=NA,testname=j$method,statname=)
return(p)

}

However I´m also required to specify stat  df. However this doesnt apply to 
the fisher test. I´ve tried specifying them as NA and  without success-throws 
either a blank or an error msg trying to round a non-numeric value respectively.



reproducible example:
ex-pbc
summary(trt~sex+ascites,data=ex,test=T,method=reverse)

summary(trt~sex+ascites,data=ex,test=T,method=reverse,catTest=u)


The closest I get is 


u-function(a,b){

j-fisher.test(a)
p-list(P=j$p.value,stat=1,df=1,testname=j$method,statname=)
return(p)

}

However then I manually have to edit the output. Is there a smart way of doing 
this?

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Hmisc, summary.formula and catTest

2011-01-06 Thread moleps

Allright..Works like a charm. However I do believe that the prtest vector 
should have been mentioned in the catTest or conTest option. Appreciate your 
time and effort.
Best,
//M

On 6. jan. 2011, at 23.24, Erik Iverson wrote:

 
 Does the prtest argument help when you actually use the 'print' function
 around your summary.formula object? I think that's how I
 solve it.
 
 I.e.,
 
 sf1 - summary(trt~sex+ascites,data=ex,test=T,method=reverse,catTest=u)
 
 print(sf1, prtest = P)
 
 
 Descriptive Statistics by trt
 
 +---+---+-+-+---+
 |   |N  |1|2|P-value|
 |   |   |(N=158)  |(N=154)  |   |
 +---+---+-+-+---+
 |sex : f|418|87% (137)|90% (139)|  0.377|
 +---+---+-+-+---+
 |ascites|312| 9% ( 14)| 6% ( 10)|  0.526|
 +---+---+-+-+---+

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Hmisc, summary.formula and catTest

2011-01-06 Thread moleps

Is it at all possible to specify this so that different tests display different 
parameters,

ie have the continous test display F, df and p while tes categorical test 
display only P values?

sf1 - summary(trt~sex+ascites+age,data=ex,test=T,method=reverse,catTest=u)

print(sf1, prtest = P)

//M
 




On 6. jan. 2011, at 23.24, Erik Iverson wrote:

 
 Does the prtest argument help when you actually use the 'print' function
 around your summary.formula object? I think that's how I
 solve it.
 
 I.e.,
 
 sf1 - summary(trt~sex+ascites,data=ex,test=T,method=reverse,catTest=u)
 
 print(sf1, prtest = P)
 
 
 Descriptive Statistics by trt
 
 +---+---+-+-+---+
 |   |N  |1|2|P-value|
 |   |   |(N=158)  |(N=154)  |   |
 +---+---+-+-+---+
 |sex : f|418|87% (137)|90% (139)|  0.377|
 +---+---+-+-+---+
 |ascites|312| 9% ( 14)| 6% ( 10)|  0.526|
 +---+---+-+-+---+

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mChoice prb in rms

2011-06-02 Thread moleps

Dear all,
I´m trying to get a output table for age and the summary of a and b, stratified 
by epo as follows using summary.formula 

h-data.frame(a=sample(c(A,NA),100,replace=T),b=sample(c(B,NA),100,replace=T),age=rnorm(100,50,25),epo=sample(c(Y,N),100,T))

library(rms)

summary.formula(epo~age+mChoice(a,b,label=test),method=reverse,data=h,na.rm=TRUE,test=T)



Descriptive Statistics by epo

+-+---+---++
| |N  |Y  |  Test  |
| |(N=51) |(N=49) |Statistic   |
+-+---+---++
|age  | 43.8/53.5/74.6| 30.8/48.8/69.3|F=1.68 d.f.=1,98 P=0.198|
+-+---+---++
|test : NA| 0% (0)| 0% (0)||
+-+---+---++
|A| 0% (0)| 0% (0)||
+-+---+---++
|B| 0% (0)| 0% (0)||
+-+---+---++


Digging deeper I find that

summary(mChoice(h$a,h$b))
h$a   4 unique combinations
Frequencies of Numbers of Choices Per Observation

nchoices
 1  2 
30 70 

Pairwise Frequencies (Diagonal Contains Marginal Frequencies)
0 x 0 matrix

 Frequencies of All Combinations 

  NA  A;B NA;A NA;B 
  30   24   23   23 


Somehow this doesnt carry over into the summary.formula output...

Also, running example(mChoice) reproduces this whereby the 0% categories are 
also shown.

Any ideas?

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plot survival analysis with time dependent variables

2012-12-31 Thread moleps

Dear all,
Is there an implementation of Simon  Makuch method of plotting the survival 
function with time-dependent variables. I´m only able to find event.chart in 
Hmisc for the purpose and I would prefer the Simon and Makuch method. I believe 
stata has it implemented for this purpose, but I cannot find it on CRAN.


Simon R, Makuch RW. A non-parametric graphical representation of the 
relationship between survival and the occurrence of an event: application to 
responder versus non-responder bias.  Statistics in Medicine 1984; 3: 35-44. 

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] p value for joint probability

2011-01-31 Thread moleps

Dear all,

Given

rr-data.frame(r1-rnorm(1000,10,5),r2-rnorm(1000,220,5))


How can I add a column (rr$p) for the joint probability of each r1  r2 pair?

 I know how to add the column.. I just dont know how to compute the p value for 
joint probabilities given the two samples.


//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] p value for joint probability

2011-02-01 Thread moleps


My terminology is probably way off. I´ll try again in plain english. 

I´d like  to generate a scatter plot of r1  r2 and color code each pair 
according to the probability of observing the pair given that the two samples 
(r1  r2) are drawn from two independent normal distributions.

rr-data.frame(r1=-rnorm(1000,10,5),r2=-rnorm(1000,220,5))

with(rr,plot(r1,r2))

Best,
//M




On 31. jan. 2011, at 23.13, Peter Ehlers wrote:

 On 2011-01-31 12:42, moleps wrote:
 Dear all,
 
 Given
 
 rr-data.frame(r1-rnorm(1000,10,5),r2-rnorm(1000,220,5))
 
 
 How can I add a column (rr$p) for the joint probability of each r1  r2 pair?
 
 If you take the values in each pair to be observations
 from two independent Normal distributions, it's easy:
 The joint probability of those values is zero.
 
 But I suspect you mean something else by joint probability.
 Can you elaborate?
 
 Peter Ehlers
 
  I know how to add the column.. I just dont know how to compute the p value 
 for joint probabilities given the two samples.
 
 
 //M
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] p value for joint probability

2011-02-01 Thread moleps

Allright.. Appreciate the input on non-zero terminology (:-). What I wanted was:

rr-data.frame(r1=rnorm(1000,10,5),r2=rnorm(1000,220,5))
with(rr,plot(r1,r2))
r3-kde2d(r1,r2,lims=c(2,18,200,240))

filled.contour(r3)


//M


On 1. feb. 2011, at 21.26, David Winsemius wrote:

 
 On Feb 1, 2011, at 2:31 PM, moleps wrote:
 
 
 My terminology is probably way off. I´ll try again in plain english.
 
 I´d like  to generate a scatter plot of r1  r2 and color code each pair 
 according to the probability of observing the pair given that the two 
 samples (r1  r2) are drawn from two independent normal distributions.
 
 The answer is still zero. If you want to ask a different question that might 
 have a non-zero answer, it might be: How can I color points on the basis of 
 their joint density with an assumption of no correlation , you might get a 
 better answer. Densities are not probabilities. You would need to specify 
 whether the arguments to the rnorm functions (i.e. the theoretic values) were 
 to be used or did you intend to use sample values for mean and sd?
 
 
 
 rr-data.frame(r1=-rnorm(1000,10,5),r2=-rnorm(1000,220,5))
 
 with(rr,plot(r1,r2))
 
 Best,
 //M
 
 
 
 
 On 31. jan. 2011, at 23.13, Peter Ehlers wrote:
 
 On 2011-01-31 12:42, moleps wrote:
 Dear all,
 
 Given
 
 rr-data.frame(r1-rnorm(1000,10,5),r2-rnorm(1000,220,5))
 
 
 How can I add a column (rr$p) for the joint probability of each r1  r2 
 pair?
 
 If you take the values in each pair to be observations
 from two independent Normal distributions, it's easy:
 The joint probability of those values is zero.
 
 But I suspect you mean something else by joint probability.
 Can you elaborate?
 
 Peter Ehlers
 
 I know how to add the column.. I just dont know how to compute the p value 
 for joint probabilities given the two samples.
 
 
 //M
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 West Hartford, CT
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R to analyze multiple MRI studies

2014-07-06 Thread moleps

Yes-I did look through the CRAN view and could not find any package that 
featured a function whereby an MRI set was transformed into Talairach or MNI 
space. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] paste together a string object later to be utilized in a function

2010-06-07 Thread moleps islon

I'm sorry for not clearer describing my motive with this--

So this is what I'm trying to do- Take a survival object and utilize
it in ggplot.

ggkm-function(time,event,stratum) {


m2s-Surv(time,as.numeric(event))

fit - survfit(m2s ~ stratum)

f$time-fit$time

f$surv-fit$surv


f$strata-c(rep(names(fit$strata[1]),fit$strata[1]),rep(names(fit$strata[2]),fit$strata[2]))

f$upper-fit$upper
f$lower-fit$lower

r-ggplot 
(f,aes(x=time,y=surv,fill=strata,group=strata))+geom_line()+geom_ribbon(aes(ymin=lower,ymax=upper),alpha=0.3)
return(r)
}


My problem is that I can have more than two strata, and I would like
the function to automatically detect the number of strata. Hence a
quick hack as to how I can create f$strata when I dont know the number
of strata would be appreciated.


The 
paste(rep(names(fit$strata[,1:length(fit$strata),]),fit$strata[,1:length(fit$strata),]),sep=)
is as close as I get. However this results in multiple strings and I
havent discovered yet how I can pass this so f$strata is created. This
could easily be done in stata appending to a macro, but as I am a
recent convertee I dont know how  to do this in R. (yet...?)


Regards,

//M

On Mon, Jun 7, 2010 at 2:50 AM, Joris Meys jorism...@gmail.com wrote:
 Wild guess, but it looks like you are looking at :

 ts - list(a=1:5)
 names(ts$a) - letters[1:5]
 v-paste(rep(names(ts$a[,1:b,]),ts$a[,1:b,]),sep=)

 sapply(v,function(x){eval(parse(,text=x))})
 $`rep(names(ts$a[1]),ts$a[1])`
 [1] a

 $`rep(names(ts$a[2]),ts$a[2])`
 [1] b b

 $`rep(names(ts$a[3]),ts$a[3])`
 [1] c c c

 $`rep(names(ts$a[4]),ts$a[4])`
 [1] d d d d

 $`rep(names(ts$a[5]),ts$a[5])`
 [1] e e e e e

 assign(test,eval(parse(text=v[3])))
 test
 [1] c c c


 Cheers
 Joris

 On Sun, Jun 6, 2010 at 9:51 PM, moleps mole...@gmail.com wrote:
 Dear r-listers,

 I need to pass a string to a function. However the length of the string is 
 dependent on the length of a vector.


 b-length(h)
 v-paste(rep(names(ts$a[,1:b,]),ts$a[,1:b,]),sep=)


 Is it possible somehow to pass this as an argument to a function later on ?



 Regards,

 //M

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Ghent University
 Faculty of Bioscience Engineering
 Department of Applied mathematics, biometrics and process control

 tel : +32 9 264 59 87
 joris.m...@ugent.be
 ---
 Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Latex problem in Hmisc (3.8-1) and Mac Os X with R 2.11.1

2010-06-18 Thread moleps islon

Dear all,

I did post this more or less identical mail in a follow up to another
question I posted, but under another heading. I try again, but now
under the correct header.


upon running this code (from the Hmisc library-latex function) I
believe the call to summary.formula is allright and produces wonderful
tables, but the latex command results in a correct formatted table but
where all the numbers and the test columns are wrong. I've pasted in
both the R code and the resulting latex code annotated with comments
from the run. Does the same code produce correct cell-entries in other
installation ?

//M



library(Hmisc)

options(digits=3)
set.seed(173)
sex - factor(sample(c(m,f), 500, rep=TRUE))
age - rnorm(500, 50, 5)
treatment - factor(sample(c(Drug,Placebo), 500, rep=TRUE))
symp - c('Headache','Stomach Ache','Hangnail',
 'Muscle Ache','Depressed')
symptom1 - sample(symp, 500,TRUE)
symptom2 - sample(symp, 500,TRUE)
symptom3 - sample(symp, 500,TRUE)
Symptoms - mChoice(symptom1, symptom2, symptom3, label='Primary Symptoms')
table (Symptoms)
table(symptom1,symptom2)
f - summary(treatment ~ age + sex + Symptoms, method=reverse, test=TRUE)
g - summary(treatment ~ age + sex + symptom1, method=reverse, test=TRUE)

latex(g)

 latex(g,file=)
% latex.default(cstats, title = title, caption = caption, rowlabel =
rowlabel,  col.just = col.just, numeric.dollar = FALSE,
insert.bottom = legend,  rowname = lab, dcolumn = dcolumn,
extracolheads = extracolheads,  extracolsize = Nsize, ...)
%
\begin{table}[!tbp]
 \caption{Descriptive Statistics by treatment\label{g}}
 \begin{center}
 \begin{tabular}{lccc}\hline\hline
\multicolumn{1}{l}{}\multicolumn{1}{c}{Drug}\multicolumn{1}{c}{Placebo}\multicolumn{1}{c}{Test
Statistic}\tabularnewline

\multicolumn{1}{c}{{\scriptsize
$N=263$}}\multicolumn{1}{c}{{\scriptsize $N=237$}}\tabularnewline
\hline
age114\tabularnewline
sex~:~m672\tabularnewline
symptom1~:~Depressed433\tabularnewline
Hangnail561\tabularnewline
Headache421\tabularnewline
Muscle~Ache351\tabularnewline
Stomach~Ache241\tabularnewline
\hline
\end{tabular}

\end{center}

\noindent {\scriptsize $a$\ }{$b$\ }{\scriptsize $c$\ } represent the
lower quartile $a$, the median $b$, and the upper quartile $c$\ for
continuous variables.\\Numbers after percents are
frequencies.\\\indent Tests used:\\\textsuperscript{\normalfont
1}Wilcoxon test; \textsuperscript{\normalfont 2}Pearson test
\end{table}


###Then I did another example from Harrell´s statistical tables and plots

rm(list=ls())



library(Hmisc)
getHdata(prostate)
# Variables in prostate had units in ( ) inside variable labels. Move
# these units of measurements to separate units attributes
# wt is an exception. It has ( ) in its label but this does not denote units
# Also make hg have a legal R plotmath expression
prostate-upData(prostate, moveUnits=TRUE,units=c(wt=,
hg=g/100*ml),labels=c(wt=Weight Index = wt(kg)-ht(cm)+200))

attach(prostate)
stage- factor(stage, 3:4, c(Stage 3,Stage 4))
s6-summary(stage~rx+age+wt+pf+hx+sbp+dbp+ekg+hg+sz+sg+ap+bm,method=reverse,
overall=TRUE, test=TRUE)
options(digits=2)
w-latex(s6, size=smaller[3], outer.size=smaller,
Nsize=smaller,long=TRUE, prmsd=TRUE,
msdsize=smaller,middle.bold=TRUE, ctable=TRUE)


##This refused to run ( as long as the ctable=T was included), but without it

latex (s6)

##I do get a nicely formated table, but again the numbers are all wrong... Also

##latex(s6, long=TRUE, prmsd=TRUE, msdsize=smaller,middle.bold=TRUE)

##makes no difference from latex(s6) alone with regards to formatting...



Quite frustrating-Any suggestions??


//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data manipulation problem

2010-04-06 Thread moleps islon

OK... next question.. Which is still a data manipulation problem so I
believe the heading is still OK.

##So now I read my population data from excel.
pop-read.csv(pop.csv)

typeof(pop) ## yields a list where I have age-specific population rows
and a yearly column population, where the years are suffixed by X

c-(1953:2008)
names(pop)-c
c.div-cut(c,break=seq(1950,2010,by=5)

Now I'd like to sum the agespecific population over the individual
levels of -c.div- and generate a new table for this with agespecific
rows and columns containing the 5-year bins instead of the original
yearly data. Do I have to program this from scratch or is it possible
to use an already existing function?


//M






qta- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest =
TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE

On Mon, Apr 5, 2010 at 10:11 PM, moleps mole...@gmail.com wrote:

 Thx Erik,
 I have no idea what went wrong with the other code snippet, but this one 
 works.. Appreciate it.

 qta- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest = 
 TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE))

 M


 On 5. apr. 2010, at 21.45, Erik Iverson wrote:

 I don't know what your data are like, since you haven't given a reproducible 
 example. I was imagining something like:

 ## generate fake data
 age - sample(20:90, 100, replace = TRUE)
 year - sample(1950:2000, 100, replace = TRUE)

 ##look at big table
 table(age, year)

 ## categorize data
 ## see include.lowest and right arguments to cut
 age.factor - cut(age, breaks = seq(20, 90, by = 10),
                  include.lowest = TRUE)

 year.factor - cut(year, breaks = seq(1950, 2000, by = 10),
                   include.lowest = TRUE)

 table(age.factor, year.factor)

 moleps wrote:
 I already did try the regression modeling approach. However the 
 epidemiologists (referee) turns out to be quite fond of comparing the 
 incidence rates to different standard populations, hence the need for this 
 labourius approach. And trying the cutting approach I ended up with :
 table (age5)
 age5
   (0,5]   (5,10]  (10,15]  (15,20]  (20,25]  (25,30]  (30,35]  (35,40]  
 (40,45]  (45,50]  (50,55]  (55,60]  (60,65]  (65,70]  (70,75]  (75,80]  
 (80,85] (85,100]       35       34       33       47       51      109      
 157      231      362      511      745      926     1002      866      547 
      247       82       18
 table (yr5)
 yr5
 (1950,1955] (1955,1960] (1960,1965] (1965,1970] (1970,1975] (1975,1980] 
 (1980,1985] (1985,1990] (1990,1995] (1995,2000] (2000,2005] (2005,2009]     
       3           5           5           5           5           5         
   5           5           5           5           5           3
 table (yr5,age5)
 Error in table(yr5, age5) : all arguments must have the same length
 Sincerely,
 M
 On 5. apr. 2010, at 20.59, Bert Gunter wrote:
 You have tempted, and being weak, I yield to temptation:

 Any good ideas?

 Yes. Don't do this.

 (what you probably really want to do is fit a model with age as a factor,
 which can be done statistically e.g. by logistic regression; or graphically
 using conditioning plots, e.g. via trellis graphics (the lattice package).
 This avoids the arbitrariness and discontinuities of binning by age range.)

 Bert Gunter
 Genentech Nonclinical Biostatistics

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of moleps
 Sent: Monday, April 05, 2010 11:46 AM
 To: r-help@r-project.org
 Subject: [R] Data manipulation problem

 Dear R´ers.

 I´ve got a dataset with age and year of diagnosis. In order to
 age-standardize the incidence I need to transform the data into a matrix
 with age-groups (divided in 5 or 10 years) along one axis and year divided
 into 5 years along the other axis. Each cell should contain the number of
 cases for that age group and for that period.
 I.e.
 My data format now is
 ID-age (to one decimal)-year(yearly data).

 What I´d like is

 age 1960-1965 1966-1970 etc...
 0-5 3 8 10 15
 6-10 2 5 8 13
 etc..


 Any good ideas?

 Regards,
 M

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data manipulation problem

2010-04-07 Thread moleps islon

So.. here we try again.

##generate dataset
age.cat-seq(0,100,10)
year-(1953:(1953+55))
data.vec-sample(1:1,(age.cat*year))
data.matrix-matrix(data.vec,c(length(age.cat),length(year))
rownames(data.matrix)-age.cat
colnames(data.matrix)-year

##divide into 5 year periods
age.div-cut(year,seq(1950,2010,6),include.lowest=T) ##interval is
beyond my datainterval so I doubt the include.lowest matters

Now what I'd like to do is summarise the rows within the 5-year intervals.

I did read about apply in its different variants and Dahlgaard, but I
do not know understand how it could be applied in this setting.

I tried making an array and summarise by that (used the vector and
applied it into a
length(age.cat)*max(vector(table(age.div)*length(age.div) array. It
worked but required a bit of tweaking (inserting null columns) and I
find myself in this situation quite often whereby I need to add
multiple columns based on another vector so I'd be very interested in
another more general approach.

//M



On Tue, Apr 6, 2010 at 9:41 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Apr 6, 2010, at 3:30 PM, David Winsemius wrote:


 On Apr 6, 2010, at 9:56 AM, moleps islon wrote:

 OK... next question.. Which is still a data manipulation problem so I
 believe the heading is still OK.

 ##So now I read my population data from excel.

 No, you read it from a text file and providing the first ten lines of that
 text file should have been really easy. Read the Posting Guide for advice
 about offering datasets either as structure() objects with dput or dump or
 as attached files with *.txt extension (not .csv). Just change the file
 name with your file browser.

 pop-read.csv(pop.csv)

 typeof(pop) ## yields a list

 Really? I would have guessed it to yield just list.

 where I have age-specific population rows
 and a yearly column population, where the years are suffixed by X

 And had you used class(pop) you would have learned it was a dataframe and
 even more informative would have been str(pop).

 c-(1953:2008)

 No, no, no. Do not use variable names that are important function names.
 The R interpreter can (usually) keep things straight but it is our brains
 that experience problems.  Other  function names to avoid: data, df, cut,
 mean, sd, list, vector, matrix

 names(pop)-c
 c.div-cut(c,break=seq(1950,2010,by=5)

 (You should have gotten an error here.) After fixing the error, did you
 you notice that there were only 3 of the first level???

 Watch out for cut(). It uses the default convention of ( , ] , i.e. open
 interval at right

  er,
^left^

 which is backwards to what some (most?) of us think natural. Because of
 that the lowest level gets dropped unless you take special precautions.
  That is undoubtedly why Harrell set up his Hmisc::cut2 to have the default
 be [ , )

 Aggregating across columns? Certainly possible, but maybe not as natural a
 fit to functions like split as would occur with working across rows. I
 suppose you could use something like this untested (because _still_ no
 sample dataset provided) code:

 apply(pop, 1,# this works a row a time
   function(x) tapply(x, list(c.div), sum) ) )  # or use aggregate which
 uses tapply

 I'm not sure it will work, since I don't know if the column names would
 get carried over into x by apply(). You might need to create a separate
 index that used the numeric positions of the columns rather than their
 names. Perhaps use c.div -  seq(0,(2008-1953)) %/% 5  or some such inside
 tapply.


 Now I'd like to sum the agespecific population over the individual
 levels of -c.div- and generate a new table for this with agespecific
 rows and columns containing the 5-year bins instead of the original
 yearly data. Do I have to program this from scratch or is it possible
 to use an already existing function?

 I think you ought to read more introductory material (and the Posting
 Guide regarding how to offer example datasets). In this case there are many
 functions that do data aggregation and most of them should be illustrated in
 a good introductory text.

 --
 David.


 //M

 qta- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest =
 TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE

 On Mon, Apr 5, 2010 at 10:11 PM, moleps mole...@gmail.com wrote:

 Thx Erik,
 I have no idea what went wrong with the other code snippet, but this one
 works.. Appreciate it.

 qta- table(cut(age,breaks = seq(0, 100, by = 10),include.lowest =
 TRUE),cut(year,breaks=seq(1950,2010,by=5),include.lowest=TRUE))

 M


 On 5. apr. 2010, at 21.45, Erik Iverson wrote:

 I don't know what your data are like, since you haven't given a
 reproducible example. I was imagining something like:

 ## generate fake data
 age - sample(20:90, 100, replace = TRUE)
 year - sample(1950:2000, 100, replace = TRUE)

 ##look at big table
 table(age, year)

 ## categorize data
 ## see include.lowest

Re: [R] Plot survival analysis with time dependent variables

2012-12-31 Thread moleps islon

I did.But could only find the citation-not an implementation.

On Tuesday, January 1, 2013, David Winsemius wrote:


 On Dec 31, 2012, at 4:38 PM, moleps wrote:

  Dear all,
 Is there an implementation of Simon  Makuch method of plotting the
 survival function with time-dependent variables. I´m only able to find
 event.chart in Hmisc for the purpose and I would prefer the Simon and
 Makuch method. I believe stata has it implemented for this purpose, but I
 cannot find it on CRAN.


 Simon R, Makuch RW. A non-parametric graphical representation of the
 relationship between survival and the occurrence of an event: application
 to responder versus non-responder bias.  Statistics in Medicine 1984; 3:
 35-44.


 Have you done any searching?

 http://markmail.org/search/?q=**list%3Aorg.r-project.r-help+**
 Simon+%26+Makuch#query:list%**3Aorg.r-project.r-help%**
 20Simon%20%26%20Makuch+page:1+**mid:p7kssr6awkrwnnoo+state:**resultshttp://markmail.org/search/?q=list%3Aorg.r-project.r-help+Simon+%26+Makuch#query:list%3Aorg.r-project.r-help%20Simon%20%26%20Makuch+page:1+mid:p7kssr6awkrwnnoo+state:results




 --

 David Winsemius, MD
 Alameda, CA, USA



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SetInternet2, RCurl and proxy

2011-01-31 Thread moleps islon

Dear all,

Using the SetInternet2(TRUE) option works wonders with R in my sealed
down work-environment. However, I'd like to use RCurl and apparently
the proxy settings are not carried over. Is it possible to figure out
the proxy-IP and port number from R after invoking SetInternet2?

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SetInternet2, RCurl and proxy

2011-01-31 Thread moleps islon

I know that the proxy usually can be found in internet
explorer..However the hospital IE version has been altered so that
nothing is visible. And the IT dept is not very keen on revealing the
proxy settings.




On Mon, Jan 31, 2011 at 2:08 PM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
 On Mon, 31 Jan 2011, moleps islon wrote:

 Dear all,

 Using the SetInternet2(TRUE) option works wonders with R in my sealed
 down work-environment. However, I'd like to use RCurl and apparently
 the proxy settings are not carried over. Is it possible to figure out
 the proxy-IP and port number from R after invoking SetInternet2?

 No, but it should be from your browser: all SetInternet2 does is to switch
 to use Internet Explorer internals.  cURL (and hence RCurl) knows nothing
 about IE's settings.

 Note that if all you need is the proxy details, then you don't need
 SetInternet2: see ?download.file.  However, many sites need to authenticate
 to the proxy 

 --
 Brian D. Ripley,                  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford,             Tel:  +44 1865 272861 (self)
 1 South Parks Road,                     +44 1865 272866 (PA)
 Oxford OX1 3TG, UK                Fax:  +44 1865 272595


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] MatchIt and sensitivity analysis

2011-02-10 Thread moleps islon

Dear all,

Is there a package that allows me to run a sensitivy analysis on a
matched dataset created using MatchIt? I am aware of both rbounds and
the sensitivy function in the twang package but they do not allow
matched objects from MatchIt as input.

//M

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using R to analyze multiple MRI studies

2014-07-03 Thread moleps islon

I need to analyze multiple T1 contrast enhanced MRI studies from different
patients. They are all in DICOM format. I see that there are different
packages for loading individual studies in DICOM format, however I have had
limited luck so far researching how the different studies can be tranformed
into MNI or Talairach space. Is there an R-implementation of this?

Best,

M

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

50 matches

Mail list logo