Try this. The function takes a vector of dates of the form yyyy-mm and produces a new character vector of dates of the same form except the output date is the beginning of the 6 month period in which the input date lies. The 6 month intervals are measured from the minimum date.
date.grouping <- function(d) { # for ea date in d calculate date beginning 6 month period which contains it mat <- matrix(as.numeric(unlist(strsplit(as.character(d),"-"))),nr=2) f <- function(x) do.call( "ISOdate", as.list(x) ) POSIXct.dates <- apply(rbind(mat,1),2,f) + ISOdate(1970,1,1) breaks <- c(seq(from=min(POSIXct.dates), along=POSIXct.dates, by="6 mo"), Inf) format( as.POSIXct( cut( POSIXct.dates, breaks, include.lowest=T )), "%Y-%m" ) } patients2 <- with( patients, tapply( cost, list(ID,date.grouping(date)), sum ) ) patients2 <- as.data.frame( patients2 ) summary(patients2) boxplot(patients2) --- Ricardo Pietrobon <[EMAIL PROTECTED]> wrote: >Hi, > > >I am new to R, coming from a few years using Stata. I've been twisting my >brain and checking several R and S references over the last few days to >try to solve this data management problem: I have a data set with a unique >patient identifier that is repeated along multiple rows, a variable with >month of patient encounter, and a continous variable for cost of >individual encounters. The data looks like this: > >ID date cost >1 "2001-01" 200.00 >1 "2001-01" 123.94 >1 "2001-03" 100.23 >1 "2001-04" 150.34 >2 "2001-03" 296.34 >2 "2002-05" 156.36 > > >I would like to obtain the median costs and boxplots for the sum of >encounters happening in the first six months after the index encounter >(first patient encounter) for each patient, then the mean and median costs >for the costs happening from 6 to 12 months after the index encounter, and >so on. Notice that the first ID has two encounters during the index date, >making it more difficult to define a single row with the index encounter. > >Any help would be appreciated, > > >Ricardo > > >Ricardo Pietrobon, MD >Assistant Professor of Surgery >Duke University Medical Center >Durham, NC 27710 US > >______________________________________________ >[EMAIL PROTECTED] mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help