I large datset that includes subjects(ID), Dates and events that need to be
counted. Not every date includes an event, and I need to only count one event
per 30days, per subject. So in essence, I need to create a 30-day "black out"
period during which time an event cannot be "counted" for each subject. The
reason is that a rule has been set up, whereby a subject can only be "counted"
once per 30 day period (the 30 day window includes the day the event of
interest is counted).
The solution should count only the following events per subject(per the 30-day
blackout rule):
ID Date
auto1 1/1/2010
auto2 2/12/2010
auto2 4/21/2011
auto3 3/1/2010
auto3 5/3/2010
I have created a multistep process to do this, but it is extremely clumsy
(detailed below). I have to believe that one of you has a much more elegant
solution. Thank you all in advance for any help!!!!
## example data
data1 <- structure(list(ID = structure(c(2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,3L, 4L,
4L, 4L, 4L, 4L), .Label = c("", "auto1", "auto2", "auto3"), class = "factor"),
Date = structure(c(14610, 14610, 14627,14680, 14652, 14660, 14725, 15085,
15086, 14642, 14669, 14732,14747, 14749), class = "Date"), event = c(1L, 1L,
1L, 0L, 1L,1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L)), .Names = c("ID",
"Date","event"), class = "data.frame", row.names = c(NA, 14L))
## remove non events
data2 <- data1[data1$event==1,]
library(doBy)
## create a table of first events
step1 <- summaryBy(Date~ID, data = data2, FUN=min)
step1$Date30 <- step1$Date.min+30
step2 <- merge(data2, step1, by.x="ID", by.y="ID")
## use an ifelse to essentially remove any events that shouldn't be counted
step2$event <- ifelse(as.numeric(step2$Date) >= step2$Date.min &
as.numeric(step2$Date) <= step2$Date30, 0, step2$event)
## basically repeat steps above until I get an error (no more events)
data3 <- step2[step2$event==1,]
data3<- data3[,1:3]
step3 <- summaryBy(Date~ID, data = data3, FUN=min)
step3$Date30 <- step3$Date.min+30
step4 <- merge(data3, step3, by.x="ID", by.y="ID")
step4$event <- ifelse(as.numeric(step4$Date) >= step4$Date.min &
as.numeric(step4$Date) <= step4$Date30, 0, step4$event)
## then I rbind the "keepers"
## in this case steps 1 and 3 above
final <- rbind(step1,step3)
## then reformat
final <- final[,1:2]
final$Date.min <- as.Date(final$Date.min,origin="1970-01-01")
## again, extremely clumsy, but it works... HELP! :)
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.