Dear all, I'm writing to you because I'm not able to construct in a long data
2 new variables based on other columns for each id. Small example with only one
subject:dt <- data.table( id = rep(1,7), bday = rep(as.Date("1960-10-29"),7),
start = rep(as.Date("2005-02-27"),7), marker0 = rep(125,7), datep =
as.Date(c('2005-04-20','2005-10-28','2005-12-31','2006-08-10','2006-12-31','2007-02-19','2007-05-15')),
marker = c(10,2,0,5,3,7,1) ) I would want to construct sistematically a new
data table from three conditions (I have the first): 1) It only keeps rows
whose datep variable is 31st december or is the last date within id (I can get
it)newdt <- unique(rbind(dt[which(month(datep)==12 &
as.POSIXlt(datep)$mday==31)], dt[, .SD[.N],
by='id'])[,list(id,init=0,marker0,sum=0,dsum=datep)])[order(id,dsum)] id init
marker0 sum dsum1: 1 0 125 0 2005-12-312: 1
0 125 0 2006-12-313: 1 0 125 0
2007-05-15 2) It has a so-called init variable, whose value is defined in
1st row as difference (yr) between start and bday, in 2nd row as difference
between dsum of 1st row and bday and finally the 3rd row as difference
between dsum of 2nd row and bday: id bday
ini marker0 sum
dsum1: 1 1960-10-29 difftime(start,
bday)/365.25 125 0 2005-12-31 2: 1 1960-10-29
difftime(as.Date("2005-12-31"), bday)/365.25 125 0 2006-12-31
3: 1 1960-10-29 difftime(as.Date("2006-12-31"), bday)/365.25 125
0 2007-05-15 3) It has also a sum variable, whose value is defined
as follows: 3a) For first row within each id: The corresponding marker0
value. 3b) For each of the following rows within id: Previous sum value plus
the sum of marker's values across the previous year. id init marker0
sum dsum 1: 1 44.3 125 125
2005-12-31 2: 1 45.2 125 125+10+2=137
2006-12-31 3: 1 46.2 125 137+5+3=145 2007-05-15 Thanks
to all R-data table help community for your continuous help!
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help