Francisco J. Zagmutt wrote:
If you want to obtain a data frame you can use the functions head and
tail like:
dat=data.frame(id=rep(1:5,3),num=rnorm(15), num2=rnorm(15))#Creates data
frame with id
last=do.call(rbind,by(dat,dat$id,tail,1))#Selects the last observation
for each id
I have a dataframe that contains fields such as patid, labdate, labvalue.
The same patid may show up in multiple rows because of lab measurements on
multiple days. Is there a simple way to obtain just the first and last
record for each patient, or do I need to write some code that performs
If you have your data.frame ordered by the patid, you can use the
function rle in combination with cumsum. As a vector example:
a - rep(c('a','b','c'),10)
a
[1] a b c a b c a b c a b c a b c a
b c a
[20] b c a b c a b c a b c
b - a[order(a)]
b
[1] a a a a a a a a a a b b b b b b
b b b
I think by() is simpler:
by(yourframe,factor(yourframe$patid),function(x)x[c(1,nrow(x)),])
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
The business of the statistician is to catalyze the scientific learning
process. - George E. P. Box
-Original
If you want to obtain a data frame you can use the functions head and tail
like:
dat=data.frame(id=rep(1:5,3),num=rnorm(15), num2=rnorm(15))#Creates data
frame with id
last=do.call(rbind,by(dat,dat$id,tail,1))#Selects the last observation for
each id