Re: [R] calcul of the mean in a period of time

2013-05-23 Thread arun
HI GG,
I should had checked with multiple t=0 only rows.
Apologies!
Check if this work: (Changed the thread name as the solution applies to that 
problem)

dat2- read.csv(dat6.csv,header=TRUE,sep=\t,row.names=1)
str(dat2)
#'data.frame':    3896 obs. of  3 variables:
# $ patient_id: int  2 2 2 2 2 2 2 2 2 2 ...
# $ t : int  0 1 2 3 4 5 6 7 8 9 ...
# $ basdai    : num  2.83 4.05 3.12 3.12 2.42 ...
 
library(plyr)
 dat2New-ddply(dat2,.(patient_id),summarize,t=seq(min(t),max(t)))
 res-join(dat2New,dat2,type=full)


 lst1-lapply(split(res,res$patient_id),function(x) 
{x1-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) 
{y1-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; 
data.frame(patient_id=unique(y1$patient_id),t=head(y1$t,1),basdai=mean(y1$basdai,na.rm=TRUE))})
 ) })

dat3-dat2[unlist(with(dat2,tapply(t,patient_id,FUN=function(x) x==0  
length(x)==1)),use.names=FALSE),]
 head(dat3,3)
#    patient_id t basdai
#143 10 0  5.225
#555 37 0  2.450
#627 42 0  6.950

 lst2-split(dat3,seq_len(nrow(dat3)))
 
lst1[lapply(lst1,length)==0]-mapply(rbind,lst1[lapply(lst1,length)==0],lst2,SIMPLIFY=FALSE)
res1-do.call(rbind,lst1)
 row.names(res1)- 1:nrow(res1)
 res2- res1[,-2]
res2$period-with(res2,ave(patient_id,patient_id,FUN=seq_along))
 #res2
#selected rows
res2[c(48:51,189:192,210:215),]
#    patient_id   basdai period
#48   9 3.625000  8
#49  10 5.225000  1 #t=0 only row
#50  11 6.018750  1
#51  11 6.00  2
#189 36 6.17  1
#190 37 2.45  1 #t=0 only row
#191 38 3.10  1
#192 38 3.575000  2
#210 41 1.918750  1
#211 41 4.025000  2
#212 41 2.975000  3
#213 41 1.725000  4
#214 42 6.95  1 #t=0 only row
#215 44 4.30  1

A.K.







From: GUANGUAN LUO guanguan...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Thursday, May 23, 2013 9:50 AM
Subject: Re: how to calculate the mean in a period of time?



Hello, Arun, sorry to trouble you again,
I tried your method and i found that for patient_id==10 et patient_id==37 ect, 
the scores are repeated 51 times, I don't understand why this occured.

Thank you so much.

GG

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calcul of the mean in a period of time

2013-05-22 Thread arun
Hi,
I guess you meant this:


dat2- read.table(text=
patient_id  t scores
1  0    1.6
1  1    2.6
1  2 2.2
1  3 1.8
2  0  2.3
2   2 2.5
2  4  2.6
2   5 1.5
3   0 1.2
4   0 1.3
4   1 1.8
,sep=,header=TRUE)

library(plyr)
 dat2New-ddply(dat2,.(patient_id),summarize,t=seq(min(t),max(t)))
 res-join(dat2New,dat2,type=full)

 lst1-lapply(split(res,res$patient_id),function(x) 
{x1-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) 
{y1-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; 
data.frame(patient_id=unique(y1$patient_id),t=head(y1$t,1),scores=mean(y1$scores,na.rm=TRUE))})
 ) })

lst1[lapply(lst1,length)==0]-lapply(lst1[lapply(lst1,length)==0],function(x) 
x- dat2[unlist(with(dat2,tapply(t,patient_id,FUN=function(x) x==0  
length(x)==1)),use.names=FALSE),])
res1-do.call(rbind,lst1)
 row.names(res1)- 1:nrow(res1)
 res2- res1[,-2]
res2$period-with(res2,ave(patient_id,patient_id,FUN=seq_along))
 res2
# patient_id scores period
#1  1   2.05  1
#2  2   2.40  1
#3  2   2.05  2
#4  3   1.20  1
#5  4   1.55  1
A.K.


From: GUANGUAN LUO guanguan...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Wednesday, May 22, 2013 5:42 AM
Subject: calcul of the mean in a period of time



Hello, AK, This is the code zhich you have written.

dat2- read.table(text=

patient_id  t scores
1  0    1.6
1  1    2.6
1  2 2.2
1  3 1.8
2  0  2.3
2   2 2.5
2  4  2.6
2   5 1.5
,sep=,header=TRUE)

library(plyr)
 dat2New-ddply(dat2,.(
patient_id),summarize,t=seq(min(t),max(t)))
 res-join(dat2New,dat2,type=full)
res1-do.call(rbind,lapply(split(res,res$patient_id),function(x) 
{x1-x[x$t!=0,];do.call(rbind,lapply(split(x1,((x1$t-1)%/%3)+1),function(y) 
{y1-if(any(y$t==1)) rbind(x[x$t==0,],y) else y; 
data.frame(patient_id=unique(y1$patient_id),scores=mean(y1$scores,na.rm=TRUE))})
 ) }))
 row.names(res1)-1:nrow(res1)
res1$period-with(res1,ave(patient_id,patient_id,FUN=seq))
 res1
#  patient_id scores period
#1  1   2.05  1
#2  2   2.40  1
#3  2   2.05  2


 for the same problem, in the case that you have done, you have select the data 
x[t!=0], if there are some patients who have only one data when t=0, can i 
change a little the code so that i can retain the informations when t=0?
That means when the patients have only one score, so i regarde the score of t=0 
as the average of period 1 for these patients.
Thank you so much for your help. I have never worked on programming before, so 
i really don't understand much for it. 
You are really helpful. Thank you so much.

GG

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.