Hi Tom,
> I have a dataset consists of duplicated sequences within day for each
> patient (see below data) and I want to reshape the data with patient as time
> variable. However the reshape function only takes the first sequence of the
> replicates and ignores the second. How can I 1) average the duplicates and 2)
> give the duplicated sequences unique names before reshaping the data ?
>
> > data
> patient day seq y
> 1 10 1 acdf -0.52416066
> 2 10 1 cdsv 0.62551539
> 3 10 1 dlfg -1.54668047
> 4 10 1 acdf 0.82404978
> 5 10 1 cdsv -1.17459914
> 6 10 2 acdf 0.47238216
You mind find that the functions in the reshape package give you a bit
more flexibility.
# The reshape package expects data like to have
# the value variable named "value"
d2 <- rename(data, c("y" = "value"))
# I think this is the format you want, which will average over the reps
cast(d2, day + seq ~ patient, mean)
Hadley
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.