Re: [R] trajectory plot (growth curve)
On Mon, Aug 23, 2010 at 3:58 PM, Lei Liu liu...@virginia.edu wrote: Hi there, I want to make trajectory plots for data as follows: ID time y 1 1 1.4 1 2 2.0 1 3 2.5 2 1.5 2.3 2 4 4.5 2 5.5 1.6 2 6 2.0 ... That is, I will plot a growth curve for each subject ID, with y in the y axis, and time in the x axis. I would like to have all growth curves in the same plot. Is there a simple way in R to do it? Thanks a lot! Try this. Lines - ID timey 1 1 1.4 1 2 2.0 1 3 2.5 2 1.5 2.3 2 4 4.5 2 5.5 1.6 2 6 2.0 library(zoo) # z - read.zoo(myfile.dat, header = TRUE, split = 1, index = 2) z - read.zoo(textConnection(Lines), header = TRUE, split = 1, index = 2) plot(z) # each in separate panel plot(z, col = 1:2) # all on same plot in different colors __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trajectory plot (growth curve)
On Mon, Aug 23, 2010 at 4:16 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Mon, Aug 23, 2010 at 3:58 PM, Lei Liu liu...@virginia.edu wrote: Hi there, I want to make trajectory plots for data as follows: ID time y 1 1 1.4 1 2 2.0 1 3 2.5 2 1.5 2.3 2 4 4.5 2 5.5 1.6 2 6 2.0 ... That is, I will plot a growth curve for each subject ID, with y in the y axis, and time in the x axis. I would like to have all growth curves in the same plot. Is there a simple way in R to do it? Thanks a lot! Try this. Lines - ID time y 1 1 1.4 1 2 2.0 1 3 2.5 2 1.5 2.3 2 4 4.5 2 5.5 1.6 2 6 2.0 library(zoo) # z - read.zoo(myfile.dat, header = TRUE, split = 1, index = 2) z - read.zoo(textConnection(Lines), header = TRUE, split = 1, index = 2) plot(z) # each in separate panel plot(z, col = 1:2) # all on same plot in different colors or better: plot(na.approx(z)) plot(na.approx(z), col = 1:2) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trajectory plot (growth curve)
and some more options... dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(1, 2), class = factor), time = c(1, 2, 3, 1.5, 4, 5.5, 6), y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)), .Names = c(ID, time, y), row.names = c(NA, -7L), class = data.frame) library(lattice) xyplot(y ~ time|ID, data = dat, type = 'l') xyplot(y ~ time, data = dat, group = ID, type = 'l') library(ggplot2) qplot(time, y, data = dat, facets = .~ID, geom = 'line') qplot(time, y, data = dat, group = ID, color = ID, geom = 'line') hth, Kingsford Jones On Mon, Aug 23, 2010 at 1:58 PM, Lei Liu liu...@virginia.edu wrote: Hi there, I want to make trajectory plots for data as follows: ID time y 1 1 1.4 1 2 2.0 1 3 2.5 2 1.5 2.3 2 4 4.5 2 5.5 1.6 2 6 2.0 ... That is, I will plot a growth curve for each subject ID, with y in the y axis, and time in the x axis. I would like to have all growth curves in the same plot. Is there a simple way in R to do it? Thanks a lot! Lei Liu Associate Professor Division of Biostatistics and Epidemiology Department of Public Health Sciences University of Virginia School of Medicine http://people.virginia.edu/~ll9f/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trajectory plot (growth curve)
On Mon, 2010-08-23 at 15:58 -0400, Lei Liu wrote: That is, I will plot a growth curve for each subject ID, with y in the y axis, and time in the x axis. I would like to have all growth curves in the same plot. Is there a simple way in R to do it? Thanks a lot! This article, entitled, Fitting Value-added Models in R, by Harold Doran, is relevant and very useful and interesting. www-stat.stanford.edu/~rag/ed351longit/doran.pdf -- Stuart Luppescu -=- slu .at. ccsr.uchicago.edu University of Chicago -=- CCSR 才文と智奈美の父 -=-Kernel 2.6.33-gentoo-r2 I have mentioned several times on this list that I'm in the process of developing a new and wonderful implementation of lme and I would prefer to continue working on that rather than modifying old-style code.-- Douglas Bates R-help (March 2004) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trajectory plot (growth curve)
Hi Lei, Hope you don't mind I'm moving this back to the list in case others may benefit. Answers below... On Mon, Aug 23, 2010 at 3:37 PM, Lei Liu liu...@virginia.edu wrote: Hi Kingsford, Thanks a lot! I got some help from my colleague by using the following code: xyplot(y~month,group=id, type=l), the same as you suggested. It worked fine. However, when I tried to add an additional line for the mean at each time point by the following code: y.mean=aggregate(y, by=list(time), FUN=mean)[, 2] uniq.time=sort(unique(time)) lines(uniq.time, y.mean, type=l, lty=1, lw=2) I find the line of mean does not overlap well with the trajectory plot!!! It seems to me that lines statement does work well under xyplot! I tried different strategies, e.g., add xlim and ylim in both xyplot and lines statements, but still the problem exists. I also tried the ggplot2 package and it had the same problem. Any help here? Thanks! Both lattice and ggplot2 use grid graphics which is a different beast from the base graphics. I don't believe the lines function has methods to add to grid plots. There are many approaches you could take here. The first thing that comes to my mind is to add another subjects (named 'mean' below) whose values are the observed average within time points: #the original data (no replicates within time points) dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(1, 2), class = factor), time = c(1, 2, 3, 1.5, 4, 5.5, 6), y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)), .Names = c(ID, time, y), row.names = c(NA, -7L), class = data.frame) #adding another subject to introduce replicates id3 - data.frame(ID=as.factor(rep(3, 4)),time = c(1, 1.5, 2, 5.5), y = c(1, 2.2, 3, 2)) dat - rbind(dat, id3) mean.y - aggregate(formula = y ~ time, data = dat, FUN = mean) mean.y - cbind(ID = as.factor('mean'), mean.y) dat - rbind(dat, mean.y) dat library(ggplot2) qplot(time, y, data=dat, group = ID, color = ID, geom = c('point', 'line')) best, Kingsford Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trajectory plot (growth curve)
Hi: On Mon, Aug 23, 2010 at 4:19 PM, Kingsford Jones kingsfordjo...@gmail.comwrote: Hi Lei, Hope you don't mind I'm moving this back to the list in case others may benefit. Answers below... On Mon, Aug 23, 2010 at 3:37 PM, Lei Liu liu...@virginia.edu wrote: Hi Kingsford, Thanks a lot! I got some help from my colleague by using the following code: xyplot(y~month,group=id, type=l), the same as you suggested. It worked fine. However, when I tried to add an additional line for the mean at each time point by the following code: y.mean=aggregate(y, by=list(time), FUN=mean)[, 2] uniq.time=sort(unique(time)) lines(uniq.time, y.mean, type=l, lty=1, lw=2) I find the line of mean does not overlap well with the trajectory plot!!! It seems to me that lines statement does work well under xyplot! I tried different strategies, e.g., add xlim and ylim in both xyplot and lines statements, but still the problem exists. I also tried the ggplot2 package and it had the same problem. Any help here? Thanks! Both lattice and ggplot2 use grid graphics which is a different beast from the base graphics. I don't believe the lines function has methods to add to grid plots. There are many approaches you could take here. The first thing that comes to my mind is to add another subjects (named 'mean' below) whose values are the observed average within time points: This is an excellent idea - the only snag might occur if someone wants the mean line to be thicker :) Having said that, it's usually easier to 'fix' the problem externally in the data rather than to fiddle with graphics commands. #the original data (no replicates within time points) dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(1, 2), class = factor), time = c(1, 2, 3, 1.5, 4, 5.5, 6), y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)), .Names = c(ID, time, y), row.names = c(NA, -7L), class = data.frame) #adding another subject to introduce replicates id3 - data.frame(ID=as.factor(rep(3, 4)),time = c(1, 1.5, 2, 5.5), y = c(1, 2.2, 3, 2)) dat - rbind(dat, id3) mean.y - aggregate(formula = y ~ time, data = dat, FUN = mean) mean.y - cbind(ID = as.factor('mean'), mean.y) dat - rbind(dat, mean.y) dat library(ggplot2) qplot(time, y, data=dat, group = ID, color = ID, geom = c('point', 'line')) A lattice version with a legend is: mykey - list(space = 'right', title = 'ID', cex.title = 1.2, text = list(levels(dat$ID), cex = 0.8), lines = list(lty = 1, col = 1:4)) xyplot(y ~ time, data = dat, lty = 1, col.lines = 1:4, col = 1:4, groups = ID, type = c('g', 'p', 'l'), key = mykey) Defining the key externally modularizes the problem, lets one define the features one wants to contain, and simplifies the high-level xyplot() call. There is a type = 'a' (shorthand for panel.average()) that can be used to good effect in xyplot(), but it creates 'holes' where missing data reside, so taking care of the problem externally at the data level is much cleaner. HTH, Dennis best, Kingsford Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trajectory plot (growth curve)
On Mon, Aug 23, 2010 at 6:19 PM, Dennis Murphy djmu...@gmail.com wrote: This is an excellent idea - the only snag might occur if someone wants the mean line to be thicker :) fortunately, with your lattice solution this is easily accomplished by passing a vector to lwd: i - c(1, 1, 1, 3) mykey - list(space = 'right', title = 'ID', cex.title = 1.2, text = list(levels(dat$ID), cex = 0.8), lines = list(lty = i, lwd = i, col = 1:4)) xyplot(y ~ time, data = dat, lty = i, lwd = i, col.lines = 1:4, col = 1:4, groups = ID, type = c('g', 'p', 'l'), key = mykey) but I didn't have luck trying the same with qplot: qplot(time, y, data = dat, group = ID, color = ID, + geom = c('point', 'line'), lty = i, lwd = i) Error in data.frame(colour = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, : arguments imply differing number of rows: 18, 4 perhaps using the construct ggplot(...) + geom_line(...) would be more fruitful? King Having said that, it's usually easier to 'fix' the problem externally in the data rather than to fiddle with graphics commands. #the original data (no replicates within time points) dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(1, 2), class = factor), time = c(1, 2, 3, 1.5, 4, 5.5, 6), y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)), .Names = c(ID, time, y), row.names = c(NA, -7L), class = data.frame) #adding another subject to introduce replicates id3 - data.frame(ID=as.factor(rep(3, 4)),time = c(1, 1.5, 2, 5.5), y = c(1, 2.2, 3, 2)) dat - rbind(dat, id3) mean.y - aggregate(formula = y ~ time, data = dat, FUN = mean) mean.y - cbind(ID = as.factor('mean'), mean.y) dat - rbind(dat, mean.y) dat library(ggplot2) qplot(time, y, data=dat, group = ID, color = ID, geom = c('point', 'line')) A lattice version with a legend is: mykey - list(space = 'right', title = 'ID', cex.title = 1.2, text = list(levels(dat$ID), cex = 0.8), lines = list(lty = 1, col = 1:4)) xyplot(y ~ time, data = dat, lty = 1, col.lines = 1:4, col = 1:4, groups = ID, type = c('g', 'p', 'l'), key = mykey) Defining the key externally modularizes the problem, lets one define the features one wants to contain, and simplifies the high-level xyplot() call. There is a type = 'a' (shorthand for panel.average()) that can be used to good effect in xyplot(), but it creates 'holes' where missing data reside, so taking care of the problem externally at the data level is much cleaner. HTH, Dennis best, Kingsford Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] trajectory plot (growth curve)
Hi: I think it would be tough to do that in qplot(), but it's easier in ggplot(), even if you don't add the mean information to the data frame. Here's one way - use the three person data frame (call it dat1) and the mean.y data frame that you created from aggregate() without adding the factor info as follows: # Set up the framework of the plot: g - ggplot(dat1, aes(x = time, y = y, groups = ID, colour = ID)) # This associates colors with groups. Next, add the points and # lines from dat1 and then add the mean data with a separate # geom_line() call, where the mean line is about twice as thick: g + geom_point(size = 2) + geom_line() + geom_line(data = mean.y, aes(x = time, y = y, colour = 'mean'), size = 1.5) # Notice how the name 'mean' that we associated with colour got into the # legend. This is because we *mapped* the same aesthetic (color) in the second # geom_line() call to the one existing for IDs. ggplot2 is smart enough to pick # this up. [We just have to be smart enough to realize it :)]. To exert more # control over line colors, add the following: last_plot() + scale_colour_manual(values = c('1' = 'red', '2' = 'green', '3' = 'blue', 'mean' = 'black')) The LHS is the value of ID, the RHS the color to associate with it. As usual, it took me about five iterations of scale_* to get it right :) The line thicknesses in the scale are all the same as the thickest, but I see that as a feature rather than a bug :) One more comment below. On Mon, Aug 23, 2010 at 6:20 PM, Kingsford Jones kingsfordjo...@gmail.comwrote: On Mon, Aug 23, 2010 at 6:19 PM, Dennis Murphy djmu...@gmail.com wrote: This is an excellent idea - the only snag might occur if someone wants the mean line to be thicker :) fortunately, with your lattice solution this is easily accomplished by passing a vector to lwd: i - c(1, 1, 1, 3) I was going to do that, too, but I used 1.5 instead of 3, saw no material difference, and gave up...should have kept trying, huh? HTH, Dennis mykey - list(space = 'right', title = 'ID', cex.title = 1.2, text = list(levels(dat$ID), cex = 0.8), lines = list(lty = i, lwd = i, col = 1:4)) xyplot(y ~ time, data = dat, lty = i, lwd = i, col.lines = 1:4, col = 1:4, groups = ID, type = c('g', 'p', 'l'), key = mykey) but I didn't have luck trying the same with qplot: qplot(time, y, data = dat, group = ID, color = ID, + geom = c('point', 'line'), lty = i, lwd = i) Error in data.frame(colour = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, : arguments imply differing number of rows: 18, 4 perhaps using the construct ggplot(...) + geom_line(...) would be more fruitful? King Having said that, it's usually easier to 'fix' the problem externally in the data rather than to fiddle with graphics commands. #the original data (no replicates within time points) dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(1, 2), class = factor), time = c(1, 2, 3, 1.5, 4, 5.5, 6), y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)), .Names = c(ID, time, y), row.names = c(NA, -7L), class = data.frame) #adding another subject to introduce replicates id3 - data.frame(ID=as.factor(rep(3, 4)),time = c(1, 1.5, 2, 5.5), y = c(1, 2.2, 3, 2)) dat - rbind(dat, id3) mean.y - aggregate(formula = y ~ time, data = dat, FUN = mean) mean.y - cbind(ID = as.factor('mean'), mean.y) dat - rbind(dat, mean.y) dat library(ggplot2) qplot(time, y, data=dat, group = ID, color = ID, geom = c('point', 'line')) A lattice version with a legend is: mykey - list(space = 'right', title = 'ID', cex.title = 1.2, text = list(levels(dat$ID), cex = 0.8), lines = list(lty = 1, col = 1:4)) xyplot(y ~ time, data = dat, lty = 1, col.lines = 1:4, col = 1:4, groups = ID, type = c('g', 'p', 'l'), key = mykey) Defining the key externally modularizes the problem, lets one define the features one wants to contain, and simplifies the high-level xyplot() call. There is a type = 'a' (shorthand for panel.average()) that can be used to good effect in xyplot(), but it creates 'holes' where missing data reside, so taking care of the problem externally at the data level is much cleaner. HTH, Dennis best, Kingsford Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and