Re: [R] trajectory plot (growth curve)

2010-08-23 Thread Gabor Grothendieck
On Mon, Aug 23, 2010 at 3:58 PM, Lei Liu liu...@virginia.edu wrote:
 Hi there,

 I want to make trajectory plots for data as follows:

 ID      time    y
 1       1       1.4
 1       2       2.0
 1       3       2.5
 2       1.5     2.3
 2       4       4.5
 2       5.5     1.6
 2       6       2.0

 ...

 That is, I will plot a growth curve for each subject ID, with y in the y
 axis, and time in the x axis. I would like to have all growth curves in the
 same plot. Is there a simple way in R to do it? Thanks a lot!


Try this.

Lines - ID  timey
1   1   1.4
1   2   2.0
1   3   2.5
2   1.5 2.3
2   4   4.5
2   5.5 1.6
2   6   2.0

library(zoo)

# z - read.zoo(myfile.dat, header = TRUE, split = 1, index = 2)
z - read.zoo(textConnection(Lines), header = TRUE, split = 1, index = 2)

plot(z) # each in separate panel
plot(z, col = 1:2) # all on same plot in different colors

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trajectory plot (growth curve)

2010-08-23 Thread Gabor Grothendieck
On Mon, Aug 23, 2010 at 4:16 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Mon, Aug 23, 2010 at 3:58 PM, Lei Liu liu...@virginia.edu wrote:
 Hi there,

 I want to make trajectory plots for data as follows:

 ID      time    y
 1       1       1.4
 1       2       2.0
 1       3       2.5
 2       1.5     2.3
 2       4       4.5
 2       5.5     1.6
 2       6       2.0

 ...

 That is, I will plot a growth curve for each subject ID, with y in the y
 axis, and time in the x axis. I would like to have all growth curves in the
 same plot. Is there a simple way in R to do it? Thanks a lot!


 Try this.

 Lines - ID      time    y
 1       1       1.4
 1       2       2.0
 1       3       2.5
 2       1.5     2.3
 2       4       4.5
 2       5.5     1.6
 2       6       2.0

 library(zoo)

 # z - read.zoo(myfile.dat, header = TRUE, split = 1, index = 2)
 z - read.zoo(textConnection(Lines), header = TRUE, split = 1, index = 2)

 plot(z) # each in separate panel
 plot(z, col = 1:2) # all on same plot in different colors


or better:

plot(na.approx(z))
plot(na.approx(z), col = 1:2)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trajectory plot (growth curve)

2010-08-23 Thread Kingsford Jones
and some more options...

dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L),
  .Label = c(1, 2), class = factor),
  time = c(1, 2, 3, 1.5, 4, 5.5, 6),
  y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)),
  .Names = c(ID, time, y),
  row.names = c(NA, -7L), class = data.frame)

library(lattice)
xyplot(y ~ time|ID, data = dat, type = 'l')
xyplot(y ~ time, data = dat, group = ID, type = 'l')

library(ggplot2)
qplot(time, y, data = dat, facets = .~ID, geom = 'line')
qplot(time, y, data = dat, group = ID, color = ID, geom = 'line')


hth,

Kingsford Jones

On Mon, Aug 23, 2010 at 1:58 PM, Lei Liu liu...@virginia.edu wrote:
 Hi there,

 I want to make trajectory plots for data as follows:

 ID      time    y
 1       1       1.4
 1       2       2.0
 1       3       2.5
 2       1.5     2.3
 2       4       4.5
 2       5.5     1.6
 2       6       2.0

 ...

 That is, I will plot a growth curve for each subject ID, with y in the y
 axis, and time in the x axis. I would like to have all growth curves in the
 same plot. Is there a simple way in R to do it? Thanks a lot!

 Lei Liu
 Associate Professor
 Division of Biostatistics and Epidemiology
 Department of Public Health Sciences
 University of Virginia School of Medicine

 http://people.virginia.edu/~ll9f/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trajectory plot (growth curve)

2010-08-23 Thread Stuart Luppescu
On Mon, 2010-08-23 at 15:58 -0400, Lei Liu wrote:
 That is, I will plot a growth curve for each subject ID, with y in 
 the y axis, and time in the x axis. I would like to have all growth 
 curves in the same plot. Is there a simple way in R to do it? Thanks a
 lot! 

This article, entitled, Fitting Value-added Models in R, by Harold
Doran, is relevant and very useful and interesting.
www-stat.stanford.edu/~rag/ed351longit/doran.pdf 
-- 
Stuart Luppescu -=- slu .at. ccsr.uchicago.edu
University of Chicago -=- CCSR 
才文と智奈美の父 -=-Kernel 2.6.33-gentoo-r2
 I have mentioned
 several times on this list that I'm in the process
 of developing a new and wonderful implementation
 of lme and I would prefer to continue working on
 that rather than modifying old-style code.--
 Douglas Bates   R-help (March 2004)
 
 
 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trajectory plot (growth curve)

2010-08-23 Thread Kingsford Jones
Hi Lei,

Hope you don't mind I'm moving this back to the list in case others
may benefit.  Answers below...

On Mon, Aug 23, 2010 at 3:37 PM, Lei Liu liu...@virginia.edu wrote:
 Hi Kingsford,

 Thanks a lot! I got some help from my colleague by using the following code:

  xyplot(y~month,group=id, type=l), the same as you suggested. It worked
 fine.

 However, when I tried to add an additional line for the mean at each time
 point by the following code:

  y.mean=aggregate(y, by=list(time), FUN=mean)[, 2]
  uniq.time=sort(unique(time))

  lines(uniq.time, y.mean, type=l, lty=1, lw=2)

 I find the line of mean does not overlap well with the trajectory plot!!! It
 seems to me that lines statement does work well under xyplot! I tried
 different strategies, e.g., add xlim and ylim in both xyplot and lines
 statements, but still the problem exists. I also tried the ggplot2 package
 and it had the same problem. Any help here? Thanks!

Both lattice and ggplot2 use grid graphics which is a different beast
from the base graphics.  I don't believe the lines function has
methods to add to grid plots.  There are many approaches you could
take here.  The first thing that comes to my mind is to add another
subjects (named 'mean' below) whose values are the observed average
within time points:

#the original data (no replicates within time points)
dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L),
 .Label = c(1, 2), class = factor),
 time = c(1, 2, 3, 1.5, 4, 5.5, 6),
 y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)),
 .Names = c(ID, time, y),
 row.names = c(NA, -7L), class = data.frame)

#adding another subject to introduce replicates
id3 - data.frame(ID=as.factor(rep(3, 4)),time = c(1, 1.5, 2, 5.5),
 y = c(1, 2.2, 3, 2))
dat - rbind(dat, id3)
mean.y - aggregate(formula = y ~ time, data = dat, FUN = mean)
mean.y - cbind(ID = as.factor('mean'), mean.y)
dat - rbind(dat, mean.y)
dat
library(ggplot2)
qplot(time, y, data=dat, group = ID, color = ID, geom = c('point', 'line'))


best,

Kingsford Jones

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trajectory plot (growth curve)

2010-08-23 Thread Dennis Murphy
Hi:

On Mon, Aug 23, 2010 at 4:19 PM, Kingsford Jones
kingsfordjo...@gmail.comwrote:

 Hi Lei,

 Hope you don't mind I'm moving this back to the list in case others
 may benefit.  Answers below...

 On Mon, Aug 23, 2010 at 3:37 PM, Lei Liu liu...@virginia.edu wrote:
  Hi Kingsford,
 
  Thanks a lot! I got some help from my colleague by using the following
 code:
 
   xyplot(y~month,group=id, type=l), the same as you suggested. It worked
  fine.
 
  However, when I tried to add an additional line for the mean at each time
  point by the following code:
 
   y.mean=aggregate(y, by=list(time), FUN=mean)[, 2]
   uniq.time=sort(unique(time))
 
   lines(uniq.time, y.mean, type=l, lty=1, lw=2)
 
  I find the line of mean does not overlap well with the trajectory plot!!!
 It
  seems to me that lines statement does work well under xyplot! I tried
  different strategies, e.g., add xlim and ylim in both xyplot and lines
  statements, but still the problem exists. I also tried the ggplot2
 package
  and it had the same problem. Any help here? Thanks!

 Both lattice and ggplot2 use grid graphics which is a different beast
 from the base graphics.  I don't believe the lines function has
 methods to add to grid plots.  There are many approaches you could
 take here.  The first thing that comes to my mind is to add another
 subjects (named 'mean' below) whose values are the observed average
 within time points:


This is an excellent idea - the only snag might occur if someone wants
the mean line to be thicker :)  Having said that, it's usually easier to
'fix' the
problem externally in the data rather than to fiddle with graphics commands.


 #the original data (no replicates within time points)
 dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L),
  .Label = c(1, 2), class = factor),
  time = c(1, 2, 3, 1.5, 4, 5.5, 6),
  y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)),
  .Names = c(ID, time, y),
  row.names = c(NA, -7L), class = data.frame)

 #adding another subject to introduce replicates
 id3 - data.frame(ID=as.factor(rep(3, 4)),time = c(1, 1.5, 2, 5.5),
 y = c(1, 2.2, 3, 2))
 dat - rbind(dat, id3)
 mean.y - aggregate(formula = y ~ time, data = dat, FUN = mean)
 mean.y - cbind(ID = as.factor('mean'), mean.y)
 dat - rbind(dat, mean.y)
 dat
 library(ggplot2)
 qplot(time, y, data=dat, group = ID, color = ID, geom = c('point', 'line'))


 A lattice version with a legend is:

mykey - list(space = 'right',
  title = 'ID',
  cex.title = 1.2,
  text = list(levels(dat$ID), cex = 0.8),
  lines = list(lty = 1, col = 1:4))

xyplot(y ~ time, data = dat, lty = 1, col.lines = 1:4, col = 1:4,
 groups = ID, type = c('g', 'p', 'l'), key = mykey)

Defining the key externally modularizes the problem, lets one define
the features one wants to contain, and simplifies the high-level
xyplot() call.

There is a type = 'a' (shorthand for panel.average()) that can be
used to good effect in xyplot(), but it creates 'holes' where missing
data reside, so taking care of the problem externally at the data
level is much cleaner.

HTH,
Dennis


 best,

 Kingsford Jones

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trajectory plot (growth curve)

2010-08-23 Thread Kingsford Jones
On Mon, Aug 23, 2010 at 6:19 PM, Dennis Murphy djmu...@gmail.com wrote:

 This is an excellent idea - the only snag might occur if someone wants
 the mean line to be thicker :)

fortunately, with your lattice solution this is easily accomplished by
passing a vector to lwd:

i - c(1, 1, 1, 3)

mykey - list(space = 'right',
  title = 'ID',
  cex.title = 1.2,
  text = list(levels(dat$ID), cex = 0.8),
  lines = list(lty = i, lwd = i, col = 1:4))

xyplot(y ~ time, data = dat, lty = i, lwd = i, col.lines = 1:4, col = 1:4,
 groups = ID, type = c('g', 'p', 'l'), key = mykey)


but I didn't have luck trying the same with qplot:

 qplot(time, y, data = dat, group = ID, color = ID,
+ geom = c('point', 'line'), lty = i, lwd = i)
Error in data.frame(colour = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,  :
  arguments imply differing number of rows: 18, 4

perhaps using the construct ggplot(...) + geom_line(...) would be more fruitful?

King




 Having said that, it's usually easier to
 'fix' the
 problem externally in the data rather than to fiddle with graphics commands.


 #the original data (no replicates within time points)
 dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L),
  .Label = c(1, 2), class = factor),
  time = c(1, 2, 3, 1.5, 4, 5.5, 6),
  y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)),
  .Names = c(ID, time, y),
  row.names = c(NA, -7L), class = data.frame)

 #adding another subject to introduce replicates
 id3 - data.frame(ID=as.factor(rep(3, 4)),time = c(1, 1.5, 2, 5.5),
                 y = c(1, 2.2, 3, 2))
 dat - rbind(dat, id3)
 mean.y - aggregate(formula = y ~ time, data = dat, FUN = mean)
 mean.y - cbind(ID = as.factor('mean'), mean.y)
 dat - rbind(dat, mean.y)
 dat
 library(ggplot2)
 qplot(time, y, data=dat, group = ID, color = ID, geom = c('point',
 'line'))

  A lattice version with a legend is:

 mykey - list(space = 'right',
   title = 'ID',
   cex.title = 1.2,
   text = list(levels(dat$ID), cex = 0.8),
   lines = list(lty = 1, col = 1:4))

 xyplot(y ~ time, data = dat, lty = 1, col.lines = 1:4, col = 1:4,
  groups = ID, type = c('g', 'p', 'l'), key = mykey)

 Defining the key externally modularizes the problem, lets one define
 the features one wants to contain, and simplifies the high-level
 xyplot() call.

 There is a type = 'a' (shorthand for panel.average()) that can be
 used to good effect in xyplot(), but it creates 'holes' where missing
 data reside, so taking care of the problem externally at the data
 level is much cleaner.

 HTH,
 Dennis


 best,

 Kingsford Jones

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trajectory plot (growth curve)

2010-08-23 Thread Dennis Murphy
Hi:

I think it would be tough to do that in qplot(), but it's easier in
ggplot(), even if you don't add the mean information to the data frame.
Here's one way - use the three person data frame (call it dat1) and the
mean.y data frame that you created from aggregate() without adding the
factor info as follows:

# Set up the framework of the plot:
g - ggplot(dat1, aes(x = time, y = y, groups = ID, colour = ID))

# This associates colors with groups. Next, add the points and
# lines from dat1 and then add the mean data with a separate
# geom_line() call, where the mean line is about twice as thick:
g + geom_point(size = 2) + geom_line() +
  geom_line(data = mean.y, aes(x = time, y = y, colour = 'mean'), size =
1.5)

# Notice how the name 'mean' that we associated with colour got into the
# legend. This is because we *mapped* the same aesthetic (color) in the
second
# geom_line() call to the one existing for IDs. ggplot2 is smart enough to
pick
# this up. [We just have to be smart enough to realize it :)]. To exert more

# control over line colors, add the following:
last_plot() + scale_colour_manual(values = c('1' = 'red', '2' = 'green',
 '3' = 'blue', 'mean' = 'black'))
The LHS is the value of ID, the RHS the color to associate with it.

As usual, it took me about five iterations of scale_* to get it right :) The
line
thicknesses in the scale are all the same as the thickest, but I see that as
a
feature rather than a bug :)

One more comment below.


On Mon, Aug 23, 2010 at 6:20 PM, Kingsford Jones
kingsfordjo...@gmail.comwrote:

 On Mon, Aug 23, 2010 at 6:19 PM, Dennis Murphy djmu...@gmail.com wrote:
 
  This is an excellent idea - the only snag might occur if someone wants
  the mean line to be thicker :)

 fortunately, with your lattice solution this is easily accomplished by
 passing a vector to lwd:

 i - c(1, 1, 1, 3)


I was going to do that, too, but I used 1.5 instead of 3, saw no material
difference,
and gave up...should have kept trying, huh?

HTH,
Dennis


 mykey - list(space = 'right',
  title = 'ID',
  cex.title = 1.2,
  text = list(levels(dat$ID), cex = 0.8),
   lines = list(lty = i, lwd = i, col = 1:4))

 xyplot(y ~ time, data = dat, lty = i, lwd = i, col.lines = 1:4, col = 1:4,
  groups = ID, type = c('g', 'p', 'l'), key = mykey)


 but I didn't have luck trying the same with qplot:

  qplot(time, y, data = dat, group = ID, color = ID,
 + geom = c('point', 'line'), lty = i, lwd = i)
 Error in data.frame(colour = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,  :
  arguments imply differing number of rows: 18, 4

 perhaps using the construct ggplot(...) + geom_line(...) would be more
 fruitful?

 King




  Having said that, it's usually easier to
  'fix' the
  problem externally in the data rather than to fiddle with graphics
 commands.
 
 
  #the original data (no replicates within time points)
  dat - structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L),
   .Label = c(1, 2), class = factor),
   time = c(1, 2, 3, 1.5, 4, 5.5, 6),
   y = c(1.4, 2, 2.5, 2.3, 4.5, 1.6, 2)),
   .Names = c(ID, time, y),
   row.names = c(NA, -7L), class = data.frame)
 
  #adding another subject to introduce replicates
  id3 - data.frame(ID=as.factor(rep(3, 4)),time = c(1, 1.5, 2, 5.5),
  y = c(1, 2.2, 3, 2))
  dat - rbind(dat, id3)
  mean.y - aggregate(formula = y ~ time, data = dat, FUN = mean)
  mean.y - cbind(ID = as.factor('mean'), mean.y)
  dat - rbind(dat, mean.y)
  dat
  library(ggplot2)
  qplot(time, y, data=dat, group = ID, color = ID, geom = c('point',
  'line'))
 
   A lattice version with a legend is:
 
  mykey - list(space = 'right',
title = 'ID',
cex.title = 1.2,
text = list(levels(dat$ID), cex = 0.8),
lines = list(lty = 1, col = 1:4))
 
  xyplot(y ~ time, data = dat, lty = 1, col.lines = 1:4, col = 1:4,
   groups = ID, type = c('g', 'p', 'l'), key = mykey)
 
  Defining the key externally modularizes the problem, lets one define
  the features one wants to contain, and simplifies the high-level
  xyplot() call.
 
  There is a type = 'a' (shorthand for panel.average()) that can be
  used to good effect in xyplot(), but it creates 'holes' where missing
  data reside, so taking care of the problem externally at the data
  level is much cleaner.
 
  HTH,
  Dennis
 
 
  best,
 
  Kingsford Jones
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and