Two questions:

1. How do I replace "NA" with 0?
2. How can I sort the observations by their id instead of by time? (actually
i can see what you produced is automatically sorted by id; but in my case,
the output data is sorted by time)


On 1/27/07, Chuck Cleland <[EMAIL PROTECTED]> wrote:
>
> gallon li wrote:
> > i have a large longitudinal data set. The number of observations for
> each
> > subject is not the same across the sample. The largest number of a
> subject
> > is 5 and the smallest number is 1.
> >
> > now i want to make each subject to have the same number of observations
> by
> > filling zero, e.g., my original sample is
> >
> > id x
> > 001 10
> > 001 30
> > 001 20
> > 002 10
> > 002 20
> > 002 40
> > 002 80
> > 002 70
> > 003 20
> > 003 40
> > 004 ......
> >
> > now i wish to make the data like
> >
> >  id x
> > 001 10
> > 001 30
> > 001 20
> > 001 0
> > 001 0
> > 002 10
> > 002 20
> > 002 40
> > 002 80
> > 002 70
> > 003 20
> > 003 40
> > 003 0
> > 003 0
> > 003 0
> > 004 ......
> >
> > so that each id has exactly 5 observations. is there a function which
> can
> > allow me do this quickly?
>
> Filling in with zeros seems like a bad idea, but here is an approach
> to filling in with NAs.  I will leave replacing the NAs with zeros to you.
>
> df.long <- data.frame(id = c(1,1,1,2,2,2,2,2,3,3), x = runif(10),
>                      time = c(1,2,5,1,2,3,4,5,2,4))
>
> df.long
>   id          x time
> 1   1 0.72888215    1
> 2   1 0.60893548    2
> 3   1 0.41347690    5
> 4   2 0.79388248    1
> 5   2 0.05810054    2
> 6   2 0.02451654    3
> 7   2 0.85464775    4
> 8   2 0.15970365    5
> 9   3 0.22856183    2
> 10  3 0.38291471    4
>
> df.wide <- reshape(df, idvar = "id", v.names = "x", direction="wide")
>
> df.wide
> id       x.1       x.2       x.5       x.3       x.4
> 1  1 0.6375135 0.1651258 0.3210223        NA        NA
> 4  2 0.9878134 0.8909020 0.9853269 0.7747615 0.3834130
> 9  3        NA 0.3586109        NA        NA 0.8310539
>
> df.long2 <- reshape(df.wide, direction="long")
>
> df.long2
>    id time         x
> 1.1  1    1 0.6375135
> 2.1  2    1 0.9878134
> 3.1  3    1        NA
> 1.2  1    2 0.1651258
> 2.2  2    2 0.8909020
> 3.2  3    2 0.3586109
> 1.5  1    5 0.3210223
> 2.5  2    5 0.9853269
> 3.5  3    5        NA
> 1.3  1    3        NA
> 2.3  2    3 0.7747615
> 3.3  3    3        NA
> 1.4  1    4        NA
> 2.4  2    4 0.3834130
> 3.4  3    4 0.8310539
>
> This assumes that your data in the "long" format has a time variable.
> See the help page for reshape() for more details.
>
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > [email protected] mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc.
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
>

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to