On Jun 11, 2009, at 12:13 PM, Mark Na wrote:

Hi R-helpers,

I have the following (dummy) dataframe:

test
 DATE LOCATION  KIND CLASS COUNT
1    1        1   CAR     A     2
2    1        1 TRUCK     D     3
3    1        1   BUS     E     4
4    1        2   CAR     E     2
5    1        2 TRUCK     A     7
6    1        2   BUS     F     1

That I would like to turn into this:

test2
  DATE LOCATION  KIND CLASS
1     1        1   CAR     A
2     1        1   CAR     A
3     1        1 TRUCK     D
4     1        1 TRUCK     D
5     1        1 TRUCK     D
6     1        1   BUS     E
7     1        1   BUS     E
8     1        1   BUS     E
9     1        1   BUS     E
10    1        2   CAR     E
11    1        2   CAR     E
12    1        2 TRUCK     A
13    1        2 TRUCK     A
14    1        2 TRUCK     A
15    1        2 TRUCK     A
16    1        2 TRUCK     A
17    1        2 TRUCK     A
18    1        2 TRUCK     A
19    1        2   BUS     F

So, basically it's a case of expanding (adding rows to) the first dataframe
by the value in the COUNT column.

I have solved this problem with the following code:

test2<-with(test, data.frame(DATE=rep(DATE,COUNT),
LOCATION=rep(LOCATION,COUNT), KIND=rep(KIND,COUNT), CLASS=rep(CLASS,COUNT)))

but I'm unsatisfied with that solution because it's verbose and I think there must a more elegant way. If I had more variables than 4 (which I do in my real data) it would be a nuisance to repeat each column within the rep
function.

I would prefer to do this with Base R or package(reshape) than relying on
another package.

Any ideas? Thanks!

Mark Na


Mark,

A quick and dirty solution:

> test[rep(1:nrow(test), test$COUNT), -ncol(test)]
    DATE LOCATION  KIND CLASS
1      1        1   CAR     A
1.1    1        1   CAR     A
2      1        1 TRUCK     D
2.1    1        1 TRUCK     D
2.2    1        1 TRUCK     D
3      1        1   BUS     E
3.1    1        1   BUS     E
3.2    1        1   BUS     E
3.3    1        1   BUS     E
4      1        2   CAR     E
4.1    1        2   CAR     E
5      1        2 TRUCK     A
5.1    1        2 TRUCK     A
5.2    1        2 TRUCK     A
5.3    1        2 TRUCK     A
5.4    1        2 TRUCK     A
5.5    1        2 TRUCK     A
5.6    1        2 TRUCK     A
6      1        2   BUS     F


For a more general solution to taking a tabulated data frame and converting it back to the raw data see my expand.dft() function:

  https://stat.ethz.ch/pipermail/r-help/2009-January/185561.html

For example:

> expand.dft(test, freq = "COUNT")
   DATE LOCATION  KIND CLASS
1     1        1   CAR     A
2     1        1   CAR     A
3     1        1 TRUCK     D
4     1        1 TRUCK     D
5     1        1 TRUCK     D
6     1        1   BUS     E
7     1        1   BUS     E
8     1        1   BUS     E
9     1        1   BUS     E
10    1        2   CAR     E
11    1        2   CAR     E
12    1        2 TRUCK     A
13    1        2 TRUCK     A
14    1        2 TRUCK     A
15    1        2 TRUCK     A
16    1        2 TRUCK     A
17    1        2 TRUCK     A
18    1        2 TRUCK     A
19    1        2   BUS     F



HTH,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to