On 12-12-07 7:27 AM, Dimitri Liakhovitski wrote:
Dear R-ers,

my task is to simple: to assign cases to desired groupings based on the
combined values on 2 variables. I can think of 3 methods of doing it.
Method 1 seems to me pretty r-like, but it requires a lot of lines of code
- onerous.

Since your groups are so regular, you can compute the groups directly. Convert each column to a factor (this might have happened automatically, depending on your data and options), then use as.integer to convert to a numeric value.

So a simple solution would be

mydata$mygroup.m4 <- with(mydata,
                             4*(2-as.integer(factor(sex)))
                             + as.integer(factor(age)))

It would be a little simpler if you wanted the sex factor in alphbetical order; then you wouldn't need to subtract from 2.

If your real data wasn't so regular, another approach would be to set up a matrix, indexed by sex and age, that gives the desired group number. That is somewhat like your "groupings" solution; I'm not sure it would be preferable to what you did.

Duncan Murdoch

Method 2 is a loop, so not very good - as it loops through all rows of
mydata.
Method 3 is a loop but loops through fewer lines, so it seems to me more
efficient.
Can you please tell me:
1. Which of my methods is more efficient?
2. Is there maybe an even more efficient r-like way of doing it?
Imagine - "mydata" is actually a very tall data frame.
Thanks a lot!
Dimitri

### My Data:
mydata<-data.frame(sex=rep(c(rep("m",4),rep("f",4)),2),age=rep(c(1:4,1:4),2))
(mydata)

### My desired assignments (in column "mygroup")
groupings<-data.frame(sex=c(rep("m",4),rep("f",4)),age=c(1:4,1:4),mygroup=1:8)
(groupings)

# No, I don't need a solution where the last column of "groupings" is
stacked twice and bound to "mydata"

# Method 1 of assigning to groups - requires a lot of lines of code:
mydata$mygroup.m1<-NA
mydata[(mydata$sex %in% "m")&(mydata$age %in% 1),"mygroup.m1"]<-1
mydata[(mydata$sex %in% "m")&(mydata$age %in% 2),"mygroup.m1"]<-2
mydata[(mydata$sex %in% "m")&(mydata$age %in% 3),"mygroup.m1"]<-3
mydata[(mydata$sex %in% "m")&(mydata$age %in% 4),"mygroup.m1"]<-4
mydata[(mydata$sex %in% "f")&(mydata$age %in% 1),"mygroup.m1"]<-5
mydata[(mydata$sex %in% "f")&(mydata$age %in% 2),"mygroup.m1"]<-6
mydata[(mydata$sex %in% "f")&(mydata$age %in% 3),"mygroup.m1"]<-7
mydata[(mydata$sex %in% "f")&(mydata$age %in% 4),"mygroup.m1"]<-8
(mydata)

# Method 2 of assigning to groups - very "loopy":
mydata$mygroup.m2<-NA
for(i in 1:nrow(mydata)){  # i<-1
   mysex<-mydata[i,"sex"]
   myage<-mydata[i,"age"]
   mydata[i,"mygroup.m2"]<-groupings[(groupings$sex %in%
mysex)&(groupings$age %in% myage),"mygroup"]
}
(mydata)

# Method 3 of assigning to groups - also "loopy", but less than Method 2:
mydata$mygroup.m3<-NA
for(i in 1:nrow(groupings)){  # i<-1
   mysex<-groupings[i,"sex"]
   myage<-groupings[i,"age"]
   mydata[(mydata$sex %in% mysex)&(mydata$age %in%
myage),"mygroup.m3"]<-groupings[i,"mygroup"]
}
(mydata)


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to