R-Helpers,
I am trying to insert a value into a dataframe. This value is a proportion
calculated by counting the number of those individuals with that value and then
inserting the proportion at the end of the dataframe to only those individuals
with the given value. The problem I am running into is that the proportions are
not being attached to only those individuals with the specified value for that
proportion.
Below is an example of the code that I am using. The data is made up for the
dataframe. Should give you an idea, but the original has 'NA' in many rows. The
original data is what is reported in the output below.
#Read in Data
age.int <- data.frame(IND_ID = seq(1, 140, 10), rs1042364 = sample(
c("(1,1)","(1,2)","(2,2)"),14,replace = T),
first_drink = sample(5:17,14,replace = T))
asubs112 <- subset(age.int, rs1042364 != "(2,2)")
ages112 <- sort(unique(na.omit(asubs112$first_drink)))
for ( i in ages112) {
indce <- which(na.omit(asubs112$first_drink == i))
prop <- length(indce)/nrow(asubs112)
asubs112[indce,4] <- prop
asubs112[indce,]
}
Below is the output that I get from the script above. Notice the proportion
for the first NA but not any of the others. Not sure what I am doing wrong, any
suggestions are a big help.
TIA,
Adrian
asubs112[1:50,]
IND_ID rs1042364 first_drink age_int V5
4 10008007 (1,2) NA 16 0.003891051
6 10013012 (1,2) 13 14 0.116731518
7 10015006 (1,2) 12 17 0.105058366
8 10015007 (1,1) 12 16 0.105058366
10 10021009 (1,2) NA 15 NA
14 10039036 (1,2) NA 15 NA
15 10039037 (1,2) NA 13 NA
17 10045005 (1,2) 13 17 0.116731518
18 10045014 (1,2) 13 14 0.116731518
21 10055022 (1,2) NA 15 NA
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html