Dear list,
Below I've written a clunky for loop that counts NA's in a row, replacing all
with NA if there are
more than 3 missing values, or keeping the values if 4 or more are present.
This is sample code from a very large
dataframe I'm trying to clean up.
I know there are many simpler more elegant solutions to this little problem.
Would someone be willing to show me how to create a function that I can APPLY
to
each row rather than looping? I've tried and can't get it.
Thank you,
Zack
################################################# Count NA's in each row. IF >
3 NA's in a row, make all
NA###########################################################
## test dataframe #######x1 <- c(1,NA,NA,NA,NA,1,2,2)x2 <-
c(1,NA,NA,NA,NA,1,2,1)x3 <- c(1,NA,1,1,1,1,1,5)x4 <- c(1,NA,NA,NA,NA,NA,NA,5)x
<- rbind(x1,x2,x3,x4)test <- rowSums(is.na(x)) ## count numer of NA's
in rowx <- cbind(x, test) ## add row NA count to datax <-
data.frame(x) ## make dataframe
# FOR LOOP to apply across all rows of dataframe -- for(i in 1:nrow(x))
{if(x[i,9] > 4) {x[i,1:7] <- NA} else { x[i,1:7] <- x[i,1:7]i = i+1}}
#####################################################
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.