Hi, I have a data set with variables that are _not_ missing at random. Now I use a package for learning a Bayesian Network which won't accept NA as a value. From a database I query data.frames with k,k+n,k+2n, ... variables (there are always at least k variables as leftmost columns). Using rbind.fill from the reshape package on two data frames I would get a data frame like
trg_type child_type_1 1 Scientists NA 2 of used Now to get rid of NA values I use the following function, which works for data frames with only factor values: substitute_na <- function(tok, na_factor_level = "NOT_REALIZED") { for (i in 1:length(tok)) {levels(tok[,i]) <- c(levels(tok[,i]), na_factor_level)} tok[is.na(tok)] <- as.factor(na_factor_level) return(tok) } Is there a better/faster way to do it? It would also be great to be able to distinguish factor columns from numeric columns and use a special numeric value there. The current version of rbind.fill makes no direct reference to the fill value so that I could change its implementation for my purpose. Thanks! Ingmar -- Ingmar Schuster Natural Language Processing Group Department of Computer Science University of Leipzig Johannisgasse 26 04103 Leipzig, Germany Tel. +49 341 9732205 http://asv.informatik.uni-leipzig.de/en/staff/Ingmar_Schuster [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.