You can remove NAs with: train <- subset(train, !is.na(TargetVariable))
I am not sure what you mean by constant values. You could use 'table' to determine which values appear the most and then remove them: x <- table(train$TargetVariable) train <- subset(train, !(TargetVariable %in% names(x)[x > someCountAboveWhichToDelete])) But you probably need to look at your data and determine which numbers are in the set that you need to delete. On Sat, Jul 10, 2010 at 6:28 PM, pdb <ph...@philbrierley.com> wrote: > > Hi all, > > I have a large data set and want to immediately build a 'blind' model > without first examining the data. Now it appears in the data there are a lot > of fields that are constant or all missing values - which prevents the model > from being built. > > Can someone point me the right direction as to how I can automatically purge > my data file of these useless fields. > > Thanks in advance, > > pdb > > train <- read.csv("TrainingData.csv") > library(gbm) > i.gbm<-gbm(TargetVariable ~ . ,data=train,distribution="bernoulli..... > > 1: In gbm.fit(x, y, offset = offset, distribution = distribution, ... : > variable 5: var1 has no variation. > -- > View this message in context: > http://r.789695.n4.nabble.com/eliminating-constant-variables-tp2284831p2284831.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.