Eric Doviak <edoviak <at> earthlink.net> writes: > > Dear useRs, > > I recently began a job at a very large and heavily bureaucratic organization. We're setting up a research > office and statistical analysis will form the backbone of our work. We'll be working with large datasets > such the SIPP as well as our own administrative data. We need to know more about what you need to do with those large data sets in order to help -- giving some specific examples would be useful. In many situations you can set up a database connection or use Perl to select carefully and only load the observations/variables you need into R, but it's hard to make completely general suggestions.
I'm not sure what the purpose of your code to read a few lines of a data file and write it to a CSV file is ... ? "Vectorizing" your code is figuring out a way to tell R how to do what you want as a single 'vector' operation -- for example to remove NAs from a vector you could do this: newvec = numeric(0) for (i in seq(along=oldvec)) { if (!is.na(oldvec[i])) newvec = c(newvec,oldvec[i]) } but this would be incredibly slow -- newvec = oldvec[!is.na(oldvec)] or newvec = na.omit(oldvec) would be far faster. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.