"mark salsburg" <[EMAIL PROTECTED]> writes:
> How do I manipulate the read.table function to read in only the 2nd
> column???
If your data is small, you can read in all columns and then subset the
resulting data frame. Try that first.
Perhaps there is a nicer way to do this that I don't know about, but
recently I coded up the following to allow for a "streamy" read.table.
I've adjusted a few things, but haven't tested. May not work as is,
but it should give you an idea.
+ seth
readBatch <- function(con, batch.size) {
colClasses <- rep("character", 20) ## fix for your data
## adjust to pick out the columns that you want
read.csv(con, colClasses=colClasses, as.is=TRUE,
nrows=batch.size, header=FALSE)[, 1:2]
}
readTableStreamily <- function(filePath) {
BATCH_SIZE <- 5000 ## no idea what a good value is depends on file and RAM
con <- file(filePath, 'r')
colNames <- readBatch(con, batch.size=1)
chunks <- list()
i <- 1
done <- FALSE
while (!done) {
done <- tryCatch({
cat(".")
chunks[[i]] <- readBatch(con, batch.size=BATCH_SIZE)
i <- i + 1
FALSE
}, error=function(e) TRUE)
}
close(con)
cat("\n")
df <- do.call("rbind", chunks)
names(df) <- colNames
df
}
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html