[R] Scanning only specific columns into R from a VERY large file

Josh B Fri, 16 Apr 2010 15:13:12 -0700

Hi,

I turn to you, the R Sages, once again for help. You've never let me down!


(1) Please make the following toy files:

x <- read.table(textConnection("var.1 var.2 var.3 var.1000
indv.1 1 5 9 7
indv.210000 2 9 3 8"), header = TRUE)

y <- read.table(textConnection("var.3 var.1000"), header = TRUE)

write.csv(x, file = "x.csv")
write.csv(y, file = "y.csv")

(2) Pretend you are starting with the files "x.csv" and "y.csv." They come from 
another source -- an online database. Pretend that these files are much, much, 
much larger. Specifically: 
    (a) Pretend that "x.csv" contains 1000 columns by 210,000 rows. 
    (b) "y.csv" contains just header titles. Pretend that there are 90 header 
titles in "y.csv" in total. These header titles are a subset of the header 
titles in "x.csv."

(3) What I want to do is scan (or import, or whatever the appropriate word is) 
only a subset of the columns from "x.csv" into an R. Specifically, I only want 
to scan the columns of data from "x.csv" into R that are indicated in the file 
"y.csv." I still want to scan in all 210000 rows from "x.csv," but only for the 
aforementioned columns listed in "y.csv."

Can you guys recommend a strategy for me? I think I need to use the scan 
command, based on the hugeness of "x.csv," but I don't know what exactly to do. 
Specific code that gets the job done would be the most useful. 

Thank you very much in advance!
Josh



      
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Scanning only specific columns into R from a VERY large file

Reply via email to