[R] Executing the same function on consecutive files
Hi all, I have the next problem: I have a matrix with size 8,000,000x18. My personal computer...blocks...so I have cut my original file into 100 different file. I have written a function that should be run on each of this file. So imagine I need to read data from q1 to q100 file data-read.table(q1.txt,sep=) and each time I read 1 file execute my personal function (I get some stats) and my last target is to add each partial stats... My question is: Is posible to say something similar to this? for (i in 1:100){ data[i]-read.table(q[i].txt, sep=) execute . } Many thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Executing the same function on consecutive files
This looks something like what you want. http://r.789695.n4.nabble.com/Reading-in-a-series-of-files-using-a-for-loop-td906101.html --- On Mon, 6/27/11, Trying To learn again tryingtolearnag...@gmail.com wrote: From: Trying To learn again tryingtolearnag...@gmail.com Subject: [R] Executing the same function on consecutive files To: r-help@r-project.org Received: Monday, June 27, 2011, 6:01 PM Hi all, I have the next problem: I have a matrix with size 8,000,000x18. My personal computer...blocks...so I have cut my original file into 100 different file. I have written a function that should be run on each of this file. So imagine I need to read data from q1 to q100 file data-read.table(q1.txt,sep=) and each time I read 1 file execute my personal function (I get some stats) and my last target is to add each partial stats... My question is: Is posible to say something similar to this? for (i in 1:100){ data[i]-read.table(q[i].txt, sep=) execute . } Many thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Executing the same function on consecutive files
Hi: One approach: (1) Put your files into a separate directory. (2) Use list.files() to grab the individual file names. (3) Write a function that takes a data frame as an argument and does the necessary processing. (4) Use lapply() or ldply/llply from the plyr package to recursively run the function on each file in the list. lapply() and llply() will return lists, ldply() would return a data frame. If you intend to use ldply(), then the function in (3) needs to return a data frame. Here's a small demo. I have five data sets in my starting directory with variables x1, x2, y. The function reads in the data and returns the output of a regression model; when lapply() is run on it, the output of the five models is returned as a list. One can then cherry pick output from the list of models. files - paste('dat', 1:5, '.csv', sep = '') myfun - function(d) { df - read.csv(d, header = TRUE) lm(y ~ ., data = df) } lout - lapply(files, myfun) library(plyr) ldply(lout, function(x) coef(x))# coefficients ldply(lout, function(x) summary(x)$r.squared) # R^2 One could also use do.call(rbind, lapply(lout, function(x) coef(x)) do.call(rbind, lapply(lout, function(x) summary(x)$r.squared)) but ldply() has a somewhat simpler syntax. Hopefully, you can adapt these steps to your problem. Dennis On Mon, Jun 27, 2011 at 3:01 PM, Trying To learn again tryingtolearnag...@gmail.com wrote: Hi all, I have the next problem: I have a matrix with size 8,000,000x18. My personal computer...blocks...so I have cut my original file into 100 different file. I have written a function that should be run on each of this file. So imagine I need to read data from q1 to q100 file data-read.table(q1.txt,sep=) and each time I read 1 file execute my personal function (I get some stats) and my last target is to add each partial stats... My question is: Is posible to say something similar to this? for (i in 1:100){ data[i]-read.table(q[i].txt, sep=) execute . } Many thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.