[R] Executing the same function on consecutive files

2011-06-27 Thread Trying To learn again
Hi all,

I have the next problem: I have a matrix with size 8,000,000x18. My personal
computer...blocks...so I have cut my original file into 100 different file.

I have written a function that should be run on each of this file.

So imagine

I need to read data from q1 to q100 file

data-read.table(q1.txt,sep=)

and each time I read 1 file execute my personal function (I get some stats)
and my last target is to add each partial stats...

My question is:

Is posible to say something similar to this?

for (i in 1:100){

data[i]-read.table(q[i].txt, sep=)

execute .

}

Many thanks in advance

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Executing the same function on consecutive files

2011-06-27 Thread John Kane
This looks something like what you want.

http://r.789695.n4.nabble.com/Reading-in-a-series-of-files-using-a-for-loop-td906101.html

--- On Mon, 6/27/11, Trying To learn again tryingtolearnag...@gmail.com wrote:

 From: Trying To learn again tryingtolearnag...@gmail.com
 Subject: [R] Executing the same function on consecutive files
 To: r-help@r-project.org
 Received: Monday, June 27, 2011, 6:01 PM
 Hi all,
 
 I have the next problem: I have a matrix with size
 8,000,000x18. My personal
 computer...blocks...so I have cut my original file into 100
 different file.
 
 I have written a function that should be run on each of
 this file.
 
 So imagine
 
 I need to read data from q1 to q100 file
 
 data-read.table(q1.txt,sep=)
 
 and each time I read 1 file execute my personal function (I
 get some stats)
 and my last target is to add each partial stats...
 
 My question is:
 
 Is posible to say something similar to this?
 
 for (i in 1:100){
 
 data[i]-read.table(q[i].txt, sep=)
 
 execute .
 
 }
 
 Many thanks in advance
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Executing the same function on consecutive files

2011-06-27 Thread Dennis Murphy
Hi:

One approach:

(1) Put your files into a separate directory.
(2) Use list.files() to grab the individual file names.
(3) Write a function that takes a data frame as an argument and does
the necessary processing.
(4) Use lapply() or ldply/llply from the plyr package to recursively
run the function on each file in the list. lapply() and llply() will
return lists, ldply() would return a data frame. If you intend to use
ldply(), then the function in (3) needs to return a data frame.

Here's a small demo. I have five data sets in my starting directory
with variables x1, x2, y. The function reads in the data and returns
the output of a regression model; when lapply() is run on it, the
output of the five models is returned as a list. One can then cherry
pick output from the list of models.

files - paste('dat', 1:5, '.csv', sep = '')
myfun - function(d) {
df - read.csv(d, header = TRUE)
lm(y ~ ., data = df)
  }
lout - lapply(files, myfun)

library(plyr)
ldply(lout, function(x) coef(x))# coefficients
ldply(lout, function(x) summary(x)$r.squared)   # R^2

One could also use
do.call(rbind, lapply(lout, function(x) coef(x))
do.call(rbind, lapply(lout, function(x) summary(x)$r.squared))

but ldply() has a somewhat simpler syntax.

Hopefully, you can adapt these steps to your problem.

Dennis

On Mon, Jun 27, 2011 at 3:01 PM, Trying To learn again
tryingtolearnag...@gmail.com wrote:
 Hi all,

 I have the next problem: I have a matrix with size 8,000,000x18. My personal
 computer...blocks...so I have cut my original file into 100 different file.

 I have written a function that should be run on each of this file.

 So imagine

 I need to read data from q1 to q100 file

 data-read.table(q1.txt,sep=)

 and each time I read 1 file execute my personal function (I get some stats)
 and my last target is to add each partial stats...

 My question is:

 Is posible to say something similar to this?

 for (i in 1:100){

 data[i]-read.table(q[i].txt, sep=)

 execute .

 }

 Many thanks in advance

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.