[R] how to edit my R codes into a efficient way
Hello, Everyone, I am a student an a new learner of R and I am trying to do my homework in R. I have 10 files need to be read and process seperately. I really want to write the codes into something like macro to save the lines instead of repeating 10 times of similar work. The following is part of my codes and I only extracted three lines for each repeating section. data.1 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat1.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); data.2 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat3.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); data.3 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat4.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); baby.1 - data.frame(cuff=data.1$avg_value, time=seq(1,dim(data.1)[1]), patient=rep(1, dim(data.1)[1])) baby.2 - data.frame(cuff=data.2$avg_value, time=seq(1,dim(data.2)[1]), patient=rep(3, dim(data.2)[1])) baby.3 - data.frame(cuff=data.3$avg_value, time=seq(1,dim(data.3)[1]), patient=rep(4, dim(data.3)[1])) I also tried the codes below but it doesn't work. for(n in 1:10){ mm - data.frame(cuff=paste(data,n, sep=.)$avg_value, time=seq(1,dim(paste(data,n, sep=.))[1]), patient=rep(1,paste(data,n, sep=.))[1])) assign(paste(baby,n,sep=.), mm)} I am looking forward to your help and thanks very much! Xuhong __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to edit my R codes into a efficient way
Xuhong Zhu napsal(a): Hello, Everyone, I am a student an a new learner of R and I am trying to do my homework in R. I have 10 files need to be read and process seperately. I really want to write the codes into something like macro to save the lines instead of repeating 10 times of similar work. The following is part of my codes and I only extracted three lines for each repeating section. data.1 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat1.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); data.2 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat3.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); data.3 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat4.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); baby.1 - data.frame(cuff=data.1$avg_value, time=seq(1,dim(data.1)[1]), patient=rep(1, dim(data.1)[1])) baby.2 - data.frame(cuff=data.2$avg_value, time=seq(1,dim(data.2)[1]), patient=rep(3, dim(data.2)[1])) baby.3 - data.frame(cuff=data.3$avg_value, time=seq(1,dim(data.3)[1]), patient=rep(4, dim(data.3)[1])) I also tried the codes below but it doesn't work. for(n in 1:10){ mm - data.frame(cuff=paste(data,n, sep=.)$avg_value, time=seq(1,dim(paste(data,n, sep=.))[1]), patient=rep(1,paste(data,n, sep=.))[1])) assign(paste(baby,n,sep=.), mm)} This cannot work since paste() gives you quoted character output while functions like data.frame() etc expect a name of some R object. You can use paste when reading individual csv files: for(n in 1:10){ mydata - read.csv(file=paste('...STA6704/pat',n,'.csv',sep=), header = TRUE, sep = ,, quote = , fill = TRUE) # ... further lines to process mydata ... } A faster way of computing would involve reading the individual files into a list of dataframes and using lapply() on that list rather than processing the data inside the loop. Petr Xuhong -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to edit my R codes into a efficient way
Have you read An Introduction to R? If not, do so before posting any further questions. Once you have read it, pay attention to what it says about lists, which is a very general data structure (indeed, **the** most general) that is very convenient for this sort of task. The general approach that one uses is something like: ContentsOfFiles - lapply(filenameVector, functionThatReadsFile,additionalParametersto Function) More specifically, ContentsOfFiles - lapply(filenameVector, read.csv, header=TRUE, quote=,fill=TRUE) see ?lapply Bert Gunter Genentech Nonclinical Statistics South San Francisco, CA 94404 650-467-7374 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Xuhong Zhu Sent: Tuesday, March 06, 2007 7:19 AM To: r-help@stat.math.ethz.ch Subject: [R] how to edit my R codes into a efficient way Hello, Everyone, I am a student an a new learner of R and I am trying to do my homework in R. I have 10 files need to be read and process seperately. I really want to write the codes into something like macro to save the lines instead of repeating 10 times of similar work. The following is part of my codes and I only extracted three lines for each repeating section. data.1 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat1.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); data.2 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat3.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); data.3 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat4.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); baby.1 - data.frame(cuff=data.1$avg_value, time=seq(1,dim(data.1)[1]), patient=rep(1, dim(data.1)[1])) baby.2 - data.frame(cuff=data.2$avg_value, time=seq(1,dim(data.2)[1]), patient=rep(3, dim(data.2)[1])) baby.3 - data.frame(cuff=data.3$avg_value, time=seq(1,dim(data.3)[1]), patient=rep(4, dim(data.3)[1])) I also tried the codes below but it doesn't work. for(n in 1:10){ mm - data.frame(cuff=paste(data,n, sep=.)$avg_value, time=seq(1,dim(paste(data,n, sep=.))[1]), patient=rep(1,paste(data,n, sep=.))[1])) assign(paste(baby,n,sep=.), mm)} I am looking forward to your help and thanks very much! Xuhong __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to edit my R codes into a efficient way
Again, thanks a lot. Xuhong On 3/6/07, Petr Klasterecky [EMAIL PROTECTED] wrote: Xuhong Zhu napsal(a): Hello, Everyone, I am a student an a new learner of R and I am trying to do my homework in R. I have 10 files need to be read and process seperately. I really want to write the codes into something like macro to save the lines instead of repeating 10 times of similar work. The following is part of my codes and I only extracted three lines for each repeating section. data.1 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat1.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); data.2 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat3.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); data.3 - read.csv(http://pegasus.cc.ucf.edu/~xsu/CLASS/STA6704/pat4.csv;, header = TRUE, sep = ,, quote = , fill = TRUE); baby.1 - data.frame(cuff=data.1$avg_value, time=seq(1,dim(data.1)[1]), patient=rep(1, dim(data.1)[1])) baby.2 - data.frame(cuff=data.2$avg_value, time=seq(1,dim(data.2)[1]), patient=rep(3, dim(data.2)[1])) baby.3 - data.frame(cuff=data.3$avg_value, time=seq(1,dim(data.3)[1]), patient=rep(4, dim(data.3)[1])) I also tried the codes below but it doesn't work. for(n in 1:10){ mm - data.frame(cuff=paste(data,n, sep=.)$avg_value, time=seq(1,dim(paste(data,n, sep=.))[1]), patient=rep(1,paste(data,n, sep=.))[1])) assign(paste(baby,n,sep=.), mm)} This cannot work since paste() gives you quoted character output while functions like data.frame() etc expect a name of some R object. You can use paste when reading individual csv files: for(n in 1:10){ mydata - read.csv(file=paste('...STA6704/pat',n,'.csv',sep=), header = TRUE, sep = ,, quote = , fill = TRUE) # ... further lines to process mydata ... } A faster way of computing would involve reading the individual files into a list of dataframes and using lapply() on that list rather than processing the data inside the loop. Petr Xuhong -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.