On May 3, 2012, at 5:40 PM, victor jimenez wrote: > First of all, thank you for the answers. I did not know about zoo. However, > it seems that none approach can do what I exactly want (please, correct me > if I am wrong). > > Probably, it was not clear in my original question. The CSV files only > contain the performance values. The other two columns (ASSOC and SIZE) are > obtained from the existing values in the directory tree. So, in my opinion, > none of the proposed solutions would work, unless every single "data.csv" > file contained all the three columns (ASSOC, SIZE and PERF). > > In my case, my experimentation framework basically outputs a CSV with some > values read from the processor's performance counters (PMCs). For each > cache size and associativity I conduct an experiment, creating a CSV file, > and placing that file into its own directory. I could modify the > experimentation framework, so that it also outputs the cache size and > associativity, but that may not be ideal in some circumstances and I also > have a significant amount of old results and I want keep using them without > manually fixing the CSV files. >
You don't need to touch the CSV files, simply add values at load time - this is all easily doable in one line ;) > do.call("rbind",lapply(Sys.glob("*/*/data.csv"),function(d) > cbind(read.csv(d),as.data.frame(t(strsplit(d,"/")[[1]]))))) A B V1 V2 V3 1 1 2 1 a data.csv 2 3 4 1 a data.csv 3 1 2 1 b data.csv 4 3 4 1 b data.csv 5 1 2 2 a data.csv 6 3 4 2 a data.csv > Has anyone else faced such a situation? Any good solutions? > > Thank you, > Victor > > On Thu, May 3, 2012 at 8:54 PM, Gabor Grothendieck > <ggrothendi...@gmail.com>wrote: > >> On Thu, May 3, 2012 at 2:07 PM, victor jimenez <betaband...@gmail.com> >> wrote: >>> Sometimes I have hundreds of CSV files scattered in a directory tree, >>> resulting from experiments' executions. For instance, giving an example >>> from my field, I may want to collect the performance of a processor for >>> several design parameters such as "cache size" (possible values: 2, 4, 8 >>> and 16) and "cache associativity" (possible values: direct-mapped, 4-way, >>> fully-associative). The results of all these experiments will be stored >> in >>> a directory tree like: >>> >>> results >>> |-- direct-mapped >>> | |-- 2 -- data.csv >>> | |-- 4 -- data.csv >>> | |-- 8 -- data.csv >>> | |-- 16 -- data.csv >>> |-- 4-way >>> | |-- 2 -- data.csv >>> | |-- 4 -- data.csv >>> ... >>> |-- fully-associative >>> | |-- 2 -- data.csv >>> | |-- 4 -- data.csv >>> ... >>> >>> I am developing a package that would allow me to gather all those CSV >> into >>> a single data frame. Currently, I just need to execute the following >>> statement: >>> >>> dframe <- gather("results/@ASSOC@/@SIZE@/data.csv") >>> >>> and this command returns a data frame containing the columns ASSOC, SIZE >>> and all the remaining columns inside the CSV files (in my case the >>> processor performance), effectively loading all the CSV files into a >> single >>> data frame. So, I would get something like: >>> >>> ASSOC, SIZE, PERF >>> direct-mapped, 2, 1.4 >>> direct-mapped, 4, 1.6 >>> direct-mapped, 8, 1.7 >>> direct-mapped, 16, 1.7 >>> 4-way, 2, 1.4 >>> 4-way, 4, 1.5 >>> ... >>> >>> I would like to ask whether there is any similar functionality already >>> implemented in R. If so, there is no need to reinvent the wheel :) >>> If it is not implemented and the R community believes that this feature >>> would be useful, I would be glad to contribute my code. >>> >> >> If your csv files all have the same columns and represent time series >> then read.zoo in the zoo package can read multiple csv files in at >> once using a single read.zoo command producing a single zoo object. >> >> library(zoo) >> ?read.zoo >> vignette("zoo-read") >> >> Also see the other zoo vignettes and help files. >> >> -- >> Statistics & Software Consulting >> GKX Group, GKX Associates Inc. >> tel: 1-877-GKX-GROUP >> email: ggrothendieck at gmail.com >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel