Thanks Henrik, that is exactly what I was hoping for! Best, Ista
On Fri, Mar 11, 2011 at 1:02 PM, Henrik Bengtsson <h...@biostat.ucsf.edu> wrote: > Hi, > > the R.filesets package was designed for this. It is heavily used by > the aroma framework (http://www.aroma-project.org/), so it got a fair > bit of mileage now (in a good a way). Here is how you could setup > your data set and work with the data. > > > # - - - - - - - - - - - - > # Setup file data set > # - - - - - - - - - - - - > library("R.filesets"); > paths <- list.files(path="deleteme", full.names=TRUE); > dsList <- lapply(paths, FUN=function(path) TabularTextFileSet$byPath(path)); > ds <- Reduce(append, dsList); > > # Fullname translator: Los Angeles/data1.csv => Los Angeles,data1.csv > setFullNamesTranslator(ds, function(name, file, ...) { > path <- getPath(file); > paste(c(basename(path), name), collapse=","); > }); > > > > # - - - - - - - - - - - - > # Examples > # - - - - - - - - - - - - > # Get the full names (a fullname consists of > # a name and comma-separated tags) >> getFullNames(ds) > [1] "Los Angeles,data1" "Los Angeles,data2" > [3] "New York,data1" "New York,data2" > > # Get the names >> getNames(ds) > [1] "Los Angeles" "Los Angeles" > [3] "New York" "New York" > >> ds > TabularTextFileSet: > Name: Los Angeles > Tags: > Full name: Los Angeles > Number of files: 4 > Names: Los Angeles, Los Angeles, New York, New York [4] > Path (to the first file): deleteme/Los Angeles > Total file size: 0.00 MB > RAM: 0.01MB > > > # Get 2nd file >> df <- getFile(ds, 2) >> df > > TabularTextFile: > Name: Los Angeles > Tags: data2 > Full name: Los Angeles,data2 > Pathname: deleteme/Los Angeles/data2.csv > File size: 80 bytes > RAM: 0.01 MB > Number of data rows: 10 > Columns [2]: '', 'x' > Number of text lines: 11 > > > > # Read one data file >> data <- readDataFrame(df) >> data > x > 1 1 1 > 2 2 2 > 3 3 3 > 4 4 4 > 5 5 5 > 6 6 6 > 7 7 7 > 8 8 8 > 9 9 9 > 10 10 10 > > > # Read all data files >> dataList <- lapply(ds, readDataFrame) >> dataList > $`Los Angeles,data1 > x > 1 1 1 > 2 2 2 > 3 3 3 > 4 4 4 > 5 5 5 > 6 6 6 > 7 7 7 > 8 8 8 > 9 9 9 > 10 10 10 > > $`Los Angeles,data2 > x > 1 1 1 > 2 2 2 > 3 3 3 > 4 4 4 > 5 5 5 > 6 6 6 > 7 7 7 > 8 8 8 > 9 9 9 > 10 10 10 > > $`New York,data1` > x > 1 1 1 > 2 2 2 > 3 3 3 > 4 4 4 > 5 5 5 > 6 6 6 > 7 7 7 > 8 8 8 > 9 9 9 > 10 10 10 > > $`New York,data2` > x > 1 1 1 > 2 2 2 > 3 3 3 > 4 4 4 > 5 5 5 > 6 6 6 > 7 7 7 > 8 8 8 > 9 9 9 > 10 10 10 > > Most methods in R.filesets are currently poorly documented (no > time/resources/...), but there is more in there than documented so > feel free to ask if you have any questions. > > Hope this helps > > /Henrik > > On Fri, Mar 11, 2011 at 8:52 AM, Ista Zahn <iz...@psych.rochester.edu> wrote: >> Hi helpeRs, >> >> I have inherited a set of data files that use the file system as a >> sort of poor man's database, i.e., the data files are nested in >> directories that indicate which city they come from. For example: >> >> dir.create("deleteme") >> for(i in paste("deleteme", c("New York", "Los Angeles"), sep="/")) { >> dir.create(i) >> for(j in paste("data", 1:2, ".csv", sep="")) { >> write.csv(data.frame(x=1:10), file=paste(i, j, sep="/")) >> } >> } >> >> list.files("deleteme", recursive=TRUE) >> >> What I want to end up with is >> >> x city wave >> 1 New York 1 >> 1 Los Angeles 1 >> 1 New York 2 >> 1 Los Angeles 2 >> >> I've started writting a simple function to do this, but it seems like >> a common situation and I'm wondering if there are any packages or >> functions that might make this easier. >> >> Thanks! >> Ista >> -- >> Ista Zahn >> Graduate student >> University of Rochester >> Department of Clinical and Social Psychology >> http://yourpsyche.org >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.