Hello. I've got many (5-20k) files with time series in a text format like this:
1359635460 2.006747 1359635520 1.886745 1359635580 3.066988 1359635640 3.633578 1359635700 2.140082 1359635760 2.033564 1359635820 1.980123 1359635880 2.060131 1359635940 2.113416 1359636000 2.440172 First field is a unix timestamp, second is a float number. Its a text export of http://graphite.readthedocs.org/en/latest/whisper.html databases. Time series could have different resolutions, start/end times, and possibly gaps inside. Current way of importing them: read.file <- function(file.name) { read.zoo( file.name, na.strings="None", colClasses=c("integer", "numeric"), col.names=c("time", basename(file.name)), FUN=function(t) {as.POSIXct(t, origin="1970-01-01 00:00.00", tz="UTC")}, drop=FALSE) } load.metrics <- function(path=".") { do.call(merge.zoo, lapply(list.files(path, full.names=TRUE), read.file)) } It works for 6k time series with 2k points in each, but fails with out of memory error on 16Gb box when I try to import 10k time series with 10k points. I've tried to make merging incremental by using Reduce but import speed became unacceptable: load.metrics <- function(path=".") { Reduce( function(a, b) { if (class(a) == "character") { a <- read.file(a) } merge.zoo(a, read.file(b)) }, list.files(path, full.names=TRUE)) } Is there faster and less memory consuming way to import and merge a lot of time series? Regards, Anton Lebedevich. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.