pls explain the big picture. I don't recognise why you're attempting to do.call merge. > Thanks for Your reply Matthew, > On 10 ts, 10000 values each, it takes 5.7 seconds to reshape, I'm willing > to reduce time at least by the half reducing my total batch time by 10 > minutes approximately (over 70 minutes total). > > I'm trying to do something like: > do.call(mergfe, x[, .SD, by=ID]) but data.table is not designed to work > this way (return a data.table), there is no problem in data.table itself. > I'm trying to extract K (10) data.table from a data.table with keys ID, > DATE and then CJ. > > > Thanks in advance for any help. > Best regards, > Daniele > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Matthew > Dowle > Sent: 05 April 2011 14:20 > To: [email protected] > Subject: Re: [datatable-help] How to speed up grouping time series,help > please > > It's easier to help if you provide timings along with your example > reproducible code, please. > How long is it taking, and how long do you think it should take? > Please also try to avoid phrases such as "without success". Does that mean > you got an error message (if so, what was it) or wrong result (if so, what > was wrong)? > Matthew > > "Daniele Amberti" <[email protected]> wrote in message > news:[email protected]... >>I retrieve for a few hundred times a group of time series (10-15 ts >>with 10000 values each), on every group I do some calculation, graphs >>etc. I wonder if there is a faster method than what presented below to >>get an appropriate timeseries object. >> >> Making a query with RODBC for every group I get a data frame like this: >> >>> X >> ID DATE VALUE >> 14 3 2000-01-01 00:00:03 0.5726334 >> 4 1 2000-01-01 00:00:03 0.8830174 >> 1 1 2000-01-01 00:00:00 0.2875775 >> 15 3 2000-01-01 00:00:04 0.1029247 >> 11 3 2000-01-01 00:00:00 0.9568333 >> 9 2 2000-01-01 00:00:03 0.5514350 >> 7 2 2000-01-01 00:00:01 0.5281055 >> 6 2 2000-01-01 00:00:00 0.0455565 >> 12 3 2000-01-01 00:00:01 0.4533342 >> 8 2 2000-01-01 00:00:02 0.8924190 >> 3 1 2000-01-01 00:00:02 0.4089769 >> 13 3 2000-01-01 00:00:02 0.6775706 >> >> And I want to get a timeSeries object or xts object like this: >> >> 1 2 3 >> 2000-01-01 00:00:00 0.2875775 0.0455565 0.9568333 >> 2000-01-01 00:00:01 NA 0.5281055 0.4533342 >> 2000-01-01 00:00:02 0.4089769 0.8924190 0.6775706 >> 2000-01-01 00:00:03 0.8830174 0.5514350 0.5726334 >> 2000-01-01 00:00:04 NA NA 0.1029247 >> >> Both classes accept a matrix so if I can create a matrix like the one >> represented above and an array of characters representing dates faster >> than what possible with xts:::merge, for example, I will have a faster >> implementation, this is the reason why I'm writing to datatable-help; >> I red vignettes, tests and did tests trying to generate a set of >> data.table (using .SD and by = ID) an then CJ but without success up >> to now, any input to test this approach will be really appreciate. >> >> Input data can be sorted or unsorted (the most complicated case is in >> the example, unsorted and missing data) in the sense that I can sort >> in query if I can take an advantage from this. >> >> Below some code to generate the test case above. >> >> Thanks in advance for any input, best regards, Daniele >> >> >> set.seed(123) >> N <- 100 # number of observations, use 5 to replicate test case above >> K <- 3 # number of timeseries ID >> >> X <- data.frame( >> ID = rep(1:K, each = N), >> DATE = as.character(rep(as.POSIXct("2000-01-01", tz = "GMT")+ 0:(N-1), >> K)), >> VALUE = runif(N*K), stringsAsFactors = FALSE) >> >> X <- X[sample(1:(N*K), N*K),] # sample observations to get random order >> (optional) >> X <- X[-(sample(1:nrow(X), floor(nrow(X)*0.2))),] # 20% missing >> >> head(X, 15) >> >> >> # an implementation in xts: >> xtsSplit <- function(x) >> { >> library(xts) >> x <- xts(x[,c("ID","VALUE")], as.POSIXct(x[,"DATE"])) >> x <- do.call(merge, split(x$VALUE,x$ID)) >> return(x) >> } >> >> xtsSplitTime <- replicate(50, >> system.time(xtsSplit(X))[[1]]) >> median(xtsTime) >> >> >> ORS Srl >> >> Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy >> Tel. +39 0173 620211 >> Fax. +39 0173 620299 / +39 0173 433111 >> Web Site www.ors.it >> >> ------------------------------------------------------------------------------------------------------------------------ >> Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi >> allegati è vietato e potrebbe costituire reato. >> Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati >> se >> provvedesse alla distruzione dello stesso >> e degli eventuali allegati. >> Opinioni, conclusioni o altre informazioni riportate nella e-mail, che >> non >> siano relative alle attività e/o >> alla missione aziendale di O.R.S. Srl si intendono non attribuibili >> alla >> società stessa, né la impegnano in alcun modo. >> _______________________________________________ >> datatable-help mailing list >> [email protected] >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> > > > > > ORS Srl > > Via Agostino Morando 1/3 12060 Roddi (Cn) - Italy > Tel. +39 0173 620211 > Fax. +39 0173 620299 / +39 0173 433111 > Web Site www.ors.it > > ------------------------------------------------------------------------------------------------------------------------ > Qualsiasi utilizzo non autorizzato del presente messaggio e dei suoi > allegati è vietato e potrebbe costituire reato. > Se lei avesse ricevuto erroneamente questo messaggio, Le saremmo grati se > provvedesse alla distruzione dello stesso > e degli eventuali allegati. > Opinioni, conclusioni o altre informazioni riportate nella e-mail, che non > siano relative alle attività e/o > alla missione aziendale di O.R.S. Srl si intendono non attribuibili alla > società stessa, né la impegnano in alcun modo. >
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
