small correction: # to ensure 0, although it will be overwritten when assigning hour origin = as.POSIXct("1970-01-01")-as.numeric(as.POSIXct("1970-01-01"))
Dr Oleg Sklyar Technology Group Man Investments Ltd +44 (0)20 7144 3803 [EMAIL PROTECTED] > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Sklyar, > Oleg (MI London) > Sent: 10 April 2008 14:52 > To: R-devel@r-project.org > Subject: [Rd] ISOdate/ISOdatetime performance suggestions, > other date/time questions > > Dear list: > > working with date/times I have come across a problem that > ISOdate and ISOdatetime are too slow on large vectors of > data. I was surprised just until I looked at the > implementation and the man page: "ISOdatetime and ISOdate are > convenience wrappers for strptime". In other terms, they > convert data to character representation first in order to > create a POSIXlt object that is then converted to POSIXct. > And POSIXct, i.e. the number of seconds since 1970, is really > what one wants most often. > > Obviously this is not a bug, but it is really a suboptimal > implementation of a pretty important function as the example > below shows. > > Now my questions are: > > - any chance that the implementation can be changed in R > (suggested, well tz needs to be added)? > - is there a better pure-R (no-C) way than that shown below > to convert to POSIXct? > - any idea why in the example below fooling R into thinking a > list is POSIXlt is faster than just creating a POSIXlt by rep > or seq? It's not a huge difference, but still. Unfortunately > seq on POSIXlt returns POSIXct anyway, so the class of > 'origin' is set correctly. > - any idea why seq is faster than rep when applied on > POSIXct? There is hardly anything simpler than on double values... > > Thanks in advance for your comments, > Oleg > > It's common in finance to work with time stamps stored in a > form like %Y%m%d.%H%M%OS, e.g. 20080410.140444 for now, this > is what 'ts' in the example below is: > > ts = 1e4*trunc(rnorm(50000,2008,2)) + 1e2*trunc(runif(50000,1,12)) + > trunc(runif(50000,1,28)) + 1e-2*trunc(runif(50000,1,24)) + > 1e-4*trunc(runif(50000,1,60)) + 1e-6*runif(50000,1,60) > > posix.viaISOdate = function(x) { > date = trunc([EMAIL PROTECTED]) > time = round([EMAIL PROTECTED],2) > rtime = round(time) > z = list(sec=rtime%%1e2 + time%%1, > min=(rtime%/%1e2)%%1e2, > hour=rtime%/%1e4, > mday=date%%100, > mon=(date%/%100)%%100, > year=date%/%10000) > ISOdate(z$year,z$mon,z$mday,z$hour,z$min,z$sec) # to POSIXct } > > ## This is just a test of how is it faster to create a long > POSIXlt object ## before another implementations are given > > origin = as.POSIXct("1970-01-01") > > mean(sapply(1:25,function(i) system.time( > as.POSIXlt(rep(origin,600000)) > ))[1,]) > # [1] 0.3972 > > mean(sapply(1:25,function(i) system.time( > as.POSIXlt(seq(origin, origin, length.out=600000)) > ))[1,]) > # [1] 0.30528 > > > posix.viaPOSIXlt1 = function(x) { > origin = as.POSIXct("1970-01-01") > z = as.POSIXlt(seq(origin, origin, length.out=length(x))) > date = trunc([EMAIL PROTECTED]) > time = round([EMAIL PROTECTED],2) > rtime = round(time) > z$sec=rtime%%1e2 + time%%1 > z$min=(rtime%/%1e2)%%1e2 > z$hour=rtime%/%1e4 > z$mday=date%%100 > z$mon=(date%/%100)%%100-1 > z$year=date%/%10000-1900 > as.double(z) # to POSIXct > } > > posix.vialist = function(x) { > date = trunc([EMAIL PROTECTED]) > time = round([EMAIL PROTECTED],2) > rtime = round(time) > na = rep(0.0,length(x)) > z = list(sec=rtime%%1e2 + time%%1, > min=(rtime%/%1e2)%%1e2, > hour=rtime%/%1e4, > mday=date%%100, > mon=(date%/%100)%%100-1, > year=date%/%10000-1900, > wday=na,yday=na,isdst=na) > class(z) = c("POSIXt","POSIXlt") > as.double(z) # to POSIXct > } > > v1 = posix.viaISOdate(ts) > v2 = posix.viaPOSIXlt1(ts) > v3 = posix.vialist(ts) > > all(v1==v2 & v2==v3) > # [1] TRUE > > mean(sapply(1:25,function(i) system.time( > system.time(posix.viaISOdate(ts)) > ))[1,]) > # [1] 1.54244 > > mean(sapply(1:25,function(i) system.time( > system.time(posix.viaPOSIXlt1(ts)) > ))[1,]) > # [1] 0.37624 > > mean(sapply(1:25,function(i) system.time( > system.time(posix.vialist(ts)) > ))[1,]) > # [1] 0.35488 > > > > > sessionInfo() > R version 2.6.2 (2008-02-08) > x86_64-unknown-linux-gnu > > locale: > LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLA > TE=C;LC_MO > NETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF- > 8;LC_NAME= > C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_ID > ENTIFICATI > ON=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > loaded via a namespace (and not attached): > [1] rcompgen_0.1-17 > > Dr Oleg Sklyar > Technology Group > Man Investments Ltd > +44 (0)20 7144 3803 > [EMAIL PROTECTED] > > > ********************************************************************** > The contents of this email are for the named addressee(s) only. > It contains information which may be confidential and privileged. > If you are not the intended recipient, please notify the > sender immediately, destroy this email and any attachments > and do not otherwise disclose or use them. Email transmission > is not a secure method of communication and Man Investments > cannot accept responsibility for the completeness or accuracy > of this email or any attachments. Whilst Man Investments > makes every effort to keep its network free from viruses, it > does not accept responsibility for any computer virus which > might be transferred by way of this email or any attachments. > This email does not constitute a request, offer, > recommendation or solicitation of any kind to buy, subscribe, > sell or redeem any investment instruments or to perform other > such transactions of any kind. Man Investments reserves the > right to monitor, record and retain all electronic > communications through its network to ensure the integrity of > its systems, for record keeping and regulatory purposes. > > Visit us at: www.maninvestments.com > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel