I wonder if there's also effect of cpu cache... Andy
> From: Roger D. Peng > > I think the first time is potentially much slower because of a > garbage collection. R-devel has a flag `gcFirst' for > system.time() which (I think) forces a garbage collection before > timing. > > -roger > > Patrick Connolly wrote: > > I tried the code that Richard O'Keefe posted last week, to wit: > > > > library(chron) > > ymd.to.POSIXlt <- > > function (y, m, d) as.POSIXlt(chron(julian(y=y, x=m, d=d))) > > n <- 100000 > > y <- sample(1970:2004, n, replace=TRUE) > > m <- sample(1:12, n, replace=TRUE) > > d <- sample(1:28, n, replace=TRUE) > > system.time(ymd.to.POSIXlt(y, m, d)) > > [1] 8.78 0.10 31.76 0.00 0.00 > > system.time(as.POSIXlt(paste(y,m,d, sep="-"))) > > [1] 14.64 0.13 53.30 0.00 0.00 > > > > > > On a somewhat newer machine, I got > > > > $ R --vanilla > > > > R : Copyright 2004, The R Foundation for Statistical Computing > > Version 1.9.0 (2004-04-12), ISBN 3-900051-00-3 > > > > [...] > > > > > > > >>library(chron) > >> ymd.to.POSIXlt <- > > > > + function (y, m, d) as.POSIXlt(chron(julian(y=y, > x=m, d=d))) > > > >> n <- 100000 > >> y <- sample(1970:2004, n, replace=TRUE) > >> m <- sample(1:12, n, replace=TRUE) > >> d <- sample(1:28, n, replace=TRUE) > >> > >>system.time(ymd.to.POSIXlt(y, m, d)) > > > > [1] 1.67 0.24 2.01 0.00 0.00 > > > >>system.time(as.POSIXlt(paste(y,m,d, sep="-"))) > > > > [1] 3.06 0.02 3.08 0.00 0.00 > > > > > > But then I tried a few more times... > > > > > >>system.time(ymd.to.POSIXlt(y, m, d)) > > > > [1] 1.09 0.04 1.13 0.00 0.00 > > > >>system.time(ymd.to.POSIXlt(y, m, d)) > > > > [1] 1.11 0.09 1.20 0.00 0.00 > > > > > > The second time is a lot faster, but subsequent ones don't > "improve further". > > ' > > But with the "standard" function, > > > > > >>system.time(as.POSIXlt(paste(y,m,d, sep="-"))) > > > > [1] 2.64 0.02 2.66 0.00 0.00 > > > >>system.time(as.POSIXlt(paste(y,m,d, sep="-"))) > > > > [1] 2.82 0.03 2.85 0.00 0.00 > > > > ... it does improve slightly but rather a lot less. > > > > > > THEN > > > > If I compare the two methods in the reverse order, > > > > > > $ R --vanilla > > > > R : Copyright 2004, The R Foundation for Statistical Computing > > Version 1.9.0 (2004-04-12), ISBN 3-900051-00-3 > > > > [....] > > > > > > > >>library(chron) > >> ymd.to.POSIXlt <- > > > > + function (y, m, d) as.POSIXlt(chron(julian(y=y, > x=m, d=d))) > > > >> n <- 100000 > >> y <- sample(1970:2004, n, replace=TRUE) > >> m <- sample(1:12, n, replace=TRUE) > >> d <- sample(1:28, n, replace=TRUE) > >>system.time(as.POSIXlt(paste(y,m,d, sep="-"))) > > > > [1] 3.66 0.02 3.76 0.00 0.00 > > > >>system.time(ymd.to.POSIXlt(y, m, d)) > > > > [1] 1.65 0.05 1.70 0.00 0.00 > > > >> > >>system.time(as.POSIXlt(paste(y,m,d, sep="-"))) > > > > [1] 2.59 0.02 2.61 0.00 0.00 > > > >>system.time(as.POSIXlt(paste(y,m,d, sep="-"))) > > > > [1] 2.73 0.00 2.74 0.00 0.00 > > > >>system.time(ymd.to.POSIXlt(y, m, d)) > > > > [1] 1.29 0.01 1.30 0.00 0.00 > > > >>system.time(ymd.to.POSIXlt(y, m, d)) > > > > [1] 0.94 0.00 0.94 0.00 0.00 > > > >>system.time(ymd.to.POSIXlt(y, m, d)) > > > > [1] 1.06 0.01 1.07 0.00 0.00 > > > > > > > > It seems as though the first simulation makes it "easier" for > > subsequent simulations of the same type AND also for > simulations of a > > somewhat different type also. The degree to which it "helps" varies > > according to just what is being run (no surprise there). > What I can't > > figure out is what is happening that makes it quicker for second and > > subsequent runs. > > > > I even tried doing a gc() and setting seeds before each run > to make a > > more direct comparison, but it made no difference other than being > > slightly less variable. I have seen a similar phenomenon in other > > types of simulations. > > > > In the case of this code, it makes no difference whether n is 100 or > > 10000000. Would that be attibutable to lazy evaluation? > > > > > > > >>version > > > > _ > > platform i686-pc-linux-gnu > > arch i686 > > os linux-gnu > > system i686, linux-gnu > > status > > major 1 > > minor 9.0 > > year 2004 > > month 04 > > day 12 > > language R > > > > > > It's not exactly a problem, but it could have a bearing on comparing > > processing times which is something that happens from time to time. > > In the comparison that gave rise to the code above, the order would > > have made a substantial difference to the perceived effectiveness of > > Richard's code. > > > > > > -- > Roger D. Peng > http://www.biostat.jhsph.edu/~rpeng/ > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
