Full_Name: Hong Ooi Version: 2.10.0 OS: Windows XP Submission from: (NULL) (203.110.235.1)
While trying to get summary statistics on a duration variable (the difference between a start and end date), I ran into the following issue. Using summary or quantile (which summary calls) on a difftime object takes an extremely long time if the object is even moderately large. A reproducible example: > x <- as.Date(1:10000, origin="1900-01-01") > x[1:10] [1] "1900-01-02" "1900-01-03" "1900-01-04" "1900-01-05" "1900-01-06" [6] "1900-01-07" "1900-01-08" "1900-01-09" "1900-01-10" "1900-01-11" > d <- x - as.Date("1900-01-01") > d[1:10] Time differences in days [1] 1 2 3 4 5 6 7 8 9 10 > system.time(summary(d[1:10])) user system elapsed 0.01 0.00 0.01 > system.time(summary(d[1:100])) user system elapsed 0.21 0.00 0.20 > system.time(summary(d[1:1000])) user system elapsed 3.02 0.00 3.02 > system.time(summary(d[1:10000])) user system elapsed 43.56 0.04 43.66 If I unclass d, there is no problem: > system.time(summary(unclass(d[1:10000]))) user system elapsed 0 0 0 Testing with Rprof() indicates that the problem lies in [.difftime, although the code for that function seems innocuous enough. > sessionInfo() R version 2.10.0 (2009-10-26) i386-pc-mingw32 locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C [5] LC_TIME=English_Australia.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel