Prof Brian Ripley writes: > One other possibly difference would be locale, but this is slow on FC3 > (2.3.4 now) in the C locale. Almost all the time is in strptime: > R profiling shows > > > summaryRprof() > $by.self > self.time self.pct total.time total.pct > "strptime" 29.58 99.7 29.58 99.7 > "as.Date.character" 0.10 0.3 29.68 100.0 > "as.Date" 0.00 0.0 29.68 100.0 > "eval" 0.00 0.0 29.68 100.0 > "system.time" 0.00 0.0 29.68 100.0 > > Now on a glibc 2.3.x system R's internal replacement for strptime will be > used (to work around bugs) so it must be some other part of the POSIX > time-handling that has changed. > > The next step would be to do C-level profiling and then retrofit the > crucial code from glibc 2.3.2.
Thanks for these suggestions. C-level profiling yields the following: % cumulative self self total time seconds seconds calls s/call s/call name 36.01 5.34 5.34 100000 0.00 0.00 get_locale_strings 4.32 5.98 0.64 100000 0.00 0.00 mktime00 3.98 6.57 0.59 277462 0.00 0.00 Rf_eval 3.71 7.12 0.55 472935 0.00 0.00 Rf_findVarInFrame3 3.64 7.66 0.54 100000 0.00 0.00 strptime_internal 3.51 8.18 0.52 1 0.52 7.51 do_strptime It looks like strftime is called from get_locale_strings, which might be the culprit. Any suggestions on where I might go from here? > It does seem a pretty unusual application of R for 10^5 date conversions > to be needed and for 30 secs to be an appreciable part of the analysis > time on such a data set. This is an issue for me when interactively loading a sizable timeseries dataset into R from Postgres, converting character strings into objects of class Date. Thanks, Jeff > > On Wed, 4 May 2005, Jeff Enos wrote: > > > R-devel, > > > > The performance of as.Date differs by a large degree between one of my > > machines with glibc 2.3.2: > > > >> system.time(x <- as.Date(rep("01-01-2005", 100000), format = "%m-%d-%Y")) > > [1] 1.17 0.00 1.18 0.00 0.00 > > > > and a comparable machine with glibc 2.3.3: > > > >> system.time(x <- as.Date(rep("01-01-2005", 100000), format = "%m-%d-%Y")) > > [1] 31.20 46.89 81.01 0.00 0.00 > > > > both with the same R version: > > > >> R.version > > _ > > platform i686-pc-linux-gnu > > arch i686 > > os linux-gnu > > system i686, linux-gnu > > status > > major 2 > > minor 1.0 > > year 2005 > > month 04 > > day 18 > > language R > > > > I'm focusing on differences in glibc versions because of as.Date's use > > of strptime. > > > > Does it seem likely that the cause of this discrepancy is in fact > > glibc? If so, can anyone tell me how to make the performance of the > > second machine more like the first? > > > > I have verified that using the chron package, which I don't believe > > uses strptime, for the above character conversion performs equally > > well on both machines. > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-devel