I was pondering a perl-core todo-item =head2 A decent benchmark
C<perlbench> seems impervious to any recent changes made to the perl core. It would be useful to have a reasonable general benchmarking suite that roughly represented what current perl programs do, and measurably reported whether tweaks to the core improve, degrade or don't really affect performance, to guide people attempting to optimise the guts of perl. Gisle would welcome new tests for perlbench. IMO, the strategy is : - ghetto-ize the problem - leave it to perlbench. - tighten up the measurements including empty-loop baseline / calibration (also done by Benchmark) - hammer specific language constructs, hoping to stress specific code paths What's notable in its absence is any *real* use of perl-dist's tests. I dug into the code, and found that this works. $> HARNESS_TIMER=1 make test ext/threads/t/end.............................................ok 60 ms ext/threads/t/err.............................................ok 1090 ms ext/threads/t/exit............................................ok 365 ms ext/threads/t/free2...........................................ok 2367 ms ext/threads/t/free............................................ok 12383 ms so clearly: - somebody thought of this already ;-) - does Test::* support this feature ? - t/harness doesnt, but HARNESS_OPTIONS could allow 't' - it wouldnt be too hard to collect this data. - could by default for -Dusedevel if t/TEST ( etc ..) saved this to persistent data - Storable, SQLlite files, etc.. # for example $data = { "./perl -Ilib -V" => `./perl -Ilib -V`, ok_times => { ... successful tests => times } nok_tests => [ .. for completeness ] platform_static => { "/proc/cpuinfo" => `cat /proc/cpuinfo` } platform_load => { "uptime" => [ `uptime && sleep 5` while 1 ` ], #notionally } .. and collected it over time, say into ~/.perl-core-hacking/blead-run-times.stor each bleader (assuming this is on by default in -Dusedevel builds) would - collect significant longitudinal data on their favorite build (forex: -des) - have same data (just less) for other systematic and/or ad-hoc configs. If we then integrate that data, there'd be enough info (eventually) to: - see statistical groupings around important build options -D/-Uusethreads, -U/-DDEBUGGING are 2 obvious ones - see statistical groups around platforms. - see the fuzz due to system-load-noise, etc. (finally satisfying, not frustrating) - see which tests are more/less load/build-option dependent - etc. This kind of thing could be useful for CPAN modules too, presumably through some optional keyword-ish: -perf_test_results_to => foo-module-p...@cpan.org, So, Im attaching a strawman patch to t/TEST which collects the timing info. running it a bunch of times, and averaging the results, I get data like this: run/switchF.t 19 avg: 0.005 sdev: 0.000 fuzz: 0.08 run/switchF1.t 19 avg: 0.016 sdev: 0.009 fuzz: 0.54 run/switchI.t 19 avg: 0.047 sdev: 0.013 fuzz: 0.28 run/switcha.t 19 avg: 0.006 sdev: 0.003 fuzz: 0.42 run/switchd.t 19 avg: 0.075 sdev: 0.027 fuzz: 0.36 run/switches.t 19 avg: 0.555 sdev: 0.048 fuzz: 0.09 run/switchn.t 19 avg: 0.007 sdev: 0.006 fuzz: 0.84 run/switchp.t 19 avg: 0.007 sdev: 0.004 fuzz: 0.53 run/switcht.t 19 avg: 0.057 sdev: 0.018 fuzz: 0.31 run/switchx.t 19 avg: 0.062 sdev: 0.020 fuzz: 0.33 uni/cache.t 19 avg: 0.115 sdev: 0.055 fuzz: 0.48 uni/chomp.t 19 avg: 0.837 sdev: 0.261 fuzz: 0.31 uni/chr.t 19 avg: 0.382 sdev: 0.168 fuzz: 0.44 uni/class.t 19 avg: 4.028 sdev: 3.601 fuzz: 0.89 uni/fold.t 19 avg: 0.807 sdev: 0.369 fuzz: 0.46 uni/greek.t 19 avg: 0.247 sdev: 0.127 fuzz: 0.51 uni/latin2.t 19 avg: 0.164 sdev: 0.032 fuzz: 0.19 uni/lex_utf8.t 19 avg: 0.112 sdev: 0.067 fuzz: 0.60 uni/lower.t 19 avg: 0.386 sdev: 0.065 fuzz: 0.17 uni/overload.t 19 avg: 0.617 sdev: 0.321 fuzz: 0.52 uni/sprintf.t 19 avg: 0.035 sdev: 0.019 fuzz: 0.55 uni/tie.t 19 avg: 0.075 sdev: 0.072 fuzz: 0.96 uni/title.t 19 avg: 0.269 sdev: 0.069 fuzz: 0.26 uni/tr_7jis.t 19 avg: 0.135 sdev: 0.078 fuzz: 0.58 uni/tr_eucjp.t 19 avg: 0.117 sdev: 0.037 fuzz: 0.31 19 make test runs --> 19 elapsed times "fuzz" is standard-deviation / avg -- low fuzz means the test's runtimes are rather consistent across the sampleset. These low-fuzz tests are good for detecting actual runtime improvements, since theyll be less hidden in the noise. Id like to discuss all of this, towards : - devising an acceptable interface to use in t/TEST presumably a require + qualified-fn-call. - fitting same into Test::* && TAP, as applicable - whether the body of data we could get is useful - if we collect it, will they come ? - how does this fit with other plans for blead, Test:: etc.
diff --git a/t/TEST b/t/TEST index 2634485..e75c7bb 100755 --- a/t/TEST +++ b/t/TEST @@ -19,6 +19,7 @@ delete $ENV{PERL5LIB}; # remove empty elements due to insertion of empty symbols via "''p1'" syntax @ARGV = grep($_,@ARGV) if $^O eq 'VMS'; our $show_elapsed_time = $ENV{HARNESS_TIMER} || 0; +our %elapsed; # collect actuals # Cheesy version of Getopt::Std. Maybe we should replace it with that. { @@ -32,6 +33,7 @@ our $show_elapsed_time = $ENV{HARNESS_TIMER} || 0; $::with_utf8 = 1 if $1 eq 'utf8'; $::with_utf16 = 1 if $1 eq 'utf16'; $::taintwarn = 1 if $1 eq 'taintwarn'; + $::show_elapsed_time = 1 if $1 eq 'times'; # alt for HARNESS_TIMER $ENV{PERL_CORE_MINITEST} = 1 if $1 eq 'minitest'; if ($1 =~ /^deparse(,.+)?$/) { $::deparse = 1; @@ -70,6 +72,25 @@ $ENV{EMXSHELL} = 'sh'; # For OS/2 use TestInit; use File::Spec; if ($show_elapsed_time) { require Time::HiRes } +END { # maybe collect runtime data + last unless -d "$ENV{HOME}/.p5p"; # no dest-dir + # collect additional ID-ish info: branch/HEAD, build-conf, etc + # require Devel::Test + # $elapsed{platform} = Devel::Test::platform_info(); + # write it if desired format is given + if ($show_elapsed_time =~ /stor/i) { + require Storable; + Storable::store(\%elapsed, + "$ENV{HOME}/.p5p/mktest-perf.$$.stor"); + } + elsif ($show_elapsed_time =~ /Dumper/i) { + require Data::Dumper; + $Data::Dumper::Sortkeys = 1; + open(my $rtdata, ">$ENV{HOME}/.p5p/mktest-perf.$$") + or die "cant write perf-data: $!\n"; + print $rtdata Data::Dumper::Dumper(\%elapsed); + } +} my $curdir = File::Spec->curdir; my $updir = File::Spec->updir; @@ -508,12 +529,11 @@ EOT } else { if ($max) { - my $elapsed; + my $elapsed = ""; if ( $show_elapsed_time ) { - $elapsed = sprintf( " %8.0f ms", (Time::HiRes::time() - $test_start_time) * 1000 ); - } - else { - $elapsed = ""; + my $et = Time::HiRes::time() - $test_start_time; + $elapsed = sprintf( " %8.0f ms", $et * 1000 ); + $elapsed{$test} = $et; } print "${te}ok$elapsed\n"; $good_files++;