On Thu, 05 Oct 2006 11:32:37 -0700, Carl Worth wrote:
> 
> Another tool that will be helpful to have is something for doing
> historical comparison over several runs. The simplest tool, and very
> useful, would be a "performance diff" that takes two runs and reports
> the difference, (perhaps only showing tests where the results differ
> more than a single standard deviation). I would use that kind of tool
> constantly to ensure that submitted patches to provide desired
> performance improvements.

I've written that program now. It's called cairo-perf-diff and it's
built in the cairo/perf directory. The Makefile won't install it or
anything, as I figure it's easy enough for interested people to just
manually copy it to ~/bin or whatever.

The interesting part of making this program work well is in what it
_doesn't_ show. Currently it is discarding as uninteresting any change
for which the mean values are not separated by more than 3 of the
standard deviation of each.

Ideally, that's the only kind of discarding we would do, but it's not
quite working well enough yet. So, currently I'm also discarding any
changes below a given threshold, (5% by default, but can also be
specified as the third argument on the command line).

Even then, it's still not discarding all the noise. There's a really
easy test for this. Just run cairo-perf twice (saving the output from
each run into first.perf and second.perf) and then run:

        cairo-perf first.perf second.perf 0.0

(That third 0.0 forces it to only discard based on overlapping
probability distributions based on the 3 standard deviations---and not
too discard things based on the percentage change being too small.)

If everything were working correctly, the output from the above would
be empty, since there should be no interesting changes in the
performance results, (and any variation should be captured by the
reported standard deviations). But the results aren't empty yet.

I did some things to attempt to improve this already. For example, I've
made cairo-perf output the number of ticks it measures in addition to
the time in milliseconds it estimates, (based on an estimate of the CPU
frequency that it measures). So cairo-perf-diff computes only on the
ticks columns, (but puts the time in its output for readability).

I think other problems are the fixed-percentage outlier elimination
and early bailout based on a stably low standard deviation. I think
these prevent the standard deviation from capturing the true amount of
variation. I started some work to eliminate the early bailout and to
do adaptive outlier detection, (based on the conventional "1.5 times
the interquartile range above the third quartile or below the first
quartile" http://mathworld.wolfram.com/Outlier.html ).

I haven't succeeded at making great improvements along those lines,
(particularly in light of the fact that removing the early bail out
slows things down a lot). And I really need to start using this tool
to land cairo patches rather than develop it. So if anyone else wants
to improve things to try to get the command above to report nothing,
then that would be greatly appreciated.

In the meantime, here's a sample showing what the output can look
like. Here's what cairo-perf-diff gives me when I give it the results
of cairo-perf before and after the patch that Monty provided for
fixing the subimage_copy performance bug in cairo:

-Carl

Speedups
========
 xlib-rgba              subimage_copy-512    3.93 2.46% ->   0.07 2.71%: 52.91x 
faster
███████████████████████████████████████████████████▉
 xlib-rgb               subimage_copy-512    4.03 1.97% ->   0.09 2.61%: 44.74x 
faster
███████████████████████████████████████████▊
 xlib-rgba              subimage_copy-256    1.02 2.25% ->   0.07 0.56%: 14.42x 
faster
█████████████▍
 xlib-rgba        text_image_rgb_over-256   63.21 1.53% ->  11.87 2.17%:  5.33x 
faster
████▍
 xlib-rgba       text_image_rgba_over-256   62.31 0.72% ->  11.87 2.82%:  5.25x 
faster
████▎
 xlib-rgba     text_image_rgba_source-256   67.97 0.85% ->  16.48 2.23%:  4.13x 
faster
███▏
 xlib-rgba      text_image_rgb_source-256   68.82 0.55% ->  16.93 2.10%:  4.07x 
faster
███▏
 xlib-rgba              subimage_copy-128    0.19 1.72% ->   0.06 0.85%:  3.10x 
faster
██▏
 xlib-rgb         text_image_rgb_over-256  108.22 0.40% ->  57.47 0.37%:  1.88x 
faster
▉
 xlib-rgb        text_image_rgba_over-256  107.32 0.59% ->  57.32 0.78%:  1.87x 
faster
▉
 xlib-rgb       text_image_rgb_source-256  114.92 0.44% ->  61.73 0.79%:  1.86x 
faster
▉
 xlib-rgb      text_image_rgba_source-256  114.01 0.51% ->  61.69 0.51%:  1.85x 
faster
▉
 xlib-rgba              subimage_copy-64     0.11 2.24% ->   0.06 0.73%:  1.83x 
faster
▉
 xlib-rgb               subimage_copy-256    2.81 1.57% ->   1.65 1.19%:  1.71x 
faster
▊
 xlib-rgba        text_image_rgb_over-128    4.78 2.22% ->   2.85 1.06%:  1.68x 
faster
▋
 xlib-rgba       text_image_rgba_over-128    4.72 1.38% ->   2.83 0.92%:  1.67x 
faster
▋
 xlib-rgba      text_image_rgb_source-128    5.82 0.22% ->   3.92 0.57%:  1.48x 
faster
▌
 xlib-rgba     text_image_rgba_source-128    5.79 0.25% ->   3.93 1.56%:  1.47x 
faster
▌
 xlib-rgba       text_image_rgba_over-64     1.53 1.03% ->   1.13 0.42%:  1.35x 
faster
▍
 xlib-rgba        text_image_rgb_over-64     1.52 0.45% ->   1.13 1.15%:  1.34x 
faster
▍
 xlib-rgb               subimage_copy-64     0.25 1.04% ->   0.19 2.61%:  1.34x 
faster
▍
 xlib-rgb               subimage_copy-128    0.64 1.65% ->   0.50 1.09%:  1.27x 
faster
▎
 xlib-rgba      fill_radial_rgba_over-256    9.75 0.95% ->   7.81 2.55%:  1.25x 
faster
▎
 xlib-rgba        fill_image_rgb_over-256    2.56 0.77% ->   2.07 1.49%:  1.24x 
faster
▎
 xlib-rgba       fill_image_rgba_over-256    2.55 0.41% ->   2.06 1.01%:  1.23x 
faster
▎
 xlib-rgba      text_image_rgb_source-64     2.27 0.91% ->   1.88 0.20%:  1.21x 
faster
▎
 xlib-rgba       fill_radial_rgb_over-256    9.68 0.60% ->   8.17 0.51%:  1.18x 
faster
▏
 xlib-rgba     fill_image_rgba_source-256    3.95 2.11% ->   3.35 1.51%:  1.18x 
faster
▏
 xlib-rgba              subimage_copy-32     0.07 1.57% ->   0.06 0.91%:  1.17x 
faster
▏
 xlib-rgba     text_image_rgba_source-64     2.25 0.28% ->   1.92 1.57%:  1.17x 
faster
▏
 xlib-rgba      fill_image_rgb_source-256    3.85 0.39% ->   3.32 1.20%:  1.16x 
faster
▏
 xlib-rgb         text_image_rgb_over-64     4.60 2.34% ->   4.06 0.51%:  1.13x 
faster
▏
 xlib-rgb         text_image_rgb_over-128   16.05 1.57% ->  14.24 1.86%:  1.13x 
faster
▏
 xlib-rgb       text_image_rgb_source-128   17.20 2.02% ->  15.32 1.76%:  1.12x 
faster
▏
 xlib-rgb        text_image_rgba_over-64     4.54 0.71% ->   4.11 1.08%:  1.10x 
faster
▏
 xlib-rgb       text_image_rgb_source-64     5.03 0.35% ->   4.59 0.16%:  1.10x 
faster
▏
 xlib-rgba       fill_image_rgba_over-64     0.36 1.78% ->   0.33 0.61%:  1.09x 
faster
▏
 xlib-rgb      text_image_rgba_source-64     4.99 0.20% ->   4.61 0.49%:  1.08x 
faster
▏
 xlib-rgb               subimage_copy-32     0.11 1.24% ->   0.10 1.13%:  1.07x 
faster
▏
 xlib-rgba     fill_radial_rgb_source-128    2.54 0.44% ->   2.38 0.31%:  1.07x 
faster
▏
 xlib-rgba     fill_image_rgba_source-64     0.48 0.65% ->   0.45 0.58%:  1.07x 
faster
▏
 xlib-rgba       fill_radial_rgb_over-128    2.19 0.33% ->   2.06 1.00%:  1.06x 
faster
▏
 xlib-rgba      fill_image_rgb_source-64     0.48 0.60% ->   0.45 0.72%:  1.06x 
faster

Slowdowns
=========
 xlib-rgba  paint_similar_rgba_source-256    0.12 2.52% ->   0.16 2.81%:  1.33x 
slower
▍
image-rgba    paint_image_rgba_source-256    0.08 0.39% ->   0.10 2.45%:  1.25x 
slower
▎
image-rgba  paint_similar_rgba_source-256    0.09 0.38% ->   0.10 2.35%:  1.20x 
slower
▎
image-rgb        paint_solid_rgb_over-512    0.64 1.12% ->   0.74 1.57%:  1.17x 
slower
▏
image-rgb     paint_solid_rgba_source-512    0.64 1.21% ->   0.74 0.44%:  1.17x 
slower
▏
image-rgb      paint_solid_rgb_source-512    0.64 0.93% ->   0.74 0.59%:  1.16x 
slower
▏
image-rgb     paint_radial_rgb_source-512   53.05 2.18% ->  60.76 2.07%:  1.15x 
slower
▏
 xlib-rgba       text_radial_rgb_over-64     3.95 0.57% ->   4.48 1.09%:  1.14x 
slower
▏
image-rgba    paint_solid_rgba_source-512    0.66 1.65% ->   0.73 1.10%:  1.12x 
slower
▏
image-rgba     paint_solid_rgb_source-512    0.66 1.90% ->   0.73 0.74%:  1.11x 
slower
▏
image-rgb   paint_similar_rgba_source-256    0.26 1.09% ->   0.29 0.98%:  1.11x 
slower
▏
image-rgb       fill_radial_rgba_over-256    5.57 0.30% ->   6.11 0.24%:  1.10x 
slower
▏
image-rgb      paint_radial_rgba_over-512   55.79 1.42% ->  60.80 0.68%:  1.09x 
slower
▏
image-rgb     fill_radial_rgba_source-128    1.64 0.20% ->   1.78 0.15%:  1.09x 
slower
▏
image-rgb     fill_radial_rgba_source-256    6.02 0.49% ->   6.55 0.26%:  1.09x 
slower
▏
image-rgb       fill_radial_rgba_over-128    1.54 1.08% ->   1.66 0.15%:  1.07x 
slower
▏
image-rgb     fill_radial_rgba_source-64     0.56 0.47% ->   0.60 0.46%:  1.07x 
slower
▏
image-rgb      paint_image_rgb_source-256    0.08 0.39% ->   0.09 0.78%:  1.06x 
slower

image-rgb       fill_radial_rgba_over-64     0.53 0.14% ->   0.56 0.47%:  1.06x 
slower

 xlib-rgba     fill_radial_rgb_source-64     0.83 0.38% ->   0.88 0.46%:  1.05x 
slower

Attachment: pgphfKuWvfNoG.pgp
Description: PGP signature

_______________________________________________
Performance-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/performance-list

Reply via email to