I have a similar performance result got about one month ago.

Gvim test has great performance increase, about 360%. Other tests has no side
effect and no increase neither.

I tested it on 32-bit userland. I will test it again based on the git code and 
give out the data later.

Regards,
Xinyun

On Fri, Dec 03, 2010 at 12:00:22AM +0800, Siarhei Siamashka wrote:
 
> Just did some benchmarks with the 'gnome-system-monitor' cairo perf trace
> which happens to use 'src_x888_8888' operation. It looks like SSSE3 has
> about the same performance as SSE2, providing no measurable benefits on
> this use case. The low complexity of this operation and the hardware
> prefetcher make any differences between implementations much less
> noticeable in general.
> 
> Intel Atom N450, 64-bit userland, gcc 4.5.1
> 
> $ CAIRO_TEST_TARGET=image ./cairo-perf-trace gnome-system-monitor.trace 
> 
> ======= C slow path =========
> 
> [ # ]  backend                         test   min(s) median(s) stddev. count
> [ # ]    image: pixman 0.21.3
> [  0]    image         gnome-system-monitor   15.011   15.034   0.08%    6/6
> 
> ======= C fast path [1] =========
> 
> [ # ]  backend                         test   min(s) median(s) stddev. count
> [ # ]    image: pixman 0.21.3
> [  0]    image         gnome-system-monitor   14.659   14.697   0.20%    6/6
> 
> ======= SSE2 fast path [2] =====
> 
> [ # ]  backend                         test   min(s) median(s) stddev. count
> [ # ]    image: pixman 0.21.3
> [  0]    image         gnome-system-monitor   14.431   14.496   0.19%    6/6
> 
> ======= SSSE3 fast path [3] ======
> 
> [ # ]  backend                         test   min(s) median(s) stddev. count
> [ # ]    image: pixman 0.21.3
> [  0]    image         gnome-system-monitor   14.455   14.496   0.17%    6/6
> 
> ====== artificial test with just an empty stub for src_x888_8888 ========
> 
> [ # ]  backend                         test   min(s) median(s) stddev. count
> [ # ]    image: pixman 0.21.3
> [  0]    image         gnome-system-monitor   12.215   12.241   0.11%    6/6
> 
> ---
> 
> So I'm not sure if this SSE3 code is very useful as is. But of course, if some
> practical use case makes a heavy use of this function so that it works on the
> data residing in L1/L2 caches, then it could show a big improvement.
> 
> 
> 1. http://cgit.freedesktop.org/pixman/commit/?id=16bae834
> 2. http://cgit.freedesktop.org/pixman/commit/?id=e0b430a1
> 3. http://lists.freedesktop.org/archives/pixman/2010-November/000742.html
> 
> -- 
> Best regards,
> Siarhei Siamashka


_______________________________________________
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman

Reply via email to