Re: [Pixman] [PATCH 1/2] Add empty SSSE3 implementation

2013-09-05 Thread Matt Turner
On Thu, Aug 29, 2013 at 10:02 AM, Søren Sandmann Pedersen wrote: > This commit adds a new, empty SSSE3 implementation and the associated > build system support. > > configure.ac: detect whether the compiler understands SSSE3 > intrinsics and set up the required CFLAGS > > Makefil

Re: [Pixman] [PATCH 1/2] Add empty SSSE3 implementation

2013-09-05 Thread Siarhei Siamashka
On Thu, 29 Aug 2013 13:02:52 -0400 "Søren Sandmann Pedersen" wrote: > This commit adds a new, empty SSSE3 implementation and the associated > build system support. > > configure.ac: detect whether the compiler understands SSSE3 > intrinsics and set up the required CFLAGS > > M

Re: [Pixman] [PATCH 2/2] ssse3: Add iterator for separable bilinear scaling

2013-09-05 Thread Siarhei Siamashka
On Thu, 29 Aug 2013 13:02:53 -0400 "Søren Sandmann Pedersen" wrote: > This new iterator uses the SSSE3 instructions pmaddubsw and pabsw to > implement a fast iterator for bilinear scaling. This patch shows some really good performance for upscaling. In fact even better than I expected. And the t

Re: [Pixman] [PATCH 0/2] SSSE3 iterator and fast path selection issues

2013-09-05 Thread Siarhei Siamashka
d a combiner. And the non-separable bilinear code still seems to be somewhat competitive for dowscaling. But the scaling ratio, where the separable implementation becomes faster, differs for different generations of hardware: http://people.freedesktop.org/~siamashka/files/20130905/ http://peo

Re: [Pixman] [PATCH] Drop support for 8-bit precision in bilinear filtering

2013-09-05 Thread Matt Turner
On Wed, Sep 4, 2013 at 7:49 PM, Søren Sandmann wrote: > From: Søren Sandmann Pedersen > > The default has been 7-bit for a while now, and the quality > improvement with 8-bit precision is not enough to justify keeping the > code around as a compile-time option. > --- I'm fine with this change, b

Re: [Pixman] [PATCH] test: safeguard the scaling-bench test against COW

2013-09-05 Thread Siarhei Siamashka
On Wed, 4 Sep 2013 03:12:51 +0300 Siarhei Siamashka wrote: > The calloc call from pixman_image_create_bits may still > rely on http://en.wikipedia.org/wiki/Copy-on-write > Explicitly initializing the destination image results in > a more predictable behaviour. A newer revision of this patch. To

Re: [Pixman] [PATCH] sse2: faster bilinear scaling (pack 4 pixels to write with MOVDQA)

2013-09-05 Thread Siarhei Siamashka
s an iterator, primarily intended for downscaling. Here are some benchmarks, comparing the SSE2 and SSSE3 implementations of src__ fast paths with the performance of SSSE3 iterator (using the scaling-bench program, which has been modified to use SRC instead of OVER, and pixman code patched t