Re: [Pixman] [PATCH 3/4] sse2: affine bilinear fetcher

2013-02-01 Thread Siarhei Siamashka
On Tue, Jan 29, 2013 at 11:21 AM, Siarhei Siamashka wrote: > > +if (BILINEAR_INTERPOLATION_BITS < 8) > > +{ > > + const __m128i xmm_xorc7 = _mm_set_epi16 (0, BMSK, 0, BMSK, 0, BMSK, > > 0, BMSK); > > + const __m128i xmm_addc7 = _mm_set_epi16 (0, 1, 0, 1, 0, 1, 0, 1); > > + con

Re: [Pixman] [PATCH 3/4] sse2: affine bilinear fetcher

2013-01-31 Thread Søren Sandmann
Siarhei Siamashka writes: > As for the affine transforms, they really depend on accessing memory > in an a cache-friendly way. A simple experiment that could be done would be to just switch to a tiled access pattern in pixman-general.c and see what the performance impact of that would be. > I w

Re: [Pixman] [PATCH 3/4] sse2: affine bilinear fetcher

2013-01-29 Thread Bill Spitzak
Siarhei Siamashka wrote: Going forward, we need to also add support for separable bilinear scaling (first horizontal interpolation for single scanlines to temporary buffers in L1 cache, then vertical interpolation of these buffers to get the final result). Unless I misunderstood something, Soere

Re: [Pixman] [PATCH 3/4] sse2: affine bilinear fetcher

2013-01-29 Thread Siarhei Siamashka
On Sun, 27 Jan 2013 14:10:27 + Chris Wilson wrote: > On an SNB i5-2500 using cairo-image: > > firefox-canvas17.8 -> 10.3: 1.72x speedup > firefox-tron 46.3 -> 28.4: 1.63x speedup > swfdec-youtube 1.7 -> 1.4: 1.22x speedup > firefox-fishbowl 64.6 -> 53.7: 1.

[Pixman] [PATCH 3/4] sse2: affine bilinear fetcher

2013-01-27 Thread Chris Wilson
On an SNB i5-2500 using cairo-image: firefox-canvas17.8 -> 10.3: 1.72x speedup firefox-tron 46.3 -> 28.4: 1.63x speedup swfdec-youtube 1.7 -> 1.4: 1.22x speedup firefox-fishbowl 64.6 -> 53.7: 1.20x speedup firefox-paintball 40.8 -> 36.8: 1.11x speedup firefo