On Mon, 25 Jun 2012 02:00:27 +0300, Siarhei Siamashka
siarhei.siamas...@gmail.com wrote:
Does it actually make sense? I remember somebody was strongly opposing
the idea of spawning threads in pixman in the past, but can't find
this e-mail right now.
The only caveat from my point of view is
On Mon, Jun 25, 2012 at 12:50 AM, Siarhei Siamashka
siarhei.siamas...@gmail.com wrote:
Using _mm_loadl_epi64() to load two pixels at once (pairs of top
and bottom pixels) is faster than loading each pixel separately
and combining them with _mm_set_epi32().
=== cairo-perf-trace ===
before:
Bill Spitzak spit...@gmail.com writes:
A problem is the pseudo-random starting point. This is necessary so
that you do not get vertical stripes when the rows have the same
data. But if you want your composites to not change, it has to be
deterministic. I'm not sure but perhaps a hash of the y
Søren Sandmann kirjoitti 25.6.2012 kello 20.47:
Bill Spitzak spit...@gmail.com writes:
A problem is the pseudo-random starting point. This is necessary so
that you do not get vertical stripes when the rows have the same
data. But if you want your composites to not change, it has to be
Chris Wilson ch...@chris-wilson.co.uk writes:
On Mon, 25 Jun 2012 02:00:27 +0300, Siarhei Siamashka
siarhei.siamas...@gmail.com wrote:
Does it actually make sense? I remember somebody was strongly opposing
the idea of spawning threads in pixman in the past, but can't find
this e-mail right
Added optimizations for several bilinear fast paths:
- src__8_
- src__8_0565
- src_0565_8_x888
- src_0565_8_0565
- add__8_
Benchmark results (using tweaked version of the lowlevel-blt-bench which does
bilinear scaling using almost identity matrix) on Malta board (@1Ghz)
Macro BILINEAR_INTERPOLATION_BITS in pixman-private.h configures
the number of fractional bits used for bilinear interpolation.
scaling-test and affine-test have checksums for 7-bit and 8-bit
configurations.
---
pixman/pixman-arm-neon-asm-bilinear.S | 119
Reducing interpolation precision allows the use of PMADDWD instruction.
It is much faster:
8-bit: image firefox-fishtank 57.584 58.349 0.74%3/3
7-bit: image firefox-fishtank 51.139 51.229 0.30%3/3
8-bit: src__ = L1: 228.71 L2: 226.52
On Mon, Jun 25, 2012 at 7:45 PM, Matt Turner matts...@gmail.com wrote:
On Mon, Jun 25, 2012 at 1:00 AM, Siarhei Siamashka
siarhei.siamas...@gmail.com wrote:
OK, I got 7-bit variant of SSE2 bilinear scaling working. It shows
quite a good speed boost thanks to PMADDWD instruction, which can be
Siarhei wrote this patch and we've been using it in the Mozilla tree since May.
Before this patch it was often faster to scale and repeat in two passes because
each pass used a fast path vs. the slow path that the single pass approach
takes. This makes it so that the single pass approach has
This patch adds fast paths for bilinear scaling of (SRC, r5g6b5, r5g6b5),
(OVER, a8r8g8b8, r5g6b5), and (OVER, a8r8g8b8, a8r8g8b8). These make a
noticeable
improvement in the performance of Firefox on Android.
-Jeff
___
Pixman mailing list
And here's the patch.
patch
Description: Binary data
___
Pixman mailing list
Pixman@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/pixman
12 matches
Mail list logo