[Pixman] [PATCH 2/9] MIPS: DSPr2: Added over_8888_8888 nearest neighbor fast path.

2013-04-15 Thread Nemanja Lukic
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): over__ = L1: 19.47 L2: 16.30 M: 11.24 ( 59.69%) HT: 9.54 VT: 9.29 R: 9.47 RT: 6.24 ( 37Kops/s) Optimized: over__ = L1: 43.67 L2: 33.30 M: 16.32

[Pixman] [PATCH 3/9] MIPS: DSPr2: Added over_8888_0565 nearest neighbor fast path.

2013-04-15 Thread Nemanja Lukic
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): over__0565 = L1: 13.22 L2: 12.02 M: 9.77 ( 38.92%) HT: 8.58 VT: 8.35 R: 8.38 RT: 5.78 ( 35Kops/s) Optimized: over__0565 = L1: 26.20 L2: 22.97 M: 15.92

[Pixman] [PATCH 4/9] MIPS: DSPr2: Added src_0565_8888 nearest neighbor fast path.

2013-04-15 Thread Nemanja Lukic
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): src_0565_ = L1: 20.70 L2: 19.22 M: 12.50 ( 49.79%) HT: 10.45 VT: 10.18 R: 9.99 RT: 5.31 ( 31Kops/s) Optimized: src_0565_ = L1: 62.98 L2: 53.44 M: 23.07

[Pixman] [PATCH 1/9] MIPS: DSPr2: Fix bug in over_n_8888_8888_ca/over_n_8888_0565_ca routines

2013-04-15 Thread Nemanja Lukic
After introducing new PRNG (pseudorandom number generator) a bug in two DSPr2 routines was revealed. Bug manifested by wrong calculation in composite and glyph tests, which caused make check to fail for MIPS DSPr2 optimizations. Bug was in the calculation of the: *dst = over (src, *dst) when ma

[Pixman] [PATCH 5/9] MIPS: DSPr2: Fix for bug in in_n_8 routine.

2013-04-15 Thread Nemanja Lukic
Rounding logic was not implemented right. Instead of using rounding version of the 8-bit shift, logical shifts were used. Also, code used unnecessary multiplications, which could be avoided by packing 4 destination (a8) pixel into one 32bit register. There were also, unnecessary spills on stack.

[Pixman] [PATCH 6/9] test: add src_0888_8888_rev and src_0888_0565_rev to lowlevel-blt-bench

2013-04-15 Thread Nemanja Lukic
--- test/lowlevel-blt-bench.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/test/lowlevel-blt-bench.c b/test/lowlevel-blt-bench.c index 4e16f7b..a1657ea 100644 --- a/test/lowlevel-blt-bench.c +++ b/test/lowlevel-blt-bench.c @@ -643,6 +643,8 @@ tests_tbl[] = {

[Pixman] [PATCH 7/9] test: add pixbuf and rpixbuf to lowlevel-blt-bench

2013-04-15 Thread Nemanja Lukic
Add necessary support to lowlevel-blt benchmark for benchmarking pixbuf and rpixbuf fast paths. bench_composite function now checks for pixbuf string in testname, and if that is detected, use same bits for src and mask images. --- test/lowlevel-blt-bench.c | 11 +-- 1 files changed, 9

[Pixman] [PATCH 8/9] MIPS: DSPr2: Added pixbuf fast path.

2013-04-15 Thread Nemanja Lukic
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): pixbuf = L1: 18.18 L2: 16.47 M: 13.36 (107.27%) HT: 10.16 VT: 10.07 R: 9.84 RT: 5.54 ( 35Kops/s) Optimized: pixbuf = L1: 43.54 L2: 36.02 M: 17.08 (137.09%) HT: