Re: [Pixman] [PATCH] MIPS: DSPr2: Fix bug in over_n_8888_8888_ca/over_n_8888_0565_ca routines

2013-03-04 Thread Nemanja Lukic
Are you referring to MIPS implementation of the following code? http://cgit.freedesktop.org/pixman/tree/pixman/pixman-fast-path.c?id=pixman- 0.29.2#n389 Yes. Looks like a lot of changes for only adding a missing shift. Are you really just fixing a single bug and not also introducing

[Pixman] MIPS DSPr2: Fix for in_n_8 routine.

2013-03-04 Thread Nemanja Lukic
Increasing number of the iterations in blitters-test revealed bug in DSPr2 optimization. Bug is in the in_n_8 routine. Rounding logic was not implemented right. Also, code used unnecessary multiplications, which could be avoided by packing 4 destination (a8) pixel into one 32bit register. There

[Pixman] [PATCH] MIPS: DSPr2: Fix for bug in in_n_8 routine.

2013-03-04 Thread Nemanja Lukic
Rounding logic was not implemented right. Instead of using rounding version of the 8-bit shift, logical shifts were used. Also, code used unnecessary multiplications, which could be avoided by packing 4 destination (a8) pixel into one 32bit register. There were also, unnecessary spills on stack.

[Pixman] [PATCH 00/12] ARMv6: Assorted improvements

2013-03-04 Thread Ben Avison
While I have some pending contributions relating to pad-repeated images and over_n_ from 2013-02-06 and 2013-02-13, I've been continuing to work in other areas. These patches have been rebased at the current head of git (as I understand is list policy), though the Cairo benchmark results

[Pixman] [PATCH 01/12] ARMv6: Fix some indentation in the composite macros

2013-03-04 Thread Ben Avison
--- pixman/pixman-arm-simd-asm.h | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/pixman/pixman-arm-simd-asm.h b/pixman/pixman-arm-simd-asm.h index 6543606..74400c1 100644 --- a/pixman/pixman-arm-simd-asm.h +++ b/pixman/pixman-arm-simd-asm.h @@ -755,18 +755,18

[Pixman] [PATCH 03/12] ARMv6: Support for very variable-hungry composite operations

2013-03-04 Thread Ben Avison
Previously, the variable ARGS_STACK_OFFSET was available to extract values from function arguments during the init macro. Now this changes dynamically around stack operations in the function as a whole so that arguments can be accessed at any point. It is also joined by LOCALS_STACK_OFFSET, which

[Pixman] [PATCH 05/12] ARMv6: Force fast paths to have fixed alignment to the BTAC

2013-03-04 Thread Ben Avison
Trying to produce repeatable, trustworthy profiling results from the cairo-perf-trace benchmark suite has proved tricky, especially when testing changes that have only a marginal ( ~5%) effect upon the runtime as a whole. One of the problems is that some traces appear to show statistically

[Pixman] [PATCH 06/12] Add extra test to lowlevel-blt-bench and fix an existing one

2013-03-04 Thread Ben Avison
in_reverse__ is one of the more commonly used operations in the cairo-perf-trace suite that hasn't been in lowlevel-blt-bench until now. The source for over_reverse_n_ needed to be marked as solid. --- test/lowlevel-blt-bench.c |3 ++- 1 files changed, 2 insertions(+), 1

[Pixman] [PATCH 07/12] ARMv6: Macro to permit testing for early returns or alternate implementations

2013-03-04 Thread Ben Avison
When the source or mask is solid (as opposed to a bitmap) there is the possibility of an immediate exit, or a branch to an alternate, more optimal implementation in some cases. This is best achieved with a brief prologue to the function; to permit this, the necessary boilerplate for setting up a

[Pixman] [PATCH 08/12] ARMv6: Added fast path for over_n_8888_8888_ca

2013-03-04 Thread Ben Avison
lowlevel-blt-bench results: Before After Mean StdDev Mean StdDev Confidence Change L1 2.70.0 16.2 0.1 100.0% +501.7% L2 2.40.0 14.8 0.2 100.0% +502.5% M 2.40.0 15.0 0.0 100.0% +525.7% HT 2.20.0 10.2

[Pixman] [PATCH 09/12] ARMv6: Add fast path for in_reverse_8888_8888

2013-03-04 Thread Ben Avison
lowlevel-blt-bench results: Before After Mean StdDev Mean StdDev Confidence Change L1 21.3 0.1 32.5 0.2 100.0% +52.1% L2 12.1 0.2 19.5 0.5 100.0% +61.2% M 11.0 0.0 17.1 0.0 100.0% +54.6% HT 8.70.0 12.8

Re: [Pixman] [PATCH 12/12] ARMv6: Add fast path for src_x888_0565

2013-03-04 Thread Chris Wilson
On Mon, Mar 04, 2013 at 05:42:29PM +, Ben Avison wrote: This isn't used in the trimmed cairo-perf-trace tests at all, but these are the lowlevel-blt-bench results: Did you try with image16? I think it should be hit somewhere, would seem like somebody would use it eventually... -Chris --

Re: [Pixman] [PATCH 12/12] ARMv6: Add fast path for src_x888_0565

2013-03-04 Thread Ben Avison
On Mon, 04 Mar 2013 17:53:01 -, Chris Wilson ch...@chris-wilson.co.uk wrote: Did you try with image16? I think it should be hit somewhere, would seem like somebody would use it eventually... Thanks for the tip; I wasn't aware of that. I've been working from Siarhei's trimmed set of traces

[Pixman] [PATCH] test: larger 0xFF/0x00 filled clusters in random images for blitters-test

2013-03-04 Thread Siarhei Siamashka
Current blitters-test program had difficulties detecting a bug in over_n___ca implementation for MIPS DSPr2: http://lists.freedesktop.org/archives/pixman/2013-March/002645.html In order to hit the buggy code path, two consecutive mask values had to be equal to 0x because of