Are you referring to MIPS implementation of the following code?
http://cgit.freedesktop.org/pixman/tree/pixman/pixman-fast-path.c?id=pixman-
0.29.2#n389
Yes.
Looks like a lot of changes for only adding a missing shift. Are you
really just fixing a single bug and not also introducing
Increasing number of the iterations in blitters-test revealed bug in DSPr2
optimization. Bug is in the in_n_8 routine. Rounding logic was not implemented
right. Also, code used unnecessary multiplications, which could be avoided
by packing 4 destination (a8) pixel into one 32bit register. There
Rounding logic was not implemented right.
Instead of using rounding version of the 8-bit shift, logical shifts were used.
Also, code used unnecessary multiplications, which could be avoided by packing
4 destination (a8) pixel into one 32bit register. There were also, unnecessary
spills on stack.
While I have some pending contributions relating to pad-repeated
images and over_n_ from 2013-02-06 and 2013-02-13, I've been
continuing to work in other areas. These patches have been rebased
at the current head of git (as I understand is list policy), though
the Cairo benchmark results
---
pixman/pixman-arm-simd-asm.h | 12 ++--
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/pixman/pixman-arm-simd-asm.h b/pixman/pixman-arm-simd-asm.h
index 6543606..74400c1 100644
--- a/pixman/pixman-arm-simd-asm.h
+++ b/pixman/pixman-arm-simd-asm.h
@@ -755,18 +755,18
Previously, the variable ARGS_STACK_OFFSET was available to extract values
from function arguments during the init macro. Now this changes dynamically
around stack operations in the function as a whole so that arguments can be
accessed at any point. It is also joined by LOCALS_STACK_OFFSET, which
Trying to produce repeatable, trustworthy profiling results from the
cairo-perf-trace benchmark suite has proved tricky, especially when testing
changes that have only a marginal ( ~5%) effect upon the runtime as a whole.
One of the problems is that some traces appear to show statistically
in_reverse__ is one of the more commonly used operations in the
cairo-perf-trace suite that hasn't been in lowlevel-blt-bench until now.
The source for over_reverse_n_ needed to be marked as solid.
---
test/lowlevel-blt-bench.c |3 ++-
1 files changed, 2 insertions(+), 1
When the source or mask is solid (as opposed to a bitmap) there is the
possibility of an immediate exit, or a branch to an alternate, more optimal
implementation in some cases. This is best achieved with a brief prologue to
the function; to permit this, the necessary boilerplate for setting up a
lowlevel-blt-bench results:
Before After
Mean StdDev Mean StdDev Confidence Change
L1 2.70.0 16.2 0.1 100.0% +501.7%
L2 2.40.0 14.8 0.2 100.0% +502.5%
M 2.40.0 15.0 0.0 100.0% +525.7%
HT 2.20.0 10.2
lowlevel-blt-bench results:
Before After
Mean StdDev Mean StdDev Confidence Change
L1 21.3 0.1 32.5 0.2 100.0% +52.1%
L2 12.1 0.2 19.5 0.5 100.0% +61.2%
M 11.0 0.0 17.1 0.0 100.0% +54.6%
HT 8.70.0 12.8
On Mon, Mar 04, 2013 at 05:42:29PM +, Ben Avison wrote:
This isn't used in the trimmed cairo-perf-trace tests at all, but these are
the lowlevel-blt-bench results:
Did you try with image16? I think it should be hit somewhere, would seem
like somebody would use it eventually...
-Chris
--
On Mon, 04 Mar 2013 17:53:01 -, Chris Wilson ch...@chris-wilson.co.uk
wrote:
Did you try with image16? I think it should be hit somewhere, would seem
like somebody would use it eventually...
Thanks for the tip; I wasn't aware of that. I've been working from
Siarhei's trimmed set of traces
Current blitters-test program had difficulties detecting a bug in
over_n___ca implementation for MIPS DSPr2:
http://lists.freedesktop.org/archives/pixman/2013-March/002645.html
In order to hit the buggy code path, two consecutive mask values had
to be equal to 0x because of
14 matches
Mail list logo