Re: [Pixman] [PATCH 2/5] vmx: adjust macros when loading vectors on ppc64le

2015-07-02 Thread Pekka Paalanen
On Thu, 25 Jun 2015 14:41:37 +0300 Oded Gabbay oded.gab...@gmail.com wrote: On Thu, Jun 25, 2015 at 2:05 PM, Pekka Paalanen ppaala...@gmail.com wrote: On Tue, 16 Jun 2015 18:27:59 +0300 Oded Gabbay oded.gab...@gmail.com wrote: From: Fernando Seiti Furusato ferse...@linux.vnet.ibm.com

Re: [Pixman] [PATCH v2 3/5] vmx: encapsulate the temporary variables inside the macros

2015-07-02 Thread Oded Gabbay
On Thu, Jul 2, 2015 at 10:08 AM, Pekka Paalanen ppaala...@gmail.com wrote: On Thu, 25 Jun 2015 15:59:55 +0300 Oded Gabbay oded.gab...@gmail.com wrote: v2: fixed whitespaces and indentation issues Signed-off-by: Oded Gabbay oded.gab...@gmail.com Reviewed-by: Adam Jackson a...@redhat.com ---

[Pixman] [PATCH 00/12] Implement more vmx fast paths

2015-07-02 Thread Oded Gabbay
Hi, This patch-set implements the most heavily used fast paths, according to profiling done by me using the cairo traces package. The patch-set adds many helper functions, to ease the conversion of fast paths between the sse2 implementations (which I used as a base) and the vmx implementations.

[Pixman] [PATCH 05/12] vmx: implement fast path vmx_composite_copy_area

2015-07-02 Thread Oded Gabbay
No changes were observed when running cairo trimmed benchmarks. Signed-off-by: Oded Gabbay oded.gab...@gmail.com --- pixman/pixman-vmx.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c index b42288b..e69d530 100644 ---

[Pixman] [PATCH 02/12] vmx: add helper functions

2015-07-02 Thread Oded Gabbay
This patch adds the following helper functions for reuse of code, hiding BE/LE differences and maintainability. All of the functions were defined as static force_inline. Names were copied from pixman-sse2.c so conversion of fast-paths between sse2 and vmx would be easier from now on. Therefore,

[Pixman] [PATCH 11/12] vmx: implement fast path iterator vmx_fetch_r5g6b5

2015-07-02 Thread Oded Gabbay
no changes were observed when running cairo trimmed benchmarks. Signed-off-by: Oded Gabbay oded.gab...@gmail.com --- pixman/pixman-vmx.c | 52 1 file changed, 52 insertions(+) diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c index

[Pixman] [PATCH 12/12] vmx: implement fast path iterator vmx_fetch_a8

2015-07-02 Thread Oded Gabbay
no changes were observed when running cairo trimmed benchmarks. Signed-off-by: Oded Gabbay oded.gab...@gmail.com --- pixman/pixman-vmx.c | 46 ++ 1 file changed, 46 insertions(+) diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c index

[Pixman] [PATCH 01/12] vmx: add LOAD_VECTOR macro

2015-07-02 Thread Oded Gabbay
This patch adds a macro for loading a single vector. It also make the other LOAD_VECTORx macros use this macro as a base so code would be re-used. In addition, I fixed minor coding style issues. Signed-off-by: Oded Gabbay oded.gab...@gmail.com --- pixman/pixman-vmx.c | 50

[Pixman] [PATCH 04/12] vmx: implement fast path vmx_blt

2015-07-02 Thread Oded Gabbay
No changes were observed when running cairo trimmed benchmarks. Signed-off-by: Oded Gabbay oded.gab...@gmail.com --- pixman/pixman-vmx.c | 124 1 file changed, 124 insertions(+) diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c index

[Pixman] [PATCH 07/12] vmx: implement fast path vmx_composite_over_n_8_8888

2015-07-02 Thread Oded Gabbay
POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le. reference memcpy speed = 24764.8MB/s (6191.2MP/s for 32bpp fills) Before After Change - L1 90.21 133.21 +47.67% L2

[Pixman] [PATCH 10/12] vmx: implement fast path iterator vmx_fetch_x8r8g8b8

2015-07-02 Thread Oded Gabbay
POWER8, 8 cores, 3.4GHz, RHEL 7.1 ppc64le. cairo trimmed benchmarks : Speedups t-firefox-asteroids 533.92 - 489.94 : 1.09x Signed-off-by: Oded Gabbay oded.gab...@gmail.com --- pixman/pixman-vmx.c | 48 1 file changed, 48

Re: [Pixman] [PATCH 0/9] lowlevel-blt-bench improvements for automated testing

2015-07-02 Thread Ben Avison
On Wed, 10 Jun 2015 14:32:49 +0100, Pekka Paalanen ppaala...@gmail.com wrote: most of the patches are trivial cleanups. The meat are the last two: CSV output mode and skipping the memory speed benchmark. Both new features are designed for an external benchmarking harness, that runs several