Re: [libav-devel] [PATCH 4/4] h264/aarch64: add intra loop filter neon asm

2019-01-02 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


Add my neon asm from x264 relicensed under the LGPL 2.1 or later. Ported
(x264 uses nv12 chroma) and optimized.

Cycle count for checkasm --bench on a Snapdragon 820e:
h264_h_loop_filter_luma_intra_8bpp_c: 60.0
h264_h_loop_filter_luma_intra_8bpp_neon: 54.2
h264_v_loop_filter_luma_intra_8bpp_c: 148.3
h264_v_loop_filter_luma_intra_8bpp_neon: 73.8
h264_h_loop_filter_chroma_intra_8bpp_c: 27.8
h264_h_loop_filter_chroma_intra_8bpp_neon: 21.4
h264_h_loop_filter_chroma_mbaff_intra_8bpp_c: 15.8
h264_h_loop_filter_chroma_mbaff_intra_8bpp_neon: 15.7
h264_v_loop_filter_chroma_intra_8bpp_c: 45.8
h264_v_loop_filter_chroma_intra_8bpp_neon: 17.3
---
libavcodec/aarch64/h264dsp_init_aarch64.c |  16 ++
libavcodec/aarch64/h264dsp_neon.S | 297 ++
2 files changed, 313 insertions(+)


LGTM

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 3/4] h264/aarch64: optimize neon loop filter

2019-01-02 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


Exit as soon as possible if no filtering will be done.

Improves the checkasm --bench cycle count on a Snapdragon 820e:
h264_h_loop_filter_luma_8bpp_c:  72.4 ->  72.5
h264_h_loop_filter_luma_8bpp_neon:   97.1 ->  56.3
h264_v_loop_filter_luma_8bpp_c: 174.0 -> 173.5
h264_v_loop_filter_luma_8bpp_neon:   62.9 ->  60.9
h264_h_loop_filter_chroma_8bpp_c:30.2 ->  30.3
h264_h_loop_filter_chroma_8bpp_neon: 51.6 ->  25.7
h264_v_loop_filter_chroma_8bpp_c:57.3 ->  57.3
h264_v_loop_filter_chroma_8bpp_neon: 28.0 ->  24.0
---
libavcodec/aarch64/h264dsp_neon.S | 33 ++-
1 file changed, 19 insertions(+), 14 deletions(-)


LGTM

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/4] checkasm/h264: add loop filter tests

2019-01-02 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


---
tests/checkasm/h264dsp.c | 124 +++
1 file changed, 124 insertions(+)


Looks ok to me

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/4] h264/aarch64: sign extend int stride in loop filter asm

2019-01-02 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


---
libavcodec/aarch64/h264dsp_neon.S | 3 +++
1 file changed, 3 insertions(+)

diff --git a/libavcodec/aarch64/h264dsp_neon.S 
b/libavcodec/aarch64/h264dsp_neon.S
index 9b4610a4d4..60ffa24500 100644
--- a/libavcodec/aarch64/h264dsp_neon.S
+++ b/libavcodec/aarch64/h264dsp_neon.S
@@ -130,6 +130,7 @@ endfunc

function ff_h264_h_loop_filter_luma_neon, export=1
h264_loop_filter_start
+sxtwx1,  w1

sub x0,  x0,  #4
ld1 {v6.8B},  [x0], x1
@@ -210,6 +211,7 @@ endfunc

function ff_h264_v_loop_filter_chroma_neon, export=1
h264_loop_filter_start
+sxtwx1,  w1

sub x0,  x0,  x1, lsl #1
ld1 {v18.8B}, [x0], x1
@@ -228,6 +230,7 @@ endfunc

function ff_h264_h_loop_filter_chroma_neon, export=1
h264_loop_filter_start
+sxtwx1,  w1

sub x0,  x0,  #2
ld1 {v18.S}[0], [x0], x1
--
2.20.1


LGTM

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel