> On Apr 9, 2018, at 10:12, Yingming Fan wrote:
>
> From: Yingming Fan
>
> ---
> Hi, there.
> I plane to submit our arm32 neon codes for qpel and epel.
> While before this i will submit hevc_mc checkasm codes.
> This hevc_mc checkasm codes check
LGTM.
Regards,
Shengbin Meng
> On 27 Mar 2018, at 20:43, Yingming Fan <yingming...@gmail.com> wrote:
>
> From: Meng Wang <wangmeng.k...@bytedance.com>
>
> Signed-off-by: Meng Wang <wangmeng.k...@bytedance.com>
> ---
> This v3 patch removed unused codes 's
> On 22 Mar 2018, at 20:51, Yingming Fan wrote:
>
> From: Meng Wang
>
> Signed-off-by: Meng Wang
> ---
> This v2 patch remove unused codes 'stride_dst /= sizeof(uint8_t);' compared
> to v1. V1 have this codes
The code looks good to me. I think the wrapper is fine, because that part of
code is not suitable for NEON assembly.
But you can remove the using of `sizeof(uint8_t)` as suggested by Carl.
Shengbin Meng
> On 19 Mar 2018, at 12:41, Yingming Fan <yingming...@gmail.com> wrote
Hi,
By checkasm benchmark, I can see a speedup of ~3x for band mode and ~6x for
edge mode on my device (the device has aarch64 CPU, but I configured ffmpeg
with `—arch=arm`). And FATE passed as well.
Results of a checkasm run:
$./tests/checkasm/checkasm --test=hevc_sao --bench
$ sudo
> On 22 Nov 2017, at 20:26, Michael Niedermayer <mich...@niedermayer.cc> wrote:
>
> On Wed, Nov 22, 2017 at 07:12:01PM +0800, Shengbin Meng wrote:
>> From: Meng Wang <wangmeng.k...@bytedance.com>
>>
>> Signed-off-by: Meng Wang <wangmeng.k.
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/hevcdsp_epel_neon.S | 10 ++
libavcodec/arm/hevcdsp_qpel_neon.S | 24
2 files changed, 30 insertions(+), 4 deletions(-)
diff --git
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/Makefile| 3 +-
libavcodec/arm/hevcdsp_init_neon.c | 62 +
libavcodec/arm/hevcdsp_sao_neon.S | 181 +
3 files
New code is written for qpel; and then code for qpel is reused for epel,
because whole-pixel interpolation in qpel and epel are identical.
Signed-off-by: Shengbin Meng <shengbinm...@gmail.com>
---
libavcodec/arm/hevcdsp_init_neon.c | 107 ++
libavcodec/arm/hevcdsp_qpel_
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/Makefile|3 +-
libavcodec/arm/hevcdsp_epel_neon.S | 2068
libavcodec/arm/hevcdsp_init_neon.c | 458
3 files
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/hevcdsp_init_neon.c | 67 +
libavcodec/arm/hevcdsp_qpel_neon.S | 509 +
2 files changed, 576 insertions(+)
diff --git
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/hevcdsp_idct_neon.S | 241 +
libavcodec/arm/hevcdsp_init_neon.c | 2 +
2 files changed, 243 insertions(+)
diff --git
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/Makefile|3 +-
libavcodec/arm/hevcdsp_epel_neon.S | 2068
libavcodec/arm/hevcdsp_init_neon.c | 459
3 files
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/hevcdsp_epel_neon.S | 10 ++
libavcodec/arm/hevcdsp_qpel_neon.S | 24
2 files changed, 30 insertions(+), 4 deletions(-)
diff --git
New code is written for qpel; and then code for qpel is reused for epel,
because whole-pixel interpolation in qpel and epel are identical.
Signed-off-by: Shengbin Meng <shengbinm...@gmail.com>
---
libavcodec/arm/hevcdsp_init_neon.c | 106 ++
libavcodec/arm/hevcdsp_qpel_
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/Makefile| 3 +-
libavcodec/arm/hevcdsp_init_neon.c | 62 +
libavcodec/arm/hevcdsp_sao_neon.S | 181 +
3 files
NEON optimization for sao
avcodec/hevcdsp: Add NEON optimization for idct16x16
Shengbin Meng (1):
avcodec/hevcdsp: Add NEON optimization for whole-pixel interpolation
libavcodec/arm/Makefile|4 +-
libavcodec/arm/hevcdsp_epel_neon.S | 2078
From: Meng Wang
Signed-off-by: Meng Wang
---
libavcodec/arm/hevcdsp_init_neon.c | 66 +
libavcodec/arm/hevcdsp_qpel_neon.S | 509 +
2 files changed, 575 insertions(+)
diff --git
> On 19 Nov 2017, at 01:35, Rafal Dabrowa wrote:
>
>
> This is a proposal of performance optimizations for 8-bit
> hevc video decoding on aarch64 platform with neon (simd) extension.
Nice to see the work for aarch64!
We are also in the process of doing NEON
Hi,
I’d like to know if anyone is dong or interested in ARM optimization for the
native HEVC decoder in FFmpeg?
We can see that some time-consuming operations in HEVC decoding have not been
optimized using NEON, e.g, qpel and epel interpolation, SAO, IDCT of large
blocks.
I have some
20 matches
Mail list logo