Re: [FFmpeg-devel] [PATCH] avcodec/arm/hevcdsp_sao : add NEON optimization for sao

2018-03-22 Thread Shengbin Meng
The code looks good to me. I think the wrapper is fine, because that part of code is not suitable for NEON assembly. But you can remove the using of `sizeof(uint8_t)` as suggested by Carl. Shengbin Meng > On 19 Mar 2018, at 12:41, Yingming Fan wrote: > > Hi, is there

Re: [FFmpeg-devel] [PATCH] avcodec/arm/hevcdsp_sao : add NEON optimization for sao

2018-03-22 Thread Shengbin Meng
Hi, By checkasm benchmark, I can see a speedup of ~3x for band mode and ~6x for edge mode on my device (the device has aarch64 CPU, but I configured ffmpeg with `—arch=arm`). And FATE passed as well. Results of a checkasm run: $./tests/checkasm/checkasm --test=hevc_sao --bench $ sudo

Re: [FFmpeg-devel] [PATCH] avcodec/arm/hevcdsp_sao : add NEON optimization for sao

2018-03-18 Thread Yingming Fan
Hi, is there any review about this patch? What’s your option about wrapper we used in this patch. Yingming Fan > On 11 Mar 2018, at 8:59 PM, Yingming Fan wrote: > > >> On 11 Mar 2018, at 8:54 PM, Carl Eugen Hoyos wrote: >> >> 2018-03-08 8:03

Re: [FFmpeg-devel] [PATCH] avcodec/arm/hevcdsp_sao : add NEON optimization for sao

2018-03-11 Thread Yingming Fan
> On 11 Mar 2018, at 8:54 PM, Carl Eugen Hoyos wrote: > > 2018-03-08 8:03 GMT+01:00 Yingming Fan : >> From: Meng Wang > >> +stride_dst /= sizeof(uint8_t); >> +stride_src /= sizeof(uint8_t); > > FFmpeg requires

Re: [FFmpeg-devel] [PATCH] avcodec/arm/hevcdsp_sao : add NEON optimization for sao

2018-03-11 Thread Carl Eugen Hoyos
2018-03-08 8:03 GMT+01:00 Yingming Fan : > From: Meng Wang > +stride_dst /= sizeof(uint8_t); > +stride_src /= sizeof(uint8_t); FFmpeg requires sizeof(uint8_t) to be 1, please simplify your patch accordingly. Why is the wrapper

Re: [FFmpeg-devel] [PATCH] avcodec/arm/hevcdsp_sao : add NEON optimization for sao

2018-03-10 Thread Yingming Fan
Hi, there. I have already pushed a patch which add hevc_sao checkasm and patch was adopted. You can verify this optimization by using checkasm under arm device, `checkasm --test=hevc_sao --bench`. hevc_sao_band speed up ~2x, hevc_sao_edge speed up ~4x. Also passed FATE under arm platform.