On Tue, 13 Sep 2022, Hubert Mazur wrote:

Provide optimized implementation for pix_median_abs16 function.

Forgot to update this part of the commit message here too.

Performance comparison tests are shown below.
- median_sad_1_c: 273.7
- median_sad_1_neon: 98.2

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur <h...@semihalf.com>
---
libavcodec/aarch64/me_cmp_init_aarch64.c |  3 ++
libavcodec/aarch64/me_cmp_neon.S         | 65 ++++++++++++++++++++++++
2 files changed, 68 insertions(+)

The same comments as for patch 1/3; looks reasonble, a bunch of leftover mov instructions which I don't see how they'd be necessary, and please avoid the extra single-lane handling and just do plain vector operations, and extract the single lane at the end.

// Martin

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to