On Tue, 3 Dec 2024, Zhao Zhili wrote:
From: Zhao Zhili <zhiliz...@tencent.com>
Test on rpi 5 with gcc 12:
apply_bdof_8_8x16_c: 7315.2 ( 1.00x)
apply_bdof_8_8x16_neon: 1876.8 ( 3.90x)
apply_bdof_8_16x8_c: 7170.5 ( 1.00x)
apply_bdof_8_16x8_neon: 1752.8 ( 4.09x)
apply_bdof_8_16x16_c: 14695.2 ( 1.00x)
apply_bdof_8_16x16_neon: 3490.5 ( 4.21x)
apply_bdof_10_8x16_c: 7371.5 ( 1.00x)
apply_bdof_10_8x16_neon: 1863.8 ( 3.96x)
apply_bdof_10_16x8_c: 7172.0 ( 1.00x)
apply_bdof_10_16x8_neon: 1766.0 ( 4.06x)
apply_bdof_10_16x16_c: 14551.5 ( 1.00x)
apply_bdof_10_16x16_neon: 3576.0 ( 4.07x)
apply_bdof_12_8x16_c: 7236.5 ( 1.00x)
apply_bdof_12_8x16_neon: 1863.8 ( 3.88x)
apply_bdof_12_16x8_c: 7316.5 ( 1.00x)
apply_bdof_12_16x8_neon: 1758.8 ( 4.16x)
apply_bdof_12_16x16_c: 14691.2 ( 1.00x)
apply_bdof_12_16x16_neon: 3480.5 ( 4.22x)
---
libavcodec/aarch64/vvc/dsp_init.c | 21 ++
libavcodec/aarch64/vvc/inter.S | 399 +++++++++++++++++++++++++++
libavcodec/aarch64/vvc/of_template.c | 64 +++++
3 files changed, 484 insertions(+)
create mode 100644 libavcodec/aarch64/vvc/of_template.c
Thanks, this looks ok to me.
(I don't personally use .req to give registers local names like this patch
does, but it works fine with our tools, so if you prefer doing it that
way, that's totally ok.)
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".