On Fri, 30 May 2025, Dmitriy Kovalenko wrote:
I've found quite a few ways to optimize existing ffmpeg's rgb to yuv
subsampled conversion. In this patch stack I'll try to
improve the perofrmance.
This particular set of changes is a small improvement to all the
existing functions and macro. The biggest performance gain is
coming from post loading increment of the pointer and immediate
prefetching of the memory blocks and interleaving the multiplication shifting
operations of
different registers for better scheduling.
Also changed a bunch of places where cmp + b.le was used instead
of one instruction cbnz/tbnz and some other small cleanups.
Here are checkasm results on the macbook pro with the latest M4 max
<before>
bgra_to_uv_1080_c: 257.5 ( 1.00x)
bgra_to_uv_1080_neon: 211.9 ( 1.22x)
bgra_to_uv_1920_c: 467.1 ( 1.00x)
bgra_to_uv_1920_neon: 379.3 ( 1.23x)
bgra_to_uv_half_1080_c: 198.9 ( 1.00x)
bgra_to_uv_half_1080_neon: 125.7 ( 1.58x)
bgra_to_uv_half_1920_c: 346.3 ( 1.00x)
bgra_to_uv_half_1920_neon: 223.7 ( 1.55x)
<after>
bgra_to_uv_1080_c: 268.3 ( 1.00x)
bgra_to_uv_1080_neon: 176.0 ( 1.53x)
bgra_to_uv_1920_c: 456.6 ( 1.00x)
bgra_to_uv_1920_neon: 307.7 ( 1.48x)
bgra_to_uv_half_1080_c: 193.2 ( 1.00x)
bgra_to_uv_half_1080_neon: 96.8 ( 2.00x)
bgra_to_uv_half_1920_c: 347.2 ( 1.00x)
bgra_to_uv_half_1920_neon: 182.6 ( 1.92x)
With my proprietary test on IOS it gives around 70% of performance
improvement converting bgra 1920x1920 image to yuv420p
On my linux arm cortex-r processing the performance improvement not that
visible but still consistently faster by 5-10% than the current
implementation.
---
libswscale/aarch64/input.S | 181 +++++++++++++++++++++++--------------
1 file changed, 112 insertions(+), 69 deletions(-)
This doesn't compile even on macOS any longer, what is this?
I will not respond to this thread any more until _every_ _single_ inline
comment has been replied to and motivated, and until the prefetch
intructions have split to a separate patch.
// Martin
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".