> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Martin > Storsjö > Sent: Mittwoch, 21. Mai 2025 14:22 > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > Subject: Re: [FFmpeg-devel] gcc: Remove auto-vectorization limitation. > > On Wed, 21 May 2025, Andreas Rheinhardt wrote: > > > Martin Storsjö: > >> On Wed, 21 May 2025, Andreas Rheinhardt wrote: > >> > >>> Jiawei: > >>>> This patch modifies the FFmpeg build system to remove the explicit > >>>> disabling > >>>> of GCC's auto-vectorization feature. > >>>> > >>>> Modern GCC versions (>= 10.0) have demonstrated stable auto- > >>>> vectorization > >>>> capabilities through extensive optimizations in loop analysis and SIMD > >>>> code generation. The explicit -fno-tree-vectorize flag originally added > >>>> in commit 973859f (2009) to workaround early GCC vectorization > >>>> instability > >>>> is no longer necessary. > >>>> > >>>> Key improvements justifying this change: > >>>> 1. Enhanced heuristics for loop vectorization cost models > >>>> 2. Mature handling of alignment and memory access patterns > >>>> 3. Robust fallback mechanisms for unsupported architectures > >>>> > >>>> This change allows FFmpeg to benefit from automated SIMD optimizations > >>>> when built with -O3 optimization level, particularly improving > >>>> performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures. > >>>> > >>>> [1] https://git.ffmpeg.org/gitweb/ffmpeg.git/ > >>>> commit/973859f5230e77beea7bb59dc081870689d6d191 > >>>> > >>>> --- > >>>> configure | 1 - > >>>> 1 file changed, 1 deletion(-) > >>>> > >>>> diff --git a/configure b/configure > >>>> index 3730b0524c..b9e95ce4ec 100755 > >>>> --- a/configure > >>>> +++ b/configure > >>>> @@ -7656,7 +7656,6 @@ if enabled icc; then > >>>> disable aligned_stack > >>>> fi > >>>> elif enabled gcc; then > >>>> - check_optflags -fno-tree-vectorize > >>>> check_cflags -Werror=format-security > >>>> check_cflags -Werror=implicit-function-declaration > >>>> check_cflags -Werror=missing-prototypes > >>> > >>> FYI: The last discussion about auto-vectorization is here: > >>> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299405.html > >>> It contains a report about a failing build with vectorization enabled: > >>> https://ffmpeg.org/pipermail/ffmpeg-devel/2022-July/299421.html > >>> I don't know whether this is still reproducible with the latest GCC. > >> > >> The issue which was reported last time, when compiling for i686 mingw32 > >> with --cpu=haswell, seems to have gone away in > >> 182663a58a7a099e02e76da3b0f96d63e5c26a6d, where we made the whole > >> problematic x86 inline cabac assembly noinline on i386. (That whole > >> inline assembly block has been problematic in a large number of cases > >> anyway.) > >> > > > > So there are currently no known miscompilations due to vectorization > > with GCC? > > I'm not aware of any, but I haven't tested widely. It certainly is worth > evalulating. > > (From dav1d, I can anecdotally add that autovectorization does seem to > help, somewhat, especially when there's not 100% assembly coverage for the > use case. For some cases it make things slower than without > autovectorization, but generally the net result is positive.) > > // Martin > _______________________________________________
Hi, a few years ago, I had spent days on that subject. Intel have some great tools which allow precise analysis of how the compiler applies those vectorization and loop optimizations - and it also works when it was compiled with gcc, which is what I had been investigating. Focus was the code in the vf_tonemap filter, later I briefly confirmed my findings by looking at some other examples. Platform was x86_x64 only. The outcome was that enabling tree-vectorize is beneficial, but combining it with -O3 has adverse effects. Since then, we are using -O2 with tree-vectorization enabled on all platforms. For CPU tone mapping, I still ended up doing a SIMD implementation using Intel intrinsics 😊 Best sw _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".