在 2025/5/21 17:04, Zhao Zhili 写道:

On May 21, 2025, at 14:17, Jiawei <jia...@iscas.ac.cn> wrote:

This patch modifies the FFmpeg build system to remove the explicit disabling
of GCC's auto-vectorization feature.

Modern GCC versions (>= 10.0) have demonstrated stable auto-vectorization
capabilities through extensive optimizations in loop analysis and SIMD
code generation. The explicit -fno-tree-vectorize flag originally added
in commit 973859f (2009) to workaround early GCC vectorization instability
is no longer necessary.
This isn’t the whole story.

The flag was added by 973859f in 2009.
Then it was reverted by cb8646af in 2016.
Shortly after that, the revert was reverted again by fd6dbc5 in 2016.

Key improvements justifying this change:
1. Enhanced heuristics for loop vectorization cost models
2. Mature handling of alignment and memory access patterns
3. Robust fallback mechanisms for unsupported architectures

This change allows FFmpeg to benefit from automated SIMD optimizations
when built with -O3 optimization level, particularly improving
performance on x86_64 (AVX), ARM64 (SVE) and RISC-V(RVV) architectures.
Those flags can only be enabled in tightly controlled environments (e.g., built 
and run on the same
machine), while FFmpeg has hand written assembly, runtime cpu probe and dynamic 
binding/dispatch.

Those auto-vectorization and ARCH flags can be enabled manually, but be careful.

Thank you point this out, since I am using x64 AVX2 and RISC-V RVV, when I enable the vector feature

by -O3 -mavx(-march=rv64gcv for RV). This configure will adds the `-fno-tree-vectorize` option automatically.

It will still add the vector load/store instructions in the result, but no vector operation here.


GCC import the explicit option to controll if there need generate the vectorized instructions. It's okay to use -O3
but not do auto-vectorization.


[1] 
https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/973859f5230e77beea7bb59dc081870689d6d191

---
configure | 1 -
1 file changed, 1 deletion(-)

diff --git a/configure b/configure
index 3730b0524c..b9e95ce4ec 100755
--- a/configure
+++ b/configure
@@ -7656,7 +7656,6 @@ if enabled icc; then
             disable aligned_stack
     fi
elif enabled gcc; then
-    check_optflags -fno-tree-vectorize
     check_cflags -Werror=format-security
     check_cflags -Werror=implicit-function-declaration
     check_cflags -Werror=missing-prototypes
--
2.43.0

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to