Re: [FFmpeg-devel] [PATCH] get_cabac_inline_x86: Don't inline if 32-bit clang on windows

James Almer Thu, 19 Aug 2021 11:41:11 -0700

On 8/18/2021 7:01 AM, Martin Storsjö wrote:

On Tue, 17 Aug 2021, James Almer wrote:
On 8/17/2021 12:35 PM, Christopher Degawa wrote:
Fixes https://trac.ffmpeg.org/ticket/8903

relevant https://github.com/msys2/MINGW-packages/discussions/9258

Signed-off-by: Christopher Degawa <[email protected]>
---
  libavcodec/x86/cabac.h | 9 +++++++--
  1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/libavcodec/x86/cabac.h b/libavcodec/x86/cabac.h
index 53d74c541e..b046a56a6b 100644
--- a/libavcodec/x86/cabac.h
+++ b/libavcodec/x86/cabac.h
@@ -177,8 +177,13 @@
    #if HAVE_7REGS && !BROKEN_COMPILER
  #define get_cabac_inline get_cabac_inline_x86
-static av_always_inline int get_cabac_inline_x86(CABACContext *c,
-                                                 uint8_t *const state)
+static
+#if defined(_WIN32) && !defined(_WIN64) && defined(__clang__)
Can you do some benchmarks to see how not inlining this compares tosimply disabling this code for this target? Because it sounds like youmay want to add this case to the BROKEN_COMPILER macro, and not usethis code at all.
I tried benchmarking it, and in short, this patch seems to be the bestsolution.
I tested 3 configurations; with this patch (changing av_always_inlineinto av_noinline), setting BROKEN_COMPILER (skipping these inline asmfunctions) and configuring with --cpu=i686 (which means it passes-march=i686 to the compiler, which disallows the use of inline MMX/SSE).I benchmarked singlethreaded decoding of a high bitrate H264 clip(listing the lowest measured time out of 3 runs):
av_noinline: 90.94 seconds
BROKEN_COMPILER: 98.92 seconds
-march=i686: 94.63 seconds
(The fact that building with -march=i686 is faster than using some butnot all inline MMX/SSE is a bit surprising.)
I also tested the same setup on x86_64 (on a different machine, withApple Clang), where I tested the above and compare it with the defaultconfiguration using av_always_inline):
av_always_inline: 74.65 seconds
av_noinline: 73.74 seconds
BROKEN_COMPILER: 78.10 seconds
So av_noinline actually seems to be generally favourable here (and forsome reason, actually a bit faster than the always_inline case, althoughI'm not sure if that bit is deterministic in general or not).
// Martin


Alright, LGTM then.
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] get_cabac_inline_x86: Don't inline if 32-bit clang on windows

Reply via email to