Matt and I talked about whether we needed the compile check, he didn't think we did because we required a GCC that has msse4.1 support, but the HAVE vs USE is a real bug (someone else noticed that too).
CC'ing Matt in case I'm miss-remembering something.
Quoting Scott D Phillips (2018-01-24 10:28:53)
> Before we were adding -DHAVE_SSE41 which isn't what the code is
> looking for, so some uses of the sse4.1 code were always being
> skipped.
>
> Fixes: 84486f6462 ("meson: Enable SSE4.1 optimizations")
> ---
> meson.build | 20 +++++++++++++++-----
> 1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/meson.build b/meson.build
> index 97619f786b..3bbda53ccf 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -771,9 +771,9 @@ foreach a : ['-Werror=pointer-arith', '-Werror=vla']
> endif
> endforeach
>
> +with_sse41 = false
> +sse41_args = []
> if host_machine.cpu_family().startswith('x86')
> - pre_args += '-DHAVE_SSE41'
> - with_sse41 = true
> sse41_args = ['-msse4.1']
>
> # GCC on x86 (not x86_64) with -msse* assumes a 16 byte aligned stack, but
> @@ -781,9 +781,19 @@ if host_machine.cpu_family().startswith('x86')
> if host_machine.cpu_family() == 'x86'
> sse41_args += '-mstackrealign'
> endif
> -else
> - with_sse41 = false
> - sse41_args = []
> +
> + if cc.compiles('''#include <smmintrin.h>
> + int param;
> + int main () {
> + __m128i a = _mm_set1_epi32 (param), b =
> _mm_set1_epi32 (param + 1), c;
> + c = _mm_max_epu32(a, b);
> + return _mm_cvtsi128_si32(c);
> + }''',
> + name : 'SSE4.1 intrinsics',
> + args : sse41_args)
> + with_sse41 = true
> + pre_args += '-DUSE_SSE41'
> + endif
> endif
>
> # Check for GCC style atomics
> --
> 2.14.3
>
signature.asc
Description: signature
_______________________________________________ mesa-dev mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/mesa-dev
