Re: Did clang 14 lose some intrinsics support?

2022-09-26 Thread Warner Losh
On Mon, Sep 26, 2022, 7:54 AM Lev Serebryakov  wrote:

> On 26.09.2022 13:03, Dimitry Andric wrote:
>
> > Sure, but if you are compiling without -mavx, why would you want the AVX
> > intrinsics? You cannot use AVX intrinsics anyway, if AVX is not enabled.
>Because autovectorization (generation of SSE or AVX instructions by
> compiler itself, without intrinsics) can pessimize code.
>
>Sometimes it is valuable to know exactly where AVX is used. I don't
> have examples on hands, but I've seen situations, when autovectorized code
> was much slower than scalar code.
>

The detection method that dim@ outline will work just fine for the
autodetect script. It just replaces the internal, charging undocumented
names for standard ones.

How you later compile individual bits of code is orthogonal.

Warner

>


Re: Did clang 14 lose some intrinsics support?

2022-09-26 Thread Lev Serebryakov

On 26.09.2022 13:03, Dimitry Andric wrote:


Sure, but if you are compiling without -mavx, why would you want the AVX
intrinsics? You cannot use AVX intrinsics anyway, if AVX is not enabled.

  Because autovectorization (generation of SSE or AVX instructions by compiler 
itself, without intrinsics) can pessimize code.

  Sometimes it is valuable to know exactly where AVX is used. I don't have 
examples on hands, but I've seen situations, when autovectorized code was much 
slower than scalar code.

--
// Lev Serebryakov




Re: Did clang 14 lose some intrinsics support?

2022-09-26 Thread Alexander Leidinger
Quoting Dimitry Andric  (from Mon, 26 Sep 2022  
12:03:03 +0200):



Sure, but if you are compiling without -mavx, why would you want the AVX
intrinsics? You cannot use AVX intrinsics anyway, if AVX is not enabled.

So I don't fully understand the problem this configure scripting is
supposed to solve?


Think about run time check of available CPU features and then using  
this code for performance critical sections only. Allows to generate  
programs which are generic to all CPUs in the main code paths, and  
able to switch to high performance implementations of critical code  
paths depending on the feature of the CPU.


Bye,
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF


pgpjLRXsPlQVc.pgp
Description: Digitale PGP-Signatur


Re: Did clang 14 lose some intrinsics support?

2022-09-26 Thread Dimitry Andric
On 25 Sep 2022, at 23:38, Christian Weisgerber  wrote:
> 
> Dimitry Andric:
> 
>>> See https://github.com/llvm/llvm-project/commit/e5147f82e1cb
>>> 
>>> - Instead of __builtin_ia32_pabsd128 maybe use _mm_abs_epi32
>>> - Instead of __builtin_ia32_pabsd256 maybe use _mm256_abs_epi32
>> 
>> I'm wondering why this rather fragile method is chosen? If you want to
>> know whether SSE is supported, you check for __SSE__, and similarly
>> __SSE2__, __AVX__ and a bunch of others. That is also portable to gcc.
> 
> __AVX__, for instance, is not defined unless you compile with -mavx,
> which also allows the compiler to issue AVX instructions during
> normal code generation.

Sure, but if you are compiling without -mavx, why would you want the AVX
intrinsics? You cannot use AVX intrinsics anyway, if AVX is not enabled.

So I don't fully understand the problem this configure scripting is
supposed to solve?

In my opinion, if you would want to know whether the compiler supports
AVX in any mode, you would first attempt to run "$CC -mavx" and if that
succeeds, run a test case which checks for the __AVX__ define. If both
succeed, then AVX intrinsics work, otherwise they don't. Rinse and
repeat for any other particular extension you would want to check. And
should work for both clang and gcc.

-Dimitry



signature.asc
Description: Message signed with OpenPGP


Re: Did clang 14 lose some intrinsics support?

2022-09-25 Thread Christian Weisgerber
Dimitry Andric:

> > See https://github.com/llvm/llvm-project/commit/e5147f82e1cb
> > 
> > - Instead of __builtin_ia32_pabsd128 maybe use _mm_abs_epi32
> > - Instead of __builtin_ia32_pabsd256 maybe use _mm256_abs_epi32
> 
> I'm wondering why this rather fragile method is chosen? If you want to
> know whether SSE is supported, you check for __SSE__, and similarly
> __SSE2__, __AVX__ and a bunch of others. That is also portable to gcc.

__AVX__, for instance, is not defined unless you compile with -mavx,
which also allows the compiler to issue AVX instructions during
normal code generation.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: Did clang 14 lose some intrinsics support?

2022-09-25 Thread Dimitry Andric
On 25 Sep 2022, at 21:02, Jan Beich  wrote:
> 
> Christian Weisgerber  writes:
> 
>> Did we lose support for SSSE3 and AVX2 intrinsics on amd64 with
>> clang 14?
> 
> __builtin_* appear unstable unlike _mm* intrinsics. Clang 15 seems
> to hide more but I'm not sure about the cause (need bisecting).

Yeah, these internal names are constantly changing. I don't know the
reason for it, but it's terribly annoying when diagnosing failures with
preprocessed files, as these tend to break when compiled with a much
earlier or later copy of clang.


> ===> clang version 15.0.1
> #define SSE2_SUPPORTED 1
> #define SSE_SUPPORTED 1
> 
> ===> clang version 15.0.1 with -march=native
> #define AVX_SUPPORTED 1
> #define FMA_SUPPORTED 1
> #define SSE2_SUPPORTED 1
> #define SSE4_1_SUPPORTED 1
> #define SSE_SUPPORTED 1
> 
>> #if __has_builtin(__builtin_ia32_pabsd128)
>>  #define SSSE3_SUPPORTED 1
>> #endif
> [...]
>> #if __has_builtin(__builtin_ia32_pabsd256)
>>  #define AVX2_SUPPORTED 1
>> #endif
> 
> See https://github.com/llvm/llvm-project/commit/e5147f82e1cb
> 
> - Instead of __builtin_ia32_pabsd128 maybe use _mm_abs_epi32
> - Instead of __builtin_ia32_pabsd256 maybe use _mm256_abs_epi32
> 

I'm wondering why this rather fragile method is chosen? If you want to
know whether SSE is supported, you check for __SSE__, and similarly
__SSE2__, __AVX__ and a bunch of others. That is also portable to gcc.

-Dimitry



signature.asc
Description: Message signed with OpenPGP


Re: Did clang 14 lose some intrinsics support?

2022-09-25 Thread Jan Beich
Christian Weisgerber  writes:

> Did we lose support for SSSE3 and AVX2 intrinsics on amd64 with
> clang 14?

__builtin_* appear unstable unlike _mm* intrinsics. Clang 15 seems
to hide more but I'm not sure about the cause (need bisecting).

===> clang version 15.0.1
#define SSE2_SUPPORTED 1
#define SSE_SUPPORTED 1

===> clang version 15.0.1 with -march=native
#define AVX_SUPPORTED 1
#define FMA_SUPPORTED 1
#define SSE2_SUPPORTED 1
#define SSE4_1_SUPPORTED 1
#define SSE_SUPPORTED 1

> #if __has_builtin(__builtin_ia32_pabsd128)
>   #define SSSE3_SUPPORTED 1
> #endif
[...]
> #if __has_builtin(__builtin_ia32_pabsd256)
>   #define AVX2_SUPPORTED 1
> #endif

See https://github.com/llvm/llvm-project/commit/e5147f82e1cb

- Instead of __builtin_ia32_pabsd128 maybe use _mm_abs_epi32
- Instead of __builtin_ia32_pabsd256 maybe use _mm256_abs_epi32