Re: [PATCH] libstdc++: Add missing constexpr to simd

2023-05-22 Thread Marc Glisse via Gcc-patches

On Mon, 22 May 2023, Jonathan Wakely via Libstdc++ wrote:


* subscripting vector builtins is not allowed in constant expressions


Is that just because nobody made it work (yet)?


Yes.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101651 and others.


* if the implementation would otherwise call SIMD intrinsics/builtins


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80517 and others.

Makes sense to work around them for now.

--
Marc Glisse


Re: [PATCH 2/2] libstdc++: use new built-in trait __add_const

2023-03-21 Thread Marc Glisse via Gcc-patches

On Tue, 21 Mar 2023, Ken Matsui via Libstdc++ wrote:


  /// add_const
+#if __has_builtin(__add_const)
+  template
+struct add_const
+{ using type = __add_const(_Tp); };
+#else
  template
struct add_const
{ using type = _Tp const; };
+#endif


Is that really better? You asked elsewhere if you should measure for each 
patch, and I think that at least for such a trivial case, you need to 
demonstrate that there is a point. The drawbacks are obvious: more code in 
libstdc++, non-standard, and more builtins in the compiler.


Using builtins makes more sense for complicated traits where you can save 
several instantiations. Now that you have done a couple simple cases to 
see how it works, I think you should concentrate on the more complicated 
cases.


--
Marc Glisse


Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-16 Thread Marc Glisse via Gcc-patches

On Fri, 4 Nov 2022, Hongyu Wang via Gcc-patches wrote:


This is a follow-up patch for PR98167

The sequence
c1 = VEC_PERM_EXPR (a, a, mask)
c2 = VEC_PERM_EXPR (b, b, mask)
c3 = c1 op c2
can be optimized to
c = a op b
c3 = VEC_PERM_EXPR (c, c, mask)
for all integer vector operation, and float operation with
full permutation.


Hello,

I assume the "full permutation" condition is to avoid performing some 
extra operations that would raise exception flags. If so, are there 
conditions (-fno-trapping-math?) where the transformation would be safe 
with arbitrary shuffles?


--
Marc Glisse


Re: [PATCH] Optimize (X<

2022-09-14 Thread Marc Glisse via Gcc-patches

On Tue, 13 Sep 2022, Roger Sayle wrote:


This patch tweaks the match.pd transformation previously added to fold
(X<

In https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html , I 
read:


"Bitwise operators act on the representation of the value including both 
the sign and value bits, where the sign bit is considered immediately 
above the highest-value value bit. Signed ‘>>’ acts on negative numbers by 
sign extension.


As an extension to the C language, GCC does not use the latitude given in 
C99 and C11 only to treat certain aspects of signed ‘<<’ as undefined."


To me, this means that for instance INT_MIN<<1 is well defined and 
evaluates to 0. But with this patch we turn (INT_MIN<<1)+(INT_MIN<<1) into 
(INT_MIN+INT_MIN)<<1, which is UB.


If we decide not to support this extension anymore, I think we need to 
change the documentation first.


--
Marc Glisse


Re: [PATCH] libstdc++: Implement std::unreachable() for C++23 (P0627R6)

2022-03-31 Thread Marc Glisse via Gcc-patches

On Thu, 31 Mar 2022, Jonathan Wakely wrote:


On Thu, 31 Mar 2022 at 17:03, Marc Glisse via Libstdc++
 wrote:


On Thu, 31 Mar 2022, Matthias Kretz via Gcc-patches wrote:


I like it. But I'd like it even more if we could have

#elif defined _UBSAN
   __ubsan_invoke_ub("reached std::unreachable()");

But to my knowledge UBSAN has no hooks for the library like this (yet).


-fsanitize=undefined already replaces __builtin_unreachable with its own
thing, so I was indeed going to ask if the assertion / trap provide a
better debugging experience compared to plain __builtin_unreachable, with
the possibility to get a stack trace (UBSAN_OPTIONS=print_stacktrace=1),
etc? Detecting if (the right subset of) ubsan is enabled sounds like a
good idea.


Does UBsan define a macro that we can use to detect it?


https://github.com/google/sanitizers/issues/765 seems to say no (it could 
be outdated though), but they were asking for use cases to motivate adding 
one. Apparently there is a macro for clang, although I don't think it is 
fine-grained.


Adding one to cppbuiltin.cc testing SANITIZE_UNREACHABLE looks easy, maybe 
we can do just this one, we don't need to go overboard and define macros 
for all possible suboptions of ubsan right now.


I don't think any of that prevents from pushing your patch as is for 
gcc-12.


--
Marc Glisse


Re: [PATCH] libstdc++: Implement std::unreachable() for C++23 (P0627R6)

2022-03-31 Thread Marc Glisse via Gcc-patches

On Thu, 31 Mar 2022, Matthias Kretz via Gcc-patches wrote:


I like it. But I'd like it even more if we could have

#elif defined _UBSAN
   __ubsan_invoke_ub("reached std::unreachable()");

But to my knowledge UBSAN has no hooks for the library like this (yet).


-fsanitize=undefined already replaces __builtin_unreachable with its own 
thing, so I was indeed going to ask if the assertion / trap provide a 
better debugging experience compared to plain __builtin_unreachable, with 
the possibility to get a stack trace (UBSAN_OPTIONS=print_stacktrace=1), 
etc? Detecting if (the right subset of) ubsan is enabled sounds like a 
good idea.


--
Marc Glisse


Re: [PATCH] PR tree-optimization/101895: Fold VEC_PERM to help recognize FMA.

2022-03-12 Thread Marc Glisse via Gcc-patches

On Fri, 11 Mar 2022, Roger Sayle wrote:

+(match vec_same_elem_p
+  CONSTRUCTOR@0
+  (if (uniform_vector_p (TREE_CODE (@0) == SSA_NAME
+? gimple_assign_rhs1 (SSA_NAME_DEF_STMT (@0)) : @0

Ah, I didn't remember we needed that, we don't seem to be very consistent 
about it. Probably for this reason, the transformation "Prefer vector1 << 
scalar to vector1 << vector2" does not match


typedef int vec __attribute__((vector_size(16)));
vec f(vec a, int b){
  vec bb = { b, b, b, b };
  return a << bb;
}

which is only optimized at vector lowering time.

+/* Push VEC_PERM earlier if that may help FMA perception (PR101895).  */
+(for plusminus (plus minus)
+  (simplify
+(plusminus (vec_perm (mult@0 @1 vec_same_elem_p@2) @0 @3) @4)
+(plusminus (mult (vec_perm @1 @1 @3) @2) @4)))

Don't you want :s on mult and vec_perm?

--
Marc Glisse