Re: [PATCH] libstdc++: Add missing constexpr to simd
On Mon, 22 May 2023, Jonathan Wakely via Libstdc++ wrote: * subscripting vector builtins is not allowed in constant expressions Is that just because nobody made it work (yet)? Yes. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101651 and others. * if the implementation would otherwise call SIMD intrinsics/builtins https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80517 and others. Makes sense to work around them for now. -- Marc Glisse
Re: [PATCH 2/2] libstdc++: use new built-in trait __add_const
On Tue, 21 Mar 2023, Ken Matsui via Libstdc++ wrote: /// add_const +#if __has_builtin(__add_const) + template +struct add_const +{ using type = __add_const(_Tp); }; +#else template struct add_const { using type = _Tp const; }; +#endif Is that really better? You asked elsewhere if you should measure for each patch, and I think that at least for such a trivial case, you need to demonstrate that there is a point. The drawbacks are obvious: more code in libstdc++, non-standard, and more builtins in the compiler. Using builtins makes more sense for complicated traits where you can save several instantiations. Now that you have done a couple simple cases to see how it works, I think you should concentrate on the more complicated cases. -- Marc Glisse
Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]
On Fri, 4 Nov 2022, Hongyu Wang via Gcc-patches wrote: This is a follow-up patch for PR98167 The sequence c1 = VEC_PERM_EXPR (a, a, mask) c2 = VEC_PERM_EXPR (b, b, mask) c3 = c1 op c2 can be optimized to c = a op b c3 = VEC_PERM_EXPR (c, c, mask) for all integer vector operation, and float operation with full permutation. Hello, I assume the "full permutation" condition is to avoid performing some extra operations that would raise exception flags. If so, are there conditions (-fno-trapping-math?) where the transformation would be safe with arbitrary shuffles? -- Marc Glisse
Re: [PATCH] Optimize (X<
On Tue, 13 Sep 2022, Roger Sayle wrote: This patch tweaks the match.pd transformation previously added to fold (X< In https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html , I read: "Bitwise operators act on the representation of the value including both the sign and value bits, where the sign bit is considered immediately above the highest-value value bit. Signed ‘>>’ acts on negative numbers by sign extension. As an extension to the C language, GCC does not use the latitude given in C99 and C11 only to treat certain aspects of signed ‘<<’ as undefined." To me, this means that for instance INT_MIN<<1 is well defined and evaluates to 0. But with this patch we turn (INT_MIN<<1)+(INT_MIN<<1) into (INT_MIN+INT_MIN)<<1, which is UB. If we decide not to support this extension anymore, I think we need to change the documentation first. -- Marc Glisse
Re: [PATCH] libstdc++: Implement std::unreachable() for C++23 (P0627R6)
On Thu, 31 Mar 2022, Jonathan Wakely wrote: On Thu, 31 Mar 2022 at 17:03, Marc Glisse via Libstdc++ wrote: On Thu, 31 Mar 2022, Matthias Kretz via Gcc-patches wrote: I like it. But I'd like it even more if we could have #elif defined _UBSAN __ubsan_invoke_ub("reached std::unreachable()"); But to my knowledge UBSAN has no hooks for the library like this (yet). -fsanitize=undefined already replaces __builtin_unreachable with its own thing, so I was indeed going to ask if the assertion / trap provide a better debugging experience compared to plain __builtin_unreachable, with the possibility to get a stack trace (UBSAN_OPTIONS=print_stacktrace=1), etc? Detecting if (the right subset of) ubsan is enabled sounds like a good idea. Does UBsan define a macro that we can use to detect it? https://github.com/google/sanitizers/issues/765 seems to say no (it could be outdated though), but they were asking for use cases to motivate adding one. Apparently there is a macro for clang, although I don't think it is fine-grained. Adding one to cppbuiltin.cc testing SANITIZE_UNREACHABLE looks easy, maybe we can do just this one, we don't need to go overboard and define macros for all possible suboptions of ubsan right now. I don't think any of that prevents from pushing your patch as is for gcc-12. -- Marc Glisse
Re: [PATCH] libstdc++: Implement std::unreachable() for C++23 (P0627R6)
On Thu, 31 Mar 2022, Matthias Kretz via Gcc-patches wrote: I like it. But I'd like it even more if we could have #elif defined _UBSAN __ubsan_invoke_ub("reached std::unreachable()"); But to my knowledge UBSAN has no hooks for the library like this (yet). -fsanitize=undefined already replaces __builtin_unreachable with its own thing, so I was indeed going to ask if the assertion / trap provide a better debugging experience compared to plain __builtin_unreachable, with the possibility to get a stack trace (UBSAN_OPTIONS=print_stacktrace=1), etc? Detecting if (the right subset of) ubsan is enabled sounds like a good idea. -- Marc Glisse
Re: [PATCH] PR tree-optimization/101895: Fold VEC_PERM to help recognize FMA.
On Fri, 11 Mar 2022, Roger Sayle wrote: +(match vec_same_elem_p + CONSTRUCTOR@0 + (if (uniform_vector_p (TREE_CODE (@0) == SSA_NAME +? gimple_assign_rhs1 (SSA_NAME_DEF_STMT (@0)) : @0 Ah, I didn't remember we needed that, we don't seem to be very consistent about it. Probably for this reason, the transformation "Prefer vector1 << scalar to vector1 << vector2" does not match typedef int vec __attribute__((vector_size(16))); vec f(vec a, int b){ vec bb = { b, b, b, b }; return a << bb; } which is only optimized at vector lowering time. +/* Push VEC_PERM earlier if that may help FMA perception (PR101895). */ +(for plusminus (plus minus) + (simplify +(plusminus (vec_perm (mult@0 @1 vec_same_elem_p@2) @0 @3) @4) +(plusminus (mult (vec_perm @1 @1 @3) @2) @4))) Don't you want :s on mult and vec_perm? -- Marc Glisse