https://gcc.gnu.org/g:6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0

commit r16-3317-g6dd2a42ab652e243b73e46b7f6e4ecfb3346e8b0
Author: Luc Grosheintz <luc.groshei...@gmail.com>
Date:   Sun Aug 3 22:57:29 2025 +0200

    libstdc++: Improve extents::operator==.
    
    An interesting case to consider is:
    
      bool same11(const std::extents<int, dyn,   2, 3>& e1,
                  const std::extents<int, dyn, dyn, 3>& e2)
      { return e1 == e2; }
    
    Which has the following properties:
    
      - There's no mismatching static extents, preventing any
        short-circuiting.
    
      - There's a comparison between dynamic and static extents.
    
      - There's one trivial comparison: ... && 3 == 3.
    
    Let E[i] denote the array of static extents, D[k] denote the array of
    dynamic extents and k[i] be the index of the i-th extent in D.
    (Naturally, k[i] is only meaningful if i is a dynamic extent).
    
    The previous implementation results in assembly that's more or less a
    literal translation of:
    
      for (i = 0; i < 3; ++i)
        e1 = E1[i] == -1 ? D1[k1[i]] : E1[i];
        e2 = E2[i] == -1 ? D2[k2[i]] : E2[i];
        if e1 != e2:
          return false
      return true;
    
    While the proposed method results in assembly for
    
      if(D1[0] == D2[0]) return false;
      return 2 == D2[1];
    
    i.e.
    
      110:  8b 17                  mov    edx,DWORD PTR [rdi]
      112:  31 c0                  xor    eax,eax
      114:  39 16                  cmp    DWORD PTR [rsi],edx
      116:  74 08                  je     120 <same11+0x10>
      118:  c3                     ret
      119:  0f 1f 80 00 00 00 00   nop    DWORD PTR [rax+0x0]
      120:  83 7e 04 02            cmp    DWORD PTR [rsi+0x4],0x2
      124:  0f 94 c0               sete   al
      127:  c3                     ret
    
    It has the following nice properties:
    
      - It eliminated the indirection D[k[i]], because k[i] is known at
        compile time. Saving us a comparison E[i] == -1 and conditionally
        loading k[i].
    
      - It eliminated the trivial condition 3 == 3.
    
    The result is code that only loads the required values and performs
    exactly the number of comparisons needed by the algorithm. It also
    results in smaller object files. Therefore, this seems like a sensible
    change. We've check several other examples, including fully statically
    determined cases and high-rank examples. The example given above
    illustrates the other cases well.
    
    The constexpr condition:
    
      if constexpr (!_S_is_compatible_extents<...>)
        return false;
    
    is no longer needed, because the optimizer correctly handles this case.
    However, it's retained for clarity/certainty.
    
    libstdc++-v3/ChangeLog:
    
            * include/std/mdspan (extents::operator==): Replace loop with
            pack expansion.
    
    Reviewed-by: Tomasz KamiƄski <tkami...@redhat.com>
    Signed-off-by: Luc Grosheintz <luc.groshei...@gmail.com>

Diff:
---
 libstdc++-v3/include/std/mdspan | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan
index 4b271116a028..fdec81300d79 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -407,10 +407,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
            return false;
          else
            {
-             for (size_t __i = 0; __i < __self.rank(); ++__i)
-               if (!cmp_equal(__self.extent(__i), __other.extent(__i)))
-                 return false;
-             return true;
+             auto __impl = [&__self, &__other]<size_t... _Counts>(
+                 index_sequence<_Counts...>)
+               { return (cmp_equal(__self.extent(_Counts),
+                                   __other.extent(_Counts)) && ...); };
+             return __impl(make_index_sequence<__self.rank()>());
            }
        }

Reply via email to