On Wed, May 21, 2025 at 09:21:40AM +0200, Jakub Jelinek wrote:
> On Wed, May 21, 2025 at 10:13:27AM +0800, Yang Yujie wrote:
> > On Tue, May 20, 2025 at 03:44:09PM GMT, Jakub Jelinek wrote:
> > > I'd suggest working on it incrementally rather than with a full patch set.
> > > In one or multiple patches handle the promote_mode stuff, the atomic
> > > extension and expr.cc changes with the feedback incorporated.
> > 
> > Ok.
> > 
> > > For gimple-lower-bitint.cc I'd really like to see what testing you've done
> > > to decide on a case by case basis.
> > > 
> > > > > Are you sure all those changes were really necessary (rather than 
> > > > > doing them
> > > > > just in case)?  I believe most of gimple-lower-bitint.cc already 
> > > > > should be
> > > > > sign or zero extending the partial limbs when storing stuff, there 
> > > > > can be
> > > > > some corner cases (I think one of the shift directions at least).
> > > > 
> > > > The modifications to gimple-lower-bitint.cc are based on testing, 
> > > 
> > > The tests weren't included :(.
> > 
> > "The tests" refer to the existing regression tests in testsuite/gcc.dg and
> > testsuite/gcc.dg/torture, specifically bitint-*.c.
> 
> The existing testsuite never tests whether the padding bits are sign or zero
> extended or contain unknown values.
> In the s390x patch, info->extended is set to true, yet all bitint tests but
> bitint-64.c at -O3 pass.
> 
> So, I'm really interested to know in which cases you found the existing code
> not doing the extensions, ideally simple {,un}signed _BitInt(N) with values
> X and Y doing arithmetic or logical operation OP.
> And put that into the new test(s), so that we can ensure that the extensions
> are performed correctly.

Tried to add incremental
--- gcc/testsuite/gcc.dg/torture/bitint-82.c    2025-05-20 17:16:38.715970734 
+0200
+++ gcc/testsuite/gcc.dg/torture/bitint-82.c    2025-05-21 09:59:15.212997874 
+0200
@@ -39,6 +39,9 @@
   BEXTC (s);
   unsigned _BitInt(105) t = d << (38 - j);
   BEXTC (t);
+  _Atomic _BitInt(5) u = 15;
+  u += 8U;
+  BEXTC (u);
   return a + 4;
 }
 #endif

for the _Atomic case, but even that doesn't FAIL on s390x.
With
--- gcc/testsuite/gcc.dg/torture/bitint-82.c.jj 2025-05-20 22:39:28.647171638 
+0200
+++ gcc/testsuite/gcc.dg/torture/bitint-82.c    2025-05-21 10:26:06.496171881 
+0200
@@ -39,6 +39,11 @@ f1 (_BitInt(9) a, unsigned _BitInt(12) b
   BEXTC (s);
   unsigned _BitInt(105) t = d << (38 - j);
   BEXTC (t);
+  _Atomic _BitInt(5) u = 15;
+  u += 8U;
+  BEXTC (u);
+  _BitInt(135) v = e << 28;
+  BEXTC (v);
   return a + 4;
 }
 #endif
instead it seems to fail on x86_64 with #if 1 in bitintext.h, but still not
on s390x.
I wonder about
[[gnu::noipa]] void
do_copy (void *p, const void *q, __SIZE_TYPE__ r)
{
  __builtin_memcpy (p, q, r);
}

/* Macro to test whether (on targets where psABI requires it) _BitInt
   with padding bits have those filled with sign or zero extension.  */
#if defined(__s390x__) || defined(__arm__) || defined(__loongarch__)
#define BEXTC(x) \
  do {                                                          \
    if ((typeof (x)) -1 < 0)                                    \
      {                                                         \
        _BitInt(sizeof (x) * __CHAR_BIT__) __x;                 \
        do_copy (&__x, &(x), sizeof (__x));                     \
        if (__x != (x))                                         \
          __builtin_abort ();                                   \
      }                                                         \
    else                                                        \
      {                                                         \
        unsigned _BitInt(sizeof (x) * __CHAR_BIT__) __x;        \
        do_copy (&__x, &(x), sizeof (__x));                     \
        if (__x != (x))                                         \
          __builtin_abort ();                                   \
      }                                                         \
  } while (0)
#else
#define BEXTC(x) do { (void) (x); } while (0)
#endif
(hiding the memcpy so that the vars are really address taken and
memcpy isn't optimized into VCE which could do the extension), but
even that doesn't help.

        Jakub

Reply via email to