Re: [PATCH] lower-bitint: Fix arithmetics followed by extension by many bits [PR112809]

2023-12-05 Thread Richard Biener
On Tue, 5 Dec 2023, Jakub Jelinek wrote:

> Hi!
> 
> A zero or sign extension from result of some upwards_2limb operation
> is implemented in lower_mergeable_stmt as an extra loop which fills in
> the extra bits with 0s or 1s.
> If the delta of extended vs. unextended bit count is small, the code
> doesn't use a loop and emits up to a couple of stores to constant indexes,
> but if the delta is large, it uses
> cnt = (bo_bit != 0) + 1 + (rem != 0);
> statements.  bo_bit is non-zero for bit-field loads and is done in that
> case as straight line, the unconditional 1 in there is for a loop which
> handles most of the limbs in the delta and finally (rem != 0) is for the
> case when the extended precision is not a multiple of limb_prec and is
> again done in straight line code (after the loop).
> The testcase ICEs because the decision what idx to use was incorrect
> for kind == bitint_prec_huge (i.e. when the precision delta is very large)
> and rem == 0 (i.e. the extended precision is multiple of limb_prec).
> In that case cnt is either 1 (if bo_bit == 0) or 2, and idx should
> be either first size_int (start) and then result of create_loop (for bo_bit
> != 0) or just result of create_loop, but by mistake the last case
> was size_int (end), which means when precision is multiple of limb_prec
> storing above the precision (which ICEs; but also not emitting the loop
> which is needed).
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok
> for trunk?

OK.

> 2023-12-05  Jakub Jelinek  
> 
>   PR tree-optimization/112809
>   * gimple-lower-bitint.cc (bitint_large_huge::lower_mergeable_stmt): For
>   separate_ext in kind == bitint_prec_huge mode if rem == 0, create for
>   i == cnt - 1 the loop rather than using size_int (end).
> 
>   * gcc.dg/bitint-48.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj 2023-12-05 09:48:14.0 +0100
> +++ gcc/gimple-lower-bitint.cc2023-12-05 18:55:58.996323144 +0100
> @@ -2624,7 +2624,7 @@ bitint_large_huge::lower_mergeable_stmt
>   {
> if (kind == bitint_prec_large || (i == 0 && bo_bit != 0))
>   idx = size_int (start + i);
> -   else if (i == cnt - 1)
> +   else if (i == cnt - 1 && (rem != 0))
>   idx = size_int (end);
> else if (i == (bo_bit != 0))
>   idx = create_loop (size_int (start + i), _next);
> --- gcc/testsuite/gcc.dg/bitint-48.c.jj   2023-12-05 19:00:19.593664966 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-48.c  2023-12-05 19:00:14.599735086 +0100
> @@ -0,0 +1,23 @@
> +/* PR tree-optimization/112809 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 512
> +_BitInt (512) a;
> +_BitInt (256) b;
> +_BitInt (256) c;
> +
> +int
> +foo (void)
> +{
> +  return a == (b | c);
> +}
> +
> +void
> +bar (void)
> +{
> +  a /= b - 2;
> +}
> +#else
> +int i;
> +#endif
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH] lower-bitint: Fix arithmetics followed by extension by many bits [PR112809]

2023-12-05 Thread Jakub Jelinek
Hi!

A zero or sign extension from result of some upwards_2limb operation
is implemented in lower_mergeable_stmt as an extra loop which fills in
the extra bits with 0s or 1s.
If the delta of extended vs. unextended bit count is small, the code
doesn't use a loop and emits up to a couple of stores to constant indexes,
but if the delta is large, it uses
  cnt = (bo_bit != 0) + 1 + (rem != 0);
statements.  bo_bit is non-zero for bit-field loads and is done in that
case as straight line, the unconditional 1 in there is for a loop which
handles most of the limbs in the delta and finally (rem != 0) is for the
case when the extended precision is not a multiple of limb_prec and is
again done in straight line code (after the loop).
The testcase ICEs because the decision what idx to use was incorrect
for kind == bitint_prec_huge (i.e. when the precision delta is very large)
and rem == 0 (i.e. the extended precision is multiple of limb_prec).
In that case cnt is either 1 (if bo_bit == 0) or 2, and idx should
be either first size_int (start) and then result of create_loop (for bo_bit
!= 0) or just result of create_loop, but by mistake the last case
was size_int (end), which means when precision is multiple of limb_prec
storing above the precision (which ICEs; but also not emitting the loop
which is needed).

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok
for trunk?

2023-12-05  Jakub Jelinek  

PR tree-optimization/112809
* gimple-lower-bitint.cc (bitint_large_huge::lower_mergeable_stmt): For
separate_ext in kind == bitint_prec_huge mode if rem == 0, create for
i == cnt - 1 the loop rather than using size_int (end).

* gcc.dg/bitint-48.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2023-12-05 09:48:14.0 +0100
+++ gcc/gimple-lower-bitint.cc  2023-12-05 18:55:58.996323144 +0100
@@ -2624,7 +2624,7 @@ bitint_large_huge::lower_mergeable_stmt
{
  if (kind == bitint_prec_large || (i == 0 && bo_bit != 0))
idx = size_int (start + i);
- else if (i == cnt - 1)
+ else if (i == cnt - 1 && (rem != 0))
idx = size_int (end);
  else if (i == (bo_bit != 0))
idx = create_loop (size_int (start + i), _next);
--- gcc/testsuite/gcc.dg/bitint-48.c.jj 2023-12-05 19:00:19.593664966 +0100
+++ gcc/testsuite/gcc.dg/bitint-48.c2023-12-05 19:00:14.599735086 +0100
@@ -0,0 +1,23 @@
+/* PR tree-optimization/112809 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-O2" } */
+
+#if __BITINT_MAXWIDTH__ >= 512
+_BitInt (512) a;
+_BitInt (256) b;
+_BitInt (256) c;
+
+int
+foo (void)
+{
+  return a == (b | c);
+}
+
+void
+bar (void)
+{
+  a /= b - 2;
+}
+#else
+int i;
+#endif

Jakub