Hi!

[ Please don't say "patch v1", it's just clutter. ]

The point of the patch is to improve some code that evokes undefined
behaviour.  The sanitizer for that is how these problems were found,
you can remark that somewhere later in the message, but ubsan is not
what this patch is about, it does not belong in the title.

On Thu, Jun 26, 2025 at 01:26:00PM +0530, Kishan Parmar wrote:
> The following patch has been bootstrapped and regtested on powerpc64le-linux.

Details like that belong at the *end* of a patch message.  They do not
usually belong in a commit message at all.

> While building GCC with --with-build-config=bootstrap-ubsan on
> powerpc64le-unknown-linux-gnu, multiple UBSAN runtime errors were
> encountered in rs6000.cc and rs6000.md due to undefined behavior
> involving left shifts on negative values and shift exponents equal to
> or exceeding the type width.

And you are changing the code so it does not do UB anymore.  Good good.

> The issue was in bit pattern recognition code
> (in can_be_rotated_to_negative_lis and can_be_built_by_li_and_rldic),
> where signed values were shifted without handling negative inputs or
> guarding against shift counts equal to the type width, causing UB.
> The fix ensures shifts and rotations are done unsigned HOST_WIDE_INT,
> and casting back only where needed (like for arithmetic right shifts)
> with proper guards to prevent shift-by-64.

Okido.

> 
> 2025-06-26  Kishan Parmar  <kis...@linux.ibm.com>
> 
> gcc:
>       PR target/118890
>       * config/rs6000/rs6000.cc (can_be_rotated_to_negative_lis):
>       Avoid left shift of negative value and guard shift count.

Please dont put newlines after colons, it looks like you forgot a piece
of text here.  Changelogs are 80 chars wide, including the 8 chars
indent.  So these lines should be something like

        * config/rs6000/rs6000.cc (can_be_rotated_to_negative_lis): Avoid left
        shift of negative value and guard shift count.

>       (can_be_built_by_li_and_rldic): Likewise.
>       (rs6000_emit_set_long_const): Likewise.
>       * config/rs6000/rs6000.md : Avoid signed overflow.

No space before colon.

The pattern you modified here is a nameless splitter, so describe it a
bit?  "rs6000.md (splitter for plus into two 16-bit parts):", something
like that.

"Avoid signed overflow" is not a great message.  First off, *all*
overflow in C is signed, "unsigned" arithmetic can never overflow (it is
arithmetic in some finite binary ring, *finite*, it cannot overflow; if
you consider the signed arithmetic as being in Z (the ring of integers)
it can result in values that are not representable on the machine).

Secondly, all overflow in C is always undefined behaviour (or not valid
C code in some other way), so of course it needs to be avoided.  "Fix
error" would not be a very informative log message either, but it would
be better already ;-)

> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10309,15 +10309,18 @@ can_be_rotated_to_negative_lis (HOST_WIDE_INT c, 
> int *rot)
>  
>    /* case b. xx0..01..1xx: some of 15 x's (and some of 16 0's) are
>       rotated over the highest bit.  */
> -  int pos_one = clz_hwi ((c << 16) >> 16);
> -  middle_zeros = ctz_hwi (c >> (HOST_BITS_PER_WIDE_INT - pos_one));
> -  int middle_ones = clz_hwi (~(c << pos_one));
> -  if (middle_zeros >= 16 && middle_ones >= 33)
> +  unsigned HOST_WIDE_INT uc = (unsigned HOST_WIDE_INT)c;

Space after cast:
  unsigned HOST_WIDE_INT uc = (unsigned HOST_WIDE_INT) c;

(throughout).

> +  int pos_one = clz_hwi ((HOST_WIDE_INT)(uc << 16) >> 16);
  int pos_one = clz_hwi ((HOST_WIDE_INT) (uc << 16) >> 16);

But why do you want a cast here?   uc is already unsigned!

> +  if (pos_one != 0)

Why this?  Shifting by zero is perfectly fine.

>      {
> -      *rot = pos_one;
> -      return true;
> +      middle_zeros = ctz_hwi (c >> (HOST_BITS_PER_WIDE_INT - pos_one));
> +      int middle_ones = clz_hwi (~(uc << pos_one));
> +      if (middle_zeros >= 16 && middle_ones >= 33)
> +     {
> +       *rot = pos_one;
> +       return true;
> +     }
>      }
> -
>    return false;
>  }
>  
> @@ -10434,7 +10437,7 @@ can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int 
> *shift, HOST_WIDE_INT *mask)
>    if (lz >= HOST_BITS_PER_WIDE_INT)
>      return false;
>  
> -  int middle_ones = clz_hwi (~(c << lz));
> +  int middle_ones = clz_hwi (~(((unsigned HOST_WIDE_INT)c) << lz));
>    if (tz + lz + middle_ones >= ones
>        && (tz - lz) < HOST_BITS_PER_WIDE_INT
>        && tz < HOST_BITS_PER_WIDE_INT)
> @@ -10468,7 +10471,7 @@ can_be_built_by_li_and_rldic (HOST_WIDE_INT c, int 
> *shift, HOST_WIDE_INT *mask)
>    if (!IN_RANGE (pos_first_1, 1, HOST_BITS_PER_WIDE_INT-1))
>      return false;
>  
> -  middle_ones = clz_hwi (~c << pos_first_1);
> +  middle_ones = clz_hwi ((~(unsigned HOST_WIDE_INT)c) << pos_first_1);
>    middle_zeros = ctz_hwi (c >> (HOST_BITS_PER_WIDE_INT - pos_first_1));
>    if (pos_first_1 < HOST_BITS_PER_WIDE_INT
>        && middle_ones + middle_zeros < HOST_BITS_PER_WIDE_INT
> @@ -10570,7 +10573,8 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c, int *num_insns)
>      {
>        /* li/lis; rldicX */
>        unsigned HOST_WIDE_INT imm = (c | ~mask);
> -      imm = (imm >> shift) | (imm << (HOST_BITS_PER_WIDE_INT - shift));
> +      if (shift != 0)
> +     imm = (imm >> shift) | (imm << (HOST_BITS_PER_WIDE_INT - shift));

Same, why is this needed?  The code is nicer without it.

If it improves generated code somewhat, that should be in a separate
patch, but all such unnecessary work should be automatically optimised
away already.  Did you see that not working?

>        count_or_emit_insn (temp, GEN_INT (imm));
>        if (shift != 0)
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 9c718ca2a22..8fc079a4297 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -1971,7 +1971,9 @@
>  {
>    HOST_WIDE_INT val = INTVAL (operands[2]);
>    HOST_WIDE_INT low = sext_hwi (val, 16);
> -  HOST_WIDE_INT rest = trunc_int_for_mode (val - low, <MODE>mode);
> +  /* Avoid signed overflow by computing difference in unsigned domain.  */
> +  unsigned HOST_WIDE_INT urest = (unsigned HOST_WIDE_INT)val - (unsigned 
> HOST_WIDE_INT)low;

You never need to cast both arms of a binary operator.  C will promote
one side automatically.

> +  HOST_WIDE_INT rest = trunc_int_for_mode (urest, <MODE>mode);
>  
>    operands[4] = GEN_INT (low);
>    if (<MODE>mode == SImode || satisfies_constraint_L (GEN_INT (rest)))

So, it is probably best if you use somnething like
  unsigned HOST_WIDE_INT val = UINTVAL (operands[2]);
at the start, and no other changes are needed?


Segher

Reply via email to