On Wed, Aug 20, 2025 at 1:14 AM Richard Biener <richard.guent...@gmail.com>
wrote:

> On Tue, Aug 19, 2025 at 11:24 PM Andrew Pinski
> <andrew.pin...@oss.qualcomm.com> wrote:
> >
> > On Tue, Aug 19, 2025 at 12:26 AM Richard Biener <
> richard.guent...@gmail.com> wrote:
> >>
> >> On Sat, Aug 16, 2025 at 7:34 PM Andrew Pinski
> >> <andrew.pin...@oss.qualcomm.com> wrote:
> >> >
> >> > Just like r16-465-gf2bb7ffe84840d8 but this time
> >> > instead of a VCE there is a full on load from a boolean.
> >> > This showed up when trying to remove the extra copy
> >> > in the testcase from the revision mentioned above (pr120122-1.c).
> >> > So when moving loads from a boolean type from being conditional
> >> > to non-conditional, the load needs to become a full load and then
> >> > casted into a bool so that the upper bits are correct.
> >> >
> >> > We could use build_ref_for_offset if that supported a similar ability
> >> > to add the statements into the sequence. Instead we copy a modified
> >> > version that does a similar thing into
> rewrite_to_defined_unconditional.
> >> >
> >> > Bootstrapped and tested on x86_64-linux-gnu.
> >> >
> >> >         PR tree-optimization/121279
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> >         * gimple-fold.cc (gimple_needing_rewrite_undefined): Return
> >> >         true for boolean loads.
> >> >         (rewrite_to_defined_unconditional): Handle boolean loads.
> >> >
> >> > gcc/testsuite/ChangeLog:
> >> >
> >> >         * gcc.dg/torture/pr121279-1.c: New test.
> >> >
> >> > Signed-off-by: Andrew Pinski <andrew.pin...@oss.qualcomm.com>
> >> > ---
> >> >  gcc/gimple-fold.cc                        | 113
> +++++++++++++++++++++-
> >> >  gcc/testsuite/gcc.dg/torture/pr121279-1.c |  49 ++++++++++
> >> >  2 files changed, 161 insertions(+), 1 deletion(-)
> >> >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr121279-1.c
> >> >
> >> > diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
> >> > index 85319b33492..40707d03946 100644
> >> > --- a/gcc/gimple-fold.cc
> >> > +++ b/gcc/gimple-fold.cc
> >> > @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3.  If not see
> >> >  #include "varasm.h"
> >> >  #include "internal-fn.h"
> >> >  #include "gimple-range.h"
> >> > +#include "alias.h"
> >> >
> >> >  enum strlen_range_kind {
> >> >    /* Compute the exact constant string length.  */
> >> > @@ -10520,6 +10521,19 @@ gimple_needing_rewrite_undefined (gimple
> *stmt)
> >> >        && !POINTER_TYPE_P (lhs_type))
> >> >      return false;
> >> >    tree rhs = gimple_assign_rhs1 (stmt);
> >> > +  /* Boolean loads need special handling as they are treated as a
> full MODE load
> >> > +     and don't mask off the bits for the precision.  */
> >> > +  if (gimple_assign_load_p (stmt)
> >> > +      && TREE_CODE (lhs_type) == BOOLEAN_TYPE
> >>
> >> I think the relevant thing is the code of the RHS type.  Also does this
> >> not apply to non-bitfield loads of !type_has_mode_precision_p types
> >> in general?  I am thinking of the fact that we treat a conversion
> >> to/from a same precision/signedness enum/integer type as useless,
> >> so singling out BOOLEAN_TYPE here looks "dangerous" at least?
> >> You are not specifically looking for TYPE_PRECISION == 1 (which is OK
> >> I think).
> >
> >
> > I think it does apply to non-bitfields loads of
> !type_has_mode_precision_p in general and not just boolean types.
> > I will remove the check for BOOLEAN_TYPE.
>

In the end I added back the check for BOOLEAN_TYPE. The reason is the
middle-end always does the correct sign/zero extension for non-mode
precision types (this is not a VCE but a "load" [yes there is PR 115405
which I might get around to fixing since it is related to this])


> >
> >>
> >>
> >> > +      && !type_has_mode_precision_p (lhs_type)
> >> > +      /* Loads directly from a decl don't need the
> >> > +        rewrite as it should always contain a valid value.  */
> >> > +      && !DECL_P (rhs)
> >>
> >> What's the reasoning here that doesn't apply to other cases
> >> [that are not within a union]?  Is it that for a DECL we know
> >> the ref is not inside a union?  Wouldn't that extent to all refs
> >> based on a decl?  Note at least with C++ you can placement-new
> >> a union { bool b; char c; } onto a 'bool' decl ...
> >
> >
> > It was to work around the issue where lim would create a temporary
> variable which is not renamed yet and we would get an ICE. But I don't
> think it is needed due to I am going to change the rewriting to just use a
> VCE (see below).
> >
> >>
> >>
> >> > +      /* Bit-fields loads don't need a rewrite as the masking
> >> > +        happens for them.  */
> >> > +      && (TREE_CODE (rhs) != COMPONENT_REF
> >> > +         || !DECL_BIT_FIELD (TREE_OPERAND (rhs, 1))))
> >> > +    return true;
> >> >    /* VCE from integral types to a integral types but with
> >> >       a smaller precision need to be changed into casts
> >> >       to be well defined. */
> >> > @@ -10558,6 +10572,104 @@ rewrite_to_defined_unconditional
> (gimple_stmt_iterator *gsi, gimple *stmt,
> >> >        print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> >> >      }
> >> >    gimple_seq stmts = NULL;
> >> > +  tree lhs = gimple_assign_lhs (stmt);
> >> > +
> >> > +  /* Bool loads need to be rewritten to be a load from the same mode
> >> > +     and then a cast to the boolean type so the other bits are
> masked off
> >> > +     correctly since the load was done conditionally. It is similar
> to the VCE
> >> > +     case below.  */
> >> > +  if (gimple_assign_load_p (stmt)
> >> > +      && TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE)
> >> > +    {
> >> > +      tree rhs = gimple_assign_rhs1 (stmt);
> >> > +      /* Direct reads from decl don't need a rewrite. */
> >> > +      gcc_assert (!DECL_P (rhs));
> >> > +      /* Bit-fields loads will do the masking so don't need the
> rewriting.  */
> >> > +      gcc_assert (TREE_CODE (rhs) != COMPONENT_REF
> >> > +                 || !DECL_BIT_FIELD (TREE_OPERAND (rhs, 1)));
> >> > +
> >> > +      location_t loc = gimple_location (stmt);
> >> > +      auto bits = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (TREE_TYPE
> (rhs)));
> >> > +      tree new_type = build_nonstandard_integer_type (bits, true);
> >> > +      poly_int64 offset;
> >> > +      tree base = get_addr_base_and_unit_offset (rhs, &offset);
> >> > +      tree off;
> >> > +      /* Preserve address-space information.  */
> >> > +      addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (rhs));
> >> > +      if (as != TYPE_ADDR_SPACE (new_type))
> >> > +       new_type = build_qualified_type (new_type,
> >> > +                                        TYPE_QUALS (new_type)
> >> > +                                        | ENCODE_QUAL_ADDR_SPACE
> (as));
> >> > +
> >> > +      /* get_addr_base_and_unit_offset returns NULL for references
> with a variable
> >> > +        offset such as array[var_index].  */
> >> > +      if (!base)
> >> > +       {
> >> > +         tree tmp, addr;
> >> > +         gassign *nstmt;
> >> > +
> >> > +         /* Create instead `tmp = &array[index];` which then will be
> deferenced. */
> >> > +         tmp = make_ssa_name (build_pointer_type (TREE_TYPE (rhs)));
> >> > +         addr = build_fold_addr_expr (unshare_expr (rhs));
> >>
> >> Ugh, that will make the base addressable.  How about, for an outermost
> >> ARRAY_REF, V_C_E its base to an array type with adjusted component?
> >> And for any other handled_component_p with constant offset (ugh,
> >> no good idea for the variable offset component-ref case - building a new
> >> fake FIELD_DECL is a bit too ugly?) replace it with
> >> a bit-field-ref?  You need to dive down to the base and replace that
> with
> >> a MEM_REF with adjusted alias-pointer type.
> >
> >
> > So it turns out I can just remove this whole thing and just add a VCE
> around the original reference and it will work correctly. I didn't think of
> that before.
>
> Huh.  So, a V_C_E shouldn't effect how we load the value - but this
> isn't what we are after here - instead it is kind of an rvalue
> re-interpretation
> (yeah, we are sometimes seeing it on the LHS...).  I guess that's exactly
> what
> we want here.
>
> Indeed, I also didn't think of this simple solution.
>

It was only after you mentioned it as referenced for ARRAY_REF that I
thought why not just do it for all.


>
> Note though that you cannot wrap a REAL/IMAG_PART_EXPR or
> a BIT_FIELD_REF inside a VIEW_CONVERT_EXPR (see gimple
> verification where we single out some refs as to be outermost only).
> You possibly won't ever see a BIT_FIELD_REF of a bool with
> non-mode precision since that would be an invalid BIT_FIELD_REF
> unless it accesses a single bit only - which is a case that shouldn't
> need rewriting?  (possibly a GIMPLE testcase could prove this)
> I hope we don't have _Complex bool, but who knows ...
>

So for BFR, the type precision has to equal to size that is referenced from
BFR, at last that is what tree-cfg checks:
 if (INTEGRAL_TYPE_P (TREE_TYPE (expr))
     && maybe_ne (TYPE_PRECISION (TREE_TYPE (expr)), size))
   {
     error ("integral result type precision does not match "
    "field size of %qs", code_name);
     return true;
   }
 So at least for the precision case == 1, it will work correctly. So I
added a check for BFR to avoid the rewriting there.
As other kinds of booleans (precision !=1), The main case where this has
been a problem is unsigned booleans were we do `a | b` and b was from this
load so BFR will always truncate to the size requested and the only case
where `a ? 1 : b` gets optimized to `a | b` is for unsigned booleans with
precision of 1. I have not found another case

As far as Complex boolean types, build_complex_type does not reject them.
Pointers were rejected with
https://gcc.gnu.org/pipermail/gcc-patches/2008-October/250588.html . So
maybe complex bools should also be turned into an ICE too.  I will test
that separately from this.

I will submit a new version of the patch after it finishes testing.

Thanks,
Andrew



> Richard.
>
> >>
> >>
> >> > +         STRIP_USELESS_TYPE_CONVERSION (addr);
> >> > +         nstmt = gimple_build_assign (tmp, addr);
> >> > +         gimple_set_location (nstmt, loc);
> >> > +         if (in_place)
> >> > +           gsi_insert_before (gsi, nstmt, GSI_SAME_STMT);
> >> > +         else
> >> > +           gimple_seq_add_stmt (&stmts, nstmt);
> >> > +         off = build_zero_cst (reference_alias_ptr_type (rhs));
> >> > +         base = tmp;
> >> > +       }
> >> > +      /* For [MEM a + 10]->b change it into MEM[a + 10 +offseof(b)]
> */
> >> > +      else if (TREE_CODE (base) == MEM_REF)
> >> > +       {
> >> > +         off = build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)),
> >> > +                              offset);
> >> > +         off = int_const_binop (PLUS_EXPR, TREE_OPERAND (base, 1),
> off);
> >> > +         base = unshare_expr (TREE_OPERAND (base, 0));
> >> > +       }
> >> > +      /* For `a.b` change it into `MEM[&a + offsetof(b)]` */
> >> > +      else
> >> > +       {
> >> > +         off = build_int_cst (reference_alias_ptr_type (rhs),
> >> > +                              offset);
> >> > +         base = build_fold_addr_expr (unshare_expr (base));
> >> > +       }
> >> > +      /*  */
> >>
> >> ?
> >
> >
> > I was going to add a comment here but I forgot. It is no longer needed
> due to the VCE part.
> > I should have a new patch submitted this evening for this.
> >
> >
> > Thanks,
> > Andrew Pinski
> >
> >>
> >>
> >> > +      if (TYPE_ALIGN (new_type) > TYPE_ALIGN (TREE_TYPE (rhs)))
> >> > +       new_type = build_aligned_type (new_type, TYPE_ALIGN
> (TREE_TYPE (rhs)));
> >> > +      tree mem_ref = fold_build2_loc (loc, MEM_REF, new_type, base,
> off);
> >> > +      if (TREE_THIS_VOLATILE (rhs))
> >> > +       TREE_THIS_VOLATILE (mem_ref) = 1;
> >> > +      if (TREE_SIDE_EFFECTS (rhs))
> >> > +       TREE_SIDE_EFFECTS (mem_ref) = 1;
> >> > +
> >> > +      /* Replace the original load with a new load and a new lhs. */
> >> > +      tree new_lhs = make_ssa_name (new_type);
> >> > +      gimple_assign_set_rhs1 (stmt, mem_ref);
> >> > +      gimple_assign_set_lhs (stmt, new_lhs);
> >> > +
> >> > +      if (in_place)
> >> > +         update_stmt (stmt);
> >> > +      else
> >> > +       {
> >> > +         gimple_set_modified (stmt, true);
> >> > +         gimple_seq_add_stmt (&stmts, stmt);
> >> > +       }
> >> > +
> >> > +      /* Build the conversion statement.  */
> >> > +      gimple *cvt = gimple_build_assign (lhs, NOP_EXPR, new_lhs);
> >> > +      if (in_place)
> >> > +       {
> >> > +         gsi_insert_after (gsi, cvt, GSI_SAME_STMT);
> >> > +         update_stmt (stmt);
> >> > +       }
> >> > +      else
> >> > +       gimple_seq_add_stmt (&stmts, cvt);
> >> > +      return stmts;
> >> > +    }
> >> > +
> >> >    /* VCE from integral types to another integral types but with
> >> >       smaller precisions need to be changed into casts
> >> >       to be well defined. */
> >> > @@ -10579,7 +10691,6 @@ rewrite_to_defined_unconditional
> (gimple_stmt_iterator *gsi, gimple *stmt,
> >> >         }
> >> >        return stmts;
> >> >      }
> >> > -  tree lhs = gimple_assign_lhs (stmt);
> >> >    tree type = unsigned_type_for (TREE_TYPE (lhs));
> >> >    if (gimple_assign_rhs_code (stmt) == ABS_EXPR)
> >> >      gimple_assign_set_rhs_code (stmt, ABSU_EXPR);
> >> > diff --git a/gcc/testsuite/gcc.dg/torture/pr121279-1.c
> b/gcc/testsuite/gcc.dg/torture/pr121279-1.c
> >> > new file mode 100644
> >> > index 00000000000..516b887fe65
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.dg/torture/pr121279-1.c
> >> > @@ -0,0 +1,49 @@
> >> > +/* { dg-do run } */
> >> > +/* PR tree-optimization/121279 */
> >> > +
> >> > +#include <stdbool.h>
> >> > +
> >> > +struct Value {
> >> > +    int type;
> >> > +    union {
> >> > +        bool boolean;
> >> > +        long long t;
> >> > +    };
> >> > +};
> >> > +
> >> > +static struct Value s_item_mem;
> >> > +
> >> > +/* truthy was being miscompiled for the value.type==2 case,
> >> > +   because the bool load from the union is pulled out of the
> >> > +   loop but that was conditional before and now it is not,
> >> > +   so turns it into  `s_item_mem.type == 1 | bool` which is
> >> > +   not valid if `s_item_mem.type == 2` . */
> >> > +static bool truthy(void) __attribute__((noipa));
> >> > +static bool
> >> > +truthy(void)
> >> > +{
> >> > +    bool tt = false;
> >> > +    for(int i = 0; i < 10; i++)
> >> > +    {
> >> > +      if (s_item_mem.type == 0)
> >> > +        tt = tt | 0;
> >> > +      else if (s_item_mem.type == 1)
> >> > +        tt = tt | s_item_mem.boolean;
> >> > +      else
> >> > +        tt = tt |  1;
> >> > +    }
> >> > +    return tt;
> >> > +}
> >> > +
> >> > +int
> >> > +main(void)
> >> > +{
> >> > +    s_item_mem.type = 2;
> >> > +    s_item_mem.t = -1;
> >> > +    bool b1 = !truthy();
> >> > +    s_item_mem.type = 1;
> >> > +    s_item_mem.boolean = b1;
> >> > +    bool b = truthy();
> >> > +    if (b1 != b)  __builtin_abort();
> >> > +    if (b) __builtin_abort();
> >> > +}
> >> > --
> >> > 2.43.0
> >> >
>

Reply via email to