On Wed, Aug 20, 2025 at 1:14 AM Richard Biener <richard.guent...@gmail.com> wrote:
> On Tue, Aug 19, 2025 at 11:24 PM Andrew Pinski > <andrew.pin...@oss.qualcomm.com> wrote: > > > > On Tue, Aug 19, 2025 at 12:26 AM Richard Biener < > richard.guent...@gmail.com> wrote: > >> > >> On Sat, Aug 16, 2025 at 7:34 PM Andrew Pinski > >> <andrew.pin...@oss.qualcomm.com> wrote: > >> > > >> > Just like r16-465-gf2bb7ffe84840d8 but this time > >> > instead of a VCE there is a full on load from a boolean. > >> > This showed up when trying to remove the extra copy > >> > in the testcase from the revision mentioned above (pr120122-1.c). > >> > So when moving loads from a boolean type from being conditional > >> > to non-conditional, the load needs to become a full load and then > >> > casted into a bool so that the upper bits are correct. > >> > > >> > We could use build_ref_for_offset if that supported a similar ability > >> > to add the statements into the sequence. Instead we copy a modified > >> > version that does a similar thing into > rewrite_to_defined_unconditional. > >> > > >> > Bootstrapped and tested on x86_64-linux-gnu. > >> > > >> > PR tree-optimization/121279 > >> > > >> > gcc/ChangeLog: > >> > > >> > * gimple-fold.cc (gimple_needing_rewrite_undefined): Return > >> > true for boolean loads. > >> > (rewrite_to_defined_unconditional): Handle boolean loads. > >> > > >> > gcc/testsuite/ChangeLog: > >> > > >> > * gcc.dg/torture/pr121279-1.c: New test. > >> > > >> > Signed-off-by: Andrew Pinski <andrew.pin...@oss.qualcomm.com> > >> > --- > >> > gcc/gimple-fold.cc | 113 > +++++++++++++++++++++- > >> > gcc/testsuite/gcc.dg/torture/pr121279-1.c | 49 ++++++++++ > >> > 2 files changed, 161 insertions(+), 1 deletion(-) > >> > create mode 100644 gcc/testsuite/gcc.dg/torture/pr121279-1.c > >> > > >> > diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc > >> > index 85319b33492..40707d03946 100644 > >> > --- a/gcc/gimple-fold.cc > >> > +++ b/gcc/gimple-fold.cc > >> > @@ -69,6 +69,7 @@ along with GCC; see the file COPYING3. If not see > >> > #include "varasm.h" > >> > #include "internal-fn.h" > >> > #include "gimple-range.h" > >> > +#include "alias.h" > >> > > >> > enum strlen_range_kind { > >> > /* Compute the exact constant string length. */ > >> > @@ -10520,6 +10521,19 @@ gimple_needing_rewrite_undefined (gimple > *stmt) > >> > && !POINTER_TYPE_P (lhs_type)) > >> > return false; > >> > tree rhs = gimple_assign_rhs1 (stmt); > >> > + /* Boolean loads need special handling as they are treated as a > full MODE load > >> > + and don't mask off the bits for the precision. */ > >> > + if (gimple_assign_load_p (stmt) > >> > + && TREE_CODE (lhs_type) == BOOLEAN_TYPE > >> > >> I think the relevant thing is the code of the RHS type. Also does this > >> not apply to non-bitfield loads of !type_has_mode_precision_p types > >> in general? I am thinking of the fact that we treat a conversion > >> to/from a same precision/signedness enum/integer type as useless, > >> so singling out BOOLEAN_TYPE here looks "dangerous" at least? > >> You are not specifically looking for TYPE_PRECISION == 1 (which is OK > >> I think). > > > > > > I think it does apply to non-bitfields loads of > !type_has_mode_precision_p in general and not just boolean types. > > I will remove the check for BOOLEAN_TYPE. > In the end I added back the check for BOOLEAN_TYPE. The reason is the middle-end always does the correct sign/zero extension for non-mode precision types (this is not a VCE but a "load" [yes there is PR 115405 which I might get around to fixing since it is related to this]) > > > >> > >> > >> > + && !type_has_mode_precision_p (lhs_type) > >> > + /* Loads directly from a decl don't need the > >> > + rewrite as it should always contain a valid value. */ > >> > + && !DECL_P (rhs) > >> > >> What's the reasoning here that doesn't apply to other cases > >> [that are not within a union]? Is it that for a DECL we know > >> the ref is not inside a union? Wouldn't that extent to all refs > >> based on a decl? Note at least with C++ you can placement-new > >> a union { bool b; char c; } onto a 'bool' decl ... > > > > > > It was to work around the issue where lim would create a temporary > variable which is not renamed yet and we would get an ICE. But I don't > think it is needed due to I am going to change the rewriting to just use a > VCE (see below). > > > >> > >> > >> > + /* Bit-fields loads don't need a rewrite as the masking > >> > + happens for them. */ > >> > + && (TREE_CODE (rhs) != COMPONENT_REF > >> > + || !DECL_BIT_FIELD (TREE_OPERAND (rhs, 1)))) > >> > + return true; > >> > /* VCE from integral types to a integral types but with > >> > a smaller precision need to be changed into casts > >> > to be well defined. */ > >> > @@ -10558,6 +10572,104 @@ rewrite_to_defined_unconditional > (gimple_stmt_iterator *gsi, gimple *stmt, > >> > print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); > >> > } > >> > gimple_seq stmts = NULL; > >> > + tree lhs = gimple_assign_lhs (stmt); > >> > + > >> > + /* Bool loads need to be rewritten to be a load from the same mode > >> > + and then a cast to the boolean type so the other bits are > masked off > >> > + correctly since the load was done conditionally. It is similar > to the VCE > >> > + case below. */ > >> > + if (gimple_assign_load_p (stmt) > >> > + && TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE) > >> > + { > >> > + tree rhs = gimple_assign_rhs1 (stmt); > >> > + /* Direct reads from decl don't need a rewrite. */ > >> > + gcc_assert (!DECL_P (rhs)); > >> > + /* Bit-fields loads will do the masking so don't need the > rewriting. */ > >> > + gcc_assert (TREE_CODE (rhs) != COMPONENT_REF > >> > + || !DECL_BIT_FIELD (TREE_OPERAND (rhs, 1))); > >> > + > >> > + location_t loc = gimple_location (stmt); > >> > + auto bits = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (TREE_TYPE > (rhs))); > >> > + tree new_type = build_nonstandard_integer_type (bits, true); > >> > + poly_int64 offset; > >> > + tree base = get_addr_base_and_unit_offset (rhs, &offset); > >> > + tree off; > >> > + /* Preserve address-space information. */ > >> > + addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (rhs)); > >> > + if (as != TYPE_ADDR_SPACE (new_type)) > >> > + new_type = build_qualified_type (new_type, > >> > + TYPE_QUALS (new_type) > >> > + | ENCODE_QUAL_ADDR_SPACE > (as)); > >> > + > >> > + /* get_addr_base_and_unit_offset returns NULL for references > with a variable > >> > + offset such as array[var_index]. */ > >> > + if (!base) > >> > + { > >> > + tree tmp, addr; > >> > + gassign *nstmt; > >> > + > >> > + /* Create instead `tmp = &array[index];` which then will be > deferenced. */ > >> > + tmp = make_ssa_name (build_pointer_type (TREE_TYPE (rhs))); > >> > + addr = build_fold_addr_expr (unshare_expr (rhs)); > >> > >> Ugh, that will make the base addressable. How about, for an outermost > >> ARRAY_REF, V_C_E its base to an array type with adjusted component? > >> And for any other handled_component_p with constant offset (ugh, > >> no good idea for the variable offset component-ref case - building a new > >> fake FIELD_DECL is a bit too ugly?) replace it with > >> a bit-field-ref? You need to dive down to the base and replace that > with > >> a MEM_REF with adjusted alias-pointer type. > > > > > > So it turns out I can just remove this whole thing and just add a VCE > around the original reference and it will work correctly. I didn't think of > that before. > > Huh. So, a V_C_E shouldn't effect how we load the value - but this > isn't what we are after here - instead it is kind of an rvalue > re-interpretation > (yeah, we are sometimes seeing it on the LHS...). I guess that's exactly > what > we want here. > > Indeed, I also didn't think of this simple solution. > It was only after you mentioned it as referenced for ARRAY_REF that I thought why not just do it for all. > > Note though that you cannot wrap a REAL/IMAG_PART_EXPR or > a BIT_FIELD_REF inside a VIEW_CONVERT_EXPR (see gimple > verification where we single out some refs as to be outermost only). > You possibly won't ever see a BIT_FIELD_REF of a bool with > non-mode precision since that would be an invalid BIT_FIELD_REF > unless it accesses a single bit only - which is a case that shouldn't > need rewriting? (possibly a GIMPLE testcase could prove this) > I hope we don't have _Complex bool, but who knows ... > So for BFR, the type precision has to equal to size that is referenced from BFR, at last that is what tree-cfg checks: if (INTEGRAL_TYPE_P (TREE_TYPE (expr)) && maybe_ne (TYPE_PRECISION (TREE_TYPE (expr)), size)) { error ("integral result type precision does not match " "field size of %qs", code_name); return true; } So at least for the precision case == 1, it will work correctly. So I added a check for BFR to avoid the rewriting there. As other kinds of booleans (precision !=1), The main case where this has been a problem is unsigned booleans were we do `a | b` and b was from this load so BFR will always truncate to the size requested and the only case where `a ? 1 : b` gets optimized to `a | b` is for unsigned booleans with precision of 1. I have not found another case As far as Complex boolean types, build_complex_type does not reject them. Pointers were rejected with https://gcc.gnu.org/pipermail/gcc-patches/2008-October/250588.html . So maybe complex bools should also be turned into an ICE too. I will test that separately from this. I will submit a new version of the patch after it finishes testing. Thanks, Andrew > Richard. > > >> > >> > >> > + STRIP_USELESS_TYPE_CONVERSION (addr); > >> > + nstmt = gimple_build_assign (tmp, addr); > >> > + gimple_set_location (nstmt, loc); > >> > + if (in_place) > >> > + gsi_insert_before (gsi, nstmt, GSI_SAME_STMT); > >> > + else > >> > + gimple_seq_add_stmt (&stmts, nstmt); > >> > + off = build_zero_cst (reference_alias_ptr_type (rhs)); > >> > + base = tmp; > >> > + } > >> > + /* For [MEM a + 10]->b change it into MEM[a + 10 +offseof(b)] > */ > >> > + else if (TREE_CODE (base) == MEM_REF) > >> > + { > >> > + off = build_int_cst (TREE_TYPE (TREE_OPERAND (base, 1)), > >> > + offset); > >> > + off = int_const_binop (PLUS_EXPR, TREE_OPERAND (base, 1), > off); > >> > + base = unshare_expr (TREE_OPERAND (base, 0)); > >> > + } > >> > + /* For `a.b` change it into `MEM[&a + offsetof(b)]` */ > >> > + else > >> > + { > >> > + off = build_int_cst (reference_alias_ptr_type (rhs), > >> > + offset); > >> > + base = build_fold_addr_expr (unshare_expr (base)); > >> > + } > >> > + /* */ > >> > >> ? > > > > > > I was going to add a comment here but I forgot. It is no longer needed > due to the VCE part. > > I should have a new patch submitted this evening for this. > > > > > > Thanks, > > Andrew Pinski > > > >> > >> > >> > + if (TYPE_ALIGN (new_type) > TYPE_ALIGN (TREE_TYPE (rhs))) > >> > + new_type = build_aligned_type (new_type, TYPE_ALIGN > (TREE_TYPE (rhs))); > >> > + tree mem_ref = fold_build2_loc (loc, MEM_REF, new_type, base, > off); > >> > + if (TREE_THIS_VOLATILE (rhs)) > >> > + TREE_THIS_VOLATILE (mem_ref) = 1; > >> > + if (TREE_SIDE_EFFECTS (rhs)) > >> > + TREE_SIDE_EFFECTS (mem_ref) = 1; > >> > + > >> > + /* Replace the original load with a new load and a new lhs. */ > >> > + tree new_lhs = make_ssa_name (new_type); > >> > + gimple_assign_set_rhs1 (stmt, mem_ref); > >> > + gimple_assign_set_lhs (stmt, new_lhs); > >> > + > >> > + if (in_place) > >> > + update_stmt (stmt); > >> > + else > >> > + { > >> > + gimple_set_modified (stmt, true); > >> > + gimple_seq_add_stmt (&stmts, stmt); > >> > + } > >> > + > >> > + /* Build the conversion statement. */ > >> > + gimple *cvt = gimple_build_assign (lhs, NOP_EXPR, new_lhs); > >> > + if (in_place) > >> > + { > >> > + gsi_insert_after (gsi, cvt, GSI_SAME_STMT); > >> > + update_stmt (stmt); > >> > + } > >> > + else > >> > + gimple_seq_add_stmt (&stmts, cvt); > >> > + return stmts; > >> > + } > >> > + > >> > /* VCE from integral types to another integral types but with > >> > smaller precisions need to be changed into casts > >> > to be well defined. */ > >> > @@ -10579,7 +10691,6 @@ rewrite_to_defined_unconditional > (gimple_stmt_iterator *gsi, gimple *stmt, > >> > } > >> > return stmts; > >> > } > >> > - tree lhs = gimple_assign_lhs (stmt); > >> > tree type = unsigned_type_for (TREE_TYPE (lhs)); > >> > if (gimple_assign_rhs_code (stmt) == ABS_EXPR) > >> > gimple_assign_set_rhs_code (stmt, ABSU_EXPR); > >> > diff --git a/gcc/testsuite/gcc.dg/torture/pr121279-1.c > b/gcc/testsuite/gcc.dg/torture/pr121279-1.c > >> > new file mode 100644 > >> > index 00000000000..516b887fe65 > >> > --- /dev/null > >> > +++ b/gcc/testsuite/gcc.dg/torture/pr121279-1.c > >> > @@ -0,0 +1,49 @@ > >> > +/* { dg-do run } */ > >> > +/* PR tree-optimization/121279 */ > >> > + > >> > +#include <stdbool.h> > >> > + > >> > +struct Value { > >> > + int type; > >> > + union { > >> > + bool boolean; > >> > + long long t; > >> > + }; > >> > +}; > >> > + > >> > +static struct Value s_item_mem; > >> > + > >> > +/* truthy was being miscompiled for the value.type==2 case, > >> > + because the bool load from the union is pulled out of the > >> > + loop but that was conditional before and now it is not, > >> > + so turns it into `s_item_mem.type == 1 | bool` which is > >> > + not valid if `s_item_mem.type == 2` . */ > >> > +static bool truthy(void) __attribute__((noipa)); > >> > +static bool > >> > +truthy(void) > >> > +{ > >> > + bool tt = false; > >> > + for(int i = 0; i < 10; i++) > >> > + { > >> > + if (s_item_mem.type == 0) > >> > + tt = tt | 0; > >> > + else if (s_item_mem.type == 1) > >> > + tt = tt | s_item_mem.boolean; > >> > + else > >> > + tt = tt | 1; > >> > + } > >> > + return tt; > >> > +} > >> > + > >> > +int > >> > +main(void) > >> > +{ > >> > + s_item_mem.type = 2; > >> > + s_item_mem.t = -1; > >> > + bool b1 = !truthy(); > >> > + s_item_mem.type = 1; > >> > + s_item_mem.boolean = b1; > >> > + bool b = truthy(); > >> > + if (b1 != b) __builtin_abort(); > >> > + if (b) __builtin_abort(); > >> > +} > >> > -- > >> > 2.43.0 > >> > >