Am Montag, dem 25.11.2024 um 10:35 +0100 schrieb Richard Biener:
> On Mon, 25 Nov 2024, Richard Biener wrote:
>
> > On Sat, 23 Nov 2024, Martin Uecker wrote:
> >
> > >
> > > This patch tries fixes the errors we have because of
> > > flexible array members. I am bit unsure about the exception
> > > for the mode.
> > >
> > > Bootstrapped and regression tested on x86_64.
> > >
> > >
> > >
> > > Fix type compatibility for types with flexible array member
> > > [PR113688,PR114014,PR117724]
> > >
> > > verify_type checks the compatibility of TYPE_CANONICAL using
> > > gimple_canonical_types_compatible_p. But it is stricter than what
> > > the
> > > C standard requires and therefor inconsistent with how TYPE_CANONICAL
> > > is set
> > > in the C FE. Here, the logic is changed to ignore array size when
> > > one of the
> > > types is a flexible array member. To not get errors because of
> > > inconsistent
> > > number of members, zero-sized arrays are not ignored anymore when
> > > checking
> > > fields of a struct (which is stricter than what was done before).
> > > Finally, a exception is added that allows the TYPE_MODE of a type with
> > > flexible array member to differ from another compatible type.
> > >
> > > PR c/113688
> > > PR c/114014
> > > PR c/117724
> > >
> > > gcc/ChangeLog:
> > > * tree.cc (gimple_canonical_types_compatible_p): Revise
> > > logic for types with FAM.
> > > (verify_type): Add exception for mode for types with FAM.
> > >
> > > gcc/testsuite/ChangeLog:
> > > * gcc.dg/pr113688.c: New test.
> > > * gcc.dg/pr114014.c: New test.
> > > * gcc.dg/pr117724.c: New test.
> > >
> > > diff --git a/gcc/testsuite/gcc.dg/pr113688.c
> > > b/gcc/testsuite/gcc.dg/pr113688.c
> > > new file mode 100644
> > > index 00000000000..8dee8c86f1b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/pr113688.c
> > > @@ -0,0 +1,8 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-g" } */
> > > +
> > > +struct S{int x,y[1];}*a;
> > > +int main(void){
> > > + struct S{int x,y[];};
> > > +}
> > > +
> > > diff --git a/gcc/testsuite/gcc.dg/pr114014.c
> > > b/gcc/testsuite/gcc.dg/pr114014.c
> > > new file mode 100644
> > > index 00000000000..ab783f4f85d
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/pr114014.c
> > > @@ -0,0 +1,14 @@
> > > +/* PR c/114014
> > > + * { dg-do compile }
> > > + * { dg-options "-std=c23 -g" } */
> > > +
> > > +struct r {
> > > + int a;
> > > + char b[];
> > > +};
> > > +struct r {
> > > + int a;
> > > + char b[0];
> > > +}; /* { dg-error "redefinition" } */
> > > +
> > > +
> > > diff --git a/gcc/testsuite/gcc.dg/pr117724.c
> > > b/gcc/testsuite/gcc.dg/pr117724.c
> > > new file mode 100644
> > > index 00000000000..d631daeb644
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/pr117724.c
> > > @@ -0,0 +1,16 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-g" } */
> > > +
> > > +struct {
> > > + unsigned long len;
> > > + unsigned long size;
> > > + char data[];
> > > +}; /* { dg-warning "unnamed struct" } */
> > > +struct {
> > > + struct {
> > > + unsigned long len;
> > > + unsigned long size;
> > > + char data[6];
> > > + };
> > > +}; /* { dg-warning "unnamed struct" } */
> > > +
> > > diff --git a/gcc/tree.cc b/gcc/tree.cc
> > > index 1da06c7d4e9..dbf6b180496 100644
> > > --- a/gcc/tree.cc
> > > +++ b/gcc/tree.cc
> > > @@ -13900,8 +13900,11 @@ gimple_canonical_types_compatible_p (const_tree
> > > t1, const_tree t2,
> > > || TREE_CODE (t1) == NULLPTR_TYPE)
> > > return true;
> > >
> > > - /* Can't be the same type if they have different mode. */
> > > - if (TYPE_MODE (t1) != TYPE_MODE (t2))
> > > + /* Can't be compatible types if they have different mode. We allow
> > > + mismatching modes for types with flexible array member. */
> > > + if (!flexible_array_type_p (t1)
> > > + && !flexible_array_type_p (t2)
> >
> > These are quite expensive (recursive on UNION, walking TYPE_FIELDs for
> > RECORD) and not handling ARRAY_TYPE itself.
> >
> > I'll also note that same TYPE_CANONICAL implies same TYPE_SIZE given
> > for aggregate assignment we assume we can take the copy size from
> > either side.
This should be fine if there is nothing in the middle end is
creating such assignments.
The C FE should not create such assignments, because it checks
compatibility of types using the stricter non-transitive rule from
ISO C and not the relaxed compatibility that is used for assigning
TYPE_CANONICAL (which is relaxed to be transitive to be able to
form equivalence classes).
> >
> > So I wonder if instead
> >
> > COMPLETE_TYPE_P (t1)
> > && COMPLETE_TYPE_P (t2)
> >
> > should be used here.
> >
I don't think this works, because the structs with FAM are also
complete types (yes, it would make more sense if they were not).
> > I assume struct { int len; char data[1]; } and
> > struct { int len; char data[2]; } are not compatible but both can
> > have TYPE_CANONICAL of struct {int len; char data[]; }?
Yes. At the moment, any of the three types could end up the
TYPE_CANONICAL depending on which is encountered first.
> >
> > > + && (TYPE_MODE (t1) != TYPE_MODE (t2)))
> > > return false;
> > >
> > > /* Non-aggregate types can be handled cheaply. */
> > > @@ -13952,7 +13955,7 @@ gimple_canonical_types_compatible_p (const_tree
> > > t1, const_tree t2,
> > > {
> > > case ARRAY_TYPE:
> > > /* Array types are the same if the element types are the same and
> > > - the number of elements are the same. */
> > > + minimum and maximum index are the same. */
> > > if (!gimple_canonical_types_compatible_p (TREE_TYPE (t1),
> > > TREE_TYPE (t2),
> > > trust_type_canonical)
> > > || TYPE_STRING_FLAG (t1) != TYPE_STRING_FLAG (t2)
> > > @@ -14046,23 +14049,35 @@ gimple_canonical_types_compatible_p (const_tree
> > > t1, const_tree t2,
> > > f1 || f2;
> > > f1 = TREE_CHAIN (f1), f2 = TREE_CHAIN (f2))
> > > {
> > > - /* Skip non-fields and zero-sized fields. */
> > > - while (f1 && (TREE_CODE (f1) != FIELD_DECL
> > > - || (DECL_SIZE (f1)
> > > - && integer_zerop (DECL_SIZE (f1)))))
> > > + /* Skip non-fields. */
> > > + while (f1 && (TREE_CODE (f1) != FIELD_DECL))
> > > f1 = TREE_CHAIN (f1);
> > > - while (f2 && (TREE_CODE (f2) != FIELD_DECL
> > > - || (DECL_SIZE (f2)
> > > - && integer_zerop (DECL_SIZE (f2)))))
> > > + while (f2 && (TREE_CODE (f2) != FIELD_DECL))
> > > f2 = TREE_CHAIN (f2);
> >
> > The above assumes zero-sized fields are last (I guess an OK assumption),
> > can you put a comment after this indicating this fact?
Ok.
> >
> > > if (!f1 || !f2)
> > > break;
> > > +
> > > + tree t1 = TREE_TYPE (f1);
> > > + tree t2 = TREE_TYPE (f2);
> > > +
> > > + /* Special case for flexible array members. */
> > > + if (TREE_CHAIN (f1) == NULL_TREE
> > > + && TREE_CHAIN (f2) == NULL_TREE
> > > + && TREE_CODE (t1) == ARRAY_TYPE
> > > + && TREE_CODE (t2) == ARRAY_TYPE
> > > + && (!DECL_NOT_FLEXARRAY (f1)
> > > + || !DECL_NOT_FLEXARRAY (f2))
> > > + && TYPE_REVERSE_STORAGE_ORDER (t1) ==
> > > TYPE_REVERSE_STORAGE_ORDER (t2)
> > > + && TYPE_NONALIASED_COMPONENT (t1) == TYPE_NONALIASED_COMPONENT
> > > (t2)
> > > + && gimple_canonical_types_compatible_p
> > > + (TREE_TYPE (t1), TREE_TYPE (t2),
> > > + trust_type_canonical))
> > > + ;
> > > /* The fields must have the same name, offset and type. */
> > > - if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2)
> > > + else if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2)
> > > || !gimple_compare_field_offset (f1, f2)
> > > || !gimple_canonical_types_compatible_p
> > > - (TREE_TYPE (f1), TREE_TYPE (f2),
> > > - trust_type_canonical))
> > > + (t1, t2, trust_type_canonical))
> >
> > So what was actually mismatching with the old code? Why can you spare
> > comparing the start of the array member (gimple_compare_field_offset)?
> > Should the special-casing be only to not compare the array type itself
> > but its element type? What about int[][] (not sure if there's such
> > thing in C, an array type of array type with flexible size)?
int[] would be an incomplete array which is not allowed as an
element type for an array.
> >
> > > return false;
> > > }
> > >
> > > @@ -14206,6 +14221,9 @@ verify_type (const_tree t)
> > > }
> > >
> > > if (COMPLETE_TYPE_P (t) && TYPE_CANONICAL (t)
> > > + /* We allow a mismatch for flexible array members. */
> > > + && !flexible_array_type_p (t)
> > > + && !flexible_array_type_p (TYPE_CANONICAL (t))
> >
> > Same as above, COMPLETE_TYPE_P (TYPE_CANONICAL (t))?
>
> One more thing - for LTO we throw away TYPE_CANONICAL and re-compute
> it in a supposedly more conservative way than frontends. This happens
> together with this function and lto/lto-common.cc:
> iterative_hash_canonical_type and gimple_register_canonical_type
> which hashes TYPE_MODE - it would need to stop doing that and it
> would have an issue with non-transitiveness of canonicality since
> it takes the "first" struct as the canonical one, not necessarily
> the !COMPLETE_TYPE_P one (which could be from another TU).
And if we just always ignore the size for an array that comes
last? (and also remove TYPE_MODE from hashing). Then it could
continue to use this function to build equivalence classes. It
might not pick the one with the flex member as canonical (just
like the C FE), but this should not matter anywhere, or?
If this is not completely wrong I would try this first.
>
> So - this might not fly with LTO in the end, unless the C fronted
> would build a flexarray canonical type for each record type with
> an array type at the end and make the flexarray variant canonical
> (in the expectation of another TU containing such).
This would also be possible but is it necessary that the TYPE_CANONICAL
is the one with the flexarray? This would usually require creating
a new type specifically for this.
Martin
>
> Richard.
>
> > > && TYPE_MODE (t) != TYPE_MODE (TYPE_CANONICAL (t)))
> > > {
> > > error ("%<TYPE_MODE%> of %<TYPE_CANONICAL%> is not compatible");
> > >
> > >
> > >
> >
> >
>