Re: Teach gimple_canonical_types_compatible_p about incomplete types

Richard Biener Tue, 26 May 2015 01:20:09 -0700

On Mon, 25 May 2015, Jan Hubicka wrote:

> Richard,
> here is next patch of the series.  It adds all the logic for defining type
> equivalence that globs all complete types together in order to make incomplete
> type equivalent to every complete variant.
> 
> Effect of recursing on pointers
> ===============================
> 
> This is, of course, intended for hashing pointers, but I did not bundle it
> here.  When used in the obvious way to hash pointers (not included in this
> patch), the number of distinct canonical types in Firefox increase by 50%:
> 
> [WPA] GIMPLE canonical type table: size 32749, 21596 elements, 1637039 
> searches, 222326 collisions (ratio: 0.135810)
> [WPA] GIMPLE canonical type pointer-map: 21596 elements, 2271778 searches
> to:
> [WPA] GIMPLE canonical type table: size 65521, 32680 elements, 1637039 
> searches, 475073 collisions (ratio: 0.290203)
> [WPA] GIMPLE canonical type pointer-map: 32680 elements, 2303368 searches     
>   
> 
> I also noticed the increase in collision ratio; it is still not bad and we
> can strenghten the hash little bit incrementally.
> 
> Here is a testcase where we improve:
> 
> truct a
> {
>   int a; int *b;
> };
> struct a *ptr;
> void test (void);
> void linker_error (void);
> 
> struct b
> {
>   int a; short *b;
> };
> struct b *ptr2;
> inline void
> test()
> {
>   ptr2->a=2;
> }
> 
> void
> main()
> {
>   asm ("":"=r"(ptr));
>   asm ("":"=r"(ptr2));
>   while (1) {
>   ptr->a=1;
>   test();
>   if (ptr->a != 1)
>     linker_error ();
>    }
> }
> 
> with -fno-early-inlining we are able to disambiguate ptr->a from ptr2->a
> because we disambiguate struct a from struct b. For some reason we don't do
> that if I move the secon ASM statement to test() that looks like missed
> optimization.
> 
> Now the change does not really translate to great increase of disambiguations
> for Firefox (it seems more in noise). The reason is the pointer_type globbing
> in alias.c.


Yeah, we only get the improvement because of some "hack" in the tree
alias oracle which also uses the base object for TBAA.

> Globbing pointer types in alias.c
> =================================
> 
> This code seemed very odd to me, but after some tought it does make sense for
> C/C++.  It seemed to me that if we want to support alias of two different
> pointers of same type but different qualification, we must make it part of the
> TYPE_CANONICAL construction because that effectively unifies the compound 
> types
> build from them.  This is (fortunately) not really true: in C stores to array
> types do not exist and structures are identified by their names, not by
> structure. This means that this extra pointer equality does not propagate up
> (only compound types in interest are pointers) and globbing them late indeed
> allows to access pointer to A by pointer to B globally without fear of
> disasters.
> 
> C cross-TU aliasing rules
> and why I don't think globing pointers works for LTO of C only programs
> ========================================================================
> 
> In LTO however the situation is different.  We discussed following
> cross-language unification rules
> 
>  1) type names does not matter (C standard is clear on that)
>  2) field names does not matter (in C/C++ they do but we think matching the
>     names between C and other languages may be impractical)
>  3) qualifiers does not matter (they do for C, byt it may be impractical
>     for other languages)
>  4) references and pointers should be considered equal because we want to
>     interface languages with references only to C.
>     I suppose this is important for fortran
> 
> These rules do build structural equality up and thus do need to be part of
> canonical type computation machinery.  
> I looked in detail to what C standard say. It has notion of compatible type
> and composite type.  The relevant clauses of C standard are:
> 
>   1  Two types have compatible type if their types are the same.
>      Additional rules for determining whether two types are compatible are 
> described
>      in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in 6.7.6 
> for
>      declarators.  Moreover, two structure, union, or enumerated types 
> declared
>      in separate translation units are compatible if their tags and members 
> satisfy
>      the following requirements:  If one is declared with a tag, the other 
> shall be
>      declared with the same tag.  If both are completed anywhere within their
>      respective translation units, then the following additional requirements 
> apply:
>      there shall be a one-to-one correspondence between their members such 
> that each
>      pair of corresponding members are declared with compatible types; if one 
> member
>      of the pair is declared with an alignment specifier, the other is 
> declared with
>      an equivalent alignment specifier; and if one member of the pair is 
> declared
>      with a name, the other is declared with the same name.  For two 
> structures,
>      corresponding members shall be declared in the same order.   For two
>      structures or unions, corresponding bit-fields shall have the same 
> widths.  For
>      two enumerations, corresponding members shall have the same values.
> 
> (tag is "struct a"/"struct b", so C standard actually require names to match
>  for structures/unions/enums to be compatible but by transitivity it needs
>  to match compatible structures without a tag, so indeed we can not use type 
> names
>  and preserve transitivity of our notion of compatibility even considering C 
> alone.
> 
>  We get wrong unions, because we insist on compatibility in order, while
>  standard doesn't.

Yeah, we should fix that.  And in fact, for cross-language LTO I don't
see why

  union { int a; char c; };

and

  union { int a; short s; };

should not be compatible - they have a common member after all.  So
I'd like to glob all unions that have the same size (and as improvement
over that, that have at least one compatible member).  That also get's
rid of the issue that we'd need to sort union members for the comparison
to avoid quadraticness (as long as we don't check for that one compatible
member).

Oh, and are

  union { int a; };

and

  struct { int a; };

not compatible?  They are layout-wise at least.  Likewise the struct
and union { int a; short s; } with the same argument as the two-union
case.

>  We also do not compare alignments. This is probably not important)

Correct - alignment doesn't enter TBAA.

>   2  For two pointer types to be compatible, both shall be identically
>      qualified and both shall be pointers to compatible types.
> 
> (here we skip the qualification for a reason by punting to TREE_CODE 
> (TREE_TYPE (t)),
>  but we get wrong some cases where compatible types have different codes, see
>  bellow)

Yes, using TREE_CODE is likely not correct.

>   6  For two array types to be compatible, both shall have compatible element
>      types, and if both size specifiers are present, and are integer constant
>      expressions, then both size specifiers shall have the same constant 
> value.
>      If the two array types are used in a context which requires them to be
>      compatible, it is undefined behavior if the two size specifiers evaluate
>      to unequal values.
> 
> (ok, here I think variable sized arrays are going to be fun, because such 
> array
>  of unknown size is a complete type and may be compatible with any array.
>  Because we glob away arrays for alias.c purposes, this is less of a problem,
>  but it propagates up to structure.  Strictly speaking struct a {int a[5];}
>  is compatible with struct a {int a[b];} from other unit.

Yeah, fun.

>  Taking this to consequences, by transitivity, we may need to drop array sizes
>  from consideration completely at least when some unit constains a variably 
> sized array
>  of compatible type that we do not really know in advance. Also when matching 
> types
>  of fields of structures and we may need to stop comparing field offsets after
>  tripping over a first field that reucursively contains an array.

Indeed, if we first see struct a { int a[3]; } and struct b { int a[2]; }
and then struct c { int a[b]; } we'd need to glob all structs to a single
TYPE_CANONICAL at that point (which of course doesn't work).

>  So in particular pointer to variable sized array is compatible with pointer
>  to constantly sized array that we get wrong because of the extra loop 
> assigning
>  TYPE_CANONICAL to read in function bodies.  I believe canonical hash need
>  to be persistent for this reason and we need to keep canonical type 
> computation
>  incremental)

Keeping the hash is fine with me (and using it in the function-code).  I
suppose we should incrementally fix that (and the union case above) before
adding any new fancy stuff.

>   15 For two function types to be compatible, both shall specify compatible
>      return types.  Moreover, the parameter type lists, if both are present,
>      shall agree in the number of parameters and in use of the ellipsis
>      terminator; corresponding parameters shall have compatible types.  If one
>      type has a parameter type list and the other type is specified by a
>      function declarator that is not part of a function definition and that
>      contains an empty identifier list, the parameter list shall not have an
>      ellipsis terminator and the type of each parameter shall be compatible
>      with the type that results from the application of the
>      default argument promotions.  If one type has a parameter type list and 
> the
>      other type is specified by a function definition that contains a 
> (possibly
>      empty) identifier list, both shall agree in the number of parameters, and
>      the type of each prototype parameter shall be compatible with the type
>      that results from the application of the default argument promotions to
>      the type of the corresponding identifier.   (In the determination of type
>      compatibility and of a composite type, each parameter declared with
>      function or array type is taken as having the adjusted type and each
>      parameter declared with qualified type is taken as having the unqualified
>      version of its declared type.)
> 
> (we do not match prototype with ellypsis to a declaration with additional
> parameters.  Because function compatibility matters only for warnings and
> for pointer types, where we probably want to use return value only, it is
> not that much of trouble.
> 
> There are couple examples in standard that are interesting read
> EXAMPLE 5    The following are all compatible function prototype declarators.
> double maximum(int n, int m, double a[n][m]);
> double maximum(int n, int m, double a[*][*]);
> double maximum(int n, int m, double a[ ][*]);
> double maximum(int n, int m, double a[ ][m]);
> as are:
> void f(double (* restrict a)[5]);
> void f(double a[restrict][5]);
> void f(double a[restrict 3][5]);
> void f(double a[restrict static 3][5]);)

Not sure why you get into functions here at all ...

>   2  Each enumerated type shall be compatible with char ,  a  signed integer
>      type, or an unsigned integer type. The choice of type is
>      implementation-defined, but  shall be capable of representing the values
>      of all the members of the enumeration.    The enumerated type is
>      incomplete until immediately after the that terminates the list of
>      enumerator declarations, and complete thereafter.
> 
> (we ignore this completely as far as I know, it is easy to fix though, all
>  we need is to make ENUMERATION_TYPE pretend to be INTEGER_TYPE)

Yes, we don't make a distinction between ENUMERAL_TYPE and INTEGER_TYPE.

>   10 For two qualified types to be compatible, both shall have the identically
>      qualified version of a compatible type; the order of type qualifiers
>      within a list of specifiers or qualifiers does not affect the specified 
> type.
> 
> Now I think in order to get C standard type compatiblity to imply
> gimple_canonical_types_compatible we need to implement all the above globbing
> rules as part of canonical type computation, not only punt at pointers in
> alias.c
> 
> My reading is that for example
> 
> struct a {char *a;};
> 
> is compatible with
> 
> struct a {enum *a;};
> 
> defined in other compilation unit.

Yes, as said above the TREE_CODE trick in the pointer-type handing is
wrong.  We can as well just drop it ...

> So in addition to globbing disucssed I suppose:
> 
>  5) we probably want to consider char * same as void * to support K&R and
>     non-K&R C code mixing.
>  6) we want to consider void * equivalent to void ** because GCC is doing
>     type punning here (still?) and we think people generally mix them.

I think void * is equivalent to _any_ other pointer type.  It's certainly
used in that way in C (to provide abstraction).  As said to the alias.c
globbing, the choice was to either glob all pointer types or make
void * (and void **) have alias-set zero.  Experiments showed that doing
the latter is more harmful for code generation.

>  7) pointers to array types should be same as pointers to elements
>     again because C and C++ does not really have accesses to arrays but
>     other languages may.

Yes.

>  8) i think to be correct by C language standard we need to glob enum 
> with char
>     tough I do not quite see how standard conforming program should use 
>     it given that standard does not say if it is char/unsigned 
>     char/signed char.

I think it depends on the actual enum, no?  So for forward declarations
like

 enum Foo;
 struct X { enum Foo *p; };

you face the same issue as with void *.

> Now I think that if we want to support all of above, we need to make it part
> of TYPE_CANONICAL calculation, not part of alias set computation. Otherwise
> we won't.

Yes, strictly speaking we need to implement all alias.c globbing in
TYPE_CANONICAL calculation as well to get composition rules right.

> C++, C++-C interpoperability
> ============================
> For now I do not see why we can't handle C++ types as C types as the type
> equivalence for C++ seems just monotonously more strict.  C++ types with
> linkage specify also language linkage.  C++ standard consider two types
> distinct if they differ by language linkage:
> 
>    The default language linkage of all function types, function names, and
>    variable names is C++ language linkage. Two function types with different
>    language linkages are distinct types even if they are otherwise identical
>    C and C++ linkages are defined by standard.
> 
> I specify how types are compatible in C++ linkage and rest is left to
> implementaiton.
> 
> I think we are safe to enforce C style rules on C++ type as C++ rules
> are just monotonously strict.
> 
> I think in longer term I should extend ODR wanrings to complain about language
> linkage violations (as specified by Section 7.5) and then we could be able to
> use ODR to disambiguate canonical types on everything that is in C++ linkage
> assuming that existing codebases are resonably free of the violations.
> 
> Fortran-C interoperability, universal pointer type, aggregates
> ==============================================================
> 
> I did my excercise and also read Fortran aliasing and C interoperability 
> rules.
> Fortran only aliasing seems just the same as aliasing within a single
> translation unit and thus rather easy given that fortran has only one hack in
> gfc_get_alias_set.
> 
> It has some interesting points WRT C language.  It has type
> that is compatible with all C pointers (contraidcting C standard that
> say they may have different representations), it also has type that is
> compatible with both signed and unsigned char. It states that
> aggregate compatibility only by types of elements (i.e. array and
> structure of 4 ints is the same or single element array and union)
> on those types.
> 
> Fortran explicitly marks C interoperable types and I think the
> issues can be reproduced by fortran only testcases so I think we
> want make Fortran FE to explicitly drop those odd guys to alias set 0.
> I will look into producing testcases for bugzilla, but that will get us
> around need to fix that all at LTO level.
> 
> Overall plan
> ============
> 
> So I propose the following:
>  1) we add the incomplete types mode to gimple_canonical_types_compatible_p
>     as suggested by this patch
>  2) I will add the individual globbing rules to the functions one by one
>  3) once done, we enable recursion on pointer
>  4) we drop the alias.c globing of pointer_type for in_lto_p.

I don't think we want to do 4), especially not only for in_lto_p.  I'm
not sure we should spend time on improving accuracy before fixing
correctness (on the current precision level).

> Independently on that I will send a patch for more fine grained globbing
> within alias.c for non-LTO compilation so we avoid cases where LTO alias
> sets would be actually stronger than what we do without LTO and also
> because this adds a lot of extra disambiguations to Firefox.
> 
> I hope the machinery will become flexible enough so we can add extra globbing
> rules if we find them necessary during the stage1. I would also porpose adding
> C standard compliant mode to gimple_canonical_types_compatible_p and use it to
> produce declaration mismatch warnings in lto-symtab when both decls come from
> C/C++ the same way as we already do ODR.  That is very useful sanity check 
> that
> will warn us when our implementation is wrong and/or programs are often
> breaking it. We can drop the weird testcases to testsuite to be sure we won't
> regress in future.
> 
> Based on experience, we may switch to C/C++ compliant equivalence when we know
> that whole unit is C/C++ and we do not have to deal with odd cases from other
> languages.

I'm not convinced the patch is a step in the right direction.

Can we please first fix pointer handing for TYPE_CANONICAL by dropping
the TREE_CODE handling and fix union handling by only looking at
union size?

Originally I wanted to make LTO type compatibility really layout-based
(thus struct { int i[2]; } and struct { int i; int j; } match for 
example).

I wonder if for non-cross-language-LTO we shouldn't simply stream
TYPE_CANONICAL, at stream-in record a TYPE_CANONICAL vs. uses vector
map and "fix" up canonical types globally (we'd need to stream
"local" canonical types in the global trees then).

Richard.

> Honza
> 
>       * tree.c (gimple_canonical_types_compatible_p): Turn
>       TRUST_TYPE_CANONICAL into FLAGS parameter; support
>       comparsion with MATCH_WITH_INCOMPLETE_TYPE.
> Index: lto/lto.c
> ===================================================================
> --- lto/lto.c (revision 223632)
> +++ lto/lto.c (working copy)
> @@ -295,7 +295,9 @@
>  static unsigned long num_canonical_type_hash_entries;
>  static unsigned long num_canonical_type_hash_queries;
>  
> -static void iterative_hash_canonical_type (tree type, inchash::hash &hstate);
> +static void iterative_hash_canonical_type (tree type, inchash::hash &hstate,
> +                                        unsigned int flags
> +                                          = MATCH_TYPE_CANONICAL);
>  static hashval_t gimple_canonical_type_hash (const void *p);
>  static void gimple_register_canonical_type_1 (tree t, hashval_t hash);
>  
> @@ -302,26 +304,36 @@
>  /* Returning a hash value for gimple type TYPE.
>  
>     The hash value returned is equal for types considered compatible
> -   by gimple_canonical_types_compatible_p.  */
> +   by gimple_canonical_types_compatible_p.  FLAGS values are interpretted
> +   same way as by this function.  */
>  
>  static hashval_t
> -hash_canonical_type (tree type)
> +hash_canonical_type (tree type, unsigned int flags = MATCH_TYPE_CANONICAL)
>  {
>    inchash::hash hstate;
>  
> -  /* We compute alias sets only for types that needs them.
> -     Be sure we do not recurse to something else as we can not hash 
> incomplete
> -     types in a way they would have same hash value as compatible complete
> -     types.  */
> -  gcc_checking_assert (type_with_alias_set_p (type));
> +  /* Check that we encounter incomplete types only with
> +     MATCH_WITH_INCOMPLETE_TYPE.
>  
> +     Also check that no one tries to use hashing in compbination with 
> +     !MATCH_TYPE_CANONICAL.  In this case gimple_canonical_types_compatible_p
> +     is not transitive and thus does not produce equivalence on all types.  
> */
> +  gcc_checking_assert ((flags & MATCH_TYPE_CANONICAL)
> +                    && ((flags & MATCH_WITH_INCOMPLETE_TYPE)
> +                        || type_with_alias_set_p (type)));
> +
>    /* Combine a few common features of types so that types are grouped into
>       smaller sets; when searching for existing matching types to merge,
>       only existing types having the same features as the new type will be
>       checked.  */
>    hstate.add_int (TREE_CODE (type));
> -  hstate.add_int (TYPE_MODE (type));
>  
> +  /* Incomplete arrays and aggregates do not have TYPE_MODE defined.  */
> +  if (!(flags & MATCH_WITH_INCOMPLETE_TYPE)
> +      || (TREE_CODE (type) != ARRAY_TYPE
> +       && !AGGREGATE_TYPE_P (type)))
> +    hstate.add_int (TYPE_MODE (type));
> +
>    /* Incorporate common features of numerical types.  */
>    if (INTEGRAL_TYPE_P (type)
>        || SCALAR_FLOAT_TYPE_P (type)
> @@ -355,7 +367,8 @@
>      hstate.add_int (TYPE_STRING_FLAG (type));
>  
>    /* For array types hash the domain bounds and the string flag.  */
> -  if (TREE_CODE (type) == ARRAY_TYPE && TYPE_DOMAIN (type))
> +  if (TREE_CODE (type) == ARRAY_TYPE && TYPE_DOMAIN (type)
> +      && !(flags & MATCH_WITH_INCOMPLETE_TYPE))
>      {
>        hstate.add_int (TYPE_STRING_FLAG (type));
>        /* OMP lowering can introduce error_mark_node in place of
> @@ -366,30 +379,35 @@
>       inchash::add_expr (TYPE_MAX_VALUE (TYPE_DOMAIN (type)), hstate);
>      }
>  
> -  /* Recurse for aggregates with a single element type.  */
> +  /* Recurse for aggregates with a single element type.
> +     We are safe to drop MATCH_INCOMPLETE.  There is no way to build
> +     an array of incomplete types.   */
>    if (TREE_CODE (type) == ARRAY_TYPE
>        || TREE_CODE (type) == COMPLEX_TYPE
>        || TREE_CODE (type) == VECTOR_TYPE)
> -    iterative_hash_canonical_type (TREE_TYPE (type), hstate);
> +    iterative_hash_canonical_type (TREE_TYPE (type), hstate,
> +                                flags & ~MATCH_WITH_INCOMPLETE_TYPE);
>  
>    /* Incorporate function return and argument types.  */
>    if (TREE_CODE (type) == FUNCTION_TYPE || TREE_CODE (type) == METHOD_TYPE)
>      {
> -      unsigned na;
> -      tree p;
> +      iterative_hash_canonical_type (TREE_TYPE (type), hstate, flags);
>  
> -      iterative_hash_canonical_type (TREE_TYPE (type), hstate);
> -
> -      for (p = TYPE_ARG_TYPES (type), na = 0; p; p = TREE_CHAIN (p))
> +      if (!(flags & MATCH_WITH_INCOMPLETE_TYPE)
> +       || TREE_CODE (type) == METHOD_TYPE)
>       {
> -       iterative_hash_canonical_type (TREE_VALUE (p), hstate);
> -       na++;
> +       unsigned na;
> +       tree p;
> +       for (p = TYPE_ARG_TYPES (type), na = 0; p; p = TREE_CHAIN (p))
> +         {
> +           iterative_hash_canonical_type (TREE_VALUE (p), hstate, flags);
> +           na++;
> +         }
> +       hstate.add_int (na);
>       }
> -
> -      hstate.add_int (na);
>      }
>  
> -  if (RECORD_OR_UNION_TYPE_P (type))
> +  if (RECORD_OR_UNION_TYPE_P (type) && !(flags & MATCH_WITH_INCOMPLETE_TYPE))
>      {
>        unsigned nf;
>        tree f;
> @@ -397,7 +415,7 @@
>        for (f = TYPE_FIELDS (type), nf = 0; f; f = TREE_CHAIN (f))
>       if (TREE_CODE (f) == FIELD_DECL)
>         {
> -         iterative_hash_canonical_type (TREE_TYPE (f), hstate);
> +         iterative_hash_canonical_type (TREE_TYPE (f), hstate, flags);
>           nf++;
>         }
>  
> @@ -407,14 +425,22 @@
>    return hstate.end();
>  }
>  
> -/* Returning a hash value for gimple type TYPE combined with VAL.  */
> +/* Returning a hash value for gimple type TYPE combined with VAL.
> +   FLAGS are the same as for gimple_conanical_types_compatible_p.  */
>  
>  static void
> -iterative_hash_canonical_type (tree type, inchash::hash &hstate)
> +iterative_hash_canonical_type (tree type, inchash::hash &hstate,
> +                            unsigned int flags)
>  {
>    hashval_t v;
> +
> +  /* TYPE_CANONICAL reflects equivalence classes with
> +     !MATCH_WITH_INCOMPLETE_TYPE.   If we are matching incomplete types,
> +     then we need to recurse.  */
> +  if (flags & MATCH_WITH_INCOMPLETE_TYPE)
> +    v = hash_canonical_type (type, flags);
>    /* An already processed type.  */
> -  if (TYPE_CANONICAL (type))
> +  else if (TYPE_CANONICAL (type))
>      {
>        type = TYPE_CANONICAL (type);
>        v = gimple_canonical_type_hash (type);
> @@ -497,6 +523,14 @@
>  {
>    if (TYPE_CANONICAL (t) || !type_with_alias_set_p (t))
>      return;
> +  /* Only types that can be handled in memory need canonical types.
> +     Function and methods are never accessed. Also we do not need canonical
> +     types for incomplete types with exception of arrays - structures may end
> +     with incomplete arrays that may be referenced.  */
> +  if (TREE_CODE (t) == FUNCTION_TYPE || TREE_CODE (t) == METHOD_TYPE
> +      || (!COMPLETE_TYPE_P (t)
> +       && (TREE_CODE (t) != ARRAY_TYPE || !COMPLETE_TYPE_P (TREE_TYPE (t)))))
> +    return;
>  
>    gimple_register_canonical_type_1 (t, hash_canonical_type (t));
>  }
> Index: tree.c
> ===================================================================
> --- tree.c    (revision 223633)
> +++ tree.c    (working copy)
> @@ -12702,12 +12702,24 @@
>  /* Return true iff T1 and T2 are structurally identical for what
>     TBAA is concerned.  
>     This function is used both by lto.c canonical type merging and by the
> -   verifier.  If TRUST_TYPE_CANONICAL we do not look into structure of types
> -   that have TYPE_CANONICAL defined and assume them equivalent.  */
> +   verifier.  
>  
> +   If flags sets MATCH_TYPE_CANONICAL we assume that TYPE_CANONICAL is set 
> in a
> +   way that two types have the same canonical type if and only if
> +   gimple_canonical_types_compatible_p (t1,t2, 0) is true.  This is used
> +   to cut down recursion during LTO canonical type comptuation.  When this 
> flag
> +   is set we also sanity check that we are going to produce equivalence 
> relation
> +   that is needed to drive the hashtable in lto.c.
> +
> +   if MATCH_WITH_INCOMPLETE_TYPE is true, then we do not use any information
> +   from complete types and thus i.e. all RECORD_TYPE are equivlaent to other
> +   RECORD_TYPEs.  This is the only equivalence possible if one require
> +   incomplete type to be in the same equivalence class with all its
> +   completetions.  */
> +
>  bool
>  gimple_canonical_types_compatible_p (const_tree t1, const_tree t2,
> -                                  bool trust_type_canonical)
> +                                  unsigned int flags)
>  {
>    /* Before starting to set up the SCC machinery handle simple cases.  */
>  
> @@ -12719,28 +12731,28 @@
>    if (t1 == NULL_TREE || t2 == NULL_TREE)
>      return false;
>  
> -  /* We consider complete types always compatible with incomplete type.
> -     This does not make sense for canonical type calculation and thus we
> -     need to ensure that we are never called on it.
> +  /* Check that either flags allow incomplete types or both types are 
> complete.
> +     This is necessary to ensure transitivity for canonical type merging.
>  
> -     FIXME: For more correctness the function probably should have three 
> modes
> -     1) mode assuming that types are complete mathcing their structure
> -     2) mode allowing incomplete types but producing equivalence classes
> -        and thus ignoring all info from complete types
> -     3) mode allowing incomplete types to match complete but checking
> -        compatibility between complete types.
> +     FIXME: with !MATCH_TYPE_CANONICAL we probably should allow match between
> +     incomplete type and complete type as defined by language standards.  No
> +     one however rely on it so far.  */
>  
> -     1 and 2 can be used for canonical type calculation. 3 is the real
> -     definition of type compatibility that can be used i.e. for warnings 
> during
> -     declaration merging.  */
> -
> -  gcc_assert (!trust_type_canonical
> +  gcc_assert ((flags & MATCH_WITH_INCOMPLETE_TYPE)
> +           || !(flags & MATCH_TYPE_CANONICAL)
>             || (type_with_alias_set_p (t1) && type_with_alias_set_p (t2)));
>    /* If the types have been previously registered and found equal
>       they still are.  */
>    if (TYPE_CANONICAL (t1) && TYPE_CANONICAL (t2)
> -      && trust_type_canonical)
> -    return TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2);
> +      && (flags & MATCH_TYPE_CANONICAL))
> +    {
> +      if (TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2))
> +     return true;
> +      /* TYPE_CANONICAL is always computed with an assumption that the type
> +      is complete.  */
> +      if (!(flags & MATCH_WITH_INCOMPLETE_TYPE))
> +     return false;
> +    }
>  
>    /* Can't be the same type if the types don't have the same code.  */
>    if (TREE_CODE (t1) != TREE_CODE (t2))
> @@ -12753,8 +12765,12 @@
>        || TREE_CODE (t1) == NULLPTR_TYPE)
>      return true;
>  
> -  /* Can't be the same type if they have different mode.  */
> -  if (TYPE_MODE (t1) != TYPE_MODE (t2))
> +  /* Can't be the same type if they have different mode.
> +     Incomplete arrays and aggregates do not have TYPE_MODE defined.  */
> +  if ((!(flags & MATCH_WITH_INCOMPLETE_TYPE)
> +       || (TREE_CODE (t1) != ARRAY_TYPE
> +           && !AGGREGATE_TYPE_P (t1)))
> +      && TYPE_MODE (t1) != TYPE_MODE (t2))
>      return false;
>  
>    /* Non-aggregate types can be handled cheaply.  */
> @@ -12783,8 +12799,7 @@
>       {
>         if (TYPE_ADDR_SPACE (TREE_TYPE (t1))
>             != TYPE_ADDR_SPACE (TREE_TYPE (t2)))
> -         return false;
> -
> +         return false;
>         if (TREE_CODE (TREE_TYPE (t1)) != TREE_CODE (TREE_TYPE (t2)))
>           return false;
>       }
> @@ -12792,9 +12807,9 @@
>        /* Tail-recurse to components.  */
>        if (TREE_CODE (t1) == VECTOR_TYPE
>         || TREE_CODE (t1) == COMPLEX_TYPE)
> -     return gimple_canonical_types_compatible_p (TREE_TYPE (t1),
> -                                                 TREE_TYPE (t2),
> -                                                 trust_type_canonical);
> +     return gimple_canonical_types_compatible_p
> +              (TREE_TYPE (t1), TREE_TYPE (t2),
> +               flags & ~MATCH_WITH_INCOMPLETE_TYPE);
>  
>        return true;
>      }
> @@ -12804,12 +12819,20 @@
>      {
>      case ARRAY_TYPE:
>        /* Array types are the same if the element types are the same and
> -      the number of elements are the same.  */
> -      if (!gimple_canonical_types_compatible_p (TREE_TYPE (t1), TREE_TYPE 
> (t2),
> -                                             trust_type_canonical)
> +      the number of elements are the same.
> +
> +      When MATCH_WITH_INCOMPLETE_TYPE is set, bypass the check
> +      on number of elements.
> +      When recursing, clear MATCH_WITH_INCOMPLETE_TYPE because there is
> +      no way to make incomplete array of array.  */
> +      if (!gimple_canonical_types_compatible_p
> +          (TREE_TYPE (t1), TREE_TYPE (t2),
> +           flags & ~MATCH_WITH_INCOMPLETE_TYPE)
>         || TYPE_STRING_FLAG (t1) != TYPE_STRING_FLAG (t2)
>         || TYPE_NONALIASED_COMPONENT (t1) != TYPE_NONALIASED_COMPONENT (t2))
>       return false;
> +      else if (flags & MATCH_WITH_INCOMPLETE_TYPE)
> +     return true;
>        else
>       {
>         tree i1 = TYPE_DOMAIN (t1);
> @@ -12848,11 +12871,19 @@
>      case METHOD_TYPE:
>      case FUNCTION_TYPE:
>        /* Function types are the same if the return type and arguments types
> -      are the same.  */
> +      are the same.
> +      It is possible that function pointers have return values and parameters
> +      of incomplete types; permit that by not clearing
> +      MATCH_WITH_INCOMPLETE_TYPE  */
>        if (!gimple_canonical_types_compatible_p (TREE_TYPE (t1), TREE_TYPE 
> (t2),
> -                                             trust_type_canonical))
> +                                             flags))
>       return false;
>  
> +      /* We must permit a match between !prototype_p and prototype_p for
> +      functions; methods are never !prototype_p.  */
> +      if ((flags & MATCH_WITH_INCOMPLETE_TYPE)
> +       && TREE_CODE (t1) == FUNCTION_TYPE)
> +     return true;
>        if (TYPE_ARG_TYPES (t1) == TYPE_ARG_TYPES (t2))
>       return true;
>        else
> @@ -12864,8 +12895,7 @@
>              parms1 = TREE_CHAIN (parms1), parms2 = TREE_CHAIN (parms2))
>           {
>             if (!gimple_canonical_types_compatible_p
> -                  (TREE_VALUE (parms1), TREE_VALUE (parms2),
> -                   trust_type_canonical))
> +                  (TREE_VALUE (parms1), TREE_VALUE (parms2), flags))
>               return false;
>           }
>  
> @@ -12881,6 +12911,15 @@
>        {
>       tree f1, f2;
>  
> +     /* C standrad require incomplete structures and unions to be
> +        considered compatible with complete ones regardless their TYPE_NAME
> +        when they come from different translation units.
> +        We must consider transitive closure here, so 
> +        every structure/union is equivalent to each other.  */
> +        
> +     if (flags & MATCH_WITH_INCOMPLETE_TYPE)
> +       return true;
> +
>       /* For aggregate types, all the fields must be the same.  */
>       for (f1 = TYPE_FIELDS (t1), f2 = TYPE_FIELDS (t2);
>            f1 || f2;
> @@ -12897,8 +12936,7 @@
>           if (DECL_NONADDRESSABLE_P (f1) != DECL_NONADDRESSABLE_P (f2)
>               || !gimple_compare_field_offset (f1, f2)
>               || !gimple_canonical_types_compatible_p
> -                   (TREE_TYPE (f1), TREE_TYPE (f2),
> -                    trust_type_canonical))
> +                   (TREE_TYPE (f1), TREE_TYPE (f2), flags))
>             return false;
>         }
>  
> @@ -12961,7 +12999,7 @@
>             with variably sized arrays because their sizes possibly
>             gimplified to different variables.  */
>          && !variably_modified_type_p (ct, NULL)
> -        && !gimple_canonical_types_compatible_p (t, ct, false))
> +        && !gimple_canonical_types_compatible_p (t, ct, 0))
>      {
>        error ("TYPE_CANONICAL is not compatible");
>        debug_tree (ct);
> Index: tree.h
> ===================================================================
> --- tree.h    (revision 223632)
> +++ tree.h    (working copy)
> @@ -4569,9 +4569,21 @@
>  extern unsigned int tree_map_base_hash (const void *);
>  extern int tree_map_base_marked_p (const void *);
>  extern void DEBUG_FUNCTION verify_type (const_tree t);
> -extern bool gimple_canonical_types_compatible_p (const_tree, const_tree,
> -                                              bool trust_type_canonical = 
> true);
>  
> +/* Flags used by gimple_canonical_types_compatible_p.  */
> +enum gimple_canonical_types_compatible_flags
> +  {
> +    /* Asume that TYPE_CANONICAL is set in a way that two types have
> +       the same canonical type if and only if
> +       gimple_canonical_types_compatible_p (t1,t2, 0) is true.  */
> +    MATCH_TYPE_CANONICAL = 1,
> +    /* Match all types as if they were incomplete.  */
> +    MATCH_WITH_INCOMPLETE_TYPE = 2
> +  };
> +extern bool gimple_canonical_types_compatible_p
> +    (const_tree, const_tree,
> +     unsigned int flags = MATCH_TYPE_CANONICAL);
> +
>  #define tree_map_eq tree_map_base_eq
>  extern unsigned int tree_map_hash (const void *);
>  #define tree_map_marked_p tree_map_base_marked_p
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nuernberg)

Re: Teach gimple_canonical_types_compatible_p about incomplete types

Reply via email to