With a few recent notes by others, I have identified that the
comparison logic I used in interface.c (compare_components,
gfc_compare_derived_types) was faulty in a few ways. I apologize in
advance for the length of this message, but I believe it will help
others (and myself) understand (recall) more thoroughly my internal
implementation of structures, unions, and maps which may have not be
obvious just by staring at hundreds of lines of code. Feel free to
skip to the end where you'll find the small patch which reorganizes
this section of code for clarity and correctness.

<tmi>

The goal of the aforementioned sections of code is to allow
gfc_compare_derived_types (and gfc_compare_union_types) to determine
when two type definitions are equal. For STRUCTUREs, as with derived
types with the SEQUENCE attribute, the type definitions can be
considered the same if their declarations are identical. With derived
types, this is easy enough. For STRUCTUREs however, we have several
constructs which induce the creation of both structure type
definitions and components with no user-defined names. I refer to
these as anonymous structures and anonymous components.

One of the checks originally performed by gfc_compare_derived_types is
to compare the names of both type definitions and components. Clearly,
this is insufficient for anonymous elements. Consider the following
definition of a structure containing a union and an anonymous
("ad-hoc") structure definition:

STRUCTURE /s/
  integer x
  UNION
    MAP
      integer y
    END MAP
  END UNION
  STRUCTURE sub
    real r
  END STUCTURE
END STRUCTURE

Internally, this has to create four separate type definitions. One for
the containing structure ("s"), one for the union, one for the map
within the union, and one for the anonymous sub-structure (in the
sub-structure, the type itself has no name - "sub" is the name of the
component which has the anonymous type). The outer structure, the map,
and the anonymous inner structure all get FL_STRUCT gfc_symbols and
the union gets an FL_UNION gfc_symbol. Furthermore several
gfc_components are created: s contains one for x, one for the union,
and one for sub; the type definition of sub contains one for r; the
union contains one for the map; and the map contains one for y. Since
there is no way for the user to name some of these elements, some
components and thier type definitions are "anonymous", and must be
given unambiguous internal names.

In the case of anonymous structure components, I have chosen to store
the second letter of the type definition symbol as uppercase. As with
regular derived type definitions, the first letter is also uppercase.
In the case of the anonymous union and map components, the component
names themselves start with a lowercase letter and the second letter
is uppercase. These names are unambiguous because Fortran is
case-insensitive and all names from user source are stored as
lowercase.

As a result, even when the above structure definition appears in
multiple places identically in source code, the compiler will generate
unique internal names for each union, map, and anonymous structure
definition because it cannot know immediately upon parsing whether
they match. In the case of derived types, components with different
names are clearly distinct. But in the case of structures, to properly
consider unions and maps we must go deeper and compare all the
sub-components. The same structure above declared in multiple places
should be considered equal, as with derived types with the SEQUENCE
attribute. This is the reason we need to bypass the name check in
gfc_compare_derived_types and compare_components.

One issue fixed by this patch is PR 77327, where I greedily started
checking properties of the components' derived type symbol, even
before I knew it was a derived type. In the following code you can see
the unprotected access to cmp1->ts.u.derived before confirming
cmp1->ts.type:

> compare_components (gfc_component *cmp1, gfc_component *cmp2,
(snip)
>   d1 = cmp1->ts.u.derived;
>   d2 = cmp2->ts.u.derived;
>   if (   (d1 && (d1->attr.flavor == FL_STRUCT || d1->attr.flavor == FL_UNION)
(snip)

The other issue fixed is in the anonymous name checking logic at the
top of gfc_compare_derived_types. The bug is exhibited by the attached
testcase wherein extra errors/warnings are erroneously generated for
identical structure definitions. As I described above, the first and
second characters of anonymous structure types are uppercase, and only
if the symbol is FL_UNION or FL_STRUCT. Unfortunately, for whatever
reason, the current code checks the third character via
derived2->name[2]:

> gfc_compare_derived_types (gfc_symbol *derived1, gfc_symbol *derived2)
(snip)
>   anonymous = (derived1->name[0] == derived2->name[0]
>       && derived1->name[1] && derived2->name[1] && derived2->name[2]
>       && derived1->name[1] == (char) TOUPPER (derived1->name[0])
>       && derived2->name[2] == (char) TOUPPER (derived2->name[0]));
(snip)

With the patch, duplicate structure definitions containing anonymous
elements will appropriately be resolved to use the same backend tree
declarations. You can see the clarified conditions for n name
comparison in the new functions in interface.c is_anonymous_component
and is_anonymous_dt. This also helps abstract the condition, if one
were to change the implementation of anonymous names in the future.

</tmi>

tl;dr: The attached patch fixes PR 77327 and another bug which is
demonstrated in the included testcase. The replacement code is also
easier to understand and more maintainable than the original.
Bootstraps and regtests on x86_64-redhat-linux. Since it is a change
to my previously submitted code I'll aim to commit to trunk over the
weekend if there are no concerns by then.

As a side note, I actually don't know how to exhibit PR 77327 in a
DejaGNU testcase, since it is a normally-silent heap-use-after-free
which was only found with a memory checker. I would appreciate any
advice on this matter so I can include a testcase for it with this
commit.

---
Fritz Reese
diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index 17500c9..ffc36f5 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -387,26 +387,46 @@ gfc_match_end_interface (void)
 }
 
 
+/* Return whether the component was defined anonymously.  */
+
+static bool
+is_anonymous_component (gfc_component *cmp)
+{
+  /* Only UNION and MAP components are anonymous.  In the case of a MAP,
+     the derived type symbol is FL_STRUCT and the component name looks like mM*.
+     This is the only case in which the second character of a component name is
+     uppercase.  */
+  return cmp->ts.type == BT_UNION
+    || (cmp->ts.type == BT_DERIVED
+        && cmp->ts.u.derived->attr.flavor == FL_STRUCT
+        && cmp->name[0] && cmp->name[1] && ISUPPER (cmp->name[1]));
+}
+
+
+/* Return whether the derived type was defined anonymously.  */
+
+static bool
+is_anonymous_dt (gfc_symbol *derived)
+{
+  /* UNION and MAP types are always anonymous. Otherwise, only nested STRUCTURE
+     types can be anonymous.  For anonymous MAP/STRUCTURE, we have FL_STRUCT
+     and the type name looks like XX*.  This is the only case in which the
+     second character of a type name is uppercase.  */
+  return derived->attr.flavor == FL_UNION
+    || (derived->attr.flavor == FL_STRUCT
+        && derived->name[0] && derived->name[1] && ISUPPER (derived->name[1]));
+}
+
+
 /* Compare components according to 4.4.2 of the Fortran standard.  */
 
 static int
 compare_components (gfc_component *cmp1, gfc_component *cmp2,
     gfc_symbol *derived1, gfc_symbol *derived2)
 {
-  gfc_symbol *d1, *d2;
-  bool anonymous = false;
-
-  /* Unions, maps, and anonymous structures all have names like "[xX]X$\d+"
-     which should not be compared.  */
-  d1 = cmp1->ts.u.derived;
-  d2 = cmp2->ts.u.derived;
-  if (   (d1 && (d1->attr.flavor == FL_STRUCT || d1->attr.flavor == FL_UNION)
-          && ISUPPER (cmp1->name[1]))
-      || (d2 && (d2->attr.flavor == FL_STRUCT || d2->attr.flavor == FL_UNION)
-          && ISUPPER (cmp2->name[1])))
-    anonymous = true;
-
-  if (!anonymous && strcmp (cmp1->name, cmp2->name) != 0)
+  /* Compare names, but not for anonymous components such as UNION or MAP.  */
+  if (!is_anonymous_component (cmp1) && !is_anonymous_component (cmp1)
+      && strcmp (cmp1->name, cmp2->name) != 0)
     return 0;
 
   if (cmp1->attr.access != cmp2->attr.access)
@@ -512,22 +532,12 @@ int
 gfc_compare_derived_types (gfc_symbol *derived1, gfc_symbol *derived2)
 {
   gfc_component *cmp1, *cmp2;
-  bool anonymous = false;
 
   if (derived1 == derived2)
     return 1;
 
   gcc_assert (derived1 && derived2);
 
-  /* MAP and anonymous STRUCTURE types have internal names of the form
-     mM* and sS* (we can get away this this because source names are converted
-     to lowerase). Compare anonymous type names specially because each
-     gets a unique name when it is declared. */
-  anonymous = (derived1->name[0] == derived2->name[0]
-      && derived1->name[1] && derived2->name[1] && derived2->name[2]
-      && derived1->name[1] == (char) TOUPPER (derived1->name[0])
-      && derived2->name[2] == (char) TOUPPER (derived2->name[0]));
-
   /* Special case for comparing derived types across namespaces.  If the
      true names and module names are the same and the module name is
      nonnull, then they are equal.  */
@@ -541,7 +551,9 @@ gfc_compare_derived_types (gfc_symbol *derived1, gfc_symbol *derived2)
      because they can be anonymous; therefore two structures with different
      names may be equal.  */
 
-  if (strcmp (derived1->name, derived2->name) != 0 && !anonymous)
+  /* Compare names, but not for anonymous types such as UNION or MAP.  */
+  if (!is_anonymous_dt (derived1) && !is_anonymous_dt (derived2)
+      && strcmp (derived1->name, derived2->name) != 0)
     return 0;
 
   if (derived1->component_access == ACCESS_PRIVATE
diff --git a/gcc/testsuite/gfortran.dg/dec_structure_13.f90 b/gcc/testsuite/gfortran.dg/dec_structure_13.f90
new file mode 100644
index 0000000..209c7da
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/dec_structure_13.f90
@@ -0,0 +1,79 @@
+! { dg-do compile }
+! { dg-options "-fdec-structure" }
+!
+! Verify that the comparisons in gfc_compare_derived_types can correctly
+! match nested anonymous subtypes.
+!
+
+subroutine sub0 (u)
+  structure /t/
+    structure sub
+      integer i
+    end structure
+  endstructure
+  record /t/ u
+  u.sub.i = 0
+end subroutine sub0
+
+subroutine sub1 ()
+  structure /t/
+    structure sub
+      integer i
+    end structure
+  endstructure
+  record /t/ u
+
+  interface
+    subroutine sub0 (u) ! regression: Interface mismatch.*Type mismatch
+      structure /t/
+        structure sub
+          integer i
+        end structure
+      endstructure
+        record /t/ u
+    end subroutine
+  end interface
+
+  call sub0(u) ! regression: Type mismatch in argument
+end subroutine
+
+subroutine sub3(u)
+  structure /tu/
+    union
+      map
+        integer i
+      end map
+      map
+        real r
+      end map
+    end union
+  end structure
+  record /tu/ u
+  u.r = 1.0
+end subroutine
+
+structure /t/
+  structure sub
+    integer i
+  end structure
+endstructure
+
+structure /tu/
+  union
+    map
+      integer i
+    end map
+    map
+      real r
+    end map
+  end union
+end structure
+
+record /t/ u
+record /tu/ u2
+
+call sub0(u) ! regression: Type mismatch in argument
+call sub1()
+call sub2(u2)
+
+end

Reply via email to