Re: [Patch, fortran] PR113363 - ICE on ASSOCIATE and unlimited polymorphic function

2024-05-12 Thread Harald Anlauf

Hi Paul,

this looks all good now, and is OK for mainline as well as backporting!

***

While playing with the testcase, I found 3 remaining smaller issues that
are pre-existing, so they should not delay your present work.  To make
it clear: these are not regressions.

When "maliciously" perturbing the testcase by adding parentheses in the
right places, I see the following:

Replacing

  associate (var => foo ()) ! OK after r14-9489-g3fd46d859cda10

by

  associate (var => (foo ()))

gives an ICE here with 14-branch and 15-mainline.

Similarly replacing

  allocate (y, source = x(1))   ! Gave zero length here

by

  allocate (y, source = (x(1)))

Furthermore, replacing

  allocate(x, source = foo ())
by

  allocate(x, source = (foo ()))

gives a runtime segfault with both 14-branch and 15-mainline.
So this is something for another day...

Thanks for the patch!

Harald


Am 12.05.24 um 13:27 schrieb Paul Richard Thomas:

Hi Harald,

Please find attached my resubmission for pr113363. The changes are as
follows:
(i) The chunk in gfc_conv_procedure_call is new. This was the source of one
of the memory leaks;
(ii) The incorporation of the _len field in trans_class_assignment was done
for the pr84006 patch;
(iii) The source of all the invalid memory accesses and so on was down to
the use of realloc. I tried all sorts of workarounds such as testing the
vptrs and the sizes but only free followed by malloc worked. I have no idea
at all why this is the case; and
(iv) I took account of your remarks about the chunk in trans-array.cc by
removing it and that the chunk in trans-stmt.cc would leak frontend memory.

OK for mainline (and -14 branch after a few-weeks)?

Regards

Paul

Fortran: Fix wrong code in unlimited polymorphic assignment [PR113363]

2024-05-12  Paul Thomas  

gcc/fortran
PR fortran/113363
* trans-array.cc (gfc_array_init_size): Use the expr3 dtype so
that the correct element size is used.
* trans-expr.cc (gfc_conv_procedure_call): Remove restriction
that ss and ss->loop be present for the finalization of class
array function results.
(trans_class_assignment): Use free and malloc, rather than
realloc, for character expressions assigned to unlimited poly
entities.
* trans-stmt.cc (gfc_trans_allocate): Build a correct rhs for
the assignment of an unlimited polymorphic 'source'.

gcc/testsuite/
PR fortran/113363
* gfortran.dg/pr113363.f90: New test.



The first chunk in trans-array.cc ensures that the array dtype is set to
the source dtype. The second chunk ensures that the lhs _len field does

not

default to zero and so is specific to dynamic types of character.



Why the two gfc_copy_ref?  valgrind pointed my to the tail
of gfc_copy_ref which already has:

dest->next = gfc_copy_ref (src->next);

so this looks redundant and leaks frontend memory?

***

Playing with the testcase, I find several invalid writes with
valgrind, or a heap buffer overflow with -fsanitize=address .









[PATCH, committed] Fortran: fix frontend memleak

2024-05-12 Thread Harald Anlauf
Dear all,

the attached obvious patch fixes a frontend memleak that was introduced
recently, and which shows up when checking for inquiry references.
I came across it when working on pr115039.

Committed after regtesting as r15-391-g13b6ac4ebd04f0.

Thanks,
Harald

From 13b6ac4ebd04f0703d92828c9268b0b216890b0d Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Sun, 12 May 2024 21:48:03 +0200
Subject: [PATCH] Fortran: fix frontend memleak

gcc/fortran/ChangeLog:

	* primary.cc (gfc_match_varspec): Replace 'ref' argument to
	is_inquiry_ref() by NULL when the result is not needed to avoid
	a memleak.
---
 gcc/fortran/primary.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/fortran/primary.cc b/gcc/fortran/primary.cc
index 606e84432be..8e7833769a8 100644
--- a/gcc/fortran/primary.cc
+++ b/gcc/fortran/primary.cc
@@ -2250,7 +2250,7 @@ gfc_match_varspec (gfc_expr *primary, int equiv_flag, bool sub_flag,
 	 can be found.  If this was an inquiry reference with the same name
 	 as a derived component and the associate-name type is not derived
 	 or class, this is fixed up in 'gfc_fixup_inferred_type_refs'.  */
-  if (mm == MATCH_YES && is_inquiry_ref (name, )
+  if (mm == MATCH_YES && is_inquiry_ref (name, NULL)
 	  && !(sym->ts.type == BT_UNKNOWN
 		&& gfc_find_derived_types (sym, gfc_current_ns, name)))
 	inquiry = true;
--
2.35.3



Re: [PATCH] fortran: Assume there is no cyclic reference with submodule symbols [PR99798]

2024-05-12 Thread Paul Richard Thomas
Hi Mikael,

That is an ingenious solution. Given the complexity, I think that the
comments are well warranted.

OK for master and, I would suggest, 14-branch after a few weeks.

Thanks!

Paul

On Sun, 12 May 2024 at 14:16, Mikael Morin  wrote:

> Hello,
>
> Here is my final patch to fix the ICE of PR99798.
> It's maybe overly verbose with comments, but the memory management is
> hopefully clarified.
> I tested this with a full fortran regression test on x86_64-linux and a
> manual check with valgrind on the testcase.
> OK for master?
>
> -- 8< --
>
> This prevents a premature release of memory with procedure symbols from
> submodules, causing random compiler crashes.
>
> The problem is a fragile detection of cyclic references, which can match
> with procedures host-associated from a module in submodules, in cases
> where it
> shouldn't.  The formal namespace is released, and with it the dummy
> arguments
> symbols of the procedure.  But there is no cyclic reference, so the
> procedure
> symbol itself is not released and remains, with pointers to its dummy
> arguments
> now dangling.
>
> The fix adds a condition to avoid the case, and refactors to a new
> predicate
> by the way.  Part of the original condition is also removed, for lack of a
> reason to keep it.
>
> PR fortran/99798
>
> gcc/fortran/ChangeLog:
>
> * symbol.cc (gfc_release_symbol): Move the condition guarding
> the handling cyclic references...
> (cyclic_reference_break_needed): ... here as a new predicate.
> Remove superfluous parts.  Add a condition preventing any premature
> release with submodule symbols.
>
> gcc/testsuite/ChangeLog:
>
> * gfortran.dg/submodule_33.f08: New test.
> ---
>  gcc/fortran/symbol.cc  | 54 +-
>  gcc/testsuite/gfortran.dg/submodule_33.f08 | 20 
>  2 files changed, 72 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gfortran.dg/submodule_33.f08
>
> diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc
> index 8f7deac1d1e..0a1646def67 100644
> --- a/gcc/fortran/symbol.cc
> +++ b/gcc/fortran/symbol.cc
> @@ -3179,6 +3179,57 @@ gfc_free_symbol (gfc_symbol *)
>  }
>
>
> +/* Returns true if the symbol SYM has, through its FORMAL_NS field, a
> reference
> +   to itself which should be eliminated for the symbol memory to be
> released
> +   via normal reference counting.
> +
> +   The implementation is crucial as it controls the proper release of
> symbols,
> +   especially (contained) procedure symbols, which can represent a lot of
> memory
> +   through the namespace of their body.
> +
> +   We try to avoid freeing too much memory (causing dangling pointers),
> to not
> +   leak too much (wasting memory), and to avoid expensive walks of the
> symbol
> +   tree (which would be the correct way to check for a cycle).  */
> +
> +bool
> +cyclic_reference_break_needed (gfc_symbol *sym)
> +{
> +  /* Normal symbols don't reference themselves.  */
> +  if (sym->formal_ns == nullptr)
> +return false;
> +
> +  /* Procedures at the root of the file do have a self reference, but
> they don't
> + have a reference in a parent namespace preventing the release of the
> + procedure namespace, so they can use the normal reference counting.
> */
> +  if (sym->formal_ns == sym->ns)
> +return false;
> +
> +  /* If sym->refs == 1, we can use normal reference counting.  If
> sym->refs > 2,
> + the symbol won't be freed anyway, with or without cyclic reference.
> */
> +  if (sym->refs != 2)
> +return false;
> +
> +  /* Procedure symbols host-associated from a module in submodules are
> special,
> + because the namespace of the procedure block in the submodule is
> different
> + from the FORMAL_NS namespace generated by host-association.  So
> there are
> + two different namespaces representing the same procedure namespace.
> As
> + FORMAL_NS comes from host-association, which only imports symbols
> visible
> + from the outside (dummy arguments basically), we can assume there is
> no
> + self reference through FORMAL_NS in that case.  */
> +  if (sym->attr.host_assoc && sym->attr.used_in_submodule)
> +return false;
> +
> +  /* We can assume that contained procedures have cyclic references,
> because
> + the symbol of the procedure itself is accessible in the procedure
> body
> + namespace.  So we assume that symbols with a formal namespace
> different
> + from the declaration namespace and two references, one of which is
> about
> + to be removed, are procedures with just the self reference left.  At
> this
> + point, the symbol SYM matches that pattern, so we return true here to
> + permit the release of SYM.  */
> +  return true;
> +}
> +
> +
>  /* Decrease the reference counter and free memory when we reach zero.
> Returns true if the symbol has been freed, false otherwise.  */
>
> @@ -3188,8 +3239,7 @@ gfc_release_symbol (gfc_symbol *)
>

[PATCH] fortran: Assume there is no cyclic reference with submodule symbols [PR99798]

2024-05-12 Thread Mikael Morin
Hello,

Here is my final patch to fix the ICE of PR99798.
It's maybe overly verbose with comments, but the memory management is
hopefully clarified.
I tested this with a full fortran regression test on x86_64-linux and a
manual check with valgrind on the testcase.
OK for master?

-- 8< --

This prevents a premature release of memory with procedure symbols from
submodules, causing random compiler crashes.

The problem is a fragile detection of cyclic references, which can match
with procedures host-associated from a module in submodules, in cases where it
shouldn't.  The formal namespace is released, and with it the dummy arguments
symbols of the procedure.  But there is no cyclic reference, so the procedure
symbol itself is not released and remains, with pointers to its dummy arguments
now dangling.

The fix adds a condition to avoid the case, and refactors to a new predicate
by the way.  Part of the original condition is also removed, for lack of a
reason to keep it.

PR fortran/99798

gcc/fortran/ChangeLog:

* symbol.cc (gfc_release_symbol): Move the condition guarding
the handling cyclic references...
(cyclic_reference_break_needed): ... here as a new predicate.
Remove superfluous parts.  Add a condition preventing any premature
release with submodule symbols.

gcc/testsuite/ChangeLog:

* gfortran.dg/submodule_33.f08: New test.
---
 gcc/fortran/symbol.cc  | 54 +-
 gcc/testsuite/gfortran.dg/submodule_33.f08 | 20 
 2 files changed, 72 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/submodule_33.f08

diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc
index 8f7deac1d1e..0a1646def67 100644
--- a/gcc/fortran/symbol.cc
+++ b/gcc/fortran/symbol.cc
@@ -3179,6 +3179,57 @@ gfc_free_symbol (gfc_symbol *)
 }
 
 
+/* Returns true if the symbol SYM has, through its FORMAL_NS field, a reference
+   to itself which should be eliminated for the symbol memory to be released
+   via normal reference counting.
+
+   The implementation is crucial as it controls the proper release of symbols,
+   especially (contained) procedure symbols, which can represent a lot of 
memory
+   through the namespace of their body.
+
+   We try to avoid freeing too much memory (causing dangling pointers), to not
+   leak too much (wasting memory), and to avoid expensive walks of the symbol
+   tree (which would be the correct way to check for a cycle).  */
+
+bool
+cyclic_reference_break_needed (gfc_symbol *sym)
+{
+  /* Normal symbols don't reference themselves.  */
+  if (sym->formal_ns == nullptr)
+return false;
+
+  /* Procedures at the root of the file do have a self reference, but they 
don't
+ have a reference in a parent namespace preventing the release of the
+ procedure namespace, so they can use the normal reference counting.  */
+  if (sym->formal_ns == sym->ns)
+return false;
+
+  /* If sym->refs == 1, we can use normal reference counting.  If sym->refs > 
2,
+ the symbol won't be freed anyway, with or without cyclic reference.  */
+  if (sym->refs != 2)
+return false;
+
+  /* Procedure symbols host-associated from a module in submodules are special,
+ because the namespace of the procedure block in the submodule is different
+ from the FORMAL_NS namespace generated by host-association.  So there are
+ two different namespaces representing the same procedure namespace.  As
+ FORMAL_NS comes from host-association, which only imports symbols visible
+ from the outside (dummy arguments basically), we can assume there is no
+ self reference through FORMAL_NS in that case.  */
+  if (sym->attr.host_assoc && sym->attr.used_in_submodule)
+return false;
+
+  /* We can assume that contained procedures have cyclic references, because
+ the symbol of the procedure itself is accessible in the procedure body
+ namespace.  So we assume that symbols with a formal namespace different
+ from the declaration namespace and two references, one of which is about
+ to be removed, are procedures with just the self reference left.  At this
+ point, the symbol SYM matches that pattern, so we return true here to
+ permit the release of SYM.  */
+  return true;
+}
+
+
 /* Decrease the reference counter and free memory when we reach zero.
Returns true if the symbol has been freed, false otherwise.  */
 
@@ -3188,8 +3239,7 @@ gfc_release_symbol (gfc_symbol *)
   if (sym == NULL)
 return false;
 
-  if (sym->formal_ns != NULL && sym->refs == 2 && sym->formal_ns != sym->ns
-  && (!sym->attr.entry || !sym->module))
+  if (cyclic_reference_break_needed (sym))
 {
   /* As formal_ns contains a reference to sym, delete formal_ns just
 before the deletion of sym.  */
diff --git a/gcc/testsuite/gfortran.dg/submodule_33.f08 
b/gcc/testsuite/gfortran.dg/submodule_33.f08
new file mode 100644
index 000..b61d750def1
--- 

Re: [Patch, fortran] PR113363 - ICE on ASSOCIATE and unlimited polymorphic function

2024-05-12 Thread Paul Richard Thomas
Hi Harald,

Please find attached my resubmission for pr113363. The changes are as
follows:
(i) The chunk in gfc_conv_procedure_call is new. This was the source of one
of the memory leaks;
(ii) The incorporation of the _len field in trans_class_assignment was done
for the pr84006 patch;
(iii) The source of all the invalid memory accesses and so on was down to
the use of realloc. I tried all sorts of workarounds such as testing the
vptrs and the sizes but only free followed by malloc worked. I have no idea
at all why this is the case; and
(iv) I took account of your remarks about the chunk in trans-array.cc by
removing it and that the chunk in trans-stmt.cc would leak frontend memory.

OK for mainline (and -14 branch after a few-weeks)?

Regards

Paul

Fortran: Fix wrong code in unlimited polymorphic assignment [PR113363]

2024-05-12  Paul Thomas  

gcc/fortran
PR fortran/113363
* trans-array.cc (gfc_array_init_size): Use the expr3 dtype so
that the correct element size is used.
* trans-expr.cc (gfc_conv_procedure_call): Remove restriction
that ss and ss->loop be present for the finalization of class
array function results.
(trans_class_assignment): Use free and malloc, rather than
realloc, for character expressions assigned to unlimited poly
entities.
* trans-stmt.cc (gfc_trans_allocate): Build a correct rhs for
the assignment of an unlimited polymorphic 'source'.

gcc/testsuite/
PR fortran/113363
* gfortran.dg/pr113363.f90: New test.


> > The first chunk in trans-array.cc ensures that the array dtype is set to
> > the source dtype. The second chunk ensures that the lhs _len field does
> not
> > default to zero and so is specific to dynamic types of character.
> >
>
> Why the two gfc_copy_ref?  valgrind pointed my to the tail
> of gfc_copy_ref which already has:
>
>dest->next = gfc_copy_ref (src->next);
>
> so this looks redundant and leaks frontend memory?
>
> ***
>
> Playing with the testcase, I find several invalid writes with
> valgrind, or a heap buffer overflow with -fsanitize=address .
>
>
>
diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 7ec33fb1598..c5b56f4e273 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -5957,6 +5957,11 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
   tmp = gfc_conv_descriptor_dtype (descriptor);
   gfc_add_modify (pblock, tmp, gfc_get_dtype_rank_type (rank, type));
 }
+  else if (expr3_desc && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (expr3_desc)))
+{
+  tmp = gfc_conv_descriptor_dtype (descriptor);
+  gfc_add_modify (pblock, tmp, gfc_conv_descriptor_dtype (expr3_desc));
+}
   else
 {
   tmp = gfc_conv_descriptor_dtype (descriptor);
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 4590aa6edb4..e315e2d3370 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -8245,8 +8245,7 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	 call the finalization function of the temporary. Note that the
 	 nullification of allocatable components needed by the result
 	 is done in gfc_trans_assignment_1.  */
-  if (expr && ((gfc_is_class_array_function (expr)
-		&& se->ss && se->ss->loop)
+  if (expr && (gfc_is_class_array_function (expr)
 		   || gfc_is_alloc_class_scalar_function (expr))
 	  && se->expr && GFC_CLASS_TYPE_P (TREE_TYPE (se->expr))
 	  && expr->must_finalize)
@@ -12028,18 +12027,25 @@ trans_class_assignment (stmtblock_t *block, gfc_expr *lhs, gfc_expr *rhs,
 
   /* Reallocate if dynamic types are different. */
   gfc_init_block (_alloc);
-  tmp = fold_convert (pvoid_type_node, class_han);
-  re = build_call_expr_loc (input_location,
-builtin_decl_explicit (BUILT_IN_REALLOC), 2,
-tmp, size);
-  re = fold_build2_loc (input_location, MODIFY_EXPR, TREE_TYPE (tmp), tmp,
-			re);
-  tmp = fold_build2_loc (input_location, NE_EXPR,
-			 logical_type_node, rhs_vptr, old_vptr);
-  re = fold_build3_loc (input_location, COND_EXPR, void_type_node,
-			tmp, re, build_empty_stmt (input_location));
-  gfc_add_expr_to_block (_alloc, re);
-
+  if (UNLIMITED_POLY (lhs) && rhs->ts.type == BT_CHARACTER)
+	{
+	  gfc_add_expr_to_block (_alloc, gfc_call_free (class_han));
+	  gfc_allocate_using_malloc (_alloc, class_han, size, NULL_TREE);
+	}
+  else
+	{
+	  tmp = fold_convert (pvoid_type_node, class_han);
+	  re = build_call_expr_loc (input_location,
+builtin_decl_explicit (BUILT_IN_REALLOC),
+2, tmp, size);
+	  re = fold_build2_loc (input_location, MODIFY_EXPR, TREE_TYPE (tmp),
+tmp, re);
+	  tmp = fold_build2_loc (input_location, NE_EXPR,
+ logical_type_node, rhs_vptr, old_vptr);
+	  re = fold_build3_loc (input_location, COND_EXPR, void_type_node,
+tmp, re, build_empty_stmt (input_location));
+	  gfc_add_expr_to_block (_alloc, re);
+	}
   tree realloc_expr = lhs->ts.type == BT_CLASS ?
 	  gfc_finish_block (_alloc) :