On November 16, 2018 10:10:00 PM GMT+01:00, Jakub Jelinek <ja...@redhat.com> 
wrote:
>Hi!
>
>As mentioned in the PR, the PR78363 changes were done too early, as
>some of
>the BLOCKs are moved to the outlined function, it is undesirable to
>generate
>DW_TAG_lexical_block DIEs for them in a different function from which
>they
>actually end up in.
>
>Those changes for parallel aren't necessary since Kevin's r253335
>change,
>so this patch
>1) reverts Richi's early (*debug_hooks->early_global_decl)
>(cfun->decl);
>   additions
>2) makes sure the same thing as for parallel is done also for other
>   OpenMP constructs
>3) adjust_context_and_scope is simplified (the BLOCK lookup must have
>   always turned just current_function_decl
>4) changes it, because if e.g. a parallel (or whatever other OpenMP
>construct that generates an outlined function) is nested in another one
>  that generates an outlined function, IMHO we want as DECL_CONTEXT the
>   immediate parent rather than ultimate parent and OMP expansion works
>   from leaf regions to outer ones
>5) I had to change dwarf2out_early_global_decl, because it didn't
>handle
>   properly the case when a function had non-NULL decl_function_context
> and that context FUNCTION_DECL had non-NULL decl_function_context too,
>   in that case we want to first handle the ultimate context, then the
>   child of that etc.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux.
>
>Richard/Jason, is the dwarf2out.c change ok for trunk (the rest I can
>ack
>myself)?

Can you add a comment why doing it differently for this case is necessary or do 
it the same way in all cases? I guess there might be latent issues for nested 
functions inside nested functions? 

Richard. 

>
>Kevin, did you have some gdb testcases you were trying for the r253335
>change?  Can you try those with added another e.g. parallel around it
>(say
>#pragma omp parallel if (0) or num_threads (1), so that it doesn't
>spawn too
>many children and see if you can check variables not mentioned in the
>inner
>parallel inside of the outer parallel, or even outside of the outer
>parallel?
>
>Martin, I haven't touched the grid stuff, can you check if it is needed
>(such as adjust one of the pr78363* testcases) and fix?
>
>Thanks.
>
>2018-11-16  Jakub Jelinek  <ja...@redhat.com>
>
>       PR debug/87039
>       * omp-expand.c: Don't include debug.h.
>       (adjust_context_and_scope): Add REGION argument.  Find DECL_CONTEXT
>       from innermost outer parallel, task, teams or target that has a
>       child_fn set, or, if there is no such outer region, use
>       current_function_decl.  Do the DECL_CONTEXT adjustment regardless of
>       whether a suitable BLOCK is found or not.
>       (expand_parallel_call, expand_teams_call): Don't call
>       adjust_context_and_scope here.
>       (grid_expand_target_grid_body): Revert 2017-01-25 changes.
>       (expand_omp_taskreg, expand_omp_target): Likewise.  Call
>       adjust_context_and_scope.
>       * dwarf2out.c (dwarf2out_early_global_decl): For
>       decl_function_context recurse instead of calling dwarf2out_decl.
>
>       * g++.dg/gomp/pr78363-4.C: New test.
>       * g++.dg/gomp/pr78363-5.C: New test.
>       * g++.dg/gomp/pr78363-6.C: New test.
>       * g++.dg/gomp/pr78363-7.C: New test.
>
>--- gcc/omp-expand.c.jj        2018-11-15 12:34:36.194806156 +0100
>+++ gcc/omp-expand.c   2018-11-16 11:27:11.722968398 +0100
>@@ -56,7 +56,6 @@ along with GCC; see the file COPYING3.
> #include "gomp-constants.h"
> #include "gimple-pretty-print.h"
> #include "hsa-common.h"
>-#include "debug.h"
> #include "stringpool.h"
> #include "attribs.h"
> 
>@@ -517,27 +516,44 @@ parallel_needs_hsa_kernel_p (struct omp_
>    function will be emitted with the correct lexical scope.  */
> 
> static void
>-adjust_context_and_scope (tree entry_block, tree child_fndecl)
>+adjust_context_and_scope (struct omp_region *region, tree entry_block,
>+                        tree child_fndecl)
> {
>+  tree parent_fndecl = NULL_TREE;
>+  gimple *entry_stmt;
>+  /* OMP expansion expands inner regions before outer ones, so if
>+     we e.g. have explicit task region nested in parallel region, when
>+     expanding the task region current_function_decl will be the
>original
>+     source function, but we actually want to use as context the child
>+     function of the parallel.  */
>+  for (region = region->outer;
>+       region && parent_fndecl == NULL_TREE; region = region->outer)
>+    switch (region->type)
>+      {
>+      case GIMPLE_OMP_PARALLEL:
>+      case GIMPLE_OMP_TASK:
>+      case GIMPLE_OMP_TEAMS:
>+      entry_stmt = last_stmt (region->entry);
>+      parent_fndecl = gimple_omp_taskreg_child_fn (entry_stmt);
>+      break;
>+      case GIMPLE_OMP_TARGET:
>+      entry_stmt = last_stmt (region->entry);
>+      parent_fndecl
>+        = gimple_omp_target_child_fn (as_a <gomp_target *> (entry_stmt));
>+      break;
>+      default:
>+      break;
>+      }
>+
>+  if (parent_fndecl == NULL_TREE)
>+    parent_fndecl = current_function_decl;
>+  DECL_CONTEXT (child_fndecl) = parent_fndecl;
>+
>   if (entry_block != NULL_TREE && TREE_CODE (entry_block) == BLOCK)
>     {
>       tree b = BLOCK_SUPERCONTEXT (entry_block);
>-
>       if (TREE_CODE (b) == BLOCK)
>         {
>-        tree parent_fndecl;
>-
>-        /* Follow supercontext chain until the parent fndecl
>-           is found.  */
>-        for (parent_fndecl = BLOCK_SUPERCONTEXT (b);
>-             TREE_CODE (parent_fndecl) == BLOCK;
>-             parent_fndecl = BLOCK_SUPERCONTEXT (parent_fndecl))
>-          ;
>-
>-        gcc_assert (TREE_CODE (parent_fndecl) == FUNCTION_DECL);
>-
>-        DECL_CONTEXT (child_fndecl) = parent_fndecl;
>-
>         DECL_CHAIN (child_fndecl) = BLOCK_VARS (b);
>         BLOCK_VARS (b) = child_fndecl;
>       }
>@@ -723,8 +739,6 @@ expand_parallel_call (struct omp_region
>   tree child_fndecl = gimple_omp_parallel_child_fn (entry_stmt);
>   t2 = build_fold_addr_expr (child_fndecl);
> 
>-  adjust_context_and_scope (gimple_block (entry_stmt), child_fndecl);
>-
>   vec_alloc (args, 4 + vec_safe_length (ws_args));
>   args->quick_push (t2);
>   args->quick_push (t1);
>@@ -952,8 +966,6 @@ expand_teams_call (basic_block bb, gomp_
>   tree child_fndecl = gimple_omp_teams_child_fn (entry_stmt);
>   tree t2 = build_fold_addr_expr (child_fndecl);
> 
>-  adjust_context_and_scope (gimple_block (entry_stmt), child_fndecl);
>-
>   vec<tree, va_gc> *args;
>   vec_alloc (args, 5);
>   args->quick_push (t2);
>@@ -1412,11 +1424,6 @@ expand_omp_taskreg (struct omp_region *r
>       else
>       block = gimple_block (entry_stmt);
> 
>-      /* Make sure to generate early debug for the function before
>-         outlining anything.  */
>-      if (! gimple_in_ssa_p (cfun))
>-      (*debug_hooks->early_global_decl) (cfun->decl);
>-
>new_bb = move_sese_region_to_fn (child_cfun, entry_bb, exit_bb, block);
>       if (exit_bb)
>       single_succ_edge (new_bb)->flags = EDGE_FALLTHRU;
>@@ -1497,6 +1504,8 @@ expand_omp_taskreg (struct omp_region *r
>       }
>     }
> 
>+  adjust_context_and_scope (region, gimple_block (entry_stmt),
>child_fn);
>+
>   if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
>     expand_parallel_call (region, new_bb,
>                         as_a <gomp_parallel *> (entry_stmt), ws_args);
>@@ -7399,11 +7408,6 @@ expand_omp_target (struct omp_region *re
>         gsi_remove (&gsi, true);
>       }
> 
>-      /* Make sure to generate early debug for the function before
>-         outlining anything.  */
>-      if (! gimple_in_ssa_p (cfun))
>-      (*debug_hooks->early_global_decl) (cfun->decl);
>-
>       /* Move the offloading region into CHILD_CFUN.  */
> 
>       block = gimple_block (entry_stmt);
>@@ -7480,6 +7484,8 @@ expand_omp_target (struct omp_region *re
>         dump_function_header (dump_file, child_fn, dump_flags);
>         dump_function_to_file (child_fn, dump_file, dump_flags);
>       }
>+
>+      adjust_context_and_scope (region, gimple_block (entry_stmt),
>child_fn);
>     }
> 
>   /* Emit a library call to launch the offloading region, or do data
>@@ -7977,11 +7983,6 @@ grid_expand_target_grid_body (struct omp
>   init_tree_ssa (cfun);
>   pop_cfun ();
> 
>-  /* Make sure to generate early debug for the function before
>-     outlining anything.  */
>-  if (! gimple_in_ssa_p (cfun))
>-    (*debug_hooks->early_global_decl) (cfun->decl);
>-
>   tree old_parm_decl = DECL_ARGUMENTS (kern_fndecl);
>   gcc_assert (!DECL_CHAIN (old_parm_decl));
>   tree new_parm_decl = copy_node (DECL_ARGUMENTS (kern_fndecl));
>--- gcc/dwarf2out.c.jj 2018-11-16 10:22:19.130250543 +0100
>+++ gcc/dwarf2out.c    2018-11-16 11:30:04.267118454 +0100
>@@ -26403,7 +26403,7 @@ dwarf2out_early_global_decl (tree decl)
>                enough so that it lands in its own context.  This avoids type
>                pruning issues later on.  */
>             if (context_die == NULL || is_declaration_die (context_die))
>-              dwarf2out_decl (context);
>+              dwarf2out_early_global_decl (context);
>           }
> 
>         /* Emit an abstract origin of a function first.  This happens
>--- gcc/testsuite/g++.dg/gomp/pr78363-4.C.jj   2018-11-16
>11:23:21.862765041 +0100
>+++ gcc/testsuite/g++.dg/gomp/pr78363-4.C      2018-11-16 11:23:47.752337412
>+0100
>@@ -0,0 +1,18 @@
>+// { dg-do compile }
>+// { dg-require-effective-target c++11 }
>+// { dg-additional-options "-g" }
>+
>+int main()
>+{
>+  int n = 0;
>+
>+#pragma omp parallel
>+#pragma omp master
>+#pragma omp parallel
>+#pragma omp master
>+#pragma omp parallel for reduction (+: n)
>+  for (int i = [](){ return 3; }(); i < 10; ++i)
>+    n++;
>+
>+  return n;
>+}
>--- gcc/testsuite/g++.dg/gomp/pr78363-5.C.jj   2018-11-16
>11:24:07.487011455 +0100
>+++ gcc/testsuite/g++.dg/gomp/pr78363-5.C      2018-11-16 11:24:29.705644466
>+0100
>@@ -0,0 +1,17 @@
>+// { dg-do compile }
>+// { dg-require-effective-target c++11 }
>+// { dg-additional-options "-g" }
>+
>+int main()
>+{
>+  int n = 0;
>+#pragma omp task shared(n)
>+#pragma omp target map(tofrom:n)
>+#pragma omp for reduction (+: n)
>+  for (int i = [](){ return 3; }(); i < 10; ++i)
>+    n++;
>+  if (n != 7)
>+    __builtin_abort ();
>+#pragma omp taskwait
>+  return 0;
>+}
>--- gcc/testsuite/g++.dg/gomp/pr78363-6.C.jj   2018-11-16
>11:24:45.195388617 +0100
>+++ gcc/testsuite/g++.dg/gomp/pr78363-6.C      2018-11-16 11:25:02.548102003
>+0100
>@@ -0,0 +1,16 @@
>+// { dg-do compile }
>+// { dg-require-effective-target c++11 }
>+// { dg-additional-options "-g" }
>+
>+int main()
>+{
>+  int n = 0;
>+#pragma omp parallel
>+#pragma omp master
>+#pragma omp task shared (n)
>+  for (int i = [](){ return 3; }(); i < 10; ++i)
>+    n = i;
>+#pragma omp taskwait
>+  if (n != 7)
>+    __builtin_abort ();
>+}
>--- gcc/testsuite/g++.dg/gomp/pr78363-7.C.jj   2018-11-16
>11:25:17.285858571 +0100
>+++ gcc/testsuite/g++.dg/gomp/pr78363-7.C      2018-11-16 11:25:37.943517369
>+0100
>@@ -0,0 +1,15 @@
>+// { dg-do compile }
>+// { dg-require-effective-target c++11 }
>+// { dg-additional-options "-g" }
>+
>+int main()
>+{
>+  int n = 0;
>+
>+#pragma omp target map(tofrom: n)
>+#pragma omp parallel for reduction (+: n)
>+  for (int i = [](){ return 3; }(); i < 10; ++i)
>+    n++;
>+
>+  return n;
>+}
>
>       Jakub

Reply via email to