Hi!

On Mon, 17 Oct 2016 15:38:50 +0200, I wrote:
> On Mon, 17 Oct 2016 14:08:44 +0200, Richard Biener 
> <richard.guent...@gmail.com> wrote:
> > On Mon, Oct 17, 2016 at 1:47 PM, Thomas Schwinge
> > <tho...@codesourcery.com> wrote:
> > > On Mon, 17 Oct 2016 13:22:17 +0200, Richard Biener 
> > > <richard.guent...@gmail.com> wrote:
> > >> On Mon, Oct 17, 2016 at 11:38 AM, Thomas Schwinge
> > >> <tho...@codesourcery.com> wrote:
> > >> > On Fri, 14 Oct 2016 13:06:59 +0200, Richard Biener 
> > >> > <richard.guent...@gmail.com> wrote:
> > >> >> On Fri, Oct 14, 2016 at 1:00 PM, Nathan Sidwell <nat...@acm.org> 
> > >> >> wrote:
> > >> >> > On 10/14/16 05:28, Richard Biener wrote:
> > >> >> >
> > >> >> >> The BB_VISITED flag has indetermined state at the beginning of a 
> > >> >> >> pass.
> > >> >> >> You have to ensure it is cleared yourself.
> > >> >> >
> > >> >> >
> > >> >> > In that case the openacc (&nvptx?) passes should be modified to 
> > >> >> > clear the
> > >> >> > flags at their start, rather than at their end.

This already is a "conceptual acknowledgement" of my patch, so...

> > >> > OK to commit the following?  Is such a test case appropriate (which 
> > >> > would
> > >> > have caught this issue right away), in particular the dg-final
> > >> > scan-tree-dump line?
> > >>
> > >> Ugh.  Not worse to what we do in various dwarf scanning I guess.
> > >
> > > ;-|
> > >
> > >> Doesn't failure lead to a miscompile eventually?  So you could formulate
> > >> this as a dg-do run test with a check for the desired outcome?
> > >
> > > No, unfortunately.  In this case the error is "benign" such that the
> > > OpenACC loop processing machinery will decide to not parallelize loops
> > > that ought to be parallelized.
> > 
> > So you can scan for "loop parallelized" instead?
> 
> The dump would still contain the outer loop's "Loop 0(0)" marker, so I'd
> have to scan for "Head"/"Tail"/"UNIQUE" or similar instead -- but that
> seems likewise fragile (for false negatives), and less useful than
> scanning for the complete pattern.
> 
> > I fear your pattern
> > is quite fragile
> > to maintain over time.
> 
> Agreed -- but then, that's intentional: my idea for this new test case
> has been to have it actually verify the expected OpenACC loop processing,
> so it's clear that this pattern will need to be adjusted if changing the
> OpenACC loop processing.
> 
> > >  This won't generally cause any problem
> > > (apart from performance regression, obviously); it just caused problems
> > > in a few libgomp test cases that actually at run time test for
> > > parallelized execution -- which will/did trigger only with nvptx
> > > offloading enabled, which not too many people are testing.  The test case
> > > I propose below will trigger also for non-offloading configurations.
> 
> On IRC, Segher suggested to 'use {} instead of "" to avoid [all those
> backslashes]' -- thanks, done.

If you don't like the test case as-is (do we need multi-line tree dump
scanning, just like we recently got for compiler diagnostics?), can I at
least commit the OpenACC loops processing fix?  Here is the latest
version, simplified after your r241296 IRA vs. BB_VISITED fixes:

commit 766cf9959b15a17e17e89a50e905b4c546893823
Author: Thomas Schwinge <tho...@codesourcery.com>
Date:   Mon Oct 17 15:33:09 2016 +0200

    Clear basic block flags before using BB_VISITED for OpenACC loops processing
    
        gcc/
        * omp-low.c (oacc_loop_discovery): Call clear_bb_flags before, and
        don't clear BB_VISITED after processing.
    
        gcc/testsuite/
        * gcc.dg/goacc/loop-processing-1.c: New file.
---
 gcc/omp-low.c                                  |  8 +++-----
 gcc/testsuite/gcc.dg/goacc/loop-processing-1.c | 18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git gcc/omp-low.c gcc/omp-low.c
index 77f89d5..3ef796f 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -19236,7 +19236,9 @@ oacc_loop_sibling_nreverse (oacc_loop *loop)
 static oacc_loop *
 oacc_loop_discovery ()
 {
-  basic_block bb;
+  /* Clear basic block flags, in particular BB_VISITED which we're going to use
+     in the following.  */
+  clear_bb_flags ();
   
   oacc_loop *top = new_oacc_loop_outer (current_function_decl);
   oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun));
@@ -19245,10 +19247,6 @@ oacc_loop_discovery ()
      that diagnostics come out in an unsurprising order.  */
   top = oacc_loop_sibling_nreverse (top);
 
-  /* Reset the visited flags.  */
-  FOR_ALL_BB_FN (bb, cfun)
-    bb->flags &= ~BB_VISITED;
-
   return top;
 }
 
diff --git gcc/testsuite/gcc.dg/goacc/loop-processing-1.c 
gcc/testsuite/gcc.dg/goacc/loop-processing-1.c
new file mode 100644
index 0000000..619576a
--- /dev/null
+++ gcc/testsuite/gcc.dg/goacc/loop-processing-1.c
@@ -0,0 +1,18 @@
+/* Make sure that OpenACC loop processing happens.  */
+/* { dg-additional-options "-O2 -fdump-tree-oaccdevlow" } */
+
+extern int place ();
+
+void vector_1 (int *ary, int size)
+{
+#pragma acc parallel num_workers (32) vector_length(32) copy(ary[0:size]) 
firstprivate (size)
+  {
+#pragma acc loop gang
+    for (int jx = 0; jx < 1; jx++)
+#pragma acc loop auto
+      for (int ix = 0; ix < size; ix++)
+       ary[ix] = place ();
+  }
+}
+
+/* { dg-final { scan-tree-dump {OpenACC loops.*Loop 0\(0\).*Loop 
14\(1\).*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 
20\);.*Head-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 
20\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 
0\);.*Tail-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_TAIL_MARK, 
\.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_JOIN, 
\.data_dep\.[0-9_]+, 0\);.*Loop 6\(4\).*\.data_dep\.[0-9_]+ = UNIQUE 
\(OACC_HEAD_MARK, 0, 1, 6\);.*Head-0:.*\.data_dep\.[0-9_]+ = UNIQUE 
\(OACC_HEAD_MARK, 0, 1, 6\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_FORK, 
\.data_dep\.[0-9_]+, 2\);.*Tail-0:.*\.data_dep\.[0-9_]+ = UNIQUE 
\(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = UNIQUE 
\(OACC_JOIN, \.data_dep\.[0-9_]+, 2\);} "oaccdevlow" } } */


Grüße
 Thomas

Reply via email to