On Tue, Oct 18, 2016 at 9:52 PM, Thomas Schwinge <tho...@codesourcery.com> wrote: > Hi! > > On Mon, 17 Oct 2016 15:38:50 +0200, I wrote: >> On Mon, 17 Oct 2016 14:08:44 +0200, Richard Biener >> <richard.guent...@gmail.com> wrote: >> > On Mon, Oct 17, 2016 at 1:47 PM, Thomas Schwinge >> > <tho...@codesourcery.com> wrote: >> > > On Mon, 17 Oct 2016 13:22:17 +0200, Richard Biener >> > > <richard.guent...@gmail.com> wrote: >> > >> On Mon, Oct 17, 2016 at 11:38 AM, Thomas Schwinge >> > >> <tho...@codesourcery.com> wrote: >> > >> > On Fri, 14 Oct 2016 13:06:59 +0200, Richard Biener >> > >> > <richard.guent...@gmail.com> wrote: >> > >> >> On Fri, Oct 14, 2016 at 1:00 PM, Nathan Sidwell <nat...@acm.org> >> > >> >> wrote: >> > >> >> > On 10/14/16 05:28, Richard Biener wrote: >> > >> >> > >> > >> >> >> The BB_VISITED flag has indetermined state at the beginning of a >> > >> >> >> pass. >> > >> >> >> You have to ensure it is cleared yourself. >> > >> >> > >> > >> >> > >> > >> >> > In that case the openacc (&nvptx?) passes should be modified to >> > >> >> > clear the >> > >> >> > flags at their start, rather than at their end. > > This already is a "conceptual acknowledgement" of my patch, so... > >> > >> > OK to commit the following? Is such a test case appropriate (which >> > >> > would >> > >> > have caught this issue right away), in particular the dg-final >> > >> > scan-tree-dump line? >> > >> >> > >> Ugh. Not worse to what we do in various dwarf scanning I guess. >> > > >> > > ;-| >> > > >> > >> Doesn't failure lead to a miscompile eventually? So you could formulate >> > >> this as a dg-do run test with a check for the desired outcome? >> > > >> > > No, unfortunately. In this case the error is "benign" such that the >> > > OpenACC loop processing machinery will decide to not parallelize loops >> > > that ought to be parallelized. >> > >> > So you can scan for "loop parallelized" instead? >> >> The dump would still contain the outer loop's "Loop 0(0)" marker, so I'd >> have to scan for "Head"/"Tail"/"UNIQUE" or similar instead -- but that >> seems likewise fragile (for false negatives), and less useful than >> scanning for the complete pattern. >> >> > I fear your pattern >> > is quite fragile >> > to maintain over time. >> >> Agreed -- but then, that's intentional: my idea for this new test case >> has been to have it actually verify the expected OpenACC loop processing, >> so it's clear that this pattern will need to be adjusted if changing the >> OpenACC loop processing. >> >> > > This won't generally cause any problem >> > > (apart from performance regression, obviously); it just caused problems >> > > in a few libgomp test cases that actually at run time test for >> > > parallelized execution -- which will/did trigger only with nvptx >> > > offloading enabled, which not too many people are testing. The test case >> > > I propose below will trigger also for non-offloading configurations. >> >> On IRC, Segher suggested to 'use {} instead of "" to avoid [all those >> backslashes]' -- thanks, done. > > If you don't like the test case as-is (do we need multi-line tree dump > scanning, just like we recently got for compiler diagnostics?), can I at > least commit the OpenACC loops processing fix? Here is the latest > version, simplified after your r241296 IRA vs. BB_VISITED fixes:
Sure, I considered that approved already (it's even obvious). Richard. > commit 766cf9959b15a17e17e89a50e905b4c546893823 > Author: Thomas Schwinge <tho...@codesourcery.com> > Date: Mon Oct 17 15:33:09 2016 +0200 > > Clear basic block flags before using BB_VISITED for OpenACC loops > processing > > gcc/ > * omp-low.c (oacc_loop_discovery): Call clear_bb_flags before, and > don't clear BB_VISITED after processing. > > gcc/testsuite/ > * gcc.dg/goacc/loop-processing-1.c: New file. > --- > gcc/omp-low.c | 8 +++----- > gcc/testsuite/gcc.dg/goacc/loop-processing-1.c | 18 ++++++++++++++++++ > 2 files changed, 21 insertions(+), 5 deletions(-) > > diff --git gcc/omp-low.c gcc/omp-low.c > index 77f89d5..3ef796f 100644 > --- gcc/omp-low.c > +++ gcc/omp-low.c > @@ -19236,7 +19236,9 @@ oacc_loop_sibling_nreverse (oacc_loop *loop) > static oacc_loop * > oacc_loop_discovery () > { > - basic_block bb; > + /* Clear basic block flags, in particular BB_VISITED which we're going to > use > + in the following. */ > + clear_bb_flags (); > > oacc_loop *top = new_oacc_loop_outer (current_function_decl); > oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun)); > @@ -19245,10 +19247,6 @@ oacc_loop_discovery () > that diagnostics come out in an unsurprising order. */ > top = oacc_loop_sibling_nreverse (top); > > - /* Reset the visited flags. */ > - FOR_ALL_BB_FN (bb, cfun) > - bb->flags &= ~BB_VISITED; > - > return top; > } > > diff --git gcc/testsuite/gcc.dg/goacc/loop-processing-1.c > gcc/testsuite/gcc.dg/goacc/loop-processing-1.c > new file mode 100644 > index 0000000..619576a > --- /dev/null > +++ gcc/testsuite/gcc.dg/goacc/loop-processing-1.c > @@ -0,0 +1,18 @@ > +/* Make sure that OpenACC loop processing happens. */ > +/* { dg-additional-options "-O2 -fdump-tree-oaccdevlow" } */ > + > +extern int place (); > + > +void vector_1 (int *ary, int size) > +{ > +#pragma acc parallel num_workers (32) vector_length(32) copy(ary[0:size]) > firstprivate (size) > + { > +#pragma acc loop gang > + for (int jx = 0; jx < 1; jx++) > +#pragma acc loop auto > + for (int ix = 0; ix < size; ix++) > + ary[ix] = place (); > + } > +} > + > +/* { dg-final { scan-tree-dump {OpenACC loops.*Loop 0\(0\).*Loop > 14\(1\).*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, > 20\);.*Head-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, > 20\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, > 0\);.*Tail-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_TAIL_MARK, > \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_JOIN, > \.data_dep\.[0-9_]+, 0\);.*Loop 6\(4\).*\.data_dep\.[0-9_]+ = UNIQUE > \(OACC_HEAD_MARK, 0, 1, 6\);.*Head-0:.*\.data_dep\.[0-9_]+ = UNIQUE > \(OACC_HEAD_MARK, 0, 1, 6\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_FORK, > \.data_dep\.[0-9_]+, 2\);.*Tail-0:.*\.data_dep\.[0-9_]+ = UNIQUE > \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = UNIQUE > \(OACC_JOIN, \.data_dep\.[0-9_]+, 2\);} "oaccdevlow" } } */ > > > Grüße > Thomas