Re: [PATCH 1/3] OpenMP: C support for imperfectly-nested loops

Sandra Loosemore Wed, 14 Jun 2023 15:41:56 -0700

High-order bit:  I've just committed OG13 version of these patches that is 
integrated with Frederik's previous loop transformation patches that are 
already on that branch.  The OG13 version incorporates many of the suggestions 
from this initial review plus a few bug fixes.  I've also made corresponding 
fixes to the mainline version but I've still got a lot of unfinished items, 
mostly related to additional tests for corner cases.


On 5/25/23 04:00, Jakub Jelinek wrote:

On Fri, Apr 28, 2023 at 05:22:52PM -0600, Sandra Loosemore wrote:

OpenMP 5.0 removed the restriction that multiple collapsed loops must
be perfectly nested, allowing "intervening code" (including nested
BLOCKs) before or after each nested loop.  In GCC this code is moved
into the inner loop body by the respective front ends.

This patch changes the C front end to use recursive descent parsing
on nested loops within an "omp for" construct, rather than an iterative
approach, in order to preserve proper nesting of compound statements.

gcc/c/ChangeLog
        * c-parser.cc (struct c_parser): Add omp_for_parse_state field.
        (struct omp_for_parse_data): New.
        (c_parser_compound_statement_nostart): Special-case nested
        OMP loops and blocks in intervening code.
        (c_parser_while_statement): Reject in intervening code.
        (c_parser_do_statement): Likewise.
        (c_parser_for_statement): Likewise.
        (c_parser_postfix_expression_after_primary): Reject calls to OMP
        runtime routines in intervening code.
        (c_parser_pragma): Reject OMP pragmas in intervening code.
        (c_parser_omp_loop_nest): New, split from c_parser_omp_for_loop.
        (c_parser_omp_for_loop): Rewrite to use recursive descent and
        generalize handling for intervening code.

gcc/ChangeLog
        * omp-api.h: New file.


Why?  Just add those to omp-general.h.


This is for the Fortran front end, which needs this stuff without everything 
else omp-general.h sucks in.  I remember that I initially did try to put it in 
omp-general.h but split it out when I ran into some trouble with that, and I 
thought it was an abstraction violation in the Fortran front end.  I didn't 
touch this for now; is it important enough that I should spend more time on it?

        * omp-general.cc (omp_runtime_api_procname): New.
        (omp_runtime_api_call): Moved here from omp-low.cc, and make
        non-static.
        * omp-general.h: Include omp-api.h.
        * omp-low.cc (omp_runtime_api_call): Delete this copy.

gcc/testsuite/ChangeLog
        * c-c++-common/goacc/collapse-1.c: Adjust expected error messages.
        * c-c++-common/goacc/tile-2.c: Likewise.
        * c-c++-common/gomp/imperfect1.c: New.
        * c-c++-common/gomp/imperfect2.c: New.
        * c-c++-common/gomp/imperfect3.c: New.
        * c-c++-common/gomp/imperfect4.c: New.
        * c-c++-common/gomp/imperfect5.c: New.
        * gcc.dg/gomp/collapse-1.c: Adjust expected error messages.

libgomp/ChangeLog
        * testsuite/libgomp.c-c++-common/imperfect1.c: New.
        * testsuite/libgomp.c-c++-common/imperfect2.c: New.
        * testsuite/libgomp.c-c++-common/imperfect3.c: New.
        * testsuite/libgomp.c-c++-common/imperfect4.c: New.
        * testsuite/libgomp.c-c++-common/imperfect5.c: New.
        * testsuite/libgomp.c-c++-common/imperfect6.c: New.
        * testsuite/libgomp.c-c++-common/offload-imperfect1.c: New.
        * testsuite/libgomp.c-c++-common/offload-imperfect2.c: New.
        * testsuite/libgomp.c-c++-common/offload-imperfect3.c: New.
        * testsuite/libgomp.c-c++-common/offload-imperfect4.c: New.


If the 3 patches are going to be committed separately (which I think is a
good idea), then the *c-c++-common* tests are a problem, because the tests
will then fail after the C FE part is committed before the C++ FE part is
committed.
For the new tests there are 2 options, one is commit them in the C patch
with /* { dg-do run { target c } } */ instead of just
/* { dg-do run } */ etc. and then in the second patch remove those
" { target c }" parts, or commit them in the second patch only.
For the existing tests with adjustments, do the { target c } vs.
{ target c++ } games and tweak in the second patch.


OK, I've split the new c-c++-common tests into a separate commit, and done the 
other rigamarole to adjust the other test cases incrementally with each part of 
the series.

The offload-imperfect* tests should be called target-imperfect* I think,
for consistency with other tests.


Done.

In the gcc/testsuite/c-c++-common/gomp/ tests I miss some coverage for
the boundary cases what is and isn't intervening code.
Before your changes, we were allowing multiple levels of {}s,
so
#pragma omp for ordered(2)
for (int i = 0; i < 64; i++)
   {
     {
       {
         for (int j = 0; j < 64; j++)
           ;
       }
     }
   }
which is valid in 5.0 (but should be tested in the testsuite), but also
empty statements, which when reading the 5.1/5.2 spec don't actually seem to
be valid.
#pragma omp for ordered(2)
for (int i = 0; i < 64; i++)
   {
     ;
     ;
     ;
     for (int j = 0; j < 64; j++)
       ;
     ;
   }
because even the empty statement is I think intervening code according to
the grammar.


Do you have recommendations for what the behavior should be now?  Reject these 
constructs with an error?  Accept them with a warning?  Or accept them quietly 
for backward compatibility?

Another thing I don't really see covered in the testsuite nor in the code
is if some variable declared in intervening code is then used in the inner
loop's init/cond/incr expressions.  I mean something like:
#pragma omp for collapse(2)
for (int i = 0; i < 64; i++)
   {
     int v = (i + 4) * 2;
     for (int j = v; j < 64; j++)
       ;
   }
That just ICEs with your patch, we should diagnose it as invalid.
In the canonical loop form requirements the standard requires that the
expressions are loop invariant with the exceptions of specific cases
allowed for non-rectangular loops.  So, above I'm sure that is violated.
Another case would be
#pragma omp for collapse(2)
for (int i = 0; i < 64; i++)
   {
     int v = (i + 4);
     for (int j = v; j < 64; j++)
       ;
   }
but here it isn't invariant either, although for (int j = i + 4; j < 64; j++)
would be valid non-rectangular loop I think it requires the exact syntax.
Yet another case is
#pragma omp for collapse(2)
for (int i = 0; i < 64; i++)
   {
     int v = 8;
     for (int j = v; j < 64; j++)
       ;
   }
This actually is loop invariant (the value) and not really sure what would
in the standard prevent it from being valid, but it doesn't feel right and
can't be handled easily, because one should be able to compute the number
of iterations before the loop where v isn't in scope or there could be a
different v in scope.
And/or for C++
#pragma omp for collapse(2)
for (int i = 0; i < 64; i++)
   {
     const int v = 8; // or even constexpr
     for (int j = v; j < 64; j++)
       ;
   }
where already during parsing we won't really know we are actually using v
if it isn't ODR used.  Or unless it is in a template and the const/constexpr
variable is value or type dependent.
I think we want to discuss this in omp-lang.


I haven't tackled this group of corner cases yet.  I am not an OpenMP expert 
and I'd be inclined just to leave FIXMEs for anything that is not entirely 
clear in the spec, though (and just forget about cases where the spec clearly 
says something else).

@@ -1525,6 +1529,23 @@ struct oacc_routine_data {
  /* Used for parsing objc foreach statements.  */
  static tree objc_foreach_break_label, objc_foreach_continue_label;

+/* Used for parsing OMP for loops. See c_parser_omp_for_loop. */

+struct omp_for_parse_data {
+  tree declv, condv, incrv, initv;
+  tree pre_body;
+  tree bindings;
+  int collapse;


I think it is confusing to call this collapse when it isn't collapse
but ordered ? ordered : collapse.
Call it count like in c_parser_omp_for_loop?


Done.

@@ -6123,6 +6145,15 @@ c_parser_compound_statement_nostart (c_parser *parser)
    bool last_label = false;
    bool save_valid_for_pragma = valid_location_for_stdc_pragma_p ();
    location_t label_loc = UNKNOWN_LOCATION;  /* Quiet warning.  */
+  struct omp_for_parse_data *omp_for_parse_state;
+
+  if (parser->omp_for_parse_state
+      && (parser->omp_for_parse_state->depth
+         < parser->omp_for_parse_state->collapse - 1))
+    omp_for_parse_state = parser->omp_for_parse_state;
+  else
+    omp_for_parse_state = NULL;
+
    if (c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
      {
        location_t endloc = c_parser_peek_token (parser)->location;


So, here we skip over various cases that need to be covered in the testcases
and decided what to do about.
The first one is:
#pragma omp for collapse(2)
for (int i = 0; i < 64; ++i)
   {
     __label__ a, b;
     goto a; a:;
     for (int j = 0; j < 64; ++j)
       ;
     goto b; b:;
   }
This is a GNU extension, do we want to allow it?
Conceptually it is like variable declarations which can appear in
intervening code, but we need to make sure it is handled properly
(the code registers those in the current scope using declare_label, so
supposedly we want to move those later to the body scope when moving there
other intervening code.  And probably it should be rejected when intervening
code is not allowed (+ test for that).

Next are C2X standard attributes, case XX:/default: (I think these would
violate the single entry single exit requirements of intervening code),
label: (that can be fine as long as it is only jumped to from within the
same intervening code), so again something for the testsuite:
int k = 0;
#pragma omp for collapse(2)
for (int i = 0; i < 64; ++i)
   {
     a: if (k) goto a;
     for (int j = 0; j < 64; ++j)
       ;
     b: if (k) goto b;
   }
Then __extension__, again something that should be considered whether it
acts as intervening code, or is even allowed as if it wasn't intervening
code + testsuite coverage.
Then pragmas which you handle in c_parser_pragma already (but will need some
tweaks for tile/unroll from Frederik).
else is an error, so nothing needs to be done about it.


I haven't done these testcases + fixes yet either, although the integration 
with Frederik's tile/unroll work is included in the OG13 version of the patches 
I've just committed.

@@ -7138,6 +7223,16 @@ c_parser_while_statement (c_parser *parser, bool ivdep, 
unsigned short unroll,
    gcc_assert (c_parser_next_token_is_keyword (parser, RID_WHILE));
    token_indent_info while_tinfo
      = get_token_indent_info (c_parser_peek_token (parser));
+
+  if (parser->omp_for_parse_state
+      && (parser->omp_for_parse_state->depth
+         < parser->omp_for_parse_state->collapse - 1))
+    {
+      error_at (c_parser_peek_token (parser)->location,
+               "loop not permitted in intervening code in OMP loop body");


Please use OpenMP in diagnostics rather than OMP (multiple times).


Done.

Also, the while and do cases are not covered in the testsuite, they should
be (there is just test for for in there).
Would be nice to also test it nested in other compound statements, like
#pragma omp for collapse(2)
for (int i = 0; i < 64; ++i)
   {
     if (1)
       {
        for (int j = 0; j < 64; ++j) // { dg-error "..." }
          ;
        while (0) ;                     // { dg-error "..." }
        do { } while (0);               // { dg-error "..." }
       }
     for (int k = 0; k < 64; ++k)
       ;
   }
I guess especially the do { ... } while (0); limitation could be quite
harmful, that is something used heavily in various macros.
So perhaps for OpenMP 6.0 we should consider there rejecting just
for statements for C/C++?  And/or also allow for loops if inside of some
substatement of say if/do/while/switch, those are also clearly
distinguishable from the main loop.  On the other side, I've never been
really excited by this imperfectly nested loops mess making it into the
standard.


Like I said, I'm not an OpenMP expert and I don't want to mess with anything 
that isn't already clear in the spec, so my inclination would be not wander off 
proposing/implementing extensions.

+      parser->omp_for_parse_state->fail = true;
+    }
+
    c_parser_consume_token (parser);
    block = c_begin_compound_stmt (flag_isoc99);
    loc = c_parser_peek_token (parser)->location;
@@ -11234,6 +11349,14 @@ c_parser_postfix_expression_after_primary (c_parser 
*parser,
                  && fndecl_built_in_p (expr.value, BUILT_IN_NORMAL)
                  && vec_safe_length (exprlist) == 1)
                warn_for_abs (expr_loc, expr.value, (*exprlist)[0]);
+             if (parser->omp_for_parse_state
+                 && parser->omp_for_parse_state->in_intervening_code
+                 && omp_runtime_api_call (expr.value))
+               {
+                 error_at (expr_loc, "Calls to the OpenMP runtime API are "


s/Calls/calls/, C/C++ FE/middle end diagnostics never start with capital
letter unless it is part of some shorthand/word always spelled with capital
letters.


Done.

+  if (decl == NULL || decl == error_mark_node || init == error_mark_node)
+    omp_for_parse_state->fail = true;
+  else
+    {
+      TREE_VEC_ELT (omp_for_parse_state->declv, omp_for_parse_state->depth)
+       = decl;
+      TREE_VEC_ELT (omp_for_parse_state->initv, omp_for_parse_state->depth)
+       = init;
+      TREE_VEC_ELT (omp_for_parse_state->condv, omp_for_parse_state->depth)
+       = cond;
+      TREE_VEC_ELT (omp_for_parse_state->incrv, omp_for_parse_state->depth)
+       = incr;


This would be more readable using a temporary:
       int depth = omp_for_parse_state->depth;
       TREE_VEC_ELT (omp_for_parse_state->declv, depth) = decl;
       TREE_VEC_ELT (omp_for_parse_state->initv, depth) = init;
       TREE_VEC_ELT (omp_for_parse_state->condv, depth) = cond;
       TREE_VEC_ELT (omp_for_parse_state->incrv, depth) = incr;


Done.

+    }
+
+parse_next:
+  omp_for_parse_state->want_nested_loop = true;
+  moreloops = omp_for_parse_state->depth < omp_for_parse_state->collapse - 1;


Shouldn't omp_for_parse_state->want_nested_loop be initialized to moreloops
here?


Done.

And/or temporarily clear
parser->omp_for_parse_state here and reset it back afterwards?
Then
   if (parser->omp_for_parse_state
       && (parser->omp_for_parse_state->depth
           < parser->omp_for_parse_state->collapse - 1))
     omp_for_parse_state = parser->omp_for_parse_state;
   else
     omp_for_parse_state = NULL;
etc. wouldn't be really needed, if parser->omp_for_parse_state would
be non-NULL, we'd want to use it without further checks.


Done.

+  /* Pop and return the implicit scope surrounding this level of loop.
+     Any iteration variable bound in loop_scope is pulled out and later
+     will be added to the scope surrounding the entire OMP_FOR.  That
+     keeps the gimplifier happy later on, and meanwhile we have already
+     resolved all references to the iteration variable in its true scope.  */
+  add_stmt (body);
+  body = c_end_compound_stmt (loc, loop_scope, true);
+  if (decl && TREE_CODE (body) == BIND_EXPR)


So, this moves just the iteration var if declared in the for loop and nothing 
else?


Yes.  I've updated the comments to try to explain this better.

+#pragma omp for ordered(3)
+  for (i = 0; i < a1; i++)  /* { dg-error "inner loops must be perfectly 
nested" } */
+    {
+      f1 (0, i);
+      for (j = 0; j < a2; j++)
+       {
+         f1 (1, j);
+         for (k = 0; k < a3; k++)
+           {
+             f1 (2, k);
+             f2 (2, k);


Would be good to stick here #pragma omp ordered doacross(source) and sink,
just to make it valid except for the intervening code.


Skipped this for now.

diff --git a/gcc/testsuite/c-c++-common/gomp/imperfect4.c 
b/gcc/testsuite/c-c++-common/gomp/imperfect4.c
new file mode 100644
index 00000000000..e5feff730a9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/imperfect4.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+
+/* This test case is expected to fail due to errors.  */
+
+static int f1count[3], f2count[3];


Don't declare variables you don't use at all in the test (in multiple
tests).


Fixed.

--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/imperfect5.c
@@ -0,0 +1,59 @@
+/* { dg-do compile } */


I think you'd better just copy some existing scan testcase and
just add intervening code to it.


I actually tried to do that originally, but I could not find "some existing scan 
testcase" with more than one loop that I could copy from.  Tobias helped me come up 
with the testcase I've presently got.

As I said at the top of this mail, the OG13 version of the patches includes the cleanups 
I indicated are "done", and I've applied those fixes in parallel to the 
mainline version I'm still working on.  If it would be helpful I could post that as WIP 
in its current state, or just continue adding the extra testcases, fixing the ICE, etc 
before reposting.

-Sandra

Re: [PATCH 1/3] OpenMP: C support for imperfectly-nested loops

Reply via email to