On Wed, Jan 25, 2023 at 03:47:18PM +0100, Tobias Burnus wrote:
> updated patch included, i.e. avoiding 'count' for 'j' when a 'j.0' would
> do (i.e. only local var without the different step calculation). I also
> now reject if there is a non-unit step on the loop using an outer var.
>
> Eventually still to be done: replace the 'sorry' by working code, i.e.
> implement the suggestions to handle some/all non-unit iteration steps as
> proposed in this thread.
>
> On 20.01.23 18:39, Jakub Jelinek wrote:
> > I think instead of non-unity etc. it is better to talk about constant
> > step 1 or -1.
>
> I concur.
>
>
> > The actual problem with non-simple loops for non-rectangular loops is
> > both in case it is an inner loop which uses some outer loop's iterator,
> > or if it is outer loop whose iterator is used, both of those cases
> > will not be handled properly.
>
> I have now added a check for the other case as well.
>
> Just to confirm, the following is fine, isn't it?
>
> !$omp simd collapse(4)
> do i = 1, 10, 2
> do outer_var = 1, 10 ! step = + 1
> do j = 1, 10, 2
> do inner_var = 1, outer_var ! step = 1
>
> i.e. both the inner_var and outer_var have 'step = 1',
> even if other loops in the 'collapse' have step != 1.
> I think it should be fine.
Yes, the loops which don't define outer_var for other loops nor
use outer_var from other loops can be in any form, we can compute
their number of iterations before the whole loop nest for them,
so for the non-rectangular iterations computations we can ignore
those except for multiplication by the pre-computed count.
> OpenMP/Fortran: Partially fix non-rect loop nests [PR107424]
>
> This patch ensures that loop bounds depending on outer loop vars use the
> proper TREE_VEC format. It additionally gives a sorry if such an outer
> var has a non-one/non-minus-one increment as currently a count variable
> is used in this case (see PR).
>
> Finally, it avoids 'count' and just uses a local loop variable if the
> step increment is +/-1.
>
> PR fortran/107424
>
> gcc/fortran/ChangeLog:
>
> * trans-openmp.cc (struct dovar_init_d): Add 'sym' and
> 'non_unit_incr' members.
> (gfc_nonrect_loop_expr): New.
> (gfc_trans_omp_do): Call it; use normal loop bounds
> for unit stride - and only create local loop var.
>
> libgomp/ChangeLog:
>
> * testsuite/libgomp.fortran/non-rectangular-loop-1.f90: New test.
> * testsuite/libgomp.fortran/non-rectangular-loop-1a.f90: New test.
> * testsuite/libgomp.fortran/non-rectangular-loop-2.f90: New test.
> * testsuite/libgomp.fortran/non-rectangular-loop-3.f90: New test.
> * testsuite/libgomp.fortran/non-rectangular-loop-4.f90: New test.
> * testsuite/libgomp.fortran/non-rectangular-loop-5.f90: New test.
>
> gcc/testsuite/ChangeLog:
>
> * gfortran.dg/goacc/privatization-1-compute-loop.f90: Update dg-note.
> * gfortran.dg/goacc/privatization-1-routine_gang-loop.f90: Likewise.
>
> +static bool
> +gfc_nonrect_loop_expr (stmtblock_t *pblock, gfc_se *sep, int loop_n,
> + gfc_code *code, gfc_expr *expr, vec<dovar_init> *inits,
> + int simple, gfc_expr *curr_loop_var)
> +{
> + int i;
> + for (i = 0; i < loop_n; i++)
> + {
> + gcc_assert (code->ext.iterator->var->expr_type == EXPR_VARIABLE);
> + if (gfc_find_sym_in_expr (code->ext.iterator->var->symtree->n.sym,
> expr))
> + break;
> + code = code->block->next;
> + }
> + if (i >= loop_n)
> + return false;
> +
> + /* Canonic format: TREE_VEC with [var, multiplier, offset]. */
I think we use everywhere Canonical rather than Canonic
> + gfc_symbol *var = code->ext.iterator->var->symtree->n.sym;
> +
> + tree tree_var = NULL_TREE;
> + tree a1 = integer_one_node;
> + tree a2 = integer_zero_node;
> +
> + if (!simple)
> + {
> + /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424. */
> + sorry_at (gfc_get_location (&curr_loop_var->where),
> + "non-rectangular loop nest with step other than constant 1 "
> + "or -1 for %qs", curr_loop_var->symtree->n.sym->name);
> + return false;
> + }
> +
> + dovar_init *di;
> + unsigned ix;
> + FOR_EACH_VEC_ELT (*inits, ix, di)
> + if (di->sym == var && !di->non_unit_iter)
> + {
> + tree_var = di->init;
> + gcc_assert (DECL_P (tree_var));
> + break;
> + }
> + else if (di->sym == var)
> + {
> + /* FIXME: Handle non-unit iter steps, cf. PR fortran/107424. */
> + sorry_at (gfc_get_location (&code->loc),
> + "non-rectangular loop nest with step other than constant 1 "
> + "or -1 for %qs", var->name);
> + inform (gfc_get_location (&expr->where), "Used here");
> + return false;
> + }
I think it would be better formatted as
if (di->sym == var)
{
if (!di->non_unit_iter)
{
...
}
else
{
...
}
}
> + if (simple && !DECL_P (dovar))
> + {
> + const char *name = code->ext.iterator->var->symtree->n.sym->name;
> + local_dovar = gfc_create_var (type, name);
> + dovar_init e = {code->ext.iterator->var->symtree->n.sym,
> + dovar, local_dovar, false};
> + inits.safe_push (e);
> + }
For the separate local_dovar case, I'd be worried if we handle lastprivate
right. From quick skimming I see some lastprivate clauses in some of
the tests, so if they verify the right value has been computed (say the
same as one would get with -fno-openmp), then fine.
Jakub