Meinersbur wrote:
> Maybe there is no need to recursively represent all the nested transformation?
I don't see a need to represent the entire loop structure. The only info that I
think is potentially needed is the nest depth for transformations that generate
a loop nest (e.g. tile), and for transformations that generate loop sequences
it is the number of loops in that sequence (e.g. `split` for
index-set-splitting, and by construction also every construct with a
`looprange` clause).
The only use case for 2130117f92e51df73ac8c4b7e37f7f89178a89f2 was to identify
whether -- while being in a non-fully instantiated template context -- there
are additional generated loops or not. Within a dependend context
`getTransformedStmt()` may return null because e.g. arguments of the
`looprange()` clause might be template parameters.
```c++
Stmt *TransformedStmt = Dir->getTransformedStmt();
if (!TransformedStmt) {
unsigned NumGeneratedLoops = Dir->getNumGeneratedLoops();
if (NumGeneratedLoops == 0) {
// May happen if the loop transformation does not result in a
// generated loop (such as full unrolling).
break;
}
if (NumGeneratedLoops > 0) {
// The loop transformation construct has generated loops, but these
// may not have been generated yet due to being in a dependent
// context.
return true;
}
}
```
Outside of a template context `getTransformedStmt()` returns the fully expanded
loops which contain the nesting information that can be analyzed by recursing
into it, so no separate generated loop info should be necessary. Eventually the
template will be used in a instantiated context (or it's completly unused)
where it can be analyzed. Until then, all analysis can be defererred (which `
return true;` does).
Since it is typically easily obtainable (e.g. by the number of items in a
`sizes` clause for tile), the number of generated loops can be used for
additional semantic checking even in a templated (/dependent) context. This is
what the `break` above does: Since we know there are no further loops even in
an instantiated context, we do not need to defer semantic checks until later,
and analyze the loop nesting we already have.
I imagine those numbers could be used for better diagnostic output in general,
e.g. when using `collapse(2)` on a `#pragma omp unroll partial`, hinting the
user that the transformation does not produce a sufficient number of loops,
instead of pointing out that the unrolled loop body is not a valid canonical
loop[^1]. However, I think tracking the entire loop nest for better diagnostic
messages is overblown, and may not even be accurate. For instance:
```c
for (int i = 1; i < 20; i+=1)
#pragma omp tile sizes(i)
for (int j = 1; j < n; j+=1)
```
Just recording the loop nesting of this may result in `gls = [ 3 ]`. Ovever,
the tile size depends on the outer loop which the OpenMP restrictions on
canonical loop nests do not allow. One may track the dependencies through all
the clauses and loop bounds, but I think it is sufficient to let
`checkOpenMPLoop` diagnose all those restrictions once `getTransformedStmt()`
is guarenteed to be available.
[^1]: In the OpenMP spec the unrolled body is never a valid canonical loop,
even with `partial(1)`, nor a loop sequence, a concept that was added after
unroll. I am not sure what the use case would be for considering the unrolled
loop body to be a canonical loop sequence.
https://github.com/llvm/llvm-project/pull/139293
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits