https://github.com/Meinersbur commented:
Please remove "[LoopTiling]" tag from the title, it is not a component of Clang. I would expect more testing in transform/tile, for instance: 1. Multi-dimensional tiling 2. collapse that only affects some, but not all of the intratile-loops of a multi-dimensional tiling 3. loops on user-defined iterators (like iterfor.cpp, but also collapsing some of the intratile-loops) 4. A check whether vectorization takes place on a non-affected innermost intra-tile loop is taking place 5. Composition of multiple loop transformation (tile and others) 6. ... There is no explanation of the overall design of this change which makes it very hard to review. The PR summary as well as a big comments at an central location need to explain it, where other comments that mention "omp-tile body-guard marker" can point to. Someone who encounters it while reading non-OpenMP-related code will be puzzled by what it is about. For instance, it should sketch how the AST structure looks like and what the semantics of OMPInvariantPredicateBoundAttr are. I tried to understand it anyway and here is what I got: 1. If all loops are known-multiples of the tile size, no predicate is emitted (Review comment: could we do it per-loop?) 2. The intratile ForStmt condition is emitted without partial tile check. That is, without additional code, it will overshoot the iteration space 3. The body is surrounded with an IfStmt with an AttributedStmt of an OMPInvariantPredicateBoundAttr checking for the iteration space. That is, without additional code in CGStmt that catches the AttributedStmt, a predicate is emitted. I would have done it the other way around: Make the default AST emit the vectorizable code (so no additional handling in CGStmt is needed; AttributedStmt can just be ignored, no change in output), and make `checkOpenMPLoop` recognize the AttributedStmt: If present, it knows it can convert the non-canonical ForStmt condition into a canonical one by remembering it still has to emit a predicate around the body in TransformedStmt. The main benefit is that this is more localized to the handling of OpenMP loops than dragging its complexity into the general-purpose CodeGen. Downside might be that the predicate needs to be remembered through the composition of multiple loop transformations (altough I think this approach might have the same problem; tests for it are missing). https://github.com/llvm/llvm-project/pull/191114 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
