Andrew Stubbs wrote:
The -fopenmp-simd enables the "#pragma omp simd" directive (only) as a
vectorization hint, but it did not work for SIMT vectorization.
This patch enables the feature for backends in which the TARGET_SIMT_VF hook is
present and indicates SIMT is available. Only the NVPTX backend actually does
this, at present.
...
--- a/gcc/omp-expand.cc
+++ b/gcc/omp-expand.cc
@@ -6588,6 +6588,14 @@ expand_omp_simd (struct omp_region *region, struct
omp_for_data *fd)
bool is_simt = omp_find_clause (gimple_omp_for_clauses (fd->for_stmt),
OMP_CLAUSE__SIMT_);
+
+ /* Allow compatible targets to use SIMT with -fopenmp-simd. */
+ if (!is_simt
+ && flag_openmp_simd
+ && targetm.simt.vf
+ && targetm.simt.vf () > 1)
+ is_simt = true;
This one feels a bit odd. It seems as if this should be either
(flag_openmp_simd || flag_openmp)
or
flag_openmp_simd
&& !flag_openmp
(and possibly even &&!flag_openacc ?)
Additionally, I wonder whether there should be a
&& optimize.
For the current target(s), -fopenmp-simd w/o -fopenmp and w/o -fopenacc,
it would only apply when running in stand-alone code and nvptx has to my
knowledge no nvptx as GPU OpenMP/OpenACC parallelization, i.e. it is low
risk.
Otherwise, for '-fopenmp-simd' (with '|| flag_openmp' or when the user
specified both), it would be also active. I am not sure whether that
would be useful in term of performance or not, but it would be more
widely visible. I guess, the latter case would be worthwhile to check
how often it applies and whether there is a performance win - and
whether there is a correctness issue.
Actually, the latter is already true if only enabled for -fopenmp-simd.
(I had no deeper thought about correctness issues, yet.)
Tobias