Andrew Stubbs wrote:
The -fopenmp-simd enables the "#pragma omp simd" directive (only) as a
vectorization hint, but it did not work for SIMT vectorization.

This patch enables the feature for backends in which the TARGET_SIMT_VF hook is
present and indicates SIMT is available.  Only the NVPTX backend actually does
this, at present.

...

--- a/gcc/omp-expand.cc
+++ b/gcc/omp-expand.cc
@@ -6588,6 +6588,14 @@ expand_omp_simd (struct omp_region *region, struct 
omp_for_data *fd)
bool is_simt = omp_find_clause (gimple_omp_for_clauses (fd->for_stmt),
                                  OMP_CLAUSE__SIMT_);
+
+  /* Allow compatible targets to use SIMT with -fopenmp-simd.  */
+  if (!is_simt
+      && flag_openmp_simd
+      && targetm.simt.vf
+      && targetm.simt.vf () > 1)
+    is_simt = true;

This one feels a bit odd. It seems as if this should be either
  (flag_openmp_simd || flag_openmp)
or
     flag_openmp_simd
  && !flag_openmp

(and possibly even &&!flag_openacc ?)

Additionally, I wonder whether there should be a
  && optimize.

For the current target(s), -fopenmp-simd w/o -fopenmp and w/o -fopenacc,
it would only apply when running in stand-alone code and nvptx has to my
knowledge no nvptx as GPU OpenMP/OpenACC parallelization, i.e. it is low
risk.

Otherwise, for '-fopenmp-simd' (with '|| flag_openmp' or when the user
specified both), it would be also active. I am not sure whether that
would be useful in term of performance or not, but it would be more
widely visible. I guess, the latter case would be worthwhile to check
how often it applies and whether there is a performance win - and
whether there is a correctness issue.

Actually, the latter is already true if only enabled for -fopenmp-simd.
(I had no deeper thought about correctness issues, yet.)

Tobias

Reply via email to