On Thu, May 16, 2019 at 09:53:06AM +0200, Richard Biener wrote: > > + if (nonconst_simd_if) > > + { > > + if (sctx.lane == NULL_TREE) > > + { > > + sctx.idx = create_tmp_var (unsigned_type_node); > > + sctx.lane = create_tmp_var (unsigned_type_node); > > + } > > Does forcing a SIMD_LANE possibly pessimize things? Just looking > whether a separate IFN might be better here or even doing the > versioning in omp expansion/lowering? But see question below...
If the GOMP_SIMD_LANE isn't used for anything else, then it is a matter of having the .GOMP_SIMD_LANE NOVOPS call inside of the loop. I don't think it pessimizes anything compared to having some other IFN in the loop that does it; I've thought about doing the versioning in omp expansion, but then we treat those two loops as unrelated loops, would optimize them differently and wouldn't be able to share the scalar loop that might be needed for some kind of other versioning in the vectorizer with that other scalar loop. > > case IFN_GOMP_SIMD_LANE: > > + if (gimple_call_num_args (stmt) >= 2 > > + && !integer_nonzerop (gimple_call_arg (stmt, 1))) > > + break; > > + /* FALLTHRU */ > > GOMP_SIMD_LANE has ECF_NOVOPS, how do you prevent code motion of it > if the result is unused? Even with VOPs the vectorized loop may be > a reduction not involving any memory operations. First of all, at least in OpenMP standard, both the simdlen and if clauses determine the preferred number of concurrent simd lanes, which doesn't seem like a language that forbids anything else, just that it would be nice to honor what the user asked for if possible. If the user asks for some nonsensical simdlen, we aren't going to support that anyway. With ECF_NOVOPS, rather than ECF_PURE or ECF_CONST, the function has side effects unknown to the compiler, so not sure how we could move it outside of the loop, and the code doesn't care where exactly in the loop it is. > > + if (loop->simduid) > > + { > > + gimple *g = gsi_stmt (si); > > + /* If .GOMP_SIMD_LANE call for the current loop has 2 arguments, > > + the second argument is the #pragma omp simd if (x) condition, > > + when 0, loop shouldn't be vectorized, when non-zero constant, > > + it should be vectorized normally, otherwise versioned with > > + vectorized loop done if the condition is non-zero at > > + runtime. */ > > + if (is_gimple_call (g) > > + && gimple_call_internal_p (g) > > + && gimple_call_internal_fn (g) == IFN_GOMP_SIMD_LANE > > + && gimple_call_num_args (g) >= 2 > > + && TREE_CODE (gimple_call_arg (g, 0)) == SSA_NAME > > + && (loop->simduid > > + == SSA_NAME_VAR (gimple_call_arg (g, 0)))) > > + { > > + tree arg = gimple_call_arg (g, 1); > > + if (integer_zerop (arg)) > > + return opt_result::failure_at (g, > > + "not vectorized: " > > + "simd if(0)\n"); > > + if (TREE_CODE (arg) == SSA_NAME) > > + LOOP_VINFO_SIMD_IF_COND (loop_vinfo) = arg; > > + } > > + } > > This looks like a quite arbitrary place to do this, it wastes > compile-time in case of a zero arg. Can't you do this in I was looking for some spot where we walk over the statements in the loop already, rather than adding yet another IL walk. Sure, it could be guarded with loop->simduid and so not be done unless it is a loop with .GOMP_SIMD_LANE call. I'd think the zero case will be extremely rare there, we handle literal if (0) much earlier, and I think it will be fairly unusual if we propagate a zero into the condition otherwise. > vect_analyze_loop before the loop over vector sizes? Maybe > re-using what note_simd_array_uses computes? note_simd_array_uses indeed does walk the IL and does look at the calls, but I'd need some data structure where to store the argument; we don't have loop_vinfo yet (we don't have it even before the loop over vector sizes), adding another tree to struct loop seems undesirable from memory usage POV, we'd need it just for the duration between note_simd_array_uses and the actual loop_vinfo creation; so would you prefer some extra hash_map for that? > Also I wonder what happens if arg is not SSA name - I guess > you omitted the > > else > gcc_assert (integer_onep (arg)); > > ? Otherwise just dropping the condition looks wrong. Maybe integer_nonzerop (arg) (even though the gimplifier runs gimple_boolify on the argument). Not really sure if some non-INTEGER_CST invariant could make it through if it has been boolified. > Is a zero argument really a "force no vectorization" or should > it merely reset force_vectorize to zero? It states that the user prefers to not vectorize that loop, it isn't a hard requirement (and it is hard to figure it out), but I think we should honor that request if possible and just not vectorize. > > + if (version_simd_if_cond) > > + { > > + gcc_assert (TREE_CODE (version_simd_if_cond) == SSA_NAME); > > + gcc_assert (dom_info_available_p (CDI_DOMINATORS)); > > + if (basic_block bb > > + = gimple_bb (SSA_NAME_DEF_STMT (version_simd_if_cond))) > > + { > > + if (!dominated_by_p (CDI_DOMINATORS, loop->header, bb) > > + || (scalar_loop > > + && !dominated_by_p (CDI_DOMINATORS, scalar_loop->header, > > + bb))) > > + version_simd_if_cond = boolean_false_node; > > How can this ever happen? A loop has a single entry and I hope > omp lowering places the condition computation on that entry. Yes, the gimplification of the condition is before the loop header. I was just trying to be safe, because the .GOMP_SIMD_LANE call is inside of the loop, not before the loop and some optimization might in theory move the def stmt of the argument SSA_NAME from before the loop to inside the loop (not aware of any that does it). Sure, in that case we wouldn't honor what user preferred. > dominators are available since loops are up-to-date so the DOM > assertion is not necessary. Likewise the SSA_NAME assert > is covered by tree checking on the SSA_NAME_DEF_STMT. Ok, will take the == SSA_NAME assert out. > > > + } > > + tree zero = build_int_cst (TREE_TYPE (version_simd_if_cond), 0); > > build_zero_cst (TREE_TYPE (version_simd_if_cond)) Is that better (build_zero_cst is a wrapper that will call build_int_cst with 0)? A lot of code calls build_int_cst directly. Don't care much though. Jakub