Richard Biener <richard.guent...@gmail.com> writes:
> On Fri, May 25, 2018 at 12:12 PM Richard Sandiford <
> richard.sandif...@linaro.org> wrote:
>
>> Richard Biener <richard.guent...@gmail.com> writes:
>> > On Wed, May 16, 2018 at 12:17 PM Richard Sandiford <
>> > richard.sandif...@linaro.org> wrote:
>> >> This patch uses IFN_COND_* to vectorise conditionally-executed,
>> >> potentially-trapping arithmetic, such as most floating-point
>> >> ops with -ftrapping-math.  E.g.:
>> >
>> >>      if (cond) { ... x = a + b; ... }
>> >
>> >> becomes:
>> >
>> >>      ...
>> >>      x = IFN_COND_ADD (cond, a, b);
>> >>      ...
>> >
>> >> When this transformation is done on its own, the value of x for
>> >> !cond isn't important.
>> >
>> >> However, the patch also looks for the equivalent of:
>> >
>> >>      y = cond ? x : a;
>> >
>> > As generated by predicate_all_scalar_phis, right?  So this tries to
> capture
>> >
>> >   if (cond)
>> >     y = a / b;
>> >   else
>> >     y = a;
>> >
>> > But I think it would be more useful to represent the else value
> explicitely
>> > as extra operand given a constant like 0 is sth that will happen in
> practice
>> > and is also sth that is I think supported as a result by some targets.
>> >
>> > So to support the flow of current if-conversion you'd convert to
>> >
>> >   IFN_COND_ADD (cond, a, b, a); // or any other reasonable "default"
> value
>> > (target controlled?)
>> >
>> > and you merge that with a single-use COND_EXPR by replacing that
> operand.
>> >
>> > If a target then cannot support the default value it needs to emit a
> blend
>> > afterwards.
>
>> Yeah, that's definitely better.  I was a bit worried about the extra
>> generality, but it turns out that (on SVE) having the else argument and
>> getting the .md patterns to blend the input can avoid an unnecessary
>> move that would be difficult to remove otherwise.
>
> Btw, I've checked AVX512 and their masked operations either blend
> into the destination or zero the masked elements.  Possibly useful
> for plus reductions.

OK.

>> I've committed the patches to add the else argument and the target hook
>> to select the preferred else value (thanks for the reviews).  The patch
>> below now uses that preferred else value if there's no ?: or if the
>> else value of the ?: isn't available at the point of the conditional
>> operation.
>
>> This version also copes with:
>
>>      if (cond) { ... x = a + b; ... }
>>      y = !cond ? c : x;
>
>> by adding a new utility function inverse_conditions_p.
>
> interpret_as_condition_p is a bit of a layering violation in fold-const.c
> since it looks at SSA_NAME_DEF_STMT.  In gimple.c we have a related
> thing, mainly used by forwprop - canonicalize_cond_expr_cond which is
> used to interpret a GENERIC folded thing as condition.
>
> Do you really need the def stmt lookup?

Yeah, since these internal functions always take a gimple value
as input, not a comparison code.  But I can put the code elsewhere.

> Note that one item on my TODO list is to remove [VEC_]COND_EXPR from
> gimple in favor of using a quaternary assign stmt with tcc_comparison code
> so we'd have
>
>   op0 = op1 < op2 ? op3 : op4;
>
> that means currently valid op0 = op1 < op2; would become
> a more explicit op0 = op1 < op2 ? true : false;
>
> That means you'd for example may need to handle _1 != 0 being
> feeded by _1 = _2 < _3 ? true : false;  -- or simply rely on those
> to be combined.
>
> Sooo - maybe you feel like implementing the above? ;)

So is just requiring a gimple value in COND_EXPR and VEC_COND_EXPR
a non-starter?  That seems neater IMO.  (And probably replacing
them with a single code.)

I think part of the problem with the 4-way stmt is that it becomes
harder to know what canonical form to use.  E.g. should it be
canonicalised so that op4 is a constant where possible?
Or should it be canonicalised based on the condition?  I've seen
this lead to cases in which we end up inserting opposite comparisons
with the current embedded VEC_COND_EXPR stuff, whereas splitting
them out would make it more obvious that the comparison and the
select are logically two separate things.

It would also be a bit inconsistent with IFN_COND_*, since they'd
still take the result of a separate comparison, and also the older
IFN_MASK* functions.

I guess gcond would still be a plain comparison?

>> >> in which the "then" value is the result of the conditionally-executed
>> >> operation and the "else" value is the first operand of that operation.
>> >> This "else" value is the one guaranteed by IFN_COND_* and so we can
>> >> replace y with x.
>> >
>> >> The patch also adds new conditional functions for multiplication
>> >> and division, which previously weren't needed.  This enables an
>> >> extra fully-masked reduction (of dubious value) in
> gcc.dg/vect/pr53773.c.
>> >
>> >> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf
>> >> and x86_64-linux-gnu.  OK to install?
>> >
>> > It does look like a clever way of expressing conditional execution.  It
>> > seems
>> > "scaling" this to more operations is a bit awkward and certainly it
> would be
>> > nice to see for example the division case also prototyped for AVX512 to
> see
>> > if it works for anything besides ARM.  Kirill CCed.
>
>> Yeah, I think the new version should be general enough for AVX512.
>> I've added a couple of gcc.dg/vect tests that are keyed off
>> vect_double_cond_arith.
>
>> Tested as before.  OK to install?
>
> In the if-conversion part how can it be that you need
> value_available_p?  Why do you delay propagating out
> redundant SSA names?
>
> It feels somewhat ugly possibly because if-conversion is implemented
> like it is right now?

I guess (although it didn't seem so ugly to me :-)).  The statements
are converted in-place and then the blocks are stitched together,
and since there's no relationship between the "then" and "else"
values of the COND_EXPR, there's no guarantee that the "else" is
computed before the "then" value, or that the "then" value can
be delayed to the point of the COND_EXPR.  Some statement reordering
would be necessary to handle the general case (and that might in itself
not be valid due to aliasing etc.).

The delay in propagating the SSA names is to avoid removing statements
or rewriting uses in the middle of the if-conversion transformation
itself.  E.g. SSA names might still be referenced in data_references
or predicates.

Doing it in ifcvt_local_dce seemed logically cleaner anyway, since it
really is a form of DCE.

Thanks,
Richard

Reply via email to