Re: [PATCH] MATCH: add abs support for half float

Kugan Vivekanandarajah Fri, 02 Aug 2024 02:21:34 -0700


> On 1 Aug 2024, at 10:46 pm, Richard Biener <richard.guent...@gmail.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Thu, Aug 1, 2024 at 5:31 AM Kugan Vivekanandarajah
> <kvivekana...@nvidia.com> wrote:
>> 
>> 
>> On Mon, Jul 29, 2024 at 10:11 AM Andrew Pinski <pins...@gmail.com> wrote:
>>> 
>>> On Mon, Jul 29, 2024 at 12:57 AM Kugan Vivekanandarajah
>>> <kugan...@gmail.com> wrote:
>>>> 
>>>> On Thu, Jul 25, 2024 at 10:19 PM Richard Biener
>>>> <richard.guent...@gmail.com> wrote:
>>>>> 
>>>>> On Thu, Jul 25, 2024 at 4:42 AM Kugan Vivekanandarajah
>>>>> <kugan...@gmail.com> wrote:
>>>>>> 
>>>>>> On Tue, Jul 23, 2024 at 11:56 PM Richard Biener
>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>> 
>>>>>>> On Tue, Jul 23, 2024 at 10:27 AM Kugan Vivekanandarajah
>>>>>>> <kugan...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> On Tue, Jul 23, 2024 at 10:35 AM Andrew Pinski <pins...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> On Mon, Jul 22, 2024 at 5:26 PM Kugan Vivekanandarajah
>>>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Revised based on the comment and moved it into existing patterns as.
>>>>>>>>>> 
>>>>>>>>>> gcc/ChangeLog:
>>>>>>>>>> 
>>>>>>>>>> * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
>>>>>>>>>> Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
>>>>>>>>>> 
>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>> 
>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
>>>>>>>>> 
>>>>>>>>> The testcase needs to make sure it runs only for targets that support
>>>>>>>>> float16 so like:
>>>>>>>>> 
>>>>>>>>> /* { dg-require-effective-target float16 } */
>>>>>>>>> /* { dg-add-options float16 } */
>>>>>>>> Added in the attached version.
>>>>>>> 
>>>>>>> + /* (type)A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>  (for cmp (ge gt)
>>>>>>>   (simplify
>>>>>>> -   (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>> -        && bitwise_equal_p (@0, @1))
>>>>>>> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
>>>>>>> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
>>>>>>> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
>>>>>>> +        && ((VECTOR_TYPE_P (type)
>>>>>>> +             && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE (@1)))
>>>>>>> +           || (!VECTOR_TYPE_P (type)
>>>>>>> +               && (TYPE_PRECISION (TREE_TYPE (@1))
>>>>>>> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
>>>>>>> +        && bitwise_equal_p (@1, @2))
>>>>>>> 
>>>>>>> I wonder about the bitwise_equal_p which tests @1 against @2 now
>>>>>>> with the convert still applied to @1 - that looks odd.  You are allowing
>>>>>>> sign-changing conversions but doesn't that change ge/gt behavior?
>>>>>>> Also why are sign/zero-extensions not OK for vector types?
>>>>>> Thanks for the review.
>>>>>> My main motivation here is for _Float16  as below.
>>>>>> 
>>>>>> _Float16 absfloat16 (_Float16 x)
>>>>>> {
>>>>>>  float _1;
>>>>>>  _Float16 _2;
>>>>>>  _Float16 _4;
>>>>>>  <bb 2> [local count: 1073741824]:
>>>>>>  _1 = (float) x_3(D);
>>>>>>  if (_1 < 0.0)
>>>>>>    goto <bb 3>; [41.00%]
>>>>>>  else
>>>>>>    goto <bb 4>; [59.00%]
>>>>>>  <bb 3> [local count: 440234144]:\
>>>>>>  _4 = -x_3(D);
>>>>>>  <bb 4> [local count: 1073741824]:
>>>>>>  # _2 = PHI <_4(3), x_3(D)(2)>
>>>>>>  return _2;
>>>>>> }
>>>>>> 
>>>>>> This is why I added  bitwise_equal_p test of @1 against @2 with
>>>>>> TYPE_PRECISION checks.
>>>>>> I agree that I will have to check for sign-changing conversions.
>>>>>> 
>>>>>> Just to keep it simple, I disallowed vector types. I am not sure if
>>>>>> this would  hit vec types. I am happy to handle this if that is
>>>>>> needed.
>>>>> 
>>>>> I think with __builtin_convertvector you should be able to construct
>>>>> a testcase that does
>>>> Thanks.
>>>> 
>>>> For the pattern,
>>>> ```
>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
>>>> (for cmp (ge gt)
>>>>  (simplify
>>>>   (cnd (cmp @0 zerop) @1 (negate @1))
>>>>    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>         && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>         && bitwise_equal_p (@0, @1))
>>>>     (if (TYPE_UNSIGNED (type))
>>>>      (absu:type @0)
>>>>      (abs @0)))))
>>>> ```
>>>> the vector type doesn't seem right. For example, if we have a 4
>>>> element vector with some negative and positive, I don't think  it
>>>> makes sense. Also, we dont seem to generate  (cmp @0 zerop). Am I
>>>> missing it completely?
>>> 
>>> Looks like I missed adding some vector testcases anyways here is one
>>> to get this, note it is C++ due to the C front-end not support `?:`
>>> for vectors yet (there is a patch).
>>> ```
>>> #define vect8 __attribute__((vector_size(8)))
>>> vect8 int f(vect8 int a)
>>> {
>>>  vect8 int na = -a;
>>>  return (a > 0) ? a : na;
>>> }
>>> ```
>>> At -O2 before forwprop1, we have:
>>> ```
>>>  vector(2) intD.9 a_2(D) = aD.2796;
>>>  vector(2) intD.9 naD.2799;
>>>  vector(2) <signed-boolean:32> _1;
>>>  vector(2) intD.9 _4;
>>> 
>>>  na_3 = -a_2(D);
>>>  _1 = a_2(D) > { 0, 0 };
>>>  _4 = VEC_COND_EXPR <_1, a_2(D), na_3>;
>>> ```
>>> And forwprop using match is able to do:
>>> ```
>>> Applying pattern match.pd:6306, gimple-match-10.cc:19843
>>> gimple_simplified to _4 = ABS_EXPR <a_2(D)>;
>>> Removing dead stmt:_1 = a_2(D) > { 0, 0 };
>>> Removing dead stmt:na_3 = -a_2(D);
>>> ```
>>> (replace int with float and add  -fno-signed-zeros you can get the ABS 
>>> also).
>>> 
>>> Note comparisons with vector types always generate a vector boolean
>>> type. So cond_expr will never show up with a vector comparison; only
>>> vec_cond.
>> 
>> 
>> Thanks for the information. I have revised the patch by adding test case for 
>> the vector type. I will look at removing the VEC_COND_EXPR as a follow up.
> 
> This is likely a major task so do not get you distracted - I just mentioned 
> it.
> 
>> This is still rejected by the type checker.
>> v2hi  absvect2 (v2hi x, int i) {
>>    v2hi neg = -x;
>>    v2si tmp = __builtin_convertvector (x, v2si);
>>    return (tmp > 0) ? x : neg;
>> }
>> as:
>> error: incompatible vector types in conditional expression: '__vector(2) 
>> int', 'v2hi' {aka '__vector(2) short int'} and 'v2hi' {aka '__vector(2) 
>> short int'}
>> 8 |     return (tmp > 0) ? x : neg;
>>    |            ~~~~~~~~~~^~~~~~~~~
>> 
>> Also
>> 
>> -      (absu:type @0)
>> -      (abs @0)))))
>> +      (absu:type @1)
>> +      (abs @1)))))
>> 
>> Changing the @1 to @2  is resulting in failures.  Existing tests like  the 
>> following would be optimized away in this case.
>> ```
>> unsigned abs_with_convert0 (int x)
>> {
>>    unsigned int y = x;
>> 
>>    if (x < 0)
>>        y = -y;
>> 
>>  return y;
>> }
> 
> How so - the TYPE_UNSIGNED tests should make the pattern not trigger here?
> 
>> ```
>> This follows the same intent as the original pattern.
>> 
>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK for 
>> trunk.
> 
> 
> 
> -   (cnd (cmp @0 zerop) @1 (negate @1))
> -    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
> -        && !TYPE_UNSIGNED (TREE_TYPE(@0))
> -        && bitwise_equal_p (@0, @1))
> +   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
> +    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
> +        && !TYPE_UNSIGNED (TREE_TYPE (@1))
> +        && !TYPE_UNSIGNED (TREE_TYPE (@0))
> 
> I think the outer type could be allowed signed if ...
> 
> +        && ((VECTOR_TYPE_P (TREE_TYPE (@0))
> +             && VECTOR_TYPE_P (TREE_TYPE (@1))
> +             && known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0)),
> +                          TYPE_VECTOR_SUBPARTS (TREE_TYPE (@1)))
> 
> this is always true
> 
> +             && element_precision (TREE_TYPE (@1))
> +                 <= element_precision (TREE_TYPE (@0)))
> +           || (!VECTOR_TYPE_P (TREE_TYPE (@0))
> +               && (TYPE_PRECISION (TREE_TYPE (@1))
> +                   <= TYPE_PRECISION (TREE_TYPE (@0)))))
> 
> ... you make the precision strictly larger.  With both unsigned the same
> precision case should have been stripped anyway.
> 
> You can use element_precision for both vector and non-vector so I think this
> should simplify to just checking element_precision.
> 
> +      (absu:type @1)
> +      (abs @1)))))
> 
> I still think this needs to be @2 for type correctness.  @1 is only
> bitwise equal,
> it could have different sign.  I think we should drop bitwise_equal_p with the
> convert now in and instead have a matching constraint.


Thanks, so this should translate to: 

/* (type)A >=/> 0 ? A : -A    same as abs (A) */
 (for cmp (ge gt)
  (simplify
   (cnd (cmp (convert?@0 @1) zerop) @2 (negate @2))
    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE (@1))
         && !TYPE_UNSIGNED (TREE_TYPE (@1))
         && !TYPE_UNSIGNED (TREE_TYPE (@2))
         && ((element_precision (TREE_TYPE (@1))
                < element_precision (TREE_TYPE (@0))
              || operand_equal_p (@1, @0)))
         && bitwise_equal_p (@1, @2))
     (if (TYPE_UNSIGNED (type))
      (absu:type @2)
      (abs @2)))))

However with this, I am seeing: Note that the @2 for these are unsigned.

Tests that now fail, but worked before (4 tests):

gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-not phiopt2 "if "
gcc: gcc.dg/tree-ssa/phi-opt-36.c scan-tree-dump-times phiopt1 "if " 2
gcc: gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-not phiopt1 "if "gcc: 
gcc.dg/tree-ssa/phi-opt-37.c scan-tree-dump-times phiopt1 "ABSU_EXPR <" 2


unsigned f2(int A)
{
  unsigned t = A;
//     A >= 0? A : -A    same as abs (A)
  if (A >= 0)  return t;
  return -t;
}

unsigned abs_with_convert0 (int x)
{
    unsigned int y = x;

    if (x < 0)
        y = -y;

  return y;
}
unsigned abs_with_convert1 (unsigned x)
{
    int y = x;

    if (y < 0)
        x = -x;

  return x;
}

In the original pattern, we have check for  !TYPE_UNSIGNED (TREE_TYPE(@0)) and  
(absu:type @0)

Shouldnt that translate !TYPE_UNSIGNED (TREE_TYPE(@1)) and  (absu:type @1) in 
the new pattern

Thanks,
Kugan

> 
> Same comments apply to the 2nd pattern update.
> 
> Richard.
> 
>> Thanks,
>> Kugan
>> 
>> gcc/ChangeLog:
>> 
>>        * match.pd: Extend A CMP 0 ? A : -A into (type)A CMP 0 ? A : -A.
>>        Extend A CMP 0 ? A : -A into (type) A CMP 0 ? A : -A.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>        * g++.dg/absvect.C: New test.
>>        * gcc.dg/tree-ssa/absfloat16.c: New test.
>> 
>> 
>> signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>> 
>> 
>> Note I think we should be able to merge VEC_COND_EXPR and COND_EXPR
>> codes by instead looking whether the condition is vector or scalar.  Fallout
>> might be a bit too big, but it might be a thing to try.
>> 
>> Richard.
>> 
>>> Thanks,
>>> Andrew Pinski
>>> 
>>>> 
>>>> Thanks,
>>>> Kugan
>>>> 
>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> +      (absu:type @1)
>>>>>>> +      (abs @1)))))
>>>>>>> 
>>>>>>> I think this should use @2 now.
>>>>>> I will change this.
>>>>>> 
>>>>>> Thanks,
>>>>>> Kugan
>>>>>> 
>>>>>>> 
>>>>>>>> Thanks.
>>>>>>>> Kugan
>>>>>>>>> 
>>>>>>>>> (like what is in gcc.dg/c11-floatn-3.c and others).
>>>>>>>>> 
>>>>>>>>> Other than that it looks good but I can't approve it.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Andrew Pinski
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>>>>>>>>>> 
>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK 
>>>>>>>>>> for trunk?
>>>>>>>>>> Thanks,
>>>>>>>>>> Kugan
>>>>>>>>>> 
>>>>>>>>>> ________________________________
>>>>>>>>>> From: Andrew Pinski <pins...@gmail.com>
>>>>>>>>>> Sent: Monday, 15 July 2024 5:30 AM
>>>>>>>>>> To: Kugan Vivekanandarajah <kvivekana...@nvidia.com>
>>>>>>>>>> Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; 
>>>>>>>>>> richard.guent...@gmail.com <richard.guent...@gmail.com>
>>>>>>>>>> Subject: Re: [PATCH] MATCH: add abs support for half float
>>>>>>>>>> 
>>>>>>>>>> External email: Use caution opening links or attachments
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Sun, Jul 14, 2024 at 1:12 AM Kugan Vivekanandarajah
>>>>>>>>>> <kvivekana...@nvidia.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> This patch extends abs detection in matched for half float.
>>>>>>>>>>> 
>>>>>>>>>>> Bootstrapped and regression test on aarch64-linux-gnu. Is this OK 
>>>>>>>>>>> for trunk?
>>>>>>>>>> 
>>>>>>>>>> This is basically this pattern:
>>>>>>>>>> ```
>>>>>>>>>> /* A >=/> 0 ? A : -A    same as abs (A) */
>>>>>>>>>> (for cmp (ge gt)
>>>>>>>>>>  (simplify
>>>>>>>>>>   (cnd (cmp @0 zerop) @1 (negate @1))
>>>>>>>>>>    (if (!HONOR_SIGNED_ZEROS (TREE_TYPE(@0))
>>>>>>>>>>         && !TYPE_UNSIGNED (TREE_TYPE(@0))
>>>>>>>>>>         && bitwise_equal_p (@0, @1))
>>>>>>>>>>     (if (TYPE_UNSIGNED (type))
>>>>>>>>>>      (absu:type @0)
>>>>>>>>>>      (abs @0)))))
>>>>>>>>>> ```
>>>>>>>>>> 
>>>>>>>>>> except extended to handle an optional convert. Why didn't you just
>>>>>>>>>> extend the above pattern to handle the convert instead? Also I think
>>>>>>>>>> you have an issue with unsigned types with the comparison.
>>>>>>>>>> Also you should extend the -abs(A) pattern right below it in a 
>>>>>>>>>> similar fashion.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Andrew Pinski
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> gcc/ChangeLog:
>>>>>>>>>>> 
>>>>>>>>>>> * match.pd: Add pattern to convert (type)A >=/> 0 ? A : -A into abs 
>>>>>>>>>>> (A) for half float.
>>>>>>>>>>> 
>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>> 
>>>>>>>>>>> * gcc.dg/tree-ssa/absfloat16.c: New test.
>>>>>>>>>>> 
>>>>>>>>>>> Signed-off-by: Kugan Vivekanandarajah <kvivekana...@nvidia.com>

Re: [PATCH] MATCH: add abs support for half float

Reply via email to