Re: [PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/06/01 23:20, Max Filippov wrote:
> On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa
>  wrote:
>> More optimized than the default RTL generation.
>>
>> gcc/ChangeLog:
>>
>> * config/xtensa/xtensa.md (adddi3, subdi3):
>> New RTL generation patterns implemented according to the instruc-
>> tion idioms described in the Xtensa ISA reference manual (p. 600).
>> ---
>>  gcc/config/xtensa/xtensa.md | 52 +
>>  1 file changed, 52 insertions(+)
>>
>> diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
>> index eda1353894b..21afa747e89 100644
>> --- a/gcc/config/xtensa/xtensa.md
>> +++ b/gcc/config/xtensa/xtensa.md
>> @@ -190,6 +190,35 @@
>> (set_attr "mode""SI")
>> (set_attr "length"  "3")])
>>
>> +(define_expand "adddi3"
>> +  [(set (match_operand:DI 0 "register_operand")
>> +   (plus:DI (match_operand:DI 1 "register_operand")
>> +(match_operand:DI 2 "register_operand")))]
>> +  ""
>> +{
>> +  rtx lo_dest, hi_dest, lo_op0, hi_op0, lo_op1, hi_op1;
>> +  rtx_code_label *label;
>> +  if (rtx_equal_p (operands[0], operands[1])
>> +  || rtx_equal_p (operands[0], operands[2])
> 
>> +  || ! REG_P (operands[1]) || ! REG_P (operands[2]))
> 
> I wonder if these additional conditions are necessary, given that
> the operands have the "register_operand" predicates?
> 

See register_operand() in gcc/recog.cc.

In fact, I've encountered several operands that satisfy the
register_operand predicate but result in REG_P() being false.


Re: [PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Max Filippov via Gcc-patches
On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa
 wrote:
> More optimized than the default RTL generation.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (adddi3, subdi3):
> New RTL generation patterns implemented according to the instruc-
> tion idioms described in the Xtensa ISA reference manual (p. 600).
> ---
>  gcc/config/xtensa/xtensa.md | 52 +
>  1 file changed, 52 insertions(+)
>
> diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
> index eda1353894b..21afa747e89 100644
> --- a/gcc/config/xtensa/xtensa.md
> +++ b/gcc/config/xtensa/xtensa.md
> @@ -190,6 +190,35 @@
> (set_attr "mode""SI")
> (set_attr "length"  "3")])
>
> +(define_expand "adddi3"
> +  [(set (match_operand:DI 0 "register_operand")
> +   (plus:DI (match_operand:DI 1 "register_operand")
> +(match_operand:DI 2 "register_operand")))]
> +  ""
> +{
> +  rtx lo_dest, hi_dest, lo_op0, hi_op0, lo_op1, hi_op1;
> +  rtx_code_label *label;
> +  if (rtx_equal_p (operands[0], operands[1])
> +  || rtx_equal_p (operands[0], operands[2])

> +  || ! REG_P (operands[1]) || ! REG_P (operands[2]))

I wonder if these additional conditions are necessary, given that
the operands have the "register_operand" predicates?

-- 
Thanks.
-- Max


Re: [PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Max Filippov via Gcc-patches
On Wed, May 31, 2023 at 11:01 PM Takayuki 'January June' Suwa
 wrote:
> More optimized than the default RTL generation.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (adddi3, subdi3):
> New RTL generation patterns implemented according to the instruc-
> tion idioms described in the Xtensa ISA reference manual (p. 600).
> ---
>  gcc/config/xtensa/xtensa.md | 52 +
>  1 file changed, 52 insertions(+)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


[PATCH 2/3 v3] xtensa: Add 'adddi3' and 'subdi3' insn patterns

2023-06-01 Thread Takayuki 'January June' Suwa via Gcc-patches
On 2023/05/31 15:02, Max Filippov wrote:
Hi!

> On Tue, May 30, 2023 at 2:50 AM Takayuki 'January June' Suwa
>  wrote:
>>
>> Resubmitting the correct one due to a mistake in merging order of fixes.
>> ---
>> More optimized than the default RTL generation.
>>
>> gcc/ChangeLog:
>>
>> * config/xtensa/xtensa.md (adddi3, subdi3):
>> New RTL generation patterns implemented according to the instruc-
>> tion idioms described in the Xtensa ISA reference manual (p. 600).
>> ---
>>  gcc/config/xtensa/xtensa.md | 52 +
>>  1 file changed, 52 insertions(+)
>>
>> diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
>> index eda1353894b..6882baaedfd 100644
>> --- a/gcc/config/xtensa/xtensa.md
>> +++ b/gcc/config/xtensa/xtensa.md
>> @@ -190,6 +190,32 @@
>> (set_attr "mode""SI")
>> (set_attr "length"  "3")])
>>
>> +(define_expand "adddi3"
>> +  [(set (match_operand:DI 0 "register_operand")
>> +   (plus:DI (match_operand:DI 1 "register_operand")
>> +(match_operand:DI 2 "register_operand")))]
>> +  ""
>> +{
>> +  rtx lo_dest, hi_dest, lo_op0, hi_op0, lo_op1, hi_op1;
>> +  rtx_code_label *label;
>> +  lo_dest = gen_lowpart (SImode, operands[0]);
>> +  hi_dest = gen_highpart (SImode, operands[0]);
>> +  lo_op0 = gen_lowpart (SImode, operands[1]);
>> +  hi_op0 = gen_highpart (SImode, operands[1]);
>> +  lo_op1 = gen_lowpart (SImode, operands[2]);
>> +  hi_op1 = gen_highpart (SImode, operands[2]);
>> +  if (rtx_equal_p (lo_dest, lo_op1))
>> +FAIL;
> 
> With this condition I see the following source
> 
> unsigned long long foo(unsigned long long a, unsigned long long b)
> {
>return a + b;
> }
> 
> turns to (expected)
> 
>.global foo
>.type   foo, @function
> foo:
>add.n   a2, a2, a4
>add.n   a3, a3, a5
>bgeua2, a4, .L2
>addi.n  a3, a3, 1
> .L2:
>ret.n
> 
> but
> 
> unsigned long long foo(unsigned long long a, unsigned long long b)
> {
>return b + a;
> }
> 
> has an extra instruction:
> 
>.global foo
>.type   foo, @function
> foo:
>mov.n   a9, a2
>add.n   a2, a4, a2
>add.n   a3, a5, a3
>bgeua2, a9, .L2
>addi.n  a3, a3, 1
> .L2:
>ret.n
> 
> I though that maybe the following would help (plus using
> lo_cmp in the emit_cmp_and_jump_insns below):
> 
>   if (!rtx_equal_p (lo_dest, lo_op0))
>lo_cmp = lo_op0;
>  else if (!rtx_equal_p (lo_dest, lo_op1))
>lo_cmp = lo_op1;
>  else
>FAIL;
> 
> but to my surprise it doesn't.

As you may have noticed, at the time of RTL generation both of the above-
mentioned are almost the same (only a and b have been swapped).
Whether or not there are extra registers is determined at a later stage,
so there is very little that can be done about it at (define_expand).

I thought as above, but when I looked at the generated RTL again, I noticed
that I could somehow make a decision based on the order of the generated
pseudo-register numbers.

> 
>> +  emit_clobber (operands[0]);
> 
> Why is this clobber needed?

Apparently there is no need to clobber explicitly (because even if omitted,
it will appear in the generated result).

> 
>> +  emit_insn (gen_addsi3 (lo_dest, lo_op0, lo_op1));
>> +  emit_insn (gen_addsi3 (hi_dest, hi_op0, hi_op1));
>> +  emit_cmp_and_jump_insns (lo_dest, lo_op1, GEU, const0_rtx,
>> +  SImode, true, label = gen_label_rtx ());
>> +  emit_insn (gen_addsi3 (hi_dest, hi_dest, const1_rtx));
>> +  emit_label (label);
>> +  DONE;
>> +})
>> +
>>  (define_insn "addsf3"
>>[(set (match_operand:SF 0 "register_operand" "=f")
>> (plus:SF (match_operand:SF 1 "register_operand" "%f")
>> @@ -237,6 +263,32 @@
>>   (const_int 5)
>>   (const_int 6)))])
>>
>> +(define_expand "subdi3"
>> +  [(set (match_operand:DI 0 "register_operand")
>> +   (minus:DI (match_operand:DI 1 "register_operand")
>> + (match_operand:DI 2 "register_operand")))]
>> +  ""
>> +{
>> +  rtx lo_dest, hi_dest, lo_op0, hi_op0, lo_op1, hi_op1;
>> +  rtx_code_label *label;
>> +  lo_dest = gen_lowpart (SImode, operands[0]);
>> +  hi_dest = gen_highpart (SImode, operands[0]);
>> +  lo_op0 = gen_lowpart (SImode, operands[1]);
>> +  hi_op0 = gen_highpart (SImode, operands[1]);
>> +  lo_op1 = gen_lowpart (SImode, operands[2]);
>> +  hi_op1 = gen_highpart (SImode, operands[2]);
>> +  if (rtx_equal_p (lo_op0, lo_op1))
>> +FAIL;
> 
> I believe that for the emit_cmp_and_jump_insns below
> the check here should look like this:
> 
> if (rtx_equal_p (lo_dest, lo_op0) || rtx_equal_p (lo_dest, lo_op1))
> 
> But maybe drop this check and use the following instead?
> 
>  emit_insn (gen_subsi3 (hi_dest, hi_op0, hi_op1));
>  emit_cmp_and_jump_insns (lo_op0, lo_op1, GEU, const0_rtx,
>   SImode, true, label = gen_label_rtx ());
>  emit_insn (gen_addsi3 (hi_dest, hi_dest,