Re: [committed] RISC-V: Fix INSN costing and more zicond tests

2023-11-09 Thread Maciej W. Rozycki
On Thu, 9 Nov 2023, Jeff Law wrote:

> >   Can we have the insn costing reverted to correct calculation?
> What needs to happen is that code needs to be extended, not reverted. Many
> codes have to be synthesized based on the condition and the true/false arms.
> That's not currently accounted for.

 How is maintaining zillions of variants of insn counts by hand (IIUC what 
you mean) going to be more efficient (or even practical maintenance-wise) 
than what the middle end did automagically?  What exactly was wrong with 
the previous approach, and then why didn't your change include a proof of 
correctness in the form of testsuite cases verifying branch vs conditional 
move costing stays the same (or gets corrected if applicable) across your 
change?

 I guess I'll post my patch series regardless on the presumption that 
correct insn counting will have been reinstated for GCC 14 one way or 
another (i.e. by reverting commit 44efc743acc0 locally in my tree and then 
getting clean test results across the patch series) and we can take it 
from there.  Also to make sure we're on the same page.

 I do hope it will be considered worthwhile despite this issue making it 
not ready for testsuite verification, as not only it adds new features, 
but it fixes numerous existing problems, plain bugs, and deficiencies as 
well which we currently have in conditional move handling.  But it relies 
on correct costing for verification, which I couldn't have expected that 
will get broken again (regardless of your clearly good intentions).  And 
I'd rather we had these test cases or otherwise costing regressions are 
easily missed (as indicated here).

  Maciej


Re: [committed] RISC-V: Fix INSN costing and more zicond tests

2023-11-09 Thread Jeff Law




On 11/9/23 07:33, Maciej W. Rozycki wrote:

On Fri, 29 Sep 2023, Jeff Law wrote:


So this ends up looking a lot like the bits that I had to revert several weeks
ago :-)

The core issue we have is given an INSN the generic code will cost the SET_SRC
and SET_DEST and sum them.  But that's far from ideal on a RISC target.

For a register destination, the cost can be determined be looking at just the
SET_SRC.  Which is precisely what this patch does.  When the outer code is an
INSN and we're presented with a SET we take one of two paths.

If the destination is a register, then we recurse just on the SET_SRC and
we're done.  Otherwise we fall back to the existing code which sums the cost
of the SET_SRC and SET_DEST.  That fallback path isn't great and probably
could be further improved (just costing SET_DEST in that case is probably
quite reasonable).


  So this actually breaks insn costing for if-conversion, causing all
conditional-move expansions to count as 1 insn regardless of how many
there actually are.  This can be easily verified by using various
`-mbranch-cost=' settings.

  Before your change you had to set the branch cost to higher than or equal
to the replacement insn count for if-conversion to trigger.  Of course
tuning microarchitectures will have preset this hopefully correctly for
their needs, so normally you don't need to resort to `-mbranch-cost='.
With this change in place only setting `-mbranch-cost=1' will prevent
if-conversion from triggering, which is taking the situation back to GCC
13 days, where `movMODEcc' patterns were always cost at 1.

  In preparation for an upcoming set of changes I have written numerous
testsuite cases to verify this insn costing to work correctly and now that
I have rebased for the submission all indicate the costing went wrong and
`movMODEcc' sequences of up to 6 insns are all now cost at 1 total.  I was
going to post the patch series Fri-Mon, but this seems like a showstopper
to me, because if-conversion now triggers even when the conditional-move
(or for that matter conditional-add, as I have it handled too) sequence is
more expensive than a branched one.

  E.g. the NE operation costs 4 instructions for Zicond:

sub a1,a0,a1
czero.eqz   a2,a2,a1
czero.nez   a1,a3,a1
or  a0,a2,a1
ret

while the branched equivalent costs (branch + 1) instructions:

beq a0,a1,.L3
mv  a0,a2
ret
.L3:
mv  a0,a3
ret

so I'd expect if-conversion only to trigger at `-mbranch-cost=3' or higher
(just as my test cases verify), but now it triggers at `-mbranch-cost=2'
already.

  Can we have the insn costing reverted to correct calculation?
FYI, I've opened a bug for this issue so it doesn't get lost.  I don't 
think the extensions are terribly hard.  It's really a matter of 
deciding if we can re-use any of the logic from the expander or if we 
just mirror its logic and keep the expander and costing in sync.


Jeff


   Maciej


Re: [committed] RISC-V: Fix INSN costing and more zicond tests

2023-11-09 Thread Jeff Law




On 11/9/23 07:33, Maciej W. Rozycki wrote:

On Fri, 29 Sep 2023, Jeff Law wrote:


So this ends up looking a lot like the bits that I had to revert several weeks
ago :-)

The core issue we have is given an INSN the generic code will cost the SET_SRC
and SET_DEST and sum them.  But that's far from ideal on a RISC target.

For a register destination, the cost can be determined be looking at just the
SET_SRC.  Which is precisely what this patch does.  When the outer code is an
INSN and we're presented with a SET we take one of two paths.

If the destination is a register, then we recurse just on the SET_SRC and
we're done.  Otherwise we fall back to the existing code which sums the cost
of the SET_SRC and SET_DEST.  That fallback path isn't great and probably
could be further improved (just costing SET_DEST in that case is probably
quite reasonable).


  So this actually breaks insn costing for if-conversion, causing all
conditional-move expansions to count as 1 insn regardless of how many
there actually are.  This can be easily verified by using various
`-mbranch-cost=' settings.

  Before your change you had to set the branch cost to higher than or equal
to the replacement insn count for if-conversion to trigger.  Of course
tuning microarchitectures will have preset this hopefully correctly for
their needs, so normally you don't need to resort to `-mbranch-cost='.
With this change in place only setting `-mbranch-cost=1' will prevent
if-conversion from triggering, which is taking the situation back to GCC
13 days, where `movMODEcc' patterns were always cost at 1.

  In preparation for an upcoming set of changes I have written numerous
testsuite cases to verify this insn costing to work correctly and now that
I have rebased for the submission all indicate the costing went wrong and
`movMODEcc' sequences of up to 6 insns are all now cost at 1 total.  I was
going to post the patch series Fri-Mon, but this seems like a showstopper
to me, because if-conversion now triggers even when the conditional-move
(or for that matter conditional-add, as I have it handled too) sequence is
more expensive than a branched one.

  E.g. the NE operation costs 4 instructions for Zicond:

sub a1,a0,a1
czero.eqz   a2,a2,a1
czero.nez   a1,a3,a1
or  a0,a2,a1
ret

while the branched equivalent costs (branch + 1) instructions:

beq a0,a1,.L3
mv  a0,a2
ret
.L3:
mv  a0,a3
ret

so I'd expect if-conversion only to trigger at `-mbranch-cost=3' or higher
(just as my test cases verify), but now it triggers at `-mbranch-cost=2'
already.

  Can we have the insn costing reverted to correct calculation?
What needs to happen is that code needs to be extended, not reverted. 
Many codes have to be synthesized based on the condition and the 
true/false arms.  That's not currently accounted for.



jeff


Re: [committed] RISC-V: Fix INSN costing and more zicond tests

2023-11-09 Thread Maciej W. Rozycki
On Fri, 29 Sep 2023, Jeff Law wrote:

> So this ends up looking a lot like the bits that I had to revert several weeks
> ago :-)
> 
> The core issue we have is given an INSN the generic code will cost the SET_SRC
> and SET_DEST and sum them.  But that's far from ideal on a RISC target.
> 
> For a register destination, the cost can be determined be looking at just the
> SET_SRC.  Which is precisely what this patch does.  When the outer code is an
> INSN and we're presented with a SET we take one of two paths.
> 
> If the destination is a register, then we recurse just on the SET_SRC and
> we're done.  Otherwise we fall back to the existing code which sums the cost
> of the SET_SRC and SET_DEST.  That fallback path isn't great and probably
> could be further improved (just costing SET_DEST in that case is probably
> quite reasonable).

 So this actually breaks insn costing for if-conversion, causing all 
conditional-move expansions to count as 1 insn regardless of how many 
there actually are.  This can be easily verified by using various 
`-mbranch-cost=' settings.

 Before your change you had to set the branch cost to higher than or equal
to the replacement insn count for if-conversion to trigger.  Of course 
tuning microarchitectures will have preset this hopefully correctly for 
their needs, so normally you don't need to resort to `-mbranch-cost='.  
With this change in place only setting `-mbranch-cost=1' will prevent 
if-conversion from triggering, which is taking the situation back to GCC 
13 days, where `movMODEcc' patterns were always cost at 1.

 In preparation for an upcoming set of changes I have written numerous 
testsuite cases to verify this insn costing to work correctly and now that 
I have rebased for the submission all indicate the costing went wrong and 
`movMODEcc' sequences of up to 6 insns are all now cost at 1 total.  I was 
going to post the patch series Fri-Mon, but this seems like a showstopper 
to me, because if-conversion now triggers even when the conditional-move 
(or for that matter conditional-add, as I have it handled too) sequence is 
more expensive than a branched one.

 E.g. the NE operation costs 4 instructions for Zicond:

sub a1,a0,a1
czero.eqz   a2,a2,a1
czero.nez   a1,a3,a1
or  a0,a2,a1
ret

while the branched equivalent costs (branch + 1) instructions:

beq a0,a1,.L3
mv  a0,a2
ret
.L3:
mv  a0,a3
ret

so I'd expect if-conversion only to trigger at `-mbranch-cost=3' or higher 
(just as my test cases verify), but now it triggers at `-mbranch-cost=2' 
already.

 Can we have the insn costing reverted to correct calculation?

  Maciej


Re: [committed] RISC-V: Fix INSN costing and more zicond tests

2023-10-12 Thread Hans-Peter Nilsson
> Date: Fri, 29 Sep 2023 16:37:21 -0600
> From: Jeff Law 

> So this ends up looking a lot like the bits that I had to revert several 
> weeks ago :-)
> 
> The core issue we have is given an INSN the generic code will cost the 
> SET_SRC and SET_DEST and sum them.  But that's far from ideal on a RISC 
> target.
> 
> For a register destination, the cost can be determined be looking at 
> just the SET_SRC.  Which is precisely what this patch does.  When the 
> outer code is an INSN and we're presented with a SET we take one of two 
> paths.
> 
> If the destination is a register, then we recurse just on the SET_SRC 
> and we're done.  Otherwise we fall back to the existing code which sums 
> the cost of the SET_SRC and SET_DEST.

Ackchyually...  that "otherwise" happens for calls to
set_rtx_cost (et al), but not calls to insn_cost.

IOW, with that patch, it seems you're mimicking insn_cost
behavior also for set_rtx_cost (et al).  You're likely aware
of this, but when seeing these target cost functions tweaked
for reasons that appear somewhat empirical, I felt compelled
to point out the related rabbit-hole.

While I'm ranting, these slightly different cost api:s,
somewhat arbitrarily, (or empirically) picked by callers, is
a problem by itself.  Not to mention that the default use of
set_rtx_cost means you get hit by another problem; the
default cost of 0 for registers is also a magic number to
pattern_cost to set the cost to INSN_COSTS (1).

The default insn_cost implementation, which RISC-V uses as
opposed to implementing the TARGET_INSN_COST hook, only
looks at the SET_SRC for calls to insn_cost for single-sets.
See pattern_cost.  I believe that's a bug.  Fixing that was
attempted in 2016 (by Bernd S.), a patch which was later
reverted: cf. commits r7-4866-g334442f282a9d6 and
r7-4930-g03612f25277590.  Hence rabbit-hole.  (And no,
implementing TARGET_INSN_COST doesn't automatically fix
things.  Too much of the gcc middle-end appears tuned to the
default behavior.)

Sorry for the rant; have a nice day and a better week-end.

>  That fallback path isn't great 
> and probably could be further improved (just costing SET_DEST in that 
> case is probably quite reasonable).
> 
> The difference between this version and the bits that slipped through by 
> accident several weeks ago is that old version mis-used the API due to a 
> thinko on my part.
> 
> This tightens up various zicond tests to avoid undesirable matching.
> 
> This has been tested on rv64gc -- the only difference it makes on the 
> testsuite is the new tests (included in this patch) flip from failing to 
> passing.
> 
> Pushed to the trunk.
> 
> Jeff

brgds, H-P


[committed] RISC-V: Fix INSN costing and more zicond tests

2023-09-29 Thread Jeff Law


So this ends up looking a lot like the bits that I had to revert several 
weeks ago :-)


The core issue we have is given an INSN the generic code will cost the 
SET_SRC and SET_DEST and sum them.  But that's far from ideal on a RISC 
target.


For a register destination, the cost can be determined be looking at 
just the SET_SRC.  Which is precisely what this patch does.  When the 
outer code is an INSN and we're presented with a SET we take one of two 
paths.


If the destination is a register, then we recurse just on the SET_SRC 
and we're done.  Otherwise we fall back to the existing code which sums 
the cost of the SET_SRC and SET_DEST.  That fallback path isn't great 
and probably could be further improved (just costing SET_DEST in that 
case is probably quite reasonable).


The difference between this version and the bits that slipped through by 
accident several weeks ago is that old version mis-used the API due to a 
thinko on my part.


This tightens up various zicond tests to avoid undesirable matching.

This has been tested on rv64gc -- the only difference it makes on the 
testsuite is the new tests (included in this patch) flip from failing to 
passing.


Pushed to the trunk.

Jeff
commit 44efc743acc01354b6b9eb1939aedfdcc44e71f3
Author: Xiao Zeng 
Date:   Fri Sep 29 16:29:02 2023 -0600

Fix INSN costing and more zicond tests

So this ends up looking a lot like the bits that I had to revert several 
weeks
ago :-)

The core issue we have is given an INSN the generic code will cost the 
SET_SRC
and SET_DEST and sum them.  But that's far from ideal on a RISC target.

For a register destination, the cost can be determined be looking at just 
the
SET_SRC.  Which is precisely what this patch does.  When the outer code is 
an
INSN and we're presented with a SET we take one of two paths.

If the destination is a register, then we recurse just on the SET_SRC and 
we're
done.  Otherwise we fall back to the existing code which sums the cost of 
the
SET_SRC and SET_DEST.  That fallback path isn't great and probably could be
further improved (just costing SET_DEST in that case is probably quite
reasonable).

The difference between this version and the bits that slipped through by
accident several weeks ago is that old version mis-used the API due to a 
thinko
on my part.

This tightens up various zicond tests to avoid undesirable matching.

This has been tested on rv64gc -- the only difference it makes on the 
testsuite
is the new tests (included in this patch) flip from failing to passing.

Pushed to the trunk.

gcc/
* config/riscv/riscv.cc (riscv_rtx_costs): Better handle costing
SETs when the outer code is INSN.

gcc/testsuite
* gcc.target/riscv/zicond-primitiveSemantics_compare_imm.c: New 
test.
* 
gcc.target/riscv/zicond-primitiveSemantics_compare_imm_return_0_imm.c:
Likewise.
* 
gcc.target/riscv/zicond-primitiveSemantics_compare_imm_return_imm_imm.c:
Likewise.
* 
gcc.target/riscv/zicond-primitiveSemantics_compare_imm_return_imm_reg.c:
Likewise.
* 
gcc.target/riscv/zicond-primitiveSemantics_compare_imm_return_reg_reg.c:
Likewise.
* gcc.target/riscv/zicond-primitiveSemantics_compare_reg.c: 
Likewise.
* 
gcc.target/riscv/zicond-primitiveSemantics_compare_reg_return_0_imm.c:
Likewise.
* 
gcc.target/riscv/zicond-primitiveSemantics_compare_reg_return_imm_imm.c:
Likewise.
* 
gcc.target/riscv/zicond-primitiveSemantics_compare_reg_return_imm_reg.c:
Likewise.
* 
gcc.target/riscv/zicond-primitiveSemantics_compare_reg_return_reg_reg.c:
Likewise.
* gcc.target/riscv/zicond-primitiveSemantics.c: Tighten expected 
regexp.
* gcc.target/riscv/zicond-primitiveSemantics_return_0_imm.c: 
Likewise.
* gcc.target/riscv/zicond-primitiveSemantics_return_imm_imm.c: 
Likewise.
* gcc.target/riscv/zicond-primitiveSemantics_return_imm_reg.c: 
Likewise.
* gcc.target/riscv/zicond-primitiveSemantics_return_reg_reg.c: 
Likewise.
* gcc.target/riscv/zicond-xor-01.c: Likewise.

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 6e7a719e7a0..d5446b63dbf 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2768,6 +2768,19 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
 
   switch (GET_CODE (x))
 {
+case SET:
+  /* If we are called for an INSN that's a simple set of a register,
+then cost based on the SET_SRC alone.  */
+  if (outer_code == INSN && REG_P (SET_DEST (x)))
+   {
+ riscv_rtx_costs (SET_SRC (x), mode, outer_code, opno, total, speed);
+ return true;
+   }
+
+  /*