Re: [RFA][RISC-V] Use "uw" forms for constant synthesis

2024-05-06 Thread Jeff Law




On 5/4/24 6:53 PM, Jeff Law wrote:


So another constant synthesis improvement.

In this patch we're looking at cases where we'd like to be able to use 
lui+slli, but can't because of the sign extending nature of lui on 
TARGET_64BIT.  For example: 0x800110020UL.  The trunk currently 
generates 4 instructions for that constant, when it can be done with 3 
(lui+slli.uw+addi).


When Zba is enabled, we can use lui+slli.uw as the slli.uw masks off the 
bits 32..63 before shifting, giving us the precise semantics we want.


I strongly suspect we'll want to do the same for a set of constants with 
lui+add.uw, lui+shNadd.uw, so you'll see the beginnings of generalizing 
support for lui followed by a "uw" instruction.


The new test just tests the set of cases that showed up while exploring 
a particular space of the constant synthesis problem.  It's not meant to 
be exhaustive (failure to use shadd when profitable).


Tested on rv64gc and rv32gcv.  OK for the trunk assuming it passes CI?

I pushed this after fixing the two over-length lines.

jeff



Re: [RFA][RISC-V] Use "uw" forms for constant synthesis

2024-05-05 Thread Jeff Law




On 5/4/24 6:53 PM, Jeff Law wrote:


So another constant synthesis improvement.

In this patch we're looking at cases where we'd like to be able to use 
lui+slli, but can't because of the sign extending nature of lui on 
TARGET_64BIT.  For example: 0x800110020UL.  The trunk currently 
generates 4 instructions for that constant, when it can be done with 3 
(lui+slli.uw+addi).


When Zba is enabled, we can use lui+slli.uw as the slli.uw masks off the 
bits 32..63 before shifting, giving us the precise semantics we want.


I strongly suspect we'll want to do the same for a set of constants with 
lui+add.uw, lui+shNadd.uw, so you'll see the beginnings of generalizing 
support for lui followed by a "uw" instruction.


The new test just tests the set of cases that showed up while exploring 
a particular space of the constant synthesis problem.  It's not meant to 
be exhaustive (failure to use shadd when profitable).


Tested on rv64gc and rv32gcv.  OK for the trunk assuming it passes CI?

Assume I'll fix the two overly long lines pointed out by the linter :-)
jeff


[RFA][RISC-V] Use "uw" forms for constant synthesis

2024-05-04 Thread Jeff Law


So another constant synthesis improvement.

In this patch we're looking at cases where we'd like to be able to use 
lui+slli, but can't because of the sign extending nature of lui on 
TARGET_64BIT.  For example: 0x800110020UL.  The trunk currently 
generates 4 instructions for that constant, when it can be done with 3 
(lui+slli.uw+addi).


When Zba is enabled, we can use lui+slli.uw as the slli.uw masks off the 
bits 32..63 before shifting, giving us the precise semantics we want.


I strongly suspect we'll want to do the same for a set of constants with 
lui+add.uw, lui+shNadd.uw, so you'll see the beginnings of generalizing 
support for lui followed by a "uw" instruction.


The new test just tests the set of cases that showed up while exploring 
a particular space of the constant synthesis problem.  It's not meant to 
be exhaustive (failure to use shadd when profitable).


Tested on rv64gc and rv32gcv.  OK for the trunk assuming it passes CI?

Jeff


gcc/

* config/riscv/riscv.cc (riscv_integer_op): Add field tracking if we
want to use a "uw" instruction variant.
(riscv_build_integer_1): Initialize the new field in various places.
Use lui+slli.uw for some constants.
(riscv_move_integer): Handle slli.uw.  

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 44945d47fd6..fd81f69e230 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -249,6 +249,7 @@ struct riscv_arg_info {
where A is an accumulator, each CODE[i] is a binary rtl operation
and each VALUE[i] is a constant integer.  CODE[0] is undefined.  */
 struct riscv_integer_op {
+  bool use_uw;
   enum rtx_code code;
   unsigned HOST_WIDE_INT value;
 };
@@ -734,6 +735,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
   /* Simply ADDI or LUI.  */
   codes[0].code = UNKNOWN;
   codes[0].value = value;
+  codes[0].use_uw = false;
   return 1;
 }
   if (TARGET_ZBS && SINGLE_BIT_MASK_OPERAND (value))
@@ -741,6 +743,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
   /* Simply BSETI.  */
   codes[0].code = UNKNOWN;
   codes[0].value = value;
+  codes[0].use_uw = false;
 
   /* RISC-V sign-extends all 32bit values that live in a 32bit
 register.  To avoid paradoxes, we thus need to use the
@@ -769,6 +772,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
{
  alt_codes[alt_cost-1].code = PLUS;
  alt_codes[alt_cost-1].value = low_part;
+ alt_codes[alt_cost-1].use_uw = false;
  memcpy (codes, alt_codes, sizeof (alt_codes));
  cost = alt_cost;
}
@@ -782,6 +786,7 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
{
  alt_codes[alt_cost-1].code = XOR;
  alt_codes[alt_cost-1].value = low_part;
+ alt_codes[alt_cost-1].use_uw = false;
  memcpy (codes, alt_codes, sizeof (alt_codes));
  cost = alt_cost;
}
@@ -792,17 +797,37 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
 {
   int shift = ctz_hwi (value);
   unsigned HOST_WIDE_INT x = value;
+  bool use_uw = false;
   x = sext_hwi (x >> shift, HOST_BITS_PER_WIDE_INT - shift);
 
   /* Don't eliminate the lower 12 bits if LUI might apply.  */
-  if (shift > IMM_BITS && !SMALL_OPERAND (x) && LUI_OPERAND (x << 
IMM_BITS))
+  if (shift > IMM_BITS
+ && !SMALL_OPERAND (x)
+ && (LUI_OPERAND (x << IMM_BITS)
+ || (TARGET_64BIT
+ && TARGET_ZBA
+ && LUI_OPERAND ((x << IMM_BITS)
+ & ~HOST_WIDE_INT_C (0x8000)
shift -= IMM_BITS, x <<= IMM_BITS;
 
+  /* Adjust X if it isn't a LUI operand in isolation, but we can use
+a subsequent "uw" instruction form to mask off the undesirable
+bits.  */
+  if (!LUI_OPERAND (x)
+ && TARGET_64BIT
+ && TARGET_ZBA
+ && LUI_OPERAND (x & ~HOST_WIDE_INT_C (0x8000UL)))
+   {
+ x = sext_hwi (x, 32);
+ use_uw = true;
+   }
+
   alt_cost = 1 + riscv_build_integer_1 (alt_codes, x, mode);
   if (alt_cost < cost)
{
  alt_codes[alt_cost-1].code = ASHIFT;
  alt_codes[alt_cost-1].value = shift;
+ alt_codes[alt_cost-1].use_uw = use_uw;
  memcpy (codes, alt_codes, sizeof (alt_codes));
  cost = alt_cost;
}
@@ -823,8 +848,10 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
  /* The sign-bit might be zero, so just rotate to be safe.  */
  codes[0].value = (((unsigned HOST_WIDE_INT) value >> trailing_ones)
| (value << (64 - trailing_ones)));
+ codes[0].use_uw = false;
  codes[1].code = ROTATERT;
  codes[1].value = 64 -