I think both approaches look weird to me.

Lehua is adding an const 0 move pattern which is only used by widen reduction 
is not ideal.
Also, I don't like changing abs/vcond_mask predicate.

So, IMHO, a complicate pattern which combine initial 0 value + extension + 
reduction + vmerge may be more reasonable.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2023-09-20 17:14
To: Lehua Ding; gcc-patches
CC: rdapp.gcc; juzhe.zhong; kito.cheng; palmer; jeffreyalaw
Subject: Re: [PATCH V2] RISC-V: Support combine cond extend and reduce sum to 
widen reduce sum
Hi Lehua,
 
I think this is better but still a bit weird :D  Allowing constants
and forcing them into registers unconditionally is slightly dubious as
well, though.  One thing that always sticks out is - how is 0 special?
Wouldn't we want other constants as well?
 
For reductions I think the vectorizer always starts accumulates
starting with the initial neutral value 0 and adds any other scalar
initial value later.  But that could change?
 
For reference, attached is what I tried.  This gives me no regressions
and your tests work.  Your approach is more generic in case we want to
match future zero constants in other patterns (that we still needed
to adjust with force reg otherwise) but the force-reg thing appears
more "natural".
 
All in all, I would prefer the force-reg approach slightly but could also
live with this v2 despite some minor "usability" concerns.  Going to leave
the decision to you, either one is OK.
 
Regards
Robin
 
From 3be4cf4403a584d560c3923207a9c4da8dafee49 Mon Sep 17 00:00:00 2001
From: Robin Dapp <rd...@ventanamicro.com>
Date: Wed, 20 Sep 2023 10:15:36 +0200
Subject: [PATCH] lehua
 
---
gcc/config/riscv/autovec-opt.md | 52 ++++++++++++++++++++++++++++++++-
gcc/config/riscv/autovec.md     |  4 ++-
gcc/config/riscv/riscv-protos.h |  1 +
3 files changed, 55 insertions(+), 2 deletions(-)
 
diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index a97a095691c..8d4ee2ae37f 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -103,12 +103,14 @@ (define_insn_and_split "*cond_abs<mode>"
         (if_then_else:VF
           (match_operand:<VM> 3 "register_operand")
           (abs:VF (match_operand:VF 1 "nonmemory_operand"))
-          (match_operand:VF 2 "register_operand")))]
+          (match_operand:VF 2 "nonmemory_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
   [(const_int 0)]
{
+  if (!REG_P (operands[2]))
+    operands[2] = force_reg (<MODE>mode, operands[2]);
   emit_insn (gen_cond_len_abs<mode> (operands[0], operands[3], operands[1],
     operands[2],
     gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode),
@@ -1176,3 +1178,51 @@ (define_insn_and_split "*n<optab><mode>"
     DONE;
   }
   [(set_attr "type" "vmalu")])
+
+;; Combine mask extend + vredsum to mask vwredsum[u]
+(define_insn_and_split "*cond_widen_reduc_plus_scal_<mode>"
+  [(set (match_operand:<V_DOUBLE_EXTEND_VEL> 0 "register_operand")
+        (unspec:<V_DOUBLE_EXTEND_VEL> [
+          (if_then_else:<V_DOUBLE_EXTEND>
+            (match_operand:<VM> 1 "register_operand")
+            (any_extend:<V_DOUBLE_EXTEND>
+              (match_operand:VI_QHS_NO_M8 2 "register_operand"))
+            (match_operand:<V_DOUBLE_EXTEND> 3 "vector_const_0_operand"))
+        ] UNSPEC_REDUC_SUM))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx ops[] = {operands[0], operands[2], operands[1],
+               gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode)};
+  riscv_vector::expand_reduction (<WREDUC_UNSPEC>,
+                                  riscv_vector::REDUCE_OP_M,
+                                  ops, CONST0_RTX (<V_DOUBLE_EXTEND_VEL>mode));
+  DONE;
+}
+[(set_attr "type" "vector")])
+
+;; Combine mask extend + vfredsum to mask vfwredusum
+(define_insn_and_split "*cond_widen_reduc_plus_scal_<mode>"
+  [(set (match_operand:<V_DOUBLE_EXTEND_VEL> 0 "register_operand")
+        (unspec:<V_DOUBLE_EXTEND_VEL> [
+          (if_then_else:<V_DOUBLE_EXTEND>
+            (match_operand:<VM> 1 "register_operand")
+            (float_extend:<V_DOUBLE_EXTEND>
+              (match_operand:VF_HS_NO_M8 2 "register_operand"))
+            (match_operand:<V_DOUBLE_EXTEND> 3 "vector_const_0_operand"))
+        ] UNSPEC_REDUC_SUM_UNORDERED))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx ops[] = {operands[0], operands[2], operands[1],
+               gen_int_mode (GET_MODE_NUNITS (<MODE>mode), Pmode)};
+  riscv_vector::expand_reduction (UNSPEC_WREDUC_SUM_UNORDERED,
+                                  riscv_vector::REDUCE_OP_M_FRM_DYN,
+                                  ops, CONST0_RTX (<V_DOUBLE_EXTEND_VEL>mode));
+  DONE;
+}
+[(set_attr "type" "vector")])
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 75ed7ae4f2e..1c10e841692 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -550,13 +550,15 @@ (define_insn_and_split "vcond_mask_<mode><vm>"
         (if_then_else:V_VLS
           (match_operand:<VM> 3 "register_operand")
           (match_operand:V_VLS 1 "nonmemory_operand")
-          (match_operand:V_VLS 2 "register_operand")))]
+          (match_operand:V_VLS 2 "nonmemory_operand")))]
   "TARGET_VECTOR && can_create_pseudo_p ()"
   "#"
   "&& 1"
   [(const_int 0)]
   {
     /* The order of vcond_mask is opposite to pred_merge.  */
+    if (!REG_P (operands[2]))
+      operands[2] = force_reg (<MODE>mode, operands[2]);
     std::swap (operands[1], operands[2]);
     riscv_vector::emit_vlmax_insn (code_for_pred_merge (<MODE>mode),
                                    riscv_vector::MERGE_OP, operands);
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 9ea0bcf15d3..a75b0b485b4 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -337,6 +337,7 @@ enum insn_type : unsigned int
   /* For vreduce, no mask policy operand. */
   REDUCE_OP = __NORMAL_OP_TA | BINARY_OP_P | VTYPE_MODE_FROM_OP1_P,
+  REDUCE_OP_M = __MASK_OP_TA | BINARY_OP_P | VTYPE_MODE_FROM_OP1_P,
   REDUCE_OP_FRM_DYN = REDUCE_OP | FRM_DYN_P | VTYPE_MODE_FROM_OP1_P,
   REDUCE_OP_M_FRM_DYN
   = __MASK_OP_TA | BINARY_OP_P | FRM_DYN_P | VTYPE_MODE_FROM_OP1_P,
-- 
2.41.0
 
 
 

Reply via email to