On 2024/8/13 19:52, Richard Henderson wrote:
On 8/13/24 21:34, LIU Zhiwei wrote:
From: TANG Tiancheng<tangtiancheng....@alibaba-inc.com>
When allocating registers for input and output, ensure they match
the available registers to avoid allocating illeagal registers.
We should respect RISC-V vector extension's variable-length registers
and LMUL-based register grouping. Coordinate with
tcg_target_available_regs
initialization tcg_target_init (behind this commit) to ensure proper
handling of vector register constraints.
Note: While mov_vec doesn't have constraints, dup_vec and other IRs do.
We need to strengthen constraints for all IRs except mov_vec, and this
is sufficient.
Signed-off-by: TANG Tiancheng<tangtiancheng....@alibaba-inc.com>
Fixes: 29f5e92502 (tcg: Introduce paired register allocation)
Reviewed-by: Liu Zhiwei<zhiwei_...@linux.alibaba.com>
---
tcg/tcg.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 34e3056380..d26b42534d 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -4722,8 +4722,10 @@ static void tcg_reg_alloc_dup(TCGContext *s,
const TCGOp *op)
return;
}
- dup_out_regs = tcg_op_defs[INDEX_op_dup_vec].args_ct[0].regs;
- dup_in_regs = tcg_op_defs[INDEX_op_dup_vec].args_ct[1].regs;
+ dup_out_regs = tcg_op_defs[INDEX_op_dup_vec].args_ct[0].regs &
+ tcg_target_available_regs[ots->type];
+ dup_in_regs = tcg_op_defs[INDEX_op_dup_vec].args_ct[1].regs &
+ tcg_target_available_regs[its->type];
Why would you ever have constraints that resolve to unavailable
registers?
If you don't want to fix this in the backend, then the next best place
is in process_op_defs(), so that we take care of this once at startup,
and never have to think about it again.
Hi Richard,
The constraints provided in process_op_defs() are static and tied to the
IR operations. For example, if we create constraints for add_vec, the
same constraints will apply to all types of add_vec operations
(TCG_TYPE_V64, TCG_TYPE_V128, TCG_TYPE_V256). This means the constraints
don't change based on the specific type of operation being performed.
In contrast, RISC-V's LMUL (Length Multiplier) can change at runtime
depending on the type of IR operation. Different LMUL values affect
which vector registers are available for use in RISC-V. Let's consider
an example where the host's vector register width is 128 bits:
For an add_vec operation on v256 (256-bit vectors), only even-numbered
vector registers like 0, 2, 4 can be used.
However, for an add_vec operation on v128 (128-bit vectors), all vector
registers (0, 1, 2, etc.) are available.
Thus if we want to use all registers of vectors, we have to add a
dynamic constraint on register allocation based on IR types.
Thanks,
Zhiwei
r~