================
@@ -3978,14 +3983,21 @@ class sme2_luti6_vector_vg4_base<RegisterOperand zd_ty,
string asm>
}
class sme2_luti6_vector_vg4_consecutive<string asm>
- : sme2_luti6_vector_vg4_base<ZZZZ_h_mul_r, asm> {
+ : sme2_luti6_vector_vg4_base<ZZZZ_h_mul_r, ZZ_Any, asm> {
+ let Inst{15-10} = 0b111101;
+ let Inst{4-2} = Zd;
+ let Inst{1-0} = 0b00;
+}
+
+class sme2_luti6_vector_vg4_consecutive_x3<string asm>
----------------
jthackray wrote:
Agreed. I was re-reading Claudio's intention when he proposed the `_u8_x3` ACLE
change, we need to select either the bottom or top indexes, and map into the
`luti6` instruction, i.e.
```
imm_idx == 0 -> use index[0], index[1]
imm_idx == 1 -> use index[1], index[2]
```
I've updated `AArch64DAGToDAGISel::SelectMultiVectorLuti6LaneX4()` for both
`_u8_x2` and `_u8_x3` intrinsics. The `luti6` instruction still only takes a
2-register `Zm` operand, so both the `_u8_x2` and `_u8_x3` ACLE forms lower to
`LUTI6_4Z2Z2ZI`.
For the `_u8_x3` form, the operands are:
```
operand 1: table0
operand 2: table1
operand 3: index0
operand 4: index1
operand 5: index2
operand 6: imm
```
so the selector picks operand 3/4, or operand 4/5, with op6 as `imm`. For
`_u8_x2` it will always pick operand 3/4, and operand 5 is `imm`.
https://github.com/llvm/llvm-project/pull/187046
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits