================
@@ -3978,14 +3983,21 @@ class sme2_luti6_vector_vg4_base<RegisterOperand zd_ty, 
string asm>
 }
 
 class sme2_luti6_vector_vg4_consecutive<string asm>
-  : sme2_luti6_vector_vg4_base<ZZZZ_h_mul_r, asm> {
+  : sme2_luti6_vector_vg4_base<ZZZZ_h_mul_r, ZZ_Any, asm> {
+  let Inst{15-10} = 0b111101;
+  let Inst{4-2}   = Zd;
+  let Inst{1-0}   = 0b00;
+}
+
+class sme2_luti6_vector_vg4_consecutive_x3<string asm>
----------------
jthackray wrote:

Agreed. I was re-reading Claudio's intention when he proposed the `_u8_x3` ACLE 
change, we need to select either the bottom or top indexes, and map into the 
`luti6` instruction, i.e.
```
  imm_idx == 0 -> use index[0], index[1]
  imm_idx == 1 -> use index[1], index[2]
```
I've updated `AArch64DAGToDAGISel::SelectMultiVectorLuti6LaneX4()` for both 
`_u8_x2` and `_u8_x3` intrinsics. The `luti6` instruction still only takes a 
2-register `Zm` operand, so both the `_u8_x2` and `_u8_x3` ACLE forms lower to 
`LUTI6_4Z2Z2ZI`.

For the `_u8_x3` form, the operands are:
```
  operand 1: table0
  operand 2: table1
  operand 3: index0
  operand 4: index1
  operand 5: index2
  operand 6: imm
```
so the selector picks operand 3/4, or operand 4/5, with op6 as `imm`. For 
`_u8_x2` it will always pick operand 3/4, and operand 5 is `imm`.

https://github.com/llvm/llvm-project/pull/187046
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to