================
@@ -97,3 +157,16 @@ define <4 x double> @shuffle_v4f64(<4 x double> %a) {
   %shuffle = shufflevector <4 x double> %a, <4 x double> poison, <4 x i32> 
<i32 3, i32 1, i32 2, i32 0>
   ret <4 x double> %shuffle
 }
+
+define <4 x double> @shuffle_v4f64_same_lane(<4 x double> %a) {
+; CHECK-LABEL: shuffle_v4f64_same_lane:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    pcalau12i $a0, %pc_hi20(.LCPI11_0)
+; CHECK-NEXT:    xvld $xr1, $a0, %pc_lo12(.LCPI11_0)
+; CHECK-NEXT:    xvpermi.d $xr0, $xr0, 78
+; CHECK-NEXT:    xvshuf.d $xr1, $xr0, $xr0
+; CHECK-NEXT:    xvori.b $xr0, $xr1, 0
----------------
zhaoqi5 wrote:

Yes, a single `xvpermi.d` is enough for this case.

But now, when we legalize `vector_shuffle` for lasx, we firstly use 
`canonicalizeShuffleVectorByLane()` to convert the source vector to avoid 
cross-lane access. Thus the `xvpermi.d $xr0, $xr0, 78` will be generated. Then 
the converted vector will be used to match lasx shuffle instructions' patterns, 
and `xvshuf` maches.

To avoid this, we may should modify the order of the current processing logic. 
And the pattern for `xvpermi.d` should also be implemented.

https://github.com/llvm/llvm-project/pull/151633
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

Reply via email to