================
@@ -1093,24 +1093,34 @@ __m256d test_mm256_hadd_pd(__m256d A, __m256d B) {
   // CHECK: call {{.*}}<4 x double> @llvm.x86.avx.hadd.pd.256(<4 x double> 
%{{.*}}, <4 x double> %{{.*}})
   return _mm256_hadd_pd(A, B);
 }
+TEST_CONSTEXPR(match_m256d(_mm256_hadd_pd((__m256d){1.0, 2.0, 3.0, 4.0}, 
(__m256d){5.0, 6.0, 7.0, 8.0}), 3.0, 7.0, 11.0, 15.0));
----------------
RKSimon wrote:

The result should be 3.0, 11.0, 7.0, 15.0 - hop are performed at the 128-bit 
level (so lhs hops then rhs hops per 128) -(interp__builtin_x86_pack has to do 
something similar):
```
dst[63:0] := a[127:64] + a[63:0]
dst[127:64] := b[127:64] + b[63:0]
dst[191:128] := a[255:192] + a[191:128]
dst[255:192] := b[255:192] + b[191:128]
dst[MAX:256] := 0
```

https://github.com/llvm/llvm-project/pull/156822
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to