Issue 71763
Summary [NEON] Wrong result of NEON intrinsic `vld2q_dup_p16` with the `-march=armv7-a` flag in CLANG-15.
Labels new issue
Assignees
Reporter yyctw
    According to the ARM documentation https://developer.arm.com/architectures/instruction-sets/intrinsics/vld2q_dup_p16, `vld2q_dup_p16` loads a 2-element structure from memory and replicates the structure to all the lanes of the two SIMD&FP registers. 
The expected result should be as follows:
```
poly16_t a[2] = {1, 3};
poly16x8x2_t r = vld2q_dup_p16(a);
// The value of r {1, 1, 1, 1, 1, 1, 1, 1,
//                 3, 3, 3, 3, 3, 3, 3, 3}
```
However, the result of `vld2q_dup_p8` with the flags `-march=armv7-a` and `-mfpu=neon` in CLANG-15 is as follows:
```
poly16_t a[2] = {1, 3};
poly16x8x2_t r = vld2q_dup_p16(a);
// The value of r {3, 3, 3, 3, 1, 1, 1, 1,
// 0, 0, 0, 0, 3, 3, 3, 3}
```
This issue also occurs in `vld2q_dup_p8`.

Reproduce problem: https://godbolt.org/z/6ETT3rjKo
CLANG version:
```
Debian clang version 15.0.7
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
```
Thank you for your reading.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to