http://llvm.org/bugs/show_bug.cgi?id=17654
Bug ID: 17654
Summary: [AVX] scalarized code generated for zext of <16 x i8>
to <16 x i16>
Product: new-bugs
Version: unspecified
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected]
Classification: Unclassified
With top of tree and "llc -mattr=+avx", this code
define <16 x i16> @bar(<16 x i8> %v) {
%v16 = zext <16 x i8> %v to <16 x i16>
ret <16 x i16> %v16
}
generates a long sequence of vpextrb/vpinsrw instructions.
However, if I manually break the vector into two <8 x i8> vectors, do two
zexts, and then reassemble the 16-wide vector, like this:
define <16 x i16> @bat(<16 x i8> %v) {
%va = shufflevector <16 x i8> %v, <16 x i8> undef, <8 x i32> <i32 0, i32 1,
i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
%vb = shufflevector <16 x i8> %v, <16 x i8> undef, <8 x i32> <i32 8, i32 9,
i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
%va16 = zext <8 x i8> %va to <8 x i16>
%vb16 = zext <8 x i8> %vb to <8 x i16>
%v16 = shufflevector <8 x i16> %va16, <8 x i16> %vb16,
<16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32
7,
i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14,
i32 15>
ret <16 x i16> %v16
}
then I get the nice output:
vpunpckhbw %xmm0, %xmm0, %xmm1 # xmm1 =
xmm0[8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15]
vmovdqa .LCPI2_0(%rip), %xmm2
vpand %xmm2, %xmm1, %xmm1
vpmovzxbw %xmm0, %xmm0
vpand %xmm2, %xmm0, %xmm0
vinsertf128 $1, %xmm1, %ymm0, %ymm0
It'd be nice if the first implementation got this output as well.
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs