https://bugs.llvm.org/show_bug.cgi?id=35454
Bug ID: 35454
Summary: [PCG] Poor shuffle lane tracking
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedb...@nondot.org
Reporter: konstantin.belocha...@sony.com
CC: llvm-bugs@lists.llvm.org
the compiler produces suboptimal code in the following cases:
///
__m128i laneTest1_0( __m128i v )
{
v = _mm_shuffle_epi32( v, _MM_SHUFFLE(0,1,2,3) ); // mirror lanes
v = _mm_add_epi8( v, v );
v = _mm_shuffle_epi32( v, _MM_SHUFFLE(0,1,2,3) ); // mirror lanes
return v;
}
vpshufd $27, %xmm0, %xmm0 # xmm0 = xmm0[3,2,1,0]
vpaddb %xmm0, %xmm0, %xmm0
vpshufd $27, %xmm0, %xmm0 # xmm0 = xmm0[3,2,1,0]
retq
__m128i laneTest1_1( __m128i v )
{
v = _mm_shuffle_epi32( v, _MM_SHUFFLE(0,1,2,3) ); // mirror lanes
v = _mm_add_epi16( v, v );
v = _mm_shuffle_epi32( v, _MM_SHUFFLE(0,1,2,3) ); // mirror lanes
return v;
}
vpshufd $27, %xmm0, %xmm0 # xmm0 = xmm0[3,2,1,0]
vpaddw %xmm0, %xmm0, %xmm0
vpshufd $27, %xmm0, %xmm0 # xmm0 = xmm0[3,2,1,0]
retq
In both cases, the shuffles could be optimized out.
Note that this happens only when the shuffle type doesn't match the
arithmetic/binop operation.
For example, the above test cases, shuffles work on packed 32-bit integer
values, while the vector add instructions work on packed 8/16 bits values.
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs