https://bugs.llvm.org/show_bug.cgi?id=35454

            Bug ID: 35454
           Summary: [PCG] Poor shuffle lane tracking
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
          Assignee: unassignedb...@nondot.org
          Reporter: konstantin.belocha...@sony.com
                CC: llvm-bugs@lists.llvm.org

the compiler produces suboptimal code in the following cases:

///
__m128i laneTest1_0( __m128i v )
{
    v = _mm_shuffle_epi32( v, _MM_SHUFFLE(0,1,2,3) ); // mirror lanes
    v = _mm_add_epi8( v, v );
    v = _mm_shuffle_epi32( v, _MM_SHUFFLE(0,1,2,3) ); // mirror lanes
    return v;
}

    vpshufd    $27, %xmm0, %xmm0       # xmm0 = xmm0[3,2,1,0]
    vpaddb    %xmm0, %xmm0, %xmm0
    vpshufd    $27, %xmm0, %xmm0       # xmm0 = xmm0[3,2,1,0]
    retq


__m128i laneTest1_1( __m128i v )
{
    v = _mm_shuffle_epi32( v, _MM_SHUFFLE(0,1,2,3) ); // mirror lanes
    v = _mm_add_epi16( v, v );
    v = _mm_shuffle_epi32( v, _MM_SHUFFLE(0,1,2,3) ); // mirror lanes
    return v;
}

    vpshufd    $27, %xmm0, %xmm0       # xmm0 = xmm0[3,2,1,0]
    vpaddw    %xmm0, %xmm0, %xmm0
    vpshufd    $27, %xmm0, %xmm0       # xmm0 = xmm0[3,2,1,0]
    retq

In both cases, the shuffles could be optimized out.
Note that this happens only when the shuffle type doesn't match the
arithmetic/binop operation.
For example, the above test cases, shuffles work on packed 32-bit integer
values, while the vector add instructions work on packed 8/16 bits values.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to