http://llvm.org/bugs/show_bug.cgi?id=11289

             Bug #: 11289
           Summary: Inefficient x86 vector code generation for compare
                    less than 0
           Product: libraries
           Version: trunk
          Platform: Macintosh
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
        AssignedTo: [email protected]
        ReportedBy: [email protected]
                CC: [email protected]
    Classification: Unclassified


Code generated for a vector comparison to zero is inefficient.

For example the C code below creates mask where each result vector element is
equal to -1 if the input vector element is negative, and 0 otherwise.

#include <emmintrin.h>

void test(__m128i* p)
{
  *p = _mm_cmplt_epi8(*p, _mm_setzero_si128());
}

Using http://llvm.org/demo/index.cgi the generated LLVM assembly is

; ModuleID = '/tmp/webcompile/_11830_0.bc'
target datalayout =
"e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"

define void @test(<2 x i64>* nocapture %p) nounwind {
  %1 = load <2 x i64>* %p, align 16, !tbaa !0
  %2 = bitcast <2 x i64> %1 to <16 x i8>
  %.lobit.i.i = ashr <16 x i8> %2, <i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7,
i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7, i8 7>
  %3 = bitcast <16 x i8> %.lobit.i.i to <2 x i64>
  store <2 x i64> %3, <2 x i64>* %p, align 16, !tbaa !0
  ret void
}

!0 = metadata !{metadata !"omnipotent char", metadata !1}
!1 = metadata !{metadata !"Simple C/C++ TBAA", null}

where the comparison < 0 is replaced by >> 7

Alternatively the vector comparison could also be written as

define <16 x i8> @test(<16 x i8> %a) {
       %b = icmp sgt <16 x i8> zeroinitializer, %a
       %c = sext <16 x i1> %b to <16 x i8>
       ret <16 x i8> %c
}

In both cases the generated code is convoluted where individual vector elements
are extracted using pextrb instructions, shifted using sarb instructions, and
reinserted using pinsrb instructions.

This was tested using r143475 from trunk.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs

Reply via email to