http://llvm.org/bugs/show_bug.cgi?id=17090

            Bug ID: 17090
           Summary: Poor codegen for loading and storing arrays
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: normal
          Priority: P
         Component: Common Code Generator Code
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
    Classification: Unclassified

I've noticed that loading and storing an LLVM array value seems to produce
quite slow code - loading and storing each element in the array individually.
Using a memcpy intrinsic, by contrast, uses MOVUPS instead.

Here is the IR:

; ModuleID = 'WinterModule'
target datalayout =
"e-p:64:64:64-S128-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f16:16:16-f32:32:32-f64:64:64-f128:128:128-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"

; Function Attrs: nounwind
define void @"main(array<float, 4>, array<float, 4>)"([4 x float]* noalias
nocapture sret %ret, [4 x float]* noalias nocapture %a, [4 x float]* noalias
nocapture %b, i32* nocapture %hidden) #0 {
entry:
  %0 = load [4 x float]* %a, align 4
  store [4 x float] %0, [4 x float]* %ret, align 4
  ret void
}

attributes #0 = { nounwind }



And here is the produced code:


# After PreEmit passes:
# Machine code for function main(array<float, 4>, array<float, 4>): Post SSA,
no
t tracking liveness
Function Live Ins: %RCX in %vreg0, %RDX in %vreg1

BB#0: derived from LLVM BB %entry
    Live Ins: %RCX %RDX
        %XMM0<def> = MOVSSrm %RDX, 1, %noreg, 0, %noreg; mem:LD4[%a]
        %XMM1<def> = MOVSSrm %RDX, 1, %noreg, 4, %noreg; mem:LD4[%a+4]
        %XMM2<def> = MOVSSrm %RDX, 1, %noreg, 8, %noreg; mem:LD4[%a+8]
        %XMM3<def> = MOVSSrm %RDX<kill>, 1, %noreg, 12, %noreg; mem:LD4[%a+12]
        MOVSSmr %RCX, 1, %noreg, 12, %noreg, %XMM3<kill>; mem:ST4[%ret+12]
        MOVSSmr %RCX, 1, %noreg, 8, %noreg, %XMM2<kill>; mem:ST4[%ret+8]
        MOVSSmr %RCX, 1, %noreg, 4, %noreg, %XMM1<kill>; mem:ST4[%ret+4]
        MOVSSmr %RCX, 1, %noreg, 0, %noreg, %XMM0<kill>; mem:ST4[%ret]
        %RAX<def> = MOV64rr %RCX<kill>
        RET %RAX<kill>

# End machine code for function main(array<float, 4>, array<float, 4>).


This is with LLVM 3.3.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs

Reply via email to