[LLVMbugs] [Bug 21541] poor codegen for unaligned fixed-size memcpy/memmove

bugzilla-daemon Mon, 01 Dec 2014 11:27:05 -0800

http://llvm.org/bugs/show_bug.cgi?id=21541


Sanjay Patel <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #16 from Sanjay Patel <[email protected]> ---
16-byte codegen for btver2 fixed with:
http://llvm.org/viewvc/llvm-project?view=revision&revision=222925

For the original code example in this bug report using clang built from
r223054, we now generate:

$ ./clang -O3 -fomit-frame-pointer -march=btver2 -c 21541.c -S -o -
    .section    __TEXT,__text,regular,pure_instructions
    .macosx_version_min 10, 10
    .globl    _copy32byte
    .align    4, 0x90
_copy32byte:                            ## @copy32byte
    .cfi_startproc
## BB#0:                                ## %entry
    vmovups    (%rsi), %ymm0
    vmovups    %ymm0, (%rdi)
    vzeroupper
    retq

------------------------------------------------------------------------

Resolving as fixed since we're using 32-byte memops now. 

I've seen some codegen variability between "vmovups" and "vmovdqu" that I can't
explain yet. I don't think there will be any perf difference between those 2
insts for a simple copy based on my testing or the docs, but if there is, we
should open a new bug.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs

[LLVMbugs] [Bug 21541] poor codegen for unaligned fixed-size memcpy/memmove

Reply via email to