http://llvm.org/bugs/show_bug.cgi?id=21541
Sanjay Patel <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #16 from Sanjay Patel <[email protected]> --- 16-byte codegen for btver2 fixed with: http://llvm.org/viewvc/llvm-project?view=revision&revision=222925 For the original code example in this bug report using clang built from r223054, we now generate: $ ./clang -O3 -fomit-frame-pointer -march=btver2 -c 21541.c -S -o - .section __TEXT,__text,regular,pure_instructions .macosx_version_min 10, 10 .globl _copy32byte .align 4, 0x90 _copy32byte: ## @copy32byte .cfi_startproc ## BB#0: ## %entry vmovups (%rsi), %ymm0 vmovups %ymm0, (%rdi) vzeroupper retq ------------------------------------------------------------------------ Resolving as fixed since we're using 32-byte memops now. I've seen some codegen variability between "vmovups" and "vmovdqu" that I can't explain yet. I don't think there will be any perf difference between those 2 insts for a simple copy based on my testing or the docs, but if there is, we should open a new bug. -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ LLVMbugs mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs
