https://llvm.org/bugs/show_bug.cgi?id=24678

            Bug ID: 24678
           Summary: are overlapping memory accesses optimal?
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Common Code Generator Code
          Assignee: [email protected]
          Reporter: [email protected]
                CC: [email protected]
    Classification: Unclassified

I'm not sure if this is a performance bug, but I'm filing it for further review
based on the discussion in D12543:
http://reviews.llvm.org/D12543

The code in SelectionDAG's FindOptimalMemOpLowering() can generate overlapping
accesses when unaligned memops are specified as fast, but it's not clear if
overlapping is good for performance on all targets.

Example:

$ cat copy13bytes.c 
#include <string.h>
void foo(char *a, char *b) {
    memcpy(a, b, 13);
}

$ clang copy13bytes.c -S -o - -O2
...
    movq    (%rsi), %rax
    movq    5(%rsi), %rcx
    movq    %rcx, 5(%rdi)
    movq    %rax, (%rdi)


$ gcc copy13bytes.c -S -o - -O2
...
    movq    (%rsi), %rax
    movq    %rax, (%rdi)
    movl    8(%rsi), %eax
    movl    %eax, 8(%rdi)
    movzbl    12(%rsi), %eax
    movb    %al, 12(%rdi)

Note that any load/store in either case may be misaligned (and in the clang
case, at least one pair of the ops are guaranteed to be misaligned), but LLVM
chooses overlapping ops to reduce the instruction count.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to