[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 Martin Liška changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #22 from Martin Liška --- Implemented.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #21 from Martin Liška --- Author: marxin Date: Mon May 20 07:55:00 2019 New Revision: 271400 URL: https://gcc.gnu.org/viewcvs?rev=271400&root=gcc&view=rev Log: Come up with hook libc_has_fast_function (PR middle-end/90263). 2019-05-20 Martin Liska PR middle-end/90263 * builtins.c (expand_builtin_memory_copy_args): When having a target with fast mempcpy implementation do now use memcpy. * config/i386/i386.c (ix86_libc_has_fast_function): New. (TARGET_LIBC_HAS_FAST_FUNCTION): Likewise. * doc/tm.texi: Likewise. * doc/tm.texi.in: Likewise. * target.def: * expr.c (emit_block_move_hints): Add 2 new arguments. * expr.h (emit_block_move_hints): Bail out when libcall to memcpy would be used. 2019-05-20 Martin Liska PR middle-end/90263 * gcc.c-torture/compile/pr90263.c: New test. * lib/target-supports.exp: Add check_effective_target_glibc. Added: trunk/gcc/testsuite/gcc.c-torture/compile/pr90263.c Modified: trunk/gcc/ChangeLog trunk/gcc/builtins.c trunk/gcc/config/i386/i386.c trunk/gcc/doc/tm.texi trunk/gcc/doc/tm.texi.in trunk/gcc/expr.c trunk/gcc/expr.h trunk/gcc/target.def trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/lib/target-supports.exp
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #20 from Wilco --- (In reply to Martin Liška from comment #19) > Created attachment 46265 [details] > Patch candidate v2 > > Update patch that should be fine. Tests on x86_64 work except: > FAIL: gcc.c-torture/execute/builtins/mempcpy.c execution, -O1 > > Where the test expects that mempcpy without LHS should be transformed into > memcpy. Thanks, this version works as expected on AArch64. In principle we could keep mempcpy for tailcalls with -Os, but I'm not sure it's worth the trouble.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 Martin Liška changed: What|Removed |Added Attachment #46262|0 |1 is obsolete|| --- Comment #19 from Martin Liška --- Created attachment 46265 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46265&action=edit Patch candidate v2 Update patch that should be fine. Tests on x86_64 work except: FAIL: gcc.c-torture/execute/builtins/mempcpy.c execution, -O1 Where the test expects that mempcpy without LHS should be transformed into memcpy.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 Martin Liška changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-04-30 Assignee|unassigned at gcc dot gnu.org |marxin at gcc dot gnu.org Target Milestone|--- |10.0 Ever confirmed|0 |1
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #18 from Wilco --- (In reply to Martin Liška from comment #14) > Created attachment 46262 [details] > Patch candidate > > Patch candidate that handles: > > $ cat ~/Programming/testcases/mempcpy.c > int *mempcopy2 (int *p, int *q, long n) > { > return __builtin_mempcpy (p, q, n); > } > > $ ./xgcc -B. -O2 -S ~/Programming/testcases/mempcpy.c -o/dev/stdout > ... > mempcopy2: > .LFB0: > .cfi_startproc > pushq %rbx > .cfi_def_cfa_offset 16 > .cfi_offset 3, -16 > movq%rdx, %rbx > callmemcpy > addq%rbx, %rax > popq%rbx > .cfi_def_cfa_offset 8 > ret > ... > > There's a single failing test: gcc.dg/20050503-1.c On AArch64 it calls mempcpy but memcpy on x86, which is the wrong way around... - if (retmode == RETURN_END && target != const0_rtx) + if (!TARGET_HAS_FAST_MEMPCPY_ROUTINE && retmode == RETURN_END && target != const0_rtx) When I uncomment this, it uses memcpy. Also: +TARGET_HAS_FAST_MEMPCPY_ROUTINE +&& retmode == RETURN_END_MINUS_ONE, +&is_move_done); Should that be RETURN_END? And is_move_done isn't used, is that right?
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #17 from Wilco --- (In reply to Wilco from comment #16) > (In reply to Martin Sebor from comment #15) > > I just noticed I have been misreading mempcpy as memccpy and so making no > > sense. Sorry about that! Please ignore my comments. > > I see, yes we have too many and the names are too similar. Now all we need > is someone to propose strlpcpy or stplcpy... https://github.com/fishilico/shared/blob/master/linux/nolibc/uname.c Aargh...
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #16 from Wilco --- (In reply to Martin Sebor from comment #15) > I just noticed I have been misreading mempcpy as memccpy and so making no > sense. Sorry about that! Please ignore my comments. I see, yes we have too many and the names are too similar. Now all we need is someone to propose strlpcpy or stplcpy...
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #15 from Martin Sebor --- I just noticed I have been misreading mempcpy as memccpy and so making no sense. Sorry about that! Please ignore my comments.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #14 from Martin Liška --- Created attachment 46262 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46262&action=edit Patch candidate Patch candidate that handles: $ cat ~/Programming/testcases/mempcpy.c int *mempcopy2 (int *p, int *q, long n) { return __builtin_mempcpy (p, q, n); } $ ./xgcc -B. -O2 -S ~/Programming/testcases/mempcpy.c -o/dev/stdout ... mempcopy2: .LFB0: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 movq%rdx, %rbx callmemcpy addq%rbx, %rax popq%rbx .cfi_def_cfa_offset 8 ret ... There's a single failing test: gcc.dg/20050503-1.c
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #13 from Martin Sebor --- I mean an equivalent of the following (with suitable symbol linkage): void* memccpy (void *d, const void *s, int c, size_t n) { // efficient memccpy, perhaps in assembly } void* memcpy (void *d, const void *s, size_t n) { memccpy (d, s, 0, n); return d; }
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #12 from Wilco --- (In reply to Martin Sebor from comment #11) > My concern is that transforming memccpy to memcpy would leave little > incentive for libraries like glibc to provide a more optimal implementation. > Would implementing the function simply as memcpy and having the latter > return the result of the former be a viable option? Basically, rename > memcpy to meccpy, parameterizing it on the termination character in the > process, and change memcpy to call memccpy and return the first pointer. I have no idea what you mean - there is no way you can implement memcpy using memccpy. It never makes sense to slow down a performance critical function in order to speed up an infrequently used one.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #11 from Martin Sebor --- My concern is that transforming memccpy to memcpy would leave little incentive for libraries like glibc to provide a more optimal implementation. Would implementing the function simply as memcpy and having the latter return the result of the former be a viable option? Basically, rename memcpy to meccpy, parameterizing it on the termination character in the process, and change memcpy to call memccpy and return the first pointer.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #10 from Martin Liška --- (In reply to Wilco from comment #6) > (In reply to Martin Liška from comment #5) > > The discussion looks familiar to me. Isn't that PR70140, where I was > > suggesting something like: > > > > https://marc.info/?l=gcc-patches&m=150166433909242&w=2 > > > > with a new target macro: TARGET_HAS_FAST_MEMPCPY_ROUTINE ? > > Yes something like that should be fine. Unfortunately that patch doesn't > apply anymore, and when I got it to compile it didn't do anything. I'm testing an updated version of the patch right now..
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #9 from Wilco --- (In reply to Martin Sebor from comment #7) > Rather than unconditionally transforming mempcpy to memcpy I would prefer to > see libc implementations of memccpy optimized. WG14 N2349 discusses a > rationale for this: > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2349.htm I don't think one precludes the other. memccpy is an odd interface, basically just a shorthand for memchr+memcpy. GLIBC already has an efficient generic implementation doing this - and like mempcpy it's unlikely many targets will add an optimized assembler version. So inlining in GCC would be best. Having so many non-standard string functions just means they are not optimized at all on most targets and never get significant use. I've spent a lot of time optimizing the generic string functions in GLIBC so they are at least not ridiculously slow (a loop doing 1 byte per iteration is unacceptable today).
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 Martin Sebor changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=88814 --- Comment #8 from Martin Sebor --- pr88814 has some additional background.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 Martin Sebor changed: What|Removed |Added Keywords||missed-optimization CC||msebor at gcc dot gnu.org --- Comment #7 from Martin Sebor --- Rather than unconditionally transforming mempcpy to memcpy I would prefer to see libc implementations of memccpy optimized. WG14 N2349 discusses a rationale for this: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2349.htm
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #6 from Wilco --- (In reply to Martin Liška from comment #5) > The discussion looks familiar to me. Isn't that PR70140, where I was > suggesting something like: > > https://marc.info/?l=gcc-patches&m=150166433909242&w=2 > > with a new target macro: TARGET_HAS_FAST_MEMPCPY_ROUTINE ? Yes something like that should be fine. Unfortunately that patch doesn't apply anymore, and when I got it to compile it didn't do anything.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #5 from Martin Liška --- The discussion looks familiar to me. Isn't that PR70140, where I was suggesting something like: https://marc.info/?l=gcc-patches&m=150166433909242&w=2 with a new target macro: TARGET_HAS_FAST_MEMPCPY_ROUTINE ?
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #4 from Wilco --- (In reply to Jakub Jelinek from comment #3) > Because then you penalize properly maintained targets which do have > efficient mempcpy. And even if some targets don't have efficient mempcpy > right now, that doesn't mean they can't have it in the future. On the > caller side, when not expanded inline, calling mempcpy instead of memcpy is > smaller, often requires fewer registers to be live across the call etc. Few targets have an optimized mempcpy, particularly when you look at other libraries besides GLIBC. We could easily have a target hook to emit mempcpy, but I believe the default should be to do what's fastest for most targets.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #3 from Jakub Jelinek --- Because then you penalize properly maintained targets which do have efficient mempcpy. And even if some targets don't have efficient mempcpy right now, that doesn't mean they can't have it in the future. On the caller side, when not expanded inline, calling mempcpy instead of memcpy is smaller, often requires fewer registers to be live across the call etc.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 --- Comment #2 from Wilco --- (In reply to Jakub Jelinek from comment #1) > As stated several times in the past, I strongly disagree. Why? GCC already does this for bzero and bcopy.
[Bug middle-end/90263] Calls to mempcpy should use memcpy
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- As stated several times in the past, I strongly disagree.