[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-05-20 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

Martin Liška  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #22 from Martin Liška  ---
Implemented.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-05-20 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #21 from Martin Liška  ---
Author: marxin
Date: Mon May 20 07:55:00 2019
New Revision: 271400

URL: https://gcc.gnu.org/viewcvs?rev=271400=gcc=rev
Log:
Come up with hook libc_has_fast_function (PR middle-end/90263).

2019-05-20  Martin Liska  

PR middle-end/90263
* builtins.c (expand_builtin_memory_copy_args): When having a
target with fast mempcpy implementation do now use memcpy.
* config/i386/i386.c (ix86_libc_has_fast_function): New.
(TARGET_LIBC_HAS_FAST_FUNCTION): Likewise.
* doc/tm.texi: Likewise.
* doc/tm.texi.in: Likewise.
* target.def:
* expr.c (emit_block_move_hints): Add 2 new arguments.
* expr.h (emit_block_move_hints): Bail out when libcall
to memcpy would be used.
2019-05-20  Martin Liska  

PR middle-end/90263
* gcc.c-torture/compile/pr90263.c: New test.
* lib/target-supports.exp: Add check_effective_target_glibc.

Added:
trunk/gcc/testsuite/gcc.c-torture/compile/pr90263.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/builtins.c
trunk/gcc/config/i386/i386.c
trunk/gcc/doc/tm.texi
trunk/gcc/doc/tm.texi.in
trunk/gcc/expr.c
trunk/gcc/expr.h
trunk/gcc/target.def
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/lib/target-supports.exp

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-05-07 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #20 from Wilco  ---
(In reply to Martin Liška from comment #19)
> Created attachment 46265 [details]
> Patch candidate v2
> 
> Update patch that should be fine. Tests on x86_64 work except:
> FAIL: gcc.c-torture/execute/builtins/mempcpy.c execution,  -O1 
> 
> Where the test expects that mempcpy without LHS should be transformed into
> memcpy.

Thanks, this version works as expected on AArch64. In principle we could keep
mempcpy for tailcalls with -Os, but I'm not sure it's worth the trouble.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-30 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

Martin Liška  changed:

   What|Removed |Added

  Attachment #46262|0   |1
is obsolete||

--- Comment #19 from Martin Liška  ---
Created attachment 46265
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46265=edit
Patch candidate v2

Update patch that should be fine. Tests on x86_64 work except:
FAIL: gcc.c-torture/execute/builtins/mempcpy.c execution,  -O1 

Where the test expects that mempcpy without LHS should be transformed into
memcpy.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-30 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

Martin Liška  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-04-30
   Assignee|unassigned at gcc dot gnu.org  |marxin at gcc dot 
gnu.org
   Target Milestone|--- |10.0
 Ever confirmed|0   |1

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #18 from Wilco  ---
(In reply to Martin Liška from comment #14)
> Created attachment 46262 [details]
> Patch candidate
> 
> Patch candidate that handles:
> 
> $ cat ~/Programming/testcases/mempcpy.c
> int *mempcopy2 (int *p, int *q, long n)
> {
>   return __builtin_mempcpy (p, q, n);
> }
> 
> $ ./xgcc -B. -O2 -S ~/Programming/testcases/mempcpy.c -o/dev/stdout
> ...
> mempcopy2:
> .LFB0:
>   .cfi_startproc
>   pushq   %rbx
>   .cfi_def_cfa_offset 16
>   .cfi_offset 3, -16
>   movq%rdx, %rbx
>   callmemcpy
>   addq%rbx, %rax
>   popq%rbx
>   .cfi_def_cfa_offset 8
>   ret
> ...
> 
> There's a single failing test: gcc.dg/20050503-1.c

On AArch64 it calls mempcpy but memcpy on x86, which is the wrong way around...

-  if (retmode == RETURN_END && target != const0_rtx)
+  if (!TARGET_HAS_FAST_MEMPCPY_ROUTINE && retmode == RETURN_END && target !=
const0_rtx)

When I uncomment this, it uses memcpy. Also:

+TARGET_HAS_FAST_MEMPCPY_ROUTINE
+&& retmode == RETURN_END_MINUS_ONE,
+_move_done);

Should that be RETURN_END? And is_move_done isn't used, is that right?

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #17 from Wilco  ---
(In reply to Wilco from comment #16)
> (In reply to Martin Sebor from comment #15)
> > I just noticed I have been misreading mempcpy as memccpy and so making no
> > sense.  Sorry about that!  Please ignore my comments.
> 
> I see, yes we have too many and the names are too similar. Now all we need
> is someone to propose strlpcpy or stplcpy...

https://github.com/fishilico/shared/blob/master/linux/nolibc/uname.c

Aargh...

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #16 from Wilco  ---
(In reply to Martin Sebor from comment #15)
> I just noticed I have been misreading mempcpy as memccpy and so making no
> sense.  Sorry about that!  Please ignore my comments.

I see, yes we have too many and the names are too similar. Now all we need is
someone to propose strlpcpy or stplcpy...

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #15 from Martin Sebor  ---
I just noticed I have been misreading mempcpy as memccpy and so making no
sense.  Sorry about that!  Please ignore my comments.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #14 from Martin Liška  ---
Created attachment 46262
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46262=edit
Patch candidate

Patch candidate that handles:

$ cat ~/Programming/testcases/mempcpy.c
int *mempcopy2 (int *p, int *q, long n)
{
  return __builtin_mempcpy (p, q, n);
}

$ ./xgcc -B. -O2 -S ~/Programming/testcases/mempcpy.c -o/dev/stdout
...
mempcopy2:
.LFB0:
.cfi_startproc
pushq   %rbx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movq%rdx, %rbx
callmemcpy
addq%rbx, %rax
popq%rbx
.cfi_def_cfa_offset 8
ret
...

There's a single failing test: gcc.dg/20050503-1.c

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #13 from Martin Sebor  ---
I mean an equivalent of the following (with suitable symbol linkage):

  void* memccpy (void *d, const void *s, int c, size_t n)
  {
// efficient memccpy, perhaps in assembly
  }

  void* memcpy (void *d, const void *s, size_t n)
  {
memccpy (d, s, 0, n);
return d;
  }

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #12 from Wilco  ---
(In reply to Martin Sebor from comment #11)
> My concern is that transforming memccpy to memcpy would leave little
> incentive for libraries like glibc to provide a more optimal implementation.
> Would implementing the function simply as memcpy and having the latter
> return the result of the former be a viable option?  Basically, rename
> memcpy to meccpy, parameterizing it on the termination character in the
> process, and change memcpy to call memccpy and return the first pointer.

I have no idea what you mean - there is no way you can implement memcpy using
memccpy. It never makes sense to slow down a performance critical function in
order to speed up an infrequently used one.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #11 from Martin Sebor  ---
My concern is that transforming memccpy to memcpy would leave little incentive
for libraries like glibc to provide a more optimal implementation.  Would
implementing the function simply as memcpy and having the latter return the
result of the former be a viable option?  Basically, rename memcpy to meccpy,
parameterizing it on the termination character in the process, and change
memcpy to call memccpy and return the first pointer.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #10 from Martin Liška  ---
(In reply to Wilco from comment #6)
> (In reply to Martin Liška from comment #5)
> > The discussion looks familiar to me. Isn't that PR70140, where I was
> > suggesting something like:
> > 
> > https://marc.info/?l=gcc-patches=150166433909242=2
> > 
> > with a new target macro: TARGET_HAS_FAST_MEMPCPY_ROUTINE ?
> 
> Yes something like that should be fine. Unfortunately that patch doesn't
> apply anymore, and when I got it to compile it didn't do anything.

I'm testing an updated version of the patch right now..

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #9 from Wilco  ---
(In reply to Martin Sebor from comment #7)
> Rather than unconditionally transforming mempcpy to memcpy I would prefer to
> see libc implementations of memccpy optimized.  WG14 N2349 discusses a
> rationale for this:   
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2349.htm

I don't think one precludes the other. memccpy is an odd interface, basically
just a shorthand for memchr+memcpy. GLIBC already has an efficient generic
implementation doing this - and like mempcpy it's unlikely many targets will
add an optimized assembler version. So inlining in GCC would be best.

Having so many non-standard string functions just means they are not optimized
at all on most targets and never get significant use. I've spent a lot of time
optimizing the generic string functions in GLIBC so they are at least not
ridiculously slow (a loop doing 1 byte per iteration is unacceptable today).

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

Martin Sebor  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=88814

--- Comment #8 from Martin Sebor  ---
pr88814 has some additional background.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread msebor at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

Martin Sebor  changed:

   What|Removed |Added

   Keywords||missed-optimization
 CC||msebor at gcc dot gnu.org

--- Comment #7 from Martin Sebor  ---
Rather than unconditionally transforming mempcpy to memcpy I would prefer to
see libc implementations of memccpy optimized.  WG14 N2349 discusses a
rationale for this:   
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2349.htm

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #6 from Wilco  ---
(In reply to Martin Liška from comment #5)
> The discussion looks familiar to me. Isn't that PR70140, where I was
> suggesting something like:
> 
> https://marc.info/?l=gcc-patches=150166433909242=2
> 
> with a new target macro: TARGET_HAS_FAST_MEMPCPY_ROUTINE ?

Yes something like that should be fine. Unfortunately that patch doesn't apply
anymore, and when I got it to compile it didn't do anything.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-29 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

Martin Liška  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #5 from Martin Liška  ---
The discussion looks familiar to me. Isn't that PR70140, where I was suggesting
something like:

https://marc.info/?l=gcc-patches=150166433909242=2

with a new target macro: TARGET_HAS_FAST_MEMPCPY_ROUTINE ?

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-26 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #4 from Wilco  ---
(In reply to Jakub Jelinek from comment #3)
> Because then you penalize properly maintained targets which do have
> efficient mempcpy.  And even if some targets don't have efficient mempcpy
> right now, that doesn't mean they can't have it in the future.  On the
> caller side, when not expanded inline, calling mempcpy instead of memcpy is
> smaller, often requires fewer registers to be live across the call etc.

Few targets have an optimized mempcpy, particularly when you look at other
libraries besides GLIBC. We could easily have a target hook to emit mempcpy,
but I believe the default should be to do what's fastest for most targets.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #3 from Jakub Jelinek  ---
Because then you penalize properly maintained targets which do have efficient
mempcpy.  And even if some targets don't have efficient mempcpy right now, that
doesn't mean they can't have it in the future.  On the caller side, when not
expanded inline, calling mempcpy instead of memcpy is smaller, often requires
fewer registers to be live across the call etc.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-26 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

--- Comment #2 from Wilco  ---
(In reply to Jakub Jelinek from comment #1)
> As stated several times in the past, I strongly disagree.

Why? GCC already does this for bzero and bcopy.

[Bug middle-end/90263] Calls to mempcpy should use memcpy

2019-04-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
As stated several times in the past, I strongly disagree.