From: Jakub Jelinek <ja...@redhat.com>
> On Thu, Sep 15, 2016 at 02:55:48PM +0000, Wilco Dijkstra wrote:
>> stpcpy is not conceptually the same, but for mempcpy, yes. By default
>> it's converted into memcpy in the GLIBC headers and the generic 
>> implementation.
>> 
>> stpcpy uses strlen and memcpy which is generally the most efficient version
>> (it even beat several assembler implementations).
>
> ??  I certainly see something completely different, at least on the arches
> I've looked at.

glibc/string/string.h contains:

#if defined __USE_GNU && defined __OPTIMIZE__ \
    && defined __extern_always_inline && __GNUC_PREREQ (3,2)
# if !defined _FORCE_INLINES && !defined _HAVE_STRING_ARCH_mempcpy

#define mempcpy(dest, src, n) __mempcpy_inline (dest, src, n)
#define __mempcpy(dest, src, n) __mempcpy_inline (dest, src, n)

__extern_always_inline void *
__mempcpy_inline (void *__restrict __dest,
                  const void *__restrict __src, size_t __n)
{
  return (char *) memcpy (__dest, __src, __n) + __n;
}

That's best done GCC as a general optimization as currently mempcpy is not 
handled efficiently (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70140),
and it avoids having to repeat this for every C library out there...

glibc/string/mempcpy.c:

void *
MEMPCPY (void *dest, const void *src, size_t len)
{
  return memcpy (dest, src, len) + len;
}

And glibc/string/stpcpy.c:

char *
STPCPY (char *dest, const char *src)
{
  size_t len = strlen (src);
  return memcpy (dest, src, len + 1) + len;
}

This means that without having to write any assembly code, by default 
mempcpy, stpcpy etc are as efficient as possible (memcpy and strlen are 
optimized well on all targets, that's not true for mempcpy, stpcpy and similar
functions, and to make matters worse, the generic code used to be very 
inefficient).

Wilco

Reply via email to