On Fri, 4 May 2018, Warner Losh wrote:

On Fri, May 4, 2018 at 5:12 PM, Mateusz Guzik <[email protected]> wrote:

On Sat, May 5, 2018 at 12:58 AM, Steven Hartland <
[email protected]> wrote:

Can we get the why in commit messages please?

This sort of message doesnt provide anything more that can be obtained
from reading the diff, which just leaves us wondering why?

I???m sure there is a good reason, but without confirmation we???re just left
guessing. The knock on to this is if some assumption that caused the why
changes, anyone looking at this will not be able to make an informed
descision that that was the case.

bcopy is an equivalent of memmove, i.e. it accepts overlapping buffers.
But if we know for a fact they don't overlap (like here), doing this over
memcpy (which does not accept such buffers) only puts avoidable
constraints on the optimizer.

Indeed, but clang already does adequate optimization for som manye cases
(especially amd64), so these small changes are not much more than special
micro-optimizations for gcc on 32-bit arches.  I care about gcc and 32-bit
arches, but you don't.

bcopy, in userland, is memmove. bcopy in the kernel has had a more
complicated history. Today it's more like memmove, but at times in the
history of BSD/Unix it's be more akin to memcpy with a companion ovbcopy
used for overlapping copies. FreeBSD has almost always been more in the

I think (but don't know) that ovbcopy is a SYSVism and bcopy() always
handled overlapping copies in BSD.  It was not well documented that it
did, but with only 1 memory-copying function that function has to handle
overlapping copies or be even better documented to not handle them.

'bcopy is memmove' rather than the 'bcopy is memcpy' though some of the
lower-tier architectures pulled in ovbcopy which we recently GC'd from
NetBSD and/or OpenBSD.

In all of 4.4BSD /sys, ovbcopy is only referenced on 34 lines (almost half
in tags files), mostly to implement it on some arches:
- news3400, hp300, i386, luna68k: alias for bcopy
- sparc64: separate from bcopy.  bcopy seems to be like memcpy and doesn't
  handle overlapping copies.
- vax/inline/machpats.c: separate and too vaxish for me to understand (seems
  to be just a prologue)
- netiso/iso_pcb.c, net/slcompress.c, sparc/pmap.c. netinet/ip_output.c,
  netinet/ip_nroute.c: actually use it

The sparc64 and vax code is an indication that bcopy didn't always handle
overlapping copies in BSD.

Plus there's been an irrational encouragement of
using bcopy over mem* which has lead to the current state of affairs.

You mean a rational encouragement.

For the vast majority of uses, it hasn't really mattered much in the past.
It hasn't shown up on radar.

It matters even less now.  Deciding if the copies overlap takes about 1
branch, and with modern branch prediction that often costs about 1 cycle.
The x86 library implementation wastes more like 50 cycles in other ways.

However, as its optimization has moved into
the compiler I'm guessing that's changed. It's that change of heart I think
that are taking people by surprise.

I blame micro-benchmarks.  Amdahls' law applies and gives a limit of about
1% for the possible improvements from optimizing bcopy(), except in
micro-benchmarks.  That is even though the kernel spends a relatively
large amount of time in bcopy().  Userland might take 80% of the time,
the kernel 20%, and bcopy() 10% of the 20% = 2%.  After optimizing bcopy()
to be twice as fast (which is difficult), you have speeded up applications
by 1% at most.

Bruce
_______________________________________________
[email protected] mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "[email protected]"

Reply via email to