Theo de Raadt <dera...@openbsd.org> wrote:
 |>> There's got to be a performance cost, not using the .S versions.
 |>
 |>What is the average size of the copy please?
 |
 |Average in what?  In base, or in chrome?
 |
 |>Years ago, I did a whole lot of tests with this. I was so disappointed 
 |>with memcpy that I NEVER use memcpy these days, only 'memmove'. I grew
 |>up with 'bcopy' so I tend to think in terms of safe overlapped copies.
 |
 |Well.... we don't control the entire ecosystem to follow such practices.

But he is right that moving backwards was very much faster on old
x86 hardware, which i (who never read hardware system manuals, let
alone processor-family specifics) never understood, since the
range was available via repz regardless of the direction.  I read

       // on both, my Athlon 1600+ and my old Cyrix 166+, simple backward
       // copying via REPZ MOVSL is as fast, or up to ~5 percent faster, than
       // the perfectly thought through MMX+SSE optimized forward copy is.
       // (which is not available on the Cyrix.  there Move is somewhat 300
       // percent faster than Copy anyway.)

Moving backwards much appreciated.

  ..
 |the point is to make memcpy a strict API.

It turned out not to be too problematic for myself (i hope i have
found all occurrences).  The commit message reads

  Avoid memcpy(3) crash due to strict standard compliance..

Luckily this happened before the release!
But mind you, it is true that i still think it is funny that this
happened on a BSD system, the origin of bcopy(3).  To me memcpy(3)
never has been anything but an optimization for cases where you
know it is save, so that the tests, the move to the end to start
there etc., can be avoided.  This was at least nine (9) cycles
iirc on the above CPUs that can be saved, and that almost
sufficient to copy a small string!
Ciao.

--steffen

Reply via email to