> Theo de Raadt <dera...@openbsd.org> wrote: > |>> There's got to be a performance cost, not using the .S versions. > |> > |>What is the average size of the copy please? > | > |Average in what? In base, or in chrome? > | > |>Years ago, I did a whole lot of tests with this. I was so disappointed > |>with memcpy that I NEVER use memcpy these days, only 'memmove'. I grew > |>up with 'bcopy' so I tend to think in terms of safe overlapped copies. > | > |Well.... we don't control the entire ecosystem to follow such practices. > > But he is right that moving backwards was very much faster on old > x86 hardware, which i (who never read hardware system manuals, let > alone processor-family specifics) never understood, since the > range was available via repz regardless of the direction. I read > > // on both, my Athlon 1600+ and my old Cyrix 166+, simple backward > // copying via REPZ MOVSL is as fast, or up to ~5 percent faster, than > // the perfectly thought through MMX+SSE optimized forward copy is. > // (which is not available on the Cyrix. there Move is somewhat 300 > // percent faster than Copy anyway.) > > Moving backwards much appreciated.
What is your point? You are not describing how the function is specified to do it's work. > .. > |the point is to make memcpy a strict API. > > It turned out not to be too problematic for myself (i hope i have > found all occurrences). The commit message reads > > Avoid memcpy(3) crash due to strict standard compliance.. So you are saying strict standard compliance made your program buggy? > Luckily this happened before the release! > But mind you, it is true that i still think it is funny that this > happened on a BSD system, the origin of bcopy(3). To me memcpy(3) > never has been anything but an optimization for cases where you > know it is save, so that the tests, the move to the end to start > there etc., can be avoided. This was at least nine (9) cycles > iirc on the above CPUs that can be saved, and that almost > sufficient to copy a small string! Then why did you use memcpy, if you knew it required strict ordering? You should have used memmove in the first place, which is bcopy with the arguments swapped.