On Monday, 11 June 2018 at 08:02:42 UTC, Walter Bright wrote:
On 6/10/2018 9:44 PM, Patrick Schluter wrote:
See what Agner Fog has to say about it:

Thanks. Agner Fog gets the last word on this topic!

Well, Agner is rarely wrong indeed, but there is a limit to how much material a single person can keep up to date.

On newer uarchs, `rep movsb` isn't slower than `rep movsd`, and often performs similar to the best SSE2 implementation (using NT stores). See "BeeOnRope"'s answer to this StackOverflow question for an in-depth discussion about this: https://stackoverflow.com/questions/43343231/enhanced-rep-movsb-for-memcpy

AVX2 seems to offer extra performance beyond that, though, if it is available (for example if runtime feature detection is used). I believe I read a comment by Agner somewhere to that effect as well – a search engine will certainly be able to turn up more.

 — David

Reply via email to