On Fri, Feb 09, 2018 at 11:17:35AM -0800, Linus Torvalds wrote:
> Yeah, it's only true on the very latest uarchs, and even there it's
> not perfect for small copies.
> 
> On the older machines that are relevant for 32-bit code, it's often
> tens of cycles just for the ucode overhead, I think, and "rep movsb"
> actually does things literally a byte at a time.

Ugh, okay. So I switch to movsl, that should at least perform on-par
with the chain of 'pushl' instructions I had before.


Thanks,

        Joerg

Reply via email to