On Mon, Feb 5, 2018 at 2:10 PM, Andy Lutomirski <[email protected]> wrote:
> At the risk of over-optimizing a dead horse, what about:
>
> xorl %ebx, %ebx
> movq %ebx, %r10
> xorl %r11, %r11
> movq %ebx, %r12
>
> etc.
>
> We'll have a cycle of latency from xor to mov, but I'd be rather
> surprised if the CPU can't hide that.

Ugh. xor really is nice because it breaks all dependencies.

Really, it's much more likely that we can just hide the xors in the
pushes. Small, simple, easy.

But I'm not timing it.

               Linus

Reply via email to