On Mon, Feb 5, 2018 at 2:22 PM, Linus Torvalds
<[email protected]> wrote:
>
> But I'm not timing it.

I lied.

I did this:

        for (i = 0; i < 100000; i++)
                asm(".rept 16384\n"
                "subq $128,%rsp\n\t"
                "pushq %rbx\n\t"
                "pushq %r10\n\t"
                "pushq %r11\n\t"
                "pushq %r12\n\t"
                "pushq %r13\n\t"
                "pushq %r14\n\t"
                "pushq %r15\n\t"

                "popq %r15\n\t"
                "popq %r14\n\t"
                "popq %r13\n\t"
                "popq %r12\n\t"
                "popq %r11\n\t"
                "popq %r10\n\t"
                "popq %rbx\n\t"
                "addq $128,%rsp\n\t"
                ".endr");

and then I timed it like that, and with "xorq" of the register after
each "pushq".

And the timings came out the same, to within the (bad) timing I did.

So I really do think you can just put the xor right next to the push,
and it will be effectively free.

            Linus

Reply via email to