RE: [patch, i386] false dependencies fix

2018-06-28 Thread Nesterovskiy, Alexander
Hello!

> So what I'm confused about is in the original output template operand 
> 0 is duplicated. In the new template operand 1 is duplicated.
>
> Presumably what you're trying to accomplish is avoiding a false read 
> on operand 0 (the destination)?  Can you please confirm?

> Knowing that should also help me evaluate the changes to recp and 
> rsqrt since they're being changed to the same style encoding when 
> operating strictly on registers.

Yes, it's the same for all instructions in the patch - we're not just avoiding
read but present more possibilities to execute speculatively for CPU here.

The destination depends only on the source after the patch, and (thanks
to CPU register renaming) CPU can successfully execute this instruction
even if some previous instruction with write to the same destination is
not finished currently.

--
Alexander Nesterovskiy


[PATCH, i386]: AVX false dependencies fix

2018-05-04 Thread Nesterovskiy, Alexander
This is the same patch I posted a few days ago, a bit modified according to 
Uros' recommendation.

Patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, vsqrtss 
and vsqrtsd instructions.
Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance 
gains.

2018-05-04  Alexander Nesterovskiy  

* config/i386/i386.md (*movsf_internal): AVX falsedep fix.
(*movdf_internal): Ditto.
(*rcpsf2_sse): Ditto.
(*rsqrtsf2_sse): Ditto.
(*sqrt2_sse): Ditto.

--
Alexander Nesterovskiy


avx_falsedep.patch
Description: avx_falsedep.patch


[patch, i386] false dependencies fix

2018-05-02 Thread Nesterovskiy, Alexander
This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, 
vsqrtss and vsqrtsd instructions.

Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance 
gains.
Please let me know if something is wrong here and should be changed.

--
Alexander Nesterovskiy


falsedep.patch
Description: falsedep.patch