Re: [patch, i386] false dependencies fix

2018-06-28 Thread Jeff Law
On 06/28/2018 01:16 AM, Uros Bizjak wrote:
> Hello!
> 
>>> --- i386.md (revision 259756)
>>> +++ i386.md (working copy)
>>> @@ -3547,7 +3547,7 @@
>>>   {
>>>   case MODE_DF:
>>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>>> -return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
>>> +return "%vmovsd\t{%d1, %0|%0, %d1}";
>>>return "%vmovsd\t{%1, %0|%0, %1}";
>>>
>>>   case MODE_V4SF:
>>> @@ -3748,7 +3748,7 @@
>>>   {
>>>   case MODE_SF:
>>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>>> -return "vmovss\t{%1, %0, %0|%0, %0, %1}";
>>> +return "%vmovss\t{%d1, %0|%0, %d1}";
>>>return "%vmovss\t{%1, %0|%0, %1}";
>> So what I'm confused about is in the original output template operand 0
>> is duplicated. In the new template operand 1 is duplicated.
>>
>> Presumably what you're trying to accomplish is avoiding a false read on
>> operand 0 (the destination)?  Can you please confirm?
> 
>> Knowing that should also help me evaluate the changes to recp and rsqrt
>> since they're being changed to the same style encoding when operating
>> strictly on registers.
> 
> Please don't change "v" -> "%v" for TARGET_AVX templates. We know that
> in this case, all insn mnemonics are prefixed with "v".
I didn't realize you'd already acked the patch without the v->%v bits
and Sebastian committed the changes about a month ago...

Oh well, at least I know a bit more about the inner workings of the AVX
unit than I did a last week.

Sorry for the noise.

jeff


Re: [patch, i386] false dependencies fix

2018-06-28 Thread Jeff Law
On 06/28/2018 01:16 AM, Uros Bizjak wrote:
> Hello!
> 
>>> --- i386.md (revision 259756)
>>> +++ i386.md (working copy)
>>> @@ -3547,7 +3547,7 @@
>>>   {
>>>   case MODE_DF:
>>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>>> -return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
>>> +return "%vmovsd\t{%d1, %0|%0, %d1}";
>>>return "%vmovsd\t{%1, %0|%0, %1}";
>>>
>>>   case MODE_V4SF:
>>> @@ -3748,7 +3748,7 @@
>>>   {
>>>   case MODE_SF:
>>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>>> -return "vmovss\t{%1, %0, %0|%0, %0, %1}";
>>> +return "%vmovss\t{%d1, %0|%0, %d1}";
>>>return "%vmovss\t{%1, %0|%0, %1}";
>> So what I'm confused about is in the original output template operand 0
>> is duplicated. In the new template operand 1 is duplicated.
>>
>> Presumably what you're trying to accomplish is avoiding a false read on
>> operand 0 (the destination)?  Can you please confirm?
> 
>> Knowing that should also help me evaluate the changes to recp and rsqrt
>> since they're being changed to the same style encoding when operating
>> strictly on registers.
> 
> Please don't change "v" -> "%v" for TARGET_AVX templates. We know that
> in this case, all insn mnemonics are prefixed with "v".
ACK on that Uros -- I'd convinced myself that v->%v for TARGET_AVX
couldn't hurt anything since  we already had the v prefix in place.  I'm
happy to ensure this follows your preferred convention.

I was mostly trying to make sure I understood the other aspects of the
proposed change.

Cheers,
jeff


RE: [patch, i386] false dependencies fix

2018-06-28 Thread Nesterovskiy, Alexander
Hello!

> So what I'm confused about is in the original output template operand 
> 0 is duplicated. In the new template operand 1 is duplicated.
>
> Presumably what you're trying to accomplish is avoiding a false read 
> on operand 0 (the destination)?  Can you please confirm?

> Knowing that should also help me evaluate the changes to recp and 
> rsqrt since they're being changed to the same style encoding when 
> operating strictly on registers.

Yes, it's the same for all instructions in the patch - we're not just avoiding
read but present more possibilities to execute speculatively for CPU here.

The destination depends only on the source after the patch, and (thanks
to CPU register renaming) CPU can successfully execute this instruction
even if some previous instruction with write to the same destination is
not finished currently.

--
Alexander Nesterovskiy


Re: [patch, i386] false dependencies fix

2018-06-28 Thread Uros Bizjak
Hello!

>> --- i386.md (revision 259756)
>> +++ i386.md (working copy)
>> @@ -3547,7 +3547,7 @@
>>   {
>>   case MODE_DF:
>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>> -return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
>> +return "%vmovsd\t{%d1, %0|%0, %d1}";
>>return "%vmovsd\t{%1, %0|%0, %1}";
>>
>>   case MODE_V4SF:
>> @@ -3748,7 +3748,7 @@
>>   {
>>   case MODE_SF:
>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
>> -return "vmovss\t{%1, %0, %0|%0, %0, %1}";
>> +return "%vmovss\t{%d1, %0|%0, %d1}";
>>return "%vmovss\t{%1, %0|%0, %1}";
> So what I'm confused about is in the original output template operand 0
> is duplicated. In the new template operand 1 is duplicated.
>
> Presumably what you're trying to accomplish is avoiding a false read on
> operand 0 (the destination)?  Can you please confirm?

> Knowing that should also help me evaluate the changes to recp and rsqrt
> since they're being changed to the same style encoding when operating
> strictly on registers.

Please don't change "v" -> "%v" for TARGET_AVX templates. We know that
in this case, all insn mnemonics are prefixed with "v".

Uros.


Re: [patch, i386] false dependencies fix

2018-06-27 Thread Jeff Law
On 05/02/2018 03:55 AM, Nesterovskiy, Alexander wrote:
> This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, 
> vsqrtss and vsqrtsd instructions.
> 
> Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance 
> gains.
> Please let me know if something is wrong here and should be changed.
> 
> --
> Alexander Nesterovskiy
> 
> 
> falsedep.patch
> 
> 
> --- i386.md   (revision 259756)
> +++ i386.md   (working copy)
> @@ -3547,7 +3547,7 @@
>   {
>   case MODE_DF:
> if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
> - return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
> + return "%vmovsd\t{%d1, %0|%0, %d1}";
> return "%vmovsd\t{%1, %0|%0, %1}";
>  
>   case MODE_V4SF:
> @@ -3748,7 +3748,7 @@
>   {
>   case MODE_SF:
> if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
> - return "vmovss\t{%1, %0, %0|%0, %0, %1}";
> + return "%vmovss\t{%d1, %0|%0, %d1}";
> return "%vmovss\t{%1, %0|%0, %1}";
So what I'm confused about is in the original output template operand 0
is duplicated. In the new template operand 1 is duplicated.

Presumably what you're trying to accomplish is avoiding a false read on
operand 0 (the destination)?  Can you please confirm?

Knowing that should also help me evaluate the changes to recp and rsqrt
since they're being changed to the same style encoding when operating
strictly on registers.


THanks,
jeff


Re: [patch, i386] false dependencies fix

2018-05-03 Thread Uros Bizjak
> This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, 
> vsqrtss and vsqrtsd
> instructions.
>
> Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance 
> gains.
> Please let me know if something is wrong here and should be changed.

Your submission needs a ChangeLog entry. Please see [1]

  case MODE_DF:
   if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1]))
-return "vmovsd\t{%1, %0, %0|%0, %0, %1}";
+return "%vmovsd\t{%d1, %0|%0, %d1}";
   return "%vmovsd\t{%1, %0|%0, %1}";

No need to change "v" to "%v". We are always in AVX domain here.

Otherwise OK. Please resubmit the patch with the ChangeLog entry.

[1] https://gcc.gnu.org/contribute.html

Thanks,
Uros.