Re: [patch, i386] false dependencies fix
On 06/28/2018 01:16 AM, Uros Bizjak wrote: > Hello! > >>> --- i386.md (revision 259756) >>> +++ i386.md (working copy) >>> @@ -3547,7 +3547,7 @@ >>> { >>> case MODE_DF: >>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >>> -return "vmovsd\t{%1, %0, %0|%0, %0, %1}"; >>> +return "%vmovsd\t{%d1, %0|%0, %d1}"; >>>return "%vmovsd\t{%1, %0|%0, %1}"; >>> >>> case MODE_V4SF: >>> @@ -3748,7 +3748,7 @@ >>> { >>> case MODE_SF: >>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >>> -return "vmovss\t{%1, %0, %0|%0, %0, %1}"; >>> +return "%vmovss\t{%d1, %0|%0, %d1}"; >>>return "%vmovss\t{%1, %0|%0, %1}"; >> So what I'm confused about is in the original output template operand 0 >> is duplicated. In the new template operand 1 is duplicated. >> >> Presumably what you're trying to accomplish is avoiding a false read on >> operand 0 (the destination)? Can you please confirm? > >> Knowing that should also help me evaluate the changes to recp and rsqrt >> since they're being changed to the same style encoding when operating >> strictly on registers. > > Please don't change "v" -> "%v" for TARGET_AVX templates. We know that > in this case, all insn mnemonics are prefixed with "v". I didn't realize you'd already acked the patch without the v->%v bits and Sebastian committed the changes about a month ago... Oh well, at least I know a bit more about the inner workings of the AVX unit than I did a last week. Sorry for the noise. jeff
Re: [patch, i386] false dependencies fix
On 06/28/2018 01:16 AM, Uros Bizjak wrote: > Hello! > >>> --- i386.md (revision 259756) >>> +++ i386.md (working copy) >>> @@ -3547,7 +3547,7 @@ >>> { >>> case MODE_DF: >>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >>> -return "vmovsd\t{%1, %0, %0|%0, %0, %1}"; >>> +return "%vmovsd\t{%d1, %0|%0, %d1}"; >>>return "%vmovsd\t{%1, %0|%0, %1}"; >>> >>> case MODE_V4SF: >>> @@ -3748,7 +3748,7 @@ >>> { >>> case MODE_SF: >>>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >>> -return "vmovss\t{%1, %0, %0|%0, %0, %1}"; >>> +return "%vmovss\t{%d1, %0|%0, %d1}"; >>>return "%vmovss\t{%1, %0|%0, %1}"; >> So what I'm confused about is in the original output template operand 0 >> is duplicated. In the new template operand 1 is duplicated. >> >> Presumably what you're trying to accomplish is avoiding a false read on >> operand 0 (the destination)? Can you please confirm? > >> Knowing that should also help me evaluate the changes to recp and rsqrt >> since they're being changed to the same style encoding when operating >> strictly on registers. > > Please don't change "v" -> "%v" for TARGET_AVX templates. We know that > in this case, all insn mnemonics are prefixed with "v". ACK on that Uros -- I'd convinced myself that v->%v for TARGET_AVX couldn't hurt anything since we already had the v prefix in place. I'm happy to ensure this follows your preferred convention. I was mostly trying to make sure I understood the other aspects of the proposed change. Cheers, jeff
RE: [patch, i386] false dependencies fix
Hello! > So what I'm confused about is in the original output template operand > 0 is duplicated. In the new template operand 1 is duplicated. > > Presumably what you're trying to accomplish is avoiding a false read > on operand 0 (the destination)? Can you please confirm? > Knowing that should also help me evaluate the changes to recp and > rsqrt since they're being changed to the same style encoding when > operating strictly on registers. Yes, it's the same for all instructions in the patch - we're not just avoiding read but present more possibilities to execute speculatively for CPU here. The destination depends only on the source after the patch, and (thanks to CPU register renaming) CPU can successfully execute this instruction even if some previous instruction with write to the same destination is not finished currently. -- Alexander Nesterovskiy
Re: [patch, i386] false dependencies fix
Hello! >> --- i386.md (revision 259756) >> +++ i386.md (working copy) >> @@ -3547,7 +3547,7 @@ >> { >> case MODE_DF: >>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >> -return "vmovsd\t{%1, %0, %0|%0, %0, %1}"; >> +return "%vmovsd\t{%d1, %0|%0, %d1}"; >>return "%vmovsd\t{%1, %0|%0, %1}"; >> >> case MODE_V4SF: >> @@ -3748,7 +3748,7 @@ >> { >> case MODE_SF: >>if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) >> -return "vmovss\t{%1, %0, %0|%0, %0, %1}"; >> +return "%vmovss\t{%d1, %0|%0, %d1}"; >>return "%vmovss\t{%1, %0|%0, %1}"; > So what I'm confused about is in the original output template operand 0 > is duplicated. In the new template operand 1 is duplicated. > > Presumably what you're trying to accomplish is avoiding a false read on > operand 0 (the destination)? Can you please confirm? > Knowing that should also help me evaluate the changes to recp and rsqrt > since they're being changed to the same style encoding when operating > strictly on registers. Please don't change "v" -> "%v" for TARGET_AVX templates. We know that in this case, all insn mnemonics are prefixed with "v". Uros.
Re: [patch, i386] false dependencies fix
On 05/02/2018 03:55 AM, Nesterovskiy, Alexander wrote: > This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, > vsqrtss and vsqrtsd instructions. > > Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance > gains. > Please let me know if something is wrong here and should be changed. > > -- > Alexander Nesterovskiy > > > falsedep.patch > > > --- i386.md (revision 259756) > +++ i386.md (working copy) > @@ -3547,7 +3547,7 @@ > { > case MODE_DF: > if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) > - return "vmovsd\t{%1, %0, %0|%0, %0, %1}"; > + return "%vmovsd\t{%d1, %0|%0, %d1}"; > return "%vmovsd\t{%1, %0|%0, %1}"; > > case MODE_V4SF: > @@ -3748,7 +3748,7 @@ > { > case MODE_SF: > if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) > - return "vmovss\t{%1, %0, %0|%0, %0, %1}"; > + return "%vmovss\t{%d1, %0|%0, %d1}"; > return "%vmovss\t{%1, %0|%0, %1}"; So what I'm confused about is in the original output template operand 0 is duplicated. In the new template operand 1 is duplicated. Presumably what you're trying to accomplish is avoiding a false read on operand 0 (the destination)? Can you please confirm? Knowing that should also help me evaluate the changes to recp and rsqrt since they're being changed to the same style encoding when operating strictly on registers. THanks, jeff
Re: [patch, i386] false dependencies fix
> This patch fixes false dependencies for vmovss, vmovsd, vrcpss, vrsqrtss, > vsqrtss and vsqrtsd > instructions. > > Tested on x86-64/Linux, no new test fails, some SPEC 2006/2017 performance > gains. > Please let me know if something is wrong here and should be changed. Your submission needs a ChangeLog entry. Please see [1] case MODE_DF: if (TARGET_AVX && REG_P (operands[0]) && REG_P (operands[1])) -return "vmovsd\t{%1, %0, %0|%0, %0, %1}"; +return "%vmovsd\t{%d1, %0|%0, %d1}"; return "%vmovsd\t{%1, %0|%0, %1}"; No need to change "v" to "%v". We are always in AVX domain here. Otherwise OK. Please resubmit the patch with the ChangeLog entry. [1] https://gcc.gnu.org/contribute.html Thanks, Uros.