Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
On April 2, 2021 10:00:53 AM GMT+02:00, Matthias Klose wrote: >On 9/30/20 2:27 PM, Florian Weimer wrote: >> These micro-architecture levels are defined in the x86-64 psABI: >> >> >https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 >> >> PTA_NO_TUNE is introduced so that the new processor alias table >entries >> do not affect the CPU tuning setting in ix86_tune. >> >> The tests depend on the macros added in commit >92e652d8c21bd7e66cbb0f900 >> ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA >flags"). > > >I would like to see this series backported to the gcc-10 branch >(already doing >that for distro builds). With the backport for PR target/95842 from >March 30, >these can now be backported without changes: That consists of: > >i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags >commit 92e652d8c21bd7e66cbb0f9001542a2f55345af0 > >Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64 >commit 324bec558e95584e8c1997575ae9d75978af59f1 > >Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64 >commit 16664e6e4fb4281be6477c13989740d44c963c77 > >Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64 >commit 552ed3ea761324bdd42c1a40d4bbef91432da29a > >i386: Make -march=x86-64-v[234] behave more like other -march= options >commit 59482fa1e7243bd905c7e27c92ae2b89c79fff87 > >and > >Fix up plugin header install >9a83366b62e585cce5577309013a832f895ccdbf > >the latter one needed anyway on the gcc-10 branch. > >I know, it's late, but the PR target/95842 backport only happened two >days ago. This can be done after the 10.3 release if really necessary. Richard. >Matthias
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
On 9/30/20 2:27 PM, Florian Weimer wrote: > These micro-architecture levels are defined in the x86-64 psABI: > > https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 > > PTA_NO_TUNE is introduced so that the new processor alias table entries > do not affect the CPU tuning setting in ix86_tune. > > The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900 > ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags"). I would like to see this series backported to the gcc-10 branch (already doing that for distro builds). With the backport for PR target/95842 from March 30, these can now be backported without changes: That consists of: i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags commit 92e652d8c21bd7e66cbb0f9001542a2f55345af0 Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64 commit 324bec558e95584e8c1997575ae9d75978af59f1 Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64 commit 16664e6e4fb4281be6477c13989740d44c963c77 Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64 commit 552ed3ea761324bdd42c1a40d4bbef91432da29a i386: Make -march=x86-64-v[234] behave more like other -march= options commit 59482fa1e7243bd905c7e27c92ae2b89c79fff87 and Fix up plugin header install 9a83366b62e585cce5577309013a832f895ccdbf the latter one needed anyway on the gcc-10 branch. I know, it's late, but the PR target/95842 backport only happened two days ago. Matthias
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
On Wed, Sep 30, 2020 at 6:28 PM Jakub Jelinek wrote: > > On Wed, Sep 30, 2020 at 06:06:31PM +0200, Florian Weimer wrote: > > This is what I came up with. It is not valid to set ix86_arch to > > PROCESSOR_GENERIC, which is why PTA_NO_TUNE is still needed. > > Ok, LGTM, but would prefer Uros to have final voice. OK from my side. Thanks, Uros.
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
On Wed, Sep 30, 2020 at 06:06:31PM +0200, Florian Weimer wrote: > This is what I came up with. It is not valid to set ix86_arch to > PROCESSOR_GENERIC, which is why PTA_NO_TUNE is still needed. Ok, LGTM, but would prefer Uros to have final voice. Jakub
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
* Jakub Jelinek: > On Wed, Sep 30, 2020 at 04:29:34PM +0200, Florian Weimer wrote: >> > Thinking about it more, wouldn't it better to just imply generic tuning >> > for these -march= options? >> >> I think this is what the patch does? See the x86-64-v3-haswell.c >> test. > > No, I think it will have that behavior solely when the compiler has been > configured to default to -mtune=generic. > What I'm suggesting is to not ignore the tuning like you do for PTA_NO_TUNE, > but instead perhaps use PROCESSOR_GENERIC and special case it in the code > so that ix86_arch will be set to PROCESSOR_K8 in that case and only > ix86_tune will be PROCESSOR_GENERIC. This is what I came up with. It is not valid to set ix86_arch to PROCESSOR_GENERIC, which is why PTA_NO_TUNE is still needed. 8<--8< These micro-architecture levels are defined in the x86-64 psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 PTA_NO_TUNE is introduced so that the new processor alias table entries do not affect the CPU tuning setting in ix86_tune. The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900 ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags"). gcc/: PR target/97250 * config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE) (PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New. * common/config/i386/i386-common.c (processor_alias_table): Add "x86-64-v2", "x86-64-v3", "x86-64-v4". * config/i386/i386-options.c (ix86_option_override_internal): Handle new PTA_NO_TUNE processor table entries. * doc/invoke.texi (x86 Options): Document new -march values. gcc/testsuite/: PR target/97250 * gcc.target/i386/x86-64-v2.c: New test. * gcc.target/i386/x86-64-v3.c: New test. * gcc.target/i386/x86-64-v3-haswell.c: New test. * gcc.target/i386/x86-64-v3-skylake.c: New test. * gcc.target/i386/x86-64-v4.c: New test. --- gcc/common/config/i386/i386-common.c | 10 +- gcc/config/i386/i386-options.c| 29 +- gcc/config/i386/i386.h| 11 +- gcc/doc/invoke.texi | 15 ++- gcc/testsuite/gcc.target/i386/x86-64-v2.c | 116 ++ gcc/testsuite/gcc.target/i386/x86-64-v3-haswell.c | 18 gcc/testsuite/gcc.target/i386/x86-64-v3-skylake.c | 21 gcc/testsuite/gcc.target/i386/x86-64-v3.c | 116 ++ gcc/testsuite/gcc.target/i386/x86-64-v4.c | 116 ++ 9 files changed, 442 insertions(+), 10 deletions(-) diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c index 10142149115..62a620b4430 100644 --- a/gcc/common/config/i386/i386-common.c +++ b/gcc/common/config/i386/i386-common.c @@ -1795,9 +1795,13 @@ const pta processor_alias_table[] = PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON, PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, - {"x86-64", PROCESSOR_K8, CPU_K8, -PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR, -0, P_NONE}, + {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE}, + {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE, + 0, P_NONE}, + {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE, + 0, P_NONE}, + {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE, + 0, P_NONE}, {"eden-x2", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR, 0, P_NONE}, diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c index 597de533fbd..a59bd703880 100644 --- a/gcc/config/i386/i386-options.c +++ b/gcc/config/i386/i386-options.c @@ -2058,10 +2058,27 @@ ix86_option_override_internal (bool main_args_p, return false; } + /* The feature-only micro-architecture levels that use + PTA_NO_TUNE are only defined for the x86-64 psABI. */ + if ((processor_alias_table[i].flags & PTA_NO_TUNE) != 0 + && (!TARGET_64BIT_P (opts->x_ix86_isa_flags) + || opts->x_ix86_abi != SYSV_ABI)) + { + error (G_("%<%s%> architecture level is only defined" + " for the x86-64 psABI"), opts->x_ix86_arch_string); + return false; + } + ix86_schedule = processor_alias_table[i].schedule; ix86_arch = processor_alias_table[i].processor; - /* Default cpu tuning to the architecture. */ - ix86_tune = ix86_arch; + + /* Default cpu tuning to the architecture, unless the table + entry requests not to do this. Used by the x86-64 psABI + micro-architecture levels. */ + if ((processor_alias_table[i].flags & PTA_NO_TUNE) == 0
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
On Wed, Sep 30, 2020 at 04:29:34PM +0200, Florian Weimer wrote: > > Thinking about it more, wouldn't it better to just imply generic tuning > > for these -march= options? > > I think this is what the patch does? See the x86-64-v3-haswell.c > test. No, I think it will have that behavior solely when the compiler has been configured to default to -mtune=generic. What I'm suggesting is to not ignore the tuning like you do for PTA_NO_TUNE, but instead perhaps use PROCESSOR_GENERIC and special case it in the code so that ix86_arch will be set to PROCESSOR_K8 in that case and only ix86_tune will be PROCESSOR_GENERIC. Jakub
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
* Jakub Jelinek: > On Wed, Sep 30, 2020 at 04:05:41PM +0200, Florian Weimer wrote: >> > I think the documentation should state that these are not valid in -mtune=, >> > just in -march=, and that using -march=x86-64-v* will not change tuning. >> > I guess there should be some testsuite coverage for the for some unexpected >> > behavior of >> > -march=skylake -march=x86-64-v3 >> > actually acting as >> > -march=x86-64-v3 -mtune=skylake >> > though perhaps it needs to be skipped if user used explicit -mtune= and >> > not sure how to actually test that (-fverbose-asm doesn't print -mtune= >> > when it is not explicit). >> >> I think the compiler driver collapses -march=skylake -march=x86-64-v3 >> to -march=x86-64-v3, dropping the tuning. The cc1 option parser also >> drops the first -march=. That's a bit surprising to me. It means >> that we can't use multiple tuning/non-tuning -march= switches, and >> that tuning with (say) -march=x86-64-v3 needs to use -mtune. >> >> PTA_NO_TUNE is still needed because we'd define __tune_k8__ otherwise >> (and switch to K8 tuning internally). >> >> Is it okay to simply document this? Perhaps like this? > > Thinking about it more, wouldn't it better to just imply generic tuning > for these -march= options? I think this is what the patch does? See the x86-64-v3-haswell.c test. I tried to explain this in the documentation. I do not think this is particularly confusing for end users because they do not see the implementation, which is making this complicated. I think we should not set generic tuning in processor_alias_table because it would override tuning for target clones, and I don't think we want to do that automatically.
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
On Wed, Sep 30, 2020 at 04:05:41PM +0200, Florian Weimer wrote: > > I think the documentation should state that these are not valid in -mtune=, > > just in -march=, and that using -march=x86-64-v* will not change tuning. > > I guess there should be some testsuite coverage for the for some unexpected > > behavior of > > -march=skylake -march=x86-64-v3 > > actually acting as > > -march=x86-64-v3 -mtune=skylake > > though perhaps it needs to be skipped if user used explicit -mtune= and > > not sure how to actually test that (-fverbose-asm doesn't print -mtune= > > when it is not explicit). > > I think the compiler driver collapses -march=skylake -march=x86-64-v3 > to -march=x86-64-v3, dropping the tuning. The cc1 option parser also > drops the first -march=. That's a bit surprising to me. It means > that we can't use multiple tuning/non-tuning -march= switches, and > that tuning with (say) -march=x86-64-v3 needs to use -mtune. > > PTA_NO_TUNE is still needed because we'd define __tune_k8__ otherwise > (and switch to K8 tuning internally). > > Is it okay to simply document this? Perhaps like this? Thinking about it more, wouldn't it better to just imply generic tuning for these -march= options? Jakub
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
* Jakub Jelinek: > On Wed, Sep 30, 2020 at 02:27:38PM +0200, Florian Weimer wrote: >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -29258,6 +29258,13 @@ of the selected instruction set. >> @item x86-64 >> A generic CPU with 64-bit extensions. >> >> +@item x86-64-v2 >> +@itemx x86-64-v3 >> +@itemx x86-64-v4 >> +These choices for @var{cpu-type} select the corresponding >> +micro-architecture level from the x86-64 psABI. They are only available >> +when compiling for a x86-64 target that uses the System V psABI@. > > I think the documentation should state that these are not valid in -mtune=, > just in -march=, and that using -march=x86-64-v* will not change tuning. > I guess there should be some testsuite coverage for the for some unexpected > behavior of > -march=skylake -march=x86-64-v3 > actually acting as > -march=x86-64-v3 -mtune=skylake > though perhaps it needs to be skipped if user used explicit -mtune= and > not sure how to actually test that (-fverbose-asm doesn't print -mtune= > when it is not explicit). I think the compiler driver collapses -march=skylake -march=x86-64-v3 to -march=x86-64-v3, dropping the tuning. The cc1 option parser also drops the first -march=. That's a bit surprising to me. It means that we can't use multiple tuning/non-tuning -march= switches, and that tuning with (say) -march=x86-64-v3 needs to use -mtune. PTA_NO_TUNE is still needed because we'd define __tune_k8__ otherwise (and switch to K8 tuning internally). Is it okay to simply document this? Perhaps like this? 8<--8< These micro-architecture levels are defined in the x86-64 psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 PTA_NO_TUNE is introduced so that the new processor alias table entries do not affect the CPU tuning setting in ix86_tune. The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900 ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags"). gcc/: PR target/97250 * config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE) (PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New. * common/config/i386/i386-common.c (processor_alias_table): Add "x86-64-v2", "x86-64-v3", "x86-64-v4". * config/i386/i386-options.c (ix86_option_override_internal): Handle new PTA_NO_TUNE processor table entries. * doc/invoke.texi (x86 Options): Document new -march values. gcc/testsuite/: PR target/97250 * gcc.target/i386/x86-64-v2.c: New test. * gcc.target/i386/x86-64-v3.c: New test. * gcc.target/i386/x86-64-v3-haswell.c: New test. * gcc.target/i386/x86-64-v3-skylake.c: New test. * gcc.target/i386/x86-64-v4.c: New test. --- gcc/common/config/i386/i386-common.c | 10 +- gcc/config/i386/i386-options.c| 27 - gcc/config/i386/i386.h| 11 +- gcc/doc/invoke.texi | 15 ++- gcc/testsuite/gcc.target/i386/x86-64-v2.c | 116 ++ gcc/testsuite/gcc.target/i386/x86-64-v3-haswell.c | 18 gcc/testsuite/gcc.target/i386/x86-64-v3-skylake.c | 21 gcc/testsuite/gcc.target/i386/x86-64-v3.c | 116 ++ gcc/testsuite/gcc.target/i386/x86-64-v4.c | 116 ++ 9 files changed, 440 insertions(+), 10 deletions(-) diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c index 10142149115..62a620b4430 100644 --- a/gcc/common/config/i386/i386-common.c +++ b/gcc/common/config/i386/i386-common.c @@ -1795,9 +1795,13 @@ const pta processor_alias_table[] = PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON, PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, - {"x86-64", PROCESSOR_K8, CPU_K8, -PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR, -0, P_NONE}, + {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE}, + {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE, + 0, P_NONE}, + {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE, + 0, P_NONE}, + {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE, + 0, P_NONE}, {"eden-x2", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR, 0, P_NONE}, diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c index 597de533fbd..cf48a911798 100644 --- a/gcc/config/i386/i386-options.c +++ b/gcc/config/i386/i386-options.c @@ -2058,10 +2058,25 @@ ix86_option_override_internal (bool main_args_p, return false; } + /* Only the x86-64 psABI defines the feature-only + micro-architecture levels that use PTA_NO_TUNE. */ + if ((processor_alias_table[i].flags & PTA_NO
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
On Wed, Sep 30, 2020 at 02:27:38PM +0200, Florian Weimer wrote: > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -29258,6 +29258,13 @@ of the selected instruction set. > @item x86-64 > A generic CPU with 64-bit extensions. > > +@item x86-64-v2 > +@itemx x86-64-v3 > +@itemx x86-64-v4 > +These choices for @var{cpu-type} select the corresponding > +micro-architecture level from the x86-64 psABI. They are only available > +when compiling for a x86-64 target that uses the System V psABI@. I think the documentation should state that these are not valid in -mtune=, just in -march=, and that using -march=x86-64-v* will not change tuning. I guess there should be some testsuite coverage for the for some unexpected behavior of -march=skylake -march=x86-64-v3 actually acting as -march=x86-64-v3 -mtune=skylake though perhaps it needs to be skipped if user used explicit -mtune= and not sure how to actually test that (-fverbose-asm doesn't print -mtune= when it is not explicit). Jakub
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
* Uros Bizjak: > On Wed, Sep 30, 2020 at 2:27 PM Florian Weimer wrote: >> >> These micro-architecture levels are defined in the x86-64 psABI: >> >> https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 >> >> PTA_NO_TUNE is introduced so that the new processor alias table entries >> do not affect the CPU tuning setting in ix86_tune. >> >> The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900 >> ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags"). >> >> gcc/: >> PR target/97250 >> * config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE) >> (PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New. >> * common/config/i386/i386-common.c (processor_alias_table): >> Add "x86-64-v2", "x86-64-v3", "x86-64-v4". >> * config/i386/i386-options.c (ix86_option_override_internal): >> Handle new PTA_NO_TUNE processor table entries. >> * doc/invoke.texi (x86 Options): Document new -march values. >> >> gcc/testsuite/: >> PR target/97250 >> * gcc.target/i386/x86-64-v2.c: New test. >> * gcc.target/i386/x86-64-v3.c: New test. >> * gcc.target/i386/x86-64-v4.c: New test. > > Perhaps you should also test for the newly introduced __LAHF_SAHF__ define? Like this? Thanks. 8<--8< These micro-architecture levels are defined in the x86-64 psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 PTA_NO_TUNE is introduced so that the new processor alias table entries do not affect the CPU tuning setting in ix86_tune. The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900 ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags"). gcc/: PR target/97250 * config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE) (PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New. * common/config/i386/i386-common.c (processor_alias_table): Add "x86-64-v2", "x86-64-v3", "x86-64-v4". * config/i386/i386-options.c (ix86_option_override_internal): Handle new PTA_NO_TUNE processor table entries. * doc/invoke.texi (x86 Options): Document new -march values. gcc/testsuite/: PR target/97250 * gcc.target/i386/x86-64-v2.c: New test. * gcc.target/i386/x86-64-v3.c: New test. * gcc.target/i386/x86-64-v4.c: New test. --- gcc/common/config/i386/i386-common.c | 10 ++- gcc/config/i386/i386-options.c| 27 +-- gcc/config/i386/i386.h| 11 ++- gcc/doc/invoke.texi | 7 ++ gcc/testsuite/gcc.target/i386/x86-64-v2.c | 116 ++ gcc/testsuite/gcc.target/i386/x86-64-v3.c | 116 ++ gcc/testsuite/gcc.target/i386/x86-64-v4.c | 116 ++ 7 files changed, 394 insertions(+), 9 deletions(-) diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c index 10142149115..62a620b4430 100644 --- a/gcc/common/config/i386/i386-common.c +++ b/gcc/common/config/i386/i386-common.c @@ -1795,9 +1795,13 @@ const pta processor_alias_table[] = PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON, PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, - {"x86-64", PROCESSOR_K8, CPU_K8, -PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR, -0, P_NONE}, + {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE}, + {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE, + 0, P_NONE}, + {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE, + 0, P_NONE}, + {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE, + 0, P_NONE}, {"eden-x2", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR, 0, P_NONE}, diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c index 597de533fbd..cf48a911798 100644 --- a/gcc/config/i386/i386-options.c +++ b/gcc/config/i386/i386-options.c @@ -2058,10 +2058,25 @@ ix86_option_override_internal (bool main_args_p, return false; } + /* Only the x86-64 psABI defines the feature-only + micro-architecture levels that use PTA_NO_TUNE. */ + if ((processor_alias_table[i].flags & PTA_NO_TUNE) != 0 + && (!TARGET_64BIT_P (opts->x_ix86_isa_flags) + || opts->x_ix86_abi != SYSV_ABI)) + { + error (G_("%<%s%> architecture level is only defined" + " for the x86-64 psABI"), opts->x_ix86_arch_string); + return false; + } + ix86_schedule = processor_alias_table[i].schedule; ix86_arch = processor_alias_table[i].processor; - /* Default cpu tuning to the architecture. */ - ix86_tune = ix
Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
On Wed, Sep 30, 2020 at 2:27 PM Florian Weimer wrote: > > These micro-architecture levels are defined in the x86-64 psABI: > > https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 > > PTA_NO_TUNE is introduced so that the new processor alias table entries > do not affect the CPU tuning setting in ix86_tune. > > The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900 > ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags"). > > gcc/: > PR target/97250 > * config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE) > (PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New. > * common/config/i386/i386-common.c (processor_alias_table): > Add "x86-64-v2", "x86-64-v3", "x86-64-v4". > * config/i386/i386-options.c (ix86_option_override_internal): > Handle new PTA_NO_TUNE processor table entries. > * doc/invoke.texi (x86 Options): Document new -march values. > > gcc/testsuite/: > PR target/97250 > * gcc.target/i386/x86-64-v2.c: New test. > * gcc.target/i386/x86-64-v3.c: New test. > * gcc.target/i386/x86-64-v4.c: New test. Perhaps you should also test for the newly introduced __LAHF_SAHF__ define? Uros. > --- > > Notes (not going to be committed); > > I struggled a bit without avoid ICEs when I used PROCESSOR_GENERIC > instead of PROCESSOR_K8 in the new process alias table entries. In > the end, I think not resetting the tuning setting is the correct thing > to do. > > Test results on x86-64 (on Debian buster) look okay-ish to me. I see > lots of obviously unrelated FAILs. > > gcc/common/config/i386/i386-common.c | 10 ++- > gcc/config/i386/i386-options.c| 27 +-- > gcc/config/i386/i386.h| 11 ++- > gcc/doc/invoke.texi | 7 ++ > gcc/testsuite/gcc.target/i386/x86-64-v2.c | 113 > ++ > gcc/testsuite/gcc.target/i386/x86-64-v3.c | 113 > ++ > gcc/testsuite/gcc.target/i386/x86-64-v4.c | 113 > ++ > 7 files changed, 385 insertions(+), 9 deletions(-) > > diff --git a/gcc/common/config/i386/i386-common.c > b/gcc/common/config/i386/i386-common.c > index 10142149115..62a620b4430 100644 > --- a/gcc/common/config/i386/i386-common.c > +++ b/gcc/common/config/i386/i386-common.c > @@ -1795,9 +1795,13 @@ const pta processor_alias_table[] = > PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, >{"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON, > PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, > - {"x86-64", PROCESSOR_K8, CPU_K8, > -PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR, > -0, P_NONE}, > + {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE}, > + {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE, > + 0, P_NONE}, > + {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE, > + 0, P_NONE}, > + {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE, > + 0, P_NONE}, >{"eden-x2", PROCESSOR_K8, CPU_K8, > PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR, > 0, P_NONE}, > diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c > index 597de533fbd..cf48a911798 100644 > --- a/gcc/config/i386/i386-options.c > +++ b/gcc/config/i386/i386-options.c > @@ -2058,10 +2058,25 @@ ix86_option_override_internal (bool main_args_p, > return false; > } > > + /* Only the x86-64 psABI defines the feature-only > + micro-architecture levels that use PTA_NO_TUNE. */ > + if ((processor_alias_table[i].flags & PTA_NO_TUNE) != 0 > + && (!TARGET_64BIT_P (opts->x_ix86_isa_flags) > + || opts->x_ix86_abi != SYSV_ABI)) > + { > + error (G_("%<%s%> architecture level is only defined" > + " for the x86-64 psABI"), opts->x_ix86_arch_string); > + return false; > + } > + > ix86_schedule = processor_alias_table[i].schedule; > ix86_arch = processor_alias_table[i].processor; > - /* Default cpu tuning to the architecture. */ > - ix86_tune = ix86_arch; > + > + /* Default cpu tuning to the architecture, unless the table > + entry requests not to do this. Used by the x86-64 psABI > + micro-architecture levels. */ > + if ((processor_alias_table[i].flags & PTA_NO_TUNE) == 0) > + ix86_tune = ix86_arch; > > if (((processor_alias_table[i].flags & PTA_MMX) != 0) > && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_MMX)) > @@ -2384,7 +2399,8 @@ ix86_option_override_internal (bool main_args_p, > ix86_arch_features[i] = !!(initial_ix86_arch_features[i] & > ix86_arch_mask); > >for (i = 0; i < pta_size; i++) > -if (! strcmp (opts->x_ix86_tune_string, processor_alias_table[i].na
[PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
These micro-architecture levels are defined in the x86-64 psABI: https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9 PTA_NO_TUNE is introduced so that the new processor alias table entries do not affect the CPU tuning setting in ix86_tune. The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900 ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags"). gcc/: PR target/97250 * config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE) (PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New. * common/config/i386/i386-common.c (processor_alias_table): Add "x86-64-v2", "x86-64-v3", "x86-64-v4". * config/i386/i386-options.c (ix86_option_override_internal): Handle new PTA_NO_TUNE processor table entries. * doc/invoke.texi (x86 Options): Document new -march values. gcc/testsuite/: PR target/97250 * gcc.target/i386/x86-64-v2.c: New test. * gcc.target/i386/x86-64-v3.c: New test. * gcc.target/i386/x86-64-v4.c: New test. --- Notes (not going to be committed); I struggled a bit without avoid ICEs when I used PROCESSOR_GENERIC instead of PROCESSOR_K8 in the new process alias table entries. In the end, I think not resetting the tuning setting is the correct thing to do. Test results on x86-64 (on Debian buster) look okay-ish to me. I see lots of obviously unrelated FAILs. gcc/common/config/i386/i386-common.c | 10 ++- gcc/config/i386/i386-options.c| 27 +-- gcc/config/i386/i386.h| 11 ++- gcc/doc/invoke.texi | 7 ++ gcc/testsuite/gcc.target/i386/x86-64-v2.c | 113 ++ gcc/testsuite/gcc.target/i386/x86-64-v3.c | 113 ++ gcc/testsuite/gcc.target/i386/x86-64-v4.c | 113 ++ 7 files changed, 385 insertions(+), 9 deletions(-) diff --git a/gcc/common/config/i386/i386-common.c b/gcc/common/config/i386/i386-common.c index 10142149115..62a620b4430 100644 --- a/gcc/common/config/i386/i386-common.c +++ b/gcc/common/config/i386/i386-common.c @@ -1795,9 +1795,13 @@ const pta processor_alias_table[] = PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON, PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE}, - {"x86-64", PROCESSOR_K8, CPU_K8, -PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR, -0, P_NONE}, + {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE}, + {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE, + 0, P_NONE}, + {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE, + 0, P_NONE}, + {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE, + 0, P_NONE}, {"eden-x2", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR, 0, P_NONE}, diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c index 597de533fbd..cf48a911798 100644 --- a/gcc/config/i386/i386-options.c +++ b/gcc/config/i386/i386-options.c @@ -2058,10 +2058,25 @@ ix86_option_override_internal (bool main_args_p, return false; } + /* Only the x86-64 psABI defines the feature-only + micro-architecture levels that use PTA_NO_TUNE. */ + if ((processor_alias_table[i].flags & PTA_NO_TUNE) != 0 + && (!TARGET_64BIT_P (opts->x_ix86_isa_flags) + || opts->x_ix86_abi != SYSV_ABI)) + { + error (G_("%<%s%> architecture level is only defined" + " for the x86-64 psABI"), opts->x_ix86_arch_string); + return false; + } + ix86_schedule = processor_alias_table[i].schedule; ix86_arch = processor_alias_table[i].processor; - /* Default cpu tuning to the architecture. */ - ix86_tune = ix86_arch; + + /* Default cpu tuning to the architecture, unless the table + entry requests not to do this. Used by the x86-64 psABI + micro-architecture levels. */ + if ((processor_alias_table[i].flags & PTA_NO_TUNE) == 0) + ix86_tune = ix86_arch; if (((processor_alias_table[i].flags & PTA_MMX) != 0) && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_MMX)) @@ -2384,7 +2399,8 @@ ix86_option_override_internal (bool main_args_p, ix86_arch_features[i] = !!(initial_ix86_arch_features[i] & ix86_arch_mask); for (i = 0; i < pta_size; i++) -if (! strcmp (opts->x_ix86_tune_string, processor_alias_table[i].name)) +if (! strcmp (opts->x_ix86_tune_string, processor_alias_table[i].name) + && (processor_alias_table[i].flags & PTA_NO_TUNE) == 0) { ix86_schedule = processor_alias_table[i].schedule; ix86_tune = processor_alias_table[i].processor; @@ -2428,8 +2444,9 @@ ix86_option_override_internal (bool main_args_p,