Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2021-04-02 Thread Richard Biener via Gcc-patches
On April 2, 2021 10:00:53 AM GMT+02:00, Matthias Klose  wrote:
>On 9/30/20 2:27 PM, Florian Weimer wrote:
>> These micro-architecture levels are defined in the x86-64 psABI:
>> 
>>
>https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9
>> 
>> PTA_NO_TUNE is introduced so that the new processor alias table
>entries
>> do not affect the CPU tuning setting in ix86_tune.
>> 
>> The tests depend on the macros added in commit
>92e652d8c21bd7e66cbb0f900
>> ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA
>flags").
>
>
>I would like to see this series backported to the gcc-10 branch
>(already doing
>that for distro builds). With the backport for PR target/95842 from
>March 30,
>these can now be backported without changes: That consists of:
>
>i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags
>commit 92e652d8c21bd7e66cbb0f9001542a2f55345af0
>
>Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
>commit 324bec558e95584e8c1997575ae9d75978af59f1
>
>Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
>commit 16664e6e4fb4281be6477c13989740d44c963c77
>
>Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
>commit 552ed3ea761324bdd42c1a40d4bbef91432da29a
>
>i386: Make -march=x86-64-v[234] behave more like other -march= options
>commit 59482fa1e7243bd905c7e27c92ae2b89c79fff87
>
>and
>
>Fix up plugin header install
>9a83366b62e585cce5577309013a832f895ccdbf
>
>the latter one needed anyway on the gcc-10 branch.
>
>I know, it's late, but the PR target/95842 backport only happened two
>days ago.

This can be done after the 10.3 release if really necessary. 

Richard. 

>Matthias



Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2021-04-02 Thread Matthias Klose
On 9/30/20 2:27 PM, Florian Weimer wrote:
> These micro-architecture levels are defined in the x86-64 psABI:
> 
> https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9
> 
> PTA_NO_TUNE is introduced so that the new processor alias table entries
> do not affect the CPU tuning setting in ix86_tune.
> 
> The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
> ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").


I would like to see this series backported to the gcc-10 branch (already doing
that for distro builds). With the backport for PR target/95842 from March 30,
these can now be backported without changes: That consists of:

i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags
commit 92e652d8c21bd7e66cbb0f9001542a2f55345af0

Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
commit 324bec558e95584e8c1997575ae9d75978af59f1

Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
commit 16664e6e4fb4281be6477c13989740d44c963c77

Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64
commit 552ed3ea761324bdd42c1a40d4bbef91432da29a

i386: Make -march=x86-64-v[234] behave more like other -march= options
commit 59482fa1e7243bd905c7e27c92ae2b89c79fff87

and

Fix up plugin header install
9a83366b62e585cce5577309013a832f895ccdbf

the latter one needed anyway on the gcc-10 branch.

I know, it's late, but the PR target/95842 backport only happened two days ago.

Matthias


Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-10-01 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 30, 2020 at 6:28 PM Jakub Jelinek  wrote:
>
> On Wed, Sep 30, 2020 at 06:06:31PM +0200, Florian Weimer wrote:
> > This is what I came up with.  It is not valid to set ix86_arch to
> > PROCESSOR_GENERIC, which is why PTA_NO_TUNE is still needed.
>
> Ok, LGTM, but would prefer Uros to have final voice.

OK from my side.

Thanks,
Uros.


Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Jakub Jelinek via Gcc-patches
On Wed, Sep 30, 2020 at 06:06:31PM +0200, Florian Weimer wrote:
> This is what I came up with.  It is not valid to set ix86_arch to
> PROCESSOR_GENERIC, which is why PTA_NO_TUNE is still needed.

Ok, LGTM, but would prefer Uros to have final voice.

Jakub



Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Florian Weimer
* Jakub Jelinek:

> On Wed, Sep 30, 2020 at 04:29:34PM +0200, Florian Weimer wrote:
>> > Thinking about it more, wouldn't it better to just imply generic tuning
>> > for these -march= options?
>> 
>> I think this is what the patch does?  See the x86-64-v3-haswell.c
>> test.
>
> No, I think it will have that behavior solely when the compiler has been
> configured to default to -mtune=generic.
> What I'm suggesting is to not ignore the tuning like you do for PTA_NO_TUNE,
> but instead perhaps use PROCESSOR_GENERIC and special case it in the code
> so that ix86_arch will be set to PROCESSOR_K8 in that case and only
> ix86_tune will be PROCESSOR_GENERIC.

This is what I came up with.  It is not valid to set ix86_arch to
PROCESSOR_GENERIC, which is why PTA_NO_TUNE is still needed.

8<--8<
These micro-architecture levels are defined in the x86-64 psABI:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9

PTA_NO_TUNE is introduced so that the new processor alias table entries
do not affect the CPU tuning setting in ix86_tune.

The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").

gcc/:
PR target/97250
* config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE)
(PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New.
* common/config/i386/i386-common.c (processor_alias_table):
Add "x86-64-v2", "x86-64-v3", "x86-64-v4".
* config/i386/i386-options.c (ix86_option_override_internal):
Handle new PTA_NO_TUNE processor table entries.
* doc/invoke.texi (x86 Options): Document new -march values.

gcc/testsuite/:
PR target/97250
* gcc.target/i386/x86-64-v2.c: New test.
* gcc.target/i386/x86-64-v3.c: New test.
* gcc.target/i386/x86-64-v3-haswell.c: New test.
* gcc.target/i386/x86-64-v3-skylake.c: New test.
* gcc.target/i386/x86-64-v4.c: New test.

---
 gcc/common/config/i386/i386-common.c  |  10 +-
 gcc/config/i386/i386-options.c|  29 +-
 gcc/config/i386/i386.h|  11 +-
 gcc/doc/invoke.texi   |  15 ++-
 gcc/testsuite/gcc.target/i386/x86-64-v2.c | 116 ++
 gcc/testsuite/gcc.target/i386/x86-64-v3-haswell.c |  18 
 gcc/testsuite/gcc.target/i386/x86-64-v3-skylake.c |  21 
 gcc/testsuite/gcc.target/i386/x86-64-v3.c | 116 ++
 gcc/testsuite/gcc.target/i386/x86-64-v4.c | 116 ++
 9 files changed, 442 insertions(+), 10 deletions(-)

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 10142149115..62a620b4430 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1795,9 +1795,13 @@ const pta processor_alias_table[] =
 PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
   {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON,
 PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
-  {"x86-64", PROCESSOR_K8, CPU_K8,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR,
-0, P_NONE},
+  {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE},
+  {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE,
+   0, P_NONE},
+  {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE,
+   0, P_NONE},
+  {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE,
+   0, P_NONE},
   {"eden-x2", PROCESSOR_K8, CPU_K8,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR,
 0, P_NONE},
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 597de533fbd..a59bd703880 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2058,10 +2058,27 @@ ix86_option_override_internal (bool main_args_p,
return false;
  }
 
+   /* The feature-only micro-architecture levels that use
+  PTA_NO_TUNE are only defined for the x86-64 psABI.  */
+   if ((processor_alias_table[i].flags & PTA_NO_TUNE) != 0
+   && (!TARGET_64BIT_P (opts->x_ix86_isa_flags)
+   || opts->x_ix86_abi != SYSV_ABI))
+ {
+   error (G_("%<%s%> architecture level is only defined"
+ " for the x86-64 psABI"), opts->x_ix86_arch_string);
+   return false;
+ }
+
ix86_schedule = processor_alias_table[i].schedule;
ix86_arch = processor_alias_table[i].processor;
-   /* Default cpu tuning to the architecture.  */
-   ix86_tune = ix86_arch;
+
+   /* Default cpu tuning to the architecture, unless the table
+  entry requests not to do this.  Used by the x86-64 psABI
+  micro-architecture levels.  */
+   if ((processor_alias_table[i].flags & PTA_NO_TUNE) == 0

Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Jakub Jelinek via Gcc-patches
On Wed, Sep 30, 2020 at 04:29:34PM +0200, Florian Weimer wrote:
> > Thinking about it more, wouldn't it better to just imply generic tuning
> > for these -march= options?
> 
> I think this is what the patch does?  See the x86-64-v3-haswell.c
> test.

No, I think it will have that behavior solely when the compiler has been
configured to default to -mtune=generic.
What I'm suggesting is to not ignore the tuning like you do for PTA_NO_TUNE,
but instead perhaps use PROCESSOR_GENERIC and special case it in the code
so that ix86_arch will be set to PROCESSOR_K8 in that case and only
ix86_tune will be PROCESSOR_GENERIC.

Jakub



Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Florian Weimer
* Jakub Jelinek:

> On Wed, Sep 30, 2020 at 04:05:41PM +0200, Florian Weimer wrote:
>> > I think the documentation should state that these are not valid in -mtune=,
>> > just in -march=, and that using -march=x86-64-v* will not change tuning.
>> > I guess there should be some testsuite coverage for the for some unexpected
>> > behavior of
>> > -march=skylake -march=x86-64-v3
>> > actually acting as
>> > -march=x86-64-v3 -mtune=skylake
>> > though perhaps it needs to be skipped if user used explicit -mtune= and
>> > not sure how to actually test that (-fverbose-asm doesn't print -mtune=
>> > when it is not explicit).
>> 
>> I think the compiler driver collapses -march=skylake -march=x86-64-v3
>> to -march=x86-64-v3, dropping the tuning.  The cc1 option parser also
>> drops the first -march=.  That's a bit surprising to me.  It means
>> that we can't use multiple tuning/non-tuning -march= switches, and
>> that tuning with (say) -march=x86-64-v3 needs to use -mtune.
>> 
>> PTA_NO_TUNE is still needed because we'd define __tune_k8__ otherwise
>> (and switch to K8 tuning internally).
>> 
>> Is it okay to simply document this?  Perhaps like this?
>
> Thinking about it more, wouldn't it better to just imply generic tuning
> for these -march= options?

I think this is what the patch does?  See the x86-64-v3-haswell.c
test.

I tried to explain this in the documentation.  I do not think this is
particularly confusing for end users because they do not see the
implementation, which is making this complicated.

I think we should not set generic tuning in processor_alias_table
because it would override tuning for target clones, and I don't think
we want to do that automatically.


Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Jakub Jelinek via Gcc-patches
On Wed, Sep 30, 2020 at 04:05:41PM +0200, Florian Weimer wrote:
> > I think the documentation should state that these are not valid in -mtune=,
> > just in -march=, and that using -march=x86-64-v* will not change tuning.
> > I guess there should be some testsuite coverage for the for some unexpected
> > behavior of
> > -march=skylake -march=x86-64-v3
> > actually acting as
> > -march=x86-64-v3 -mtune=skylake
> > though perhaps it needs to be skipped if user used explicit -mtune= and
> > not sure how to actually test that (-fverbose-asm doesn't print -mtune=
> > when it is not explicit).
> 
> I think the compiler driver collapses -march=skylake -march=x86-64-v3
> to -march=x86-64-v3, dropping the tuning.  The cc1 option parser also
> drops the first -march=.  That's a bit surprising to me.  It means
> that we can't use multiple tuning/non-tuning -march= switches, and
> that tuning with (say) -march=x86-64-v3 needs to use -mtune.
> 
> PTA_NO_TUNE is still needed because we'd define __tune_k8__ otherwise
> (and switch to K8 tuning internally).
> 
> Is it okay to simply document this?  Perhaps like this?

Thinking about it more, wouldn't it better to just imply generic tuning
for these -march= options?

Jakub



Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Florian Weimer
* Jakub Jelinek:

> On Wed, Sep 30, 2020 at 02:27:38PM +0200, Florian Weimer wrote:
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -29258,6 +29258,13 @@ of the selected instruction set.
>>  @item x86-64
>>  A generic CPU with 64-bit extensions.
>>  
>> +@item x86-64-v2
>> +@itemx x86-64-v3
>> +@itemx x86-64-v4
>> +These choices for @var{cpu-type} select the corresponding
>> +micro-architecture level from the x86-64 psABI.  They are only available
>> +when compiling for a x86-64 target that uses the System V psABI@.
>
> I think the documentation should state that these are not valid in -mtune=,
> just in -march=, and that using -march=x86-64-v* will not change tuning.
> I guess there should be some testsuite coverage for the for some unexpected
> behavior of
> -march=skylake -march=x86-64-v3
> actually acting as
> -march=x86-64-v3 -mtune=skylake
> though perhaps it needs to be skipped if user used explicit -mtune= and
> not sure how to actually test that (-fverbose-asm doesn't print -mtune=
> when it is not explicit).

I think the compiler driver collapses -march=skylake -march=x86-64-v3
to -march=x86-64-v3, dropping the tuning.  The cc1 option parser also
drops the first -march=.  That's a bit surprising to me.  It means
that we can't use multiple tuning/non-tuning -march= switches, and
that tuning with (say) -march=x86-64-v3 needs to use -mtune.

PTA_NO_TUNE is still needed because we'd define __tune_k8__ otherwise
(and switch to K8 tuning internally).

Is it okay to simply document this?  Perhaps like this?

8<--8<
These micro-architecture levels are defined in the x86-64 psABI:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9

PTA_NO_TUNE is introduced so that the new processor alias table entries
do not affect the CPU tuning setting in ix86_tune.

The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").

gcc/:
PR target/97250
* config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE)
(PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New.
* common/config/i386/i386-common.c (processor_alias_table):
Add "x86-64-v2", "x86-64-v3", "x86-64-v4".
* config/i386/i386-options.c (ix86_option_override_internal):
Handle new PTA_NO_TUNE processor table entries.
* doc/invoke.texi (x86 Options): Document new -march values.

gcc/testsuite/:
PR target/97250
* gcc.target/i386/x86-64-v2.c: New test.
* gcc.target/i386/x86-64-v3.c: New test.
* gcc.target/i386/x86-64-v3-haswell.c: New test.
* gcc.target/i386/x86-64-v3-skylake.c: New test.
* gcc.target/i386/x86-64-v4.c: New test.

---
 gcc/common/config/i386/i386-common.c  |  10 +-
 gcc/config/i386/i386-options.c|  27 -
 gcc/config/i386/i386.h|  11 +-
 gcc/doc/invoke.texi   |  15 ++-
 gcc/testsuite/gcc.target/i386/x86-64-v2.c | 116 ++
 gcc/testsuite/gcc.target/i386/x86-64-v3-haswell.c |  18 
 gcc/testsuite/gcc.target/i386/x86-64-v3-skylake.c |  21 
 gcc/testsuite/gcc.target/i386/x86-64-v3.c | 116 ++
 gcc/testsuite/gcc.target/i386/x86-64-v4.c | 116 ++
 9 files changed, 440 insertions(+), 10 deletions(-)

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 10142149115..62a620b4430 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1795,9 +1795,13 @@ const pta processor_alias_table[] =
 PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
   {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON,
 PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
-  {"x86-64", PROCESSOR_K8, CPU_K8,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR,
-0, P_NONE},
+  {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE},
+  {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE,
+   0, P_NONE},
+  {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE,
+   0, P_NONE},
+  {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE,
+   0, P_NONE},
   {"eden-x2", PROCESSOR_K8, CPU_K8,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR,
 0, P_NONE},
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 597de533fbd..cf48a911798 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2058,10 +2058,25 @@ ix86_option_override_internal (bool main_args_p,
return false;
  }
 
+   /* Only the x86-64 psABI defines the feature-only
+  micro-architecture levels that use PTA_NO_TUNE.  */
+   if ((processor_alias_table[i].flags & PTA_NO

Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Jakub Jelinek via Gcc-patches
On Wed, Sep 30, 2020 at 02:27:38PM +0200, Florian Weimer wrote:
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -29258,6 +29258,13 @@ of the selected instruction set.
>  @item x86-64
>  A generic CPU with 64-bit extensions.
>  
> +@item x86-64-v2
> +@itemx x86-64-v3
> +@itemx x86-64-v4
> +These choices for @var{cpu-type} select the corresponding
> +micro-architecture level from the x86-64 psABI.  They are only available
> +when compiling for a x86-64 target that uses the System V psABI@.

I think the documentation should state that these are not valid in -mtune=,
just in -march=, and that using -march=x86-64-v* will not change tuning.
I guess there should be some testsuite coverage for the for some unexpected
behavior of
-march=skylake -march=x86-64-v3
actually acting as
-march=x86-64-v3 -mtune=skylake
though perhaps it needs to be skipped if user used explicit -mtune= and
not sure how to actually test that (-fverbose-asm doesn't print -mtune=
when it is not explicit).

Jakub



Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Florian Weimer
* Uros Bizjak:

> On Wed, Sep 30, 2020 at 2:27 PM Florian Weimer  wrote:
>>
>> These micro-architecture levels are defined in the x86-64 psABI:
>>
>> https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9
>>
>> PTA_NO_TUNE is introduced so that the new processor alias table entries
>> do not affect the CPU tuning setting in ix86_tune.
>>
>> The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
>> ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").
>>
>> gcc/:
>> PR target/97250
>> * config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE)
>> (PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New.
>> * common/config/i386/i386-common.c (processor_alias_table):
>> Add "x86-64-v2", "x86-64-v3", "x86-64-v4".
>> * config/i386/i386-options.c (ix86_option_override_internal):
>> Handle new PTA_NO_TUNE processor table entries.
>> * doc/invoke.texi (x86 Options): Document new -march values.
>>
>> gcc/testsuite/:
>> PR target/97250
>> * gcc.target/i386/x86-64-v2.c: New test.
>> * gcc.target/i386/x86-64-v3.c: New test.
>> * gcc.target/i386/x86-64-v4.c: New test.
>
> Perhaps you should also test for the newly introduced __LAHF_SAHF__ define?

Like this?  Thanks.

8<--8<
These micro-architecture levels are defined in the x86-64 psABI:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9

PTA_NO_TUNE is introduced so that the new processor alias table entries
do not affect the CPU tuning setting in ix86_tune.

The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").

gcc/:
PR target/97250
* config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE)
(PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New.
* common/config/i386/i386-common.c (processor_alias_table):
Add "x86-64-v2", "x86-64-v3", "x86-64-v4".
* config/i386/i386-options.c (ix86_option_override_internal):
Handle new PTA_NO_TUNE processor table entries.
* doc/invoke.texi (x86 Options): Document new -march values.

gcc/testsuite/:
PR target/97250
* gcc.target/i386/x86-64-v2.c: New test.
* gcc.target/i386/x86-64-v3.c: New test.
* gcc.target/i386/x86-64-v4.c: New test.

---
 gcc/common/config/i386/i386-common.c  |  10 ++-
 gcc/config/i386/i386-options.c|  27 +--
 gcc/config/i386/i386.h|  11 ++-
 gcc/doc/invoke.texi   |   7 ++
 gcc/testsuite/gcc.target/i386/x86-64-v2.c | 116 ++
 gcc/testsuite/gcc.target/i386/x86-64-v3.c | 116 ++
 gcc/testsuite/gcc.target/i386/x86-64-v4.c | 116 ++
 7 files changed, 394 insertions(+), 9 deletions(-)

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 10142149115..62a620b4430 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1795,9 +1795,13 @@ const pta processor_alias_table[] =
 PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
   {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON,
 PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
-  {"x86-64", PROCESSOR_K8, CPU_K8,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR,
-0, P_NONE},
+  {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE},
+  {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE,
+   0, P_NONE},
+  {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE,
+   0, P_NONE},
+  {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE,
+   0, P_NONE},
   {"eden-x2", PROCESSOR_K8, CPU_K8,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR,
 0, P_NONE},
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 597de533fbd..cf48a911798 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2058,10 +2058,25 @@ ix86_option_override_internal (bool main_args_p,
return false;
  }
 
+   /* Only the x86-64 psABI defines the feature-only
+  micro-architecture levels that use PTA_NO_TUNE.  */
+   if ((processor_alias_table[i].flags & PTA_NO_TUNE) != 0
+   && (!TARGET_64BIT_P (opts->x_ix86_isa_flags)
+   || opts->x_ix86_abi != SYSV_ABI))
+ {
+   error (G_("%<%s%> architecture level is only defined"
+ " for the x86-64 psABI"), opts->x_ix86_arch_string);
+   return false;
+ }
+
ix86_schedule = processor_alias_table[i].schedule;
ix86_arch = processor_alias_table[i].processor;
-   /* Default cpu tuning to the architecture.  */
-   ix86_tune = ix

Re: [PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 30, 2020 at 2:27 PM Florian Weimer  wrote:
>
> These micro-architecture levels are defined in the x86-64 psABI:
>
> https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9
>
> PTA_NO_TUNE is introduced so that the new processor alias table entries
> do not affect the CPU tuning setting in ix86_tune.
>
> The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
> ("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").
>
> gcc/:
> PR target/97250
> * config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE)
> (PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New.
> * common/config/i386/i386-common.c (processor_alias_table):
> Add "x86-64-v2", "x86-64-v3", "x86-64-v4".
> * config/i386/i386-options.c (ix86_option_override_internal):
> Handle new PTA_NO_TUNE processor table entries.
> * doc/invoke.texi (x86 Options): Document new -march values.
>
> gcc/testsuite/:
> PR target/97250
> * gcc.target/i386/x86-64-v2.c: New test.
> * gcc.target/i386/x86-64-v3.c: New test.
> * gcc.target/i386/x86-64-v4.c: New test.

Perhaps you should also test for the newly introduced __LAHF_SAHF__ define?

Uros.

> ---
>
> Notes (not going to be committed);
>
> I struggled a bit without avoid ICEs when I used PROCESSOR_GENERIC
> instead of PROCESSOR_K8 in the new process alias table entries.  In
> the end, I think not resetting the tuning setting is the correct thing
> to do.
>
> Test results on x86-64 (on Debian buster) look okay-ish to me.  I see
> lots of obviously unrelated FAILs.
>
>  gcc/common/config/i386/i386-common.c  |  10 ++-
>  gcc/config/i386/i386-options.c|  27 +--
>  gcc/config/i386/i386.h|  11 ++-
>  gcc/doc/invoke.texi   |   7 ++
>  gcc/testsuite/gcc.target/i386/x86-64-v2.c | 113 
> ++
>  gcc/testsuite/gcc.target/i386/x86-64-v3.c | 113 
> ++
>  gcc/testsuite/gcc.target/i386/x86-64-v4.c | 113 
> ++
>  7 files changed, 385 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/common/config/i386/i386-common.c 
> b/gcc/common/config/i386/i386-common.c
> index 10142149115..62a620b4430 100644
> --- a/gcc/common/config/i386/i386-common.c
> +++ b/gcc/common/config/i386/i386-common.c
> @@ -1795,9 +1795,13 @@ const pta processor_alias_table[] =
>  PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
>{"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON,
>  PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
> -  {"x86-64", PROCESSOR_K8, CPU_K8,
> -PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR,
> -0, P_NONE},
> +  {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE},
> +  {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE,
> +   0, P_NONE},
> +  {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE,
> +   0, P_NONE},
> +  {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE,
> +   0, P_NONE},
>{"eden-x2", PROCESSOR_K8, CPU_K8,
>  PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR,
>  0, P_NONE},
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 597de533fbd..cf48a911798 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -2058,10 +2058,25 @@ ix86_option_override_internal (bool main_args_p,
> return false;
>   }
>
> +   /* Only the x86-64 psABI defines the feature-only
> +  micro-architecture levels that use PTA_NO_TUNE.  */
> +   if ((processor_alias_table[i].flags & PTA_NO_TUNE) != 0
> +   && (!TARGET_64BIT_P (opts->x_ix86_isa_flags)
> +   || opts->x_ix86_abi != SYSV_ABI))
> + {
> +   error (G_("%<%s%> architecture level is only defined"
> + " for the x86-64 psABI"), opts->x_ix86_arch_string);
> +   return false;
> + }
> +
> ix86_schedule = processor_alias_table[i].schedule;
> ix86_arch = processor_alias_table[i].processor;
> -   /* Default cpu tuning to the architecture.  */
> -   ix86_tune = ix86_arch;
> +
> +   /* Default cpu tuning to the architecture, unless the table
> +  entry requests not to do this.  Used by the x86-64 psABI
> +  micro-architecture levels.  */
> +   if ((processor_alias_table[i].flags & PTA_NO_TUNE) == 0)
> + ix86_tune = ix86_arch;
>
> if (((processor_alias_table[i].flags & PTA_MMX) != 0)
> && !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_MMX))
> @@ -2384,7 +2399,8 @@ ix86_option_override_internal (bool main_args_p,
>  ix86_arch_features[i] = !!(initial_ix86_arch_features[i] & 
> ix86_arch_mask);
>
>for (i = 0; i < pta_size; i++)
> -if (! strcmp (opts->x_ix86_tune_string, processor_alias_table[i].na

[PATCH] PR target/97250: i386: Add support for x86-64-v2, x86-64-v3, x86-64-v4 levels for x86-64

2020-09-30 Thread Florian Weimer
These micro-architecture levels are defined in the x86-64 psABI:

https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9

PTA_NO_TUNE is introduced so that the new processor alias table entries
do not affect the CPU tuning setting in ix86_tune.

The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").

gcc/:
PR target/97250
* config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE)
(PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New.
* common/config/i386/i386-common.c (processor_alias_table):
Add "x86-64-v2", "x86-64-v3", "x86-64-v4".
* config/i386/i386-options.c (ix86_option_override_internal):
Handle new PTA_NO_TUNE processor table entries.
* doc/invoke.texi (x86 Options): Document new -march values.

gcc/testsuite/:
PR target/97250
* gcc.target/i386/x86-64-v2.c: New test.
* gcc.target/i386/x86-64-v3.c: New test.
* gcc.target/i386/x86-64-v4.c: New test.

---

Notes (not going to be committed);

I struggled a bit without avoid ICEs when I used PROCESSOR_GENERIC
instead of PROCESSOR_K8 in the new process alias table entries.  In
the end, I think not resetting the tuning setting is the correct thing
to do.

Test results on x86-64 (on Debian buster) look okay-ish to me.  I see
lots of obviously unrelated FAILs.

 gcc/common/config/i386/i386-common.c  |  10 ++-
 gcc/config/i386/i386-options.c|  27 +--
 gcc/config/i386/i386.h|  11 ++-
 gcc/doc/invoke.texi   |   7 ++
 gcc/testsuite/gcc.target/i386/x86-64-v2.c | 113 ++
 gcc/testsuite/gcc.target/i386/x86-64-v3.c | 113 ++
 gcc/testsuite/gcc.target/i386/x86-64-v4.c | 113 ++
 7 files changed, 385 insertions(+), 9 deletions(-)

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 10142149115..62a620b4430 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -1795,9 +1795,13 @@ const pta processor_alias_table[] =
 PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
   {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON,
 PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR, 0, P_NONE},
-  {"x86-64", PROCESSOR_K8, CPU_K8,
-PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR,
-0, P_NONE},
+  {"x86-64", PROCESSOR_K8, CPU_K8, PTA_X86_64_BASELINE, 0, P_NONE},
+  {"x86-64-v2", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V2 | PTA_NO_TUNE,
+   0, P_NONE},
+  {"x86-64-v3", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V3 | PTA_NO_TUNE,
+   0, P_NONE},
+  {"x86-64-v4", PROCESSOR_K8, CPU_GENERIC, PTA_X86_64_V4 | PTA_NO_TUNE,
+   0, P_NONE},
   {"eden-x2", PROCESSOR_K8, CPU_K8,
 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_FXSR,
 0, P_NONE},
diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 597de533fbd..cf48a911798 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -2058,10 +2058,25 @@ ix86_option_override_internal (bool main_args_p,
return false;
  }
 
+   /* Only the x86-64 psABI defines the feature-only
+  micro-architecture levels that use PTA_NO_TUNE.  */
+   if ((processor_alias_table[i].flags & PTA_NO_TUNE) != 0
+   && (!TARGET_64BIT_P (opts->x_ix86_isa_flags)
+   || opts->x_ix86_abi != SYSV_ABI))
+ {
+   error (G_("%<%s%> architecture level is only defined"
+ " for the x86-64 psABI"), opts->x_ix86_arch_string);
+   return false;
+ }
+
ix86_schedule = processor_alias_table[i].schedule;
ix86_arch = processor_alias_table[i].processor;
-   /* Default cpu tuning to the architecture.  */
-   ix86_tune = ix86_arch;
+
+   /* Default cpu tuning to the architecture, unless the table
+  entry requests not to do this.  Used by the x86-64 psABI
+  micro-architecture levels.  */
+   if ((processor_alias_table[i].flags & PTA_NO_TUNE) == 0)
+ ix86_tune = ix86_arch;
 
if (((processor_alias_table[i].flags & PTA_MMX) != 0)
&& !(opts->x_ix86_isa_flags_explicit & OPTION_MASK_ISA_MMX))
@@ -2384,7 +2399,8 @@ ix86_option_override_internal (bool main_args_p,
 ix86_arch_features[i] = !!(initial_ix86_arch_features[i] & ix86_arch_mask);
 
   for (i = 0; i < pta_size; i++)
-if (! strcmp (opts->x_ix86_tune_string, processor_alias_table[i].name))
+if (! strcmp (opts->x_ix86_tune_string, processor_alias_table[i].name)
+   && (processor_alias_table[i].flags & PTA_NO_TUNE) == 0)
   {
ix86_schedule = processor_alias_table[i].schedule;
ix86_tune = processor_alias_table[i].processor;
@@ -2428,8 +2444,9 @@ ix86_option_override_internal (bool main_args_p,