Re: [PATCH RFC] c++: implement C++17 hardware interference size

2021-09-15 Thread Jason Merrill via Gcc-patches

On 9/15/21 8:31 AM, Martin Liška wrote:

On 9/14/21 09:56, Christophe LYON via Gcc-patches wrote:

So adjustment is needed for both arm and aarch64 targets


Hello.

I noticed the same problem and I've got a patch candidate for it.

What do you think about it?


I've now silenced the warning for internal default values, but they're 
still worth discussion.


For arm, I think it would make sense to use 32 for constructive for the 
generic target, but perhaps we think the older chips aren't worth 
constraining new code for?  In which case, perhaps 
param_l1_cache_line_size should also default to 64 for the generic target.


For aarch64, it seems even more questionable that 
param_l1_cache_line_size is 32 for the generic target; are there any 
aarch64 CPUs with a 32B L1 cache line?


Jason



Re: [PATCH RFC] c++: implement C++17 hardware interference size

2021-09-15 Thread Martin Liška

On 9/14/21 09:56, Christophe LYON via Gcc-patches wrote:

So adjustment is needed for both arm and aarch64 targets


Hello.

I noticed the same problem and I've got a patch candidate for it.

What do you think about it?
MartinFrom 2ecedfe5cc421dd36f5770b04553343ecebd3430 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 15 Sep 2021 14:29:52 +0200
Subject: [PATCH] Fix -Werror=interference-size on ARM and Aarch64.

gcc/ChangeLog:

	* config/aarch64/aarch64.c (aarch64_override_options_internal):
	Use default L1 cache line size from params.
	* config/arm/arm.c (arm_option_override): Likewise.
---
 gcc/config/aarch64/aarch64.c | 2 +-
 gcc/config/arm/arm.c | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 36519ccc5a5..4eb50bdb7d5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -16559,7 +16559,7 @@ aarch64_override_options_internal (struct gcc_options *opts)
 			   256);
   SET_OPTION_IF_UNSET (opts, _options_set,
 			   param_construct_interfere_size,
-			   64);
+			   param_l1_cache_line_size);
 }
 
   if (aarch64_tune_params.prefetch->l2_cache_size >= 0)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6c6e77fab66..5d14e1e5e9c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3688,7 +3688,8 @@ arm_option_override (void)
   SET_OPTION_IF_UNSET (_options, _options_set,
 			   param_destruct_interfere_size, 64);
   SET_OPTION_IF_UNSET (_options, _options_set,
-			   param_construct_interfere_size, 64);
+			   param_construct_interfere_size,
+			   param_l1_cache_line_size);
 }
 
   if (current_tune->prefetch.l1_cache_size >= 0)
-- 
2.33.0



Re: [PATCH RFC] c++: implement C++17 hardware interference size

2021-09-15 Thread Christophe Lyon via Gcc-patches
On Wed, Sep 15, 2021 at 12:25 PM Richard Earnshaw via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

>
>
> On 14/09/2021 08:56, Christophe LYON via Gcc-patches wrote:
> >
> > On 10/09/2021 15:16, Jason Merrill via Gcc-patches wrote:
> >> OK, time to finish this up.  The main change relative to the last
> >> patch I sent
> >> to the list is dropping the -finterference-tune flag and making that
> >> behavior
> >> the default.  Any more comments?
> >>
> >> 
> >>
> >> The last missing piece of the C++17 standard library is the hardware
> >> intereference size constants.  Much of the delay in implementing these
> >> has
> >> been due to uncertainty about what the right values are, and even
> whether
> >> there is a single constant value that is suitable; the destructive
> >> interference size is intended to be used in structure layout, so program
> >> ABIs will depend on it.
> >>
> >> In principle, both of these values should be the same as the target's L1
> >> cache line size.  When compiling for a generic target that is intended
> to
> >> support a range of target CPUs with different cache line sizes, the
> >> constructive size should probably be the minimum size, and the
> >> destructive
> >> size the maximum, unless you are constrained by ABI compatibility with
> >> previous code.
> >>
> >>  From discussion on gcc-patches, I've come to the conclusion that the
> >> solution to the difficulty of choosing stable values is to give up on
> it,
> >> and instead encourage only uses where ABI stability is unimportant: in
> >> particular, uses where the ABI is shared at most between translation
> >> units
> >> built at the same time with the same flags.
> >>
> >> To that end, I've added a warning for any use of the constant value of
> >> std::hardware_destructive_interference_size in a header or module
> export.
> >> Appropriate uses within a project can disable the warning.
> >>
> >> A previous iteration of this patch included an -finterference-tune
> >> flag to
> >> make the value vary with -mtune; this iteration makes that the default
> >> behavior, which should be appropriate for all reasonable uses of the
> >> variable.  The previous default of "stable-ish" seems to me likely to
> >> have
> >> been more of an attractive nuisance; since we can't promise actual
> >> stability, we should instead make proper uses more convenient.
> >>
> >> JF Bastien's implementation proposal is summarized at
> >> https://github.com/itanium-cxx-abi/cxx-abi/issues/74
> >>
> >> I implement this by adding new --params for the two sizes.  Targets can
> >> override these values in targetm.target_option.override() to support a
> >> range
> >> of values for the generic target; otherwise, both will default to the L1
> >> cache line size.
> >>
> >> 64 bytes still seems correct for all x86.
> >>
> >> I'm not sure why he proposed 64/64 for generic 32-bit ARM, since the
> >> Cortex
> >> A9 has a 32-byte cache line, so I'd think 32/64 would make more sense.
> >
> > While this works for an arm-linux-gnueabihf toolchain configured
> > --with-mode=arm, it fails --with-mode=thumb (also using
> > --with-cpu=cortex-a9):
> >
> > : error: '--param constructive-interference-size=64' is
> > greater than '--param l1-cache-line-size=32' [-Werror=interference-size]
> > cc1plus: all warnings being treated as errors
> > make[4]: *** [Makefile:678: alloc_c.lo] Error 1
> >
> >
> >
> >>
> >> He proposed 64/128 for generic AArch64, but since the A64FX now has a
> >> 256B
> >> cache line, I've changed that to 64/256.
> >
> >
> > Similarly, for aarch64 I'm seeing:
> >
> > : error: '--param constructive-interference-size=64' is
> > greater than '--param l1-cache-line-size=32' [-Werror=interference-size]
> > cc1plus: all warnings being treated as errors
> > make[4]: *** [Makefile:678: alloc_c.lo] Error 1
> >
> >
> > So adjustment is needed for both arm and aarch64 targets
> >
> >
>
> FWIW, I'm still in discussion with our architects about the best values
> to use here, but certainly this needs fixing quickly as it seems to be
> breaking hundreds of tests in the C++ testsuite.
>

Indeed. A lot of messages to gcc-testresults are blocked because they are
too large.

Christophe


>
> R.
>
> > Christophe
> >
> >
> >>
> >> With the above choice to reject stability as a goal, getting these
> values
> >> "right" is now just a matter of what we want the default optimization
> >> to be,
> >> and we can feel free to adjust them as CPUs with different cache lines
> >> become more and less common.
> >>
> >> gcc/ChangeLog:
> >>
> >> * params.opt: Add destructive-interference-size and
> >> constructive-interference-size.
> >> * doc/invoke.texi: Document them.
> >> * config/aarch64/aarch64.c (aarch64_override_options_internal):
> >> Set them.
> >> * config/arm/arm.c (arm_option_override): Set them.
> >> * config/i386/i386-options.c (ix86_option_override_internal):
> >> Set them.
> >>
> >> gcc/c-family/ChangeLog:
> >>
> >> * c.opt: Add 

Re: [PATCH RFC] c++: implement C++17 hardware interference size

2021-09-15 Thread Richard Earnshaw via Gcc-patches




On 14/09/2021 08:56, Christophe LYON via Gcc-patches wrote:


On 10/09/2021 15:16, Jason Merrill via Gcc-patches wrote:
OK, time to finish this up.  The main change relative to the last 
patch I sent
to the list is dropping the -finterference-tune flag and making that 
behavior

the default.  Any more comments?



The last missing piece of the C++17 standard library is the hardware
intereference size constants.  Much of the delay in implementing these 
has

been due to uncertainty about what the right values are, and even whether
there is a single constant value that is suitable; the destructive
interference size is intended to be used in structure layout, so program
ABIs will depend on it.

In principle, both of these values should be the same as the target's L1
cache line size.  When compiling for a generic target that is intended to
support a range of target CPUs with different cache line sizes, the
constructive size should probably be the minimum size, and the 
destructive

size the maximum, unless you are constrained by ABI compatibility with
previous code.

 From discussion on gcc-patches, I've come to the conclusion that the
solution to the difficulty of choosing stable values is to give up on it,
and instead encourage only uses where ABI stability is unimportant: in
particular, uses where the ABI is shared at most between translation 
units

built at the same time with the same flags.

To that end, I've added a warning for any use of the constant value of
std::hardware_destructive_interference_size in a header or module export.
Appropriate uses within a project can disable the warning.

A previous iteration of this patch included an -finterference-tune 
flag to

make the value vary with -mtune; this iteration makes that the default
behavior, which should be appropriate for all reasonable uses of the
variable.  The previous default of "stable-ish" seems to me likely to 
have

been more of an attractive nuisance; since we can't promise actual
stability, we should instead make proper uses more convenient.

JF Bastien's implementation proposal is summarized at
https://github.com/itanium-cxx-abi/cxx-abi/issues/74

I implement this by adding new --params for the two sizes.  Targets can
override these values in targetm.target_option.override() to support a 
range

of values for the generic target; otherwise, both will default to the L1
cache line size.

64 bytes still seems correct for all x86.

I'm not sure why he proposed 64/64 for generic 32-bit ARM, since the 
Cortex

A9 has a 32-byte cache line, so I'd think 32/64 would make more sense.


While this works for an arm-linux-gnueabihf toolchain configured 
--with-mode=arm, it fails --with-mode=thumb (also using 
--with-cpu=cortex-a9):


: error: '--param constructive-interference-size=64' is 
greater than '--param l1-cache-line-size=32' [-Werror=interference-size]

cc1plus: all warnings being treated as errors
make[4]: *** [Makefile:678: alloc_c.lo] Error 1





He proposed 64/128 for generic AArch64, but since the A64FX now has a 
256B

cache line, I've changed that to 64/256.



Similarly, for aarch64 I'm seeing:

: error: '--param constructive-interference-size=64' is 
greater than '--param l1-cache-line-size=32' [-Werror=interference-size]

cc1plus: all warnings being treated as errors
make[4]: *** [Makefile:678: alloc_c.lo] Error 1


So adjustment is needed for both arm and aarch64 targets




FWIW, I'm still in discussion with our architects about the best values 
to use here, but certainly this needs fixing quickly as it seems to be 
breaking hundreds of tests in the C++ testsuite.


R.


Christophe




With the above choice to reject stability as a goal, getting these values
"right" is now just a matter of what we want the default optimization 
to be,

and we can feel free to adjust them as CPUs with different cache lines
become more and less common.

gcc/ChangeLog:

* params.opt: Add destructive-interference-size and
constructive-interference-size.
* doc/invoke.texi: Document them.
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Set them.
* config/arm/arm.c (arm_option_override): Set them.
* config/i386/i386-options.c (ix86_option_override_internal):
Set them.

gcc/c-family/ChangeLog:

* c.opt: Add -Winterference-size.
* c-cppbuiltin.c (cpp_atomic_builtins): Add __GCC_DESTRUCTIVE_SIZE
and __GCC_CONSTRUCTIVE_SIZE.

gcc/cp/ChangeLog:

* constexpr.c (maybe_warn_about_constant_value):
Complain about std::hardware_destructive_interference_size.
(cxx_eval_constant_expression): Call it.
* decl.c (cxx_init_decl_processing): Check
--param *-interference-size values.

libstdc++-v3/ChangeLog:

* include/std/version: Define __cpp_lib_hardware_interference_size.
* libsupc++/new: Define hardware interference size variables.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Winterference.h: New file.
* g++.dg/warn/Winterference.C: New test.
* 

Re: [PATCH RFC] c++: implement C++17 hardware interference size

2021-09-14 Thread Christophe LYON via Gcc-patches



On 10/09/2021 15:16, Jason Merrill via Gcc-patches wrote:

OK, time to finish this up.  The main change relative to the last patch I sent
to the list is dropping the -finterference-tune flag and making that behavior
the default.  Any more comments?



The last missing piece of the C++17 standard library is the hardware
intereference size constants.  Much of the delay in implementing these has
been due to uncertainty about what the right values are, and even whether
there is a single constant value that is suitable; the destructive
interference size is intended to be used in structure layout, so program
ABIs will depend on it.

In principle, both of these values should be the same as the target's L1
cache line size.  When compiling for a generic target that is intended to
support a range of target CPUs with different cache line sizes, the
constructive size should probably be the minimum size, and the destructive
size the maximum, unless you are constrained by ABI compatibility with
previous code.

 From discussion on gcc-patches, I've come to the conclusion that the
solution to the difficulty of choosing stable values is to give up on it,
and instead encourage only uses where ABI stability is unimportant: in
particular, uses where the ABI is shared at most between translation units
built at the same time with the same flags.

To that end, I've added a warning for any use of the constant value of
std::hardware_destructive_interference_size in a header or module export.
Appropriate uses within a project can disable the warning.

A previous iteration of this patch included an -finterference-tune flag to
make the value vary with -mtune; this iteration makes that the default
behavior, which should be appropriate for all reasonable uses of the
variable.  The previous default of "stable-ish" seems to me likely to have
been more of an attractive nuisance; since we can't promise actual
stability, we should instead make proper uses more convenient.

JF Bastien's implementation proposal is summarized at
https://github.com/itanium-cxx-abi/cxx-abi/issues/74

I implement this by adding new --params for the two sizes.  Targets can
override these values in targetm.target_option.override() to support a range
of values for the generic target; otherwise, both will default to the L1
cache line size.

64 bytes still seems correct for all x86.

I'm not sure why he proposed 64/64 for generic 32-bit ARM, since the Cortex
A9 has a 32-byte cache line, so I'd think 32/64 would make more sense.


While this works for an arm-linux-gnueabihf toolchain configured 
--with-mode=arm, it fails --with-mode=thumb (also using 
--with-cpu=cortex-a9):


: error: '--param constructive-interference-size=64' is 
greater than '--param l1-cache-line-size=32' [-Werror=interference-size]

cc1plus: all warnings being treated as errors
make[4]: *** [Makefile:678: alloc_c.lo] Error 1





He proposed 64/128 for generic AArch64, but since the A64FX now has a 256B
cache line, I've changed that to 64/256.



Similarly, for aarch64 I'm seeing:

: error: '--param constructive-interference-size=64' is 
greater than '--param l1-cache-line-size=32' [-Werror=interference-size]

cc1plus: all warnings being treated as errors
make[4]: *** [Makefile:678: alloc_c.lo] Error 1


So adjustment is needed for both arm and aarch64 targets


Christophe




With the above choice to reject stability as a goal, getting these values
"right" is now just a matter of what we want the default optimization to be,
and we can feel free to adjust them as CPUs with different cache lines
become more and less common.

gcc/ChangeLog:

* params.opt: Add destructive-interference-size and
constructive-interference-size.
* doc/invoke.texi: Document them.
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Set them.
* config/arm/arm.c (arm_option_override): Set them.
* config/i386/i386-options.c (ix86_option_override_internal):
Set them.

gcc/c-family/ChangeLog:

* c.opt: Add -Winterference-size.
* c-cppbuiltin.c (cpp_atomic_builtins): Add __GCC_DESTRUCTIVE_SIZE
and __GCC_CONSTRUCTIVE_SIZE.

gcc/cp/ChangeLog:

* constexpr.c (maybe_warn_about_constant_value):
Complain about std::hardware_destructive_interference_size.
(cxx_eval_constant_expression): Call it.
* decl.c (cxx_init_decl_processing): Check
--param *-interference-size values.

libstdc++-v3/ChangeLog:

* include/std/version: Define __cpp_lib_hardware_interference_size.
* libsupc++/new: Define hardware interference size variables.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Winterference.h: New file.
* g++.dg/warn/Winterference.C: New test.
* g++.target/aarch64/interference.C: New test.
* g++.target/arm/interference.C: New test.
* g++.target/i386/interference.C: New test.
---
  gcc/doc/invoke.texi   | 65 

[PATCH RFC] c++: implement C++17 hardware interference size

2021-09-10 Thread Jason Merrill via Gcc-patches
OK, time to finish this up.  The main change relative to the last patch I sent
to the list is dropping the -finterference-tune flag and making that behavior
the default.  Any more comments?



The last missing piece of the C++17 standard library is the hardware
intereference size constants.  Much of the delay in implementing these has
been due to uncertainty about what the right values are, and even whether
there is a single constant value that is suitable; the destructive
interference size is intended to be used in structure layout, so program
ABIs will depend on it.

In principle, both of these values should be the same as the target's L1
cache line size.  When compiling for a generic target that is intended to
support a range of target CPUs with different cache line sizes, the
constructive size should probably be the minimum size, and the destructive
size the maximum, unless you are constrained by ABI compatibility with
previous code.

>From discussion on gcc-patches, I've come to the conclusion that the
solution to the difficulty of choosing stable values is to give up on it,
and instead encourage only uses where ABI stability is unimportant: in
particular, uses where the ABI is shared at most between translation units
built at the same time with the same flags.

To that end, I've added a warning for any use of the constant value of
std::hardware_destructive_interference_size in a header or module export.
Appropriate uses within a project can disable the warning.

A previous iteration of this patch included an -finterference-tune flag to
make the value vary with -mtune; this iteration makes that the default
behavior, which should be appropriate for all reasonable uses of the
variable.  The previous default of "stable-ish" seems to me likely to have
been more of an attractive nuisance; since we can't promise actual
stability, we should instead make proper uses more convenient.

JF Bastien's implementation proposal is summarized at
https://github.com/itanium-cxx-abi/cxx-abi/issues/74

I implement this by adding new --params for the two sizes.  Targets can
override these values in targetm.target_option.override() to support a range
of values for the generic target; otherwise, both will default to the L1
cache line size.

64 bytes still seems correct for all x86.

I'm not sure why he proposed 64/64 for generic 32-bit ARM, since the Cortex
A9 has a 32-byte cache line, so I'd think 32/64 would make more sense.

He proposed 64/128 for generic AArch64, but since the A64FX now has a 256B
cache line, I've changed that to 64/256.

With the above choice to reject stability as a goal, getting these values
"right" is now just a matter of what we want the default optimization to be,
and we can feel free to adjust them as CPUs with different cache lines
become more and less common.

gcc/ChangeLog:

* params.opt: Add destructive-interference-size and
constructive-interference-size.
* doc/invoke.texi: Document them.
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Set them.
* config/arm/arm.c (arm_option_override): Set them.
* config/i386/i386-options.c (ix86_option_override_internal):
Set them.

gcc/c-family/ChangeLog:

* c.opt: Add -Winterference-size.
* c-cppbuiltin.c (cpp_atomic_builtins): Add __GCC_DESTRUCTIVE_SIZE
and __GCC_CONSTRUCTIVE_SIZE.

gcc/cp/ChangeLog:

* constexpr.c (maybe_warn_about_constant_value):
Complain about std::hardware_destructive_interference_size.
(cxx_eval_constant_expression): Call it.
* decl.c (cxx_init_decl_processing): Check
--param *-interference-size values.

libstdc++-v3/ChangeLog:

* include/std/version: Define __cpp_lib_hardware_interference_size.
* libsupc++/new: Define hardware interference size variables.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Winterference.h: New file.
* g++.dg/warn/Winterference.C: New test.
* g++.target/aarch64/interference.C: New test.
* g++.target/arm/interference.C: New test.
* g++.target/i386/interference.C: New test.
---
 gcc/doc/invoke.texi   | 65 +++
 gcc/c-family/c.opt|  5 ++
 gcc/params.opt| 16 +
 gcc/c-family/c-cppbuiltin.c   | 12 
 gcc/config/aarch64/aarch64.c  | 22 +++
 gcc/config/arm/arm.c  | 22 +++
 gcc/config/i386/i386-options.c|  6 ++
 gcc/cp/constexpr.c| 33 ++
 gcc/cp/decl.c | 32 +
 gcc/testsuite/g++.dg/warn/Winterference-2.C   | 14 
 gcc/testsuite/g++.dg/warn/Winterference.C |  6 ++
 .../g++.target/aarch64/interference.C |  9 +++
 gcc/testsuite/g++.target/arm/interference.C   |  9 +++
 gcc/testsuite/g++.target/i386/interference.C  |  8 +++