Re: [gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-18 Thread Mike Gilbert
On Tue, Oct 18, 2022 at 12:47 PM Ulrich Mueller  wrote:
>
> > On Tue, 18 Oct 2022, David Seifert wrote:
>
> > What if I want to build Gentoo on an old AMD Thunderbird which has
> > neither SSE1 nor the more important SSE2?
>
> The -mfpmath=sse option is a no-op if the CPU doesn't support SSE,
> i.e. it will use 387 arithmetics nevertheless.

I don't really see an "effective" way to deploy this via profiles on x86.

We could add it to the default CFLAGS setting in
profiles/arch/x86/make.defaults. However, we also default to
-march=i686 there, and that doesn't support SSE or SSE2. Also, the
entire CFLAGS variable is likely to be overridden by the CFLAGS
setting in /etc/make.conf.

The CFLAGS_x86 profile variable is only used by the
multilib_toolchain_setup function in multilib.eclass. In other words,
it only affects ebuilds that utilize the multilib eclasses to build
libraries for multiple ABIs. That covers all 32-bit libraries on
amd64, but doesn't cover all packages on x86.



Re: [gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-18 Thread Ulrich Mueller
> On Tue, 18 Oct 2022, David Seifert wrote:

> What if I want to build Gentoo on an old AMD Thunderbird which has
> neither SSE1 nor the more important SSE2?

The -mfpmath=sse option is a no-op if the CPU doesn't support SSE,
i.e. it will use 387 arithmetics nevertheless.


signature.asc
Description: PGP signature


Re: [gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-18 Thread Mike Gilbert
On Tue, Oct 18, 2022 at 5:56 AM David Seifert  wrote:
>
> On Tue, 2022-10-18 at 10:14 +0200, Ulrich Mueller wrote:
> > > > > > > On Tue, 18 Oct 2022, Mike Gilbert wrote:
> >
> > > Reference: https://gcc.gnu.org/wiki/x87note
> >
> > Which says:
> >
> > > ... the amount of worst-case error that could possibly happen using
> > > the x87 (with any amount of intermediate rounding) is at worst the
> > > same as true 64 or 32 bit arithmetic, and in practice is almost
> > > always
> > > better.
> >
> > and:
> >
> > > Note, however, that this greater repeatability comes at the cost of
> > > lost precision (i.e. SSE always gets the same precision because it
> > > always takes the equivalent of the x87's worst case: a forced round
> > > down at each step).
> >
> > So, it comes with a price, and I wonder if we shouldn't leave that
> > choice to the user, and go with the upstream GCC default?
> >
> > > -CFLAGS_x86="-m32"
> > > +CFLAGS_x86="-m32 -mfpmath=sse"
>
> -mfpmath=sse is already the default on amd64.

I have amended the first paragraph to make this more clear:

GCC uses x87 floating point instructions when building 32-bit x86
code by default. When building 64-bit code, SSE2 instructions are used
instead.



Re: [gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-18 Thread Mike Gilbert
On Tue, Oct 18, 2022 at 9:37 AM David Seifert  wrote:
>
> On Tue, 2022-10-18 at 13:40 +0200, Ulrich Mueller wrote:
> > > > > > > On Tue, 18 Oct 2022, David Seifert wrote:
> >
> > > > > -CFLAGS_x86="-m32"
> > > > > +CFLAGS_x86="-m32 -mfpmath=sse"
> >
> > > -mfpmath=sse is already the default on amd64.
> >
> > I see. This change makes sense then.
> >
> > What about profiles/arch/x86 though? IIUC we'll end up with an
> > inconsistency between x86 and multilib amd64.
> >
> > Ulrich
>
> What if I want to build Gentoo on an old AMD Thunderbird which has
> neither SSE1 nor the more important SSE2?

Right. On amd64 CPU always supports SSE2, so -mfpmath=sse will always
work there.

On x86, we need to consider a more diverse set of supported instructions.



Re: [gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-18 Thread David Seifert
On Tue, 2022-10-18 at 13:40 +0200, Ulrich Mueller wrote:
> > > > > > On Tue, 18 Oct 2022, David Seifert wrote:
> 
> > > > -CFLAGS_x86="-m32"
> > > > +CFLAGS_x86="-m32 -mfpmath=sse"
> 
> > -mfpmath=sse is already the default on amd64.
> 
> I see. This change makes sense then.
> 
> What about profiles/arch/x86 though? IIUC we'll end up with an
> inconsistency between x86 and multilib amd64.
> 
> Ulrich

What if I want to build Gentoo on an old AMD Thunderbird which has
neither SSE1 nor the more important SSE2?



Re: [gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-18 Thread Ulrich Mueller
> On Tue, 18 Oct 2022, David Seifert wrote:

>> > -CFLAGS_x86="-m32"
>> > +CFLAGS_x86="-m32 -mfpmath=sse"

> -mfpmath=sse is already the default on amd64.

I see. This change makes sense then.

What about profiles/arch/x86 though? IIUC we'll end up with an
inconsistency between x86 and multilib amd64.

Ulrich


signature.asc
Description: PGP signature


Re: [gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-18 Thread David Seifert
On Tue, 2022-10-18 at 10:14 +0200, Ulrich Mueller wrote:
> > > > > > On Tue, 18 Oct 2022, Mike Gilbert wrote:
> 
> > Reference: https://gcc.gnu.org/wiki/x87note
> 
> Which says:
> 
> > ... the amount of worst-case error that could possibly happen using
> > the x87 (with any amount of intermediate rounding) is at worst the
> > same as true 64 or 32 bit arithmetic, and in practice is almost
> > always
> > better.
> 
> and:
> 
> > Note, however, that this greater repeatability comes at the cost of
> > lost precision (i.e. SSE always gets the same precision because it
> > always takes the equivalent of the x87's worst case: a forced round
> > down at each step).
> 
> So, it comes with a price, and I wonder if we shouldn't leave that
> choice to the user, and go with the upstream GCC default?
> 
> > -CFLAGS_x86="-m32"
> > +CFLAGS_x86="-m32 -mfpmath=sse"

-mfpmath=sse is already the default on amd64.

> Also, why add the flag only to CFLAGS_x86 but not to CFLAGS_amd64?
> They should have the same single and double precision arithmetic?
> 
> Ulrich




Re: [gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-18 Thread Ulrich Mueller
> On Tue, 18 Oct 2022, Mike Gilbert wrote:

> Reference: https://gcc.gnu.org/wiki/x87note

Which says:

| ... the amount of worst-case error that could possibly happen using
| the x87 (with any amount of intermediate rounding) is at worst the
| same as true 64 or 32 bit arithmetic, and in practice is almost always
| better.

and:

| Note, however, that this greater repeatability comes at the cost of
| lost precision (i.e. SSE always gets the same precision because it
| always takes the equivalent of the x87's worst case: a forced round
| down at each step).

So, it comes with a price, and I wonder if we shouldn't leave that
choice to the user, and go with the upstream GCC default?

> -CFLAGS_x86="-m32"
> +CFLAGS_x86="-m32 -mfpmath=sse"

Also, why add the flag only to CFLAGS_x86 but not to CFLAGS_amd64?
They should have the same single and double precision arithmetic?

Ulrich


signature.asc
Description: PGP signature


[gentoo-dev] [PATCH] profiles/arch/amd64: add "-mfpmath=sse" to CFLAGS_x86

2022-10-17 Thread Mike Gilbert
GCC uses x87 floating point instructions when building 32-bit x86
code by default. This is true even for x86-64 multilib.

Using the x87 floating point unit can lead to strange behavior when
calculating intermediate values for single and double precision floats.
It uses 80 bits for all calculations, which is larger than the 32 or 64
bits specified for floats and doubles.

Using the SSE2 instructions available on x86-64 for floating point
arithmetic leads to more consistent behavior, and is usually faster.

Reference: https://gcc.gnu.org/wiki/x87note
Signed-off-by: Mike Gilbert 
---
 profiles/arch/amd64/make.defaults | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/profiles/arch/amd64/make.defaults 
b/profiles/arch/amd64/make.defaults
index 0c05dec124e..e7e18ff6a91 100644
--- a/profiles/arch/amd64/make.defaults
+++ b/profiles/arch/amd64/make.defaults
@@ -1,4 +1,4 @@
-# Copyright 1999-2021 Gentoo Authors
+# Copyright 1999-2022 Gentoo Authors
 # Distributed under the terms of the GNU General Public License v2
 
 ARCH="amd64"
@@ -28,7 +28,7 @@ LDFLAGS_amd64="-m elf_x86_64"
 CHOST_amd64="x86_64-pc-linux-gnu"
 
 # 32bit specific settings.
-CFLAGS_x86="-m32"
+CFLAGS_x86="-m32 -mfpmath=sse"
 LDFLAGS_x86="-m elf_i386"
 CHOST_x86="i686-pc-linux-gnu"
 
-- 
2.37.3