Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-24 Thread Filipe Laíns via arch-dev-public
On Sat, 2019-05-25 at 10:35 +1000, Allan McRae wrote:
> On 25/5/19 10:17 am, Filipe Laíns via arch-dev-public wrote:
> > Hello,
> > 
> > Currently there are no guidelines stating which x86 extensions (ex.
> > SSE2, SEE3, SSE4, AVX, etc.) we support. This is a bit problematic
> > since it lets compilers do what they want and possible generate code
> > that can't run on some systems.
> > 
> > Even though this is an issue, it's not complete anarchy, at least yet!
> > Just kidding :p. The vast majority of our native packages are compiled
> > with GCC and we do default to `-mtune=generic` which is good but not
> > optimal. `-mtune=generic` tells GCC to compile for a generic processor
> > so it's up to GCC to decide which architecture extensions would compose
> > a generic processor. I haven't been able to find any documentation on
> > what x86 extensions are enabled for a "generic" processor but I was
> > able to track them down to MMX, SSE (or KNI) and SSE2. Being
> > undocumented they could change at any time so I don't think we should
> > rely on `-mtune=generic`.
> > 
> 
> I think you need to look at the difference between -march and -mtune.
> We use "-march=x86-64", which defines the instruction sets that can be
> used.  Adding "-mtune=generic" does not allow the inclusion of
> additional instruction sets.
> 
> Look at the output of:
> gcc -march=x86-64 -Q --help=target
> 
> Allan

Yes! My bad. I got confused.

From the `-march=x86_64` documentation:
A generic CPU with 64-bit extensions.

Setting `-mtune` to generic won't add any additional instruction sets
by itself, but it does not prevent instruction sets from being added.
Looks like GCC enables MMX, SSE and SSE2 by default, it isn't related
at all to `-march` like I stated in the email but it still presents the
same issue.

What do you think?

Filipe Laíns
3DCE 51D6 0930 EBA4 7858 BA41 46F6 33CB B0EB 4BF2


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-24 Thread Allan McRae via arch-dev-public
On 25/5/19 10:17 am, Filipe Laíns via arch-dev-public wrote:
> Hello,
> 
> Currently there are no guidelines stating which x86 extensions (ex.
> SSE2, SEE3, SSE4, AVX, etc.) we support. This is a bit problematic
> since it lets compilers do what they want and possible generate code
> that can't run on some systems.
> 
> Even though this is an issue, it's not complete anarchy, at least yet!
> Just kidding :p. The vast majority of our native packages are compiled
> with GCC and we do default to `-mtune=generic` which is good but not
> optimal. `-mtune=generic` tells GCC to compile for a generic processor
> so it's up to GCC to decide which architecture extensions would compose
> a generic processor. I haven't been able to find any documentation on
> what x86 extensions are enabled for a "generic" processor but I was
> able to track them down to MMX, SSE (or KNI) and SSE2. Being
> undocumented they could change at any time so I don't think we should
> rely on `-mtune=generic`.
> 

I think you need to look at the difference between -march and -mtune.
We use "-march=x86-64", which defines the instruction sets that can be
used.  Adding "-mtune=generic" does not allow the inclusion of
additional instruction sets.

Look at the output of:
gcc -march=x86-64 -Q --help=target

Allan


[arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-24 Thread Filipe Laíns via arch-dev-public
Hello,

Currently there are no guidelines stating which x86 extensions (ex.
SSE2, SEE3, SSE4, AVX, etc.) we support. This is a bit problematic
since it lets compilers do what they want and possible generate code
that can't run on some systems.

Even though this is an issue, it's not complete anarchy, at least yet!
Just kidding :p. The vast majority of our native packages are compiled
with GCC and we do default to `-mtune=generic` which is good but not
optimal. `-mtune=generic` tells GCC to compile for a generic processor
so it's up to GCC to decide which architecture extensions would compose
a generic processor. I haven't been able to find any documentation on
what x86 extensions are enabled for a "generic" processor but I was
able to track them down to MMX, SSE (or KNI) and SSE2. Being
undocumented they could change at any time so I don't think we should
rely on `-mtune=generic`.

What I propose is to define a set of x86 extensions to support and make
all compilers default to that if possible. I am fine with MMX, SSE and
SSE2 but they should be *our* choice, not GCC's.

This begs the question, would such approach be doable? So I ask the
compiler maintainers to please check if this would be possible.

I would also like to explore the idea of adding an "high performance"
architecture which would be able to make use of SSE{,2,3,4,4.1,4.2} and
AVX, which seem to be the standard for newer processors (>=2013). This
would only be available for packages that do high performance computing
(ex. openblas, sdrangel, etc.). Any thoughts on this?

Thanks,
Filipe Laíns
3DCE 51D6 0930 EBA4 7858 BA41 46F6 33CB B0EB 4BF2


signature.asc
Description: This is a digitally signed message part