Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-25 Thread Filipe Laíns via arch-dev-public
On Sat, 2019-05-25 at 21:27 +1000, Allan McRae via arch-dev-public wrote:
> On 25/5/19 9:19 pm, Bruno Pagani via arch-dev-public wrote:
> > Hi,
> > 
> > Le 25/05/2019 à 02:17, Filipe Laíns via arch-dev-public a écrit :
> > > I would also like to explore the idea of adding an "high performance"
> > > architecture which would be able to make use of SSE{,2,3,4,4.1,4.2} and
> > > AVX, which seem to be the standard for newer processors (>=2013). This
> > > would only be available for packages that do high performance computing
> > > (ex. openblas, sdrangel, etc.). Any thoughts on this?
> > 
> > As said on IRC, they have been discussions before on having multiple
> > targets and corresponding repos, but the starting point is that we need
> > automated build before going into such a direction, and this in turn has
> > several requirements. I’ve linked to you the pad where we put our ideas
> > together regarding this.
> > 
> > In the meantime, we had the case before of whether we should package
> > e.g. $pkgname-{sse4,avx} in a case where it mattered a lot, but it
> > turned out the software in question (embree) is able to do runtime
> > detection of available ISA. Maybe some other packages are doing this
> > too, else we could discuss whether allowing such flavours as a temporary
> > measure would be acceptable for selected packages.
> 
> glibc detects available instruction sets and uses the best for many
> functions.
> 
> I'd be very, very, very much against providing multiple variants of a
> package in our repos.  Using asp and makepkg are is a hard solution for
> those who really need a few packages rebuilt.
> 
> PS - I rebuilt [core] with -march=haswell recently as a test.  Automated
> building is not an issue.  Unattended package/database signing is the
> major stumbling block.
> 
> Allan

In cases where the instruction set is detected at runtime it would not
be needed a new variation of the package since we can guarantee the
software isn't going to try to run any unsupported instructions.
What we are discussing really only applies to packages without runtime
SIMD code selection.

Thanks,
Filipe Laíns
3DCE 51D6 0930 EBA4 7858 BA41 46F6 33CB B0EB 4BF2


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-25 Thread Filipe Laíns via arch-dev-public
On Sat, 2019-05-25 at 13:19 +0200, Bruno Pagani via arch-dev-public wrote:
> Hi,
> 
> Le 25/05/2019 à 02:17, Filipe Laíns via arch-dev-public a écrit :
> > I would also like to explore the idea of adding an "high performance"
> > architecture which would be able to make use of SSE{,2,3,4,4.1,4.2} and
> > AVX, which seem to be the standard for newer processors (>=2013). This
> > would only be available for packages that do high performance computing
> > (ex. openblas, sdrangel, etc.). Any thoughts on this?
> 
> As said on IRC, they have been discussions before on having multiple
> targets and corresponding repos, but the starting point is that we need
> automated build before going into such a direction, and this in turn has
> several requirements. I’ve linked to you the pad where we put our ideas
> together regarding this.
> 
> In the meantime, we had the case before of whether we should package
> e.g. $pkgname-{sse4,avx} in a case where it mattered a lot, but it
> turned out the software in question (embree) is able to do runtime
> detection of available ISA. Maybe some other packages are doing this
> too, else we could discuss whether allowing such flavours as a temporary
> measure would be acceptable for selected packages.
> 
> Regards,
> Bruno

This is fine my me. My biggest concern was the fact C doesn't support
__attribute__(("instruction set here")) but there are of course
workarounds. Creating a new architecture only makes sense if there are
multiple packages needing this but it seems not. I am fine with a
suffix, although I was thinking more something like -simd as SSE4, AVX,
etc. are usually available at the same time. In this cases I think we
should add a post_install step that gives a warning if the user CPU
doesn't support the used instruction sets.

Thanks,
Filipe Laíns
3DCE 51D6 0930 EBA4 7858 BA41 46F6 33CB B0EB 4BF2


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-25 Thread Allan McRae via arch-dev-public
On 25/5/19 9:34 pm, Bruno Pagani wrote:
> Out of curiosity, what did you rebuild of [core] lead to?

I had a potentially slightly faster system for a week...  It was mainly
a test to see if I spotted some build issues of test suite failures
beyond what is seen for x86_64.  All was good.

A


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-25 Thread Bruno Pagani via arch-dev-public
Le 25/05/2019 à 13:27, Allan McRae via arch-dev-public a écrit :
> On 25/5/19 9:19 pm, Bruno Pagani via arch-dev-public wrote:
>> Hi,
>>
>> Le 25/05/2019 à 02:17, Filipe Laíns via arch-dev-public a écrit :
>>> I would also like to explore the idea of adding an "high performance"
>>> architecture which would be able to make use of SSE{,2,3,4,4.1,4.2} and
>>> AVX, which seem to be the standard for newer processors (>=2013). This
>>> would only be available for packages that do high performance computing
>>> (ex. openblas, sdrangel, etc.). Any thoughts on this?
>> As said on IRC, they have been discussions before on having multiple
>> targets and corresponding repos, but the starting point is that we need
>> automated build before going into such a direction, and this in turn has
>> several requirements. I’ve linked to you the pad where we put our ideas
>> together regarding this.
>>
>> In the meantime, we had the case before of whether we should package
>> e.g. $pkgname-{sse4,avx} in a case where it mattered a lot, but it
>> turned out the software in question (embree) is able to do runtime
>> detection of available ISA. Maybe some other packages are doing this
>> too, else we could discuss whether allowing such flavours as a temporary
>> measure would be acceptable for selected packages.
> glibc detects available instruction sets and uses the best for many
> functions.

Great!

> I'd be very, very, very much against providing multiple variants of a
> package in our repos.  Using asp and makepkg are is a hard solution for
> those who really need a few packages rebuilt.

I’m fine with that possibility too.

> PS - I rebuilt [core] with -march=haswell recently as a test.  Automated
> building is not an issue.  Unattended package/database signing is the
> major stumbling block.

Yes, in our discussions it boiled down to “Automated rebuilds” →
“Unattented signing” → “Reproducible builds”.

Out of curiosity, what did you rebuild of [core] lead to?

Bruno




signature.asc
Description: OpenPGP digital signature


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-25 Thread Allan McRae via arch-dev-public
On 25/5/19 9:19 pm, Bruno Pagani via arch-dev-public wrote:
> Hi,
> 
> Le 25/05/2019 à 02:17, Filipe Laíns via arch-dev-public a écrit :
>> I would also like to explore the idea of adding an "high performance"
>> architecture which would be able to make use of SSE{,2,3,4,4.1,4.2} and
>> AVX, which seem to be the standard for newer processors (>=2013). This
>> would only be available for packages that do high performance computing
>> (ex. openblas, sdrangel, etc.). Any thoughts on this?
> 
> As said on IRC, they have been discussions before on having multiple
> targets and corresponding repos, but the starting point is that we need
> automated build before going into such a direction, and this in turn has
> several requirements. I’ve linked to you the pad where we put our ideas
> together regarding this.
> 
> In the meantime, we had the case before of whether we should package
> e.g. $pkgname-{sse4,avx} in a case where it mattered a lot, but it
> turned out the software in question (embree) is able to do runtime
> detection of available ISA. Maybe some other packages are doing this
> too, else we could discuss whether allowing such flavours as a temporary
> measure would be acceptable for selected packages.

glibc detects available instruction sets and uses the best for many
functions.

I'd be very, very, very much against providing multiple variants of a
package in our repos.  Using asp and makepkg are is a hard solution for
those who really need a few packages rebuilt.

PS - I rebuilt [core] with -march=haswell recently as a test.  Automated
building is not an issue.  Unattended package/database signing is the
major stumbling block.

Allan


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-25 Thread Allan McRae via arch-dev-public
On 25/5/19 5:22 pm, Lukas Jirkovsky via arch-dev-public wrote:
> On Sat, 25 May 2019 at 04:27, Filipe Laíns via arch-dev-public
>  wrote:
>> Setting `-mtune` to generic won't add any additional instruction sets
>> by itself, but it does not prevent instruction sets from being added.
>> Looks like GCC enables MMX, SSE and SSE2 by default, it isn't related
>> at all to `-march` like I stated in the email but it still presents the
>> same issue.
> 
> As far as I know, MMX, SSE and SSE2 are mandatory part of the AMD64
> instruction set, so they are not enabled randomly just because someone
> felt like it, but because they are be present on every x86_64 cpu.
> .

Correct.  Using the command I gave in my first reply:

$ gcc -march=x86-64 -Q --help=target | grep sse
  -mfpmath= sse
  -mno-sse4 [enabled]
  -msse [enabled]
  -msse2[enabled]
  -msse2avx [disabled]
  -msse3[disabled]
  -msse4[disabled]
...

$ gcc -march=x86-64 -Q --help=target | grep mmx
  -mmmx [enabled]

-mtune just tunes instructions for a "representative" set of "current"
CPUs that run as x86-64.

Allan


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-25 Thread Bruno Pagani via arch-dev-public
Hi,

Le 25/05/2019 à 02:17, Filipe Laíns via arch-dev-public a écrit :
> I would also like to explore the idea of adding an "high performance"
> architecture which would be able to make use of SSE{,2,3,4,4.1,4.2} and
> AVX, which seem to be the standard for newer processors (>=2013). This
> would only be available for packages that do high performance computing
> (ex. openblas, sdrangel, etc.). Any thoughts on this?

As said on IRC, they have been discussions before on having multiple
targets and corresponding repos, but the starting point is that we need
automated build before going into such a direction, and this in turn has
several requirements. I’ve linked to you the pad where we put our ideas
together regarding this.

In the meantime, we had the case before of whether we should package
e.g. $pkgname-{sse4,avx} in a case where it mattered a lot, but it
turned out the software in question (embree) is able to do runtime
detection of available ISA. Maybe some other packages are doing this
too, else we could discuss whether allowing such flavours as a temporary
measure would be acceptable for selected packages.

Regards,
Bruno





signature.asc
Description: OpenPGP digital signature


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-25 Thread Lukas Jirkovsky via arch-dev-public
On Sat, 25 May 2019 at 04:27, Filipe Laíns via arch-dev-public
 wrote:
> Setting `-mtune` to generic won't add any additional instruction sets
> by itself, but it does not prevent instruction sets from being added.
> Looks like GCC enables MMX, SSE and SSE2 by default, it isn't related
> at all to `-march` like I stated in the email but it still presents the
> same issue.

As far as I know, MMX, SSE and SSE2 are mandatory part of the AMD64
instruction set, so they are not enabled randomly just because someone
felt like it, but because they are be present on every x86_64 cpu.


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-24 Thread Filipe Laíns via arch-dev-public
On Sat, 2019-05-25 at 10:35 +1000, Allan McRae wrote:
> On 25/5/19 10:17 am, Filipe Laíns via arch-dev-public wrote:
> > Hello,
> > 
> > Currently there are no guidelines stating which x86 extensions (ex.
> > SSE2, SEE3, SSE4, AVX, etc.) we support. This is a bit problematic
> > since it lets compilers do what they want and possible generate code
> > that can't run on some systems.
> > 
> > Even though this is an issue, it's not complete anarchy, at least yet!
> > Just kidding :p. The vast majority of our native packages are compiled
> > with GCC and we do default to `-mtune=generic` which is good but not
> > optimal. `-mtune=generic` tells GCC to compile for a generic processor
> > so it's up to GCC to decide which architecture extensions would compose
> > a generic processor. I haven't been able to find any documentation on
> > what x86 extensions are enabled for a "generic" processor but I was
> > able to track them down to MMX, SSE (or KNI) and SSE2. Being
> > undocumented they could change at any time so I don't think we should
> > rely on `-mtune=generic`.
> > 
> 
> I think you need to look at the difference between -march and -mtune.
> We use "-march=x86-64", which defines the instruction sets that can be
> used.  Adding "-mtune=generic" does not allow the inclusion of
> additional instruction sets.
> 
> Look at the output of:
> gcc -march=x86-64 -Q --help=target
> 
> Allan

Yes! My bad. I got confused.

From the `-march=x86_64` documentation:
A generic CPU with 64-bit extensions.

Setting `-mtune` to generic won't add any additional instruction sets
by itself, but it does not prevent instruction sets from being added.
Looks like GCC enables MMX, SSE and SSE2 by default, it isn't related
at all to `-march` like I stated in the email but it still presents the
same issue.

What do you think?

Filipe Laíns
3DCE 51D6 0930 EBA4 7858 BA41 46F6 33CB B0EB 4BF2


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-24 Thread Allan McRae via arch-dev-public
On 25/5/19 10:17 am, Filipe Laíns via arch-dev-public wrote:
> Hello,
> 
> Currently there are no guidelines stating which x86 extensions (ex.
> SSE2, SEE3, SSE4, AVX, etc.) we support. This is a bit problematic
> since it lets compilers do what they want and possible generate code
> that can't run on some systems.
> 
> Even though this is an issue, it's not complete anarchy, at least yet!
> Just kidding :p. The vast majority of our native packages are compiled
> with GCC and we do default to `-mtune=generic` which is good but not
> optimal. `-mtune=generic` tells GCC to compile for a generic processor
> so it's up to GCC to decide which architecture extensions would compose
> a generic processor. I haven't been able to find any documentation on
> what x86 extensions are enabled for a "generic" processor but I was
> able to track them down to MMX, SSE (or KNI) and SSE2. Being
> undocumented they could change at any time so I don't think we should
> rely on `-mtune=generic`.
> 

I think you need to look at the difference between -march and -mtune.
We use "-march=x86-64", which defines the instruction sets that can be
used.  Adding "-mtune=generic" does not allow the inclusion of
additional instruction sets.

Look at the output of:
gcc -march=x86-64 -Q --help=target

Allan


[arch-dev-public] Create guidelines regarding SIMD instructions/x86 extensions

2019-05-24 Thread Filipe Laíns via arch-dev-public
Hello,

Currently there are no guidelines stating which x86 extensions (ex.
SSE2, SEE3, SSE4, AVX, etc.) we support. This is a bit problematic
since it lets compilers do what they want and possible generate code
that can't run on some systems.

Even though this is an issue, it's not complete anarchy, at least yet!
Just kidding :p. The vast majority of our native packages are compiled
with GCC and we do default to `-mtune=generic` which is good but not
optimal. `-mtune=generic` tells GCC to compile for a generic processor
so it's up to GCC to decide which architecture extensions would compose
a generic processor. I haven't been able to find any documentation on
what x86 extensions are enabled for a "generic" processor but I was
able to track them down to MMX, SSE (or KNI) and SSE2. Being
undocumented they could change at any time so I don't think we should
rely on `-mtune=generic`.

What I propose is to define a set of x86 extensions to support and make
all compilers default to that if possible. I am fine with MMX, SSE and
SSE2 but they should be *our* choice, not GCC's.

This begs the question, would such approach be doable? So I ask the
compiler maintainers to please check if this would be possible.

I would also like to explore the idea of adding an "high performance"
architecture which would be able to make use of SSE{,2,3,4,4.1,4.2} and
AVX, which seem to be the standard for newer processors (>=2013). This
would only be available for packages that do high performance computing
(ex. openblas, sdrangel, etc.). Any thoughts on this?

Thanks,
Filipe Laíns
3DCE 51D6 0930 EBA4 7858 BA41 46F6 33CB B0EB 4BF2


signature.asc
Description: This is a digitally signed message part