Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Filipe Laíns via arch-dev-public
On Mon, 2020-03-30 at 09:07 +1000, Allan McRae via arch-dev-public wrote:
> SSE4.2 is 2008 for Intel, 2011 for AMD.  Though I guess some processors
> were released without it for some time after that.   AVX was released by
> both in 2011.
> 
> So why is one too far and the other not?

I was looking at some edge cases where the CPUs had AVX but no SSE4.2.
Intel's website is also a bit unreliable with older CPUs, a lot of the
cases it is not listing extensions when it should be.

> No need for pacman support.  Just add higher instruction set to a new
> repo and set that repo with higher priority.

Right, that works too. Although, we can't have the word "auto" in the
arch definition automatically identifying and enabling the supported
extensions.

> But that involves developers choosing which packages to build with
> higher instruction sets, which requires extra developer time.

Well. The thing about these CPU extensions is that they require very
specific use workloads to be useful. They are more efficient when
performing the tasks they were design to do, but they have an higher
power consumption, which generally results in the CPU lowering clocks.
To make proper use of them, you usually need to build your algorithm
around them.

Just enabling it in GCC might bring some performance improvements but
they are usually *very* minimal.

Upstreams that benefit from these extensions will most of the time
write their own kernels and provide runtime detection or add a switch
in the build system.

I don't believe this is too much of burden for packagers. 99.9% of the
time, just building a package with SEE/AVX natively in GCC will not
bring any relevant performance enhancement.

Furthermore, building an extra optimized version makes it much more
reasonable for us to choose AVX2, which will have a much higher real
world impact than just building everything with SSE4.2 instead.

With that said, if you still want to bump the minimum requirements,
could please consider taking just a little bit of time to run some
benchmarks on packages you think should get an improvement?

> Ideally, we would just autobuild for more optimized architectures, but
> this requires auto-signing packages, which has not happened in the last
> decade (but may in this one...).
>
> Picking an instruction set that is ~10 years old and making it the
> default for the distro seems a reasonable approach to me.

The instruction set might be 10 years old, but you have to look at the
discontinuation date for CPU families which don't support it, and maybe
add 1 or 2 years for the bulk of the inventory to be cleared out from
stores.

Cheers,
Filipe Laíns


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Gaetan Bisson via arch-dev-public
[2020-03-29 16:25:48 +0100] Filipe Laíns via arch-dev-public:
> What I would for us to do is to create a x86-64-axv2, etc. that would
> complement x86-64. We would not add it as a target for all packages,
> just for the ones that make sense.
> 
> For this pacman would have to support architecture priority. We could
> have something like this:
> 
> Architecture = x86-64-axv2 x86-64

I'd like to say why not but everything remains to be done, here. Whereas
pacman and our toolchain have mature support for multiple architectures,
and they have it today.

> My point here is that to me it does not really make sense to drop
> support for older CPUs. We will have little benefit in newer CPUs.

Nothing is being dropped. Every CPU that does not support the new
architecture can keep running the x86_64 packages they currently do.

> Then automate it? Is there any reason why we can't have the tooling
> build all architectures for us? Why not have an `extra-build` helper
> that will call extra-$arch-build for all every architecture?

That would be awesome but the tooling does not yet exist. Personally I
do not consider it terribly bothersome to build packages for multiple
architectures like we did for i686 and x86_64. And I think it would be
preferable to introduce a new architecture tomorrow than wait a few more
months in the hope someone implements your proposed scheme.

Cheers.

-- 
Gaetan


signature.asc
Description: PGP signature


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Allan McRae via arch-dev-public
On 30/3/20 12:39 am, Filipe Laíns wrote:
> On Sun, 2020-03-29 at 23:37 +1000, Allan McRae via arch-dev-public wrote:
>> On 29/3/20 11:17 pm, Filipe Laíns wrote:
>>> I would also like to note that rebuilding everything with forced
>>> support for AVX2 or whatever won't have much effect. Most packages do
>>> not have workloads where it would make use sense to use these CPU
>>> extensions, and as such, GCC would not use them.
>>
>> That assumes we just add AVX2.  Whereas, requiring a CPU supporting AVX2
>> would bring other optimizations that would be used.
> 
> No, it should be true for all extensions.
> 
>> As I replied earlier, AVX2 may be going too far.  But is a good starting
>> point for discussion.  If that is too far, what could we accept?
>> SSE4.2?  AVX?   Surely we can do better than pure x86_64.
> 
> No, SSE4.2 is too far. For me, the minimum should be AVX.


SSE4.2 is 2008 for Intel, 2011 for AMD.  Though I guess some processors
were released without it for some time after that.   AVX was released by
both in 2011.

So why is one too far and the other not?


>> To have a separate architecture would require automated builds, which
>> requires being able to sign packages automatically.  And we have not
>> achieved database signing in 9 years  I'm looking for a boost that
>> could be achieved now.
> 
> No, it would not. Where is this coming from? I already build split
> packages with SIMD instructions, I make the PKGBUILD build for 2
> architectures instead with a minimal patch.
> 
> If pacman is not able to handle parallel architectures, we should fix
> that. I think it's a valid use case.

No need for pacman support.  Just add higher instruction set to a new
repo and set that repo with higher priority.

But that involves developers choosing which packages to build with
higher instruction sets, which requires extra developer time.

Ideally, we would just autobuild for more optimized architectures, but
this requires auto-signing packages, which has not happened in the last
decade (but may in this one...).


Picking an instruction set that is ~10 years old and making it the
default for the distro seems a reasonable approach to me.

A


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Andreas Radke via arch-dev-public
Am Sun, 29 Mar 2020 21:44:38 +1000
schrieb Allan McRae via arch-dev-public :

> We are currently supporting processors from 2003.  We can be better
> than that.
> 
> A

In the very early Linux days many tasks maxed out the cpu performance
and every cpus optimization was noticeable. This has changed a lot.

Many even very old cpus are still fast enough for useful tasks. Do not
force users with such a system to leave Arch. My main workstation
system is still a SandyBridge 2600K and I guess it will last another
5-10 years.

I much prefer runtime extension detection that should be implemented
upstream. I'm strongly against increasing our main architecture
requirements. I'm not sure if adding any additional more optimized repo
is worth the work.

-Andy



pgpMlZGiGEuFw.pgp
Description: Digitale Signatur von OpenPGP


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Eli Schwartz via arch-dev-public
On 3/29/20 11:25 AM, Filipe Laíns via arch-dev-public wrote:
> I want to clarify what I am proposing.
> 
> I would not be an entirely new architecture in the sense of i686, CPU
> extensions are not different architectures and shouldn't be treated as
> such.
> 
> What I would for us to do is to create a x86-64-axv2, etc. that would
> complement x86-64. We would not add it as a target for all packages,
> just for the ones that make sense.
> 
> For this pacman would have to support architecture priority. We could
> have something like this:
> 
> Architecture = x86-64-axv2 x86-64
> 
> This means if a x86-64-axv2 package is available, it would be selected
> over the x86-64 one. That way we don't need to rebuild all packages.

Where would you store this package? The pkgname must be unique in each
repository database, so you would need a community-avx2 repository.

Then it is as simple as Santiago said, just have users add the
additional repository if they need it, giving it precedence in pacman.conf.
(Except I will go one step further and say this is the *only* way.)


-- 
Eli Schwartz
Bug Wrangler and Trusted User



signature.asc
Description: OpenPGP digital signature


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Santiago Torres-Arias via arch-dev-public
On Sun, Mar 29, 2020 at 04:25:48PM +0100, Filipe Laíns via arch-dev-public 
wrote:
> I want to clarify what I am proposing.
> 
> I would not be an entirely new architecture in the sense of i686, CPU
> extensions are not different architectures and shouldn't be treated as
> such.
> 
> What I would for us to do is to create a x86-64-axv2, etc. that would
> complement x86-64. We would not add it as a target for all packages,
> just for the ones that make sense.
> 
> For this pacman would have to support architecture priority. We could
> have something like this:
> 
> Architecture = x86-64-axv2 x86-64

Couldn't this be as simple as having a package with avx2  (or whatever)
extensions compiled in in a separate repository that takes precedence in
pacman.conf?

Thanks,
-Santiago


signature.asc
Description: PGP signature


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Filipe Laíns via arch-dev-public
On Sun, 2020-03-29 at 16:25 +0100, Filipe Laíns via arch-dev-public wrote:
> I would not be an entirely
  *It would

> What I would for us to do is to create a x86-64-axv2, etc. that would
*would like for us

> let's me to believe (but this may be just me), I would be
 *let me to

Ugh, sorry again. Today I am only catching the errors when obsessing
over sending.


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Filipe Laíns via arch-dev-public
I want to clarify what I am proposing.

I would not be an entirely new architecture in the sense of i686, CPU
extensions are not different architectures and shouldn't be treated as
such.

What I would for us to do is to create a x86-64-axv2, etc. that would
complement x86-64. We would not add it as a target for all packages,
just for the ones that make sense.

For this pacman would have to support architecture priority. We could
have something like this:

Architecture = x86-64-axv2 x86-64

This means if a x86-64-axv2 package is available, it would be selected
over the x86-64 one. That way we don't need to rebuild all packages.


My point here is that to me it does not really make sense to drop
support for older CPUs. We will have little benefit in newer CPUs.
Projects that need the performance already dynamically choose the CPU
extensions to use in the runtime -- they will work on all x86-64 CPUs.

If this did in fact bring a relevant performance improvement, like the
original mail let's me to believe (but this may be just me), I would be
all for it, but that's not the case.

On Sun, 2020-03-29 at 16:51 +0200, Morten Linderud via arch-dev-public wrote:
> Well, how do you think we supported two architectures? Why do you think
> `extra-x86_64-build` is named the way it is?
> 
> The "problem" is that we have no intentions of building 1 package 4 times and
> keep things in sync by hand, it was tedious enough with i686, which was part 
> of
> why it was dropped in the first place. Thus we want build-servers to do this 
> for
> us. 

Then automate it? Is there any reason why we can't have the tooling
build all architectures for us? Why not have an `extra-build` helper
that will call extra-$arch-build for all every architecture?

This will have practically the same affect as my SIMD packages not. The
only difference would be how people are consuming them. It would just
work out of the box instead of them having to install the -avx2
variant.

> Allan is going to have a hard time argueing that the minimal improvements is
> going to justify the absurd time we'll end up building things by hand, it's 
> the
> crux of the problem essentially. I'm also sure he knows this.
> 
> Surely we can bikeshed about which architectures to support, what we should
> discuss is how we should accomplish the task in general.
> 
> > Furthermore, if you do indeed whish to move this forward please present
> > us with reasonable data. Take a few packages that would benefit from
> > this, build them with the proposed architecture and show us benchmarks.
> > I think it's gonna be very hard for you to find packages with
> > considerable improvement but I might be wrong, please show me.
> 
> See last paragraph.

Which paragraph are you referring to?

I will be taking a step back now. I will probably wait 1 or 2 days before 
replying.
Feel free to reach me privately for a direct discussion if you want to.

Cheers,
Filipe Laíns


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Morten Linderud via arch-dev-public
On Sun, Mar 29, 2020 at 03:39:51PM +0100, Filipe Laíns via arch-dev-public 
wrote:
> > To have a separate architecture would require automated builds, which
> > requires being able to sign packages automatically.  And we have not
> > achieved database signing in 9 years  I'm looking for a boost that
> > could be achieved now.
> 
> No, it would not. Where is this coming from? I already build split
> packages with SIMD instructions, I make the PKGBUILD build for 2
> architectures instead with a minimal patch.
> 
> If pacman is not able to handle parallel architectures, we should fix
> that. I think it's a valid use case.

Well, how do you think we supported two architectures? Why do you think
`extra-x86_64-build` is named the way it is?

The "problem" is that we have no intentions of building 1 package 4 times and
keep things in sync by hand, it was tedious enough with i686, which was part of
why it was dropped in the first place. Thus we want build-servers to do this for
us. 

Allan is going to have a hard time argueing that the minimal improvements is
going to justify the absurd time we'll end up building things by hand, it's the
crux of the problem essentially. I'm also sure he knows this.

Surely we can bikeshed about which architectures to support, what we should
discuss is how we should accomplish the task in general.

> Furthermore, if you do indeed whish to move this forward please present
> us with reasonable data. Take a few packages that would benefit from
> this, build them with the proposed architecture and show us benchmarks.
> I think it's gonna be very hard for you to find packages with
> considerable improvement but I might be wrong, please show me.

See last paragraph.

-- 
Morten Linderud
PGP: 9C02FF419FECBE16


signature.asc
Description: PGP signature


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Filipe Laíns via arch-dev-public
On Sun, 2020-03-29 at 15:39 +0100, Filipe Laíns via arch-dev-public
wrote:
> I make the PKGBUILD build for 2
* I can make

Sorry, I am a little distracted today.
 
Filipe Laíns


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Filipe Laíns via arch-dev-public
On Sun, 2020-03-29 at 23:37 +1000, Allan McRae via arch-dev-public wrote:
> On 29/3/20 11:17 pm, Filipe Laíns wrote:
> > I would also like to note that rebuilding everything with forced
> > support for AVX2 or whatever won't have much effect. Most packages do
> > not have workloads where it would make use sense to use these CPU
> > extensions, and as such, GCC would not use them.
> 
> That assumes we just add AVX2.  Whereas, requiring a CPU supporting AVX2
> would bring other optimizations that would be used.

No, it should be true for all extensions.

> As I replied earlier, AVX2 may be going too far.  But is a good starting
> point for discussion.  If that is too far, what could we accept?
> SSE4.2?  AVX?   Surely we can do better than pure x86_64.

No, SSE4.2 is too far. For me, the minimum should be AVX.

> To have a separate architecture would require automated builds, which
> requires being able to sign packages automatically.  And we have not
> achieved database signing in 9 years  I'm looking for a boost that
> could be achieved now.

No, it would not. Where is this coming from? I already build split
packages with SIMD instructions, I make the PKGBUILD build for 2
architectures instead with a minimal patch.

If pacman is not able to handle parallel architectures, we should fix
that. I think it's a valid use case.

Furthermore, if you do indeed whish to move this forward please present
us with reasonable data. Take a few packages that would benefit from
this, build them with the proposed architecture and show us benchmarks.
I think it's gonna be very hard for you to find packages with
considerable improvement but I might be wrong, please show me.

Filipe Laíns


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Allan McRae via arch-dev-public
On 29/3/20 11:17 pm, Filipe Laíns wrote:
> I would also like to note that rebuilding everything with forced
> support for AVX2 or whatever won't have much effect. Most packages do
> not have workloads where it would make use sense to use these CPU
> extensions, and as such, GCC would not use them.

That assumes we just add AVX2.  Whereas, requiring a CPU supporting AVX2
would bring other optimizations that would be used.

As I replied earlier, AVX2 may be going too far.  But is a good starting
point for discussion.  If that is too far, what could we accept?
SSE4.2?  AVX?   Surely we can do better than pure x86_64.


To have a separate architecture would require automated builds, which
requires being able to sign packages automatically.  And we have not
achieved database signing in 9 years  I'm looking for a boost that
could be achieved now.

Allan


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Filipe Laíns via arch-dev-public
On Sun, 2020-03-29 at 20:26 +1000, Allan McRae via arch-dev-public wrote:
> Remember when Arch Linux was optimized out of the box.   We have the
> blazingly fast i686 port while other distros hung out in i386 land.
> Those were the days where the idea of Arch being fast started.  Now it
> has degraded to stuff of legend.
> 
> Now, x86_64 is old.  We should continue to push forward and add further
> optimization.
> 
> Reasonable optimizations to consider:
> 
> AVX2
> FMA
> SSE4.2
> 
> AVX2 is Intel Haswell and newer or AMD Ryzen and newer.  This CPUs
> released 2013 to 2015.  So 5 - 7 years old.
> 
> Discuss.

Absolutely not!

A huge amount of systems do not meet that requirements. This would rule
out all Intel 3rd gen and older cpus, which would for eg. impact laptop
models such as the Thinkpad X220 and Thinkpad X230.

Instead of adding a requirement for newer CPU extensions let's fix this
the proper way.

The correct approach to this is to push upstreams to support dynamic
detection of CPU extensions. That means the performance critical code
is compiled with a different range of extensions and the binary detects
at runtime what to use. A lot of them do that already, and for the ones
which don't we should push for it (point them to [1]).

Unfortunately, we will always have upstreams that don't support that.
For this we should define a separate architecture. Right now I have
building 2 variants of the projects (see srslte-avx2 and liquid-dsp-
sse4.1), optimally we would have a different arch for them.

I would also like to note that rebuilding everything with forced
support for AVX2 or whatever won't have much effect. Most packages do
not have workloads where it would make use sense to use these CPU
extensions, and as such, GCC would not use them.

There is only maybe a handful of packages in the repos that would
benefit from this.

[1] 
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-target_005fclones-function-attribute

Cheers,
Filipe Laíns


signature.asc
Description: This is a digitally signed message part


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Sven-Hendrik Haase via arch-dev-public
On Sun, 29 Mar 2020 at 12:26, Allan McRae via arch-dev-public <
arch-dev-public@archlinux.org> wrote:

> Remember when Arch Linux was optimized out of the box.   We have the
> blazingly fast i686 port while other distros hung out in i386 land.
> Those were the days where the idea of Arch being fast started.  Now it
> has degraded to stuff of legend.
>
> Now, x86_64 is old.  We should continue to push forward and add further
> optimization.
>
> Reasonable optimizations to consider:
>
> AVX2
> FMA
> SSE4.2
>
> AVX2 is Intel Haswell and newer or AMD Ryzen and newer.  This CPUs
> released 2013 to 2015.  So 5 - 7 years old.
>
> Discuss.
>

I'm definitely all for this. However, I'd strongly prefer it if we used
some heavy automation for building for all the variants. coderobe actually
started an experimental project to explore this. It would also increase our
mirror size requirements quite drastically which I think is likely fine as
our full mirror size is quite small but it should be considered.

I suggest going by processor support generation alone instead of per
feature. For instance, Haswell introduced AVX2 as well as FMA3 so it
doesn't really make much sense to separate those out, I think. Besides, if
you have AVX2 support and care for speed you'll also want to enable FMA3.

Suggested processor-generation based optimization "tier"s:

- nehalem (SSE4.2)
- sandybridge (SSE4.2, AVX)
- haswell (SSE4.2, AVX, AVX2, FMA3)
(soon-ish) - icelake (SSE4.2, AVX, AVX2, FMA3, AVX-512)

I know this sounds Intel specific so these names might not be optimal.

There is quite some work involved in this but I also strongly believe that
we have to keep pushing forward.


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Allan McRae via arch-dev-public
On 29/3/20 8:52 pm, Evangelos Foutras wrote:
> If I see a SIGILL on my AMD Phenom II X6 1090T then Arch will have failed me. 
> 
> 
> I believe your proposal should only be discussed as co-existing
> optimized port(s) and even then I'm not sure it's worth the trouble.
> Performance-critical applications can and frequently are optimized for
> the running processor (I'm thinking of stuff like glibc and ffmpeg
> here).

AVX2 was a bold choice, and really a place to get a discussion started.

We are currently supporting processors from 2003.  We can be better than
that.

A


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Allan McRae via arch-dev-public
On 29/3/20 9:27 pm, Amish wrote:
> Also if I am not wrong Arch philosophy talks only about latest software
> and no where there is mention of latest hardware being a compulsion.

It used to.  One of the original selling points was i686 optimization.
Then we got lazy, and stopped innovating.

A


Re: [arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Evangelos Foutras via arch-dev-public
If I see a SIGILL on my AMD Phenom II X6 1090T then Arch will have failed me. 

I believe your proposal should only be discussed as co-existing
optimized port(s) and even then I'm not sure it's worth the trouble.
Performance-critical applications can and frequently are optimized for
the running processor (I'm thinking of stuff like glibc and ffmpeg
here).


[arch-dev-public] Discussion - Increasing our CPU requirements

2020-03-29 Thread Allan McRae via arch-dev-public
Remember when Arch Linux was optimized out of the box.   We have the
blazingly fast i686 port while other distros hung out in i386 land.
Those were the days where the idea of Arch being fast started.  Now it
has degraded to stuff of legend.

Now, x86_64 is old.  We should continue to push forward and add further
optimization.

Reasonable optimizations to consider:

AVX2
FMA
SSE4.2

AVX2 is Intel Haswell and newer or AMD Ryzen and newer.  This CPUs
released 2013 to 2015.  So 5 - 7 years old.

Discuss.