Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-08-09 Thread Thiago Macieira
On Sunday, 9 August 2020 15:28:26 PDT René J. V. Bertin wrote:
> > BTW, you may want to also add the -mtune option to either of the two
> > processors. Leaving it at the default (matching -march) may not produce
> > the
> > best code for either.
> 
> Oh? I've always understood that -march implies -mtune (but not vice-versa)?!

-march=X implies -mtune=X, but you can add -mtune=Y to get the instruction 
models for Y. For obvious reasons, Y > X, since it makes no sense to optimise 
for a processor generation that couldn't possibly run the code generated.

> Is Clear Linux really noticeably faster and because of its build options? I
> can't really tell from the Phoronix comparison between it and Ubuntu 20.04
> how conclusive the difference is in every day life (if that doesn't imply
> one of the benchmarks where the difference is going to be noticeable).

That's "YMMV" for you. Clear Linux wins in enough benchmarks to prove that 
there is something behind it all. But whether that affects your daily life, 
that's another story. Clear Linux isn't optimised for desktop experience, 
aside from a lightning quick boot time. The moment you open Chrome or Firefox 
((both of which you have to download outside the distro, unfortunately), 
you're in a completely different world.

But if your daily life is running servers and containers, even a 1% 
improvement means money saved. Also why a lightning quick boot is important: 
if your workload takes 5 min to run and your distro took 30s to boot instead 
of 2.5s, that's 10% overhead.
 
> Meanwhile I had settled on an annoyingly long option string that has -
> march=core2 and sets every SSE and MMX version support (except SSE4a).
> R.

You should use -march=westmere instead of core2.

-march=core2 implies:

-msse -msse2 -msse3 -mssse3 -mcx16

Whereas -march=westmere is
-msse -msse2 -msse3 -mssse3 -mcx16 -msse4.1 -msse4.2 -mpopcnt -mpclmul

(note: the woboq indexed source is old; it still shows -maes for Westmere, but 
that got changed some time ago with a patch of mine; the definitions are also 
in i386.h now and aren't macros anymore)



-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-08-09 Thread René J . V . Bertin
Thiago Macieira wrote:

Well well, I turn my back a couple of days and there's a whole lot of 
interesting things going on behind it :)

> On Monday, 20 July 2020 22:46:36 PDT Thiago Macieira wrote:
>> So the common denominator of SLM and SNB is the WSM (Westmere).

Indeed, my i7 is a Sandybridge as far as I've been able to determine. It has 
the 
basic AVX instruction set.

>> Not coincidentally, it's the default -march= for GCC and Clang on Clear

Clearly :)

> 
> BTW, you may want to also add the -mtune option to either of the two
> processors. Leaving it at the default (matching -march) may not produce the
> best code for either.

Oh? I've always understood that -march implies -mtune (but not vice-versa)?!

> YMMV.
As always with this sort of thing (as far as "generic" code is concerned) but I 
always tell myself that adding -march=XXX or a bunch of -msseX options isn't 
going to change build times a lot and may just cause binaries to run a little 
faster when the machine starts getting stressed.

Is Clear Linux really noticeably faster and because of its build options? I 
can't really tell from the Phoronix comparison between it and Ubuntu 20.04 how 
conclusive the difference is in every day life (if that doesn't imply one of 
the 
benchmarks where the difference is going to be noticeable).

Meanwhile I had settled on an annoyingly long option string that has -
march=core2 and sets every SSE and MMX version support (except SSE4a). 
R.

___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-07-21 Thread Thiago Macieira
On Monday, 20 July 2020 22:46:36 PDT Thiago Macieira wrote:
> So the common denominator of SLM and SNB is the WSM (Westmere).
> 
> Not coincidentally, it's the default -march= for GCC and Clang on Clear
> Linux, as well as what all binaries in /usr/bin and /usr/lib64 are compiled
> towards.

BTW, you may want to also add the -mtune option to either of the two 
processors. Leaving it at the default (matching -march) may not produce the 
best code for either.

YMMV.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-07-20 Thread Thiago Macieira
On Monday, 20 July 2020 21:50:45 PDT Rainer Wiesenfarth wrote:
> René would like to know how to set "march" when he *compiles code on* the
> Celeron (N3150) that should *run on* the (probably first generation) Core
> i7 mobile.

Oh? I had not understood that. But I was wondering why he was asking about 
i7... there's no Atom that is marketed as i7.

Let me reread...

Sorry, I was confused then! René is looking for the maximum common denominator 
between a Silvermont and a SandyBridge.

From some work I've been preparing to send to Qt:

# Architecture  Based onNew features
  # Core line
arch=Core2  x86_64  sse3,ssse3,cx16
arch=NHMCore2   sse4.1,sse4.2,popcnt
arch=WSMNHM
arch=SNBWSM avx
arch=IVBSNB f16c,rdrnd,fsgsbase
arch=HSWIVB avx2,fma,bmi,bmi2,lzcnt,movbe
arch=BDWHSW adx,rdseed
[...]
  # Atom line
arch=SLMWSM rdrnd,movbe
arch=GLMSLM fsgsbase,rdseed,lzcnt,xsavec,xsaves

So the common denominator of SLM and SNB is the WSM (Westmere).

Not coincidentally, it's the default -march= for GCC and Clang on Clear Linux, 
as well as what all binaries in /usr/bin and /usr/lib64 are compiled towards.

Other combinations may not match an exact CPU, like:

SLM & IVB = WSM + rdrnd
SLM & HSW = SLM
GLM & HSW = SLM + fsgsbase
GLM & BDW = SLM + fsgsbase + rdseed

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-07-20 Thread Rainer Wiesenfarth
Addendum: According to https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html,
"-march=nehalem" or "-march=sandybridge" should do the trick. Both do not
have RDRND instruction set support, as René already noted.

Cheers, Rainer

On Tue, Jul 21, 2020 at 6:50 AM Rainer Wiesenfarth <
rainer_wiesenfa...@trimble.com> wrote:

> Sorry for jumping into your discussion, but I think you are talking about
> different topics.
>
> René would like to know how to set "march" when he *compiles code on* the
> Celeron (N3150) that should *run on* the (probably first generation) Core
> i7 mobile.
>
> Thiago says that "march=native" produces code that is guaranteed to run on
> the *same architecture.*
>
> My guess would be that René has to set "march" to what the i7 supports,
> using sde64 searching for Nehalem or - eventually - Sandy Bridge. As I am
> on AMD and Windows, it might be Thiago's part to provide the sde64
> information... :-)
>
> Cheers, Rainer
>
> --
> Software Engineer | Trimble Imaging Division
> Rotebühlstraße 81 | 70178 Stuttgart | Germany
> Office +49 711 22881 0 | Fax +49 711 22881 11
> http://www.trimble.com/imaging/ | http://www.inpho.de/
>
> Trimble Germany GmbH, Am Prime Parc 11, 65479 Raunheim
> Eingetragen beim Amtsgericht Darmstadt unter HRB 83893,
> Geschäftsführer: Rob Reeder, Jürgen Kesper
>


-- 
Software Engineer | Trimble Imaging Division
Rotebühlstraße 81 | 70178 Stuttgart | Germany
Office +49 711 22881 0 | Fax +49 711 22881 11
http://www.trimble.com/imaging/ | http://www.inpho.de/

Trimble Germany GmbH, Am Prime Parc 11, 65479 Raunheim
Eingetragen beim Amtsgericht Darmstadt unter HRB 83893,
Geschäftsführer: Rob Reeder, Jürgen Kesper
___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-07-20 Thread Rainer Wiesenfarth
Sorry for jumping into your discussion, but I think you are talking about
different topics.

René would like to know how to set "march" when he *compiles code on* the
Celeron (N3150) that should *run on* the (probably first generation) Core
i7 mobile.

Thiago says that "march=native" produces code that is guaranteed to run on
the *same architecture.*

My guess would be that René has to set "march" to what the i7 supports,
using sde64 searching for Nehalem or - eventually - Sandy Bridge. As I am
on AMD and Windows, it might be Thiago's part to provide the sde64
information... :-)

Cheers, Rainer

-- 
Software Engineer | Trimble Imaging Division
Rotebühlstraße 81 | 70178 Stuttgart | Germany
Office +49 711 22881 0 | Fax +49 711 22881 11
http://www.trimble.com/imaging/ | http://www.inpho.de/

Trimble Germany GmbH, Am Prime Parc 11, 65479 Raunheim
Eingetragen beim Amtsgericht Darmstadt unter HRB 83893,
Geschäftsführer: Rob Reeder, Jürgen Kesper
___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-07-20 Thread Thiago Macieira
On Monday, 20 July 2020 14:56:24 PDT René J.V. Bertin wrote:
> Thanks, will try though
> [https://en.wikichip.org/wiki/intel/microarchitectures/silvermont] suggests
> that Silvermont has the rdrand instruction that my i7 doesn't have.

It's supposed to have:

$ sde64 --help | grep Silvermont
 -slmSet chip-check and CPUID for Intel(R) Silvermont CPU
$ sde64 -slm -- cpuid -1 --leaf=1 | grep RDRAND
  RDRAND instruction  = true
[http://www.intel.com/software/sde]

But some CPU features relating to crypto are sometimes disabled, either on 
BIOS or on particular SKUs. So far, I'd only heard about the AES and 
PCLMULQDQ[*] instructions, but I suppose RDRAND and RDSEED fit into that 
category, as will the SHA, VAES and VPCLMULQDQ instructions.

To fix a Qt build, you can always append -mno-rdrnd to the build options. This 
will disable the unconditional use of RDRAND in qrandom.cpp.

GCC's -march=native should use the feature set coming from your CPUID 
instruction. So it should not enable RDRAND if your CPU doesn't report it. 
That means -march=native may result in a feature set that matches none of the 
other -march= options. You can check like this:
  gcc -dM -E -march=native -xc /dev/null | grep RDRND

That does not apply to Clang. Clang's -march=native tries to find the best 
match to your processor and then enables that particular entry. That means it 
could enable features your processor doesn't have. If that's the case, please 
report a bug.

I had to fix both GCC and Clang on the AES instructions. We may need to do the 
same for RDRAND.

[*] pronounced "pickle-muckle-duck"
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-07-20 Thread René J . V . Bertin
On Saturday July 18 2020 10:20:44 Thiago Macieira wrote:

>That's a Braswell-based Atom:
Or rather a Celeron (whatever the exact difference is)?
https://en.wikichip.org/wiki/intel/celeron/n3150

>So -march=silvermont.

Thanks, will try though 
[https://en.wikichip.org/wiki/intel/microarchitectures/silvermont] suggests 
that Silvermont has the rdrand instruction that my i7 doesn't have.
>
>Anyway, -march=native is supposed to work on any machine that runs the code it 
>compiled. If that fails,

Of course it does, and no, it doesn't.

R
___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


Re: [Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-07-18 Thread Thiago Macieira
On Saturday, 18 July 2020 05:44:04 PDT René J.V. Bertin wrote:
> Is there a good single-term setting for -march= that I can use that still
> enables the most useful SIMD instruction supported by at least my old i7
> and the N3150 (and VirtualBox!), rather than entering a hand-tuned list of
> individual extensions? I usually build with clang but could consider gcc in
> this case :)

https://ark.intel.com/content/www/us/en/ark/products/87258/intel-celeron-processor-n3150-2m-cache-up-to-2-08-ghz.html?wapkw=N3150

That's a Braswell-based Atom:
https://en.wikichip.org/wiki/intel/cores/braswell

So -march=silvermont.

Anyway, -march=native is supposed to work on any machine that runs the code it 
compiled. If that fails, please submit a bug report to GCC. After testing GCC 
11, that is.  H.J. has just revamped the -march=native detection to fix issues 
like that.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products



___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest


[Interest] good-compromise compatibility setting for -march=??? option (x86)?

2020-07-18 Thread René J . V . Bertin
Hi,

I know it goes against "suggested wisdom" to tinker with compiler options when 
building Qt code, but in my experience you do need to tell the compiler what 
instruction set extensions it must enable. In any case, my binaries built on a 
recent'ish Celeron (N3150) ran fine in a VM on a 2011 mobile i7 until I started 
building with -march=native (on the Celeron), trusting that the i7 had a larger 
instruction set. Turns out it doesn't have RDRAND (and is different enough that 
__builtin_bswap32() gives an illegal instruction).

Is there a good single-term setting for -march= that I can use that still 
enables the most useful SIMD instruction supported by at least my old i7 and 
the N3150 (and VirtualBox!), rather than entering a hand-tuned list of 
individual extensions? I usually build with clang but could consider gcc in 
this case :)

Thanks!
R.
___
Interest mailing list
Interest@qt-project.org
https://lists.qt-project.org/listinfo/interest