As you probably know, VPP build system is compiling selected C files multiple 
times with
different compiler flags and there is runtime infra which detect CPU type on 
runtime and
selects optimal binary.

On x86 today we have following variants:
- baseline (-march=core-i7)
- avx2 (-march=core-avx2)
- avx512 (-march=skylake-avx512)

There are 2 issues today which forced us to prefer avx2 even on 
skylake/cascadelake server CPUs:

1) there is bug in binutils version used on ubuntu 18.04 which causes avx512 
code to be broken

2) On Skylake/CascadeLake on the 1st sign of instructions which are dealing 
with ZMM registers (512-bit) CPU must request power
licence and change frequency. This procedure can take up to 500 microseconds 
and during that time core
operates in degraded mode. Same happens again if there is no 512-bit 
instructions for ~2ms.
That means that sparse use of 512-bit register will cause more harm than 
benefit.
This is expected to be significantly improved on Icelake CPUs.


So i’m planning to do following change:

Replace variants above with following:

- baseline - no changes
- hsw - Haswell / Broadwell - AVX2 instruction set
- skx - Skylake Server CPUs/ Cascadelake - AVX512 instruction set without use 
of 512-bit registers
- icl - Icelake - AVX512 instruction set with use of 512-bit registers + new 
instructions (avx512 bit manipulation, vaes)

Any comments, thoughts?

— 
Damjan
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16203): https://lists.fd.io/g/vpp-dev/message/16203
Mute This Topic: https://lists.fd.io/mt/73355888/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to