[Public] Hi Bruce,
Snipped > > > > > > > > > > Hi Bruce, we have reviewed this internally and tested the same. > > > > > We would like > > > > your thought for the following. > > > > > > > > > > - Before patch: we were directly setting AVX512 falgs for F, BW, > > > > > DQ, VL > > > > > - new patch: we are setting the flags for `skylake-server` as bare > > > > > minimal. > > > > > - AMD supports AVX512 from `znver4 and higher`. > > > > > > > > > > As per GCC > > > > > `https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html`, the extra > > > > > ISA > > > > supported between skylake-server (super set) and znver4 and znver5 > > > > are `SAHF, FXSR, XSAVE, RDRND, LZCNT, HLE, PREFETCHW, SGX`. > > > > > Currently for DPDK microbenchmarks and examples runs safe as it > > > > > is not using > > > > the `SAHF, FXSR, XSAVE, RDRND, LZCNT, HLE, PREFETCHW, SGX` > > > > instructions. > > > > > > > > > > Question: should we check if target is `AMD EPYC` then apply > > > > > bare minimum as > > > > `-march=znver4`, thus avoid possible unsupported instruction > > > > generation when non `c_args for march` is passed? > > > > > > > > > > > > > Can you clarify why you mean by the "target" here? Is there a > > > > specific value you are thinking of for the "cpu_instruction_set" option? > > > > > > `Target` is target CPU, when generated without any arguments we get code > > > for > `native build`. > > > > > > On AMD target cpu zen4 or zen5; Before patch as per the code ` AVX512 > > > flags > for F, BW, DQ` are used in ` cc_avx512_flags`. > > > > > > With the patch, the cc_avx512_flags is set to `-march=skylake-avx512` > > > (where > compiler optimizations `can add HLE, PREFETCHW, SGX`). > > > > > > > With this patch for zen4 or zen5 the AVX512 code paths should be > > compiled with no additional flags, since the -march=zen* flag should > > include everything necessary. Can you confirm that you see the extra > > -march flag in those cases? > > > > Ran a quick test myself, this is what I see, doing a zen4 build: > > $ meson setup -Dcpu_instruction_set=znver4 build-zen4 > The Meson build system > Version: 1.7.0 > Source dir: /home/bruce/dpdk-github > Build dir: /home/bruce/dpdk-github/build-zen4 > ... > Fetching value of define "__AVX512F__" : 1 > Message: Extra C flags needed for AVX2 output: [] > Message: Extra C flags needed for AVX512 output: [] > ... > > Checking the build.ninja file in the build-zen4 directory, there is no use of > march=skylake. Here is the compilation recipe for i40e AVX-512 code, for > example: > > build > drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o: > c_COMPILER ../drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c > DEPFILE = > drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d > DEPFILE_UNQUOTED = > drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d > ARGS = -Idrivers/librte_net_i40e_avx512_lib.a.p -Idrivers -I../drivers - > Idrivers/net/intel/i40e -I../drivers/net/intel/i40e > -Idrivers/net/intel/i40e/base - > I../drivers/net/intel/i40e/base -Ilib/ethdev -I../lib/ethdev -Ilib/eal/common > - > I../lib/eal/common -I. -I.. -Iconfig -I../config -Ilib/eal/include > -I../lib/eal/include - > Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include > -I../lib/eal/x86/include > -I../kernel/linux -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs > -Ilib/log -I../lib/log - > Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net > -I../lib/net -Ilib/mbuf - > I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring > -Ilib/meter -I../lib/meter - > Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci > -I../lib/pci - > Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/hash -I../lib/hash -Ilib/rcu > -I../lib/rcu - > I/usr/include/x86_64-linux-gnu -fdiagnostics-color=always - > D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -O3 -include > rte_config.h -Wvla -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral - > Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs > -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes > -Wundef > -Wwrite-strings -Wno-packed-not-aligned -Wno-missing-field-initializers - > D_GNU_SOURCE -fPIC -march=znver4 -mrtm -DALLOW_EXPERIMENTAL_API > -DALLOW_INTERNAL_API -Wno-format-truncation -Wno-address-of-packed- > member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e - > DCC_AVX512_SUPPORT I have retried the patch and tested the same on 2 scenarios 1. Scenario-1: znver4 (avx512) with gcc 13.2 2. Scenario-2: znver3 (no avx512) with gcc 13.2 For scenario-2 where CPU is not supporting avx512 but compiler supports avx512 I get the logs as ``` Fetching value of define "__AVX512F__" : Message: found AVX512 Message: ['-march=skylake-avx512'] Compiler for C supports arguments -Wno-overriding-option: NO Message: Extra C flags needed for AVX2 output: [] Message: Extra C flags needed for AVX512 output: ['-march=skylake-avx512'] Message: AVX 512 flag in drivers Message: ['-march=skylake-avx512'] Message: ['-march=native', '-mrtm', '-DALLOW_EXPERIMENTAL_API', '-DALLOW_INTERNAL_API', '-Wno-format-truncation', '-Wno-address-of-packed-member', '-DDLB2_BYPASS_FENCE_ON_PP=0', '-DDLB_HW_CREDITS_CHECKS=1', '-DDLB_SW_CREDITS_CHECKS=1', '-DDLB_TYPE_CHECK=1', '-DRTE_LOG_DEFAULT_LOGTYPE=pmd.event.dlb2', '-DCC_AVX512_SUPPORT'] ``` The above is correct as per my understanding as the test is to compile for avx512 ISA. But When using `ninja -C build -v` and building for native build, I get ``` [1741/3539] cc -Idrivers/librte_net_i40e_avx512_lib.a.p -Idrivers -I../drivers -Idrivers/net/intel/i40e -I../drivers/net/intel/i40e -Idrivers/net/intel/i40e/base -I../drivers/net/intel/i40e/base -Ilib/ethdev -I../lib/ethdev -Ilib/eal/common -I../lib/eal/common -I. -I.. -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include -Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include -I../lib/eal/x86/include -I../kernel/linux -Ilib/eal -I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log -Ilib/metrics -I../lib/metrics -Ilib/telemetry -I../lib/telemetry -Ilib/net -I../lib/net -Ilib/mbuf -I../lib/mbuf -Ilib/mempool -I../lib/mempool -Ilib/ring -I../lib/ring -Ilib/meter -I../lib/meter -Idrivers/bus/pci -I../drivers/bus/pci -I../drivers/bus/pci/linux -Ilib/pci -I../lib/pci -Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/hash -I../lib/hash -Ilib/rcu -I../lib/rcu -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c11 -O3 -include rte_config.h -Wvla -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef -Wwrite-strings -Wno-packed-not-aligned -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC -march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation -Wno-address-of-packed-member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -DCC_AVX512_SUPPORT -march=skylake-avx512 -MD -MQ drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o -MF drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o.d -o drivers/librte_net_i40e_avx512_lib.a.p/net_intel_i40e_i40e_rxtx_vec_avx512.c.o -c ../drivers/net/intel/i40e/i40e_rxtx_vec_avx512.c ``` In above log I get `2 instances of march`; logs `-march=native -mrtm -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation -Wno-address-of-packed-member -DRTE_LOG_DEFAULT_LOGTYPE=pmd.net.i40e -DCC_AVX512_SUPPORT -march=skylake-avx512`. Question-1: I think this is not expected right? The `-march=native` is populated from `cflags` and `-march= skylake-avx512` is populated from ` cc_avx512_flags`. Question-2: if the target is meet minimal ISA why not we use `-march=x86-64-v4`? Note: I am yet to check for cross build. Will update on cross build how this comes out. > > Regards, > /Bruce