> On Apr 10, 2019, at 11:07 PM, Pavan Nikhilesh Bhagavatula > <pbhagavat...@marvell.com> wrote: > > Hi Yongseok, > >> -----Original Message----- >> From: Yongseok Koh <ys...@mellanox.com> >> Sent: Wednesday, April 10, 2019 11:08 PM >> To: Pavan Nikhilesh Bhagavatula <pbhagavat...@marvell.com> >> Cc: Thomas Monjalon <tho...@monjalon.net>; dev <dev@dpdk.org>; Jerin >> Jacob Kollanukkaran <jer...@marvell.com>; jerinjac...@gmail.com >> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support >> machine specific flags >> >> External Email >> >> ---------------------------------------------------------------------- >> >>> On Apr 10, 2019, at 9:13 AM, jerinjac...@gmail.com wrote: >>> >>> From: Pavan Nikhilesh <pbhagavat...@marvell.com> >>> >>> Currently, RTE_* flags are set based on the implementer ID but there >>> might be some micro arch specific differences from the same vendor eg. >>> CACHE_LINESIZE. Add support to set micro arch specific flags. >>> >>> Signed-off-by: Pavan Nikhilesh <pbhagavat...@marvell.com> >>> Signed-off-by: Jerin Jacob <jer...@marvell.com> >>> --- >>> config/arm/meson.build | 56 ++++++++++++++++++++++++------------------ >>> 1 file changed, 32 insertions(+), 24 deletions(-) >>> >>> diff --git a/config/arm/meson.build b/config/arm/meson.build index >>> 170a4981a..24bce2b39 100644 >>> --- a/config/arm/meson.build >>> +++ b/config/arm/meson.build >>> @@ -7,25 +7,6 @@ march_opt = '-march=@0@'.format(machine) >>> >>> arm_force_native_march = false >>> >>> -machine_args_generic = [ >>> - ['default', ['-march=armv8-a+crc+crypto']], >>> - ['native', ['-march=native']], >>> - ['0xd03', ['-mcpu=cortex-a53']], >>> - ['0xd04', ['-mcpu=cortex-a35']], >>> - ['0xd05', ['-mcpu=cortex-a55']], >>> - ['0xd07', ['-mcpu=cortex-a57']], >>> - ['0xd08', ['-mcpu=cortex-a72']], >>> - ['0xd09', ['-mcpu=cortex-a73']], >>> - ['0xd0a', ['-mcpu=cortex-a75']], >>> - ['0xd0b', ['-mcpu=cortex-a76']], >>> -] >>> -machine_args_cavium = [ >>> - ['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']], >>> - ['native', ['-march=native']], >>> - ['0xa1', ['-mcpu=thunderxt88']], >>> - ['0xa2', ['-mcpu=thunderxt81']], >>> - ['0xa3', ['-mcpu=thunderxt83']]] >>> - >>> flags_common_default = [ >>> # Accelarate rte_memcpy. Be sure to run unit test >> (memcpy_perf_autotest) >>> # to determine the best threshold in code. Refer to notes in source >>> file @@ -52,12 +33,10 @@ flags_generic = [ >>> ['RTE_USE_C11_MEM_MODEL', true], >>> ['RTE_CACHE_LINE_SIZE', 128]] >>> flags_cavium = [ >>> - ['RTE_MACHINE', '"thunderx"'], >>> ['RTE_CACHE_LINE_SIZE', 128], >>> ['RTE_MAX_NUMA_NODES', 2], >>> ['RTE_MAX_LCORE', 96], >>> - ['RTE_MAX_VFIO_GROUPS', 128], >>> - ['RTE_USE_C11_MEM_MODEL', false]] >>> + ['RTE_MAX_VFIO_GROUPS', 128]] >>> flags_dpaa = [ >>> ['RTE_MACHINE', '"dpaa"'], >>> ['RTE_USE_C11_MEM_MODEL', true], >>> @@ -71,6 +50,27 @@ flags_dpaa2 = [ >>> ['RTE_MAX_NUMA_NODES', 1], >>> ['RTE_MAX_LCORE', 16], >>> ['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]] >>> +flags_default_extra = [] >>> +flags_thunderx_extra = [ >>> + ['RTE_MACHINE', '"thunderx"'], >>> + ['RTE_USE_C11_MEM_MODEL', false]] >>> + >>> +machine_args_generic = [ >>> + ['default', ['-march=armv8-a+crc+crypto']], >>> + ['native', ['-march=native']], >>> + ['0xd03', ['-mcpu=cortex-a53']], >>> + ['0xd04', ['-mcpu=cortex-a35']], >>> + ['0xd07', ['-mcpu=cortex-a57']], >>> + ['0xd08', ['-mcpu=cortex-a72']], >>> + ['0xd09', ['-mcpu=cortex-a73']], >>> + ['0xd0a', ['-mcpu=cortex-a75']]] >>> + >>> +machine_args_cavium = [ >>> + ['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']], >>> + ['native', ['-march=native']], >>> + ['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra], >>> + ['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra], >>> + ['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]] >>> >>> ## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page >>> G7-5321) impl_generic = ['Generic armv8', flags_generic, >>> machine_args_generic] @@ -157,8 +157,16 @@ else >>> endif >>> foreach marg: machine[2] >>> if marg[0] == impl_pn >>> - foreach f: marg[1] >>> - machine_args += f >>> + foreach flag: marg[1] >>> + if cc.has_argument(flag) >>> + machine_args += flag >>> + endif >>> + endforeach >>> + # Apply any extra machine specific flags. >>> + foreach flag: marg.get(2, flags_default_extra) >>> + if flag.length() > 0 >>> + dpdk_conf.set(flag[0], flag[1]) >>> + endif >> >> Let me continue the discussion from v7 here. >> Seems I wan't clear enough. >> >> Let me take an example. If the host is thunderx2 (0xaf) and compiler is older >> than v7, flags_thunderx2_extra isn't set. This means, for example, >> RTE_CACHE_LINE_SIZE will still be 128. Is that what you want? >> RTE_CACHE_LINE_SIZE has nothing to do with compiler support and you might >> want to set it regardless of gcc version. You could skip setting -mcpu with >> setting >> the extra flags. >> > > Thanks for the detailed explanation. > I think since we have the check to skip mcpu flag when cc doesn't support it > (cc.has_argument(flag)) > It will be safe to remove > ` > # Primary part number based mcpu flags are supported > # for gcc versions > 7 > if cc.version().version_compare( > '<7.0') or cmd_output.length() == 0 > if not meson.is_cross_build() and arm_force_native_march == > true > impl_pn = 'native' > else > impl_pn = 'default' > endif > endif > `
+1 > > The command output check can also be removed as it is handled when calling > the command script itself. +1 > > Thoughts? > > PS. I think the safest way to set CACHELINE_SIZE is to read the cache type > register[1] but sadly only few latest kernels > have the support through sysfs > (/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size) +1 In summary, +3. LoL I'll also submit a patch to change the default cacheline size of cortex-a72 with the new flags_*_extra[] thanks, Yongseok