> On Apr 10, 2019, at 11:07 PM, Pavan Nikhilesh Bhagavatula 
> <pbhagavat...@marvell.com> wrote:
> 
> Hi Yongseok,
> 
>> -----Original Message-----
>> From: Yongseok Koh <ys...@mellanox.com>
>> Sent: Wednesday, April 10, 2019 11:08 PM
>> To: Pavan Nikhilesh Bhagavatula <pbhagavat...@marvell.com>
>> Cc: Thomas Monjalon <tho...@monjalon.net>; dev <dev@dpdk.org>; Jerin
>> Jacob Kollanukkaran <jer...@marvell.com>; jerinjac...@gmail.com
>> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support
>> machine specific flags
>> 
>> External Email
>> 
>> ----------------------------------------------------------------------
>> 
>>> On Apr 10, 2019, at 9:13 AM, jerinjac...@gmail.com wrote:
>>> 
>>> From: Pavan Nikhilesh <pbhagavat...@marvell.com>
>>> 
>>> Currently, RTE_* flags are set based on the implementer ID but there
>>> might be some micro arch specific differences from the same vendor eg.
>>> CACHE_LINESIZE. Add support to set micro arch specific flags.
>>> 
>>> Signed-off-by: Pavan Nikhilesh <pbhagavat...@marvell.com>
>>> Signed-off-by: Jerin Jacob <jer...@marvell.com>
>>> ---
>>> config/arm/meson.build | 56 ++++++++++++++++++++++++------------------
>>> 1 file changed, 32 insertions(+), 24 deletions(-)
>>> 
>>> diff --git a/config/arm/meson.build b/config/arm/meson.build index
>>> 170a4981a..24bce2b39 100644
>>> --- a/config/arm/meson.build
>>> +++ b/config/arm/meson.build
>>> @@ -7,25 +7,6 @@ march_opt = '-march=@0@'.format(machine)
>>> 
>>> arm_force_native_march = false
>>> 
>>> -machine_args_generic = [
>>> -   ['default', ['-march=armv8-a+crc+crypto']],
>>> -   ['native', ['-march=native']],
>>> -   ['0xd03', ['-mcpu=cortex-a53']],
>>> -   ['0xd04', ['-mcpu=cortex-a35']],
>>> -   ['0xd05', ['-mcpu=cortex-a55']],
>>> -   ['0xd07', ['-mcpu=cortex-a57']],
>>> -   ['0xd08', ['-mcpu=cortex-a72']],
>>> -   ['0xd09', ['-mcpu=cortex-a73']],
>>> -   ['0xd0a', ['-mcpu=cortex-a75']],
>>> -   ['0xd0b', ['-mcpu=cortex-a76']],
>>> -]
>>> -machine_args_cavium = [
>>> -   ['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>>> -   ['native', ['-march=native']],
>>> -   ['0xa1', ['-mcpu=thunderxt88']],
>>> -   ['0xa2', ['-mcpu=thunderxt81']],
>>> -   ['0xa3', ['-mcpu=thunderxt83']]]
>>> -
>>> flags_common_default = [
>>>     # Accelarate rte_memcpy. Be sure to run unit test
>> (memcpy_perf_autotest)
>>>     # to determine the best threshold in code. Refer to notes in source
>>> file @@ -52,12 +33,10 @@ flags_generic = [
>>>     ['RTE_USE_C11_MEM_MODEL', true],
>>>     ['RTE_CACHE_LINE_SIZE', 128]]
>>> flags_cavium = [
>>> -   ['RTE_MACHINE', '"thunderx"'],
>>>     ['RTE_CACHE_LINE_SIZE', 128],
>>>     ['RTE_MAX_NUMA_NODES', 2],
>>>     ['RTE_MAX_LCORE', 96],
>>> -   ['RTE_MAX_VFIO_GROUPS', 128],
>>> -   ['RTE_USE_C11_MEM_MODEL', false]]
>>> +   ['RTE_MAX_VFIO_GROUPS', 128]]
>>> flags_dpaa = [
>>>     ['RTE_MACHINE', '"dpaa"'],
>>>     ['RTE_USE_C11_MEM_MODEL', true],
>>> @@ -71,6 +50,27 @@ flags_dpaa2 = [
>>>     ['RTE_MAX_NUMA_NODES', 1],
>>>     ['RTE_MAX_LCORE', 16],
>>>     ['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]]
>>> +flags_default_extra = []
>>> +flags_thunderx_extra = [
>>> +   ['RTE_MACHINE', '"thunderx"'],
>>> +   ['RTE_USE_C11_MEM_MODEL', false]]
>>> +
>>> +machine_args_generic = [
>>> +   ['default', ['-march=armv8-a+crc+crypto']],
>>> +   ['native', ['-march=native']],
>>> +   ['0xd03', ['-mcpu=cortex-a53']],
>>> +   ['0xd04', ['-mcpu=cortex-a35']],
>>> +   ['0xd07', ['-mcpu=cortex-a57']],
>>> +   ['0xd08', ['-mcpu=cortex-a72']],
>>> +   ['0xd09', ['-mcpu=cortex-a73']],
>>> +   ['0xd0a', ['-mcpu=cortex-a75']]]
>>> +
>>> +machine_args_cavium = [
>>> +   ['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']],
>>> +   ['native', ['-march=native']],
>>> +   ['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra],
>>> +   ['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra],
>>> +   ['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]]
>>> 
>>> ## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page
>>> G7-5321) impl_generic = ['Generic armv8', flags_generic,
>>> machine_args_generic] @@ -157,8 +157,16 @@ else
>>>     endif
>>>     foreach marg: machine[2]
>>>             if marg[0] == impl_pn
>>> -                   foreach f: marg[1]
>>> -                           machine_args += f
>>> +                   foreach flag: marg[1]
>>> +                           if cc.has_argument(flag)
>>> +                                   machine_args += flag
>>> +                           endif
>>> +                   endforeach
>>> +                   # Apply any extra machine specific flags.
>>> +                   foreach flag: marg.get(2, flags_default_extra)
>>> +                           if flag.length() > 0
>>> +                                   dpdk_conf.set(flag[0], flag[1])
>>> +                           endif
>> 
>> Let me continue the discussion from v7 here.
>> Seems I wan't clear enough.
>> 
>> Let me take an example. If the host is thunderx2 (0xaf) and compiler is older
>> than v7, flags_thunderx2_extra isn't set. This means, for example,
>> RTE_CACHE_LINE_SIZE will still be 128. Is that what you want?
>> RTE_CACHE_LINE_SIZE has nothing to do with compiler support and you might
>> want to set it regardless of gcc version. You could skip setting -mcpu with 
>> setting
>> the extra flags.
>> 
> 
> Thanks for the detailed explanation.
> I think since we have the check to skip mcpu flag when cc doesn't support it 
> (cc.has_argument(flag))
> It will be safe to remove 
> `
>        # Primary part number based mcpu flags are supported
>        # for gcc versions > 7
>        if cc.version().version_compare(
>                        '<7.0') or cmd_output.length() == 0
>                if not meson.is_cross_build() and arm_force_native_march == 
> true
>                        impl_pn = 'native'
>                else
>                        impl_pn = 'default'
>                endif
>        endif
> `

+1

> 
> The command output check can also be removed as it is handled when calling 
> the command script itself.

+1

> 
> Thoughts?
> 
> PS. I think the safest way to set CACHELINE_SIZE is to read the cache type 
> register[1] but sadly only few latest kernels 
> have the support through sysfs 
> (/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size) 

+1

In summary, +3. LoL

I'll also submit a patch to change the default cacheline size of cortex-a72 
with the new flags_*_extra[]


thanks,
Yongseok

Reply via email to