On 11/26/2018 8:55 PM, Asaf Sinai wrote:
> +CC Ilia & Sasha.
>
> -----Original Message-----
> From: Burakov, Anatoly <anatoly.bura...@intel.com>
> Sent: Monday, November 26, 2018 04:57 PM
> To: Ilya Maximets <i.maxim...@samsung.com>; Asaf Sinai <asa...@radware.com>; 
> dev@dpdk.org; Thomas Monjalon <tho...@monjalon.net>
> Subject: Re: [dpdk-dev] CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES: no difference in 
> memory pool allocations, when enabling/disabling this configuration
>
> On 26-Nov-18 2:32 PM, Ilya Maximets wrote:
>> On 26.11.2018 17:21, Burakov, Anatoly wrote:
>>> On 26-Nov-18 2:10 PM, Ilya Maximets wrote:
>>>> On 26.11.2018 16:42, Burakov, Anatoly wrote:
>>>>> On 26-Nov-18 1:20 PM, Ilya Maximets wrote:
>>>>>> On 26.11.2018 16:16, Ilya Maximets wrote:
>>>>>>> On 26.11.2018 15:50, Burakov, Anatoly wrote:
>>>>>>>> On 26-Nov-18 11:43 AM, Burakov, Anatoly wrote:
>>>>>>>>> On 26-Nov-18 11:33 AM, Asaf Sinai wrote:
>>>>>>>>>> Hi Anatoly,
>>>>>>>>>>
>>>>>>>>>> We did not check it with "testpmd", only with our application.
>>>>>>>>>>      From the beginning, we did not enable this configuration (look 
>>>>>>>>>> at attached files), and everything works fine.
>>>>>>>>>> Of course we rebuild DPDK, when we change configuration.
>>>>>>>>>> Please note that we use DPDK 17.11.3, maybe this is why it works 
>>>>>>>>>> fine?
>>>>>>>>> Just tested with DPDK 17.11, and yes, it does work the way you are 
>>>>>>>>> describing. This is not intended behavior. I will look into it.
>>>>>>>>>
>>>>>>>> +CC author of commit introducing CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES.
>>>>>>>>
>>>>>>>> Looking at the code, i think this config option needs to be reworked 
>>>>>>>> and we should clarify what we mean by this option. It appears that 
>>>>>>>> i've misunderstood what this option actually intended to do, and i 
>>>>>>>> also think it's naming could be improved because it's confusing and 
>>>>>>>> misleading.
>>>>>>>>
>>>>>>>> In 17.11, this option does *not* prevent EAL from using NUMA - it 
>>>>>>>> merely disables using libnuma to perform memory allocation. This looks 
>>>>>>>> like intended (if counter-intuitive) behavior - disabling this option 
>>>>>>>> will simply revert DPDK to working as it did before this option was 
>>>>>>>> introduced (i.e. best-effort allocation). This is why your code still 
>>>>>>>> works - because EAL still does allocate memory on socket 1, and 
>>>>>>>> *knows* that it's socket 1 memory. It still supports NUMA.
>>>>>>>>
>>>>>>>> The commit message for these changes states that the actual purpose of 
>>>>>>>> this option is to enable "balanced" hugepage allocation. In case of 
>>>>>>>> cgroups limitations, previously, DPDK would've exhausted all hugepages 
>>>>>>>> on master core's socket before attempting to allocate from other 
>>>>>>>> sockets, but by the time we've reached cgroups limits on numbers of 
>>>>>>>> hugepages, we might not have reached socket 1 and thus missed out on 
>>>>>>>> the pages we could've allocated, but didn't. Using libnuma solves this 
>>>>>>>> issue, because now we can allocate pages on sockets we want, instead 
>>>>>>>> of hoping we won't run out of hugepages before we get the memory we 
>>>>>>>> need.
>>>>>>>>
>>>>>>>> In 18.05 onwards, this option works differently (and arguably wrong). 
>>>>>>>> More specifically, it disallows allocations on sockets other than 0, 
>>>>>>>> and it also makes it so that EAL does not check which socket the 
>>>>>>>> memory *actually* came from. So, not only allocating memory from 
>>>>>>>> socket 1 is disabled, but allocating from socket 0 may even get you 
>>>>>>>> memory from socket 1!
>>>>>>> I'd consider this as a bug.
>>>>>>>
>>>>>>>> +CC Thomas
>>>>>>>>
>>>>>>>> The CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES option is a misnomer, because 
>>>>>>>> it makes it seem like this option disables NUMA support, which is not 
>>>>>>>> the case.
>>>>>>>>
>>>>>>>> I would also argue that it is not relevant to 18.05+ memory subsystem, 
>>>>>>>> and should only work in legacy mode, because it is *impossible* to 
>>>>>>>> make it work right in the new memory subsystem, and here's why:
>>>>>>>>
>>>>>>>> Without libnuma, we have no way of "asking" the kernel to allocate a 
>>>>>>>> hugepage on a specific socket - instead, any allocation will most 
>>>>>>>> likely happen on socket from which the allocation came from. For 
>>>>>>>> example, if user program's lcore is on socket 1, allocation on socket 
>>>>>>>> 0 will actually allocate a page on socket 1.
>>>>>>>>
>>>>>>>> If we don't check for page's NUMA node affinity (which is what 
>>>>>>>> currently happens) - we get performance degradation because we may 
>>>>>>>> unintentionally allocate memory on wrong NUMA node. If we do check for 
>>>>>>>> this - then allocation of memory on socket 1 from lcore on socket 0 
>>>>>>>> will almost never succeed, because kernel will always give us pages on 
>>>>>>>> socket 0.
>>>>>>>>
>>>>>>>> Put it simply, there is no sane way to make this option work for the 
>>>>>>>> new memory subsystem - IMO it should be dropped, and libnuma should be 
>>>>>>>> made a hard dependency on Linux.
>>>>>>> I agree that new memory model could not work without libnuma,
>>>>>>> i.e. will lead to unpredictable memory allocations with no any
>>>>>>> respect to requested socket_id's. I also agree that
>>>>>>> CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES is only sane for a legacy memory 
>>>>>>> model.
>>>>>>> It looks like we have no other choice than just drop the option
>>>>>>> and make the code unconditional, i.e. have hard dependency on libnuma.
>>>>>>>
>>>>>> We, probably, could compile this code and have hard dependency
>>>>>> only for platforms with 'RTE_MAX_NUMA_NODES > 1'.
>>>>> Well, as long as legacy mode stays supported, we have to keep the option. 
>>>>> The "drop" part was referring to supporting it under the new memory 
>>>>> system, not a literal drop from config files.
>>>> The option was introduced because we didn't want to introduce the
>>>> new hard dependency. Since we'll have it anyway, I'm not sure if
>>>> keeping the option for legacy mode makes any sense.
>>> Oh yes, you're right. Drop it is!
>>>
>>>>> As for using RTE_MAX_NUMA_NODES, i don't think it's merited. 
>>>>> Distributions cannot deliver different DPDK versions based on the number 
>>>>> of sockets on a particular machine - so it would have to be a hard 
>>>>> dependency for distributions anyway (does any distribution ship DPDK 
>>>>> without libnuma?).
>>>> At least ARMv7 builds commonly does not ship libnuma package.
>>> Do you mean libnuma builds for ARMv7 are not available? Or do you mean the 
>>> libnuma package is not installed by default?
>>>
>>> If it's the latter, then i believe it's not installed by default anywhere, 
>>> but if using distribution version of DPDK, libnuma will be taken care of 
>>> via package manager. Presumably building from source can be taken care of 
>>> with pkg-config/meson.
>>>
>>> Or do you mean ARMv7 does not have libnuma for their arch at all, in any 
>>> distro?
>> libnuma builds for ARMv7 are not available in most of the distros. I
>> didn't check all, but here is results for Ubuntu:
>>       
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpac
>> kages.ubuntu.com%2Fsearch%3Fsuite%3Dbionic%26arch%3Darmhf%26searchon%3
>> Dnames%26keywords%3Dlibnuma&amp;data=02%7C01%7CAsafSi%40radware.com%7C
>> a44f84bca42d4a52acac08d653af83b8%7C6ae4e000b5d04f48a766402d46119b76%7C
>> 0%7C0%7C636788410626179927&amp;sdata=1pJ0WkAs6Y%2Bv3w%2BhKAELBw%2BjMra
>> BnhiqqpsXkRv2ifI%3D&amp;reserved=0
>>
>> You may see that Ubuntu 18.04 (bionic) has no libnuma package for
>> 'armhf' and also 'powerpc' platforms.
>>
> That's a difficulty. Do these platforms support NUMA? In other words, could 
> we replace this flag with just outright disabling NUMA support?

Many platforms don't support NUMA, so they dont' really need libnuma.

Mandating libnuma will also break several things:

   - cross build for ARM on x86 - which is among the preferred method 
for build by many in ARM community.

  - many of the embedded SoCs are without NUMA support, they use smaller 
rootf (e.g. Yocto).  It will be a burden to add libnuma there.


>
>>>>> For those compiling from source - are there any supported
>>>>> distributions which don't package libnuma? I don't see much sense
>>>>> in keeping libnuma optional, IMO. This is of course up to the tech
>>>>> board to decide, but IMO the "without libnuma it's basically
>>>>> broken" argument is very strong in my opinion :)
>>>>>
>>>
>
> --
> Thanks,
> Anatoly

Reply via email to