Re: [lng-odp] [PATCH v3] linux-gen: pool: avoid allocating packets which cross page boundaries

Github ODP bot Tue, 10 Oct 2017 06:05:23 -0700

Matias Elo(matiaselo) replied on github web page:

platform/linux-generic/odp_pool.c
@@ -232,19 +237,40 @@ static void init_buffers(pool_t *pool)
        uint32_t mask;
        int type;
        uint32_t seg_size;
+       uint64_t page_size = 0;
+       int skipped_blocks = 0;
+
+       if (odp_shm_info(pool->shm, &shm_info) == 0)
+               page_size = shm_info.page_size;



Comment:
OK, I'll add an abort branch.

> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
> This is happening within `odp_pool_create()`, which is creating the 
> `odp_pool_t`, so other threads don't have that handle with which to call 
> `odp_pool_destroy()`. 
> 
> The only way `odp_shm_info()` can fail is if it's passed an unknown 
> `odp_shm_t` handle, which is not the case here. 
> 
> I have no objection to adding an `else ODP_ABORT()` here, however.


>> muvarov wrote
>> actually it can.  it can return -1 for some reasons. And pool_create() and 
>> pool_destroy() from different threads for the same time, will make shm 
>> invalid before init_buffers(). But that is corner case and programmable 
>> error :)
>> 
>> But it works. My point was a little bit different. In my understanding 
>> assuming that odp api function will fail is not correct in places like that. 
>> That works like work around - what we will do if call will fail. Instead of 
>> that you should force platform to implement this api call in correct way. 
>> Especially if it's reference implementation. I would prefer ODP_ABORT() if 
>> call fails as that is not expected here.


>>> Matias Elo(matiaselo) wrote:
>>> As Bill noted, if odp_shm_info() could somehow fail, nothing would still 
>>> brake.


>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>> @muvarov In the unlikely (impossible in the case of the current 
>>>> linux-generic implementation) event of this call failing, `page_size` will 
>>>> be left at the initialized default 0 value, which means we won't skip any 
>>>> buffers since 0 is less than `FIRST_HP_SIZE`. No no error here.


>>>>> muvarov wrote
>>>>> if call fails it has to be an error? result in that case is not 
>>>>> predictable?


>>>>>> Matias Elo(matiaselo) wrote:
>>>>>> As you noted there can be at most N crossings with the original pool 
>>>>>> size ('num'). Now, to compensate for the skipped buffers the pool size 
>>>>>> is in practice increased by 'num_extra'. When the original pool size is 
>>>>>> large enough the extra buffers may again cross page boundary. 
>>>>>> 
>>>>>> Currently, a single buffer is 8KB, so a single 2MB page can hold max 256 
>>>>>> buffers. With a pool size of 64K or larger the extra buffers will cross 
>>>>>> page boundary again.
>>>>>> 
>>>>>> Using 2x multiplication here is overkill. I'll update the patch and 
>>>>>> calculate the exact number of required extra buffers.


>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>> The dpdk drivers only support huge page memory, so packets crossing 4K 
>>>>>>> page boundaries isn't a problem.


>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>> The blocks are contiguous so you're computing how many of them can fit 
>>>>>>>> in a 2M huge page. Since blocks aren't themselves larger than a huge 
>>>>>>>> page I don't see what's wrong with my analysis.


>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>> Fair enough. I agree these sort of considerations should be 
>>>>>>>>> transparent to the application. This will complicate the sort of 
>>>>>>>>> index-juggling that VPP does (see PR #200). We'll have to consider 
>>>>>>>>> this in the discussions planned for Monday's ARCH call.


>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>> I'd call it huge pages if that's the problem you're trying to 
>>>>>>>>>> address. Any jumbo packet is going to cross a 4K boundary if it 
>>>>>>>>>> isn't segmented. I assume that isn't a problem?


>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>> Yes, I'm referring to all page sizes >= 2MB, so the same operation 
>>>>>>>>>>> is also done with 1GB pages. I was struggling to figure out a good 
>>>>>>>>>>> name for this define, so all ideas are welcome.


>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>> Currently this is required for the dpdk pmd drivers only, but 
>>>>>>>>>>>> if/when we are adding support for more drivers this issue is 
>>>>>>>>>>>> likely to re-emerge. IMHO the applications shouldn't have to know 
>>>>>>>>>>>> about this, so I wouldn't add a new parameter.


>>>>>>>>>>>>> Matias Elo(matiaselo) wrote:
>>>>>>>>>>>>> Also the extra buffers can cross page boundaries.


>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>> Why the `2 *` factor here? The number of block size "page" 
>>>>>>>>>>>>>> crossings is going to be limited to the rounded up number of 
>>>>>>>>>>>>>> pages:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ```
>>>>>>>>>>>>>> <--------------------------> "n" pages in size
>>>>>>>>>>>>>> ---+--------+--- ... ---+---- There can be at most n crossings,
>>>>>>>>>>>>>>                               (n - 1) if block happily starts on 
>>>>>>>>>>>>>> a page boundary.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> ```


>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>> Shouldn't this processing be an `odp_pool_param_t` option? 
>>>>>>>>>>>>>>> Doing this unconditionally because "some NICs" may have 
>>>>>>>>>>>>>>> problems seems unnecessary.


>>>>>>>>>>>>>>>> Bill Fischofer(Bill-Fischofer-Linaro) wrote:
>>>>>>>>>>>>>>>> This is somewhat confusing terminology. Pages are typically 4K 
>>>>>>>>>>>>>>>> in size. It seems you're referring to huge pages here. How 
>>>>>>>>>>>>>>>> about systems that configured 1GB huge pages?


https://github.com/Linaro/odp/pull/223#discussion_r143709558
updated_at 2017-10-10 12:39:44

Re: [lng-odp] [PATCH v3] linux-gen: pool: avoid allocating packets which cross page boundaries

Reply via email to