Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx

Ferruh Yigit Wed, 09 Oct 2024 19:00:59 -0700

On 8/31/2024 2:38 PM, Mihai Brodschi wrote:
> Hi Ferruh,
> 
> Apologies for the late response.
> I've run some performance tests for the two proposed solutions.
> In the tables below, the rte_memcpy results correspond to this patch.
> The 2xpktmbuf_alloc results correspond to the other proposed solution.
> 
> bash commands:
> server# ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 
> --single -l<SERVER_CORES> --file=test1 -- --nb-cores <NB_CORES> --txq 
> <NB_CORES> --rxq <NB_CORES> --burst <BURST> -i
> client# ./dpdk-testpmd 
> --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes --single 
> -l<CLIENT_CORES> --file=test2 -- --nb-cores <NB_CORES> --txq <NB_CORES> --rxq 
> <NB_CORES> --burst <BURST> -i
> 
> testpmd commands:
> client:
> testpmd> start
> server:
> testpmd> start tx_first
> 
> 
> CPU: AMD EPYC 7713P
> RAM: DDR4-3200
> OS: Debian 12
> DPDK: 22.11.1
> SERVER_CORES=72,8,9,10,11
> CLIENT_CORES=76,12,13,14,15
> 
> Results:
> ==================================================================
> |                          | 1 CORE     | 2 CORES    | 4 CORES   |
> ==================================================================
> | unpatched burst=32       | 9.95 Gbps  | 19.24 Gbps | 36.4 Gbps |
> ------------------------------------------------------------------
> | 2xpktmbuf_alloc burst=32 | 9.86 Gbps  | 18.88 Gbps | 36.6 Gbps |
> ------------------------------------------------------------------
> | 2xpktmbuf_alloc burst=31 | 9.17 Gbps  | 18.69 Gbps | 35.1 Gbps |
> ------------------------------------------------------------------
> | rte_memcpy burst=32      | 9.54 Gbps  | 19.10 Gbps | 36.6 Gbps |
> ------------------------------------------------------------------
> | rte_memcpy burst=31      | 9.39 Gbps  | 18.53 Gbps | 35.5 Gbps |
> ==================================================================
> 
> 
> CPU: Intel Core i7-14700HX
> RAM: DDR5-5600
> OS:  Ubuntu 24.04.1
> DPDK: 23.11.1
> SERVER_CORES=0,1,3,5,7
> CLIENT_CORES=8,9,11,13,15
> 
> Results:
> ==================================================================
> |                          | 1 CORE     | 2 CORES    | 4 CORES   |
> ==================================================================
> | unpatched burst=32       | 15.52 Gbps | 27.35 Gbps | 46.8 Gbps |
> ------------------------------------------------------------------
> | 2xpktmbuf_alloc burst=32 | 15.49 Gbps | 27.68 Gbps | 46.4 Gbps |
> ------------------------------------------------------------------
> | 2xpktmbuf_alloc burst=31 | 14.98 Gbps | 26.75 Gbps | 45.2 Gbps |
> ------------------------------------------------------------------
> | rte_memcpy burst=32      | 15.99 Gbps | 28.44 Gbps | 49.3 Gbps |
> ------------------------------------------------------------------
> | rte_memcpy burst=31      | 14.85 Gbps | 27.32 Gbps | 46.3 Gbps |
> ==================================================================
>


Hi Mihai,

Thank you for the extensive testing.

Problematic case is "burst=31", between '2xpktmbuf_alloc' & 'rte_memcpy'
method, there is small difference and not one of them consistently
better than other.

In this case I will proceed with current patch.


> 
> On 19/07/2024 12:03, Ferruh Yigit wrote:
>> On 7/8/2024 12:45 PM, Ferruh Yigit wrote:
>>> On 7/8/2024 4:39 AM, Mihai Brodschi wrote:
>>>>
>>>>
>>>> On 07/07/2024 21:46, Mihai Brodschi wrote:
>>>>>
>>>>>
>>>>> On 07/07/2024 18:18, Mihai Brodschi wrote:
>>>>>>
>>>>>>
>>>>>> On 07/07/2024 17:05, Ferruh Yigit wrote:
>>>>>>>
>>>>>>> My expectation is numbers should be like following:
>>>>>>>
>>>>>>> Initially:
>>>>>>>  size = 256
>>>>>>>  head = 0
>>>>>>>  tail = 0
>>>>>>>
>>>>>>> In first refill:
>>>>>>>  n_slots = 256
>>>>>>>  head = 256
>>>>>>>  tail = 0
>>>>>>>
>>>>>>> Subsequent run that 32 slots used:
>>>>>>>  head = 256
>>>>>>>  tail = 32
>>>>>>>  n_slots = 32
>>>>>>>  rte_pktmbuf_alloc_bulk(mq, buf[head & mask], n_slots);
>>>>>>>   head & mask = 0
>>>>>>>   // So it fills first 32 elements of buffer, which is inbound
>>>>>>>
>>>>>>> This will continue as above, combination of only gap filled and head
>>>>>>> masked with 'mask' provides the wrapping required.
>>>>>>
>>>>>> If I understand correctly, this works only if eth_memif_rx_zc always 
>>>>>> processes
>>>>>> a number of packets which is a power of 2, so that the ring's head 
>>>>>> always wraps
>>>>>> around at the end of a refill loop, never in the middle of it.
>>>>>> Is there any reason this should be the case?
>>>>>> Maybe the tests don't trigger the crash because this condition holds 
>>>>>> true for them?
>>>>>
>>>>> Here's how to reproduce the crash on DPDK stable 23.11.1, using testpmd:
>>>>>
>>>>> Server:
>>>>> # ./dpdk-testpmd --vdev=net_memif0,id=1,role=server,bsize=1024,rsize=8 
>>>>> --single-file-segments -l2,3 --file-prefix test1 -- -i
>>>>>
>>>>> Client:
>>>>> # ./dpdk-testpmd 
>>>>> --vdev=net_memif0,id=1,role=client,bsize=1024,rsize=8,zero-copy=yes 
>>>>> --single-file-segments -l4,5 --file-prefix test2 -- -i
>>>>> testpmd> start
>>>>>
>>>>> Server:
>>>>> testpmd> start tx_first
>>>>> testpmt> set burst 15
>>>>>
>>>>> At this point, the client crashes with a segmentation fault.
>>>>> Before the burst is set to 15, its default value is 32.
>>>>> If the receiver processes packets in bursts of size 2^N, the crash does 
>>>>> not occur.
>>>>> Setting the burst size to any power of 2 works, anything else crashes.
>>>>> After applying this patch, the crashes are completely gone.
>>>>
>>>> Sorry, this might not crash with a segmentation fault. To confirm the 
>>>> mempool is
>>>> corrupted, please compile DPDK with debug=true and the c_args 
>>>> -DRTE_LIBRTE_MEMPOOL_DEBUG.
>>>> You should see the client panic when changing the burst size to not be a 
>>>> power of 2.
>>>> This also works on the latest main branch.
>>>>
>>>
>>> Hi Mihai,
>>>
>>> Right, if the buffer size is not multiple of burst size, issue is valid.
>>> And as there is a requirement to have buffer size power of two, burst
>>> should have the same.
>>> I assume this issue is not caught before because default burst size is 32.
>>>
>>> Can you please share some performance impact of the change, with two
>>> possible solutions we discussed above?
>>>
>>> Other option is to add this as a limitation to the memif zero copy, but
>>> this won't be good for usability.
>>>
>>> We can decide based on performance numbers.
>>>
>>>
>>
>> Hi Jakup,
>>
>> Do you have any comment on this?
>>
>> I think we should either document this as limitation of the driver, or
>> fix it, and if so need to decide the fix.
>>
> 
>

Re: [PATCH v2] net/memif: fix buffer overflow in zero copy Rx

Reply via email to