On 9/20/22 15:01, Mengxin Liu wrote:
> We tested ovs from both 2.15 and 2.17 branches. The leak is not that
> remarkable,
> about 26KB memory leak detected by memleak-bpfss when creating and
> deleting 800 ports.
> 
> Our environment is complex with ovs, ovn, kubernetes and kube-ovn etc.
> I will try to
> reproduce it in a minimal setup.
> 
> 
> On Tue, 20 Sept 2022 at 20:40, Aaron Conole <[email protected]> wrote:
>>
>> Mengxin Liu <[email protected]> writes:
>>
>>> Sorry for the incomplete information. This patch resolved the problem
>>> in my environment,
>>> but I'm not familiar with the recycling logic and if it will affect
>>> the recycling. Here is the
>>> description of how we found and resolved the leak, hope it helps you
>>> further analyze this issue.
>>>
>>> The leak was found in a Kubernetes cluster that runs thousands of ci
>>> jobs everyday and kube-ovn is the CNI.
>>> We found the vswitchd memory keeps increasing even if the jobs have
>>> finished and all related ovs ports have
>>> been deleted.
>>
>> Strange - I have a test case going now - adding and deleting ports, so
>> far no memory issues?  Can you generate a small reproducer?  For
>> reference, I'm doing something like:
>>
>> for I in $(seq 0 1000); do
>>     ip tuntap add mode tap p$I
>>     sleep 1
>>     ovs-vsctl add-port br-int p$I
>>     sleep 5
>>     ovs-vsctl del-port p$I
>>     ip tuntap del mode tap p$I
>> done
>>
>> And this doesn't seem to show any change to virt/res memory on my
>> system.  I will switch to master branch as well (I'm on 2.14 branch for
>> the moment) once the test completes.
>>
>>> Then we used memleak-bpfcc to find the potential memory leak with command
>>> "memleak-bpfcc -o 300000 -p 2099300 3600" and kept creating and
>>> deleting ovs ports during the
>>> detection period. After the detection it shows that the ofproto_usage
>>> objects are  not released, and the
>>> number of the existing objects equals to the ports we created and deleted.
>>>
>>> After this patch, the memleak-bpfcc cannot find any leak and the
>>> memory will not always increase.
>>
>> What version of OVS are you using?
>>
>>> On Mon, 19 Sept 2022 at 21:57, Aaron Conole <[email protected]> wrote:
>>>>
>>>> Mengxin Liu <[email protected]> writes:
>>>>
>>>>> Signed-off-by: Mengxin Liu <[email protected]>
>>>>> ---
>>>>>  ofproto/ofproto.c | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
>>>>> index 3a527683c..7c6cb1c56 100644
>>>>> --- a/ofproto/ofproto.c
>>>>> +++ b/ofproto/ofproto.c
>>>>> @@ -2429,7 +2429,7 @@ static void
>>>>>  dealloc_ofp_port(struct ofproto *ofproto, ofp_port_t ofp_port)
>>>>>  {
>>>>>      if (ofp_to_u16(ofp_port) < ofproto->max_ports) {
>>>>> -        ofport_set_usage(ofproto, ofp_port, time_msec());
>>>>> +        ofport_remove_usage(ofproto, ofp_port);

This change defeats the logic of the port re-use and will cause
application of OpenFlow rules to incorrect ports in case of a port
churn in a system.
So, unfortunately, the patch in its current form cannot be accepted.

And I do not see at the moment how the usage node can be leaked
without leaking the whole ofproto structure.

If you have more detailed logs for the memory leak, e.g. the call
trace that actually leads to the leak, we should look at that.

Best regards, Ilya Maximets.


>>>>>      }
>>>>>  }
>>>>>
>>>>> --
>>>>
>>>> This interferes with ofp port recycling code.  Can you describe the
>>>> leak?  I don't really understand where / what the leak is, and the
>>>> commit message is very sparse here.  Please give details under when the
>>>> memory leaks, how you found it, and whether/how this changes the
>>>> recycling code.
>>>>
>>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to