On Sun, May 26, 2024 at 04:08:18PM +0800, Wilson Peng via dev wrote:
> From: Wilson Peng <[email protected]>
>
> While deploying Tanzu Kubernetes(Antrea based solution) in Broadcom customer,
> Sometimes it is found that the kernel thread OvsConntrackEntryCleaner is not
> started
> After the Windows node is rebooted on unexpected condition. It could be also
> observed a similar issue in local Antrea setup via Clean-AntreaNetwork.ps1
> which will
> Remove-VMSwitch and re-create it on Windows node.
>
> After checking the local conntrack dump, OVS doesn’t remove the connection
> entries
> Even though the time is overdue, we could find the connection entries created
> several
> Hours ago in the dump , within a state (TIME_WAIT) that was supposed to be
> deleted earlier.
> At that time, the count of the existing entries in the OVS conntrack zone is
> far from the
> Up limit, the actual number is 19k. Then we tried to flush the conntrack with
> CMD
> "ovs-dpctl.exe flush-conntrack" and all the conntrack entries could be
> removed.
>
> In this patch, it is adding the complete sanity check when starting
> OvsConntrackEntryCleaner
> Thread. If anything abnormal is happening it will rollback the thread
> creating.
>
> Antrea team does help do the regression test with build including the patch
> and it could PASS
> The testing. And it is not find the Connectract not timeout issue with same
> reproducing condition.
>
> It is good to backport the fix to main and backported until 2.17
>
> Signed-off-by: Wilson Peng <[email protected]>
Hi Wilson,
Thanks for your patch.
Some mechanical feedback on the patch submission:
* Please avoid the term "sanity check", as per the
inclusive naming Word List v1.0, which has been adopted by OvS.
* Please line wrap the patch description to 75 characters,
excluding where that doesn't make sense such as Fixes tags.
Some high-level feedback on the patch.
This patch seems to take a defensive programming approach,
where various state is kept such that OvsCleanupConntrack() is idempotent.
But this does lead to extra code, and I suggest it is error prone.
In all I would prefer to see some root-cause analysis of the problem
pinpointing which resources aren't being released and why. Ideally
this would lead to a targeted fix for the problem at hand.
> ---
> datapath-windows/ovsext/Conntrack.c | 86 ++++++++++++++++++++---------
> 1 file changed, 61 insertions(+), 25 deletions(-)
>
> diff --git a/datapath-windows/ovsext/Conntrack.c
> b/datapath-windows/ovsext/Conntrack.c
> index 39ba5cc10..fbd036418 100644
> --- a/datapath-windows/ovsext/Conntrack.c
> +++ b/datapath-windows/ovsext/Conntrack.c
> @@ -36,6 +36,9 @@ static PLIST_ENTRY ovsConntrackTable;
> static OVS_CT_THREAD_CTX ctThreadCtx;
> static PNDIS_RW_LOCK_EX *ovsCtBucketLock = NULL;
> static NDIS_SPIN_LOCK ovsCtZoneLock;
> +static BOOLEAN ovsCtZoneLock_release = FALSE;
> +static BOOLEAN OvsNatInitDone = FALSE;
> +static BOOLEAN OvsConntrackCleanThreadExist = FALSE;
> static POVS_CT_ZONE_INFO zoneInfo = NULL;
> extern POVS_SWITCH_CONTEXT gOvsSwitchContext;
> static ULONG ctTotalEntries;
> @@ -95,32 +98,44 @@ OvsInitConntrack(POVS_SWITCH_CONTEXT context)
> goto freeBucketLock;
> }
>
> - ObReferenceObjectByHandle(threadHandle, SYNCHRONIZE, NULL, KernelMode,
> - &ctThreadCtx.threadObject, NULL);
> + OvsConntrackCleanThreadExist = FALSE;
> + ctThreadCtx.exit = 0;
> + status = ObReferenceObjectByHandle(threadHandle, SYNCHRONIZE, NULL,
> KernelMode,
> + &ctThreadCtx.threadObject, NULL);
> ZwClose(threadHandle);
> threadHandle = NULL;
> + if (!NT_SUCCESS(status)) {
> + ctThreadCtx.exit = 1;
> + KeSetEvent(&ctThreadCtx.event, 0, FALSE);
> + KeWaitForSingleObject(ctThreadCtx.threadObject, Executive,
> + KernelMode, FALSE, NULL);
> + goto cleanupConntrack;
> + }
>
> zoneInfo = OvsAllocateMemoryWithTag(sizeof(OVS_CT_ZONE_INFO) *
> CT_MAX_ZONE, OVS_CT_POOL_TAG);
> if (zoneInfo == NULL) {
> status = STATUS_INSUFFICIENT_RESOURCES;
> - goto freeBucketLock;
> + goto cleanupConntrack;
> }
>
> NdisAllocateSpinLock(&ovsCtZoneLock);
> + ovsCtZoneLock_release = FALSE;
> defaultCtLimit = CT_MAX_ENTRIES;
> for (UINT32 i = 0; i < CT_MAX_ZONE; i++) {
> zoneInfo[i].entries = 0;
> zoneInfo[i].limit = defaultCtLimit;
> }
>
> - status = OvsNatInit();
> -
> - if (status != STATUS_SUCCESS) {
> - OvsCleanupConntrack();
> + if (OvsNatInitDone == FALSE) {
> + status = OvsNatInit();
> + if (status != STATUS_SUCCESS) {
> + goto cleanupConntrack;
> + }
> + OvsNatInitDone = TRUE;
> }
> + OvsConntrackCleanThreadExist = TRUE;
> return STATUS_SUCCESS;
> -
> freeBucketLock:
> for (UINT32 i = 0; i < numBucketLocks; i++) {
> if (ovsCtBucketLock[i] != NULL) {
> @@ -132,6 +147,9 @@ freeBucketLock:
> freeTable:
> OvsFreeMemoryWithTag(ovsConntrackTable, OVS_CT_POOL_TAG);
> ovsConntrackTable = NULL;
> +cleanupConntrack:
> + OvsCleanupConntrack();
In some error paths code above the 'cleanupConntrack' line will execute
and others it won't. This doesn't seem consistent.
I would prefer a situation where the code looked more like it did
before this patch:
status = alloc_A()
if (...)
goto unwind_A;
status = alloc_B()
if (...)
goto unwind_B;
...
return STATUS_SUCCESS;
unwind_B:
release_B();
unwind_A:
release_A();
return status;
And for OvsCleanupConntrack() to only handle the case where initialisation
succeeded.
> +
> return status;
> }
>
...
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev