Le 10/11/2020 à 18:12, Maciej Zdeb a écrit :
Hi,

I'm so happy you're able to replicate it! :)

With that patch that disabled pool_flush I still can reproduce on my r&d server and on production, just different places of crash:


Hi Maciej,

Could you test the following patch please ? For now I don't know if it fully fixes the bug. But it is step forward. I must do a deeper review to be sure it covers all cases.

Thanks !

--
Christopher Faulet
>From 7b1996335f8bd33fc3180003dfb57c4d55fa6a60 Mon Sep 17 00:00:00 2001
From: Christopher Faulet <cfau...@haproxy.com>
Date: Tue, 10 Nov 2020 18:45:34 +0100
Subject: [PATCH] BUG/MEDIUM: spoe: Be sure to remove all references on a
 released spoe applet

When a SPOE applet is used to send a frame, a reference on this applet is saved
in the spoe context of the offladed stream. But, if the applet is released
before receving the corresponding ack, we must be sure to remove this
reference. This was performed for fragmented frames only. But it must also be
performed for a spoe contexts in the applet waiting_queue and in the thread
waiting queue (used in async mode).

This patch must be backported to all versions where the spoe is supported (>=
1.7).
---
 src/flt_spoe.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/flt_spoe.c b/src/flt_spoe.c
index a91906e105..33b312688e 100644
--- a/src/flt_spoe.c
+++ b/src/flt_spoe.c
@@ -1253,6 +1253,7 @@ spoe_release_appctx(struct appctx *appctx)
 		LIST_INIT(&ctx->list);
 		_HA_ATOMIC_SUB(&agent->counters.nb_waiting, 1);
 		spoe_update_stat_time(&ctx->stats.tv_wait, &ctx->stats.t_waiting);
+		ctx->spoe_appctx = NULL;
 		ctx->state = SPOE_CTX_ST_ERROR;
 		ctx->status_code = (spoe_appctx->status_code + 0x100);
 		TEST_STRM(ctx->strm);
@@ -1270,8 +1271,13 @@ spoe_release_appctx(struct appctx *appctx)
 		task_wakeup(ctx->strm->task, TASK_WOKEN_MSG);
 	}
 
-	if (!LIST_ISEMPTY(&agent->rt[tid].applets))
+	if (!LIST_ISEMPTY(&agent->rt[tid].applets)) {
+		list_for_each_entry_safe(ctx, back, &agent->rt[tid].waiting_queue, list) {
+			if (ctx->spoe_appctx == spoe_appctx)
+				ctx->spoe_appctx = NULL;
+		}
 		goto end;
+	}
 
 	/* If this was the last running applet, notify all waiting streams */
 	list_for_each_entry_safe(ctx, back, &agent->rt[tid].sending_queue, list) {
@@ -1279,6 +1285,7 @@ spoe_release_appctx(struct appctx *appctx)
 		LIST_INIT(&ctx->list);
 		_HA_ATOMIC_SUB(&agent->counters.nb_sending, 1);
 		spoe_update_stat_time(&ctx->stats.tv_queue, &ctx->stats.t_queue);
+		ctx->spoe_appctx = NULL;
 		ctx->state = SPOE_CTX_ST_ERROR;
 		ctx->status_code = (spoe_appctx->status_code + 0x100);
 		TEST_STRM(ctx->strm);
@@ -1289,6 +1296,7 @@ spoe_release_appctx(struct appctx *appctx)
 		LIST_INIT(&ctx->list);
 		_HA_ATOMIC_SUB(&agent->counters.nb_waiting, 1);
 		spoe_update_stat_time(&ctx->stats.tv_wait, &ctx->stats.t_waiting);
+		ctx->spoe_appctx = NULL;
 		ctx->state = SPOE_CTX_ST_ERROR;
 		ctx->status_code = (spoe_appctx->status_code + 0x100);
 		TEST_STRM(ctx->strm);
-- 
2.26.2

Reply via email to