This patch adds reset n_count regardless of svc state and does only
if n_count >= configured_count.
The behavior hc with windows server as backend in load balancers show some
issue with counters, e.g.:
ovn-controller --> win_srv | SYN
ovn-controller <-- win_srv | SYN_ACK <- increase n_success
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase n_success
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase n_success
n_success == success_count => status = online;
n_success = 0
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | RST (win_srv don't recive ACK and sent RST)
increase n_failures (count = 1)
After wait_time:
ovn-controller --> win_srv | SYN
ovn-controller <-- win_srv | SYN_ACK <- increase n_success
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase n_success
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase n_success
n_success == success_count => status = online;
n_success = 0
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | RST (win_srv don't recive ACK and sent RST)
increase n_failures (count = 2)
After wait_time:
ovn-controller --> win_srv | SYN
ovn-controller <-- win_srv | SYN_ACK <- increase n_success
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase n_success
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase n_success
n_success == success_count => status = online;
n_success = 0
ovn-controller --> win_srv | RST_ACK
ovn-controller <-- win_srv | RST (win_srv don't recive ACK and sent RST)
increase n_failures (count = 3)
n_failures == failure_count => status = offline;
n_failures = 0
So, the main point is svc reset the counters only for current state if
n_success >= success_count, but if a backend of lb for some reason
sent RST to ovn-controller it's just increase count and will reset only then
state is offline and greater or equal than failure_count. The same for SYN_ACK.
Signed-off-by: Evgenii Kovalev <[email protected]>
---
controller/pinctrl.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index f98f6d70c..158e0ee75 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -7761,7 +7761,9 @@ svc_monitors_run(struct rconn *swconn,
if (svc_mon->n_success >= svc_mon->success_count) {
svc_mon->status = SVC_MON_ST_ONLINE;
svc_mon->n_success = 0;
+ svc_mon->n_failures = 0;
}
+
if (current_time >= svc_mon->next_send_time) {
svc_monitor_send_health_check(swconn, svc_mon);
next_run_time = svc_mon->wait_time;
@@ -7773,6 +7775,7 @@ svc_monitors_run(struct rconn *swconn,
case SVC_MON_S_OFFLINE:
if (svc_mon->n_failures >= svc_mon->failure_count) {
svc_mon->status = SVC_MON_ST_OFFLINE;
+ svc_mon->n_success = 0;
svc_mon->n_failures = 0;
}
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev