On 11/8/23 18:55, Mark Michelson wrote: > Thank you for the detailed explanation. It makes perfect sense to me. > > Acked-by: Mark Michelson <[email protected]> >
Thanks, applied to main and backported to all stable branches down to 22.03. > On 11/7/23 09:39, Evgenii Kovalev wrote: >> This patch adds reset n_count regardless of svc state and does only >> if n_count >= configured_count. >> >> The behavior hc with windows server as backend in load balancers show >> some >> issue with counters, e.g.: >> >> ovn-controller --> win_srv | SYN >> ovn-controller <-- win_srv | SYN_ACK <- increase n_success >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase >> n_success >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase >> n_success >> n_success == success_count => status = >> online; n_success = 0 >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | RST (win_srv don't recive ACK and sent RST) >> increase n_failures (count = 1) >> After wait_time: >> ovn-controller --> win_srv | SYN >> ovn-controller <-- win_srv | SYN_ACK <- increase n_success >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase >> n_success >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase >> n_success >> n_success == success_count => status = >> online; n_success = 0 >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | RST (win_srv don't recive ACK and sent RST) >> increase n_failures (count = 2) >> After wait_time: >> ovn-controller --> win_srv | SYN >> ovn-controller <-- win_srv | SYN_ACK <- increase n_success >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase >> n_success >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | SYN_ACK (TCP Retransmission) <- increase >> n_success >> n_success == success_count => status = >> online; n_success = 0 >> ovn-controller --> win_srv | RST_ACK >> ovn-controller <-- win_srv | RST (win_srv don't recive ACK and sent RST) >> increase n_failures (count = 3) >> n_failures == failure_count => status = >> offline; n_failures = 0 >> >> So, the main point is svc reset the counters only for current state if >> n_success >= success_count, but if a backend of lb for some reason >> sent RST to ovn-controller it's just increase count and will reset >> only then >> state is offline and greater or equal than failure_count. The same for >> SYN_ACK. >> >> Signed-off-by: Evgenii Kovalev <[email protected]> >> --- >> controller/pinctrl.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/controller/pinctrl.c b/controller/pinctrl.c >> index f98f6d70c..158e0ee75 100644 >> --- a/controller/pinctrl.c >> +++ b/controller/pinctrl.c >> @@ -7761,7 +7761,9 @@ svc_monitors_run(struct rconn *swconn, >> if (svc_mon->n_success >= svc_mon->success_count) { >> svc_mon->status = SVC_MON_ST_ONLINE; >> svc_mon->n_success = 0; >> + svc_mon->n_failures = 0; >> } >> + >> if (current_time >= svc_mon->next_send_time) { >> svc_monitor_send_health_check(swconn, svc_mon); >> next_run_time = svc_mon->wait_time; >> @@ -7773,6 +7775,7 @@ svc_monitors_run(struct rconn *swconn, >> case SVC_MON_S_OFFLINE: >> if (svc_mon->n_failures >= svc_mon->failure_count) { >> svc_mon->status = SVC_MON_ST_OFFLINE; >> + svc_mon->n_success = 0; >> svc_mon->n_failures = 0; >> } >> _______________________________________________ >> dev mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
