On 10/26/23 14:56, Evgenii Kovalev wrote: > > On 18.10.2023 19:49, Evgenii Kovalev wrote: >> Hi All, >>
Hi Evgenii, >> We met with unexpected behavior of health check in load balancer with >> a windows server as backend, for example: >> ovn-controller --> win_srv SYN >> ovn-controller <-- win_srv SYN_ACK >> ovn-controller --> win_srv RST_ACK >> ovn-controller <-- win_srv SYN_ACK TCP Retransmission >> ovn-controller --> win_srv RST_ACK >> ovn-controller <-- win_srv SYN_ACK TCP Retransmission >> ovn-controller --> win_srv RST_ACK >> ovn-controller <-- win_srv RST >> >> Dump where 172.20.2.2 is ovn-controller and 172.20.2.49 is windows >> server backend: >> 17:18:20.324464 IP 172.20.2.2.29636 > 172.20.2.49.7045: Flags [S], seq >> 2289025863, win 65160, length 0 >> 17:18:20.324572 IP 172.20.2.49.7045 > 172.20.2.2.29636: Flags [S.], >> seq 2447380613, ack 2289025864, win 8192, options [mss 1460], length 0 >> 17:18:20.325233 IP 172.20.2.2.29636 > 172.20.2.49.7045: Flags [R.], >> seq 2, ack 1, win 65160, length 0 >> 17:18:23.336091 IP 172.20.2.49.7045 > 172.20.2.2.29636: Flags [S.], >> seq 2447380613, ack 2289025864, win 8192, options [mss 1460], length 0 >> 17:18:23.336559 IP 172.20.2.2.29636 > 172.20.2.49.7045: Flags [R.], >> seq 2, ack 1, win 65160, length 0 >> 17:18:29.335992 IP 172.20.2.49.7045 > 172.20.2.2.29636: Flags [S.], >> seq 2447380613, ack 2289025864, win 8192, options [mss 1460], length 0 >> 17:18:29.336423 IP 172.20.2.2.29636 > 172.20.2.49.7045: Flags [R.], >> seq 2, ack 1, win 65160, length 0 >> 17:18:41.335919 IP 172.20.2.49.7045 > 172.20.2.2.29636: Flags [R], seq >> 2447380614, win 0, length 0 >> >> For the linux it doesn't affected because linux just ignore RST_ACK >> and didn't made tcp retransmission from client side, it looks like: >> ovn-controller --> linux_srv SYN >> ovn-controller <-- linux_srv SYN_ACK >> ovn-controller --> linux_srv RST_ACK >> The linux client didn't made TCP retransmission. >> >> The main issue: >> The status in svc_mon flapping between online and offline because of >> behavior with a windows server backend. Every change for status in >> svc_mon trigger ovn-northd and increase CPU usage of ovn-northd to 100%. >> >> I checked it on windows server with simple http server as backend and >> this is behavior reproduced and looks like all backends with windows >> affected. >> >> I want make a patch to fix this behavior, but I don't know which >> method is preferred: >> >> 1) Add a new boolean field in struct svc_mon. After the func >> svc_monitor_run() called with init/online/offline cases set this field >> in the func svc_monitor_send_health_check(). If the func >> process_packet_in() called in ACTION_OPCODE_HANDLE_SVC_CHECK case we >> change the boolean field and send RST_ACK to client if packet_in with >> SYN_ACK flag or we make a choice based on the new boolean field, >> change or not the svc_mon->state field if packet_in with RST flag. >> The func where we make decision: >> https://github.com/ovn-org/ovn/blob/main/controller/pinctrl.c#L7810-L7858 >> Personally, I'd try to avoid another configuration knob if we can. >> 2) Implement established tcp connection, for example: >> ovn-controller --> win_srv SYN >> ovn-controller <-- win_srv SYN_ACK >> ovn-controller --> win_srv ACK >> ovn-controller --> win_srv FIN_ACK >> ovn-controller <-- win_srv FIN_ACK >> ovn-controller --> win_srv ACK >> Sounds easy but I'm quite sure we'll end up with quite a lot of code because of all the potential cases with TCP establish/close handshakes. >> 3) Send RST instead of RST_ACK >> Does this work in all cases on Linux too? If so, +1 from me. >> It will be great if you advice which method is better or I need fix it >> on another way. >> Thanks. >> > > Gentle ping. > I'll appreciate for your advice. > Sorry for the delay. Regards, Dumitru _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
