Hi, we’ve found a wrong behaviour with service_monitor record status for a health-checked Load Balancer. Its status can stay online forever even if virtual machine is stopped. This leads to load balanced traffic been sent to a dead backend.
Below is the script to reproduce the issue because I doubt about the correct place for a possible fix (my guess is it should be fixed in controller/binding.c in function binding_lport_set_down, but I’m not sure how this can affect VM live migration…): # cat ./repro.sh #!/bin/bash -x ovn-nbctl ls-add ls1 ovn-nbctl lsp-add ls1 lsp1 -- \ lsp-set-addresses lsp1 "00:00:00:00:00:01 192.168.0.10" ovn-nbctl lb-add lb1 192.168.0.100:80 192.168.0.10:80 ovn-nbctl set Load_balancer lb1 ip_port_mappings:192.168.0.10=lsp1:192.168.0.8 ovn-nbctl --id=@id create Load_Balancer_Health_Check vip='"192.168.0.100:80"' -- set Load_Balancer lb1 health_check=@id ovn-nbctl ls-lb-add ls1 lb1 ovs-vsctl add-port br-int test-lb -- set interface test-lb type=internal external_ids:iface-id=lsp1 ip li set test-lb addr 00:00:00:00:00:01 ip a add 192.168.0.10/24 dev test-lb ip li set test-lb up # check service_monitor ovn-sbctl list service_mon # ensure state became offline sleep 4 ovn-sbctl list service_mon # start listen on :80 with netcat ncat -k -l 192.168.0.10 80 & # ensure state turned to online sleep 4 ovn-sbctl list service_mon # trigger binding release ovs-vsctl remove interface test-lb external_ids iface-id # ensure state remains online sleep 10 ovn-sbctl list service_mon # ensure OVS group and backend is still in bucket ovs-ofctl dump-groups br-int | grep 192.168.0.10 ———— Looking forward to hear any thoughts on this. PS. don’t forget to kill ncat ;) Regards, Vladislav Odintsov _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev