Re: [ovs-discuss] OVN: Delay in handling unixctl commands in ovsdb-server
On Wed, Feb 12, 2020 at 11:27:18PM +0530, Numan Siddique wrote: > Hi Ben/All, > > In an OVN deployment - with OVN dbs deployed as active/standby using > pacemaker, we are seeing delays in response to unixctl command - > ovsdb-server/sync-status. > > Pacemaker periodically calls the OVN pacemaker OCF script to get the > status and this script internally invokes - ovs-appctl -t > /var/run/openvswitch/ovnsb_db.ctl ovsdb-server/sync-status. In a large > deployment with lots of OVN resources we see that ovsdb-server takes a > lot of time (sometimes > 60 seconds) to respond to this command. This > causes pacemaker to stop the service in that node and move the master > to another node. This causes a lot of disruption. > > One approach of solving this issue is to handle unixctl commands in a > separate thread. The commands like sync-status, get-** etc can be > easily handled in the thread. Still, there are many commands like > ovsdb-server/set-active-ovsdb-server, ovsdb-server/compact etc (which > changes the state) which needs to be synchronized between the main > ovsdb-server thread and the newly added thread using a mutex. > > Does this approach makes sense ? I started working on it. But I wanted > to check with the community before putting into more efforts. It seems reasonable to me to support unixctl commands in multiple threads. The details of how you implement it will determine how usable it is. I suggest making the current case easy and common. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] 答复: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?
On Thu, Feb 13, 2020 at 03:07:33PM +0100, Ilya Maximets wrote: > On 2/13/20 2:52 PM, Yi Yang (杨燚)-云服务集团 wrote: > > Thanks Ilya, iperf3 udp should be single direction, source IP address and > > destination IP address are two VMs' IP, udp bandwidth will be 0 if they are > > wrong, but obviously UDP loss rate is 0, so it isn't the case you're > > saying, do we have way to disable MAC learning or MAC broadcast? > > NORMAL action acts like an L2 learning switch. If you don't > want to use MAC learning, remove flow with NORMAL action and > add direct forwarding flow like output:. But > I don't think that you want to do that in OpenStack setup. Also iperf3 establishes the control connection which uses TCP in both directions. So, in theory, the FDB should be updated. > > Is NORMAL action or MAC learning slow path process? If so, ovs-vswitchd > > daemon should have high cpu utilization. > > It's not a slow path, so there will be no cpu usage by ovs-vswitchd > userspace process. To confirm that you're flooding packets, you > may dump installed datapath flows with the following command: > > ovs-appctl dpctl/dump-flows > > In case of flood, you will see datapath flow with big number of > output ports like this: > > <...> actions:,,... I'd suggest to look at the fdb: ovs-appctl fdb/show and port stats to see if there is traffic moving as well. Maybe it's not your UDP test packet, but another unrelated traffic in the network. HTH, fbl > > > > > -邮件原件- > > 发件人: Ilya Maximets [mailto:i.maxim...@ovn.org] > > 发送时间: 2020年2月13日 21:23 > > 收件人: Flavio Leitner ; Yi Yang (杨燚)-云服务集团 > > > > 抄送: ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; Ilya Maximets > > > > 主题: Re: [ovs-dev] OVS performance issue: why small udp packet pps > > performance between VMs is highly related with number of ovs ports and > > number of VMs? > > > > On 2/13/20 12:48 PM, Flavio Leitner wrote: > >> On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote: > >>> Hi, all > >>> > >>> We find ovs has serious performance issue, we only launch one VM in > >>> one compute, and do iperf small udp pps performance test between > >>> these two VMs, we can see about 18 pps (packets per second, -l > >>> 16), but > >>> > >>> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps > >>> performance will be about 5 pps. > >>> 2) If we launch one more VM in every compute node, but don’t run any > >>> workload, the pps performance will be about 9 pps. (note, no > >>> above veth ports in this test) > >>> 3) If we launch two more VMs in every compute node (totally 3 VMs > >>> every compute nodes), but don’t run any workload , the pps > >>> performance will be about 5 pps (note, no above veth ports in > >>> this test) > >>> > >>> Anybody can help explain why it is so? Is there any known way to > >>> optimized this? I really think ovs performance is bad (we can draw > >>> such conclusion from our test result at least), I don’t want to > >>> defame ovs ☺ > >>> > >>> BTW, we used ovs kernel datapath and vhost, we can see every port has a > >>> vhost kernel thread, it is running with 100% cpu utilization if we run > >>> iperf in VM, bu for those idle VMs, the corresponding vhost still has > >>> about 30% cpu utilization, I don’t understand why. > >>> > >>> In addition, we find udp performance is also very bad for small UDP > >>> packet for physical NIC. But it can reach 26 pps for –l 80 which > >>> enough covers vxlan header (8 bytes) + inner eth header (14) + ipudp > >>> header (28) + 16 = 66, if we consider performance overhead ovs bridge > >>> introduces, pps performance between VMs should be able to reach 20 > >>> pps at least, other VMs and ports shouldn’t have so big hurt against it > >>> because they are idle, no any workload there. > >> > >> What do you have in the flow table? It sounds like the traffic is > >> being broadcast to all ports. Check the FDB to see if OvS is learning > >> the mac addresses. > >> > >> It's been a while since I don't run performance tests with kernel > >> datapath, but it should be no different than Linux bridge with just > >> action NORMAL in the flow table. > >> > > > > I agree that if your performance heavily depends on the number of ports > > than you're most likely just flooding all the packets to all the ports. > > Since you're using UDP traffic, please, be sure that you're sending some > > packets in backward direction, so OVS and all other switches (if any) will > > learn/not forget to which port packets should be sent. Also, check if your > > IP addresses are correct. If for some reason it's not possible for OVS to > > learn MAC addresses correctly, avoid using action:NORMAL. > > > > Best regards, Ilya Maximets. > > > -- fbl ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] 答复: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?
On 2/13/20 2:52 PM, Yi Yang (杨燚)-云服务集团 wrote: > Thanks Ilya, iperf3 udp should be single direction, source IP address and > destination IP address are two VMs' IP, udp bandwidth will be 0 if they are > wrong, but obviously UDP loss rate is 0, so it isn't the case you're saying, > do we have way to disable MAC learning or MAC broadcast? NORMAL action acts like an L2 learning switch. If you don't want to use MAC learning, remove flow with NORMAL action and add direct forwarding flow like output:. But I don't think that you want to do that in OpenStack setup. > > Is NORMAL action or MAC learning slow path process? If so, ovs-vswitchd > daemon should have high cpu utilization. It's not a slow path, so there will be no cpu usage by ovs-vswitchd userspace process. To confirm that you're flooding packets, you may dump installed datapath flows with the following command: ovs-appctl dpctl/dump-flows In case of flood, you will see datapath flow with big number of output ports like this: <...> actions:,,... > > -邮件原件- > 发件人: Ilya Maximets [mailto:i.maxim...@ovn.org] > 发送时间: 2020年2月13日 21:23 > 收件人: Flavio Leitner ; Yi Yang (杨燚)-云服务集团 > > 抄送: ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; Ilya Maximets > > 主题: Re: [ovs-dev] OVS performance issue: why small udp packet pps performance > between VMs is highly related with number of ovs ports and number of VMs? > > On 2/13/20 12:48 PM, Flavio Leitner wrote: >> On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote: >>> Hi, all >>> >>> We find ovs has serious performance issue, we only launch one VM in >>> one compute, and do iperf small udp pps performance test between >>> these two VMs, we can see about 18 pps (packets per second, -l >>> 16), but >>> >>> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps >>> performance will be about 5 pps. >>> 2) If we launch one more VM in every compute node, but don’t run any >>> workload, the pps performance will be about 9 pps. (note, no >>> above veth ports in this test) >>> 3) If we launch two more VMs in every compute node (totally 3 VMs >>> every compute nodes), but don’t run any workload , the pps >>> performance will be about 5 pps (note, no above veth ports in >>> this test) >>> >>> Anybody can help explain why it is so? Is there any known way to >>> optimized this? I really think ovs performance is bad (we can draw >>> such conclusion from our test result at least), I don’t want to >>> defame ovs ☺ >>> >>> BTW, we used ovs kernel datapath and vhost, we can see every port has a >>> vhost kernel thread, it is running with 100% cpu utilization if we run >>> iperf in VM, bu for those idle VMs, the corresponding vhost still has about >>> 30% cpu utilization, I don’t understand why. >>> >>> In addition, we find udp performance is also very bad for small UDP packet >>> for physical NIC. But it can reach 26 pps for –l 80 which enough covers >>> vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = >>> 66, if we consider performance overhead ovs bridge introduces, pps >>> performance between VMs should be able to reach 20 pps at least, other >>> VMs and ports shouldn’t have so big hurt against it because they are idle, >>> no any workload there. >> >> What do you have in the flow table? It sounds like the traffic is >> being broadcast to all ports. Check the FDB to see if OvS is learning >> the mac addresses. >> >> It's been a while since I don't run performance tests with kernel >> datapath, but it should be no different than Linux bridge with just >> action NORMAL in the flow table. >> > > I agree that if your performance heavily depends on the number of ports than > you're most likely just flooding all the packets to all the ports. Since > you're using UDP traffic, please, be sure that you're sending some packets in > backward direction, so OVS and all other switches (if any) will learn/not > forget to which port packets should be sent. Also, check if your IP > addresses are correct. If for some reason it's not possible for OVS to learn > MAC addresses correctly, avoid using action:NORMAL. > > Best regards, Ilya Maximets. > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
[ovs-discuss] 答复: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?
Thanks Ilya, iperf3 udp should be single direction, source IP address and destination IP address are two VMs' IP, udp bandwidth will be 0 if they are wrong, but obviously UDP loss rate is 0, so it isn't the case you're saying, do we have way to disable MAC learning or MAC broadcast? Is NORMAL action or MAC learning slow path process? If so, ovs-vswitchd daemon should have high cpu utilization. -邮件原件- 发件人: Ilya Maximets [mailto:i.maxim...@ovn.org] 发送时间: 2020年2月13日 21:23 收件人: Flavio Leitner ; Yi Yang (杨燚)-云服务集团 抄送: ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; Ilya Maximets 主题: Re: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs? On 2/13/20 12:48 PM, Flavio Leitner wrote: > On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote: >> Hi, all >> >> We find ovs has serious performance issue, we only launch one VM in >> one compute, and do iperf small udp pps performance test between >> these two VMs, we can see about 18 pps (packets per second, -l >> 16), but >> >> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps >> performance will be about 5 pps. >> 2) If we launch one more VM in every compute node, but don’t run any >> workload, the pps performance will be about 9 pps. (note, no >> above veth ports in this test) >> 3) If we launch two more VMs in every compute node (totally 3 VMs >> every compute nodes), but don’t run any workload , the pps >> performance will be about 5 pps (note, no above veth ports in >> this test) >> >> Anybody can help explain why it is so? Is there any known way to >> optimized this? I really think ovs performance is bad (we can draw >> such conclusion from our test result at least), I don’t want to >> defame ovs ☺ >> >> BTW, we used ovs kernel datapath and vhost, we can see every port has a >> vhost kernel thread, it is running with 100% cpu utilization if we run iperf >> in VM, bu for those idle VMs, the corresponding vhost still has about 30% >> cpu utilization, I don’t understand why. >> >> In addition, we find udp performance is also very bad for small UDP packet >> for physical NIC. But it can reach 26 pps for –l 80 which enough covers >> vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = >> 66, if we consider performance overhead ovs bridge introduces, pps >> performance between VMs should be able to reach 20 pps at least, other >> VMs and ports shouldn’t have so big hurt against it because they are idle, >> no any workload there. > > What do you have in the flow table? It sounds like the traffic is > being broadcast to all ports. Check the FDB to see if OvS is learning > the mac addresses. > > It's been a while since I don't run performance tests with kernel > datapath, but it should be no different than Linux bridge with just > action NORMAL in the flow table. > I agree that if your performance heavily depends on the number of ports than you're most likely just flooding all the packets to all the ports. Since you're using UDP traffic, please, be sure that you're sending some packets in backward direction, so OVS and all other switches (if any) will learn/not forget to which port packets should be sent. Also, check if your IP addresses are correct. If for some reason it's not possible for OVS to learn MAC addresses correctly, avoid using action:NORMAL. Best regards, Ilya Maximets. smime.p7s Description: S/MIME cryptographic signature ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
[ovs-discuss] 答复: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?
Flavio, this is an openstack environment, all the flows are added by neutron, NORMAL action is default flow before neutron adds any flow, this is ovs default flow. -邮件原件- 发件人: Flavio Leitner [mailto:f...@sysclose.org] 发送时间: 2020年2月13日 19:48 收件人: Yi Yang (杨燚)-云服务集团 抄送: ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; i.maxim...@ovn.org 主题: Re: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs? On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote: > Hi, all > > We find ovs has serious performance issue, we only launch one VM in > one compute, and do iperf small udp pps performance test between these > two VMs, we can see about 18 pps (packets per second, -l 16), but > > 1) if we add 100 veth ports in br-int bridge, respectively, then the pps > performance will be about 5 pps. > 2) If we launch one more VM in every compute node, but don’t run any > workload, the pps performance will be about 9 pps. (note, no above > veth ports in this test) > 3) If we launch two more VMs in every compute node (totally 3 VMs > every compute nodes), but don’t run any workload , the pps performance > will be about 5 pps (note, no above veth ports in this test) > > Anybody can help explain why it is so? Is there any known way to > optimized this? I really think ovs performance is bad (we can draw > such conclusion from our test result at least), I don’t want to defame > ovs ☺ > > BTW, we used ovs kernel datapath and vhost, we can see every port has a vhost > kernel thread, it is running with 100% cpu utilization if we run iperf in VM, > bu for those idle VMs, the corresponding vhost still has about 30% cpu > utilization, I don’t understand why. > > In addition, we find udp performance is also very bad for small UDP packet > for physical NIC. But it can reach 26 pps for –l 80 which enough covers > vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 66, > if we consider performance overhead ovs bridge introduces, pps performance > between VMs should be able to reach 20 pps at least, other VMs and ports > shouldn’t have so big hurt against it because they are idle, no any workload > there. What do you have in the flow table? It sounds like the traffic is being broadcast to all ports. Check the FDB to see if OvS is learning the mac addresses. It's been a while since I don't run performance tests with kernel datapath, but it should be no different than Linux bridge with just action NORMAL in the flow table. -- fbl smime.p7s Description: S/MIME cryptographic signature ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?
On 2/13/20 12:48 PM, Flavio Leitner wrote: > On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote: >> Hi, all >> >> We find ovs has serious performance issue, we only launch one VM in one >> compute, and do iperf small udp pps performance test between these two VMs, >> we can see about 18 pps (packets per second, -l 16), but >> >> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps >> performance will be about 5 pps. >> 2) If we launch one more VM in every compute node, but don’t run any >> workload, the pps performance will be about 9 pps. (note, no above veth >> ports in this test) >> 3) If we launch two more VMs in every compute node (totally 3 VMs every >> compute nodes), but don’t run any workload , the pps performance will be >> about 5 pps (note, no above veth ports in this test) >> >> Anybody can help explain why it is so? Is there any known way to optimized >> this? I really think ovs performance is bad (we can draw such conclusion >> from our test result at least), I don’t want to defame ovs ☺ >> >> BTW, we used ovs kernel datapath and vhost, we can see every port has a >> vhost kernel thread, it is running with 100% cpu utilization if we run iperf >> in VM, bu for those idle VMs, the corresponding vhost still has about 30% >> cpu utilization, I don’t understand why. >> >> In addition, we find udp performance is also very bad for small UDP packet >> for physical NIC. But it can reach 26 pps for –l 80 which enough covers >> vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = >> 66, if we consider performance overhead ovs bridge introduces, pps >> performance between VMs should be able to reach 20 pps at least, other >> VMs and ports shouldn’t have so big hurt against it because they are idle, >> no any workload there. > > What do you have in the flow table? It sounds like the traffic is > being broadcast to all ports. Check the FDB to see if OvS is > learning the mac addresses. > > It's been a while since I don't run performance tests with kernel > datapath, but it should be no different than Linux bridge with > just action NORMAL in the flow table. > I agree that if your performance heavily depends on the number of ports than you're most likely just flooding all the packets to all the ports. Since you're using UDP traffic, please, be sure that you're sending some packets in backward direction, so OVS and all other switches (if any) will learn/not forget to which port packets should be sent. Also, check if your IP addresses are correct. If for some reason it's not possible for OVS to learn MAC addresses correctly, avoid using action:NORMAL. Best regards, Ilya Maximets. ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?
On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote: > Hi, all > > We find ovs has serious performance issue, we only launch one VM in one > compute, and do iperf small udp pps performance test between these two VMs, > we can see about 18 pps (packets per second, -l 16), but > > 1) if we add 100 veth ports in br-int bridge, respectively, then the pps > performance will be about 5 pps. > 2) If we launch one more VM in every compute node, but don’t run any > workload, the pps performance will be about 9 pps. (note, no above veth > ports in this test) > 3) If we launch two more VMs in every compute node (totally 3 VMs every > compute nodes), but don’t run any workload , the pps performance will be > about 5 pps (note, no above veth ports in this test) > > Anybody can help explain why it is so? Is there any known way to optimized > this? I really think ovs performance is bad (we can draw such conclusion from > our test result at least), I don’t want to defame ovs ☺ > > BTW, we used ovs kernel datapath and vhost, we can see every port has a vhost > kernel thread, it is running with 100% cpu utilization if we run iperf in VM, > bu for those idle VMs, the corresponding vhost still has about 30% cpu > utilization, I don’t understand why. > > In addition, we find udp performance is also very bad for small UDP packet > for physical NIC. But it can reach 26 pps for –l 80 which enough covers > vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 66, > if we consider performance overhead ovs bridge introduces, pps performance > between VMs should be able to reach 20 pps at least, other VMs and ports > shouldn’t have so big hurt against it because they are idle, no any workload > there. What do you have in the flow table? It sounds like the traffic is being broadcast to all ports. Check the FDB to see if OvS is learning the mac addresses. It's been a while since I don't run performance tests with kernel datapath, but it should be no different than Linux bridge with just action NORMAL in the flow table. -- fbl ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
[ovs-discuss] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?
Hi, all We find ovs has serious performance issue, we only launch one VM in one compute, and do iperf small udp pps performance test between these two VMs, we can see about 18 pps (packets per second, -l 16), but 1) if we add 100 veth ports in br-int bridge, respectively, then the pps performance will be about 5 pps. 2) If we launch one more VM in every compute node, but don’t run any workload, the pps performance will be about 9 pps. (note, no above veth ports in this test) 3) If we launch two more VMs in every compute node (totally 3 VMs every compute nodes), but don’t run any workload , the pps performance will be about 5 pps (note, no above veth ports in this test) Anybody can help explain why it is so? Is there any known way to optimized this? I really think ovs performance is bad (we can draw such conclusion from our test result at least), I don’t want to defame ovs ☺ BTW, we used ovs kernel datapath and vhost, we can see every port has a vhost kernel thread, it is running with 100% cpu utilization if we run iperf in VM, bu for those idle VMs, the corresponding vhost still has about 30% cpu utilization, I don’t understand why. In addition, we find udp performance is also very bad for small UDP packet for physical NIC. But it can reach 26 pps for –l 80 which enough covers vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 66, if we consider performance overhead ovs bridge introduces, pps performance between VMs should be able to reach 20 pps at least, other VMs and ports shouldn’t have so big hurt against it because they are idle, no any workload there. smime.p7s Description: S/MIME cryptographic signature ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
Re: [ovs-discuss] OVN: Delay in handling unixctl commands in ovsdb-server
Hi all, On Thu, Feb 13, 2020 at 8:09 AM Han Zhou wrote: > > > On Wed, Feb 12, 2020 at 9:57 AM Numan Siddique > wrote: > > > > Hi Ben/All, > > > > In an OVN deployment - with OVN dbs deployed as active/standby using > > pacemaker, we are seeing delays in response to unixctl command - > > ovsdb-server/sync-status. > > > > Pacemaker periodically calls the OVN pacemaker OCF script to get the > > status and this script internally invokes - ovs-appctl -t > > /var/run/openvswitch/ovnsb_db.ctl ovsdb-server/sync-status. In a large > > deployment with lots of OVN resources we see that ovsdb-server takes a > > lot of time (sometimes > 60 seconds) to respond to this command. This > > causes pacemaker to stop the service in that node and move the master > > to another node. This causes a lot of disruption. > > > > One approach of solving this issue is to handle unixctl commands in a > > separate thread. The commands like sync-status, get-** etc can be > > easily handled in the thread. Still, there are many commands like > > ovsdb-server/set-active-ovsdb-server, ovsdb-server/compact etc (which > > changes the state) which needs to be synchronized between the main > > ovsdb-server thread and the newly added thread using a mutex. > > > > Does this approach makes sense ? I started working on it. But I wanted > > to check with the community before putting into more efforts. > > > > Are there better ways to solve this issue ? > > > > Thanks > > Numan > > > Hi Numan, > > It seems reasonable to me. Multi-threading would add a little complexity, > but in this case it should be straightforward. It merely requires mutexes > to synchronize between the threads for *writes*, and also for *reads* of > non-atomic data. > The only side effect is that *if* the thread that does the DB job really > stucked because of a bug and not handling jobs at all, the unixctl thread > ovsdb-server/sync-status command wouldn't detect it, so it could result in > pacemaker reporting *happy* status without detecting problems. First for > all this is unlikely to happen. But if we really think it is a problem we > can still solve it by incrementing a counter in main loop and have a new > command (readonly, without mutex) to check if this counter is increasing, > to tell if the server if really working. > I'd be more inclined to do what Han suggests here and that every thread contributes to the health status with a readonly counter. Whatever gets implemented here perhaps can be re-used in ovn-controller to monitor the main & pinctrl threads. Similar scenario but maybe worse consequences as it affects dataplane is that the "health" thread reports good status but the pinctrl thread is stuck and therefore DHCP service is down and instances can't fetch IP. > Thanks, > Han > ___ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > ___ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss