Re: [ovs-discuss] OVN: Delay in handling unixctl commands in ovsdb-server

2020-02-13 Thread Ben Pfaff
On Wed, Feb 12, 2020 at 11:27:18PM +0530, Numan Siddique wrote:
> Hi Ben/All,
> 
> In an OVN deployment - with OVN dbs deployed as active/standby using
> pacemaker, we are seeing delays in response to unixctl command -
> ovsdb-server/sync-status.
> 
> Pacemaker periodically calls the OVN pacemaker OCF script to get the
> status and this script internally invokes - ovs-appctl -t
> /var/run/openvswitch/ovnsb_db.ctl ovsdb-server/sync-status. In a large
> deployment with lots of OVN resources we see that ovsdb-server takes a
> lot of time (sometimes > 60 seconds) to respond to this command. This
> causes pacemaker to stop the service in that node and move the master
> to another node. This causes a lot of disruption.
> 
> One approach of solving this issue is to handle unixctl commands in a
> separate thread. The commands like sync-status, get-** etc can be
> easily handled in the thread. Still, there are many commands like
> ovsdb-server/set-active-ovsdb-server, ovsdb-server/compact etc (which
> changes the state) which needs to be synchronized between the main
> ovsdb-server thread and the newly added thread using a mutex.
> 
> Does this approach makes sense ? I started working on it. But I wanted
> to check with the community before putting into more efforts.

It seems reasonable to me to support unixctl commands in multiple
threads.  The details of how you implement it will determine how usable
it is.  I suggest making the current case easy and common.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] 答复: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?

2020-02-13 Thread Flavio Leitner
On Thu, Feb 13, 2020 at 03:07:33PM +0100, Ilya Maximets wrote:
> On 2/13/20 2:52 PM, Yi Yang (杨燚)-云服务集团 wrote:
> > Thanks Ilya, iperf3 udp should be single direction, source IP address and 
> > destination IP address are two VMs' IP, udp bandwidth will be 0 if they are 
> > wrong, but obviously UDP loss rate is 0, so it isn't the case you're 
> > saying, do we have way to disable MAC learning or MAC broadcast?
> 
> NORMAL action acts like an L2 learning switch.  If you don't
> want to use MAC learning, remove flow with NORMAL action and
> add direct forwarding flow like output:.  But
> I don't think that you want to do that in OpenStack setup.

Also iperf3 establishes the control connection which uses TCP in
both directions. So, in theory, the FDB should be updated.

> > Is NORMAL action or MAC learning slow path process? If so, ovs-vswitchd 
> > daemon should have high cpu utilization.
> 
> It's not a slow path, so there will be no cpu usage by ovs-vswitchd
> userspace process.  To confirm that you're flooding packets, you
> may dump installed datapath flows with the following command:
> 
> ovs-appctl dpctl/dump-flows
> 
> In case of flood, you will see datapath flow with big number of
> output ports like this:
> 
> <...>  actions:,,...

I'd suggest to look at the fdb: ovs-appctl fdb/show 
and port stats to see if there is traffic moving as well.
Maybe it's not your UDP test packet, but another unrelated
traffic in the network.

HTH,
fbl


> 
> > 
> > -邮件原件-
> > 发件人: Ilya Maximets [mailto:i.maxim...@ovn.org] 
> > 发送时间: 2020年2月13日 21:23
> > 收件人: Flavio Leitner ; Yi Yang (杨燚)-云服务集团 
> > 
> > 抄送: ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; Ilya Maximets 
> > 
> > 主题: Re: [ovs-dev] OVS performance issue: why small udp packet pps 
> > performance between VMs is highly related with number of ovs ports and 
> > number of VMs?
> > 
> > On 2/13/20 12:48 PM, Flavio Leitner wrote:
> >> On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote:
> >>> Hi, all
> >>>
> >>> We find ovs has serious performance issue, we only launch one VM in 
> >>> one compute, and do iperf small udp pps performance test between 
> >>> these two VMs, we can see about 18 pps (packets per second, -l 
> >>> 16), but
> >>>
> >>> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps 
> >>> performance will be about 5 pps.
> >>> 2) If we launch one more VM in every compute node, but don’t run any 
> >>> workload, the pps performance will be about 9 pps. (note, no 
> >>> above veth ports in this test)
> >>> 3) If we launch two more VMs in every compute node (totally 3 VMs 
> >>> every compute nodes), but don’t run any workload , the pps 
> >>> performance will be about 5 pps (note, no above veth ports in 
> >>> this test)
> >>>
> >>> Anybody can help explain why it is so? Is there any known way to 
> >>> optimized this? I really think ovs performance is bad (we can draw 
> >>> such conclusion from our test result at least), I don’t want to 
> >>> defame ovs ☺
> >>>
> >>> BTW, we used ovs kernel datapath and vhost, we can see every port has a 
> >>> vhost kernel thread, it is running with 100% cpu utilization if we run 
> >>> iperf in VM, bu for those idle VMs, the corresponding vhost still has 
> >>> about 30% cpu utilization, I don’t understand why.
> >>>
> >>> In addition, we find udp performance is also very bad for small UDP 
> >>> packet for physical NIC. But it can reach 26 pps for –l 80 which 
> >>> enough covers vxlan header (8 bytes) + inner eth header (14) + ipudp 
> >>> header (28) + 16 = 66, if we consider performance overhead ovs bridge 
> >>> introduces, pps performance between VMs should be able to reach 20 
> >>> pps at least, other VMs and ports shouldn’t have so big hurt against it 
> >>> because they are idle, no any workload there.
> >>
> >> What do you have in the flow table?  It sounds like the traffic is 
> >> being broadcast to all ports. Check the FDB to see if OvS is learning 
> >> the mac addresses.
> >>
> >> It's been a while since I don't run performance tests with kernel 
> >> datapath, but it should be no different than Linux bridge with just 
> >> action NORMAL in the flow table.
> >>
> > 
> > I agree that if your performance heavily depends on the number of ports 
> > than you're most likely just flooding all the packets to all the ports.  
> > Since you're using UDP traffic, please, be sure that you're sending some 
> > packets in backward direction, so OVS and all other switches (if any) will 
> > learn/not forget to which port packets should be sent.  Also, check if your 
> > IP addresses are correct.  If for some reason it's not possible for OVS to 
> > learn MAC addresses correctly, avoid using action:NORMAL.
> > 
> > Best regards, Ilya Maximets.
> > 
> 

-- 
fbl
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] 答复: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?

2020-02-13 Thread Ilya Maximets
On 2/13/20 2:52 PM, Yi Yang (杨燚)-云服务集团 wrote:
> Thanks Ilya, iperf3 udp should be single direction, source IP address and 
> destination IP address are two VMs' IP, udp bandwidth will be 0 if they are 
> wrong, but obviously UDP loss rate is 0, so it isn't the case you're saying, 
> do we have way to disable MAC learning or MAC broadcast?

NORMAL action acts like an L2 learning switch.  If you don't
want to use MAC learning, remove flow with NORMAL action and
add direct forwarding flow like output:.  But
I don't think that you want to do that in OpenStack setup.

> 
> Is NORMAL action or MAC learning slow path process? If so, ovs-vswitchd 
> daemon should have high cpu utilization.

It's not a slow path, so there will be no cpu usage by ovs-vswitchd
userspace process.  To confirm that you're flooding packets, you
may dump installed datapath flows with the following command:

ovs-appctl dpctl/dump-flows

In case of flood, you will see datapath flow with big number of
output ports like this:

<...>  actions:,,...

> 
> -邮件原件-
> 发件人: Ilya Maximets [mailto:i.maxim...@ovn.org] 
> 发送时间: 2020年2月13日 21:23
> 收件人: Flavio Leitner ; Yi Yang (杨燚)-云服务集团 
> 
> 抄送: ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; Ilya Maximets 
> 
> 主题: Re: [ovs-dev] OVS performance issue: why small udp packet pps performance 
> between VMs is highly related with number of ovs ports and number of VMs?
> 
> On 2/13/20 12:48 PM, Flavio Leitner wrote:
>> On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote:
>>> Hi, all
>>>
>>> We find ovs has serious performance issue, we only launch one VM in 
>>> one compute, and do iperf small udp pps performance test between 
>>> these two VMs, we can see about 18 pps (packets per second, -l 
>>> 16), but
>>>
>>> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps 
>>> performance will be about 5 pps.
>>> 2) If we launch one more VM in every compute node, but don’t run any 
>>> workload, the pps performance will be about 9 pps. (note, no 
>>> above veth ports in this test)
>>> 3) If we launch two more VMs in every compute node (totally 3 VMs 
>>> every compute nodes), but don’t run any workload , the pps 
>>> performance will be about 5 pps (note, no above veth ports in 
>>> this test)
>>>
>>> Anybody can help explain why it is so? Is there any known way to 
>>> optimized this? I really think ovs performance is bad (we can draw 
>>> such conclusion from our test result at least), I don’t want to 
>>> defame ovs ☺
>>>
>>> BTW, we used ovs kernel datapath and vhost, we can see every port has a 
>>> vhost kernel thread, it is running with 100% cpu utilization if we run 
>>> iperf in VM, bu for those idle VMs, the corresponding vhost still has about 
>>> 30% cpu utilization, I don’t understand why.
>>>
>>> In addition, we find udp performance is also very bad for small UDP packet 
>>> for physical NIC. But it can reach 26 pps for –l 80 which enough covers 
>>> vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 
>>> 66, if we consider performance overhead ovs bridge introduces, pps 
>>> performance between VMs should be able to reach 20 pps at least, other 
>>> VMs and ports shouldn’t have so big hurt against it because they are idle, 
>>> no any workload there.
>>
>> What do you have in the flow table?  It sounds like the traffic is 
>> being broadcast to all ports. Check the FDB to see if OvS is learning 
>> the mac addresses.
>>
>> It's been a while since I don't run performance tests with kernel 
>> datapath, but it should be no different than Linux bridge with just 
>> action NORMAL in the flow table.
>>
> 
> I agree that if your performance heavily depends on the number of ports than 
> you're most likely just flooding all the packets to all the ports.  Since 
> you're using UDP traffic, please, be sure that you're sending some packets in 
> backward direction, so OVS and all other switches (if any) will learn/not 
> forget to which port packets should be sent.  Also, check if your IP 
> addresses are correct.  If for some reason it's not possible for OVS to learn 
> MAC addresses correctly, avoid using action:NORMAL.
> 
> Best regards, Ilya Maximets.
> 

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] 答复: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?

2020-02-13 Thread 杨燚
Thanks Ilya, iperf3 udp should be single direction, source IP address and 
destination IP address are two VMs' IP, udp bandwidth will be 0 if they are 
wrong, but obviously UDP loss rate is 0, so it isn't the case you're saying, do 
we have way to disable MAC learning or MAC broadcast?

Is NORMAL action or MAC learning slow path process? If so, ovs-vswitchd daemon 
should have high cpu utilization.

-邮件原件-
发件人: Ilya Maximets [mailto:i.maxim...@ovn.org] 
发送时间: 2020年2月13日 21:23
收件人: Flavio Leitner ; Yi Yang (杨燚)-云服务集团 

抄送: ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; Ilya Maximets 

主题: Re: [ovs-dev] OVS performance issue: why small udp packet pps performance 
between VMs is highly related with number of ovs ports and number of VMs?

On 2/13/20 12:48 PM, Flavio Leitner wrote:
> On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote:
>> Hi, all
>>
>> We find ovs has serious performance issue, we only launch one VM in 
>> one compute, and do iperf small udp pps performance test between 
>> these two VMs, we can see about 18 pps (packets per second, -l 
>> 16), but
>>
>> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps 
>> performance will be about 5 pps.
>> 2) If we launch one more VM in every compute node, but don’t run any 
>> workload, the pps performance will be about 9 pps. (note, no 
>> above veth ports in this test)
>> 3) If we launch two more VMs in every compute node (totally 3 VMs 
>> every compute nodes), but don’t run any workload , the pps 
>> performance will be about 5 pps (note, no above veth ports in 
>> this test)
>>
>> Anybody can help explain why it is so? Is there any known way to 
>> optimized this? I really think ovs performance is bad (we can draw 
>> such conclusion from our test result at least), I don’t want to 
>> defame ovs ☺
>>
>> BTW, we used ovs kernel datapath and vhost, we can see every port has a 
>> vhost kernel thread, it is running with 100% cpu utilization if we run iperf 
>> in VM, bu for those idle VMs, the corresponding vhost still has about 30% 
>> cpu utilization, I don’t understand why.
>>
>> In addition, we find udp performance is also very bad for small UDP packet 
>> for physical NIC. But it can reach 26 pps for –l 80 which enough covers 
>> vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 
>> 66, if we consider performance overhead ovs bridge introduces, pps 
>> performance between VMs should be able to reach 20 pps at least, other 
>> VMs and ports shouldn’t have so big hurt against it because they are idle, 
>> no any workload there.
> 
> What do you have in the flow table?  It sounds like the traffic is 
> being broadcast to all ports. Check the FDB to see if OvS is learning 
> the mac addresses.
> 
> It's been a while since I don't run performance tests with kernel 
> datapath, but it should be no different than Linux bridge with just 
> action NORMAL in the flow table.
> 

I agree that if your performance heavily depends on the number of ports than 
you're most likely just flooding all the packets to all the ports.  Since 
you're using UDP traffic, please, be sure that you're sending some packets in 
backward direction, so OVS and all other switches (if any) will learn/not 
forget to which port packets should be sent.  Also, check if your IP addresses 
are correct.  If for some reason it's not possible for OVS to learn MAC 
addresses correctly, avoid using action:NORMAL.

Best regards, Ilya Maximets.


smime.p7s
Description: S/MIME cryptographic signature
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] 答复: [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?

2020-02-13 Thread 杨燚
Flavio, this is an openstack environment, all the flows are added by neutron, 
NORMAL action is default flow before neutron adds any flow, this is ovs default 
flow.

-邮件原件-
发件人: Flavio Leitner [mailto:f...@sysclose.org] 
发送时间: 2020年2月13日 19:48
收件人: Yi Yang (杨燚)-云服务集团 
抄送: ovs-discuss@openvswitch.org; ovs-...@openvswitch.org; i.maxim...@ovn.org
主题: Re: [ovs-dev] OVS performance issue: why small udp packet pps performance 
between VMs is highly related with number of ovs ports and number of VMs?

On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote:
> Hi, all
> 
> We find ovs has serious performance issue, we only launch one VM in 
> one compute, and do iperf small udp pps performance test between these 
> two VMs, we can see about 18 pps (packets per second, -l 16), but
> 
> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps 
> performance will be about 5 pps.
> 2) If we launch one more VM in every compute node, but don’t run any 
> workload, the pps performance will be about 9 pps. (note, no above 
> veth ports in this test)
> 3) If we launch two more VMs in every compute node (totally 3 VMs 
> every compute nodes), but don’t run any workload , the pps performance 
> will be about 5 pps (note, no above veth ports in this test)
> 
> Anybody can help explain why it is so? Is there any known way to 
> optimized this? I really think ovs performance is bad (we can draw 
> such conclusion from our test result at least), I don’t want to defame 
> ovs ☺
> 
> BTW, we used ovs kernel datapath and vhost, we can see every port has a vhost 
> kernel thread, it is running with 100% cpu utilization if we run iperf in VM, 
> bu for those idle VMs, the corresponding vhost still has about 30% cpu 
> utilization, I don’t understand why.
> 
> In addition, we find udp performance is also very bad for small UDP packet 
> for physical NIC. But it can reach 26 pps for –l 80 which enough covers 
> vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 66, 
> if we consider performance overhead ovs bridge introduces, pps performance 
> between VMs should be able to reach 20 pps at least, other VMs and ports 
> shouldn’t have so big hurt against it because they are idle, no any workload 
> there.

What do you have in the flow table?  It sounds like the traffic is being 
broadcast to all ports. Check the FDB to see if OvS is learning the mac 
addresses.

It's been a while since I don't run performance tests with kernel datapath, but 
it should be no different than Linux bridge with just action NORMAL in the flow 
table.

--
fbl


smime.p7s
Description: S/MIME cryptographic signature
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?

2020-02-13 Thread Ilya Maximets
On 2/13/20 12:48 PM, Flavio Leitner wrote:
> On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote:
>> Hi, all
>>
>> We find ovs has serious performance issue, we only launch one VM in one 
>> compute, and do iperf small udp pps performance test between these two VMs, 
>> we can see about 18 pps (packets per second, -l 16), but
>>
>> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps 
>> performance will be about 5 pps.
>> 2) If we launch one more VM in every compute node, but don’t run any 
>> workload, the pps performance will be about 9 pps. (note, no above veth 
>> ports in this test)
>> 3) If we launch two more VMs in every compute node (totally 3 VMs every 
>> compute nodes), but don’t run any workload , the pps performance will be 
>> about 5 pps (note, no above veth ports in this test)
>>
>> Anybody can help explain why it is so? Is there any known way to optimized 
>> this? I really think ovs performance is bad (we can draw such conclusion 
>> from our test result at least), I don’t want to defame ovs ☺
>>
>> BTW, we used ovs kernel datapath and vhost, we can see every port has a 
>> vhost kernel thread, it is running with 100% cpu utilization if we run iperf 
>> in VM, bu for those idle VMs, the corresponding vhost still has about 30% 
>> cpu utilization, I don’t understand why.
>>
>> In addition, we find udp performance is also very bad for small UDP packet 
>> for physical NIC. But it can reach 26 pps for –l 80 which enough covers 
>> vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 
>> 66, if we consider performance overhead ovs bridge introduces, pps 
>> performance between VMs should be able to reach 20 pps at least, other 
>> VMs and ports shouldn’t have so big hurt against it because they are idle, 
>> no any workload there.
> 
> What do you have in the flow table?  It sounds like the traffic is
> being broadcast to all ports. Check the FDB to see if OvS is
> learning the mac addresses.
> 
> It's been a while since I don't run performance tests with kernel
> datapath, but it should be no different than Linux bridge with 
> just action NORMAL in the flow table.
> 

I agree that if your performance heavily depends on the number of
ports than you're most likely just flooding all the packets to
all the ports.  Since you're using UDP traffic, please, be sure
that you're sending some packets in backward direction, so OVS
and all other switches (if any) will learn/not forget to which
port packets should be sent.  Also, check if your IP addresses
are correct.  If for some reason it's not possible for OVS to
learn MAC addresses correctly, avoid using action:NORMAL.

Best regards, Ilya Maximets.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [ovs-dev] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?

2020-02-13 Thread Flavio Leitner
On Thu, Feb 13, 2020 at 09:18:38AM +, Yi Yang (杨燚)-云服务集团 wrote:
> Hi, all
> 
> We find ovs has serious performance issue, we only launch one VM in one 
> compute, and do iperf small udp pps performance test between these two VMs, 
> we can see about 18 pps (packets per second, -l 16), but
> 
> 1) if we add 100 veth ports in br-int bridge, respectively, then the pps 
> performance will be about 5 pps.
> 2) If we launch one more VM in every compute node, but don’t run any 
> workload, the pps performance will be about 9 pps. (note, no above veth 
> ports in this test)
> 3) If we launch two more VMs in every compute node (totally 3 VMs every 
> compute nodes), but don’t run any workload , the pps performance will be 
> about 5 pps (note, no above veth ports in this test)
> 
> Anybody can help explain why it is so? Is there any known way to optimized 
> this? I really think ovs performance is bad (we can draw such conclusion from 
> our test result at least), I don’t want to defame ovs ☺
> 
> BTW, we used ovs kernel datapath and vhost, we can see every port has a vhost 
> kernel thread, it is running with 100% cpu utilization if we run iperf in VM, 
> bu for those idle VMs, the corresponding vhost still has about 30% cpu 
> utilization, I don’t understand why.
> 
> In addition, we find udp performance is also very bad for small UDP packet 
> for physical NIC. But it can reach 26 pps for –l 80 which enough covers 
> vxlan header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 66, 
> if we consider performance overhead ovs bridge introduces, pps performance 
> between VMs should be able to reach 20 pps at least, other VMs and ports 
> shouldn’t have so big hurt against it because they are idle, no any workload 
> there.

What do you have in the flow table?  It sounds like the traffic is
being broadcast to all ports. Check the FDB to see if OvS is
learning the mac addresses.

It's been a while since I don't run performance tests with kernel
datapath, but it should be no different than Linux bridge with 
just action NORMAL in the flow table.

-- 
fbl
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVS performance issue: why small udp packet pps performance between VMs is highly related with number of ovs ports and number of VMs?

2020-02-13 Thread 杨燚
Hi, all

We find ovs has serious performance issue, we only launch one VM in one 
compute, and do iperf small udp pps performance test between these two VMs, we 
can see about 18 pps (packets per second, -l 16), but

1) if we add 100 veth ports in br-int bridge, respectively, then the pps 
performance will be about 5 pps.
2) If we launch one more VM in every compute node, but don’t run any workload, 
the pps performance will be about 9 pps. (note, no above veth ports in this 
test)
3) If we launch two more VMs in every compute node (totally 3 VMs every compute 
nodes), but don’t run any workload , the pps performance will be about 5 
pps (note, no above veth ports in this test)

Anybody can help explain why it is so? Is there any known way to optimized 
this? I really think ovs performance is bad (we can draw such conclusion from 
our test result at least), I don’t want to defame ovs ☺

BTW, we used ovs kernel datapath and vhost, we can see every port has a vhost 
kernel thread, it is running with 100% cpu utilization if we run iperf in VM, 
bu for those idle VMs, the corresponding vhost still has about 30% cpu 
utilization, I don’t understand why.

In addition, we find udp performance is also very bad for small UDP packet for 
physical NIC. But it can reach 26 pps for –l 80 which enough covers vxlan 
header (8 bytes) + inner eth header (14) + ipudp header (28) + 16 = 66, if we 
consider performance overhead ovs bridge introduces, pps performance between 
VMs should be able to reach 20 pps at least, other VMs and ports shouldn’t 
have so big hurt against it because they are idle, no any workload there.


smime.p7s
Description: S/MIME cryptographic signature
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN: Delay in handling unixctl commands in ovsdb-server

2020-02-13 Thread Daniel Alvarez Sanchez
Hi all,

On Thu, Feb 13, 2020 at 8:09 AM Han Zhou  wrote:

>
>
> On Wed, Feb 12, 2020 at 9:57 AM Numan Siddique 
> wrote:
> >
> > Hi Ben/All,
> >
> > In an OVN deployment - with OVN dbs deployed as active/standby using
> > pacemaker, we are seeing delays in response to unixctl command -
> > ovsdb-server/sync-status.
> >
> > Pacemaker periodically calls the OVN pacemaker OCF script to get the
> > status and this script internally invokes - ovs-appctl -t
> > /var/run/openvswitch/ovnsb_db.ctl ovsdb-server/sync-status. In a large
> > deployment with lots of OVN resources we see that ovsdb-server takes a
> > lot of time (sometimes > 60 seconds) to respond to this command. This
> > causes pacemaker to stop the service in that node and move the master
> > to another node. This causes a lot of disruption.
> >
> > One approach of solving this issue is to handle unixctl commands in a
> > separate thread. The commands like sync-status, get-** etc can be
> > easily handled in the thread. Still, there are many commands like
> > ovsdb-server/set-active-ovsdb-server, ovsdb-server/compact etc (which
> > changes the state) which needs to be synchronized between the main
> > ovsdb-server thread and the newly added thread using a mutex.
> >
> > Does this approach makes sense ? I started working on it. But I wanted
> > to check with the community before putting into more efforts.
> >
> > Are there better ways to solve this issue ?
> >
> > Thanks
> > Numan
> >
> Hi Numan,
>
> It seems reasonable to me. Multi-threading would add a little complexity,
> but in this case it should be straightforward. It merely requires mutexes
> to synchronize between the threads for *writes*, and also for *reads* of
> non-atomic data.
> The only side effect is that *if* the thread that does the DB job really
> stucked because of a bug and not handling jobs at all, the unixctl thread
> ovsdb-server/sync-status command wouldn't detect it, so it could result in
> pacemaker reporting *happy* status without detecting problems. First for
> all this is unlikely to happen. But if we really think it is a problem we
> can still solve it by incrementing a counter in main loop and have a new
> command (readonly, without mutex) to check if this counter is increasing,
> to tell if the server if really working.
>

I'd be more inclined to do what Han suggests here and that every thread
contributes to the health status with a readonly counter.

Whatever gets implemented here perhaps can be re-used in ovn-controller to
monitor the main & pinctrl threads.
Similar scenario but maybe worse consequences as it affects dataplane is
that the "health" thread reports good status but the pinctrl thread is
stuck and therefore DHCP service is down and instances can't fetch IP.


> Thanks,
> Han
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss