Re: [ovs-discuss] Discussion on the logical rationality of flow-limit

2021-05-05 Thread taoyunupt
在 2021-05-06 03:26:46,"Ben Pfaff"  写道:

>On Fri, Apr 30, 2021 at 06:10:43PM +0800, taoyunupt wrote:
>> 
>> 
>> 
>> At 2021-04-29 06:39:11, "Ben Pfaff"  wrote:
>> >On Wed, Apr 28, 2021 at 08:12:06PM +0800, taoyunupt wrote:
>> >> Hi,
>> >>  Recently I encountered a TCP connection performance problem, the 
>> >> test tool is Apache benchmark.
>> >>  The OVS  in my environment is set for  hardware offload solution.  
>> >> The "Requests per second" is about 6000/s, it closed to non-offload 
>> >> solution.
>> >> 
>> >> 
>> >>   "flow-lmit"  has a dynamic balance in udpif_revalidator, it will 
>> >> modify by the OVS condition(which is pind to "duration").   In the 
>> >> revalidate function, when the number of flows is greater than twice the 
>> >> "flow-limit" , the delete flow operation will be triggered to delete all 
>> >> flows; when the number of flows is greater than the "flow-limit", the 
>> >> aging time will be adjusted to 0.1s, Slowly delete flow.   
>> >> 
>> >> 
>> >>  
>> >>  I found that the reason for the poor performance is that when the 
>> >> number of flows in the datapath increases and the processing power of OVS 
>> >> decreases, a large number of flow deletions are generated. 
>> >>  As we know, In the hardware offloading scenario, although there are 
>> >> a lot of flows, in fact, apart from the first packet, there is no need to 
>> >> process subsequent packets. 
>> >>  In my opinion, the dynamic balance mechanism is very necessary, but 
>> >> we need to increase the value of “duration”, or provide some new switches 
>> >> for some high-performance scenarios, such as hardware offloading.
>> >>  Do we still need to restrict the number of flows so strictly? By the 
>> >> way, do you have another solution to resolve this?   
>> >
>> >It's been a long time since I worked on this, but I recall two reasons
>> >for the flow limit.  First, each flow takes up memory.  Second, each
>> >flow must be revalidated periodically, meaning that it uses CPU as
>> >well.
>> >
>> >I don't, off-hand, remember the real reasons why the logic for deleting
>> >flows works as it does.  It might be in the comments or the commit
>> >messages.  But, I suspect, it is because above the flow-limit we want to
>> >try to reduce the amount of memory and CPU time dedicated to the cache
>> >and, if we arrive at twice the flow limit, we conclude that that try
>> >failed and that we must have a large number of very short flows so that
>> >caching is not very valuable anyhow.
>> >
>> >In a hardware offload scenario, we get rid of some costs (the cost of
>> >processing and forwarding packets and perhaps the memory cost in the
>> >datapath) but we still have the cost of revalidating them.  When there
>> >are many flows, we add the extra cost of balancing flows between
>> >software and the offload hardware.
>> >
>> >Because of the remaining cost and the added ones when there is hardware
>> >offload, it's not obvious to me that we can stop limiting the number of
>> >flows.  I think that experimentation and measurements would be needed.
>> >Perhaps this would be an adjustment to the dynamic algorithm, rather
>> 
>> >than a removal of it.
>> 
>> 
>> I think we can increase the init `flow_limit` in udpif_create,1 is a 
>> small number for current server and OS, and if 'duration' is small ,we 
>> should increase faster by a lager number not `flow_limit += 1000;`.
>> I have not better idea for this situation. Do you have some suggestion? I am 
>> very glad to do this change.
>
>What kind of number are you thinking about?  I'd like to come up with a
>rationale for choosing it.  It might be even better to come up with an

>algorithm or a heuristic for choosing it.


I think we could  set the initial value to 200,000, and adjust the increase to 
20,000 each time.  Can  you describe the rationale algorithm you meationed in 
detailed ? ___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] EXTERNAL: Re: Unable to add flows - Operation not permitted

2021-05-05 Thread Seshadri, Usha
Thanks for your response Ben. Since kernel datapath requires root, would using 
DPDK solve this problem? Can OVS-DPDK run as non-root?

Thanks,
Usha


-Original Message-
From: Ben Pfaff  
Sent: Wednesday, May 5, 2021 7:25 PM
To: Seshadri, Usha (US) 
Cc: ovs-discuss@openvswitch.org
Subject: EXTERNAL: Re: [ovs-discuss] Unable to add flows - Operation not 
permitted

On Wed, May 05, 2021 at 07:38:45PM +, Seshadri, Usha wrote:
>   1.  I am trying to add flows by executing the following command on the CLI 
> as a non-root user, but I see 'Operation not permitted' errors in the log 
> file as provided below:

[...]

> 2021-05-05T16:05:15.278Z|00012|ofproto_dpif|ERR|failed to open 
> datapath of type system: Operation not permitted 
> 2021-05-05T16:05:15.278Z|00013|ofproto|ERR|failed to open datapath 
> br0: Operation not permitted 
> 2021-05-05T16:05:15.278Z|00014|bridge|ERR|failed to create bridge br0: 
> Operation not permitted

I guess that you are using the OVS datapath that uses the Linux kernel module.  
Ordinarily, this does require root.  People who work with containers a lot (nto 
me) might know some workaround.

>   1.  Running the command again says the bridge already exists.
> 
> ovs-vsctl add-br br0
> ovs-vsctl: cannot create a bridge named br0 because a bridge named br0 
> already exists

Yes.  ovs-vsctl just modifies the database, which already has an entry for the 
bridge.  OVS tries to configure the system to look like the database, but it 
doesn't succeed because it doesn't have the right permissions.

> It appears I may be running into permissions issue. The owner + group 
> permissions are identical, owned by root. The user in OpenShift belongs to 
> the root group. Does OVS need to run as root? Any help with this is greatly 
> appreciated.

I can't help with this part, but maybe someone else can.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Unable to add flows - Operation not permitted

2021-05-05 Thread Ben Pfaff
On Wed, May 05, 2021 at 07:38:45PM +, Seshadri, Usha wrote:
>   1.  I am trying to add flows by executing the following command on the CLI 
> as a non-root user, but I see 'Operation not permitted' errors in the log 
> file as provided below:

[...]

> 2021-05-05T16:05:15.278Z|00012|ofproto_dpif|ERR|failed to open datapath of 
> type system: Operation not permitted
> 2021-05-05T16:05:15.278Z|00013|ofproto|ERR|failed to open datapath br0: 
> Operation not permitted
> 2021-05-05T16:05:15.278Z|00014|bridge|ERR|failed to create bridge br0: 
> Operation not permitted

I guess that you are using the OVS datapath that uses the Linux kernel
module.  Ordinarily, this does require root.  People who work with
containers a lot (nto me) might know some workaround.

>   1.  Running the command again says the bridge already exists.
> 
> ovs-vsctl add-br br0
> ovs-vsctl: cannot create a bridge named br0 because a bridge named br0 
> already exists

Yes.  ovs-vsctl just modifies the database, which already has an entry
for the bridge.  OVS tries to configure the system to look like the
database, but it doesn't succeed because it doesn't have the right
permissions.

> It appears I may be running into permissions issue. The owner + group 
> permissions are identical, owned by root. The user in OpenShift belongs to 
> the root group. Does OVS need to run as root? Any help with this is greatly 
> appreciated.

I can't help with this part, but maybe someone else can.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Cannot create gtp tunnel

2021-05-05 Thread Ben Pfaff
On Tue, Apr 27, 2021 at 10:28:10AM +0430, Ash Ash wrote:
> The release docs says
> GTPU tunnel is only supported on userspace datapath. Does it mean that I
> have to use netdev as datapath_type and ovs-vsctl add-br br0 won't work?

The userspace datapath uses datapath_type netdev, yes.

"add-br" is still a valid way to add such a datapath, you just have to
set the datapath-type afterward.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Unable to add flows - Operation not permitted

2021-05-05 Thread Seshadri, Usha
Hello,

I am a newbie to OVS. I am trying to explore adding flows on the command line 
and running into 'Operation not permitted' errors.
Setup:

  1.  Docker image: Base CentOS 8 image + openvswitch binaries via dnf install
  2.  Image from step 1 deployed on OpenShift.
  3.  Startup OVS via ovs-ctl as a non-root user using 'ovs-ctl start' command 
on the CLI and the output from the command is as given below. I can see 
ovsdb-server and ovs-vswitchd are successfully running via the 'ps' command.
ovs-ctl start

/etc/openvswitch/conf.db does not exist ... (warning).
Creating empty database /etc/openvswitch/conf.db [  OK  ]
nice: cannot set niceness: Permission denied
Starting ovsdb-server [  OK  ]
system ID not configured, please use --system-id ... failed!
Configuring Open vSwitch system IDs [  OK  ]
nice: cannot set niceness: Permission denied
Starting ovs-vswitchd [  OK  ]
Enabling remote OVSDB managers [  OK  ]


  1.  I am trying to add flows by executing the following command on the CLI as 
a non-root user, but I see 'Operation not permitted' errors in the log file as 
provided below:
ovs-vsctl add-br br0
ovs-vsctl: Error detected while setting up 'br0'.  See ovs-vswitchd log for 
details.
ovs-vsctl: The default log directory is "/var/log/openvswitch".

cat /var/log/openvswitch/ovs-vswitchd.log
2021-05-05T14:44:19.191Z|1|vlog|INFO|opened log file 
/var/log/openvswitch/ovs-vswitchd.log
2021-05-05T14:44:19.192Z|2|vswitchd|ERR|mlockall failed: Cannot allocate 
memory
2021-05-05T14:44:19.193Z|3|ovs_numa|INFO|Discovered 8 CPU cores on NUMA 
node 0
2021-05-05T14:44:19.193Z|4|ovs_numa|INFO|Discovered 1 NUMA nodes and 8 CPU 
cores
2021-05-05T14:44:19.194Z|5|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connecting...
2021-05-05T14:44:19.195Z|6|netlink_socket|INFO|netlink: could not enable 
listening to all nsid (Operation not permitted)
2021-05-05T14:44:19.196Z|7|reconnect|INFO|unix:/var/run/openvswitch/db.sock:
 connected
2021-05-05T14:44:19.199Z|8|dpif_netlink|INFO|The kernel module does not 
support meters.
2021-05-05T14:44:19.201Z|9|bridge|INFO|ovs-vswitchd (Open vSwitch) 2.12.0
2021-05-05T16:05:15.276Z|00010|memory|INFO|2964 kB peak resident set size after 
4856.1 seconds
2021-05-05T16:05:15.277Z|00011|dpif|WARN|failed to create datapath ovs-system: 
Operation not permitted
2021-05-05T16:05:15.278Z|00012|ofproto_dpif|ERR|failed to open datapath of type 
system: Operation not permitted
2021-05-05T16:05:15.278Z|00013|ofproto|ERR|failed to open datapath br0: 
Operation not permitted
2021-05-05T16:05:15.278Z|00014|bridge|ERR|failed to create bridge br0: 
Operation not permitted


  1.  Running the command again says the bridge already exists.

ovs-vsctl add-br br0
ovs-vsctl: cannot create a bridge named br0 because a bridge named br0 already 
exists

It appears I may be running into permissions issue. The owner + group 
permissions are identical, owned by root. The user in OpenShift belongs to the 
root group. Does OVS need to run as root? Any help with this is greatly 
appreciated.


Thanks,
Usha

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Discussion on the logical rationality of flow-limit

2021-05-05 Thread Ben Pfaff
On Fri, Apr 30, 2021 at 06:10:43PM +0800, taoyunupt wrote:
> 
> 
> 
> At 2021-04-29 06:39:11, "Ben Pfaff"  wrote:
> >On Wed, Apr 28, 2021 at 08:12:06PM +0800, taoyunupt wrote:
> >> Hi,
> >>  Recently I encountered a TCP connection performance problem, the test 
> >> tool is Apache benchmark.
> >>  The OVS  in my environment is set for  hardware offload solution.  
> >> The "Requests per second" is about 6000/s, it closed to non-offload 
> >> solution.
> >> 
> >> 
> >>   "flow-lmit"  has a dynamic balance in udpif_revalidator, it will 
> >> modify by the OVS condition(which is pind to "duration").   In the 
> >> revalidate function, when the number of flows is greater than twice the 
> >> "flow-limit" , the delete flow operation will be triggered to delete all 
> >> flows; when the number of flows is greater than the "flow-limit", the 
> >> aging time will be adjusted to 0.1s, Slowly delete flow.   
> >> 
> >> 
> >>  
> >>  I found that the reason for the poor performance is that when the 
> >> number of flows in the datapath increases and the processing power of OVS 
> >> decreases, a large number of flow deletions are generated. 
> >>  As we know, In the hardware offloading scenario, although there are a 
> >> lot of flows, in fact, apart from the first packet, there is no need to 
> >> process subsequent packets. 
> >>  In my opinion, the dynamic balance mechanism is very necessary, but 
> >> we need to increase the value of “duration”, or provide some new switches 
> >> for some high-performance scenarios, such as hardware offloading.
> >>  Do we still need to restrict the number of flows so strictly? By the 
> >> way, do you have another solution to resolve this?   
> >
> >It's been a long time since I worked on this, but I recall two reasons
> >for the flow limit.  First, each flow takes up memory.  Second, each
> >flow must be revalidated periodically, meaning that it uses CPU as
> >well.
> >
> >I don't, off-hand, remember the real reasons why the logic for deleting
> >flows works as it does.  It might be in the comments or the commit
> >messages.  But, I suspect, it is because above the flow-limit we want to
> >try to reduce the amount of memory and CPU time dedicated to the cache
> >and, if we arrive at twice the flow limit, we conclude that that try
> >failed and that we must have a large number of very short flows so that
> >caching is not very valuable anyhow.
> >
> >In a hardware offload scenario, we get rid of some costs (the cost of
> >processing and forwarding packets and perhaps the memory cost in the
> >datapath) but we still have the cost of revalidating them.  When there
> >are many flows, we add the extra cost of balancing flows between
> >software and the offload hardware.
> >
> >Because of the remaining cost and the added ones when there is hardware
> >offload, it's not obvious to me that we can stop limiting the number of
> >flows.  I think that experimentation and measurements would be needed.
> >Perhaps this would be an adjustment to the dynamic algorithm, rather
> 
> >than a removal of it.
> 
> 
> I think we can increase the init `flow_limit` in udpif_create,1 is a 
> small number for current server and OS, and if 'duration' is small ,we should 
> increase faster by a lager number not `flow_limit += 1000;`.
> I have not better idea for this situation. Do you have some suggestion? I am 
> very glad to do this change.

What kind of number are you thinking about?  I'd like to come up with a
rationale for choosing it.  It might be even better to come up with an
algorithm or a heuristic for choosing it.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVN does not work with vlans when CX5 does UDP tx checksum offload on OEL 7.7 (RHEL 7.7 based) / OEL 7.9 (RHEL 7.9) based

2021-05-05 Thread Brendan Doyle

Folks,

I had posted an question to this alias a while back with the subject:
 " TCP tunnel traffic stops working when move from RHEL   7.7 to 7.9"

I finally got to the bottom of this and discovered that the issues is 
with UDP checksum offload
when the underlay is in a vlan, which seems to break OVN. Is this a 
known issue?


To cut to the chase I got things working with the following command on 
each chassis:


*ethtool --offload genev_sys_6081 rx on tx off*

When I looked at the tcpdumps on the underlay NIC I noticed that in the 
old working
OS (OEL 7.7 (RHEL 7.7 based) ) that the outer UDP pkt always had "[udp 
sum ok]" meaning that
the OS was doing the checksum, where as in the new broken OS (OEL 7.9 
(RHEL 7.9) the first
few packets had  "[bad udp cksum" these packets got through, but then 
the next few had
"[udp sum ok]" and these did not get through to the other chassis across 
the tunnel. Oddly
when I removed the vlan, with no ethertool changes things worked, It 
only broke when there
was a vlan in the mix. Then after much trail and error with ethertool 
settings on the NIC,

the VIF, ovs-system and finally *genev_sys_6081* I got it to work.

Seems like a bit of a performance limitation that OVN does not work with 
NIC checksum offload?


Brendan


On 29/04/2021 10:54, Brendan Doyle wrote:

Hi Folks,

In a very basic OVN config, where I have two VMs on different chassis:

switch 7b89d593-05f3-41a7-a246-8dade975df48 (ls_vcn1)
    port a6a358c5-5db4-49c7-b68a-3a7429161ab4
    addresses: ["52:54:00:71:ad:a0 192.16.1.5"]
    port b6c5ef1a-acd9-4053-9986-88e1a6a12b81
    addresses: ["52:54:00:40:8f:dc 192.16.1.6"]

When I upgrade the chassis from OEL 7.7 (RHEL 7.7 based) to OEL 7.9 
(RHEL 7.9) based, then
TCP traffic stops working, ping and UDP are fine. When I look at 
tcpdump of the traffic on both
chassis, I see the initial handshake encapsulated traffic being sent 
and revived on both nodes.
The initial TCP handshake seems to get through on the sender and it 
sends the first data packet
but the receive side does  not get the data packets and keeps sending 
the initial handshake ack

(see traces below).

I'm think something to do with tcp checksum or some other NIC offload? 
the NICS are CX5s.

Just wondering has anyone come across this?

Thanks

Brendan


Sender
-
98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 
132: (tos 0x0, ttl 64, id 29694, offset 0, flags [DF], proto UDP (17), 
length 118)
    253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xfc99 -> 
0xa576!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
[class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
00010002]
    52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 
(0x0800), length 74: (tos 0x0, ttl 64, id 61068, offset 0, flags [DF], 
proto TCP (6), length 60)
    192.16.1.6.38900 > 192.16.1.5.22: Flags [S], cksum 0x0a2b 
(correct), seq 3225335796, win 27200, options [mss 1360,sackOK,TS val 
1242625918 ecr 0,nop,wscale 7], length 0


98:03:9b:89:21:5a > 98:03:9b:89:21:e2, ethertype IPv4 (0x0800), length 
132: (tos 0x0, ttl 64, id 5167, offset 0, flags [DF], proto UDP (17), 
length 118)
    253.255.0.18.28454 > 253.255.0.21.6081: [udp sum ok] Geneve, Flags 
[C], vni 0x1, proto TEB (0x6558), options [class Open Virtual 
Networking (OVN) (0x102) type 0x80(C) len 8 data 00020001]
    52:54:00:71:ad:a0 > 52:54:00:40:8f:dc, ethertype IPv4 
(0x0800), length 74: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], 
proto TCP (6), length 60)
    192.16.1.5.22 > 192.16.1.6.38900: Flags [S.], cksum 0xb82f 
(correct), seq 3217262113, ack 3225335797, win 26960, options [mss 
1360,sackOK,TS val 3343009202 ecr 1242625918,nop,wscale 7], length 0


98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 
124: (tos 0x0, ttl 64, id 29695, offset 0, flags [DF], proto UDP (17), 
length 110)
    253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa57e -> 
0x723d!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
[class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
00010002]
    52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 
(0x0800), length 66: (tos 0x0, ttl 64, id 61069, offset 0, flags [DF], 
proto TCP (6), length 52)
    192.16.1.6.38900 > 192.16.1.5.22: Flags [.], cksum 0x8252 
(incorrect -> 0x4f11), seq 1, ack 1, win 213, options [nop,nop,TS val 
1242625920 ecr 3343009202], length 0


98:03:9b:89:21:e2 > 98:03:9b:89:21:5a, ethertype IPv4 (0x0800), length 
145: (tos 0x0, ttl 64, id 29696, offset 0, flags [DF], proto UDP (17), 
length 131)
    253.255.0.21.62384 > 253.255.0.18.6081: [bad udp cksum 0xa569 -> 
0xae4d!] Geneve, Flags [C], vni 0x1, proto TEB (0x6558), options 
[class Open Virtual Networking (OVN) (0x102) type 0x80(C) len 8 data 
00010002]
    52:54:00:40:8f:dc > 52:54:00:71:ad:a0, ethertype IPv4 
(0x0800), length 87: (tos 0x0, ttl 64, id 61070, offset 0, flags [DF], 
proto TCP (6), length 73)