You're right that my patch was a bit of a short term hack. It addressed a particular case where we could easily discard flows when we know they could never match (because of the logical port included in the match).
Use of conjunctive matches is a better more general solution to the explosion of flows when using address sets, but that hasn't been worked on yet. On Wed, Dec 13, 2017 at 5:00 PM, Kevin Lin <ke...@kelda.io> wrote: > Thanks for the replies! > > We’re using v2.8.1. > > I don’t completely understand Russell’s patch, but I don’t think our ACLs > were taking advantage of it. Do the ACLs need to be “tagged” with port > information in order for it to be useful? > > Before, our ACLs were in terms of L3 and above. I brought down the number of > flows from 252981 to 16224 by modifying the ACLs to include the logical > port. For example: > > from-lport: (inport == “1” || inport == “2” || inport == “3”) && (ip4.dst == > $addressSet) > to-lport: (ip4.src == $addressSet) && (outport == “1” || outport == “2” || > outport == “3”) > > > Is that the right way to take advantage of Russell’s patch? This explodes > the address set on one side of the connection, but does decrease the number > of flows installed on each vswitchd. > > Even with the decrease in flows in vswitchd, I’m still seeing the log > messages. Does 16224 flows per vswitchd instance, and 559 flows in ovn-sb > sound reasonable? > > Thanks, > —Kevin > > On Dec 13, 2017, at 7:55 AM, Mark Michelson <mmich...@redhat.com> wrote: > > On 12/12/2017 03:26 PM, Kevin Lin wrote: > > Hi again, > We’re trying to scale up our OVN deployment and we’re seeing some worrying > log messages. > The topology is 32 containers connected to another 32 containers on 10 > different ports. This is running on 17 machines (one machine runs ovn-northd > and ovsdb-server, the other 16 run ovn-controller, ovs-vswitchd, and > ovsdb-server). We’re using an address set for the source group, but not the > destination group. We’re also creating a different ACL for each port. So the > ACLs look like: > One address set for { container1, container2, … container32 } > addressSet -> container1 on port 80 > addressSet -> container1 on port 81 > … > addressSet -> container1 on port 90 > addressSet -> container2 on port 80 > … > addressSet -> container32 on port 90 > The ovn-controller log: > 2017-12-12T20:14:49Z|11878|timeval|WARN|Unreasonably long 1843ms poll > interval (1840ms user, 0ms system) > 2017-12-12T20:14:49Z|11879|timeval|WARN|disk: 0 reads, 16 writes > 2017-12-12T20:14:49Z|11880|timeval|WARN|context switches: 0 voluntary, 21 > involuntary > 2017-12-12T20:14:49Z|11881|poll_loop|DBG|wakeup due to [POLLIN] on fd 9 > (172.31.11.193:48460<->172.31.2.181:6640) at lib/stream-fd.c:157 (36% CPU > usage) > 2017-12-12T20:14:49Z|11882|poll_loop|DBG|wakeup due to [POLLIN] on fd 12 > (<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:157 (36% CPU usage) > 2017-12-12T20:14:49Z|11883|jsonrpc|DBG|tcp:172.31.2.181:6640: received > reply, result=[], id="echo" > 2017-12-12T20:14:49Z|11884|netlink_socket|DBG|nl_sock_transact_multiple__ > (Success): nl(len:36, type=38(family-defined), flags=9[REQUEST][ECHO], > seq=b11, pid=2268452876 > 2017-12-12T20:14:49Z|11885|netlink_socket|DBG|nl_sock_recv__ (Success): > nl(len:136, type=36(family-defined), flags=0, seq=b11, pid=2268452876 > 2017-12-12T20:14:49Z|11886|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt: > received: OFPT_ECHO_REQUEST (OF1.3) (xid=0x0): 0 bytes of payload > 2017-12-12T20:14:49Z|11887|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt: > sent (Success): OFPT_ECHO_REPLY (OF1.3) (xid=0x0): 0 bytes of payload > 2017-12-12T20:14:51Z|11888|timeval|WARN|Unreasonably long 1851ms poll > interval (1844ms user, 8ms system) > 2017-12-12T20:14:51Z|11889|timeval|WARN|context switches: 0 voluntary, 11 > involuntary > 2017-12-12T20:14:52Z|11890|poll_loop|DBG|wakeup due to [POLLIN] on fd 9 > (172.31.11.193:48460<->172.31.2.181:6640) at lib/stream-fd.c:157 (73% CPU > usage) > 2017-12-12T20:14:52Z|11891|jsonrpc|DBG|tcp:172.31.2.181:6640: received > request, method="echo", params=[], id="echo" > 2017-12-12T20:14:52Z|11892|jsonrpc|DBG|tcp:172.31.2.181:6640: send reply, > result=[], id="echo" > 2017-12-12T20:14:52Z|11893|netlink_socket|DBG|nl_sock_transact_multiple__ > (Success): nl(len:36, type=38(family-defined), flags=9[REQUEST][ECHO], > seq=b12, pid=2268452876 > 2017-12-12T20:14:52Z|11894|netlink_socket|DBG|nl_sock_recv__ (Success): > nl(len:136, type=36(family-defined), flags=0, seq=b12, pid=2268452876 > 2017-12-12T20:14:52Z|11895|netdev_linux|DBG|Dropped 18 log messages in last > 56 seconds (most recently, 3 seconds ago) due to excessive rate > 2017-12-12T20:14:52Z|11896|netdev_linux|DBG|unknown qdisc "mq" > 2017-12-12T20:14:54Z|11897|hmap|DBG|Dropped 15511 log messages in last 6 > seconds (most recently, 0 seconds ago) due to excessive rate > 2017-12-12T20:14:54Z|11898|hmap|DBG|ovn/lib/expr.c:2644: 6 nodes in bucket > (128 nodes, 64 buckets) > 2017-12-12T20:14:54Z|11899|timeval|WARN|Unreasonably long 1831ms poll > interval (1828ms user, 4ms system) > 2017-12-12T20:14:54Z|11900|timeval|WARN|context switches: 0 voluntary, 12 > involuntary > The log messages show up continuously. The logs appear even when the network > isn’t being used. > I poked around with Ethan Jackson and he noted that the hmap counters seem > unusually high: > root@ip-172-31-11-193:/# ovs-appctl -t ovn-controller coverage/show > Event coverage, avg rate over last: 5 seconds, last minute, last hour, > hash=d6ee5804: > hmap_pathological 2323.6/sec 2662.467/sec 2514.0069/sec total: > 9407536 > hmap_expand 3596.8/sec 4121.283/sec 3890.8833/sec total: > 14604479 > txn_unchanged 0.8/sec 0.917/sec 0.8658/sec total: > 5659 > txn_incomplete 0.0/sec 0.000/sec 0.0008/sec total: > 33 > txn_success 0.0/sec 0.000/sec 0.0006/sec total: > 24 > poll_create_node 2.4/sec 2.750/sec 2.5986/sec total: > 18218 > poll_zero_timeout 0.0/sec 0.000/sec 0.0100/sec total: > 71 > rconn_queued 0.0/sec 0.050/sec 0.0531/sec total: > 252570 > rconn_sent 0.0/sec 0.050/sec 0.0531/sec total: > 252570 > seq_change 1.2/sec 1.383/sec 1.2992/sec total: > 8500 > pstream_open 0.0/sec 0.000/sec 0.0000/sec total: > 1 > stream_open 0.0/sec 0.000/sec 0.0000/sec total: > 6 > unixctl_received 0.0/sec 0.000/sec 0.0019/sec total: > 7 > unixctl_replied 0.0/sec 0.000/sec 0.0019/sec total: > 7 > util_xalloc 2731550.2/sec 3129900.483/sec 569276.5414/sec > total: 11057381035 > vconn_open 0.0/sec 0.000/sec 0.0000/sec total: > 4 > vconn_received 0.0/sec 0.050/sec 0.0444/sec total: > 201 > vconn_sent 0.0/sec 0.000/sec 0.0144/sec total: > 253535 > netdev_get_ifindex 0.4/sec 0.467/sec 0.4328/sec total: > 2822 > netlink_received 0.4/sec 0.467/sec 0.4328/sec total: > 2822 > netlink_sent 0.4/sec 0.467/sec 0.4328/sec total: > 2822 > cmap_expand 0.0/sec 0.000/sec 0.0000/sec total: > 2 > 47 events never hit > I’ve also attached the output of ovs-bugtool run from the machine running > ovn-northd, and one of the machines running ovn-controller and ovs-vswitchd. > Thanks, > —Kevin > > > Hi Kevin, > > In short, I agree with Ethan's assessment that the hmap numbers appear > large. The ACLs, combined with ovn-controller's algorithms, are causing > that. The ACLs result in ~250000 openflow flows being generated. > ovn-controller re-creates this flow table from scratch each time it runs its > main loop. > > These are unsolved problems as yet in ovn-controller. The two high-level > thoughts on the matter are: > 1) See if we can reduce the number of generated flows when ACLs are used. > 2) See if we can perform incremental changes on the flow table rather than > re-creating it from scratch every time. > > There are two commits that may be of interest to you. > > * As Miguel mentioned in his response, Russell Bryant submitted a change [1] > within the past couple of months to help towards (1) a bit. The idea was > that if we can detect that a flow will never be needed on a specific > machine, then don't generate it. It's not going to cover everything, but it > could potentially help you. That change is present in 2.7.3 and 2.8.1 > > * A change [2] by Han Zhou fixed an issue where we were generating the flow > table and then just throwing it away since vswitchd was not ready for us to > submit a new flow table to it. This change is present in 2.7.3 and 2.8.0. > > These two commits aren't silver bullets but if you're running a version of > OVS that does not have these commits in them already, you may see some > improvement with an upgrade. > > Mark > > [1] > https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338847.html > [2] https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331286.html > > > > _______________________________________________ > discuss mailing list > disc...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > -- Russell Bryant _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss