On 12/12/2017 03:26 PM, Kevin Lin wrote:
Hi again,
We’re trying to scale up our OVN deployment and we’re seeing some
worrying log messages.
The topology is 32 containers connected to another 32 containers on 10
different ports. This is running on 17 machines (one machine runs
ovn-northd and ovsdb-server, the other 16 run ovn-controller,
ovs-vswitchd, and ovsdb-server). We’re using an address set for the
source group, but not the destination group. We’re also creating a
different ACL for each port. So the ACLs look like:
One address set for { container1, container2, … container32 }
addressSet -> container1 on port 80
addressSet -> container1 on port 81
…
addressSet -> container1 on port 90
addressSet -> container2 on port 80
…
addressSet -> container32 on port 90
The ovn-controller log:
2017-12-12T20:14:49Z|11878|timeval|WARN|Unreasonably long 1843ms poll
interval (1840ms user, 0ms system)
2017-12-12T20:14:49Z|11879|timeval|WARN|disk: 0 reads, 16 writes
2017-12-12T20:14:49Z|11880|timeval|WARN|context switches: 0 voluntary,
21 involuntary
2017-12-12T20:14:49Z|11881|poll_loop|DBG|wakeup due to [POLLIN] on fd 9
(172.31.11.193:48460<->172.31.2.181:6640) at lib/stream-fd.c:157 (36%
CPU usage)
2017-12-12T20:14:49Z|11882|poll_loop|DBG|wakeup due to [POLLIN] on fd 12
(<->/var/run/openvswitch/br-int.mgmt) at lib/stream-fd.c:157 (36% CPU usage)
2017-12-12T20:14:49Z|11883|jsonrpc|DBG|tcp:172.31.2.181:6640: received
reply, result=[], id="echo"
2017-12-12T20:14:49Z|11884|netlink_socket|DBG|nl_sock_transact_multiple__ (Success):
nl(len:36, type=38(family-defined), flags=9[REQUEST][ECHO], seq=b11,
pid=2268452876
2017-12-12T20:14:49Z|11885|netlink_socket|DBG|nl_sock_recv__ (Success):
nl(len:136, type=36(family-defined), flags=0, seq=b11, pid=2268452876
2017-12-12T20:14:49Z|11886|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt:
received: OFPT_ECHO_REQUEST (OF1.3) (xid=0x0): 0 bytes of payload
2017-12-12T20:14:49Z|11887|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt:
sent (Success): OFPT_ECHO_REPLY (OF1.3) (xid=0x0): 0 bytes of payload
2017-12-12T20:14:51Z|11888|timeval|WARN|Unreasonably long 1851ms poll
interval (1844ms user, 8ms system)
2017-12-12T20:14:51Z|11889|timeval|WARN|context switches: 0 voluntary,
11 involuntary
2017-12-12T20:14:52Z|11890|poll_loop|DBG|wakeup due to [POLLIN] on fd 9
(172.31.11.193:48460<->172.31.2.181:6640) at lib/stream-fd.c:157 (73%
CPU usage)
2017-12-12T20:14:52Z|11891|jsonrpc|DBG|tcp:172.31.2.181:6640: received
request, method="echo", params=[], id="echo"
2017-12-12T20:14:52Z|11892|jsonrpc|DBG|tcp:172.31.2.181:6640: send
reply, result=[], id="echo"
2017-12-12T20:14:52Z|11893|netlink_socket|DBG|nl_sock_transact_multiple__ (Success):
nl(len:36, type=38(family-defined), flags=9[REQUEST][ECHO], seq=b12,
pid=2268452876
2017-12-12T20:14:52Z|11894|netlink_socket|DBG|nl_sock_recv__ (Success):
nl(len:136, type=36(family-defined), flags=0, seq=b12, pid=2268452876
2017-12-12T20:14:52Z|11895|netdev_linux|DBG|Dropped 18 log messages in
last 56 seconds (most recently, 3 seconds ago) due to excessive rate
2017-12-12T20:14:52Z|11896|netdev_linux|DBG|unknown qdisc "mq"
2017-12-12T20:14:54Z|11897|hmap|DBG|Dropped 15511 log messages in last 6
seconds (most recently, 0 seconds ago) due to excessive rate
2017-12-12T20:14:54Z|11898|hmap|DBG|ovn/lib/expr.c:2644: 6 nodes in
bucket (128 nodes, 64 buckets)
2017-12-12T20:14:54Z|11899|timeval|WARN|Unreasonably long 1831ms poll
interval (1828ms user, 4ms system)
2017-12-12T20:14:54Z|11900|timeval|WARN|context switches: 0 voluntary,
12 involuntary
The log messages show up continuously. The logs appear even when the
network isn’t being used.
I poked around with Ethan Jackson and he noted that the hmap counters
seem unusually high:
root@ip-172-31-11-193:/# ovs-appctl -t ovn-controller coverage/show
Event coverage, avg rate over last: 5 seconds, last minute, last hour,
hash=d6ee5804:
hmap_pathological 2323.6/sec 2662.467/sec 2514.0069/sec
total: 9407536
hmap_expand 3596.8/sec 4121.283/sec 3890.8833/sec
total: 14604479
txn_unchanged 0.8/sec 0.917/sec 0.8658/sec
total: 5659
txn_incomplete 0.0/sec 0.000/sec 0.0008/sec
total: 33
txn_success 0.0/sec 0.000/sec 0.0006/sec
total: 24
poll_create_node 2.4/sec 2.750/sec 2.5986/sec
total: 18218
poll_zero_timeout 0.0/sec 0.000/sec 0.0100/sec
total: 71
rconn_queued 0.0/sec 0.050/sec 0.0531/sec
total: 252570
rconn_sent 0.0/sec 0.050/sec 0.0531/sec
total: 252570
seq_change 1.2/sec 1.383/sec 1.2992/sec
total: 8500
pstream_open 0.0/sec 0.000/sec 0.0000/sec
total: 1
stream_open 0.0/sec 0.000/sec 0.0000/sec
total: 6
unixctl_received 0.0/sec 0.000/sec 0.0019/sec
total: 7
unixctl_replied 0.0/sec 0.000/sec 0.0019/sec
total: 7
util_xalloc 2731550.2/sec 3129900.483/sec 569276.5414/sec
total: 11057381035
vconn_open 0.0/sec 0.000/sec 0.0000/sec
total: 4
vconn_received 0.0/sec 0.050/sec 0.0444/sec
total: 201
vconn_sent 0.0/sec 0.000/sec 0.0144/sec
total: 253535
netdev_get_ifindex 0.4/sec 0.467/sec 0.4328/sec
total: 2822
netlink_received 0.4/sec 0.467/sec 0.4328/sec
total: 2822
netlink_sent 0.4/sec 0.467/sec 0.4328/sec
total: 2822
cmap_expand 0.0/sec 0.000/sec 0.0000/sec
total: 2
47 events never hit
I’ve also attached the output of ovs-bugtool run from the machine
running ovn-northd, and one of the machines running ovn-controller and
ovs-vswitchd.
Thanks,
—Kevin
Hi Kevin,
In short, I agree with Ethan's assessment that the hmap numbers appear
large. The ACLs, combined with ovn-controller's algorithms, are causing
that. The ACLs result in ~250000 openflow flows being generated.
ovn-controller re-creates this flow table from scratch each time it runs
its main loop.
These are unsolved problems as yet in ovn-controller. The two high-level
thoughts on the matter are:
1) See if we can reduce the number of generated flows when ACLs are used.
2) See if we can perform incremental changes on the flow table rather
than re-creating it from scratch every time.
There are two commits that may be of interest to you.
* As Miguel mentioned in his response, Russell Bryant submitted a change
[1] within the past couple of months to help towards (1) a bit. The idea
was that if we can detect that a flow will never be needed on a specific
machine, then don't generate it. It's not going to cover everything, but
it could potentially help you. That change is present in 2.7.3 and 2.8.1
* A change [2] by Han Zhou fixed an issue where we were generating the
flow table and then just throwing it away since vswitchd was not ready
for us to submit a new flow table to it. This change is present in 2.7.3
and 2.8.0.
These two commits aren't silver bullets but if you're running a version
of OVS that does not have these commits in them already, you may see
some improvement with an upgrade.
Mark
[1]
https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338847.html
[2] https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/331286.html
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss