When doing scale testing for OVN (using https://github.com/openvswitch/ovn-scale-test), we had some interesting findings, and need some help here.
We ran the test "create and bind lports" against branch 2.9 and branch 2.6, and we found that 2.6 was must faster. With some analysis, we found out the reason is not because of OVN gets slower in 2.9, but because the bottleneck of this test in branch 2.9 is ovs-vswitchd. The testing was run in an environment with 20 farm nodes, each has 50 sandbox HVs (I will just mention them as HVs in short). Before the test, there are already 9500 lports bound in 950 HVs on 19 farm nodes. The test run against the last farm node to bind the lport on the 50 HVs there. The steps in the test scenario are: 1. Create 5 new LSs in NB (so that the LSs will not be shared with any of HVs on other farm nodes) 2. create 100 lports in NB on a LS 3. bind these lports on HVs, 2 for each HV. They are bound sequentially on each HV, and for each HV the 2 ports are bound using one command together: ovs-vsctl add-port <port 1> -- set Interface external-ids:... -- add-port <port 2> -- set Interface external-ids:... (the script didn't set type to internal, but I hope it is not an issue for this test). 4. wait the port stated changed to up in NB for all the 100 lports (with a single ovn-nbctl command) These steps are repeated for 5 times, one for each LS. So in the end we got 500 more lports created and bound (the total scale is then 1k HVs and 10k lports). When running with 2.6, the ovn-controllers were taking most of the CPU time. However, with 2.9, the CPU of ovn-controllers spikes but there is always ovs-vswitchd on the top with 100% CPU. It means the ovs-vswitchd is the bottleneck in this testing. There is only one ovs-vswitchd with 100% at the same time and different ovs-vswitchd will spike one after another, since the ports are bound sequentially on each HV. From the rally log, each 2 ports binding takes around 4 - 5 seconds. This is just the ovs-vsctl command execution time. The 100% CPU of ovs-vswitchd explains the slowness. So, based on this result, we can not using the total time to evaluate the efficiency of OVN, instead we can evaluate by CPU cost of ovn-controller processes. In fact, 2.9 ovn-controller costs around 70% less CPU than 2.6, which I think is due to some optimization we made earlier. (With my work-in-progress patch it saves much more, and I will post later as RFC). However, I cannot explain why ovs-vswitchd is getting slower than 2.6 when doing port-binding. We need expert suggestions here, for what could be the possible reason of this slowness. We can do more testing with different versions between 2.6 and 2.9 to find out related change, but with some pointers it might save some effort. Below are some logs of ovs-vswitchd when port binding is happening: ====== 2018-02-07T00:12:54.558Z|01767|bridge|INFO|bridge br-int: added interface lport_bc65cd_QFOU3v on port 1028 2018-02-07T00:12:55.629Z|01768|timeval|WARN|Unreasonably long 1112ms poll interval (1016ms user, 4ms system) 2018-02-07T00:12:55.629Z|01769|timeval|WARN|faults: 336 minor, 0 major 2018-02-07T00:12:55.629Z|01770|timeval|WARN|context switches: 0 voluntary, 13 involuntary 2018-02-07T00:12:55.629Z|01771|coverage|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=b256889c: 2018-02-07T00:12:55.629Z|01772|coverage|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0056/sec total: 29 2018-02-07T00:12:55.629Z|01773|coverage|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:55.629Z|01774|coverage|INFO|ofproto_packet_out 0.0/sec 0.000/sec 0.0111/sec total: 90 2018-02-07T00:12:55.629Z|01775|coverage|INFO|ofproto_recv_openflow 0.2/sec 0.033/sec 0.4858/sec total: 6673 2018-02-07T00:12:55.629Z|01776|coverage|INFO|ofproto_update_port 0.0/sec 0.000/sec 5.5883/sec total: 28258 2018-02-07T00:12:55.629Z|01777|coverage|INFO|rev_reconfigure 0.0/sec 0.000/sec 0.0056/sec total: 32 2018-02-07T00:12:55.629Z|01778|coverage|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0011/sec total: 6 2018-02-07T00:12:55.629Z|01779|coverage|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0078/sec total: 124 2018-02-07T00:12:55.629Z|01780|coverage|INFO|rev_mac_learning 0.0/sec 0.000/sec 0.0036/sec total: 13 2018-02-07T00:12:55.629Z|01781|coverage|INFO|xlate_actions 0.0/sec 0.000/sec 0.0111/sec total: 90 2018-02-07T00:12:55.629Z|01782|coverage|INFO|ccmap_expand 0.0/sec 0.000/sec 0.0050/sec total: 36 2018-02-07T00:12:55.629Z|01783|coverage|INFO|ccmap_shrink 0.0/sec 0.000/sec 0.0000/sec total: 49 2018-02-07T00:12:55.629Z|01784|coverage|INFO|cmap_expand 0.0/sec 0.000/sec 0.0217/sec total: 257 2018-02-07T00:12:55.629Z|01785|coverage|INFO|cmap_shrink 0.0/sec 0.000/sec 0.0000/sec total: 141 2018-02-07T00:12:55.629Z|01786|coverage|INFO|dpif_port_add 0.0/sec 0.000/sec 0.0022/sec total: 1020 2018-02-07T00:12:55.629Z|01787|coverage|INFO|dpif_port_del 0.0/sec 0.000/sec 0.0000/sec total: 40 2018-02-07T00:12:55.629Z|01788|coverage|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0000/sec total: 2 2018-02-07T00:12:55.629Z|01789|coverage|INFO|dpif_flow_get 0.0/sec 0.000/sec 0.0000/sec total: 20 2018-02-07T00:12:55.629Z|01790|coverage|INFO|dpif_flow_put 0.0/sec 0.000/sec 0.0000/sec total: 21 2018-02-07T00:12:55.629Z|01791|coverage|INFO|dpif_flow_del 0.0/sec 0.000/sec 0.0000/sec total: 20 2018-02-07T00:12:55.629Z|01792|coverage|INFO|dpif_execute 0.0/sec 0.000/sec 0.0100/sec total: 71 2018-02-07T00:12:55.629Z|01793|coverage|INFO|flow_extract 0.0/sec 0.000/sec 0.0000/sec total: 4 2018-02-07T00:12:55.629Z|01794|coverage|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.4361/sec total: 6435 2018-02-07T00:12:55.629Z|01795|coverage|INFO|hindex_pathological 0.0/sec 0.000/sec 0.0000/sec total: 17 2018-02-07T00:12:55.629Z|01796|coverage|INFO|hindex_expand 0.0/sec 0.000/sec 0.0000/sec total: 9 2018-02-07T00:12:55.629Z|01797|coverage|INFO|hmap_pathological 0.0/sec 0.000/sec 0.8131/sec total: 13752 2018-02-07T00:12:55.630Z|01798|coverage|INFO|hmap_expand 21.0/sec 18.333/sec 11253.0167/sec total: 55905271 2018-02-07T00:12:55.630Z|01799|coverage|INFO|hmap_shrink 0.0/sec 0.000/sec 0.0022/sec total: 1029 2018-02-07T00:12:55.630Z|01800|coverage|INFO|mac_learning_learned 0.0/sec 0.000/sec 0.0022/sec total: 18 2018-02-07T00:12:55.630Z|01801|coverage|INFO|mac_learning_expired 0.0/sec 0.000/sec 0.0017/sec total: 6 2018-02-07T00:12:55.630Z|01802|coverage|INFO|netdev_sent 0.0/sec 0.000/sec 0.0400/sec total: 350 2018-02-07T00:12:55.630Z|01803|coverage|INFO|netdev_get_stats 203.4/sec 203.400/sec 198.4978/sec total: 798522 2018-02-07T00:12:55.630Z|01804|coverage|INFO|txn_unchanged 0.2/sec 0.200/sec 0.1911/sec total: 776 2018-02-07T00:12:55.630Z|01805|coverage|INFO|txn_incomplete 0.0/sec 0.000/sec 0.0167/sec total: 106 2018-02-07T00:12:55.630Z|01806|coverage|INFO|txn_success 0.0/sec 0.000/sec 0.0111/sec total: 48 2018-02-07T00:12:55.630Z|01807|coverage|INFO|poll_create_node 162.6/sec 157.267/sec 156.3664/sec total: 636465 2018-02-07T00:12:55.630Z|01808|coverage|INFO|poll_zero_timeout 0.2/sec 0.250/sec 0.4564/sec total: 1936 2018-02-07T00:12:55.630Z|01809|coverage|INFO|rconn_queued 0.2/sec 0.033/sec 0.0453/sec total: 1214 2018-02-07T00:12:55.630Z|01810|coverage|INFO|rconn_sent 0.2/sec 0.033/sec 0.0453/sec total: 1214 2018-02-07T00:12:55.630Z|01811|coverage|INFO|seq_change 1098.8/sec 1099.167/sec 1097.4869/sec total: 4468823 2018-02-07T00:12:55.630Z|01812|coverage|INFO|pstream_open 0.0/sec 0.000/sec 0.0000/sec total: 5 2018-02-07T00:12:55.630Z|01813|coverage|INFO|stream_open 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:55.630Z|01814|coverage|INFO|util_xalloc 7223.8/sec 6527.317/sec 51569.9900/sec total: 251999059 2018-02-07T00:12:55.630Z|01815|coverage|INFO|vconn_received 0.2/sec 0.033/sec 0.4858/sec total: 6675 2018-02-07T00:12:55.630Z|01816|coverage|INFO|vconn_sent 0.2/sec 0.033/sec 0.0453/sec total: 1216 2018-02-07T00:12:55.630Z|01817|coverage|INFO|netlink_received 0.0/sec 0.000/sec 0.0000/sec total: 3 2018-02-07T00:12:55.630Z|01818|coverage|INFO|netlink_sent 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:55.630Z|01819|coverage|INFO|57 events never hit 2018-02-07T00:12:56.761Z|01820|timeval|WARN|Unreasonably long 1112ms poll interval (948ms user, 4ms system) 2018-02-07T00:12:56.761Z|01821|timeval|WARN|faults: 7 minor, 0 major 2018-02-07T00:12:56.761Z|01822|timeval|WARN|context switches: 2 voluntary, 17 involuntary 2018-02-07T00:12:56.761Z|01823|coverage|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=fcc57021: 2018-02-07T00:12:56.761Z|01824|coverage|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0056/sec total: 30 2018-02-07T00:12:56.761Z|01825|coverage|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:56.761Z|01826|coverage|INFO|ofproto_packet_out 0.0/sec 0.000/sec 0.0111/sec total: 90 2018-02-07T00:12:56.761Z|01827|coverage|INFO|ofproto_recv_openflow 0.2/sec 0.033/sec 0.4858/sec total: 6683 2018-02-07T00:12:56.761Z|01828|coverage|INFO|ofproto_update_port 0.0/sec 0.000/sec 5.5883/sec total: 29268 2018-02-07T00:12:56.761Z|01829|coverage|INFO|rev_reconfigure 0.0/sec 0.000/sec 0.0056/sec total: 33 2018-02-07T00:12:56.761Z|01830|coverage|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0011/sec total: 6 2018-02-07T00:12:56.761Z|01831|coverage|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0078/sec total: 124 2018-02-07T00:12:56.761Z|01832|coverage|INFO|rev_mac_learning 0.0/sec 0.000/sec 0.0036/sec total: 13 2018-02-07T00:12:56.761Z|01833|coverage|INFO|xlate_actions 0.0/sec 0.000/sec 0.0111/sec total: 90 2018-02-07T00:12:56.761Z|01834|coverage|INFO|ccmap_expand 0.0/sec 0.000/sec 0.0050/sec total: 36 2018-02-07T00:12:56.761Z|01835|coverage|INFO|ccmap_shrink 0.0/sec 0.000/sec 0.0000/sec total: 49 2018-02-07T00:12:56.761Z|01836|coverage|INFO|cmap_expand 0.0/sec 0.000/sec 0.0217/sec total: 257 2018-02-07T00:12:56.761Z|01837|coverage|INFO|cmap_shrink 0.0/sec 0.000/sec 0.0000/sec total: 141 2018-02-07T00:12:56.761Z|01838|coverage|INFO|dpif_port_add 0.0/sec 0.000/sec 0.0022/sec total: 1020 2018-02-07T00:12:56.761Z|01839|coverage|INFO|dpif_port_del 0.0/sec 0.000/sec 0.0000/sec total: 40 2018-02-07T00:12:56.761Z|01840|coverage|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0000/sec total: 2 2018-02-07T00:12:56.761Z|01841|coverage|INFO|dpif_flow_get 0.0/sec 0.000/sec 0.0000/sec total: 20 2018-02-07T00:12:56.761Z|01842|coverage|INFO|dpif_flow_put 0.0/sec 0.000/sec 0.0000/sec total: 21 2018-02-07T00:12:56.761Z|01843|coverage|INFO|dpif_flow_del 0.0/sec 0.000/sec 0.0000/sec total: 20 2018-02-07T00:12:56.761Z|01844|coverage|INFO|dpif_execute 0.0/sec 0.000/sec 0.0100/sec total: 71 2018-02-07T00:12:56.761Z|01845|coverage|INFO|flow_extract 0.0/sec 0.000/sec 0.0000/sec total: 4 2018-02-07T00:12:56.761Z|01846|coverage|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.4361/sec total: 6440 2018-02-07T00:12:56.761Z|01847|coverage|INFO|hindex_pathological 0.0/sec 0.000/sec 0.0000/sec total: 17 2018-02-07T00:12:56.761Z|01848|coverage|INFO|hindex_expand 0.0/sec 0.000/sec 0.0000/sec total: 9 2018-02-07T00:12:56.761Z|01849|coverage|INFO|hmap_pathological 0.0/sec 0.000/sec 0.8131/sec total: 13825 2018-02-07T00:12:56.761Z|01850|coverage|INFO|hmap_expand 21.0/sec 18.333/sec 11253.0167/sec total: 57949607 2018-02-07T00:12:56.761Z|01851|coverage|INFO|hmap_shrink 0.0/sec 0.000/sec 0.0022/sec total: 1029 2018-02-07T00:12:56.761Z|01852|coverage|INFO|mac_learning_learned 0.0/sec 0.000/sec 0.0022/sec total: 18 2018-02-07T00:12:56.761Z|01853|coverage|INFO|mac_learning_expired 0.0/sec 0.000/sec 0.0017/sec total: 6 2018-02-07T00:12:56.761Z|01854|coverage|INFO|netdev_sent 0.0/sec 0.000/sec 0.0400/sec total: 350 2018-02-07T00:12:56.761Z|01855|coverage|INFO|netdev_get_stats 203.4/sec 203.400/sec 198.4978/sec total: 799540 2018-02-07T00:12:56.761Z|01856|coverage|INFO|txn_unchanged 0.2/sec 0.200/sec 0.1911/sec total: 777 2018-02-07T00:12:56.761Z|01857|coverage|INFO|txn_incomplete 0.0/sec 0.000/sec 0.0167/sec total: 108 2018-02-07T00:12:56.761Z|01858|coverage|INFO|txn_success 0.0/sec 0.000/sec 0.0111/sec total: 49 2018-02-07T00:12:56.761Z|01859|coverage|INFO|poll_create_node 162.6/sec 157.267/sec 156.3664/sec total: 636681 2018-02-07T00:12:56.761Z|01860|coverage|INFO|poll_zero_timeout 0.2/sec 0.250/sec 0.4564/sec total: 1937 2018-02-07T00:12:56.761Z|01861|coverage|INFO|rconn_queued 0.2/sec 0.033/sec 0.0453/sec total: 1216 2018-02-07T00:12:56.761Z|01862|coverage|INFO|rconn_sent 0.2/sec 0.033/sec 0.0453/sec total: 1216 2018-02-07T00:12:56.761Z|01863|coverage|INFO|seq_change 1098.8/sec 1099.167/sec 1097.4869/sec total: 4471434 2018-02-07T00:12:56.761Z|01864|coverage|INFO|pstream_open 0.0/sec 0.000/sec 0.0000/sec total: 5 2018-02-07T00:12:56.761Z|01865|coverage|INFO|stream_open 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:56.761Z|01866|coverage|INFO|util_xalloc 7223.8/sec 6527.317/sec 51569.9900/sec total: 260206006 2018-02-07T00:12:56.761Z|01867|coverage|INFO|vconn_received 0.2/sec 0.033/sec 0.4858/sec total: 6685 2018-02-07T00:12:56.761Z|01868|coverage|INFO|vconn_sent 0.2/sec 0.033/sec 0.0453/sec total: 1218 2018-02-07T00:12:56.761Z|01869|coverage|INFO|netlink_received 0.0/sec 0.000/sec 0.0000/sec total: 3 2018-02-07T00:12:56.761Z|01870|coverage|INFO|netlink_sent 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:56.761Z|01871|coverage|INFO|57 events never hit 2018-02-07T00:12:57.054Z|01872|bridge|INFO|bridge br-int: added interface lport_bc65cd_lykjnm on port 1029 2018-02-07T00:12:58.202Z|01873|timeval|WARN|Unreasonably long 1179ms poll interval (1032ms user, 0ms system) 2018-02-07T00:12:58.202Z|01874|timeval|WARN|faults: 41 minor, 0 major 2018-02-07T00:12:58.202Z|01875|timeval|WARN|context switches: 0 voluntary, 19 involuntary 2018-02-07T00:12:58.203Z|01876|coverage|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=668121b0: 2018-02-07T00:12:58.203Z|01877|coverage|INFO|bridge_reconfigure 0.0/sec 0.000/sec 0.0056/sec total: 31 2018-02-07T00:12:58.203Z|01878|coverage|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:58.203Z|01879|coverage|INFO|ofproto_packet_out 0.0/sec 0.000/sec 0.0111/sec total: 90 2018-02-07T00:12:58.203Z|01880|coverage|INFO|ofproto_recv_openflow 0.2/sec 0.033/sec 0.4858/sec total: 6733 2018-02-07T00:12:58.203Z|01881|coverage|INFO|ofproto_update_port 0.0/sec 0.000/sec 5.5883/sec total: 30279 2018-02-07T00:12:58.203Z|01882|coverage|INFO|rev_reconfigure 0.0/sec 0.000/sec 0.0056/sec total: 34 2018-02-07T00:12:58.203Z|01883|coverage|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0011/sec total: 6 2018-02-07T00:12:58.203Z|01884|coverage|INFO|rev_flow_table 0.0/sec 0.000/sec 0.0078/sec total: 125 2018-02-07T00:12:58.203Z|01885|coverage|INFO|rev_mac_learning 0.0/sec 0.000/sec 0.0036/sec total: 13 2018-02-07T00:12:58.203Z|01886|coverage|INFO|xlate_actions 0.0/sec 0.000/sec 0.0111/sec total: 90 2018-02-07T00:12:58.203Z|01887|coverage|INFO|ccmap_expand 0.0/sec 0.000/sec 0.0050/sec total: 36 2018-02-07T00:12:58.203Z|01888|coverage|INFO|ccmap_shrink 0.0/sec 0.000/sec 0.0000/sec total: 49 2018-02-07T00:12:58.203Z|01889|coverage|INFO|cmap_expand 0.0/sec 0.000/sec 0.0217/sec total: 260 2018-02-07T00:12:58.203Z|01890|coverage|INFO|cmap_shrink 0.0/sec 0.000/sec 0.0000/sec total: 141 2018-02-07T00:12:58.203Z|01891|coverage|INFO|dpif_port_add 0.0/sec 0.000/sec 0.0022/sec total: 1021 2018-02-07T00:12:58.203Z|01892|coverage|INFO|dpif_port_del 0.0/sec 0.000/sec 0.0000/sec total: 40 2018-02-07T00:12:58.203Z|01893|coverage|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0000/sec total: 2 2018-02-07T00:12:58.203Z|01894|coverage|INFO|dpif_flow_get 0.0/sec 0.000/sec 0.0000/sec total: 20 2018-02-07T00:12:58.203Z|01895|coverage|INFO|dpif_flow_put 0.0/sec 0.000/sec 0.0000/sec total: 21 2018-02-07T00:12:58.203Z|01896|coverage|INFO|dpif_flow_del 0.0/sec 0.000/sec 0.0000/sec total: 20 2018-02-07T00:12:58.203Z|01897|coverage|INFO|dpif_execute 0.0/sec 0.000/sec 0.0100/sec total: 71 2018-02-07T00:12:58.203Z|01898|coverage|INFO|flow_extract 0.0/sec 0.000/sec 0.0000/sec total: 4 2018-02-07T00:12:58.203Z|01899|coverage|INFO|miniflow_malloc 0.0/sec 0.000/sec 0.4361/sec total: 6489 2018-02-07T00:12:58.203Z|01900|coverage|INFO|hindex_pathological 0.0/sec 0.000/sec 0.0000/sec total: 17 2018-02-07T00:12:58.203Z|01901|coverage|INFO|hindex_expand 0.0/sec 0.000/sec 0.0000/sec total: 9 2018-02-07T00:12:58.203Z|01902|coverage|INFO|hmap_pathological 0.0/sec 0.000/sec 0.8131/sec total: 13945 2018-02-07T00:12:58.203Z|01903|coverage|INFO|hmap_expand 21.0/sec 18.333/sec 11253.0167/sec total: 59998044 2018-02-07T00:12:58.203Z|01904|coverage|INFO|hmap_shrink 0.0/sec 0.000/sec 0.0022/sec total: 1030 2018-02-07T00:12:58.203Z|01905|coverage|INFO|mac_learning_learned 0.0/sec 0.000/sec 0.0022/sec total: 18 2018-02-07T00:12:58.203Z|01906|coverage|INFO|mac_learning_expired 0.0/sec 0.000/sec 0.0017/sec total: 6 2018-02-07T00:12:58.203Z|01907|coverage|INFO|netdev_sent 0.0/sec 0.000/sec 0.0400/sec total: 350 2018-02-07T00:12:58.203Z|01908|coverage|INFO|netdev_get_stats 203.4/sec 203.400/sec 198.4978/sec total: 799541 2018-02-07T00:12:58.203Z|01909|coverage|INFO|txn_unchanged 0.2/sec 0.200/sec 0.1911/sec total: 777 2018-02-07T00:12:58.203Z|01910|coverage|INFO|txn_incomplete 0.0/sec 0.000/sec 0.0167/sec total: 110 2018-02-07T00:12:58.203Z|01911|coverage|INFO|txn_success 0.0/sec 0.000/sec 0.0111/sec total: 50 2018-02-07T00:12:58.203Z|01912|coverage|INFO|poll_create_node 162.6/sec 157.267/sec 156.3664/sec total: 636974 2018-02-07T00:12:58.203Z|01913|coverage|INFO|poll_zero_timeout 0.2/sec 0.250/sec 0.4564/sec total: 1939 2018-02-07T00:12:58.203Z|01914|coverage|INFO|rconn_queued 0.2/sec 0.033/sec 0.0453/sec total: 1217 2018-02-07T00:12:58.203Z|01915|coverage|INFO|rconn_sent 0.2/sec 0.033/sec 0.0453/sec total: 1217 2018-02-07T00:12:58.203Z|01916|coverage|INFO|seq_change 1098.8/sec 1099.167/sec 1097.4869/sec total: 4474605 2018-02-07T00:12:58.203Z|01917|coverage|INFO|pstream_open 0.0/sec 0.000/sec 0.0000/sec total: 5 2018-02-07T00:12:58.203Z|01918|coverage|INFO|stream_open 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:58.203Z|01919|coverage|INFO|util_xalloc 7223.8/sec 6527.317/sec 51569.9900/sec total: 268435993 2018-02-07T00:12:58.203Z|01920|coverage|INFO|vconn_received 0.2/sec 0.033/sec 0.4858/sec total: 6735 2018-02-07T00:12:58.203Z|01921|coverage|INFO|vconn_sent 0.2/sec 0.033/sec 0.0453/sec total: 1219 2018-02-07T00:12:58.203Z|01922|coverage|INFO|netlink_received 0.0/sec 0.000/sec 0.0000/sec total: 3 2018-02-07T00:12:58.203Z|01923|coverage|INFO|netlink_sent 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:12:58.203Z|01924|coverage|INFO|57 events never hit 2018-02-07T00:12:58.221Z|01925|bridge|INFO|bridge br-int: added interface patch-br-int-to-provnet-lswitch_bc65cd_Y7sqEw on port 1030 2018-02-07T00:12:58.221Z|01926|bridge|INFO|bridge br0: added interface patch-provnet-lswitch_bc65cd_Y7sqEw-to-br-int on port 15 2018-02-07T00:12:59.313Z|01927|timeval|WARN|Unreasonably long 1111ms poll interval (1092ms user, 0ms system) 2018-02-07T00:12:59.313Z|01928|timeval|WARN|faults: 45 minor, 0 major 2018-02-07T00:12:59.313Z|01929|timeval|WARN|context switches: 6 voluntary, 59 involuntary 2018-02-07T00:12:59.313Z|01930|coverage|INFO|Skipping details of duplicate event coverage for hash=668121b0 2018-02-07T00:13:00.430Z|01931|timeval|WARN|Unreasonably long 1117ms poll interval (1112ms user, 0ms system) 2018-02-07T00:13:00.430Z|01932|timeval|WARN|faults: 34 minor, 0 major 2018-02-07T00:13:00.430Z|01933|timeval|WARN|context switches: 2 voluntary, 11 involuntary 2018-02-07T00:13:00.431Z|01934|coverage|INFO|Event coverage, avg rate over last: 5 seconds, last minute, last hour, hash=a612a6e0: 2018-02-07T00:13:00.431Z|01935|coverage|INFO|bridge_reconfigure 0.8/sec 0.067/sec 0.0067/sec total: 33 2018-02-07T00:13:00.431Z|01936|coverage|INFO|ofproto_flush 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:13:00.431Z|01937|coverage|INFO|ofproto_packet_out 0.2/sec 0.017/sec 0.0114/sec total: 92 2018-02-07T00:13:00.431Z|01938|coverage|INFO|ofproto_recv_openflow 32.2/sec 2.717/sec 0.5306/sec total: 6935 2018-02-07T00:13:00.431Z|01939|coverage|INFO|ofproto_update_port 808.8/sec 67.400/sec 6.7117/sec total: 32303 2018-02-07T00:13:00.431Z|01940|coverage|INFO|rev_reconfigure 0.8/sec 0.067/sec 0.0067/sec total: 36 2018-02-07T00:13:00.431Z|01941|coverage|INFO|rev_port_toggled 0.0/sec 0.000/sec 0.0011/sec total: 7 2018-02-07T00:13:00.431Z|01942|coverage|INFO|rev_flow_table 0.4/sec 0.033/sec 0.0083/sec total: 126 2018-02-07T00:13:00.431Z|01943|coverage|INFO|rev_mac_learning 0.0/sec 0.000/sec 0.0036/sec total: 13 2018-02-07T00:13:00.431Z|01944|coverage|INFO|xlate_actions 0.2/sec 0.017/sec 0.0114/sec total: 92 2018-02-07T00:13:00.431Z|01945|coverage|INFO|ccmap_expand 0.0/sec 0.000/sec 0.0050/sec total: 36 2018-02-07T00:13:00.431Z|01946|coverage|INFO|ccmap_shrink 0.0/sec 0.000/sec 0.0000/sec total: 49 2018-02-07T00:13:00.431Z|01947|coverage|INFO|cmap_expand 2.0/sec 0.167/sec 0.0244/sec total: 271 2018-02-07T00:13:00.431Z|01948|coverage|INFO|cmap_shrink 0.0/sec 0.000/sec 0.0000/sec total: 141 2018-02-07T00:13:00.431Z|01949|coverage|INFO|dpif_port_add 0.4/sec 0.033/sec 0.0028/sec total: 1021 2018-02-07T00:13:00.431Z|01950|coverage|INFO|dpif_port_del 0.0/sec 0.000/sec 0.0000/sec total: 40 2018-02-07T00:13:00.431Z|01951|coverage|INFO|dpif_flow_flush 0.0/sec 0.000/sec 0.0000/sec total: 2 2018-02-07T00:13:00.431Z|01952|coverage|INFO|dpif_flow_get 0.0/sec 0.000/sec 0.0000/sec total: 20 2018-02-07T00:13:00.431Z|01953|coverage|INFO|dpif_flow_put 0.0/sec 0.000/sec 0.0000/sec total: 21 2018-02-07T00:13:00.431Z|01954|coverage|INFO|dpif_flow_del 0.0/sec 0.000/sec 0.0000/sec total: 20 2018-02-07T00:13:00.431Z|01955|coverage|INFO|dpif_execute 0.0/sec 0.000/sec 0.0100/sec total: 72 2018-02-07T00:13:00.431Z|01956|coverage|INFO|flow_extract 0.0/sec 0.000/sec 0.0000/sec total: 4 2018-02-07T00:13:00.431Z|01957|coverage|INFO|miniflow_malloc 30.8/sec 2.567/sec 0.4789/sec total: 6689 2018-02-07T00:13:00.431Z|01958|coverage|INFO|hindex_pathological 0.0/sec 0.000/sec 0.0000/sec total: 17 2018-02-07T00:13:00.431Z|01959|coverage|INFO|hindex_expand 0.0/sec 0.000/sec 0.0000/sec total: 9 2018-02-07T00:13:00.431Z|01960|coverage|INFO|hmap_pathological 76.8/sec 6.400/sec 0.9197/sec total: 14149 2018-02-07T00:13:00.431Z|01961|coverage|INFO|hmap_expand 1637509.8/sec 136476.267/sec 13527.3358/sec total: 64098893 2018-02-07T00:13:00.431Z|01962|coverage|INFO|hmap_shrink 0.4/sec 0.033/sec 0.0028/sec total: 1030 2018-02-07T00:13:00.431Z|01963|coverage|INFO|mac_learning_learned 0.0/sec 0.000/sec 0.0022/sec total: 19 2018-02-07T00:13:00.431Z|01964|coverage|INFO|mac_learning_expired 0.0/sec 0.000/sec 0.0017/sec total: 6 2018-02-07T00:13:00.431Z|01965|coverage|INFO|netdev_sent 0.0/sec 0.000/sec 0.0400/sec total: 359 2018-02-07T00:13:00.431Z|01966|coverage|INFO|netdev_get_stats 204.4/sec 203.483/sec 198.7817/sec total: 799543 2018-02-07T00:13:00.431Z|01967|coverage|INFO|txn_unchanged 0.2/sec 0.200/sec 0.1914/sec total: 777 2018-02-07T00:13:00.431Z|01968|coverage|INFO|txn_incomplete 1.4/sec 0.117/sec 0.0186/sec total: 113 2018-02-07T00:13:00.431Z|01969|coverage|INFO|txn_success 0.6/sec 0.050/sec 0.0119/sec total: 51 2018-02-07T00:13:00.431Z|01970|coverage|INFO|poll_create_node 241.8/sec 164.833/sec 156.7022/sec total: 637792 2018-02-07T00:13:00.431Z|01971|coverage|INFO|poll_zero_timeout 1.8/sec 0.367/sec 0.4589/sec total: 1944 2018-02-07T00:13:00.431Z|01972|coverage|INFO|rconn_queued 1.2/sec 0.133/sec 0.0469/sec total: 1219 2018-02-07T00:13:00.431Z|01973|coverage|INFO|rconn_sent 1.2/sec 0.133/sec 0.0469/sec total: 1219 2018-02-07T00:13:00.431Z|01974|coverage|INFO|seq_change 2679.2/sec 1230.183/sec 1101.2081/sec total: 4481090 2018-02-07T00:13:00.431Z|01975|coverage|INFO|pstream_open 0.0/sec 0.000/sec 0.0000/sec total: 5 2018-02-07T00:13:00.431Z|01976|coverage|INFO|stream_open 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:13:00.431Z|01977|coverage|INFO|util_xalloc 6573732.0/sec 553875.117/sec 60700.1733/sec total: 284890464 2018-02-07T00:13:00.431Z|01978|coverage|INFO|vconn_received 32.2/sec 2.717/sec 0.5306/sec total: 6937 2018-02-07T00:13:00.431Z|01979|coverage|INFO|vconn_sent 1.2/sec 0.133/sec 0.0469/sec total: 1221 2018-02-07T00:13:00.431Z|01980|coverage|INFO|netlink_received 0.0/sec 0.000/sec 0.0000/sec total: 3 2018-02-07T00:13:00.431Z|01981|coverage|INFO|netlink_sent 0.0/sec 0.000/sec 0.0000/sec total: 1 2018-02-07T00:13:00.431Z|01982|coverage|INFO|57 events never hit 2018-02-07T00:13:00.431Z|01983|poll_loop|INFO|Dropped 6 log messages in last 217 seconds (most recently, 216 seconds ago) due to excessive rate 2018-02-07T00:13:00.431Z|01984|poll_loop|INFO|wakeup due to [POLLIN] on fd 112 (/root/sandbox-192.168.64.0/br-int.mgmt<->) at ../lib/stream-fd.c:157 (90% CPU usage) I managed to capture a perf record when a ovs-vswitchd is spiking, not sure if it helps: Samples: 6K of event 'cycles:pp', Event count (approx.): 118132588562209 Children Self Command Shared Object Symbol ◆ + 66.01% 0.00% ovs-vswitchd ovs-vswitchd [.] _start ▒ + 66.01% 0.00% ovs-vswitchd libc-2.23.so [.] __libc_start_main ▒ + 66.01% 0.00% ovs-vswitchd ovs-vswitchd [.] main ▒ + 66.01% 0.00% ovs-vswitchd ovs-vswitchd [.] bridge_run ▒ + 66.01% 0.00% ovs-vswitchd ovs-vswitchd [.] bridge_run__ ▒ + 41.88% 0.00% ovs-vswitchd ovs-vswitchd [.] ofproto_type_run ▒ + 41.88% 0.00% ovs-vswitchd ovs-vswitchd [.] type_run ▒ + 24.13% 0.00% ovs-vswitchd ovs-vswitchd [.] bridge_reconfigure ▒ + 24.13% 0.00% ovs-vswitchd ovs-vswitchd [.] ofproto_run ▒ + 24.13% 0.00% ovs-vswitchd ovs-vswitchd [.] update_port ▒ + 23.31% 0.00% ovs-vswitchd ovs-vswitchd [.] update_mtu ▒ + 23.31% 0.00% ovs-vswitchd ovs-vswitchd [.] update_mtu_ofproto ▒ + 23.31% 0.00% ovs-vswitchd ovs-vswitchd [.] ofport_is_mtu_overridden ▒ + 23.27% 0.00% ovs-vswitchd ovs-vswitchd [.] xlate_txn_start ▒ + 23.27% 0.00% ovs-vswitchd ovs-vswitchd [.] xlate_xport_copy ▒ + 23.27% 0.00% ovs-vswitchd libjemalloc.so.1 [.] calloc ▒ + 23.27% 0.00% ovs-vswitchd ovs-vswitchd [.] xcalloc ▒ + 23.27% 23.27% ovs-vswitchd libc-2.23.so [.] 0x000000000008f294 ▒ + 22.38% 0.00% ovs-vswitchd ovs-vswitchd [.] ofproto_class_find__ ▒ + 22.38% 0.00% ovs-vswitchd ovs-vswitchd [.] ofproto_port_open_type ▒ + 20.96% 0.00% ovs-vswitchd ovs-vswitchd [.] dp_enumerate_types ▒ + 20.14% 20.14% ovs-vswitchd libjemalloc.so.1 [.] malloc ▒ + 19.43% 0.81% ovs-vswitchd ovs-vswitchd [.] ovs_mutex_lock_at ▒ + 19.33% 0.00% ovs-vswitchd ovs-vswitchd [.] sset_add__ ▒ + 19.33% 0.00% ovs-vswitchd ovs-vswitchd [.] xmalloc ▒ + 18.62% 18.62% ovs-vswitchd libpthread-2.23.so [.] pthread_mutex_lock ▒ + 18.61% 0.00% ovs-vswitchd ovs-vswitchd [.] dpif_netdev_run ▒ + 18.61% 0.00% ovs-vswitchd ovs-vswitchd [.] dp_netdev_process_rxq_port ▒ + 18.61% 0.00% ovs-vswitchd ovs-vswitchd [.] netdev_rxq_recv ▒ + 18.61% 0.00% ovs-vswitchd ovs-vswitchd [.] netdev_dummy_rxq_recv ▒ + 13.03% 0.00% revalidator66 [kernel.kallsyms] [k] entry_SYSCALL_64_fastpath ▒ + 13.03% 0.00% revalidator66 libc-2.23.so [.] __clone ▒ + 13.03% 0.00% revalidator66 libpthread-2.23.so [.] start_thread ▒ Thanks, Han
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss