On 2/1/22 23:15, Vladislav Odintsov wrote:
> Hi OVN and OVS folks,
> 
> during OVN upgrade from 20.06 to 21.12 version I’ve faced with a quite long 
> datapath downtime (48-50 seconds).
> I suppose this downtime was triggered by ovn-controller installing new 
> openflow rules to ovs.
> There are relatively not so many flows (13-14k).
> openvswitch version is 2.13.4.
> ovn-openflow-probe-interval is configured to 30.
> 
> The upgrade procedure was:
> 1. Set to ovs external_ids an option to force matching ovn-northd internal 
> version (ovn-match-northd-version=true)
> 2. Upgrade ovn-controller (to first get ovn-controller, which supports 
> version match between controller and northd and not to break datapath after 
> upgrading ovn-northd)
> 3. Upgrade ovn-northd cluster
> 
> Network outage timeframe matches to ovs-vswitchd WARN logs. Please see next:
> 
> 2022-02-01T19:44:32.577Z|00035|ovs_rcu(urcu7)|WARN|blocked 1000 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:33.577Z|00036|ovs_rcu(urcu7)|WARN|blocked 2000 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:35.577Z|00037|ovs_rcu(urcu7)|WARN|blocked 4000 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:39.577Z|00038|ovs_rcu(urcu7)|WARN|blocked 8000 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:44.396Z|04063|connmgr|INFO|br-int<->unix#272674: 37 
> flow_mods 13 s ago (36 adds, 1 deletes)
> 2022-02-01T19:44:44.441Z|04064|timeval|WARN|Unreasonably long 12864ms poll 
> interval (0ms user, 12833ms system)
> 2022-02-01T19:44:44.441Z|04065|timeval|WARN|faults: 287 minor, 0 major
> 2022-02-01T19:44:44.441Z|04066|timeval|WARN|context switches: 1 voluntary, 18 
> involuntary
> 2022-02-01T19:44:44.441Z|04067|coverage|INFO|Event coverage, avg rate over 
> last: 5 seconds, last minute, last hour,  hash=9105faf6:
> 2022-02-01T19:44:44.441Z|04068|coverage|INFO|ccmap_expand               
> 0.0/sec     0.000/sec        0.0000/sec   total: 1129
> 2022-02-01T19:44:44.441Z|04069|coverage|INFO|ccmap_shrink               
> 0.0/sec    11.933/sec        0.1989/sec   total: 2363
> 2022-02-01T19:44:44.441Z|04070|coverage|INFO|cmap_expand                
> 0.0/sec     0.833/sec        0.8811/sec   total: 11218023
> 2022-02-01T19:44:44.441Z|04071|coverage|INFO|cmap_shrink                
> 8.2/sec    11.950/sec        1.0736/sec   total: 11217475
> 2022-02-01T19:44:44.441Z|04072|coverage|INFO|dpif_port_add              
> 0.0/sec     0.000/sec        0.0000/sec   total: 229
> 2022-02-01T19:44:44.441Z|04073|coverage|INFO|dpif_port_del              
> 0.0/sec     0.000/sec        0.0000/sec   total: 144
> 2022-02-01T19:44:44.441Z|04074|coverage|INFO|dpif_flow_flush            
> 0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2022-02-01T19:44:44.441Z|04075|coverage|INFO|dpif_flow_get              
> 0.0/sec     0.000/sec        0.0000/sec   total: 23
> 2022-02-01T19:44:44.441Z|04076|coverage|INFO|dpif_flow_put             
> 83.0/sec    97.900/sec       80.8306/sec   total: 849051748
> 2022-02-01T19:44:44.441Z|04077|coverage|INFO|dpif_flow_del            
> 147.8/sec    91.733/sec       78.2797/sec   total: 810797137
> 2022-02-01T19:44:44.441Z|04078|coverage|INFO|dpif_execute              
> 66.6/sec    83.767/sec       87.8397/sec   total: 958697586
> 2022-02-01T19:44:44.441Z|04079|coverage|INFO|dpif_execute_with_help     
> 0.0/sec     4.250/sec        4.9892/sec   total: 70713908
> 2022-02-01T19:44:44.441Z|04080|coverage|INFO|flow_extract             
> 169.0/sec   247.017/sec      333.1194/sec   total: 7990486004
> 2022-02-01T19:44:44.441Z|04081|coverage|INFO|miniflow_malloc            
> 0.0/sec     0.183/sec        0.7983/sec   total: 5871853
> 2022-02-01T19:44:44.441Z|04082|coverage|INFO|hindex_pathological        
> 0.0/sec     0.000/sec        0.0000/sec   total: 78
> 2022-02-01T19:44:44.441Z|04083|coverage|INFO|hindex_expand              
> 0.0/sec     0.000/sec        0.0000/sec   total: 11
> 2022-02-01T19:44:44.441Z|04084|coverage|INFO|hmap_pathological          
> 0.0/sec     0.633/sec        0.9331/sec   total: 16638117
> 2022-02-01T19:44:44.441Z|04085|coverage|INFO|hmap_expand                
> 0.0/sec   114.867/sec      139.8006/sec   total: 2184263485
> 2022-02-01T19:44:44.441Z|04086|coverage|INFO|mac_learning_learned       
> 0.0/sec     0.000/sec        0.0003/sec   total: 299
> 2022-02-01T19:44:44.441Z|04087|coverage|INFO|mac_learning_expired       
> 0.0/sec     0.000/sec        0.0006/sec   total: 117
> 2022-02-01T19:44:44.441Z|04088|coverage|INFO|mac_learning_moved         
> 0.0/sec     0.000/sec        0.0000/sec   total: 87
> 2022-02-01T19:44:44.441Z|04089|coverage|INFO|netdev_get_stats           
> 0.0/sec    36.850/sec       43.3306/sec   total: 690673778
> 2022-02-01T19:44:44.441Z|04090|coverage|INFO|txn_unchanged              
> 0.0/sec     0.100/sec        0.1614/sec   total: 2705864
> 2022-02-01T19:44:44.441Z|04091|coverage|INFO|txn_incomplete             
> 0.0/sec     0.183/sec        0.2239/sec   total: 3688261
> 2022-02-01T19:44:44.441Z|04092|coverage|INFO|txn_success                
> 0.0/sec     0.167/sec        0.1953/sec   total: 3236818
> 2022-02-01T19:44:44.441Z|04093|coverage|INFO|poll_create_node         
> 3930.6/sec  3941.350/sec     5012.2572/sec  total: 115959738162
> 2022-02-01T19:44:44.441Z|04094|coverage|INFO|poll_zero_timeout        
> 159.4/sec   111.417/sec      103.4078/sec   total: 1181551227
> 2022-02-01T19:44:44.441Z|04095|coverage|INFO|rconn_queued               
> 0.0/sec     0.050/sec        0.0817/sec   total: 1279558
> 2022-02-01T19:44:44.441Z|04096|coverage|INFO|rconn_sent                 
> 0.0/sec     0.050/sec        0.0817/sec   total: 1279558
> 2022-02-01T19:44:44.441Z|04097|coverage|INFO|seq_change               
> 2205.2/sec  2502.783/sec     2874.6258/sec  total: 57335390998
> 2022-02-01T19:44:44.441Z|04098|coverage|INFO|pstream_open               
> 0.0/sec     0.000/sec        0.0000/sec   total: 5
> 2022-02-01T19:44:44.441Z|04099|coverage|INFO|stream_open                
> 0.0/sec     0.000/sec        0.0000/sec   total: 1
> 2022-02-01T19:44:44.441Z|04100|coverage|INFO|unixctl_received           
> 0.0/sec     0.000/sec        0.0000/sec   total: 189
> 2022-02-01T19:44:44.441Z|04101|coverage|INFO|unixctl_replied            
> 0.0/sec     0.000/sec        0.0000/sec   total: 189
> 2022-02-01T19:44:44.441Z|04102|coverage|INFO|util_xalloc              
> 7935.8/sec 15234.350/sec    18372.4347/sec  total: 345308271208
> 2022-02-01T19:44:44.441Z|04103|coverage|INFO|vconn_received             
> 0.0/sec     0.083/sec        0.3556/sec   total: 2018930
> 2022-02-01T19:44:44.441Z|04104|coverage|INFO|vconn_sent                 
> 0.0/sec     0.067/sec        0.0983/sec   total: 1552067
> 2022-02-01T19:44:44.441Z|04105|coverage|INFO|netdev_set_policing        
> 0.0/sec     0.000/sec        0.0000/sec   total: 623
> 2022-02-01T19:44:44.441Z|04106|coverage|INFO|netdev_get_ifindex         
> 0.0/sec     0.100/sec        0.1636/sec   total: 2713928
> 2022-02-01T19:44:44.441Z|04107|coverage|INFO|netdev_set_hwaddr          
> 0.0/sec     0.000/sec        0.0000/sec   total: 2
> 2022-02-01T19:44:44.441Z|04108|coverage|INFO|netdev_get_ethtool         
> 0.0/sec     0.000/sec        0.0000/sec   total: 1073
> 2022-02-01T19:44:44.441Z|04109|coverage|INFO|netdev_set_ethtool         
> 0.0/sec     0.000/sec        0.0000/sec   total: 225
> 2022-02-01T19:44:44.441Z|04110|coverage|INFO|nln_changed                
> 0.0/sec     0.050/sec        0.0492/sec   total: 819614
> 2022-02-01T19:44:44.441Z|04111|coverage|INFO|netlink_received         
> 346.6/sec   472.133/sec      575.4597/sec   total: 11288088779
> 2022-02-01T19:44:44.441Z|04112|coverage|INFO|netlink_recv_jumbo        
> 18.4/sec   113.250/sec      175.1189/sec   total: 3935643810
> 2022-02-01T19:44:44.441Z|04113|coverage|INFO|netlink_sent             
> 308.4/sec   378.133/sec      380.4267/sec   total: 4753114969
> 2022-02-01T19:44:44.441Z|04114|coverage|INFO|ofproto_flush              
> 0.0/sec     0.000/sec        0.0000/sec   total: 1
> 2022-02-01T19:44:44.441Z|04115|coverage|INFO|ofproto_recv_openflow      
> 0.0/sec     0.067/sec        0.3389/sec   total: 1746449
> 2022-02-01T19:44:44.441Z|04116|coverage|INFO|ofproto_update_port        
> 0.0/sec     0.000/sec        0.0000/sec   total: 814
> 2022-02-01T19:44:44.441Z|04117|coverage|INFO|packet_in_overflow         
> 0.0/sec     0.000/sec        0.0000/sec   total: 282520
> 2022-02-01T19:44:44.441Z|04118|coverage|INFO|rev_reconfigure            
> 0.0/sec     0.017/sec        0.0325/sec   total: 539904
> 2022-02-01T19:44:44.441Z|04119|coverage|INFO|rev_port_toggled           
> 0.0/sec     0.000/sec        0.0000/sec   total: 27
> 2022-02-01T19:44:44.441Z|04120|coverage|INFO|rev_flow_table             
> 0.0/sec     0.000/sec        0.0106/sec   total: 10876
> 2022-02-01T19:44:44.441Z|04121|coverage|INFO|rev_mac_learning           
> 0.0/sec     0.000/sec        0.0006/sec   total: 312
> 2022-02-01T19:44:44.441Z|04122|coverage|INFO|dumped_duplicate_flow      
> 0.0/sec     0.000/sec        0.0000/sec   total: 39204
> 2022-02-01T19:44:44.441Z|04123|coverage|INFO|dumped_new_flow            
> 0.0/sec     0.000/sec        0.0000/sec   total: 7177
> 2022-02-01T19:44:44.441Z|04124|coverage|INFO|handler_duplicate_upcall   
> 0.4/sec     1.833/sec        2.2664/sec   total: 34762798
> 2022-02-01T19:44:44.441Z|04125|coverage|INFO|upcall_ukey_contention     
> 0.0/sec     0.000/sec        0.0000/sec   total: 9
> 2022-02-01T19:44:44.441Z|04126|coverage|INFO|upcall_ukey_replace        
> 0.2/sec     0.050/sec        0.0447/sec   total: 7927221
> 2022-02-01T19:44:44.441Z|04127|coverage|INFO|revalidate_missed_dp_flow   
> 0.0/sec     0.000/sec        0.0000/sec   total: 1118
> 2022-02-01T19:44:44.441Z|04128|coverage|INFO|xlate_actions            
> 123.6/sec   273.183/sec      233.8275/sec   total: 2110727223
> 2022-02-01T19:44:44.441Z|04129|coverage|INFO|bridge_reconfigure         
> 0.0/sec     0.017/sec        0.0325/sec   total: 539902
> 2022-02-01T19:44:44.441Z|04130|coverage|INFO|74 events never hit
> 2022-02-01T19:44:44.441Z|04131|poll_loop|INFO|Dropped 3 log messages in last 
> 1058229 seconds (most recently, 1058228 seconds ago) due to excessive rate
> 2022-02-01T19:44:44.441Z|04132|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (84% CPU 
> usage)
> 2022-02-01T19:44:45.441Z|00039|ovs_rcu(urcu7)|WARN|blocked 1000 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:46.441Z|00040|ovs_rcu(urcu7)|WARN|blocked 2000 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:48.441Z|00041|ovs_rcu(urcu7)|WARN|blocked 4001 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:52.441Z|00042|ovs_rcu(urcu7)|WARN|blocked 8000 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:55.109Z|04133|timeval|WARN|Unreasonably long 10668ms poll 
> interval (0ms user, 10643ms system)
> 2022-02-01T19:44:55.109Z|04134|timeval|WARN|faults: 70 minor, 0 major
> 2022-02-01T19:44:55.109Z|04135|timeval|WARN|context switches: 0 voluntary, 4 
> involuntary
> 2022-02-01T19:44:55.109Z|04136|coverage|INFO|Skipping details of duplicate 
> event coverage for hash=9105faf6
> 2022-02-01T19:44:55.109Z|04137|poll_loop|INFO|wakeup due to [POLLIN] on fd 15 
> (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (100% CPU usage)
> 2022-02-01T19:44:55.109Z|04138|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:44:56.112Z|00043|ovs_rcu(urcu7)|WARN|blocked 1001 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:57.112Z|00044|ovs_rcu(urcu7)|WARN|blocked 2000 ms waiting 
> for main to quiesce
> 2022-02-01T19:44:59.112Z|00045|ovs_rcu(urcu7)|WARN|blocked 4000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:03.112Z|00046|ovs_rcu(urcu7)|WARN|blocked 8000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:05.249Z|04139|timeval|WARN|Unreasonably long 10140ms poll 
> interval (0ms user, 10115ms system)
> 2022-02-01T19:45:05.249Z|04140|timeval|WARN|faults: 248 minor, 0 major
> 2022-02-01T19:45:05.249Z|04141|timeval|WARN|context switches: 1 voluntary, 27 
> involuntary
> 2022-02-01T19:45:05.249Z|04142|coverage|INFO|Skipping details of duplicate 
> event coverage for hash=9105faf6
> 2022-02-01T19:45:05.249Z|04143|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 
> (NETLINK_ROUTE<->NETLINK_ROUTE) at lib/netlink-socket.c:1401 (100% CPU usage)
> 2022-02-01T19:45:05.249Z|04144|poll_loop|INFO|wakeup due to [POLLOUT] on fd 
> 235 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:153 (100% CPU 
> usage)
> 2022-02-01T19:45:05.249Z|04145|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:06.250Z|00047|ovs_rcu(urcu7)|WARN|blocked 1000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:07.250Z|00048|ovs_rcu(urcu7)|WARN|blocked 2000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:09.250Z|00049|ovs_rcu(urcu7)|WARN|blocked 4000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:13.250Z|00050|ovs_rcu(urcu7)|WARN|blocked 8000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:15.704Z|04146|rconn|WARN|br-int<->unix#272675: connection 
> dropped (Broken pipe)
> 2022-02-01T19:45:16.714Z|00051|ovs_rcu(urcu7)|WARN|blocked 1000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:17.713Z|00052|ovs_rcu(urcu7)|WARN|blocked 2000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:19.714Z|00053|ovs_rcu(urcu7)|WARN|blocked 4000 ms waiting 
> for main to quiesce
> 2022-02-01T19:45:23.014Z|04147|timeval|WARN|Unreasonably long 17764ms poll 
> interval (0ms user, 17720ms system)
> 2022-02-01T19:45:23.014Z|04148|timeval|WARN|faults: 63 minor, 0 major
> 2022-02-01T19:45:23.014Z|04149|timeval|WARN|context switches: 2 voluntary, 24 
> involuntary
> 2022-02-01T19:45:23.014Z|04150|poll_loop|INFO|wakeup due to [POLLIN] on fd 15 
> (<->/var/run/openvswitch/db.sock) at lib/stream-fd.c:157 (100% CPU usage)
> 2022-02-01T19:45:23.014Z|04151|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:23.021Z|04152|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:23.022Z|04153|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:23.022Z|04154|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:23.023Z|04155|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:23.023Z|04156|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:23.023Z|04157|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:23.024Z|04158|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:23.024Z|04159|poll_loop|INFO|wakeup due to [POLLIN] on fd 
> 230 (/var/run/openvswitch/br-int.mgmt<->) at lib/stream-fd.c:157 (100% CPU 
> usage)
> 2022-02-01T19:45:41.274Z|04160|connmgr|INFO|br-int<->unix#272674: 13589 
> flow_mods 18 s ago (13509 adds, 80 modifications)
> 2022-02-01T19:46:31.285Z|04161|rconn|ERR|br-int<->unix#272673: no response to 
> inactivity probe after 60 seconds, disconnecting
> 
> 
> In ovn-controller log:
> 
> 2022-02-01T19:44:31.267Z|00077|features|INFO|unix:/run/openvswitch/br-int.mgmt:
>  connecting to switch
> 2022-02-01T19:44:31.267Z|00078|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
> connecting...
> 2022-02-01T19:44:31.267Z|00079|ofctrl|INFO|unix:/run/openvswitch/br-int.mgmt: 
> connecting to switch
> 2022-02-01T19:44:31.267Z|00080|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
> connecting...
> 2022-02-01T19:44:31.268Z|00081|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
> connected
> 2022-02-01T19:44:31.269Z|00082|rconn|INFO|unix:/run/openvswitch/br-int.mgmt: 
> connected
> 2022-02-01T19:44:31.284Z|00083|main|INFO|OVS feature set changed, force 
> recompute.
> 2022-02-01T19:44:34.074Z|00084|main|INFO|OVNSB commit failed, force recompute 
> next time.
> 2022-02-01T19:44:34.621Z|00085|reconnect|INFO|ssl:<IP>16.0.61:6642: 
> connection closed by peer
> 2022-02-01T19:44:35.622Z|00086|reconnect|INFO|ssl:<IP>:6642: connecting...
> 2022-02-01T19:44:35.672Z|00087|reconnect|INFO|ssl:<IP>:6642: connected
> 2022-02-01T19:45:01.470Z|00088|lflow_cache|INFO|Detected cache inactivity 
> (last active 30001 ms ago): trimming cache
> 2022-02-01T19:45:23.351Z|00089|binding|INFO|Setting lport <lport> 
> ovn-installed in OVS
> 2022-02-01T19:45:23.351Z|00090|binding|INFO|Setting lport <lport> up in 
> Southbound
> <there are 80 port bindings...>
> 2022-02-01T19:45:23.352Z|00245|binding|INFO|Setting lport <lportN> 
> ovn-installed in OVS
> 2022-02-01T19:45:23.352Z|00246|binding|INFO|Setting lport <lportN> up in 
> Southbound
> 2022-02-01T19:45:53.323Z|00247|lflow_cache|INFO|Detected cache inactivity 
> (last active 30004 ms ago): trimming cache
> 
> 
> I’ve got some questions:
> 1. What OVS is doing while `waiting for main to quiesce` ? I noticed this 
> message appeares each time downtime occurs.

'Unreasonably long 17764ms poll interval (0ms user, 17720ms system)'
is very concerning.  It means that the main thread spent almost 18
seconds inside the kernel executing some syscall, maybe several syscalls.
Main thread is not responsible for flow installation to the kernel, IIRC,
so flow updates should not be a case here.  You need to figure out what
are these syscalls and why they are taking so long to execute inside
the kernel.

Maybe you have some suspicious kernel logs at the same time?
You may also try 'perf' or some other tracing tools to figure out
what is going on.

> 2. Is there any hint to minimize such downtime while reinstalling OF?

Installation of rules is an unlikely root cause here.

> 3. Could there be a situation where ovn-northd after upgrade still generating 
> new logical flows, ovn-controller constantly gets updates from SB and 
> triggers update to OF in a loop while doesn’t finish getting final lflows 
> "view"? So, could it start installing new OF while having in memory lflows 
> which are not consistent at that moment?

northd generates all updates in a single transaction, so SB is updated
atomically. SB will send this update as a single update message, so this
is also atomic.  ovn-controller will install all OF rules as an OF bundle,
so that is also atomic.

> 4. What does `br-int<->unix#272675: connection dropped (Broken pipe)` error 
> means? What type of connection is broken? Is it normal?

This looks like a controller connection, i.e. connection between
ovn-controller with ovs-vswitchd.  Doesn't look very good, but
disconnection seems to be caused by inactivity due to overload
on one of the sides of the connection.

> 5. Do you see anything else abnormal in the logs?

'0ms user, 17720ms system' is the most abnormal, AFAICT.

Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to