Re: [ovs-discuss] Upgraded to openvswitch-1.4.1 and still high load and polluted syslog

Oliver Francke Thu, 17 May 2012 04:58:09 -0700

Hi,

uhm, I think I have my firewall-provisioning ready for production, but still 
temporary high load of the ovs-vswitchd.


Anybody with a clue of what's going on there?

--- 8-< ---

May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844633|poll_loop|WARN|Dropped 771 log 
messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844634|poll_loop|WARN|wakeup due to 
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
(101% CPU usage)
May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844635|poll_loop|WARN|wakeup due to 
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
(101% CPU usage)
May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844636|timeval|WARN|105 ms poll 
interval (56 ms user, 44 ms system) is over 152 times the weighted mean 
interval 1 ms (342116319 samples)
May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844637|timeval|WARN|context switches: 
0 voluntary, 2 involuntary
May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844638|coverage|INFO|Skipping details 
of duplicate event coverage for hash=959f79a0 in epoch 342116319
May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844639|poll_loop|WARN|Dropped 880 log 
messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate

--- 8-< ---

and ovs-dpctl shows:

system@vmbr1:
        lookups: hit:269430948 missed:1076470 lost:1
        flows: 6
        port 0: vmbr1 (internal)
        port 1: eth1
        port 4: vlan10 (internal)
        port 5: tap822i1d0
        port 6: tap822i1d1
        port 7: tap410i1d0
        port 9: tap1113i1d0
        port 13: tap433i1d0
        port 15: tap377i1d0
        port 16: tap416i1d0
        port 18: tap287i1d0
        port 19: tap451i1d0
        port 23: tap160i1d0
        port 24: tap376i1d0
        port 27: tap1084i1d0
        port 28: tap1085i1d0
        port 30: tap760i1d0
        port 31: tap339i1d0
system@vmbr0:
        lookups: hit:15321943230 missed:8565995663 lost:201094006
        flows: 15216
        port 0: vmbr0 (internal)
        port 1: vlan146 (internal)
        port 2: eth0
.
.
.

Hints welcome, cause my fear is, if I put a couple of rules to some 120 VM's, 
that there could be performance issues when load is high anyway?

Kind regards,

Oliver.

Am 10.05.2012 um 10:25 schrieb Oliver Francke:

> Hi *,
> 
> as suggested by Ben I upgraded to 1.4.1 and configured with following command:
> 
> >          ovs-vsctl set bridge vmbr0 
> > other-config:flow-eviction-threshold=10000
> 
> all 4 nodes, which worked seamlessly. And now to the section with the "but" 
> in it:
> 
> But: still high load with polluting syslog with: ( not constantly, but ever 
> so often)
> 
> --- 8-< ---
> 
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139643|poll_loop|WARN|wakeup due to 
> [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
> (52% CPU usage)
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139644|poll_loop|WARN|wakeup due to 
> [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
> (52% CPU usage)
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139645|poll_loop|WARN|wakeup due to 
> [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
> (52% CPU usage)
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139646|poll_loop|WARN|wakeup due to 
> [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
> (52% CPU usage)
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139647|poll_loop|WARN|wakeup due to 
> [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
> (52% CPU usage)
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139648|timeval|WARN|52 ms poll 
> interval (10 ms user, 10 ms system) is over 19 times the weighted mean 
> interval 3 ms (31293432 samples)
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139649|timeval|WARN|context switches: 
> 0 voluntary, 136 involuntary
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139650|coverage|INFO|Event coverage 
> (epoch 31293432/entire run), hash=0a8403eb:
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139651|coverage|INFO|ofproto_dpif_xlate          30 / 418241133
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139652|coverage|INFO|flow_extract     
>            15 / 119894932
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139653|coverage|INFO|hmap_pathological            6 / 187853714
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139654|coverage|INFO|hmap_expand      
>           275 / 1030432349
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139655|coverage|INFO|netdev_get_stats 
>           117 /   1275136
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139656|coverage|INFO|poll_fd_wait     
>            15 / 469400883
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139657|coverage|INFO|util_xalloc      
>         21740 / 85658737570
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139658|coverage|INFO|netdev_ethtool   
>           234 /   2550522
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139659|coverage|INFO|netlink_received 
>           486 / 444888366
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139660|coverage|INFO|netlink_sent     
>           264 / 361913061
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139661|coverage|INFO|bridge_reconfigure           0 /         6
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139662|coverage|INFO|ofproto_flush    
>             0 /         2
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139663|coverage|INFO|ofproto_update_port          0 /       131
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139664|coverage|INFO|facet_revalidate 
>             0 /    157765
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139665|coverage|INFO|facet_unexpected 
>             0 /         1
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139666|coverage|INFO|dpif_port_add    
>             0 /         2
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139667|coverage|INFO|dpif_port_del    
>             0 /         2
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139668|coverage|INFO|dpif_flow_flush  
>             0 /         4
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139669|coverage|INFO|dpif_flow_put    
>             0 /       445
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139670|coverage|INFO|dpif_flow_del    
>             0 / 119661491
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139671|coverage|INFO|dpif_purge       
>             0 /         2
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139672|coverage|INFO|mac_learning_learned         0 /      6111
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139673|coverage|INFO|mac_learning_expired         0 /      5598
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139674|coverage|INFO|poll_zero_timeout            0 /      6190
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139675|coverage|INFO|pstream_open     
>             0 /         4
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139676|coverage|INFO|stream_open      
>             0 /         1
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139677|coverage|INFO|netdev_set_policing          0 /       706
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139678|coverage|INFO|netdev_get_ifindex           0 /       123
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139679|coverage|INFO|netdev_get_hwaddr            0 /       125
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139680|coverage|INFO|nln_changed      
>             0 /       137
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 
> 139681|coverage|INFO|netlink_recv_jumbo           0 /  16397628
> May  8 06:30:15 fcmsnode0 ovs-vswitchd: 139682|coverage|INFO|47 events never 
> hit
> May  8 06:30:16 fcmsnode0 ovs-vswitchd: 139683|poll_loop|WARN|Dropped 216 log 
> messages in last 1 seconds (most recently, 1 seconds ago) due to excessive 
> rate
> May  8 06:30:16 fcmsnode0 ovs-vswitchd: 139684|poll_loop|WARN|wakeup due to 
> [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
> (52% CPU usage)
> May  8 06:30:16 fcmsnode0 ovs-vswitchd: 139685|poll_loop|WARN|wakeup due to 
> [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197 
> (52% CPU usage)
> May  8 06:30:17 fcmsnode0 ovs-vswitchd: 139686|poll_loop|WARN|Dropped 480 log 
> messages in last 1 seconds (most recently, 1 seconds ago) due to excessive 
> rate
> 
> --- 8-< ---
> 
> Perhaps there is already a hint in the stats... If not, how to dig into it 
> further...?
> 
> Kind regards,
> 
> Oliver.
> 
> -- 
> 
> Oliver Francke
> 
> filoo GmbH
> Moltkestraße 25a
> 33330 Gütersloh
> HRB4355 AG Gütersloh
> 
> Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
> 
> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
> 
> _______________________________________________
> discuss mailing list
> [email protected]
> http://openvswitch.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss

Re: [ovs-discuss] Upgraded to openvswitch-1.4.1 and still high load and polluted syslog

Reply via email to