Does this problem exist in 1.3.0?
On Thu, 24 May 2012 14:33:44 +0200, Oliver Francke writes: > Hi Volkan, > > thnx for you attention, > > > On 05/24/2012 12:21 PM, Volkan YAZICI wrote: >> Hi Oliver! >> >> I have a triangle shaped network topology of 3 ovs 1.4.1 >> switches. When I plug-in a host into the network I immediatly >> start to observe the same log pollution and excessive CPU usage >> similar to yours. I either need to un-plug the machine or break >> the triangle into a straight line by removing one of the >> connections between the switches. Have you made any progress on >> your side about the problem? Any clues? > > no progress and no reply, I restarted the ovs-vswitchd with log-file enabled > to > see some more things, but too much json-output. > And Ben told me, to have far below 10% instead of 40% avg. and spikes for > several minutes of 100%. > > Perhaps I go into the logfile it time permits... > > Thnx n regards, > > Oliver > >> >> >> Best. >> >> On Thu, 17 May 2012 13:58:00 +0200, Oliver Francke writes: >>> Hi, >>> >>> uhm, I think I have my firewall-provisioning ready for production, but >>> still temporary high load of the ovs-vswitchd. >>> >>> Anybody with a clue of what's going on there? >>> >>> --- 8-< --- >>> >>> May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844633|poll_loop|WARN|Dropped 771 >>> log messages in last 1 seconds (most recently, 1 seconds ago) due to >>> excessive rate >>> May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844634|poll_loop|WARN|wakeup due >>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>> lib/dpif-linux.c:1197 (101% CPU usage) >>> May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844635|poll_loop|WARN|wakeup due >>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>> lib/dpif-linux.c:1197 (101% CPU usage) >>> May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844636|timeval|WARN|105 ms poll >>> interval (56 ms user, 44 ms system) is over 152 times the weighted mean >>> interval 1 ms (342116319 samples) >>> May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844637|timeval|WARN|context >>> switches: 0 voluntary, 2 involuntary >>> May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844638|coverage|INFO|Skipping >>> details of duplicate event coverage for hash=959f79a0 in epoch 342116319 >>> May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844639|poll_loop|WARN|Dropped 880 >>> log messages in last 1 seconds (most recently, 1 seconds ago) due to >>> excessive rate >>> >>> --- 8-< --- >>> >>> and ovs-dpctl shows: >>> >>> system@vmbr1: >>> lookups: hit:269430948 missed:1076470 lost:1 >>> flows: 6 >>> port 0: vmbr1 (internal) >>> port 1: eth1 >>> port 4: vlan10 (internal) >>> port 5: tap822i1d0 >>> port 6: tap822i1d1 >>> port 7: tap410i1d0 >>> port 9: tap1113i1d0 >>> port 13: tap433i1d0 >>> port 15: tap377i1d0 >>> port 16: tap416i1d0 >>> port 18: tap287i1d0 >>> port 19: tap451i1d0 >>> port 23: tap160i1d0 >>> port 24: tap376i1d0 >>> port 27: tap1084i1d0 >>> port 28: tap1085i1d0 >>> port 30: tap760i1d0 >>> port 31: tap339i1d0 >>> system@vmbr0: >>> lookups: hit:15321943230 missed:8565995663 lost:201094006 >>> flows: 15216 >>> port 0: vmbr0 (internal) >>> port 1: vlan146 (internal) >>> port 2: eth0 >>> . >>> . >>> . >>> >>> Hints welcome, cause my fear is, if I put a couple of rules to some 120 >>> VM's, that there could be performance issues when load is high anyway? >>> >>> Kind regards, >>> >>> Oliver. >>> >>> Am 10.05.2012 um 10:25 schrieb Oliver Francke: >>> >>>> Hi *, >>>> >>>> as suggested by Ben I upgraded to 1.4.1 and configured with following >>>> command: >>>> >>>>> ovs-vsctl set bridge vmbr0 >>>>> other-config:flow-eviction-threshold=10000 >>>> all 4 nodes, which worked seamlessly. And now to the section with the >>>> "but" in it: >>>> >>>> But: still high load with polluting syslog with: ( not constantly, but >>>> ever so often) >>>> >>>> --- 8-< --- >>>> >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139643|poll_loop|WARN|wakeup due >>>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>>> lib/dpif-linux.c:1197 (52% CPU usage) >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139644|poll_loop|WARN|wakeup due >>>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>>> lib/dpif-linux.c:1197 (52% CPU usage) >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139645|poll_loop|WARN|wakeup due >>>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>>> lib/dpif-linux.c:1197 (52% CPU usage) >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139646|poll_loop|WARN|wakeup due >>>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>>> lib/dpif-linux.c:1197 (52% CPU usage) >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139647|poll_loop|WARN|wakeup due >>>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>>> lib/dpif-linux.c:1197 (52% CPU usage) >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139648|timeval|WARN|52 ms poll >>>> interval (10 ms user, 10 ms system) is over 19 times the weighted mean >>>> interval 3 ms (31293432 samples) >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139649|timeval|WARN|context >>>> switches: 0 voluntary, 136 involuntary >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139650|coverage|INFO|Event >>>> coverage (epoch 31293432/entire run), hash=0a8403eb: >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139651|coverage|INFO|ofproto_dpif_xlate 30 / 418241133 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139652|coverage|INFO|flow_extract >>>> 15 / 119894932 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139653|coverage|INFO|hmap_pathological 6 / 187853714 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139654|coverage|INFO|hmap_expand >>>> 275 / 1030432349 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139655|coverage|INFO|netdev_get_stats 117 / 1275136 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139656|coverage|INFO|poll_fd_wait >>>> 15 / 469400883 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139657|coverage|INFO|util_xalloc >>>> 21740 / 85658737570 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139658|coverage|INFO|netdev_ethtool 234 / 2550522 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139659|coverage|INFO|netlink_received 486 / 444888366 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139660|coverage|INFO|netlink_sent >>>> 264 / 361913061 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139661|coverage|INFO|bridge_reconfigure 0 / 6 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139662|coverage|INFO|ofproto_flush >>>> 0 / 2 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139663|coverage|INFO|ofproto_update_port 0 / 131 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139664|coverage|INFO|facet_revalidate 0 / 157765 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139665|coverage|INFO|facet_unexpected 0 / 1 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139666|coverage|INFO|dpif_port_add >>>> 0 / 2 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139667|coverage|INFO|dpif_port_del >>>> 0 / 2 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139668|coverage|INFO|dpif_flow_flush 0 / 4 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139669|coverage|INFO|dpif_flow_put >>>> 0 / 445 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139670|coverage|INFO|dpif_flow_del >>>> 0 / 119661491 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139671|coverage|INFO|dpif_purge >>>> 0 / 2 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139672|coverage|INFO|mac_learning_learned 0 / 6111 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139673|coverage|INFO|mac_learning_expired 0 / 5598 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139674|coverage|INFO|poll_zero_timeout 0 / 6190 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139675|coverage|INFO|pstream_open >>>> 0 / 4 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139676|coverage|INFO|stream_open >>>> 0 / 1 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139677|coverage|INFO|netdev_set_policing 0 / 706 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139678|coverage|INFO|netdev_get_ifindex 0 / 123 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139679|coverage|INFO|netdev_get_hwaddr 0 / 125 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139680|coverage|INFO|nln_changed >>>> 0 / 137 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: >>>> 139681|coverage|INFO|netlink_recv_jumbo 0 / 16397628 >>>> May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139682|coverage|INFO|47 events >>>> never hit >>>> May 8 06:30:16 fcmsnode0 ovs-vswitchd: 139683|poll_loop|WARN|Dropped 216 >>>> log messages in last 1 seconds (most recently, 1 seconds ago) due to >>>> excessive rate >>>> May 8 06:30:16 fcmsnode0 ovs-vswitchd: 139684|poll_loop|WARN|wakeup due >>>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>>> lib/dpif-linux.c:1197 (52% CPU usage) >>>> May 8 06:30:16 fcmsnode0 ovs-vswitchd: 139685|poll_loop|WARN|wakeup due >>>> to [POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at >>>> lib/dpif-linux.c:1197 (52% CPU usage) >>>> May 8 06:30:17 fcmsnode0 ovs-vswitchd: 139686|poll_loop|WARN|Dropped 480 >>>> log messages in last 1 seconds (most recently, 1 seconds ago) due to >>>> excessive rate >>>> >>>> --- 8-< --- >>>> >>>> Perhaps there is already a hint in the stats... If not, how to dig into it >>>> further...? >>>> >>>> Kind regards, >>>> >>>> Oliver. >>>> >>>> -- >>>> >>>> Oliver Francke >>>> >>>> filoo GmbH >>>> Moltkestraße 25a >>>> 33330 Gütersloh >>>> HRB4355 AG Gütersloh >>>> >>>> Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz >>>> >>>> Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh >>>> >>>> _______________________________________________ >>>> discuss mailing list >>>> [email protected] >>>> http://openvswitch.org/mailman/listinfo/discuss >>> _______________________________________________ >>> discuss mailing list >>> [email protected] >>> http://openvswitch.org/mailman/listinfo/discuss _______________________________________________ discuss mailing list [email protected] http://openvswitch.org/mailman/listinfo/discuss
