Well...
On 05/24/2012 02:34 PM, Volkan YAZICI wrote:
Does this problem exist in 1.3.0?
we started late with 1.4.0... and it shows in production only (
traffic-load-issue, whatever not seen in test-lab)
Oliver.
On Thu, 24 May 2012 14:33:44 +0200, Oliver Francke writes:
Hi Volkan,
thnx for you attention,
On 05/24/2012 12:21 PM, Volkan YAZICI wrote:
Hi Oliver!
I have a triangle shaped network topology of 3 ovs 1.4.1
switches. When I plug-in a host into the network I immediatly
start to observe the same log pollution and excessive CPU usage
similar to yours. I either need to un-plug the machine or break
the triangle into a straight line by removing one of the
connections between the switches. Have you made any progress on
your side about the problem? Any clues?
no progress and no reply, I restarted the ovs-vswitchd with log-file enabled to
see some more things, but too much json-output.
And Ben told me, to have far below 10% instead of 40% avg. and spikes for
several minutes of 100%.
Perhaps I go into the logfile it time permits...
Thnx n regards,
Oliver
Best.
On Thu, 17 May 2012 13:58:00 +0200, Oliver Francke writes:
Hi,
uhm, I think I have my firewall-provisioning ready for production, but still
temporary high load of the ovs-vswitchd.
Anybody with a clue of what's going on there?
--- 8-< ---
May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844633|poll_loop|WARN|Dropped 771 log
messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844634|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(101% CPU usage)
May 17 13:54:07 fcmsnode10 ovs-vswitchd: 1844635|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(101% CPU usage)
May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844636|timeval|WARN|105 ms poll
interval (56 ms user, 44 ms system) is over 152 times the weighted mean
interval 1 ms (342116319 samples)
May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844637|timeval|WARN|context switches:
0 voluntary, 2 involuntary
May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844638|coverage|INFO|Skipping details
of duplicate event coverage for hash=959f79a0 in epoch 342116319
May 17 13:54:08 fcmsnode10 ovs-vswitchd: 1844639|poll_loop|WARN|Dropped 880 log
messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
--- 8-< ---
and ovs-dpctl shows:
system@vmbr1:
lookups: hit:269430948 missed:1076470 lost:1
flows: 6
port 0: vmbr1 (internal)
port 1: eth1
port 4: vlan10 (internal)
port 5: tap822i1d0
port 6: tap822i1d1
port 7: tap410i1d0
port 9: tap1113i1d0
port 13: tap433i1d0
port 15: tap377i1d0
port 16: tap416i1d0
port 18: tap287i1d0
port 19: tap451i1d0
port 23: tap160i1d0
port 24: tap376i1d0
port 27: tap1084i1d0
port 28: tap1085i1d0
port 30: tap760i1d0
port 31: tap339i1d0
system@vmbr0:
lookups: hit:15321943230 missed:8565995663 lost:201094006
flows: 15216
port 0: vmbr0 (internal)
port 1: vlan146 (internal)
port 2: eth0
.
.
.
Hints welcome, cause my fear is, if I put a couple of rules to some 120 VM's,
that there could be performance issues when load is high anyway?
Kind regards,
Oliver.
Am 10.05.2012 um 10:25 schrieb Oliver Francke:
Hi *,
as suggested by Ben I upgraded to 1.4.1 and configured with following command:
ovs-vsctl set bridge vmbr0 other-config:flow-eviction-threshold=10000
all 4 nodes, which worked seamlessly. And now to the section with the "but" in
it:
But: still high load with polluting syslog with: ( not constantly, but ever so
often)
--- 8-< ---
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139643|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(52% CPU usage)
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139644|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(52% CPU usage)
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139645|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(52% CPU usage)
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139646|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(52% CPU usage)
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139647|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(52% CPU usage)
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139648|timeval|WARN|52 ms poll interval
(10 ms user, 10 ms system) is over 19 times the weighted mean interval 3 ms
(31293432 samples)
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139649|timeval|WARN|context switches: 0
voluntary, 136 involuntary
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139650|coverage|INFO|Event coverage
(epoch 31293432/entire run), hash=0a8403eb:
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139651|coverage|INFO|ofproto_dpif_xlate
30 / 418241133
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139652|coverage|INFO|flow_extract
15 / 119894932
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139653|coverage|INFO|hmap_pathological
6 / 187853714
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139654|coverage|INFO|hmap_expand
275 / 1030432349
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139655|coverage|INFO|netdev_get_stats
117 / 1275136
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139656|coverage|INFO|poll_fd_wait
15 / 469400883
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139657|coverage|INFO|util_xalloc
21740 / 85658737570
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139658|coverage|INFO|netdev_ethtool
234 / 2550522
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139659|coverage|INFO|netlink_received
486 / 444888366
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139660|coverage|INFO|netlink_sent
264 / 361913061
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139661|coverage|INFO|bridge_reconfigure
0 / 6
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139662|coverage|INFO|ofproto_flush
0 / 2
May 8 06:30:15 fcmsnode0 ovs-vswitchd:
139663|coverage|INFO|ofproto_update_port 0 / 131
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139664|coverage|INFO|facet_revalidate
0 / 157765
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139665|coverage|INFO|facet_unexpected
0 / 1
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139666|coverage|INFO|dpif_port_add
0 / 2
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139667|coverage|INFO|dpif_port_del
0 / 2
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139668|coverage|INFO|dpif_flow_flush
0 / 4
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139669|coverage|INFO|dpif_flow_put
0 / 445
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139670|coverage|INFO|dpif_flow_del
0 / 119661491
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139671|coverage|INFO|dpif_purge
0 / 2
May 8 06:30:15 fcmsnode0 ovs-vswitchd:
139672|coverage|INFO|mac_learning_learned 0 / 6111
May 8 06:30:15 fcmsnode0 ovs-vswitchd:
139673|coverage|INFO|mac_learning_expired 0 / 5598
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139674|coverage|INFO|poll_zero_timeout
0 / 6190
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139675|coverage|INFO|pstream_open
0 / 4
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139676|coverage|INFO|stream_open
0 / 1
May 8 06:30:15 fcmsnode0 ovs-vswitchd:
139677|coverage|INFO|netdev_set_policing 0 / 706
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139678|coverage|INFO|netdev_get_ifindex
0 / 123
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139679|coverage|INFO|netdev_get_hwaddr
0 / 125
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139680|coverage|INFO|nln_changed
0 / 137
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139681|coverage|INFO|netlink_recv_jumbo
0 / 16397628
May 8 06:30:15 fcmsnode0 ovs-vswitchd: 139682|coverage|INFO|47 events never hit
May 8 06:30:16 fcmsnode0 ovs-vswitchd: 139683|poll_loop|WARN|Dropped 216 log
messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
May 8 06:30:16 fcmsnode0 ovs-vswitchd: 139684|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(52% CPU usage)
May 8 06:30:16 fcmsnode0 ovs-vswitchd: 139685|poll_loop|WARN|wakeup due to
[POLLIN] on fd 36 (unknown anon_inode:[eventpoll]) at lib/dpif-linux.c:1197
(52% CPU usage)
May 8 06:30:17 fcmsnode0 ovs-vswitchd: 139686|poll_loop|WARN|Dropped 480 log
messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
--- 8-< ---
Perhaps there is already a hint in the stats... If not, how to dig into it
further...?
Kind regards,
Oliver.
--
Oliver Francke
filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh
Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss
--
Oliver Francke
filoo GmbH
Moltkestraße 25a
33330 Gütersloh
HRB4355 AG Gütersloh
Geschäftsführer: S.Grewing | J.Rehpöhler | C.Kunz
Folgen Sie uns auf Twitter: http://twitter.com/filoogmbh
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss