[ovs-discuss] No buffer space available warning
Hi, With the following are the platform specifications: OVS version: 1.4.0 traffic type: MTU size UDP packets @7000packets per second, generating 50,000 flows by changing UDP src Ports. OS: Fedora release 12, Linux 2.6.31.5-127.fc12.i686.PAE CPU: Intel Core i3 @ 2100 GHz Controller: NOX, simple IP based switching application. If I generate 50,000 flows in two chunks, for example, first I generate traffic with src Port from 1 to 20,000 for few seconds then change the src Port range to 20,001 to 50,000, I get the following warnings: Jun 11 16:41:06|00019|dpif|WARN|system@br0: recv failed (No buffer space available) Jun 11 16:41:19|00020|dpif|WARN|system@br0: recv failed (No buffer space available) Jun 11 16:41:22|00021|dpif|WARN|system@br0: recv failed (No buffer space available) Jun 11 16:41:24|00022|dpif|WARN|system@br0: recv failed (No buffer space available) Following is the output of ovs-dpctl show br0 system@br0: lookups: hit:491818 missed:603323 lost:116749 flows: 28559 Moreover, if I generate 50,000 flows in one go, without any pause, then I do not get any error. How to diagnose the problem? Has anyone else experienced such problem with reactive flow writing? ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] Flow miss/Packet order question
Are you planning to solve this problem in near future or do you have any suggestions to mitigate this problem? On Thu, Apr 26, 2012 at 2:37 AM, Ben Pfaff b...@nicira.com wrote: On Wed, Apr 25, 2012 at 01:33:56PM -0700, Joji Mtt wrote: I am trying to figure out if there would be a packet order issue with the current version of OVS. Consider a case where a controller has added a forwarding rule for a specific flow (Flow A) and this rule is not yet installed in the DP. In this scenario, it is conceivable that certain (bursty) traffic patterns on Flow A can result in the packets being sent out of order. E.g. consider an initial burst of 5 packets that miss the kernel flow table, followed by several subsequent packets arriving at random intervals. As soon as the userspace processes the flow miss, it will install a rule in the kernel. Depending on the relative timing of the rule installation, any of these subsequent packets could get switched directly by the kernel before the previous packets that took the slow path could get forwarded. I couldn't find any special handling to cover this case. Most likely it is already handled and I am just missing the part where it is done. Could anyone clarify this for me? Yes, it's possible to get out-of-order packets for this reason. ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] warning messages regarding buffer space and unknown buffer
the patch has solved my problem. thanks a lot On Wed, Apr 25, 2012 at 10:52 AM, junaid khalid junaid.kha...@seecs.nust.edu.pk wrote: Thanks a lot. I will update you after testing it On Wed, Apr 25, 2012 at 10:45 AM, Ben Pfaff b...@nicira.com wrote: On Wed, Apr 25, 2012 at 10:38:45AM +0500, junaid khalid wrote: we are talking about the recv failed problem (system@br0: recv failed (No buffer space available)). Try applying the following patch from master to your tree. It may help. --8--cut here--8-- From: Ben Pfaff b...@nicira.com Date: Thu, 15 Mar 2012 21:15:38 -0700 Subject: [PATCH] netlink-socket: Increase Netlink socket receive buffer size. Open vSwitch userspace can set up flows at a high rate, but it is somewhat bursty in opportunities to set up flows, by which I mean that OVS sets up a batch of flows, then goes off and does some other work for a while, then sets up another batch of flows, and so on. The result is that, if a large number of packets that need flow setups come in all at once, then some of them can overflow the relatively small kernel-to-user buffers. This commit increases the kernel-to-user buffers from the default of approximately 120 kB each to 1 MB each. In one somewhat synthetic test case that I ran based on an hping3 that generated a load of about 20,000 new flows per second (including both requests and replies), this reduced the packets dropped at the kernel-to-user interface from about 30% to none. I expect that it will similarly improve packet loss in workloads where flow arrival is not easily predictable. (This has little effect on workloads generated by ovs-benchmark rate because that benchmark is effectively self-clocking, that is, a new flow is triggered only by a reply to a request made earlier, which means that the number of buffered packets at any given has a known, constant upper limit.) Bug #10210. Signed-off-by: Ben Pfaff b...@nicira.com --- include/sparse/sys/socket.h |5 +++-- lib/netlink-socket.c| 10 +- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/include/sparse/sys/socket.h b/include/sparse/sys/socket.h index 89e3c2d..1ed195b 100644 --- a/include/sparse/sys/socket.h +++ b/include/sparse/sys/socket.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2011 Nicira Networks. + * Copyright (c) 2011, 2012 Nicira Networks. * * Licensed under the Apache License, Version 2.0 (the License); * you may not use this file except in compliance with the License. @@ -74,7 +74,8 @@ enum { SO_SNDBUF, SO_SNDLOWAT, SO_SNDTIMEO, -SO_TYPE +SO_TYPE, +SO_RCVBUFFORCE }; enum { diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c index bc46235..df6f1d8 100644 --- a/lib/netlink-socket.c +++ b/lib/netlink-socket.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011 Nicira Networks. + * Copyright (c) 2008, 2009, 2010, 2011, 2012 Nicira Networks. * * Licensed under the Apache License, Version 2.0 (the License); * you may not use this file except in compliance with the License. @@ -89,6 +89,7 @@ nl_sock_create(int protocol, struct nl_sock **sockp) struct nl_sock *sock; struct sockaddr_nl local, remote; socklen_t local_size; +int rcvbuf; int retval = 0; if (!max_iovs) { @@ -122,6 +123,13 @@ nl_sock_create(int protocol, struct nl_sock **sockp) sock-protocol = protocol; sock-dump = NULL; +rcvbuf = 1024 * 1024; +if (setsockopt(sock-fd, SOL_SOCKET, SO_RCVBUFFORCE, + rcvbuf, sizeof rcvbuf)) { +VLOG_WARN_RL(rl, setting %d-byte socket receive buffer failed (%s), + rcvbuf, strerror(errno)); +} + retval = get_socket_rcvbuf(sock-fd); if (retval 0) { retval = -retval; -- 1.7.2.5 ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] warning messages regarding buffer space and unknown buffer
Hi, Do you mean the setup of flows in cached flow table in kernel module? We are sending packets after setting up the flows. Therefore, this problem should be in the fast path or in other words, between the ovs-vswitchd and kernel module. About traffic passing from the switch, MTU size UDP packets are passed from the switch; received from one interface and sent out from the other interface of NIC. On Tue, Apr 24, 2012 at 2:55 AM, Ben Pfaff b...@nicira.com wrote: On Mon, Apr 23, 2012 at 03:54:09PM +0600, junaid khalid wrote: 1) Apr 23 13:19:27|00019|dpif|WARN|system@br0: recv failed (No buffer space available) Packets are arriving at your interfaces faster than ovs-vswitchd can set up flows. What traffic is going through the switch? 2) Apr 23 13:19:37|00033|pktbuf|WARN|cookie mismatch: 01fa != 02fa OVS is sending buffered packets to your OpenFlow controller. The OpenFlow controller is sending back replies to use those buffers, but the replies are arriving slowly enough that by the time that they arrive the switch has already discarded those buffers. Apr 23 13:19:37|00034|connmgr|INFO|br0-tcp:127.0.0.1:6633: sending type OFPET_BAD_REQUEST, code OFPBRC_BUFFER_UNKNOWN error reply to OFPT_FLOW_MOD message Also a consequence of the above. traffic type: MTU size UDP packets @7.3 Gbps, generating 100,000 flows by changing src IPs. NOX isn't going to be able to keep up with that rate. OVS 1.4.0 isn't, either, but OVS 1.6.90 from the tip of master should be able to handle it without a controller. Throwing in an OpenFlow controller that sees every packet will probably bog things down a lot. Apr 23 13:19:16|00010|ofp_util|WARN|received Nicira extension message of unknown type 8 That message has been obsolete for ages, why are you using it? ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] warning messages regarding buffer space and unknown buffer
On Tue, Apr 24, 2012 at 9:06 PM, Ben Pfaff b...@nicira.com wrote: On Tue, Apr 24, 2012 at 07:02:37PM +0600, junaid khalid wrote: Do you mean the setup of flows in cached flow table in kernel module? We are sending packets after setting up the flows. Therefore, this problem should be in the fast path or in other words, between the ovs-vswitchd and kernel module. You said you have 100,000 flows. How many packets in each flow? we are generating packets in a round robin fashion, approx. 6 packets per flow per sec. ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] warning messages regarding buffer space and unknown buffer
On Wed, Apr 25, 2012 at 10:14 AM, Ben Pfaff b...@nicira.com wrote: On Wed, Apr 25, 2012 at 09:43:50AM +0500, junaid khalid wrote: On Tue, Apr 24, 2012 at 9:06 PM, Ben Pfaff b...@nicira.com wrote: On Tue, Apr 24, 2012 at 07:02:37PM +0600, junaid khalid wrote: Do you mean the setup of flows in cached flow table in kernel module? We are sending packets after setting up the flows. Therefore, this problem should be in the fast path or in other words, between the ovs-vswitchd and kernel module. You said you have 100,000 flows. How many packets in each flow? we are generating packets in a round robin fashion, approx. 6 packets per flow per sec. You might want to increase the flow eviction threshold to 10 then. See the documentation in ovs-vswitchd.conf.db(5). We have also tried that. We set the flow eviction threshold to 100,000 and added a periodic print in the kernel module to check the number of entries in the flow table in kernel module. We noticed that, although the flow table size in userspace is 100,000 (from dump-aggregate command), but the flow table size in kernel module increases gradually and the error continues to come up till the kernel table is completely filled with 10 entries. ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] warning messages regarding buffer space and unknown buffer
Thanks a lot. I will update you after testing it On Wed, Apr 25, 2012 at 10:45 AM, Ben Pfaff b...@nicira.com wrote: On Wed, Apr 25, 2012 at 10:38:45AM +0500, junaid khalid wrote: we are talking about the recv failed problem (system@br0: recv failed (No buffer space available)). Try applying the following patch from master to your tree. It may help. --8--cut here--8-- From: Ben Pfaff b...@nicira.com Date: Thu, 15 Mar 2012 21:15:38 -0700 Subject: [PATCH] netlink-socket: Increase Netlink socket receive buffer size. Open vSwitch userspace can set up flows at a high rate, but it is somewhat bursty in opportunities to set up flows, by which I mean that OVS sets up a batch of flows, then goes off and does some other work for a while, then sets up another batch of flows, and so on. The result is that, if a large number of packets that need flow setups come in all at once, then some of them can overflow the relatively small kernel-to-user buffers. This commit increases the kernel-to-user buffers from the default of approximately 120 kB each to 1 MB each. In one somewhat synthetic test case that I ran based on an hping3 that generated a load of about 20,000 new flows per second (including both requests and replies), this reduced the packets dropped at the kernel-to-user interface from about 30% to none. I expect that it will similarly improve packet loss in workloads where flow arrival is not easily predictable. (This has little effect on workloads generated by ovs-benchmark rate because that benchmark is effectively self-clocking, that is, a new flow is triggered only by a reply to a request made earlier, which means that the number of buffered packets at any given has a known, constant upper limit.) Bug #10210. Signed-off-by: Ben Pfaff b...@nicira.com --- include/sparse/sys/socket.h |5 +++-- lib/netlink-socket.c| 10 +- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/include/sparse/sys/socket.h b/include/sparse/sys/socket.h index 89e3c2d..1ed195b 100644 --- a/include/sparse/sys/socket.h +++ b/include/sparse/sys/socket.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2011 Nicira Networks. + * Copyright (c) 2011, 2012 Nicira Networks. * * Licensed under the Apache License, Version 2.0 (the License); * you may not use this file except in compliance with the License. @@ -74,7 +74,8 @@ enum { SO_SNDBUF, SO_SNDLOWAT, SO_SNDTIMEO, -SO_TYPE +SO_TYPE, +SO_RCVBUFFORCE }; enum { diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c index bc46235..df6f1d8 100644 --- a/lib/netlink-socket.c +++ b/lib/netlink-socket.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011 Nicira Networks. + * Copyright (c) 2008, 2009, 2010, 2011, 2012 Nicira Networks. * * Licensed under the Apache License, Version 2.0 (the License); * you may not use this file except in compliance with the License. @@ -89,6 +89,7 @@ nl_sock_create(int protocol, struct nl_sock **sockp) struct nl_sock *sock; struct sockaddr_nl local, remote; socklen_t local_size; +int rcvbuf; int retval = 0; if (!max_iovs) { @@ -122,6 +123,13 @@ nl_sock_create(int protocol, struct nl_sock **sockp) sock-protocol = protocol; sock-dump = NULL; +rcvbuf = 1024 * 1024; +if (setsockopt(sock-fd, SOL_SOCKET, SO_RCVBUFFORCE, + rcvbuf, sizeof rcvbuf)) { +VLOG_WARN_RL(rl, setting %d-byte socket receive buffer failed (%s), + rcvbuf, strerror(errno)); +} + retval = get_socket_rcvbuf(sock-fd); if (retval 0) { retval = -retval; -- 1.7.2.5 ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
[ovs-discuss] warning messages regarding buffer space and unknown buffer
Hi, We are getting the following warnings while using OVS. 1) Apr 23 13:19:27|00019|dpif|WARN|system@br0: recv failed (No buffer space available) 2) Apr 23 13:19:37|00033|pktbuf|WARN|cookie mismatch: 01fa != 02fa Apr 23 13:19:37|00034|connmgr|INFO|br0-tcp:127.0.0.1:6633: sending type OFPET_BAD_REQUEST, code OFPBRC_BUFFER_UNKNOWN error reply to OFPT_FLOW_MOD message We get the warnings after a few seconds of traffic switching. First error is less frequent as compared to second one which is much frequent. We have not done any special configurations while running OVS. We are receiving packets from one ether port and sending out on the other port after matching five tuple (src/dst IP, src/dst port, protocol). A log of error messages is attached in a file. Following are the platform specifications: OVS version: 1.4.0 traffic type: MTU size UDP packets @7.3 Gbps, generating 100,000 flows by changing src IPs. OS: Fedora release 12, Linux 2.6.31.5-127.fc12.i686.PAE CPU: Intel Core i3 @ 2100 GHz Controller: NOX, simple IP based switching application. Also tried beacon controller but the problem persists. Does this have any effect on performance or functionality of OVS? How should we resolve this problem? Any suggestions or pointers are highly appreciated. thanks and regards, Junaid Khalid Apr 23 13:19:16|1|reconnect|INFO|unix:./var/run/openvswitch/db.sock: connecting... Apr 23 13:19:16|2|reconnect|INFO|unix:./var/run/openvswitch/db.sock: connected Apr 23 13:19:16|3|bridge|INFO|created port eth3 on bridge br0 Apr 23 13:19:16|4|bridge|INFO|created port br0 on bridge br0 Apr 23 13:19:16|5|bridge|INFO|created port eth4 on bridge br0 Apr 23 13:19:16|6|ofproto|INFO|using datapath ID 002320cc711c Apr 23 13:19:16|7|ofproto|INFO|datapath ID changed to 001b2173ed48 Apr 23 13:19:16|8|rconn|INFO|br0-tcp:127.0.0.1:6633: connecting... Apr 23 13:19:16|9|rconn|INFO|br0-tcp:127.0.0.1:6633: connected Apr 23 13:19:16|00010|ofp_util|WARN|received Nicira extension message of unknown type 8 Apr 23 13:19:16|00011|ofp_util|WARN|received Nicira extension message of unknown type 8 Apr 23 13:19:16|00012|connmgr|INFO|br0-tcp:127.0.0.1:6633: sending type OFPET_BAD_REQUEST, code OFPBRC_BAD_SUBTYPE error reply to invalid message Apr 23 13:19:24|00013|rconn|INFO|br0-tcp:127.0.0.1:6633: connection closed by peer Apr 23 13:19:25|00014|rconn|INFO|br0-tcp:127.0.0.1:6633: connecting... Apr 23 13:19:25|00015|rconn|INFO|br0-tcp:127.0.0.1:6633: connected Apr 23 13:19:25|00016|ofp_util|WARN|received Nicira extension message of unknown type 8 Apr 23 13:19:25|00017|ofp_util|WARN|received Nicira extension message of unknown type 8 Apr 23 13:19:25|00018|connmgr|INFO|br0-tcp:127.0.0.1:6633: sending type OFPET_BAD_REQUEST, code OFPBRC_BAD_SUBTYPE error reply to invalid message Apr 23 13:19:27|00019|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:28|00020|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:29|00021|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:30|00022|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:31|00023|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:31|00024|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:32|00025|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:32|00026|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:33|00027|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:34|00028|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:35|00029|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:35|00030|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:36|00031|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:36|00032|dpif|WARN|system@br0: recv failed (No buffer space available) Apr 23 13:19:37|00033|pktbuf|WARN|cookie mismatch: 01fa != 02fa Apr 23 13:19:37|00034|connmgr|INFO|br0-tcp:127.0.0.1:6633: sending type OFPET_BAD_REQUEST, code OFPBRC_BUFFER_UNKNOWN error reply to OFPT_FLOW_MOD message Apr 23 13:19:37|00035|pktbuf|WARN|cookie mismatch: 01fb != 02fb Apr 23 13:19:37|00036|connmgr|INFO|br0-tcp:127.0.0.1:6633: sending type OFPET_BAD_REQUEST, code OFPBRC_BUFFER_UNKNOWN error reply to OFPT_FLOW_MOD message Apr 23 13:19:37|00037|pktbuf|WARN|cookie mismatch: 01fc != 02fc Apr 23 13:19:37|00038|connmgr|INFO|br0-tcp:127.0.0.1:6633: sending type OFPET_BAD_REQUEST, code OFPBRC_BUFFER_UNKNOWN error reply to OFPT_FLOW_MOD message Apr 23 13:19:37|00039|pktbuf|WARN|cookie mismatch: 01fd != 02fd Apr 23 13:19:37|00040|connmgr|INFO|br0-tcp:127.0.0.1:6633: sending type OFPET_BAD_REQUEST, code OFPBRC_BUFFER_UNKNOWN error reply to OFPT_FLOW_MOD message Apr 23 13:19:37|00041|pktbuf
[ovs-discuss] OVS disconnecting from controller when under heavy load.
Hi I was running OVS (ovs-1.1.2 in kernel mode, on a core i7 machine with 4GB of RAM) under several Gbps of load. I noticed some unexpected behavior; OVS would disconnect from Controller (NOX 0.9 zaku) under heavy load conditions and re-establish the connection when the load was reduced (the behavior did not have a particular pattern but seemed to become rare for under 2Gbps). The reason for the disconnection was a lack of echo-request/reply packets being exchanged between the controller and ovs. Each time the experiment was run, ovs had about 10k flows in its flow table (added pro-actively) and the controller was not being invoked for packet-in events (all the relevant flows had already been written). I was unable to find any relevant posts on the mailing lists regarding similar issue (controller disconnecting when ovs is under gigabits per second of traffic). I have no idea what could be causing this behavior, and hope that someone can enlighten me. Regards, junaid ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
Re: [ovs-discuss] Query regarding arp packet
On Mon, Oct 31, 2011 at 11:07 AM, Ben Pfaff b...@nicira.com wrote: On Sat, Oct 29, 2011 at 01:36:33AM -0700, junaid khalid wrote: I have an issue regarding the ARP packet. I am generating an arp request from my controller with the source Mac address same as of my NIC. I didn't receive its reply on the controller even though my machine receives the reply packet of my requests which are destined to the mac of my NIC. But when i use some other source mac address ,i recevie its reply on the controller. Is it a possible BUG or is there something wrong with my configuration Can you be more specific about what you're doing? When you say that you are generating an ARP request, do you mean that the controller is sending it directly from its NIC, or that the controller is sending it via a packet_out request sent to some OpenFlow switch? What's the destination of the ARP? Is in-band control configured? Is just one OpenFlow switch involved or multiple? Do you see the ARP being processed at its destination and the reply being sent from that destination, i.e. where does the reply get dropped if anywhere? etc. I am using single openflow switch, both controller and the switch are on the same machine. I am sending an arp request to next hop router(which is not an openflow router) via packet_out. I didn't receive its reply on the controller even though my machine receives it. I can see the reply packet(destined to the MAC of my NIC) in wireshark. -- Regards, -junaid ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss
[ovs-discuss] Query regarding arp packet
Hi I have an issue regarding the ARP packet. I am generating an arp request from my controller with the source Mac address same as of my NIC. I didn't receive its reply on the controller even though my machine receives the reply packet of my requests which are destined to the mac of my NIC. But when i use some other source mac address ,i recevie its reply on the controller. Is it a possible BUG or is there something wrong with my configuration P.S. I am using openvswitch 1.2.2 -- Regards, -junaid khalid ___ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss