Thanks for the information. We've seen that OVS can handle over 10Gbps. The problem that you're seeing is related to flow setups. In releases prior to 1.7, the flow setup rate was roughly 40,000 flows per second. The changes in 1.7 increase that number to 120,000.
As we discussed, the bulk of your flows appear to be short-lived, and each packet is roughly 100 bytes. By my calculations, 20-30Mbps translates to roughly 25,000 to 37,500 flow setups per second if every packet missed. 320Mbps would require 400,000 flow setups per second. These are very rough numbers since even many of the short-lived flows consist of a few packets, but it gives us ballpark numbers for your type of traffic. We've not found your situation to be typical. I'll write you off-list, since I have some questions about your setup and the type of traffic you're experiencing. With additional information, we may be able to figure out ways to optimize for traffic such as yours. --Justin On Jun 6, 2012, at 12:44 AM, Kaushal Shubhank wrote: > The packets lost in the previous case came about in the couple of minutes we > put the whole 350mbps of load. In the lesser load, we did not see packet loss. > > I am running traffic at the same rate 20-30mbps and the CPU load is also the > same. I think as soon as we add the full load, the high CPU load will lead to > packet losses (what happened last time). > > In any case, we'll let this setup run for at least one day. What do you think > should be our next step? Note that adding full load involves taking > permissions from a lot of people, so unless we are sure the CPU load will not > shoot up, we would not be allowed to wire in the full load. > > - Kaushal > > On Wed, Jun 6, 2012 at 12:58 PM, Justin Pettit <[email protected]> wrote: > Okay, great. The big change here is that we're actually setting up fewer of > those kernel flows on purpose to force them into userspace. Here's the > commit message that describes the change: > > ofproto-dpif: Implement "flow setup governor" to speed up many short flows. > > The cost of creating and initializing facets and subfacets and installing, > tracking, and uninstalling kernel flows is significant. When most flows > have only one or a few packets, this overhead is higher than the cost of > handling each packet individually. This commit introduces heuristics that > cheaply count (approximately) the number of packets seen in a flow and > skips most of this expensive bookkeeping until the packet count exceeds a > threshold (currently 5 packets). > > So the fact that more packets are going to userspace is not bad and is, in > fact, expected. (There were also changes to make packet processing faster in > userspace.) It appears to be that you're not losing packets: > > lookups: hit:117426873 missed:87741549 lost:0 > > In your previous message you were seeing a lot: > > lookups: hit:3105457869 missed:792488043 lost:903955 > > That packet loss is bad, since it means that the packet was lost between > kernel and userspace, and it's gone forever. Obviously, there were a lot > more packets in that previous run, so we should monitor that over your full > day run. You said that you detected packet loss in your previous run. Are > you still seeing that? > > At what rate are you running traffic? Are you still seeing 30% CPU for > 20Mbps? > > --Justin > > > On Jun 6, 2012, at 12:09 AM, Kaushal Shubhank wrote: > > > Hi Justin, Oliver, > > > > So I switched to the newer version around 11 hrs ago. Here are some > > observations: > > > > 1. Number of flows has come down to a couple of thousands (from 12-15k). > > However we might wait for the setup to run for one whole day to see through > > peak and lean times and then count the flows again. > > > > 2. The flows have lesser percentage of lower packet count. I have attached > > a dump for reference. > > > > 3. The CPU usage is still around the same, that means we still see misses > > in kernel flow tables. > > > > $ sudo ovs-dpctl dump-flows br0 | grep -e "packets:[0123]," | wc -l > > 764 > > $ sudo ovs-dpctl show > > system@br0: > > lookups: hit:117426873 missed:87741549 lost:0 > > flows: 2145 > > port 0: br0 (internal) > > port 1: eth3 > > port 2: eth4 > > > > - Kaushal > > > > On Tue, Jun 5, 2012 at 12:49 PM, Kaushal Shubhank <[email protected]> > > wrote: > > Surely we will try the 1.7.0 version. Considering this is production, we > > will be able to try this in off-peak hours. We will update you with the > > results as soon as possible. > > > > Thanks a lot and looking forward to contribute to the project in any way > > possible. > > > > Kaushal > > > > > > On Tue, Jun 5, 2012 at 12:36 PM, Justin Pettit <[email protected]> wrote: > > Of your nearly 12,000 flows, over 10,000 had fewer than four packets: > > > > [jpettit@timber-2 Desktop] grep -e "packets:[0123]," live_flows_20120604 > > |wc -l > > 10143 > > > > Short-lived flows are really difficult for OVS, since there's a lot of > > overhead in setting up and maintaining the kernel flow table. We made > > *substantial* improvements for handling just this scenario in the > > forthcoming 1.7.0 release. The code should be stable, but it hasn't gone > > through a full QA regression. However, if you're willing to give it a > > shot, you can download a snapshot of the tip of the 1.7 branch: > > > > http://openvswitch.org/cgi-bin/gitweb.cgi?p=openvswitch;a=snapshot;h=04a67c083458784d1fed689bcb7ed904026d2352;sf=tgz > > > > We've only been able to test it with generated traffic, so seeing how much > > it improves performance with real traffic would be invaluable. If you're > > able to give it a try and let us know, we'd really appreciate it. > > > > --Justin > > > > > > On Jun 4, 2012, at 11:39 PM, Kaushal Shubhank wrote: > > > > > Hi Justin, > > > > > > This is how the connections are made, so I guess eth3 and eth4 are not in > > > the same network segment. > > > Router--->eth4==eth3--->switch > > > > > > We tried with eviction threshold 10000, but were seeing high packet > > > losses. I am pasting a few kernel flows (ovs-dpct dump-flows) here, and > > > attaching the whole dump (11k flows). I don't see any pattern. The port > > > 80 filtering flows were around 800 in the 11k flows, that means other > > > flows were just non-port 80 packets which we just send from eth3 to eth4 > > > or vice-versa. > > > > > > If there is any way we reduce those (11k - 800) flows, we could reduce > > > CPU usage. > > > > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=203.188.231.195,dst=1.2.138.199,proto=17,tos=0,ttl=127,frag > > > =no),udp(src=62294,dst=16464), packets:1, bytes:60, used:3.170s, actions:2 > > > in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=94.194.158.115,dst=110.172.18.250,proto=6,tos=0,ttl=22,frag > > > =no),tcp(src=62760,dst=47868), packets:0, bytes:0, used:never, actions:1 > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=203.188.231.134,dst=209.85.148.139,proto=6,tos=0,ttl=126,frag=no),tcp(src=64741,dst=80), > > > packets:1, bytes:60, used:2.850s, > > > actions:set(eth(src=00:15:17:44:03:6e,dst=00:e0:ed:15:24:4a)),0 > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=110.172.18.137,dst=219.90.100.27,proto=6,tos=0,ttl=127,frag=no),tcp(src=49504,dst=12758), > > > packets:67603, bytes:4060369, used:0.360s, actions:2 > > > in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=189.63.179.72,dst=203.188.231.195,proto=17,tos=0,ttl=110,frag=no),udp(src=60414,dst=16464), > > > packets:1, bytes:60, used:0.620s, actions:1 > > > in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=213.57.230.226,dst=110.172.18.8,proto=17,tos=0,ttl=101,frag=no),udp(src=59274,dst=24844), > > > packets:0, bytes:0, used:never, actions:1 > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=195.35.128.105,dst=110.172.18.250,proto=6,tos=0,ttl=15,frag=no),tcp(src=54303,dst=47868), > > > packets:3, bytes:222, used:5.300s, actions:2 > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=110.172.18.154,dst=76.186.139.105,proto=6,tos=0,ttl=126,frag=no),tcp(src=10369,dst=61585), > > > packets:1, bytes:60, used:0.290s, actions:2 > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=78.92.118.9,dst=110.172.18.80,proto=17,tos=0,ttl=23,frag=no),udp(src=44779,dst=59357), > > > packets:0, bytes:0, used:never, actions:2 > > > in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=89.216.130.134,dst=203.188.231.206,proto=17,tos=0,ttl=33,frag=no),udp(src=52342,dst=30291), > > > packets:0, bytes:0, used:never, actions:1 > > > in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=76.226.72.157,dst=110.172.18.250,proto=6,tos=0,ttl=36,frag=no),tcp(src=46637,dst=47868), > > > packets:2, bytes:148, used:2.730s, actions:1 > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=89.211.162.95,dst=110.172.18.80,proto=17,tos=0,ttl=92,frag=no),udp(src=19442,dst=59357), > > > packets:0, bytes:0, used:never, actions:2 > > > in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=86.179.231.157,dst=110.172.18.11,proto=17,tos=0,ttl=109,frag=no),udp(src=58240,dst=23813), > > > packets:7, bytes:1181, used:1.700s, actions:1 > > > in_port(2),eth(src=e8:b7:48:42:5b:09,dst=00:15:17:44:03:6e),eth_type(0x0800),ipv4(src=72.201.71.66,dst=203.188.231.195,proto=17,tos=0,ttl=115,frag=no),udp(src=1025,dst=16464), > > > packets:1, bytes:60, used:2.620s, actions:1 > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=95.165.107.21,dst=110.172.18.80,proto=17,tos=0,ttl=96,frag=no),udp(src=49400,dst=59357), > > > packets:1, bytes:72, used:3.360s, actions:2 > > > in_port(1),eth(src=00:15:17:44:03:6e,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=110.172.18.203,dst=212.96.161.246,proto=6,tos=0,ttl=127,frag=no),tcp(src=49172,dst=80), > > > packets:2, bytes:735, used:0.240s, > > > actions:set(eth(src=00:15:17:44:03:6e,dst=00:e0:ed:15:24:4a)),0 > > > in_port(0),eth(src=00:e0:ed:15:24:4a,dst=e8:b7:48:42:5b:09),eth_type(0x0800),ipv4(src=203.188.231.54,dst=111.119.15.31,proto=6,tos=0,ttl=64,frag=no),tcp(src=47463,dst=80), > > > packets:6, bytes:928, used:4.440s, actions:2 > > > > > > Thanks, > > > Kaushal > > > > > > On Tue, Jun 5, 2012 at 11:29 AM, Justin Pettit <[email protected]> wrote: > > > Are eth3 and eth4 on the same network segment? If so, I'd guess you've > > > introduced a loop. > > > > > > I wouldn't recommend setting your evection threshold so high, since OVS > > > is going to have to do a lot of work to maintain so many kernel flows. I > > > wouldn't go above 10s of thousands of flows. What do your kernel flows > > > look like? You have too many to post here, but maybe you can provide a > > > sampling of a couple hundred. Do you see any patterns? > > > > > > --Justin > > > > > > > > > On Jun 4, 2012, at 10:40 PM, Kaushal Shubhank wrote: > > > > > > > Hello, > > > > > > > > We have a simple setup in which a server running a transparent proxy > > > > needs to intercept the http port 80 data. We have installed openvswitch > > > > (1.4.1) in the same server (running Ubuntu-natty 2.6.38-12-server > > > > 64bit) to feed the proxy with the corresponding type of packets while > > > > bridging all other types of packets. The functionality is working > > > > properly but the CPU usage is quite high (~30% for 20mbps traffic). The > > > > total load we need to deploy on is around 350mbps, and as soon as we > > > > plug in, the CPU usage shoots up to 100% (on a quad core Intel(R) > > > > Xeon(R) CPU E5420 @ 2.50GHz), even when only allowing all the packets > > > > to flow through br0. Packet loss also starts to occur. > > > > > > > > After reading similar discussions on previous threads I made my bridge > > > > stp-enabled and increased the flow-eviction-threshold to "1000000". > > > > Still the CPU load is high due to misses in kernel flow table. I have > > > > defined only the following flows: > > > > > > > > $ ovs-ofctl dump-flows br0 > > > > > > > > NXST_FLOW reply (xid=0x4): > > > > cookie=0x0, duration=80105.621s, table=0, n_packets=61978784, > > > > n_bytes=7438892513, priority=100,tcp,in_port=1,tp_dst=80 > > > > actions=mod_dl_dst:00:e0:ed:15:24:4a,LOCAL > > > > cookie=0x0, duration=80105.501s, table=0, n_packets=49343241, > > > > n_bytes=113922939324, > > > > priority=100,tcp,dl_src=00:e0:ed:15:24:4a,tp_src=80 actions=output:1 > > > > cookie=0x0, duration=518332.577s, table=0, n_packets=3052099665, > > > > n_bytes=2041603012562, priority=0 actions=NORMAL > > > > cookie=0x0, duration=80105.586s, table=0, n_packets=46209782, > > > > n_bytes=109671221356, priority=100,tcp,in_port=2,tp_src=80 > > > > actions=mod_dl_dst:00:e0:ed:15:24:4a,LOCAL > > > > cookie=0x0, duration=80105.601s, table=0, n_packets=40389137, > > > > n_bytes=5660094662, priority=100,tcp,dl_src=00:e0:ed:15:24:4a,tp_dst=80 > > > > actions=output:2 > > > > > > > > where 00:e0:ed:15:24:4a is br0's MAC address > > > > > > > > $ ovs-dpctl show > > > > > > > > system@br0: > > > > lookups: hit:3105457869 missed:792488043 lost:903955 {these lost > > > > packets came with 350mbps load and do not change with 20mbps} > > > > flows: 12251 > > > > port 0: br0 (internal) > > > > port 1: eth3 > > > > port 2: eth4 > > > > > > > > As far as we could understand, the missed packets here cause context > > > > switch to user-mode and increase CPU usage. Let me know if any other > > > > detail about the setup is required. > > > > > > > > Is there anything else we can do to reduce CPU usage? > > > > Can the flows above be improved in some way? > > > > Is there any other configuration for deployment in production that we > > > > missed? > > > > > > > > Regards, > > > > Kaushal > > > > _______________________________________________ > > > > discuss mailing list > > > > [email protected] > > > > http://openvswitch.org/mailman/listinfo/discuss > > > > > > > > > <flows.tgz> > > > > > > > > <live_flows_20120606_1206> > > _______________________________________________ discuss mailing list [email protected] http://openvswitch.org/mailman/listinfo/discuss
