Jesse, To add to Juan's points, we are not power users of OVS. All we need is a DMAC based switching of packets paying attention to VLAN ids.
Short lived/long lived flows happen at L4 level which we don't want OVS to pay attention to. Is this possible ? Is it possible to run OVS in bridge mode or some such basic equivalent ? Thanks, -vijay On Fri, Feb 3, 2012 at 4:29 PM, Juan Tellez <[email protected]> wrote: > Jesse, > > > I don't really see anything in the information that you've given that > > indicates OVS is the one dropping packets > > We do not see the same problem with the Linux Bridge, and we want to use > the vswitch. > > Is it possible that ovs disrupts long living TCP connections in the > presence of many short flows? The long living connections experience long > 10-15s delays caused by excessive packet drops after running for about 16 > hours. > > Again .. I'm looking for the possibility that is an already fixed bug in > 1.2, 1.3 or 1.4. > > Thanks, > > Juan > > -----Original Message----- > From: Jesse Gross [mailto:[email protected]] > Sent: Friday, February 03, 2012 9:20 AM > To: Juan Tellez > Cc: [email protected]; Vijay Chander > Subject: Re: [ovs-discuss] xenServer and openVswitch 1.0.99 > > On Thu, Feb 2, 2012 at 10:06 PM, Juan Tellez <[email protected]> wrote: > > Jesse, > > > >>>> What's the traffic mixture like when you have this problem with vlans > (i.e. single flow vs. many connections)? If you run a single stream, what > is the ratio of hits to misses on the relevant datapath? > > > > Our traffic is varied, some very short flows others are long lasting tcp > connections. > > If you have many short flows, it's possible that the CPU load you see > is simply the result of normal processing. > > > We are mostly concerned about the long flows dropping lots of packets. > When we see messages as the above ones, can we expect that the vswitch has > dropped packets? > > I don't really see anything in the information that you've given that > indicates OVS is the one dropping packets. > > > I think the relevant traffic is in the vif*.0 below, which is 1.1%. Can > you explain hit/miss/lost statistic below? > > Hits are packets processed entirely in the kernel, misses are sent to > userspace for flow setup, lost are packets that were queued to > userspace but exceeded the queue length. > > > Kern.log traces are interesting .. they do seem to correlate to some of > the failures we see: > > > > /var/log/kern.log.9.gz:Oct 2 20:55:58 localhost kernel: vif122.0: > draining TX queue > > /var/log/kern.log.9.gz:Oct 2 20:56:00 localhost kernel: vif117.0: > draining TX queue > > /var/log/kern.log.9.gz:Oct 2 20:56:00 localhost kernel: vif121.0: > draining TX queue > > /var/log/kern.log.9.gz:Oct 2 20:56:02 localhost kernel: vif112.0: > draining TX queue > > /var/log/kern.log.9.gz:Oct 2 20:56:05 localhost kernel: vif113.0: > draining TX queue > > > > Is draining occurring on a regular interval? > > Those messages are coming from netback, not OVS. Combined with the > fact that you see dropped counts going up on the interface itself, it > seems that's the likely cause of the problem. Probably something on > the guest side is not keeping up but you should talk to the Xen guys. > _______________________________________________ > discuss mailing list > [email protected] > http://openvswitch.org/mailman/listinfo/discuss >
_______________________________________________ discuss mailing list [email protected] http://openvswitch.org/mailman/listinfo/discuss
