Jesse,
>>> What's the traffic mixture like when you have this problem with vlans (i.e.
>>> single flow vs. many connections)? If you run a single stream, what is the
>>> ratio of hits to misses on the relevant datapath?
Our traffic is varied, some very short flows others are long lasting tcp
connections. We are mostly concerned about the long flows dropping lots of
packets. When we see messages as the above ones, can we expect that the
vswitch has dropped packets?
I think the relevant traffic is in the vif*.0 below, which is 1.1%. Can you
explain hit/miss/lost statistic below?
system@xenbr2:
lookups: frags:0, hit:110495900, missed:1284600, lost:34
port 0: xenbr2 (internal)
port 1: eth2
port 2: xapi21 (internal)
port 3: vif455.1
port 4: vif462.0
port 5: vif456.0
port 6: vif461.0
port 7: vif457.0
port 8: vif458.0
port 9: vif465.0
port 10: vif459.0
port 11: vif467.0
port 12: vif460.0
port 13: vif463.0
port 14: vif464.0
port 16: vif466.0
We are, attempting to push the host by sending lots of small flows in order to
see if we can reproduce the problem in our system more easily.
>>> Is there anything interesting in the ovs-vswitchd log?
No. They are empty for the times we are interested in.
Kern.log traces are interesting .. they do seem to correlate to some of the
failures we see:
/var/log/kern.log.9.gz:Oct 2 20:55:58 localhost kernel: vif122.0: draining TX
queue
/var/log/kern.log.9.gz:Oct 2 20:56:00 localhost kernel: vif117.0: draining TX
queue
/var/log/kern.log.9.gz:Oct 2 20:56:00 localhost kernel: vif121.0: draining TX
queue
/var/log/kern.log.9.gz:Oct 2 20:56:02 localhost kernel: vif112.0: draining TX
queue
/var/log/kern.log.9.gz:Oct 2 20:56:05 localhost kernel: vif113.0: draining TX
queue
Is draining occurring on a regular interval?
Thanks,
Juan Tellez
-----Original Message-----
From: Jesse Gross [mailto:[email protected]]
Sent: Thursday, February 02, 2012 6:36 PM
To: Juan Tellez
Cc: [email protected]; Vijay Chander
Subject: Re: [ovs-discuss] xenServer and openVswitch 1.0.99
On Wed, Feb 1, 2012 at 6:07 PM, Juan Tellez <[email protected]> wrote:
> Jesse,
>
> Dmesg hasn't changed for a while .. and sadly it is not time-stamped. Below
> is the tail:
>
> ..
> device vif467.1 entered promiscuous mode
> device tap467.0 entered promiscuous mode
> device tap467.1 entered promiscuous mode
> /local/domain/465/device/vif/0: Connected
> /local/domain/465/device/vif/1: Connected
> /local/domain/466/device/vif/0: Connected
> /local/domain/466/device/vif/1: Connected
> /local/domain/467/device/vif/0: Connected
> /local/domain/467/device/vif/1: Connected
> vif458.2: draining TX queue
> vif456.2: draining TX queue
> vif457.2: draining TX queue
> vif459.2: draining TX queue
>
>> What are the outputs of dmesg and ovs-dpctl show?
What's the traffic mixture like when you have this problem with vlans
(i.e. single flow vs. many connections)? If you run a single stream,
what is the ratio of hits to misses on the relevant datapath?
Is there anything interesting in the ovs-vswitchd log?
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss