Hello
A while back we experienced packet loss during a traffic peak on a XenServer
6.2 which uses openvswitch v1.4.6.
The following message appeared multiple times during this packet-loss-period:
Apr 26 11:11:03 hostname ovs-vswitchd: 6221017|dpif|WARN|system@xenbr0: recv
failed (No buffer space available)
We found the following info regarding this message in the archive of the
ovs-mailing list.
- [1] [ovs-discuss] warning messages regarding buffer space
Mentions flow-eviction-threshold, includes a patch (included in 1.4.6) to
increase some buffer size
We also found an Article from Citrix discussing the flow-eviction-threshold
at length [2].
- [3] [ovs-discuss] No buffer space available warning
Suggests upgrading ovs to 1.7.x
- [4] [ovs-dev] [OVS 1.4.6] UNIX domain socket problems
"Those messages mean that the kernel is queuing packets to userspace
faster than userspace can process them. It can indicate that packets
are arriving very quickly, but it can also indicate that userspace
isn't getting any CPU time for a long time so that packets can pile
up. The prior evidence points toward the latter."
What I would like to find out is:
- Under what circumstances is this message logged?
What I understand is that there seems to be a buffer somewhere between the
physical interface and the ovs bridge
which is not emptied fast enough because the ovs userspace process cannot
keep up with processing all the packets.
If this buffer is full, new arriving packets get dropped and this message
appears.
- What exactly is this buffer?
- How (if at all) is it related to the flow-eviction-threshold? (asking
since this keeps coming up)
- Is there a way to monitor this / get the current buffer usage?
- What limitation did we hit? And if any, can this limit be increased? Would
you recommend this?
I know this is an ancient version of ovs, but since this is occurring on
XenServer, upgrading is not really an option.
We're also investigating the source of the traffic peak and did find some
unicast flooding on our network, which we're trying to eliminate right now.
We also found other XenServer hosts with the same message sporadically popping
up, which is worrying us.
In order to make sure that this does not happen again, we would like to fully
understand what circumstances cause this.
Any details or pointers to more info would be appreciated.
Kind regards
- Niels
[1] https://marc.info/?t=145964532500005&r=1&w=2
[2] https://support.citrix.com/article/CTX140764
[3] https://marc.info/?t=145964533900027&r=1&w=2
[4] https://mail.openvswitch.org/pipermail/ovs-dev/2014-May/284249.html
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss