Hi all,
I’m wondering if anyone here has seen this issue before, I’ve spent the last
couple of days troubleshooting:
Platform:
Host: XenServer 7.0 running on 2 x E2660-v4, 256GB RAM
Server VM: FreeBSD 11 (tested on 11.0-p15 and 11.1-p6), 2GB RAM (also tested
with 32GB RAM), 1x50GB HDD, 1 x NIC, 2 or more vCPUs in any combination (2
sockets x 1 core, 1 socket x 2 cores, …)
Client VM: FreeBSD 11, any configuration of vCPUs, RAM and HDD.
Behaviour:
Sporadic interruption of TCP sessions when utilising the above machine as a
“server” with “clients” connecting. Looking into the communication with
pcap/Wireshark, you see a TCP Dup Ack sent from both ends, followed by the
client sending an RST packet, terminating the TCP session. We have also seen
evidence of the client sending a Keepalive packet, which is ACK’d by the server
before the RST is sent from the client end.
To recreate:
On the above VM, perform a vanilla install of nginx:
pkg install nginx
service nginx onestart
Then on a client VM (currently only tested with FreeBSD), run the following (or
similar):
for i in {1..10000}; do if [ $(curl -s -o /dev/null -w "%{http_code}"
http://10.2.122.71) != 200 ] ; then echo "error"; fi; done
When vCPUs=1 on the server, I get no errors, when vCPUs>1 I get errors
reported. The frequency of errors *seems* to be proportional to the number of
vCPUs, but they are sporadic with no clear periodicity or pattern, so that is
just anecdotal. Also, the problem seems by far the most prevalent when
communicating between two VMs on the same host, in the same VLAN. Xen still
sends packets via the switch rather than bridging internally between the
interfaces.
Note that we have not had a chance to investigate the effect of different
numbers of CPUs on the *client* end, however it does seem to be governed
entirely by the server end.
We cannot recreate this issue using the same FreeBSD image and same
configuration, but using KVM as a hypervisor.
Has anyone met this before?
Thanks,
Laurence
smime.p7s
Description: S/MIME cryptographic signature
