I have two separate networks, I'll call them "home" and "rack". At my rack, I have a couple physical machines that each have a whole bunch of VMs running on them using libvirt/KVM. One physical machine is running CentOS 5 (C5), the other is running CentOS 6 (C6). Most of my VMs are running C5, while two are running C6. I have one VM (running C5) which runs OpenVPN to connect between the rack and my home.
I discovered yesterday that while I can ping the C6 VMs and physical host, I cannot do UDP or TCP sessions. I used tcpdump to capture packets at all hops along the way. For UDP (DNS queries), I would send out a packet, and it would return all the way to my machine, and then be discarded before the application saw it. For TCP, I would send out a SYN, successfully receive a SYN+ACK, and send an ACK. The remote side would then send a packet (the SSH "welcome" string, in this case), which my ssh app would never see. As far as I could tell initially, everything seemed to be fine with the received packets. The IP header checksum was correct, and the DNS transaction ID matched. The TCP counts were correct. There was a TCP checksum, but I didn't notice initially that Wireshark wasn't checking it, because it wasn't flagging it. Eventually I noticed that and turned TCP checksum validation on, and sure enough the checksum was wrong. I assume the same was true for UDP, although I never actually checked. So why was the checksum wrong? It turns out that the virtio driver in C6 has some optimizations in it. It waits until a packet actually hits the wire to calculate the checksum. Internally to the physical host, it never bothers -- since it's never going on the wire, it doesn't calculate it or check it. The problem was that OpenVPN was sending those bad packets to my home network without fixing the checksum, since they never hit the physical machine's NIC. Temporary solution? Move the OpenVPN machine back to the C5 physical host, so the packets are forced to go through the NIC. If this isn't fixed soon in C6, I'll eventually need to move OpenVPN to a physical machine that doesn't host VMs, as I'll be upgrading my machines to C6. Fun times! :) Steve Meyers /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
