[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
The VM has now had the problem recur again with the monitoring showing that just before it stops functioning nothing appears to be odd/excessive with the network usage. As such, I may have mis-titled my bug as I had assumed it was heavy load causing the issue. One thing which struck me as slightly odd with it was that during the time it was not responding it's showing a small number of packets being received but nothing going out: 170613-210207: TX ens3: 1578 pkts/s RX ens3: 2690 pkts/s 170613-210217: TX ens3: 1090 pkts/s RX ens3: 1324 pkts/s 170613-210227: TX ens3: 0 pkts/s RX ens3: 17 pkts/s 170613-210237: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210247: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210257: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210307: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210317: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210327: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210337: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210347: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210357: TX ens3: 0 pkts/s RX ens3: 10 pkts/s 170613-210407: TX ens3: 0 pkts/s RX ens3: 11 pkts/s 170613-210417: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210427: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210437: TX ens3: 0 pkts/s RX ens3: 11 pkts/s I will have to set up monitoring on the host side as well to see if it reports the same. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
Since I have put monitoring in place it would appear that I haven't had the issue again in 3 days. This is rather frustrating as it was happening every night. I'm going to dial back on some of the monitoring in case this was prompting it to stay alive unexpectedly (as I don't think the option of "run this script that keeps pinging stuff" is really a viable long term solution to the issue). I don't currently know a way of monitoring the packets/bytes per second of a VM from the host so I'm going to move the ping/reboot script off the quest and onto the host to see if this makes any difference. It's possible that if it was keeping it alive when sending pings from the guest, I may end up keeping it alive when sending to the guest as well. If this turns out to be the case I'll kill the pinging off and just leave the stats scripts running in the hope that it'll die at some stage and I'll be able to work out when it did and what throughput was happening at the time. If there are any other suggestions on how I could do monitoring on this I'd love to hear them. I may also try swapping to the e1000 at some point if I can get it to stop working again to prove that it's a virtio issue rather than something more generic. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
I didn't manage to force the network to drop out by simply running iperf3. Had it running most of the day doing approx 2-3gbps traffic constantly. The issue is still persisting though, so there must be something going on. I have now set some simple scripts which monitor packets per second and bytes per second logging them to a file and also something which monitors pings to both the local network and the wider internet. The ping monitor is set to log when network goes and restart the machine to get things back up and running. I am hoping that I'll find something in the logs at the times the network dies/the server restarts which will give an indication of what is going on. I will post another update when I have more info. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
1) It seems to happen every night (which is when the Ceph cluster is most used). Not sure if it's at a specific time or whether anything else could be interacting with it - I'll see if I can narrow this down. I'm trying to run some tests to see whether I can force it to happen by throwing a lot of traffic at it (using iperf3 - if you have better suggestions on how to generate load please let me know). 2) As above, trying network benchmark tool to see if it triggers. If not, I'll specifically push load through Ceph to see if I can reliably re-trigger the failure. 3) I thought the initial report was about virtio network with the suggestion that e1000 might solve it but would bring it's own issues with it. I am currently on virtio, but may change over to e1000 to see if the problem persists. Ideally wouldn't want this as a long term solution as I have previously found e1000 to be slower. 4) Couldn't see anything in the logs on either the host or the guest suggesting that there was anything going on. The guest doesn't seem to be aware that the NIC isn't functional but traffic just doesn't seem to flow over it. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
I am not experiencing this issue on my other virtualised Ceph instance. The guest of the instance is 17.04 much like the other one, but the host is 16.04.2 LTS on 4.4.0-75-generic. No idea if it is just coincidence that it's not failing. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] [NEW] KVM Guest loses network connectivity under heavy load
Public bug reported: Much as described in bug 1325560 I am experiencing issues with a KVM guest losing network connectivity when under load. I have a KVM host running Ubuntu 17.04 with linux- image-4.10.0-22-generic (currently in zesty-proposed) installed. On top of this I have a guest also running 17.04 also with the proposed kernel (there was an issue with machine lockups on one of my other 17.04 machines with the standard linux-image-4.10.0-21-generic kernel regarding swap and so I've updated all my 17.04 machines to -22). The guest is running as a Ceph OSD host with 2 OSD processes. The Ceph cluster shares out RBD images to other hosts. When one of the hosts causes a lot of activity on the Ceph disks by, for example, running btrfs scrub the network card on the system will frequently stop. The machine can see itself, but no other parts of the local network or wider networks. No hosts on the local network can see the machine in question. Simply restarting the machine fixes the problem temporarily (i.e. logging into the guest and issuing 'shutdown -r now' - the actual qemu process remains the same). ** Affects: qemu-kvm (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1325560] Re: kvm virtio netdevs lose network connectivity under "enough" load
Getting the same here on 17.04 host with 17.04 guest. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1325560 Title: kvm virtio netdevs lose network connectivity under "enough" load To manage notifications about this bug go to: https://bugs.launchpad.net/libvirt/+bug/1325560/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1630642] Re: bridge mode network no longer functions
I just performed much the same upgrade on my second KVM host and the networking issue didn't happen again. ** Changed in: qemu (Ubuntu) Status: New => Invalid -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1630642 Title: bridge mode network no longer functions To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1630642/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1630642] Re: bridge mode network no longer functions
OK, so all appears to be working now. No idea which of the many actions I performed whilst trying to get the server back up and running actually fixed it, but I'm now back at 10.5 after having tried to downgrade to 10.4 (where it still didn't seem to be working). No idea if I rebooted after the initial upgrade. I would have checked whether the system said it needed it and would have done so had it done, so I know that it didn't say it needed it - maybe this needs checking in case a reboot is required and it's not flagging it. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1630642 Title: bridge mode network no longer functions To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1630642/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1630642] Re: bridge mode network no longer functions
There were a range of packages installed at the same time - I'm assuming it's the QEMU/KVM ones which have caused the issue, but for the sake of completeness the upgrades which happened were as follows: snap-confine:amd64 (1.0.38-0ubuntu0.16.04.8, 1.0.38-0ubuntu0.16.04.10), qemu-system-x86:amd64 (1:2.5+dfsg-5ubuntu10.4, 1:2.5+dfsg-5ubuntu10.5), libklibc:amd64 (2.0.4-8ubuntu1.16.04.1, 2.0.4-8ubuntu1.16.04.2), libvirt-bin:amd64 (1.3.1-1ubuntu10.2, 1.3.1-1ubuntu10.3), qemu-utils:amd64 (1:2.5+dfsg-5ubuntu10.4, 1:2.5+dfsg-5ubuntu10.5), ubuntu-core-launcher:amd64 (1.0.38-0ubuntu0.16.04.8, 1.0.38-0ubuntu0.16.04.10), qemu-kvm:amd64 (1:2.5+dfsg-5ubuntu10.4, 1:2.5+dfsg-5ubuntu10.5), libvirt0:amd64 (1.3.1-1ubuntu10.2, 1.3.1-1ubuntu10.3), qemu-block-extra:amd64 (1:2.5+dfsg-5ubuntu10.4, 1:2.5+dfsg-5ubuntu10.5), klibc-utils:amd64 (2.0.4-8ubuntu1.16.04.1, 2.0.4-8ubuntu1.16.04.2), qemu-system-common:amd64 (1:2.5+dfsg-5ubuntu10.4, 1:2.5+dfsg-5ubuntu10.5) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1630642 Title: bridge mode network no longer functions To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1630642/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1630642] [NEW] bridge mode network no longer functions
Public bug reported: Since upgrading from qemu 1:2.5+dfsg-5ubuntu10.4 to 1:2.5+dfsg- 5ubuntu10.5 the bridged network adapter no longer functions. I built a fresh server 2012 R2 VM on 10.5. I could see the adapter and set it's IP etc, but no traffic passed through it. Migrating the machine to a different host which was still on 10.4 resulted in the network working. Migrating a Ubuntu 16.04.1 VM from the 10.4 host to the 10.5 host resulted in the network ceasing to function. The network resumed functioning when I migrated it back. ** Affects: qemu (Ubuntu) Importance: Undecided Status: New ** Summary changed: - bridge mode network no longer function + bridge mode network no longer functions -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1630642 Title: bridge mode network no longer functions To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1630642/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs