[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
[Expired for qemu-kvm (Ubuntu) because there has been no activity for 60 days.] ** Changed in: qemu-kvm (Ubuntu) Status: Incomplete => Expired -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
The VM has now had the problem recur again with the monitoring showing that just before it stops functioning nothing appears to be odd/excessive with the network usage. As such, I may have mis-titled my bug as I had assumed it was heavy load causing the issue. One thing which struck me as slightly odd with it was that during the time it was not responding it's showing a small number of packets being received but nothing going out: 170613-210207: TX ens3: 1578 pkts/s RX ens3: 2690 pkts/s 170613-210217: TX ens3: 1090 pkts/s RX ens3: 1324 pkts/s 170613-210227: TX ens3: 0 pkts/s RX ens3: 17 pkts/s 170613-210237: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210247: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210257: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210307: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210317: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210327: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210337: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210347: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210357: TX ens3: 0 pkts/s RX ens3: 10 pkts/s 170613-210407: TX ens3: 0 pkts/s RX ens3: 11 pkts/s 170613-210417: TX ens3: 0 pkts/s RX ens3: 12 pkts/s 170613-210427: TX ens3: 0 pkts/s RX ens3: 13 pkts/s 170613-210437: TX ens3: 0 pkts/s RX ens3: 11 pkts/s I will have to set up monitoring on the host side as well to see if it reports the same. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
Hi Barry, to check from the host out of band if "anything" is moving you could use domnetstat like $ virsh domifstat Get the device via $ virsh domiflist If you have steady traffic you can check if there was no change at all. If you are not sure on the traffic you can at least log that through the time you assume it is broken and see if qemu thinks packets are still moved (or even errors show up there). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
Since I have put monitoring in place it would appear that I haven't had the issue again in 3 days. This is rather frustrating as it was happening every night. I'm going to dial back on some of the monitoring in case this was prompting it to stay alive unexpectedly (as I don't think the option of "run this script that keeps pinging stuff" is really a viable long term solution to the issue). I don't currently know a way of monitoring the packets/bytes per second of a VM from the host so I'm going to move the ping/reboot script off the quest and onto the host to see if this makes any difference. It's possible that if it was keeping it alive when sending pings from the guest, I may end up keeping it alive when sending to the guest as well. If this turns out to be the case I'll kill the pinging off and just leave the stats scripts running in the hope that it'll die at some stage and I'll be able to work out when it did and what throughput was happening at the time. If there are any other suggestions on how I could do monitoring on this I'd love to hear them. I may also try swapping to the e1000 at some point if I can get it to stop working again to prove that it's a virtio issue rather than something more generic. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
On #3 oh yeah virtio is better than e1000 in every dimension, yet sometimes worth a try to sort things out. Thanks for the update and your testing, I was running a similar load, but maybe not long enough. Once you have more, please also attach your tooling that you use to force it. That way I can try to reproduce over here with just the same. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
I didn't manage to force the network to drop out by simply running iperf3. Had it running most of the day doing approx 2-3gbps traffic constantly. The issue is still persisting though, so there must be something going on. I have now set some simple scripts which monitor packets per second and bytes per second logging them to a file and also something which monitors pings to both the local network and the wider internet. The ping monitor is set to log when network goes and restart the machine to get things back up and running. I am hoping that I'll find something in the logs at the times the network dies/the server restarts which will give an indication of what is going on. I will post another update when I have more info. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
1) It seems to happen every night (which is when the Ceph cluster is most used). Not sure if it's at a specific time or whether anything else could be interacting with it - I'll see if I can narrow this down. I'm trying to run some tests to see whether I can force it to happen by throwing a lot of traffic at it (using iperf3 - if you have better suggestions on how to generate load please let me know). 2) As above, trying network benchmark tool to see if it triggers. If not, I'll specifically push load through Ceph to see if I can reliably re-trigger the failure. 3) I thought the initial report was about virtio network with the suggestion that e1000 might solve it but would bring it's own issues with it. I am currently on virtio, but may change over to e1000 to see if the problem persists. Ideally wouldn't want this as a long term solution as I have previously found e1000 to be slower. 4) Couldn't see anything in the logs on either the host or the guest suggesting that there was anything going on. The guest doesn't seem to be aware that the NIC isn't functional but traffic just doesn't seem to flow over it. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
Hi Barry, thanks for your report. Several questions trying to get us analyzing this: 1. how (fast) reproducible is this? 2. you say heavy network load through to ceph, did you ever reproduce without ceph (like any network benchmark tool? - that would ease reproducing this). Also even if ceph only, do you have any numbers in #packets, sizes, throughput along that. I don't know if iptraf still works, but might be worth a try to gather some info what/how much load it is. 3. the referred bug report initially was about e1000 devices in the guest, what do you have (virtio as usually?) 4. I bet you'd have said so, but just in case anything in the logs (var/lib/libvirt/qemu/.log? ** Changed in: qemu-kvm (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1694625] Re: KVM Guest loses network connectivity under heavy load
I am not experiencing this issue on my other virtualised Ceph instance. The guest of the instance is 17.04 much like the other one, but the host is 16.04.2 LTS on 4.4.0-75-generic. No idea if it is just coincidence that it's not failing. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1694625 Title: KVM Guest loses network connectivity under heavy load To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/1694625/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs