Re: [vfio-users] Host hard lockups
> It took longer than expected, but a definite crash happened yesterday. > Sadly, it seems that MSI was not a fix for the in-use crashes. > > At this point I'm worried that it's some sort of weird hardware-specific > interaction that is unlikely to be fixed. If anybody experiences similar > symptoms or can suggest any debugging techniques, I'd greatly appreciate > any suggestions. I do experience something similar and have been since June. I get about 1-2 crashes per month and the symptoms are very similar to yours. After the last crash I went ahead and setup netconsole logging. That way all kernel messages are sent to another machine and are saved after the crash. https://www.kernel.org/doc/Documentation/networking/netconsole.txt It's easy to setup using the "Dynamic reconfiguration" solution but you'll need another machine to log the messages. Today I finally got another crash and it looks identical to this: https://lkml.org/lkml/2016/9/14/527 It's a problem with fuse that's only triggered under memory pressure. I always assumed the crashes are related to kvm because it usually happens soon after starting a VM but perhaps the VM only introduced the memory pressure needed to trigger the fuse crash. Do you also use fuse? The patch to fix it are marked[3.15+] but so far only 4.8.0 and above got the fix. I upgraded to 4.8.2 and hopefully that'll fix the crashes for me. After some googeling I even found this: https://github.com/trapexit/mergerfs#mergerfs-under-heavy-load-and-memory-preasure-leads-to-kernel-panic mergerfs is what I use fuse for. ___ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users
Re: [vfio-users] Host hard lockups
Hi, I also runs on Rampage IV. I dont have freezes during work but sometimes I have performance issue durring some media with sound operations. I very often have similiar freezes when I'm shuting down the VM. I havent much time to troubleshoot this unfortunetly. I have a question for you. I have problem with sound in the vm, do you have sound from your sound card or using HDMI passed through ? I cant get this to work. Best Regards Tomasz Strzelecki 2016-08-22 22:18 GMT+02:00 vfio: > On 08/16/2016 12:51 PM, vfio wrote: > > I noticed that the symptoms of the freeze are very similar to the > > freezes I sometimes get when shutting down the guest. After searching > > the archives, I decided to try enabling MSI for the Titan X in the Win10 > > guest. This did indeed stop the freezes at shutdown time. I've been > > using the guest for 10 days since my original post. During this time I > > experienced only one freeze, but I was not nearby at the time so it's > > hard to say if the guest caused it. > > It took longer than expected, but a definite crash happened yesterday. > Sadly, it seems that MSI was not a fix for the in-use crashes. > > At this point I'm worried that it's some sort of weird hardware-specific > interaction that is unlikely to be fixed. If anybody experiences similar > symptoms or can suggest any debugging techniques, I'd greatly appreciate > any suggestions. > > Thanks! > > ___ > vfio-users mailing list > vfio-users@redhat.com > https://www.redhat.com/mailman/listinfo/vfio-users > ___ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users
Re: [vfio-users] Host hard lockups
On 08/16/2016 12:51 PM, vfio wrote: > I noticed that the symptoms of the freeze are very similar to the > freezes I sometimes get when shutting down the guest. After searching > the archives, I decided to try enabling MSI for the Titan X in the Win10 > guest. This did indeed stop the freezes at shutdown time. I've been > using the guest for 10 days since my original post. During this time I > experienced only one freeze, but I was not nearby at the time so it's > hard to say if the guest caused it. It took longer than expected, but a definite crash happened yesterday. Sadly, it seems that MSI was not a fix for the in-use crashes. At this point I'm worried that it's some sort of weird hardware-specific interaction that is unlikely to be fixed. If anybody experiences similar symptoms or can suggest any debugging techniques, I'd greatly appreciate any suggestions. Thanks! ___ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users
Re: [vfio-users] Host hard lockups
Thanks for the suggestions. I'm definitely not running out of RAM; the host has 32GB and I've lowered the guest assignment to 8GB. I also have a constant RAM use monitor that never shows anything near capacity (even in the cases where the guest has frozen but the adjacent CPU monitor on the host has not yet frozen). I have a minor update on the issue. I noticed that the symptoms of the freeze are very similar to the freezes I sometimes get when shutting down the guest. After searching the archives, I decided to try enabling MSI for the Titan X in the Win10 guest. This did indeed stop the freezes at shutdown time. I've been using the guest for 10 days since my original post. During this time I experienced only one freeze, but I was not nearby at the time so it's hard to say if the guest caused it. I still sometimes get similar freezes when starting up the machine after it has already been shutdown once during the host session. In these cases, I've noticed that lspci shows that my USB3 PCIe card is missing all of its details and displays an error. I have not yet saved this error message since it happens so infrequently in my use that I don't care too much about this particular freeze. Random question: does it matter that my host OS boots using compatibility mode (i.e., not UEFI)? The guest machine is using OVMF. One of the problems with debugging this is that it's hard to tell if the problem was fixed due to the odd random distribution of the freezes. I will provide more updates if the situation changes. Thanks. ___ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users
Re: [vfio-users] Host hard lockups
On 05/08/2016 22:26, Bronek Kozicki wrote: I had something similar too, it was happening few times per month. And then it stopped, but I do not remember what changed back then :( Could be hardware change since I switched around that time from AMD to nVidia Quadro, or could be software change as I upgraded to qemu 2.5 (but I see you are running 4.6, so probably not this). I meant to say I also upgraded my kernel to 4.1, then removed this part but stupidly left parentheses, obviously irrelevant B. ___ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users
Re: [vfio-users] Host hard lockups
On 05/08/2016 22:11, vfio wrote: Hello everyone, I've been running VGA passthrough with a Debian unstable host and a Windows 10 guest for months now. Everything works perfectly, except that the entire machine randomly freezes when the guest is running. When a freeze happens, the guest immediately locks up. Sometimes, if audio was playing, it goes into a short loop. Strangely, the host does not usually freeze immediately; it takes a few seconds after the guest has frozen. For example, my CPU monitor on the host will usually perform a few more measurements before completely freezing, and the mouse cursor on the host machine will continue working for a few seconds as well. When the host freezes, not even the physical reset button on the machine works. It requires a hard reset by holding the power button. There have been a few times where the reset button worked. However, in one of these instances, the host refused to boot after a reset, claiming to be unable to initialize one of the USB buses. Sometimes the issue does not happen for several days with multiple-hour sessions. Sometimes it happens multiple times per day, possibly a few minutes after booting the guest. No freeze leaves any traces in syslog. These issues are very similar to those reported by Colin Godsey on this list in May. While the conclusion of that thread seemed to be BIOS firmware problems or "the Skylake freeze", I am using a Core i7-4960X (Ivy Bridge-E) and an ASUS Rampage IV Extreme with the final BIOS revision (4901 from 2014-06-18). I had something similar too, it was happening few times per month. And then it stopped, but I do not remember what changed back then :( Could be hardware change since I switched around that time from AMD to nVidia Quadro, or could be software change as I upgraded to qemu 2.5 (but I see you are running 4.6, so probably not this). I also made changes in my system configuration to ensure it does not run out of RAM (it is also running ZFS, which is very greedy). B. ___ vfio-users mailing list vfio-users@redhat.com https://www.redhat.com/mailman/listinfo/vfio-users