Re: [vfio-users] Host hard lockups

2016-10-16 Thread Thomas Lindroth
> It took longer than expected, but a definite crash happened yesterday.
> Sadly, it seems that MSI was not a fix for the in-use crashes.
> 
> At this point I'm worried that it's some sort of weird hardware-specific
> interaction that is unlikely to be fixed. If anybody experiences similar
> symptoms or can suggest any debugging techniques, I'd greatly appreciate
> any suggestions.

I do experience something similar and have been since June. I get about 1-2
crashes per month and the symptoms are very similar to yours. After the last
crash I went ahead and setup netconsole logging. That way all kernel messages
are sent to another machine and are saved after the crash. 

https://www.kernel.org/doc/Documentation/networking/netconsole.txt
It's easy to setup using the "Dynamic reconfiguration" solution but you'll
need another machine to log the messages.

Today I finally got another crash and it looks identical to this:
https://lkml.org/lkml/2016/9/14/527

It's a problem with fuse that's only triggered under memory pressure. I
always assumed the crashes are related to kvm because it usually happens soon
after starting a VM but perhaps the VM only introduced the memory pressure
needed to trigger the fuse crash. Do you also use fuse?

The patch to fix it are marked  [3.15+] but so far
only 4.8.0 and above got the fix. I upgraded to 4.8.2 and hopefully that'll
fix the crashes for me.

After some googeling I even found this:
https://github.com/trapexit/mergerfs#mergerfs-under-heavy-load-and-memory-preasure-leads-to-kernel-panic
mergerfs is what I use fuse for.

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Host hard lockups

2016-09-17 Thread fenix23 .
Hi,

I also runs on Rampage IV. I dont have freezes during work but sometimes I
have performance issue durring some media with sound operations. I very
often have similiar freezes when I'm shuting down the VM. I havent much
time to troubleshoot this unfortunetly.

I have a question for you. I have problem with sound in the vm, do you have
sound from your sound card or using HDMI passed through ?
I cant get this to work.

Best Regards
Tomasz Strzelecki

2016-08-22 22:18 GMT+02:00 vfio :

> On 08/16/2016 12:51 PM, vfio wrote:
> > I noticed that the symptoms of the freeze are very similar to the
> > freezes I sometimes get when shutting down the guest. After searching
> > the archives, I decided to try enabling MSI for the Titan X in the Win10
> > guest. This did indeed stop the freezes at shutdown time. I've been
> > using the guest for 10 days since my original post. During this time I
> > experienced only one freeze, but I was not nearby at the time so it's
> > hard to say if the guest caused it.
>
> It took longer than expected, but a definite crash happened yesterday.
> Sadly, it seems that MSI was not a fix for the in-use crashes.
>
> At this point I'm worried that it's some sort of weird hardware-specific
> interaction that is unlikely to be fixed. If anybody experiences similar
> symptoms or can suggest any debugging techniques, I'd greatly appreciate
> any suggestions.
>
> Thanks!
>
> ___
> vfio-users mailing list
> vfio-users@redhat.com
> https://www.redhat.com/mailman/listinfo/vfio-users
>
___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Host hard lockups

2016-08-22 Thread vfio
On 08/16/2016 12:51 PM, vfio wrote:
> I noticed that the symptoms of the freeze are very similar to the
> freezes I sometimes get when shutting down the guest. After searching
> the archives, I decided to try enabling MSI for the Titan X in the Win10
> guest. This did indeed stop the freezes at shutdown time. I've been
> using the guest for 10 days since my original post. During this time I
> experienced only one freeze, but I was not nearby at the time so it's
> hard to say if the guest caused it.

It took longer than expected, but a definite crash happened yesterday.
Sadly, it seems that MSI was not a fix for the in-use crashes.

At this point I'm worried that it's some sort of weird hardware-specific
interaction that is unlikely to be fixed. If anybody experiences similar
symptoms or can suggest any debugging techniques, I'd greatly appreciate
any suggestions.

Thanks!

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Host hard lockups

2016-08-16 Thread vfio
Thanks for the suggestions. I'm definitely not running out of RAM; the
host has 32GB and I've lowered the guest assignment to 8GB. I also have
a constant RAM use monitor that never shows anything near capacity (even
in the cases where the guest has frozen but the adjacent CPU monitor on
the host has not yet frozen).

I have a minor update on the issue.

I noticed that the symptoms of the freeze are very similar to the
freezes I sometimes get when shutting down the guest. After searching
the archives, I decided to try enabling MSI for the Titan X in the Win10
guest. This did indeed stop the freezes at shutdown time. I've been
using the guest for 10 days since my original post. During this time I
experienced only one freeze, but I was not nearby at the time so it's
hard to say if the guest caused it.

I still sometimes get similar freezes when starting up the machine after
it has already been shutdown once during the host session. In these
cases, I've noticed that lspci shows that my USB3 PCIe card is missing
all of its details and displays an error. I have not yet saved this
error message since it happens so infrequently in my use that I don't
care too much about this particular freeze.

Random question: does it matter that my host OS boots using
compatibility mode (i.e., not UEFI)? The guest machine is using OVMF.

One of the problems with debugging this is that it's hard to tell if the
problem was fixed due to the odd random distribution of the freezes. I
will provide more updates if the situation changes.

Thanks.

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Host hard lockups

2016-08-05 Thread Bronek Kozicki

On 05/08/2016 22:26, Bronek Kozicki wrote:

I had something similar too, it was happening few times per month. And
then it stopped, but I do not remember what changed back then :( Could
be hardware change since I switched around that time from AMD to nVidia
Quadro, or could be software change as I upgraded to qemu 2.5 (but I see
you are running 4.6, so probably not this).


I meant to say I also upgraded my kernel to 4.1, then removed this part 
but stupidly left parentheses, obviously irrelevant



B.

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users


Re: [vfio-users] Host hard lockups

2016-08-05 Thread Bronek Kozicki

On 05/08/2016 22:11, vfio wrote:

Hello everyone,

I've been running VGA passthrough with a Debian unstable host and a
Windows 10 guest for months now. Everything works perfectly, except that
the entire machine randomly freezes when the guest is running.

When a freeze happens, the guest immediately locks up. Sometimes, if
audio was playing, it goes into a short loop. Strangely, the host does
not usually freeze immediately; it takes a few seconds after the guest
has frozen. For example, my CPU monitor on the host will usually perform
a few more measurements before completely freezing, and the mouse cursor
on the host machine will continue working for a few seconds as well.

When the host freezes, not even the physical reset button on the machine
works. It requires a hard reset by holding the power button. There have
been a few times where the reset button worked. However, in one of these
instances, the host refused to boot after a reset, claiming to be unable
to initialize one of the USB buses. Sometimes the issue does not happen
for several days with multiple-hour sessions. Sometimes it happens
multiple times per day, possibly a few minutes after booting the guest.

No freeze leaves any traces in syslog.

These issues are very similar to those reported by Colin Godsey on this
list in May. While the conclusion of that thread seemed to be BIOS
firmware problems or "the Skylake freeze", I am using a Core i7-4960X
(Ivy Bridge-E) and an ASUS Rampage IV Extreme with the final BIOS
revision (4901 from 2014-06-18).



I had something similar too, it was happening few times per month. And 
then it stopped, but I do not remember what changed back then :( Could 
be hardware change since I switched around that time from AMD to nVidia 
Quadro, or could be software change as I upgraded to qemu 2.5 (but I see 
you are running 4.6, so probably not this). I also made changes in my 
system configuration to ensure it does not run out of RAM (it is also 
running ZFS, which is very greedy).



B.

___
vfio-users mailing list
vfio-users@redhat.com
https://www.redhat.com/mailman/listinfo/vfio-users