On 10/14/2014 02:02 AM, Micky wrote: > There are new developments since the last time it was reported. > User xmerlin did some extensive testing and found it to be a regression. I > also reproduced it. > > https://github.com/joyent/smartos-live/issues/346
I've discussed this with xmerlin in IRC. Unfortunately, we don't have a very clear picture of what's going on yet, or if it is a regression, what change, if any, introduced it. Specifically, we see that the Linux kernel fails to initialize its apic every once in a rare while. We discussed the next steps we need to investigate it, particularly, we asked for help in understanding why this happened and to help get it to root cause, which means investigating the Linux guest itself and understanding the bit of Linux code that is executing and why it is failing to initialize. We used Linux here, as it is really quite hard to understand a Windows guest and there are often, from a KVM perspective, still runnable threads for the Windows guests, so it's hard to understand why that's happening. For next steps for debugging this on a Linux guest are to modify or instrument the Linux kernel and get some extra information printed out to understand why it's failing these conditions, specifically its the bit around its apic initailization, which hasn't had too much change in that range. Perhaps identifying which commit we believe may actually have induced the regression might help, but from the little that's changed in the kvm and QEMU code bases during that time, it's currently rather hard to see how they lead to this behavior, though nothing is ever 100% impossible. Robert ------------------------------------------- smartos-discuss Archives: https://www.listbox.com/member/archive/184463/=now RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00 Modify Your Subscription: https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb Powered by Listbox: http://www.listbox.com
