Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/02/2015 06:54 PM, Markus Trippelsdorf wrote: On 2015.09.02 at 18:27 +0800, Xiao Guangrong wrote: On 09/02/2015 05:38 PM, Markus Trippelsdorf wrote: On 2015.09.02 at 17:17 +0800, Xiao Guangrong wrote: No. PAT is of course enabled and booting is successful sometimes even with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. Thanks for your confirmation. markus@x4 linux % cat .config | grep X86_PAT CONFIG_X86_PAT=y markus@x4 linux % dmesg | grep PAT [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT Strange, BP processor has already set WC to PAT1, however KVM does not read it out from PAT MSR on its local CPU. Hmm... PAT default values do not include WC, it seems initing PAT on SP has not finished after module_init()? Could please apply this diff and test it again? (Your patch was malformed.) [2.138098] kvm: Nested Virtualization enabled [2.138153] kvm: Nested Paging enabled [2.138204] KVM PAT: 0x7040600070406. So the PAT is the value after CPU reset, it's likely PAT is not initialized on the local CPU. Could it be a simple AMD/INTEL difference. I'm running a AMD CPU and I see many !use_intel() in if statements in arch/x86/kernel/cpu/mtrr/main.c... #define use_intel() (mtrr_if && mtrr_if->use_intel_if == 1) And i checked your CPU supports "mtrr" /proc/info, so it should use generic_mtrr_ops and generic_mtrr_ops.use_intel_if = 1. That means AMD CPU also use "intel" way. :) Please refer to the initiation of "mtrr_if" in mtrr_bp_init(). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.02 at 18:27 +0800, Xiao Guangrong wrote: > > > On 09/02/2015 05:38 PM, Markus Trippelsdorf wrote: > > On 2015.09.02 at 17:17 +0800, Xiao Guangrong wrote: > >>> > >>> No. PAT is of course enabled and booting is successful sometimes even > >>> with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. > >> > >> Thanks for your confirmation. > >> > >>> > >>> markus@x4 linux % cat .config | grep X86_PAT > >>> CONFIG_X86_PAT=y > >>> markus@x4 linux % dmesg | grep PAT > >>> [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- > >>> WT > >> > >> Strange, BP processor has already set WC to PAT1, however KVM does not > >> read it out > >> from PAT MSR on its local CPU. > >> > >> Hmm... PAT default values do not include WC, it seems initing PAT on SP > >> has not > >> finished after module_init()? > >> > >> Could please apply this diff and test it again? > > > > (Your patch was malformed.) > > > > [2.138098] kvm: Nested Virtualization enabled > > [2.138153] kvm: Nested Paging enabled > > [2.138204] KVM PAT: 0x7040600070406. > > So the PAT is the value after CPU reset, it's likely PAT is not initialized on > the local CPU. Could it be a simple AMD/INTEL difference. I'm running a AMD CPU and I see many !use_intel() in if statements in arch/x86/kernel/cpu/mtrr/main.c... -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/02/2015 05:38 PM, Markus Trippelsdorf wrote: On 2015.09.02 at 17:17 +0800, Xiao Guangrong wrote: No. PAT is of course enabled and booting is successful sometimes even with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. Thanks for your confirmation. markus@x4 linux % cat .config | grep X86_PAT CONFIG_X86_PAT=y markus@x4 linux % dmesg | grep PAT [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT Strange, BP processor has already set WC to PAT1, however KVM does not read it out from PAT MSR on its local CPU. Hmm... PAT default values do not include WC, it seems initing PAT on SP has not finished after module_init()? Could please apply this diff and test it again? (Your patch was malformed.) [2.138098] kvm: Nested Virtualization enabled [2.138153] kvm: Nested Paging enabled [2.138204] KVM PAT: 0x7040600070406. So the PAT is the value after CPU reset, it's likely PAT is not initialized on the local CPU. Maybe something is escaped from stop_machine() called in native_smp_cpus_done(), Ingo? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.02 at 17:17 +0800, Xiao Guangrong wrote: > > > > No. PAT is of course enabled and booting is successful sometimes even > > with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. > > Thanks for your confirmation. > > > > > markus@x4 linux % cat .config | grep X86_PAT > > CONFIG_X86_PAT=y > > markus@x4 linux % dmesg | grep PAT > > [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT > > Strange, BP processor has already set WC to PAT1, however KVM does not read > it out > from PAT MSR on its local CPU. > > Hmm... PAT default values do not include WC, it seems initing PAT on SP has > not > finished after module_init()? > > Could please apply this diff and test it again? (Your patch was malformed.) [2.138098] kvm: Nested Virtualization enabled [2.138153] kvm: Nested Paging enabled [2.138204] KVM PAT: 0x7040600070406. [2.138255] mtrr2protval[0]:18. [2.138306] mtrr2protval[1]:ff. [2.138356] mtrr2protval[2]:0. [2.138408] mtrr2protval[3]:0. [2.138459] mtrr2protval[4]:8. [2.138510] mtrr2protval[5]:ff. [2.138561] mtrr2protval[6]:0. [2.138612] mtrr2protval[7]:10. [2.138662] BUG in fallback_mtrr_type, mtrr = 1. -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/02/2015 11:50 AM, Markus Trippelsdorf wrote: On 2015.09.02 at 06:31 +0800, Xiao Guangrong wrote: On 09/01/2015 09:56 PM, Markus Trippelsdorf wrote: On 2015.09.01 at 21:00 +0800, Xiao Guangrong wrote: Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? If yes, could you please print the actual value out? It is the BUG() in fallback_mtrr_type(). I changed it to a printk and it prints 1 for the value of mtrr. MTRR_TYPE_WRCOMB 1 Then I suspect pat is not enabled in your box, could you please check CONFIG_X86_PAT is selected in your .config file, pat is shown in /proc/cpuid, "nopat" kernel parameter is used, and dmesg | grep PAT. No. PAT is of course enabled and booting is successful sometimes even with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. Thanks for your confirmation. markus@x4 linux % cat .config | grep X86_PAT CONFIG_X86_PAT=y markus@x4 linux % dmesg | grep PAT [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT Strange, BP processor has already set WC to PAT1, however KVM does not read it out from PAT MSR on its local CPU. Hmm... PAT default values do not include WC, it seems initing PAT on SP has not finished after module_init()? Could please apply this diff and test it again? diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 189e464..d9d3a30 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -884,6 +884,7 @@ static u8 fallback_mtrr_type(int mtrr) case MTRR_TYPE_WRPROT: return MTRR_TYPE_UC_MINUS; default: + printk("BUG in %s, mtrr = %d.\n", __FUNCTION__, mtrr); BUG(); } } @@ -907,6 +908,8 @@ static void build_mtrr2protval(void) * guest. */ rdmsrl(MSR_IA32_CR_PAT, pat); + printk("KVM PAT: 0x%llx.\n", pat); + for (i = 0; i < 8; i++) { u8 mtrr = pat >> (8 * i); @@ -914,10 +917,17 @@ static void build_mtrr2protval(void) mtrr2protval[mtrr] = __cm_idx2pte(i); } + for (i = 0; i < 8; i++) + printk("mtrr2protval[%d]:%x.\n", i, mtrr2protval[i]); + + for (i = 0; i < 8; i++) { if (mtrr2protval[i] == MTRR2PROTVAL_INVALID) { u8 fallback = fallback_mtrr_type(i); mtrr2protval[i] = mtrr2protval[fallback]; + if (mtrr2protval[i] == MTRR2PROTVAL_INVALID) + printk("BUG in %s, mtrr2protval[%d] = %x.\n", __FUNCTION__, i, mtrr2protval[i]); + BUG_ON(mtrr2protval[i] == MTRR2PROTVAL_INVALID); } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/02/2015 11:50 AM, Markus Trippelsdorf wrote: On 2015.09.02 at 06:31 +0800, Xiao Guangrong wrote: On 09/01/2015 09:56 PM, Markus Trippelsdorf wrote: On 2015.09.01 at 21:00 +0800, Xiao Guangrong wrote: Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? If yes, could you please print the actual value out? It is the BUG() in fallback_mtrr_type(). I changed it to a printk and it prints 1 for the value of mtrr. MTRR_TYPE_WRCOMB 1 Then I suspect pat is not enabled in your box, could you please check CONFIG_X86_PAT is selected in your .config file, pat is shown in /proc/cpuid, "nopat" kernel parameter is used, and dmesg | grep PAT. No. PAT is of course enabled and booting is successful sometimes even with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. Thanks for your confirmation. markus@x4 linux % cat .config | grep X86_PAT CONFIG_X86_PAT=y markus@x4 linux % dmesg | grep PAT [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT Strange, BP processor has already set WC to PAT1, however KVM does not read it out from PAT MSR on its local CPU. Hmm... PAT default values do not include WC, it seems initing PAT on SP has not finished after module_init()? Could please apply this diff and test it again? diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 189e464..d9d3a30 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -884,6 +884,7 @@ static u8 fallback_mtrr_type(int mtrr) case MTRR_TYPE_WRPROT: return MTRR_TYPE_UC_MINUS; default: + printk("BUG in %s, mtrr = %d.\n", __FUNCTION__, mtrr); BUG(); } } @@ -907,6 +908,8 @@ static void build_mtrr2protval(void) * guest. */ rdmsrl(MSR_IA32_CR_PAT, pat); + printk("KVM PAT: 0x%llx.\n", pat); + for (i = 0; i < 8; i++) { u8 mtrr = pat >> (8 * i); @@ -914,10 +917,17 @@ static void build_mtrr2protval(void) mtrr2protval[mtrr] = __cm_idx2pte(i); } + for (i = 0; i < 8; i++) + printk("mtrr2protval[%d]:%x.\n", i, mtrr2protval[i]); + + for (i = 0; i < 8; i++) { if (mtrr2protval[i] == MTRR2PROTVAL_INVALID) { u8 fallback = fallback_mtrr_type(i); mtrr2protval[i] = mtrr2protval[fallback]; + if (mtrr2protval[i] == MTRR2PROTVAL_INVALID) + printk("BUG in %s, mtrr2protval[%d] = %x.\n", __FUNCTION__, i, mtrr2protval[i]); + BUG_ON(mtrr2protval[i] == MTRR2PROTVAL_INVALID); } } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.02 at 17:17 +0800, Xiao Guangrong wrote: > > > > No. PAT is of course enabled and booting is successful sometimes even > > with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. > > Thanks for your confirmation. > > > > > markus@x4 linux % cat .config | grep X86_PAT > > CONFIG_X86_PAT=y > > markus@x4 linux % dmesg | grep PAT > > [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT > > Strange, BP processor has already set WC to PAT1, however KVM does not read > it out > from PAT MSR on its local CPU. > > Hmm... PAT default values do not include WC, it seems initing PAT on SP has > not > finished after module_init()? > > Could please apply this diff and test it again? (Your patch was malformed.) [2.138098] kvm: Nested Virtualization enabled [2.138153] kvm: Nested Paging enabled [2.138204] KVM PAT: 0x7040600070406. [2.138255] mtrr2protval[0]:18. [2.138306] mtrr2protval[1]:ff. [2.138356] mtrr2protval[2]:0. [2.138408] mtrr2protval[3]:0. [2.138459] mtrr2protval[4]:8. [2.138510] mtrr2protval[5]:ff. [2.138561] mtrr2protval[6]:0. [2.138612] mtrr2protval[7]:10. [2.138662] BUG in fallback_mtrr_type, mtrr = 1. -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/02/2015 06:54 PM, Markus Trippelsdorf wrote: On 2015.09.02 at 18:27 +0800, Xiao Guangrong wrote: On 09/02/2015 05:38 PM, Markus Trippelsdorf wrote: On 2015.09.02 at 17:17 +0800, Xiao Guangrong wrote: No. PAT is of course enabled and booting is successful sometimes even with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. Thanks for your confirmation. markus@x4 linux % cat .config | grep X86_PAT CONFIG_X86_PAT=y markus@x4 linux % dmesg | grep PAT [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT Strange, BP processor has already set WC to PAT1, however KVM does not read it out from PAT MSR on its local CPU. Hmm... PAT default values do not include WC, it seems initing PAT on SP has not finished after module_init()? Could please apply this diff and test it again? (Your patch was malformed.) [2.138098] kvm: Nested Virtualization enabled [2.138153] kvm: Nested Paging enabled [2.138204] KVM PAT: 0x7040600070406. So the PAT is the value after CPU reset, it's likely PAT is not initialized on the local CPU. Could it be a simple AMD/INTEL difference. I'm running a AMD CPU and I see many !use_intel() in if statements in arch/x86/kernel/cpu/mtrr/main.c... #define use_intel() (mtrr_if && mtrr_if->use_intel_if == 1) And i checked your CPU supports "mtrr" /proc/info, so it should use generic_mtrr_ops and generic_mtrr_ops.use_intel_if = 1. That means AMD CPU also use "intel" way. :) Please refer to the initiation of "mtrr_if" in mtrr_bp_init(). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/02/2015 05:38 PM, Markus Trippelsdorf wrote: On 2015.09.02 at 17:17 +0800, Xiao Guangrong wrote: No. PAT is of course enabled and booting is successful sometimes even with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. Thanks for your confirmation. markus@x4 linux % cat .config | grep X86_PAT CONFIG_X86_PAT=y markus@x4 linux % dmesg | grep PAT [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT Strange, BP processor has already set WC to PAT1, however KVM does not read it out from PAT MSR on its local CPU. Hmm... PAT default values do not include WC, it seems initing PAT on SP has not finished after module_init()? Could please apply this diff and test it again? (Your patch was malformed.) [2.138098] kvm: Nested Virtualization enabled [2.138153] kvm: Nested Paging enabled [2.138204] KVM PAT: 0x7040600070406. So the PAT is the value after CPU reset, it's likely PAT is not initialized on the local CPU. Maybe something is escaped from stop_machine() called in native_smp_cpus_done(), Ingo? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.02 at 18:27 +0800, Xiao Guangrong wrote: > > > On 09/02/2015 05:38 PM, Markus Trippelsdorf wrote: > > On 2015.09.02 at 17:17 +0800, Xiao Guangrong wrote: > >>> > >>> No. PAT is of course enabled and booting is successful sometimes even > >>> with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. > >> > >> Thanks for your confirmation. > >> > >>> > >>> markus@x4 linux % cat .config | grep X86_PAT > >>> CONFIG_X86_PAT=y > >>> markus@x4 linux % dmesg | grep PAT > >>> [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- > >>> WT > >> > >> Strange, BP processor has already set WC to PAT1, however KVM does not > >> read it out > >> from PAT MSR on its local CPU. > >> > >> Hmm... PAT default values do not include WC, it seems initing PAT on SP > >> has not > >> finished after module_init()? > >> > >> Could please apply this diff and test it again? > > > > (Your patch was malformed.) > > > > [2.138098] kvm: Nested Virtualization enabled > > [2.138153] kvm: Nested Paging enabled > > [2.138204] KVM PAT: 0x7040600070406. > > So the PAT is the value after CPU reset, it's likely PAT is not initialized on > the local CPU. Could it be a simple AMD/INTEL difference. I'm running a AMD CPU and I see many !use_intel() in if statements in arch/x86/kernel/cpu/mtrr/main.c... -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.02 at 06:31 +0800, Xiao Guangrong wrote: > > > On 09/01/2015 09:56 PM, Markus Trippelsdorf wrote: > > On 2015.09.01 at 21:00 +0800, Xiao Guangrong wrote: > >> > >> Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? > >> If yes, could you please print the actual value out? > > > > It is the BUG() in fallback_mtrr_type(). I changed it to a printk and > > it prints 1 for the value of mtrr. > > > > MTRR_TYPE_WRCOMB 1 > > > > Then I suspect pat is not enabled in your box, could you please check > CONFIG_X86_PAT is selected in your .config file, pat is shown in > /proc/cpuid, "nopat" kernel parameter is used, and dmesg | grep PAT. No. PAT is of course enabled and booting is successful sometimes even with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. markus@x4 linux % cat .config | grep X86_PAT CONFIG_X86_PAT=y markus@x4 linux % dmesg | grep PAT [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT markus@x4 linux % cat /proc/cpuinfo| grep pat flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate npt lbrv svm_lock nrip_save vmmcall ... -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/01/2015 09:56 PM, Markus Trippelsdorf wrote: On 2015.09.01 at 21:00 +0800, Xiao Guangrong wrote: Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? If yes, could you please print the actual value out? It is the BUG() in fallback_mtrr_type(). I changed it to a printk and it prints 1 for the value of mtrr. MTRR_TYPE_WRCOMB 1 Then I suspect pat is not enabled in your box, could you please check CONFIG_X86_PAT is selected in your .config file, pat is shown in /proc/cpuid, "nopat" kernel parameter is used, and dmesg | grep PAT. I will post a fix if the suspect is right. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.01 at 21:00 +0800, Xiao Guangrong wrote: > > Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? > If yes, could you please print the actual value out? It is the BUG() in fallback_mtrr_type(). I changed it to a printk and it prints 1 for the value of mtrr. MTRR_TYPE_WRCOMB 1 -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/01/2015 06:04 PM, Markus Trippelsdorf wrote: On 2015.09.01 at 10:56 +0200, Ingo Molnar wrote: * Markus Trippelsdorf wrote: As I wrote in my other reply. The boot failure is nondeterministic (boot succeeds roughly every sixth time). So the bisection and the patch is just bogus (,but the boot failure is real). Sorry. No problem. Please let us know if any of these commits does turn out to be the culprit. (Which is always a possibility.) I'm pretty sure commit 3c2e7f7de3 is the culprit. commit 3c2e7f7de3240216042b61073803b61b9b3cfb22 Author: Paolo Bonzini Date: Tue Jul 7 14:32:17 2015 +0200 KVM: SVM: use NPT page attributes I've booted ten times in a row successfully with the following patch: diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 74d825716f4f..3190173a575f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -989,7 +989,7 @@ static __init int svm_hardware_setup(void) } else kvm_disable_tdp(); - build_mtrr2protval(); +// build_mtrr2protval(); return 0; err: Paolo, your commit causes nondeterministic boot failure on my machine. It sometimes crashes early with the following backtrace: Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? If yes, could you please print the actual value out? BTW, you may change BUG() to WARN() to get the print info more easier. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.01 at 10:56 +0200, Ingo Molnar wrote: > > * Markus Trippelsdorf wrote: > > As I wrote in my other reply. The boot failure is nondeterministic (boot > > succeeds roughly every sixth time). So the bisection and the patch is > > just bogus (,but the boot failure is real). > > > > Sorry. > > No problem. Please let us know if any of these commits does turn out to be > the > culprit. (Which is always a possibility.) I'm pretty sure commit 3c2e7f7de3 is the culprit. commit 3c2e7f7de3240216042b61073803b61b9b3cfb22 Author: Paolo Bonzini Date: Tue Jul 7 14:32:17 2015 +0200 KVM: SVM: use NPT page attributes I've booted ten times in a row successfully with the following patch: diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 74d825716f4f..3190173a575f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -989,7 +989,7 @@ static __init int svm_hardware_setup(void) } else kvm_disable_tdp(); - build_mtrr2protval(); +// build_mtrr2protval(); return 0; err: Paolo, your commit causes nondeterministic boot failure on my machine. It sometimes crashes early with the following backtrace: map_vsyscall kvm_arch_hardware_setup map_vsyscall kvm_init map_vsyscall do_one_initcall kernel_init_freeable rest_init kernel_init ret_from_fork rest_init RIP: svm_hardware_setup -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/01/2015 09:56 PM, Markus Trippelsdorf wrote: On 2015.09.01 at 21:00 +0800, Xiao Guangrong wrote: Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? If yes, could you please print the actual value out? It is the BUG() in fallback_mtrr_type(). I changed it to a printk and it prints 1 for the value of mtrr. MTRR_TYPE_WRCOMB 1 Then I suspect pat is not enabled in your box, could you please check CONFIG_X86_PAT is selected in your .config file, pat is shown in /proc/cpuid, "nopat" kernel parameter is used, and dmesg | grep PAT. I will post a fix if the suspect is right. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.01 at 10:56 +0200, Ingo Molnar wrote: > > * Markus Trippelsdorfwrote: > > As I wrote in my other reply. The boot failure is nondeterministic (boot > > succeeds roughly every sixth time). So the bisection and the patch is > > just bogus (,but the boot failure is real). > > > > Sorry. > > No problem. Please let us know if any of these commits does turn out to be > the > culprit. (Which is always a possibility.) I'm pretty sure commit 3c2e7f7de3 is the culprit. commit 3c2e7f7de3240216042b61073803b61b9b3cfb22 Author: Paolo Bonzini Date: Tue Jul 7 14:32:17 2015 +0200 KVM: SVM: use NPT page attributes I've booted ten times in a row successfully with the following patch: diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 74d825716f4f..3190173a575f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -989,7 +989,7 @@ static __init int svm_hardware_setup(void) } else kvm_disable_tdp(); - build_mtrr2protval(); +// build_mtrr2protval(); return 0; err: Paolo, your commit causes nondeterministic boot failure on my machine. It sometimes crashes early with the following backtrace: map_vsyscall kvm_arch_hardware_setup map_vsyscall kvm_init map_vsyscall do_one_initcall kernel_init_freeable rest_init kernel_init ret_from_fork rest_init RIP: svm_hardware_setup -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.02 at 06:31 +0800, Xiao Guangrong wrote: > > > On 09/01/2015 09:56 PM, Markus Trippelsdorf wrote: > > On 2015.09.01 at 21:00 +0800, Xiao Guangrong wrote: > >> > >> Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? > >> If yes, could you please print the actual value out? > > > > It is the BUG() in fallback_mtrr_type(). I changed it to a printk and > > it prints 1 for the value of mtrr. > > > > MTRR_TYPE_WRCOMB 1 > > > > Then I suspect pat is not enabled in your box, could you please check > CONFIG_X86_PAT is selected in your .config file, pat is shown in > /proc/cpuid, "nopat" kernel parameter is used, and dmesg | grep PAT. No. PAT is of course enabled and booting is successful sometimes even with the BUG() in allback_mtrr_type(). I suspect a setup (timing) issue. markus@x4 linux % cat .config | grep X86_PAT CONFIG_X86_PAT=y markus@x4 linux % dmesg | grep PAT [0.00] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WC UC- WT markus@x4 linux % cat /proc/cpuinfo| grep pat flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate npt lbrv svm_lock nrip_save vmmcall ... -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 09/01/2015 06:04 PM, Markus Trippelsdorf wrote: On 2015.09.01 at 10:56 +0200, Ingo Molnar wrote: * Markus Trippelsdorfwrote: As I wrote in my other reply. The boot failure is nondeterministic (boot succeeds roughly every sixth time). So the bisection and the patch is just bogus (,but the boot failure is real). Sorry. No problem. Please let us know if any of these commits does turn out to be the culprit. (Which is always a possibility.) I'm pretty sure commit 3c2e7f7de3 is the culprit. commit 3c2e7f7de3240216042b61073803b61b9b3cfb22 Author: Paolo Bonzini Date: Tue Jul 7 14:32:17 2015 +0200 KVM: SVM: use NPT page attributes I've booted ten times in a row successfully with the following patch: diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 74d825716f4f..3190173a575f 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -989,7 +989,7 @@ static __init int svm_hardware_setup(void) } else kvm_disable_tdp(); - build_mtrr2protval(); +// build_mtrr2protval(); return 0; err: Paolo, your commit causes nondeterministic boot failure on my machine. It sometimes crashes early with the following backtrace: Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? If yes, could you please print the actual value out? BTW, you may change BUG() to WARN() to get the print info more easier. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: commit 3c2e7f7de3 (KVM use NPT page attributes) causes boot failures
On 2015.09.01 at 21:00 +0800, Xiao Guangrong wrote: > > Did it trigger the BUG()/BUG_ON() in mtrr2protval()/fallback_mtrr_type()? > If yes, could you please print the actual value out? It is the BUG() in fallback_mtrr_type(). I changed it to a printk and it prints 1 for the value of mtrr. MTRR_TYPE_WRCOMB 1 -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/