Re: [kvm-devel] [PATCH] KVM: VMX: Enable Virtual Processor Identification (VPID)
On Friday 25 January 2008 15:03:28 Avi Kivity wrote: Yang, Sheng wrote: I think it's OK for there is a judgment in __invvpid() to see if machine has the ability(also if it is allowed to using VPID). :) Oh right, I missed that. We can remove it now since it will only be called if vpid != 0, and that happens only on vpid enabled machines, no? Yes. I think we can replace vpid_sync_all() with vpid_sync_vcpu_all(), then there was no reference to vpid_sync_all() now... BTW: vpid_sync_all() can be used after VMX_ON, but just in case... So we may remove it for now. -- Thanks Yang, Sheng - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
agenda de cursos atualizada: fevereiro/ mar�o 2008.
Olá, Estamos enviando nossa agenda de cursos atualizada. Atenciosamente, Natielly & Tais CÉREBRO & COMUNICAÇÃO - DESENVOLVIMENTO PESSOAL Curitiba - 3029-8015 / 3029-8026 / 9229-0157 MEMORIZAÇÃO Curso prático. Ensina a memorizar textos, assuntos de trabalho, aulas, exames, vestibulares, concursos públicos e idiomas. Destaque-se nos estudos e no trabalho. Nosso diferencial: técnicas de neurolingüística para vencer dificuldades de memorização. Conteúdo completo e mais diferenciais: www.cerebroecomunicacao.com.br CONSULTE TAMBÉM CURSO EM 6CDs. Opções de turmas: Noite: 11 a 14 defevereiro de 2008, 2ª. à 5ª.feira,19 às 22h. Carga horária: 12h. Sábados: 16 a 23 de fevereiro; 1º de março de 2008. Carga horária: 12h. Manhã (8h15 às12h) ou tarde (13h30 às 17h30) conforme o maior número de reservas. 12 vagas/turma. Número mínimo de participantes para abertura do curso: 08. Reservas, inscrição e valores veja no final da página. ORATÓRIA – Como Falar em Público com SegurançaCurso prático. Ensina a falar com segurança em palestras, reuniões, aulas e apresentações. Cause uma boa impressão com sua postura, voz e expressões. Projete-se no seu ambiente profissional e acadêmico. Nosso diferencial: técnicas de neurolingüística para ter autoconfiança na hora de falar em público. Conteúdo completo e mais diferencias: www.cerebroecomunicacao.com.br CONSULTE TAMBÉM CURSO EM 3CDs. Opções de turmas: Noite: 18 a21 defevereiro de 2008, 2ª. à 5ª.feira, 19 às 22 h. Carga horária: 12h. Sábados: 16 a 23 de fevereiro; 1º de março de 2008. Carga horária: 12h. Manhã (8h15 às12h) ou tarde (13h30 às 17h30) conforme o maior número de reservas. 12 vagas/turma. Número mínimo de participantes para abertura do curso: 08. Reservas, inscrição e valores veja no final da página. LEITURA DINÂMICA COM TÉCNICAS DE ESTUDOCurso prático. Ensina a ler de modo concentrado, encontrar a essência do texto, saber interpretar, assimilar, eliminar vícios de leitura e a ler mais rápido. Nosso diferencial: técnicas de neurolingüística parater motivação na hora de ler e estudar. Conteúdo completo e mais diferenciais: www.cerebroecomunicacao.com.br Opções de turmas: Noite:25 a28 defevereiro de 2008, 2ª. à 5ª.feira, 19 às 22 h. Carga horária: 12h. Sábados:16 a 23 de fevereiro; 1º de março de 2008. Manhã (8h15 às12h) ou tarde (13h30 às 17h30) conforme o maior número de reservas. 12 vagas por turma. Número mínimo de participantes para abertura do curso: 08. Reservas, inscrição e valores veja no final da página. PROGRAMAÇÃO NEUROLINGÜÍSTICA - CONTROLE MENTAL Curso prático de autoconhecimento. Ensina a programar a mente para ganhar autoconfiança, vencer medos, hábitos indesejados, ser mais positivo e atingir objetivos pessoais. Ensina a usar técnicas de autocontrole para reorganizar a vida em todas as áreas. Conteúdo completo e diferencias: www.cerebroecomunicacao.com.br CONSULTE TAMBÉM CURSO EM 3 CDs. Opções de turmas: Noite:3 a6 demarço de 2008, 2ª. à 5ª.feira, 19 às 22 h. Carga horária: 12h. Sábados:16 a 23 de fevereiro; 1º de março de 2008. Carga horária: 12h. Manhã (8h15 às12h) ou tarde (13h30 às 17h30) conforme o maior número de reservas. 12 vagas/turma. Número mínimo de participantes para abertura do curso: 08. Reservas, inscrição e valores veja no final da página. LIDERANÇA MOTIVACIONALCurso prático. Ensina a desenvolver a habilidade de liderança, obter o máximo de si e a motivar as pessoas para conseguir o melhor delas; ensina a ser criativo e a gerar mudanças. Nosso diferencial: técnicas de neurolingüística e meditação para tranqüilizar a mente e aumentar a motivação. Conteúdo completo e mais diferencias: www.cerebroecomunicacao.com.br Opções de turmas: Noite: 10 a13 demarço de 2008, 2ª à 5ª feira, 19 às 22h. Carga horária: 12h. Sábados: 16 a 23 de fevereiro; 1º de março de 2008. Carga horária: 12h. Manhã(8h15 às 12h) ou tarde (13h30 às 1730), conforme o maior número de reservas. 12 vagas/turma. Número mínimo de participantes para abertura do curso: 08. Reservas, inscrição e valores veja no final da página. ENEAGRAMA – Desperte seu potencial interior.Ensina a descobrir como o inconsciente dirige as ações, hábitos e pensamentos definindo o nosso tipo de personalidade e como despertar todo o potencial interior para o trabalho, relacionamentos, inteligência e afeto. Nosso diferencial: técnicas de neurolingüística e inteligência emocional para aumentar a motivação pessoal. Conteúdo completo e mais diferencias: www.cerebroecomunicacao.com.br Opções de turmas: Noite:17 a20 demarço de 2008, 2ª à 5ª feira, 19 às 22h. Carga horária: 12h. Sábados: 16 a 23 de fevereiro; 1º de março de 2008. Carga horária: 12h. Manhã(8h15 às 12h) ou tarde (13h30 às 1730), conforme o maior número de reservas. 12 vagas/turma. Número mínimo de participantes para abertura do curso: 08. FORMAS DE PAGAMENTO À VISTA: R$ 155,00/pessoa/curso; pagamento feito no dia de início do curso. Incluído no valor: apostila, certificado e coffee-break.EM PARCELAS 2 x
Re: [kvm-devel] ia64 kernel patches?
Jes Sorensen wrote: Zhang, Xiantao wrote: Hi, Jes Yes, Anthony and I are working with kernel-ia64 and kvm community to push the patches. Since kernel should export some interface for kvm use, we have to wait the response from kernel-ia64. But anyway, It should be picked up in near future. :) Compared with last push, we added smp guest support and got more stable status now . Thanks Xiantao Hi Xiantao, If you could put up the patches somewhere, I could help you clean them up and push them. I would prefer not to wait until they appear in Linus' tree if possible. Hi, Jes You don't need to wait so long. We will push it to Avi's tree first in near future once ia64 kernel ready. Thanks Xiantao -Original Message- From: Chris Wright [mailto:[EMAIL PROTECTED] Sent: 2008年1月25日 2:53 To: Jes Sorensen Cc: Zhang, Xiantao; kvm-devel@lists.sourceforge.net Subject: Re: [kvm-devel] ia64 kernel patches? * Jes Sorensen ([EMAIL PROTECTED]) wrote: Trying to browse the list archives, but both gmane and sourcefrog's interfaces are really painful to deal with. So, any chance someone could point me at the current ia64 KVM kernel patches? I notice they are not yet in Avi's tree. The last full round of patches I recall are from December. Maybe better to get updated patches from Xiantao? thanks, -chris - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/4] [RFC] MMU Notifiers V1
On Thu, Jan 24, 2008 at 09:56:06PM -0800, Christoph Lameter wrote: Andrea's mmu_notifier #4 - RFC V1 - Merge subsystem rmap based with Linux rmap based approach - Move Linux rmap based notifiers out of macro - Try to account for what locks are held while the notifiers are called. - Develop a patch sequence that separates out the different types of hooks so that it is easier to review their use. - Avoid adding #include to linux/mm_types.h - Integrate RCU logic suggested by Peter. I'm glad you're converging on something a bit saner and much much closer to my code, plus perfectly usable by KVM optimal rmap design too. It would have preferred if you would have sent me patches like Peter did for review and merging etc... that would have made review especially easier. Anyway I'm used to that on lkml so it's ok, I just need this patch to be included in mainline, everything else is irrelevant to me. On a technical merit this still partially makes me sick and I think it's the last issue to debate. @@ -971,6 +974,9 @@ int try_to_unmap(struct page *page, int else ret = try_to_unmap_file(page, migration); + if (unlikely(PageExternalRmap(page))) + mmu_rmap_notifier(invalidate_page, page); + if (!page_mapped(page)) ret = SWAP_SUCCESS; return ret; I find the above hard to accept, because the moment you work with physical pages and not mm+address I think you couldn't possibly care if page_mapped is true or false, and I think the above notifier should be called _outside_ try_to_unmap. Infact I'd call mmu_rmap_notifier(invalidate_page, page); only if page_unmapped is false and the linux pte is gone already (practically just before the page_count == 2 check and after try_to_unmap). I also think it's still worth to debate the rmap based on virtual or physical index. By supporting both secondary-rmap designs at the same time you seem to agree current KVM lightweight rmap implementation is a superior design at least for KVM. But by insisting on your rmap based on physical for your usage, you're implicitly telling us that is a superior design for you. But we know very little of why you can't exactly build rmap on virtual like KVM does! (especially now that you implicitly admitted KVM rmap design is superior at least for KVM it'd be interesting to know why you can't do the same exactly) You said something on that, but I certainly don't have a clear picture of why it can't work or why it would be less efficient. Like you said by PM I'd also like comments from Hugh, Nick and others about this issue. Nevertheless I'm very glad we already fully converged on the set_page_dirty, invalidate-page after ptep_clear_flush/young, etc... and furthermore that you only made very minor modification to my code to add a pair of hooks for the page-based rmap notifiers on top of my patch. So from a functionality POV this is 100% workable already from KVM side! Thanks! - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm-60 crash
Hello -- Running kvm-60 under ubuntu 2.6.22.14-generic xf86_64 using http://plan9.bell-labs.com/plan9/download/plan9.iso.bz2 as the guest, crashes as shown immediately after selecting install from the plan9 boot menu. Fails to crash with -no-kvm switch. -- rec -- --- /usr/local/kvm/bin/qemu-system-x86_64 -hda ../KVM/plan9.img -cdrom ../ISO/plan9.iso -boot d -m 1024 exception 13 (0) rax 00010010 rbx 0001 rcx f0012000 rdx 00a1 rsi f0101000 rdi f0009000 rsp 7bfc rbp f0001320 r8 r9 r10 r11 r12 r13 r14 r15 rip 00100239 rflags 00033002 cs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ds (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) es (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ss (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) fs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) gs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) tr (fffbd000/2088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt (/ p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) gdt 14000/4f idt 0/3ff cr0 10010 cr2 0 cr3 12000 cr4 d0 cr8 0 efer 0 code: 10 00 48 00 66 0f 20 c0 66 31 db 66 43 66 31 d8 66 0f 22 c0 -- ea 3e 02 00 08 b8 00 00 8e d0 bd 00 7c 8b 86 2c 00 8e d8 8b 86 28 00 8e c0 66 8b be 00 00 Aborted (core dumped) On Jan 21, 2008 12:07 PM, Roger Critchlow [EMAIL PROTECTED] wrote: Hi all -- I have kvm 1:28-4ubuntu2 installed under ubuntu 7.10 running 2.6.22-14-generic Attempting to install plan9 I get the following crash: $ kvm -nographic --cdrom ISO/plan9.iso -boot d KVM/plan9-qcow.img (qemu) apm ax=f000 cx=f000 dx=f000 di=fff0 ebx=9e42 esi=-f0010 initial probe, to find plan9.ini...dev A0 port 1F0 config 0040 capabilities 0B00 mwdma 0007 udma 203FLLBA sectors 4194304 dev A0 port 170 config 85C0 capabilities 0300 mwdma 0007 udma 203F found partition sdD0!cdboot; 50782+1440 using sdD0!cdboot!plan9.ini . Plan 9 Startup Menu: 1. Install Plan 9 from this CD 2. Boot Plan 9 from this CD 3. Boot Plan 9 from this CD and debug 9load Selection: 1 1 booting sdD0!cdboot!9pcflop.gz found 9pcflop.gz .gz.. 1208343 = 792836+1045244+130816=1968896 entry: f0100020 exception 13 (0) rax 00010010 rbx 0001 rcx f0012000 rdx 00a1 rsi f0101000 rdi f0009000 rsp 7bfc rbp f0001320 r8 r9 r10 r11 r12 r13 r14 r15 rip 00100239 rflags 00033002 cs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ds (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) es (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) ss (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) fs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) gs (/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0) tr (0885/2088 p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0) ldt (/ p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0) gdt 14000/4f idt 0/3ff cr0 10010 cr2 0 cr3 12000 cr4 d0 cr8 0 efer 0 code: ea 3e 02 00 08 b8 00 00 8e d0 bd 00 7c 8b 86 2c 00 8e d8 8b 86 28 00 8e c0 66 8b be 00 00 Aborted (core dumped) The install succeeds if I use qemu and leave it running all night. But booting the installed image using kvm in the morning gets the identical crash except for the rbp value: rsi f0101000 rdi f0009000 rsp 7bfc rbp f0001320 --- rsi f0101000 rdi f0009000 rsp 7bfc rbp f0001314 -- rec -- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 5/8] MMU: make the __nonpaging_map function generic
The mapping function for the nonpaging case in the softmmu does basically the same as required for Nested Paging. Make this function generic so it can be used for both. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c |7 +++ 1 files changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 635e70c..dfbcf5e 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -956,10 +956,9 @@ static void nonpaging_new_cr3(struct kvm_vcpu *vcpu) { } -static int __nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, - gfn_t gfn, struct page *page) +static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, int write, + gfn_t gfn, struct page *page, int level) { - int level = PT32E_ROOT_LEVEL; hpa_t table_addr = vcpu-arch.mmu.root_hpa; int pt_write = 0; @@ -1017,7 +1016,7 @@ static int nonpaging_map(struct kvm_vcpu *vcpu, gva_t v, int write, gfn_t gfn) spin_lock(vcpu-kvm-mmu_lock); kvm_mmu_free_some_pages(vcpu); - r = __nonpaging_map(vcpu, v, write, gfn, page); + r = __direct_map(vcpu, v, write, gfn, page, PT32E_ROOT_LEVEL); spin_unlock(vcpu-kvm-mmu_lock); up_read(current-mm-mmap_sem); -- 1.5.3.7 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 4/8] X86: export information about NPT to generic x86 code
The generic x86 code has to know if the specific implementation uses Nested Paging. In the generic code Nested Paging is called Hardware Assisted Paging (HAP) to avoid confusion with (future) HAP implementations of other vendors. This patch exports the availability of HAP to the generic x86 code. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |7 +++ arch/x86/kvm/vmx.c |7 +++ include/asm-x86/kvm_host.h |2 ++ 3 files changed, 16 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 2e718ff..d0bfdd8 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1678,6 +1678,11 @@ static bool svm_cpu_has_accelerated_tpr(void) return false; } +static bool svm_hap_enabled(void) +{ + return npt_enabled; +} + static struct kvm_x86_ops svm_x86_ops = { .cpu_has_kvm_support = has_svm, .disabled_by_bios = is_disabled, @@ -1734,6 +1739,8 @@ static struct kvm_x86_ops svm_x86_ops = { .inject_pending_vectors = do_interrupt_requests, .set_tss_addr = svm_set_tss_addr, + + .hap_enabled = svm_hap_enabled, }; static int __init svm_init(void) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 00a00e4..8feb775 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2631,6 +2631,11 @@ static void __init vmx_check_processor_compat(void *rtn) } } +static bool vmx_hap_enabled(void) +{ + return false; +} + static struct kvm_x86_ops vmx_x86_ops = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, @@ -2688,6 +2693,8 @@ static struct kvm_x86_ops vmx_x86_ops = { .inject_pending_vectors = do_interrupt_requests, .set_tss_addr = vmx_set_tss_addr, + + .hap_enabled = vmx_hap_enabled, }; static int __init vmx_init(void) diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 67ae307..45a9d05 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -392,6 +392,8 @@ struct kvm_x86_ops { struct kvm_run *run); int (*set_tss_addr)(struct kvm *kvm, unsigned int addr); + + bool (*hap_enabled)(void); }; extern struct kvm_x86_ops *kvm_x86_ops; -- 1.5.3.7 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 2/8] SVM: add detection of Nested Paging feature
Let SVM detect if the Nested Paging feature is available on the hardware. Disable it to keep this patch series bisectable. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 0c58527..49bb57a 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -47,6 +47,8 @@ MODULE_LICENSE(GPL); #define SVM_FEATURE_LBRV (1 1) #define SVM_DEATURE_SVML (1 2) +static bool npt_enabled = false; + static void kvm_reput_irq(struct vcpu_svm *svm); static inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu) @@ -410,6 +412,12 @@ static __init int svm_hardware_setup(void) svm_features = cpuid_edx(SVM_CPUID_FUNC); + if (!svm_has(SVM_FEATURE_NPT)) + npt_enabled = false; + + if (npt_enabled) + printk(KERN_INFO kvm: Nested Paging enabled\n); + return 0; err_2: -- 1.5.3.7 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 1/8] SVM: move feature detection to hardware setup code
By moving the SVM feature detection from the each_cpu code to the hardware setup code it runs only once. As an additional advance the feature check is now available earlier in the module setup process. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 7bdbe16..0c58527 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -302,7 +302,6 @@ static void svm_hardware_enable(void *garbage) svm_data-asid_generation = 1; svm_data-max_asid = cpuid_ebx(SVM_CPUID_FUNC) - 1; svm_data-next_asid = svm_data-max_asid + 1; - svm_features = cpuid_edx(SVM_CPUID_FUNC); asm volatile (sgdt %0 : =m(gdt_descr)); gdt = (struct desc_struct *)gdt_descr.address; @@ -408,6 +407,9 @@ static __init int svm_hardware_setup(void) if (r) goto err_2; } + + svm_features = cpuid_edx(SVM_CPUID_FUNC); + return 0; err_2: -- 1.5.3.7 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] attached is kvm crash log with a kernel BUG at /usr/src/modules/kvm/mmu.c:307!
kernel BUG at /usr/src/modules/kvm/mmu.c:307! invalid opcode: [1] SMP CPU 1 Problem is not there if running without mod probes on KVM and KVM_AMD This happens after adding Microsoft Service Pack 4 to a Win2k install. Package was running -no-acpi Reproducible. -win2k-hack makes no difference It doesn't happen with earlier versions of win2k except when it was installing the new KDE packages of windows on a pristine win2k service pack 2 install. Kvm works fine on other OS's except it does crash with the OS used for Norton Ghost on e-machines XP image restore.It worked ok for image restore running on freedos. The same images run fine in a pure qemu mode. OS is debian testing kernel is: Linux miro 2.6.24-rc5 #1 SMP Wed Dec 26 00:53:14 CST 2007 x86_64 GNU/Linux cat /proc/cpuinfo processor: 0 vendor_id: AuthenticAMD cpu family: 15 model: 75 model name: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping: 2 cpu MHz: 2009.246 cache size: 512 KB physical id: 0 siblings: 2 core id: 0 cpu cores: 2 fpu: yes fpu_exception: yes cpuid level: 1 wp: yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy bogomips: 4021.07 TLB size: 1024 4K pages clflush size: 64 cache_alignment: 64 address sizes: 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc processor: 1 vendor_id: AuthenticAMD cpu family: 15 model: 75 model name: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping: 2 cpu MHz: 2009.246 cache size: 512 KB physical id: 0 siblings: 2 core id: 1 cpu cores: 2 fpu: yes fpu_exception: yes cpuid level: 1 wp: yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy bogomips: 4018.93 TLB size: 1024 4K pages clflush size: 64 cache_alignment: 64 address sizes: 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/miro-root 1368927678698619861 53% / tmpfs 997808 0997808 0% /lib/init/rw udev 10240 112 10128 2% /dev tmpfs 997808 0997808 0% /dev/shm /dev/sda1 241116111562117106 49% /boot /dev/mapper/miro-home2 309313552 71451664 222149632 25% /home /dev/mapper/miro-tmp388741 10319358352 3% /tmp /dev/mapper/miro-usr 44044120 14213296 27600220 34% /usr /dev/mapper/miro-var 11820976 2117964 9103264 19% /var KVM version:/sbin/modinfo kvm filename: /lib/modules/2.6.24-rc5/misc/kvm.ko license:GPL author: Qumranet version:kvm-48 srcversion: 63B0F92A3F1152C05FE5A8F depends: vermagic: 2.6.24-rc5 SMP mod_unload /sbin/modinfo kvm_amd filename: /lib/modules/2.6.24-rc5/misc/kvm-amd.ko license:GPL author: Qumranet version:kvm-48 srcversion: 22F744921D178E88E9B84A7 depends:kvm vermagic: 2.6.24-rc5 SMP mod_unload host arch: x86_64 guest that crashed: win2k srv pack 4 start cmd line: kvm -hda /home/watermod/KVM/win2k_srvpk4_.img -m 512 -no-acpi no-kvm - works fine. Jan 23 01:35:50 miro kernel: [ cut here ] Jan 23 01:35:50 miro kernel: kernel BUG at /usr/src/modules/kvm/mmu.c:307! Jan 23 01:35:50 miro kernel: invalid opcode: [1] SMP Jan 23 01:35:50 miro kernel: CPU 1 Jan 23 01:35:50 miro kernel: Modules linked in: nls_iso8859_1 cifs kvm_amd kvm nvidia(P) binfmt_misc ppdev ipv6 fuse tun loop snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 snd_seq_dummy snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi firmware_class snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep psmouse parport_pc parport snd pcspkr serio_raw emu10k1_gp k8temp soundcore gameport i2c_nforce2 i2c_core button evdev ext3 jbd mbcache dm_mirror dm_snapshot dm_mod sg usbhid sr_mod cdrom sd_mod sata_nv pata_amd libata r8169 scsi_mod ehci_hcd ohci_hcd thermal processor fan Jan 23 01:35:50 miro kernel: Pid: 17315, comm: kvm Tainted: P2.6.24-rc5 #1 Jan 23 01:35:50 miro kernel: RIP: 0010:[8894c43d] [8894c43d] :kvm:mmu_memory_cache_alloc+0xd/0x2a Jan 23 01:35:50 miro kernel: RSP: 0018:81005ebad9e8 EFLAGS: 00010246 Jan 23 01:35:50 miro kernel: RAX: RBX: c20004a02428 RCX:
Re: [kvm-devel] [PATCH] QEMU support for virtio balloon driver
On Saturday 26 January 2008 03:08:57 Marcelo Tosatti wrote: On Thu, Jan 24, 2008 at 04:29:51PM -0600, Anthony Liguori wrote: I'm inclined to think that we should have a capability check for MM notifiers and just not do madvise if they aren't present. I don't think the ioctl approach that Marcelo took is sufficient as a malicious guest could possibly hose the host. How's that? The ioctl damage is contained to the guest (other than CPU processing time, which the guest can cause in other ways). Anyway, don't see the need for back compat with older hosts. Having the guest allocate and not touch memory means that it should [Cut last 523 lines of email, all of which was quoting] Look, I realize that Anthony is unable to control his massive quoting diarrhea, but I expect better from you Marcelo. It wastes everyone elses' time scanning it for new content, and this habit seems to be spreading. If this keeps up, I'll have no choice but to start sending spams with such a massive quote ratio just so everyone's filters start dropping them... Rusty. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH][RFC] SVM: Add Support for Nested Paging in AMD Fam16 CPUs
Joerg Roedel wrote: Hi, here is the first release of patches for KVM to support the Nested Paging (NPT) feature of AMD QuadCore CPUs for comments and public testing. This feature improves the guest performance significantly. I measured an improvement of around 17% using kernbench in my first tests. This patch series is basically tested with Linux guests (32 bit legacy paging, 32 bit PAE paging and 64 bit Long Mode). Also tested with Windows Vista 32 bit and 64 bit. All these guests ran successfully with these patches. The patch series only enables NPT for 64 bit Linux hosts at the moment. Please give these patches a good and deep testing. I hope we have this patchset ready for merging soon. Good. We also ported the EPT patch for Xen to KVM, which we submitted last year. We've been cleaning up the patch with Avi. We are working on live migration support now, and we'll submit the patch once it's done. So please stay tuned. Joerg Jun --- Intel Open Source Technology Center - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] QEMU support for virtio balloon driver
Dor Laor wrote: On Thu, 2008-01-24 at 16:29 -0600, Anthony Liguori wrote: Anthony Liguori wrote: This patch adds support to QEMU for Rusty's recently introduce virtio balloon driver. The user-facing portions of this are the introduction of a balloon and info balloon command in the monitor. I think using madvise unconditionally is okay but I am not sure. Looks like it's not. I just hung my host system after doing a bunch of ballooning with a kernel that doesn't have MM notifiers. I'm inclined to think that we should have a capability check for MM notifiers and just not do madvise if they aren't present. I don't think the ioctl approach that Marcelo took is sufficient as a malicious guest could possibly hose the host. The ioctl to zap the shadow pages is needed in order to free memory fast. Without it the balloon will evacuate memory to slow for common mgmt application (running additional VMs). I think that assertion needs some performance numbers to back it up. Linux will write unused pages to swap such that when it does need to obtain memory, it can easily just reclaim pages without doing any disk IO. The real advantage with using madvise() is that it doesn't use any swap space (at least, on Linux). This ioctl (on older kernels only) can hose the host but so can malicious guests that do dummy cr3 switching and other hackry. What do you mean by that? The guest really shouldn't be able to hose the host regardless of what it puts in cr3. If it can, then that's a very serious bug. If one really insist he can always add a timer to this ioctl to slow potential malicious guests. The issue is the atomicity of removing some from the shadow MMU cache and then madvise()'ing (since madvise is incapable of evicting from the shadow MMU cache w/o MMU notifiers). The only real solution I know of would be to also introduce an ioctl that's essentially, MADVISE_AND_REMOVE_FROM_SHADOW_MMU ioctl(). Regards, Anthony Liguori Having the guest allocate and not touch memory means that it should eventually be removed from the shadow page cache and eventually swapped out so ballooning isn't totally useless in the absence of MM notifiers. Regards, Anthony Liguori - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 3/8] SVM: add module parameter to disable NestedPaging
On Fri, Jan 25, 2008 at 05:47:11PM -0800, Nakajima, Jun wrote: Joerg Roedel wrote: To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=off to the kvm_amd module. I think it's better to use a (common) parameter to qemu. That way you can control on/off for each VM. Generally I see no problem with it. But at least for NPT I don't see a reason why someone should want to disable it on a VM basis (as far as it works stable). Avi, what do you think? Joerg - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 3/8] SVM: add module parameter to disable NestedPaging
Joerg Roedel wrote: To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=off to the kvm_amd module. I think it's better to use a (common) parameter to qemu. That way you can control on/off for each VM. Jun --- Intel Open Source Technology Center - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 4/8] X86: export information about NPT to generic x86 code
Joerg Roedel wrote: The generic x86 code has to know if the specific implementation uses Nested Paging. In the generic code Nested Paging is called Hardware Assisted Paging (HAP) to avoid confusion with (future) HAP implementations of other vendors. This patch exports the availability of HAP to the generic x86 code. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |7 +++ arch/x86/kvm/vmx.c |7 +++ include/asm-x86/kvm_host.h |2 ++ 3 files changed, 16 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 2e718ff..d0bfdd8 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1678,6 +1678,11 @@ static bool svm_cpu_has_accelerated_tpr(void) return false; } +static bool svm_hap_enabled(void) +{ + return npt_enabled; +} + To help with bisecting, you should probably return false here until the patch that actually implements NPT support. Otherwise, the 7th patch in this series breaks KVM for SVM. Regards, Anthony Liguori static struct kvm_x86_ops svm_x86_ops = { .cpu_has_kvm_support = has_svm, .disabled_by_bios = is_disabled, @@ -1734,6 +1739,8 @@ static struct kvm_x86_ops svm_x86_ops = { .inject_pending_vectors = do_interrupt_requests, .set_tss_addr = svm_set_tss_addr, + + .hap_enabled = svm_hap_enabled, }; static int __init svm_init(void) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 00a00e4..8feb775 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2631,6 +2631,11 @@ static void __init vmx_check_processor_compat(void *rtn) } } +static bool vmx_hap_enabled(void) +{ + return false; +} + static struct kvm_x86_ops vmx_x86_ops = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, @@ -2688,6 +2693,8 @@ static struct kvm_x86_ops vmx_x86_ops = { .inject_pending_vectors = do_interrupt_requests, .set_tss_addr = vmx_set_tss_addr, + + .hap_enabled = vmx_hap_enabled, }; static int __init vmx_init(void) diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 67ae307..45a9d05 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -392,6 +392,8 @@ struct kvm_x86_ops { struct kvm_run *run); int (*set_tss_addr)(struct kvm *kvm, unsigned int addr); + + bool (*hap_enabled)(void); }; extern struct kvm_x86_ops *kvm_x86_ops; - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/4] [RFC] MMU Notifiers V1
On Sat, 26 Jan 2008, Benjamin Herrenschmidt wrote: Also, wouldn't there be a problem with something trying to use that interface to keep in sync a secondary device MMU such as the DRM or other accelerators, which might need virtual address based invalidation ? Yes just doing the rmap based solution would have required DRM etc to maintain their own rmaps. So it looks that we need to go with both variants. Note that secondary device MMUs that need to run code outside of atomic context may still need to create their own rmaps. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/4] [RFC] MMU Notifiers V1
On Fri, 2008-01-25 at 12:42 +0100, Andrea Arcangeli wrote: On Thu, Jan 24, 2008 at 09:56:06PM -0800, Christoph Lameter wrote: Andrea's mmu_notifier #4 - RFC V1 - Merge subsystem rmap based with Linux rmap based approach - Move Linux rmap based notifiers out of macro - Try to account for what locks are held while the notifiers are called. - Develop a patch sequence that separates out the different types of hooks so that it is easier to review their use. - Avoid adding #include to linux/mm_types.h - Integrate RCU logic suggested by Peter. I'm glad you're converging on something a bit saner and much much closer to my code, plus perfectly usable by KVM optimal rmap design too. It would have preferred if you would have sent me patches like Peter did for review and merging etc... that would have made review especially easier. Anyway I'm used to that on lkml so it's ok, I just need this patch to be included in mainline, everything else is irrelevant to me. Also, wouldn't there be a problem with something trying to use that interface to keep in sync a secondary device MMU such as the DRM or other accelerators, which might need virtual address based invalidation ? Ben. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 1/4] mmu_notifier: Core code
Diff so far against V1 - Improve RCU support. (There is now a sychronize_rcu in mmu_release which is bad.) - Clean compile for !MMU_NOTIFIER - Use mmap_sem for serializing additions the mmu_notifier list in the mm_struct (but still global spinlock for mmu_rmap_notifier. The registration function is called only a couple of times)) - --- include/linux/list.h | 14 ++ include/linux/mm_types.h |2 -- include/linux/mmu_notifier.h | 39 --- mm/mmu_notifier.c| 28 +++- 4 files changed, 69 insertions(+), 14 deletions(-) Index: linux-2.6/mm/mmu_notifier.c === --- linux-2.6.orig/mm/mmu_notifier.c2008-01-25 12:14:49.0 -0800 +++ linux-2.6/mm/mmu_notifier.c 2008-01-25 12:14:49.0 -0800 @@ -15,17 +15,18 @@ void mmu_notifier_release(struct mm_struct *mm) { struct mmu_notifier *mn; - struct hlist_node *n; + struct hlist_node *n, *t; if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { rcu_read_lock(); - hlist_for_each_entry_rcu(mn, n, + hlist_for_each_entry_safe_rcu(mn, n, t, mm-mmu_notifier.head, hlist) { if (mn-ops-release) mn-ops-release(mn, mm); hlist_del(mn-hlist); } rcu_read_unlock(); + synchronize_rcu(); } } @@ -53,24 +54,33 @@ int mmu_notifier_age_page(struct mm_stru return young; } -static DEFINE_SPINLOCK(mmu_notifier_list_lock); +/* + * Note that all notifiers use RCU. The updates are only guaranteed to be + * visible to other processes after a RCU quiescent period! + */ +void __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm) +{ + hlist_add_head_rcu(mn-hlist, mm-mmu_notifier.head); +} +EXPORT_SYMBOL_GPL(__mmu_notifier_register); void mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm) { - spin_lock(mmu_notifier_list_lock); - hlist_add_head(mn-hlist, mm-mmu_notifier.head); - spin_unlock(mmu_notifier_list_lock); + down_write(mm-mmap_sem); + __mmu_notifier_register(mn, mm); + up_write(mm-mmap_sem); } EXPORT_SYMBOL_GPL(mmu_notifier_register); void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm) { - spin_lock(mmu_notifier_list_lock); - hlist_del(mn-hlist); - spin_unlock(mmu_notifier_list_lock); + down_write(mm-mmap_sem); + hlist_del_rcu(mn-hlist); + up_write(mm-mmap_sem); } EXPORT_SYMBOL_GPL(mmu_notifier_unregister); +static DEFINE_SPINLOCK(mmu_notifier_list_lock); HLIST_HEAD(mmu_rmap_notifier_list); void mmu_rmap_notifier_register(struct mmu_rmap_notifier *mrn) Index: linux-2.6/include/linux/list.h === --- linux-2.6.orig/include/linux/list.h 2008-01-25 12:14:47.0 -0800 +++ linux-2.6/include/linux/list.h 2008-01-25 12:14:49.0 -0800 @@ -991,6 +991,20 @@ static inline void hlist_add_after_rcu(s ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \ pos = pos-next) +/** + * hlist_for_each_entry_safe_rcu - iterate over list of given type + * @tpos: the type * to use as a loop cursor. + * @pos: the struct hlist_node to use as a loop cursor. + * @n: temporary pointer + * @head: the head for your list. + * @member:the name of the hlist_node within the struct. + */ +#define hlist_for_each_entry_safe_rcu(tpos, pos, n, head, member) \ + for (pos = (head)-first;\ +rcu_dereference(pos) ({ n = pos-next; 1;})\ + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1;}); \ +pos = n) + #else #warning don't include kernel headers in userspace #endif /* __KERNEL__ */ Index: linux-2.6/include/linux/mm_types.h === --- linux-2.6.orig/include/linux/mm_types.h 2008-01-25 12:14:49.0 -0800 +++ linux-2.6/include/linux/mm_types.h 2008-01-25 12:14:49.0 -0800 @@ -224,9 +224,7 @@ struct mm_struct { rwlock_tioctx_list_lock; struct kioctx *ioctx_list; -#ifdef CONFIG_MMU_NOTIFIER struct mmu_notifier_head mmu_notifier; /* MMU notifier list */ -#endif }; #endif /* _LINUX_MM_TYPES_H */ Index: linux-2.6/include/linux/mmu_notifier.h === --- linux-2.6.orig/include/linux/mmu_notifier.h 2008-01-25 12:14:49.0 -0800 +++ linux-2.6/include/linux/mmu_notifier.h 2008-01-25 13:07:54.0 -0800 @@ -46,6 +46,10 @@ struct mmu_notifier { }; struct mmu_notifier_ops { + /* +
[kvm-devel] [PATCH 8/8] SVM: add support for Nested Paging
This patch contains the SVM architecture dependent changes for KVM to enable support for the Nested Paging feature of AMD Barcelona and Phenom processors. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c | 67 --- 1 files changed, 63 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index d0bfdd8..578d8ec 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -47,7 +47,12 @@ MODULE_LICENSE(GPL); #define SVM_FEATURE_LBRV (1 1) #define SVM_DEATURE_SVML (1 2) +#ifdef CONFIG_X86_64 +static bool npt_enabled = true; +#else static bool npt_enabled = false; +#endif + static char *npt = on; module_param(npt, charp, S_IRUGO); @@ -187,7 +192,7 @@ static inline void flush_guest_tlb(struct kvm_vcpu *vcpu) static void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer) { - if (!(efer EFER_LMA)) + if (!npt_enabled !(efer EFER_LMA)) efer = ~EFER_LME; to_svm(vcpu)-vmcb-save.efer = efer | MSR_EFER_SVME_MASK; @@ -568,6 +573,24 @@ static void init_vmcb(struct vmcb *vmcb) save-cr0 = 0x0010 | X86_CR0_PG | X86_CR0_WP; save-cr4 = X86_CR4_PAE; /* rdx = ?? */ + + if (npt_enabled) { + /* Setup VMCB for Nested Paging */ + control-nested_ctl = 1; + control-intercept_exceptions = ~(1 PF_VECTOR); + control-intercept_cr_read = ~(INTERCEPT_CR0_MASK| + INTERCEPT_CR3_MASK| + INTERCEPT_CR4_MASK); + control-intercept_cr_write = ~(INTERCEPT_CR0_MASK| +INTERCEPT_CR3_MASK| +INTERCEPT_CR4_MASK); + save-g_pat = 0x0007040600070406ULL; + /* enable caching because the QEMU Bios doesn't enable it */ + save-cr0 = X86_CR0_ET; + save-cr3 = 0; + save-cr4 = 0; + } + } static int svm_vcpu_reset(struct kvm_vcpu *vcpu) @@ -789,6 +812,15 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) { struct vcpu_svm *svm = to_svm(vcpu); + if (npt_enabled) { + /* +* re-enable caching here because the QEMU bios +* does not do it - this results in some delay at +* reboot +*/ + cr0 = ~(X86_CR0_CD | X86_CR0_NW); + goto set; + } #ifdef CONFIG_X86_64 if (vcpu-arch.shadow_efer EFER_LME) { if (!is_paging(vcpu) (cr0 X86_CR0_PG)) { @@ -812,13 +844,16 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) cr0 = ~(X86_CR0_CD | X86_CR0_NW); if (!vcpu-fpu_active) cr0 |= X86_CR0_TS; +set: svm-vmcb-save.cr0 = cr0; } static void svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { vcpu-arch.cr4 = cr4; - to_svm(vcpu)-vmcb-save.cr4 = cr4 | X86_CR4_PAE; + if (!npt_enabled) + cr4 |= X86_CR4_PAE; + to_svm(vcpu)-vmcb-save.cr4 = cr4; } static void svm_set_segment(struct kvm_vcpu *vcpu, @@ -1284,14 +1319,31 @@ static int (*svm_exit_handlers[])(struct vcpu_svm *svm, [SVM_EXIT_WBINVD] = emulate_on_interception, [SVM_EXIT_MONITOR] = invalid_op_interception, [SVM_EXIT_MWAIT]= invalid_op_interception, + [SVM_EXIT_NPF] = pf_interception, }; - static int handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); u32 exit_code = svm-vmcb-control.exit_code; + if (npt_enabled) { + int mmu_reload = 0; + if (((vcpu-arch.cr0 ^ svm-vmcb-save.cr0) X86_CR0_PG) + || ((vcpu-arch.cr4 ^ svm-vmcb-save.cr4) + (X86_CR4_PGE|X86_CR4_PAE))) + mmu_reload = 1; + vcpu-arch.cr0 = svm-vmcb-save.cr0; + vcpu-arch.cr4 = svm-vmcb-save.cr4; + vcpu-arch.cr3 = svm-vmcb-save.cr3; + if (mmu_reload) { + kvm_mmu_reset_context(vcpu); + kvm_mmu_load(vcpu); + } + if (is_pae(vcpu) !is_long_mode(vcpu)) + load_pdptrs(vcpu, vcpu-arch.cr3); + } + kvm_reput_irq(svm); if (svm-vmcb-control.exit_code == SVM_EXIT_ERR) { @@ -1302,7 +1354,8 @@ static int handle_exit(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) } if (is_external_interrupt(svm-vmcb-control.exit_int_info) - exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR) + exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR + exit_code != SVM_EXIT_NPF) printk(KERN_ERR %s: unexpected exit_ini_info
[kvm-devel] [PATCH 3/8] SVM: add module parameter to disable Nested Paging
To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=off to the kvm_amd module. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 49bb57a..2e718ff 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -48,6 +48,9 @@ MODULE_LICENSE(GPL); #define SVM_DEATURE_SVML (1 2) static bool npt_enabled = false; +static char *npt = on; + +module_param(npt, charp, S_IRUGO); static void kvm_reput_irq(struct vcpu_svm *svm); @@ -415,6 +418,11 @@ static __init int svm_hardware_setup(void) if (!svm_has(SVM_FEATURE_NPT)) npt_enabled = false; + if (npt_enabled strncmp(npt, off, 3) == 0) { + printk(KERN_INFO kvm: Nested Paging disabled\n); + npt_enabled = false; + } + if (npt_enabled) printk(KERN_INFO kvm: Nested Paging enabled\n); -- 1.5.3.7 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 6/8] X86: export the load_pdptrs() function to modules
The load_pdptrs() function is required in the SVM module for NPT support. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/x86.c |1 + include/asm-x86/kvm_host.h |2 ++ 2 files changed, 3 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 8f94a0b..31cdf09 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -202,6 +202,7 @@ out: return ret; } +EXPORT_SYMBOL_GPL(load_pdptrs); static bool pdptrs_changed(struct kvm_vcpu *vcpu) { diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 45a9d05..b55a7a6 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -412,6 +412,8 @@ void kvm_mmu_zap_all(struct kvm *kvm); unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm); void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages); +int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3); + enum emulation_result { EMULATE_DONE, /* no further processing */ EMULATE_DO_MMIO, /* kvm_run filled with mmio request */ -- 1.5.3.7 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] KVM net bug
José Antonio wrote: Hi, First I'm sorry for my english. I'm using kvm on host: Debian GNU/Linux sid host kernel: Linux enol 2.6.23-1-amd64 #1 SMP Fri Dec 21 12:00:17 UTC 2007 x86_64 GNU/Linux kvm version 58+dfsg-2 (almost on Debian) guest: Debian etch and Windows XP, although I think that problem is on any guest command: kvm -hda disk.img -net nic,macaddr=valid mac -net tap I describe the problem on Debian: When I execute de kvm command the guest work but don't work the net because device eth0 don't exists instead of exists eth1 firstly eth2 when I shut down and later I boot the guest. Because of this my local net don't work. I'm having a hard time understanding your problem. Do you have an appropriate qemu-ifup script? Regards, Anthony Liguori Regards, - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 1/4] mmu_notifier: Core code
On Fri, Jan 25, 2008 at 11:03:07AM -0800, Christoph Lameter wrote: Shouldn't this really be protected by the down_write(mmap_sem)? Maybe: Ok. We could switch this to mmap_sem protection for the mm_struct but the rmap notifier is not associated with an mm_struct. So we would need to keep it there. Since we already have a spinlock: Just use it for both to avoid further complications. But now you are putting a global lock in where it is inappropriate. The lock is only used during register and unregister. Very low level usage. Seems to me that is the same argument used for lock_kernel. I am saying we have a perfectly reasonable way to seperate the protections down to their smallest. For the things hanging off the mm, mmap_sem, for the other list, a list specific lock. Keep in mind that on a 2048p SSI MPI job starting up, we have 2048 ranks doing this at the same time 6 times withing their address range. That seems like a lock which could get hot fairly quickly. It may be for a short period during startup and shutdown, but it is there. XPMEM, would also benefit from a call early. We could make all the segments as being torn down and start the recalls. We already have this code in and working (have since it was first written 6 years ago). In this case, all segments are torn down with a single message to each of the importing partitions. In contrast, the teardown code which would happen now would be one set of messages for each vma. So we need an additional global teardown call? Then we'd need to switch off the vma based invalidate_range()? No, EXACTLY what I originally was asking for, either move this call site up, introduce an additional mmu_notifier op, or place this one in two locations with a flag indicating which call is being made. Add a new invalidate_all() call? Then on exit we do 1. invalidate_all() That will be fine as long as we can unregister the ops notifier and free the structure. Otherwise, we end up being called needlessly. 2. invalidate_range() for each vma 3. release() We cannot simply move the call up because there will be future range callbacks on vma invalidation. I am not sure what this means. Right now, if you were to notify XPMEM the process is exiting, we would take care of all the recalling of pages exported by this process, clearing those pages cache lines from cache, and raising memory protections. I would assume that moving the callout earlier would expect the same of every driver. Thanks, Robin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Merging KVM QEMU changes upstream
Hi, As most probably know, the KVM project has been maintaining a QEMU tree for some time now. Beyond support for the KVM kernel interface, the tree also contains a number of useful features like live migration, virtio, and extboot. Some of these things have been posted to qemu-devel already but were not included. I would like to work on merging the KVM changes into upstream QEMU but before I started that work, I wanted to get a read on how difficult it would be. A lot of these things were designed specifically for KVM on x86. Only now are other architectures starting to be considered. Certainly, cross-architecture emulation hasn't really been considered. I wouldn't expect anything to be merged that caused a regression for cross-architecture emulation, but I don't really have the time to get a lot of the new features working for the cross-architecture case. I would expect, though, that if these things were merged, it would make it relatively easy for someone else to do that though. Is this a reasonable merge strategy? We won't introduce regressions but I can't guarantee these new things will work cross-architecture. Regards, Anthony Liguori - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] QEMU support for virtio balloon driver
Marcelo Tosatti wrote: On Thu, Jan 24, 2008 at 04:29:51PM -0600, Anthony Liguori wrote: Anthony Liguori wrote: This patch adds support to QEMU for Rusty's recently introduce virtio balloon driver. The user-facing portions of this are the introduction of a balloon and info balloon command in the monitor. I think using madvise unconditionally is okay but I am not sure. Looks like it's not. I just hung my host system after doing a bunch of ballooning with a kernel that doesn't have MM notifiers. Thats strange, lack of MMU notifiers will crash the guest (by use of a stale shadow entry), but not the host. What are the symptoms? It's only happened to me once because I stopped testing with madvise afterward. The guest spontaneously restarted, and a few seconds later, the machine hung. It shouldn't be hard to reproduce by just repeatedly ballooning up and down a guest. Do others expect KVM to just cope with the virtual mapping being changed out from underneath of it? I'm inclined to think that we should have a capability check for MM notifiers and just not do madvise if they aren't present. I don't think the ioctl approach that Marcelo took is sufficient as a malicious guest could possibly hose the host. How's that? The ioctl damage is contained to the guest (other than CPU processing time, which the guest can cause in other ways). Anyway, don't see the need for back compat with older hosts. Well, I'm unsure if this is a bug or expected behavior. If it's the later, then the ioctl approach just introduces a race condition. If the guest can fault in a page after the ioctl but before the madvise(), then it can trigger the bug. Regards, Anthony Liguori Having the guest allocate and not touch memory means that it should eventually be removed from the shadow page cache and eventually swapped out so ballooning isn't totally useless in the absence of MM notifiers. Regards, Anthony Liguori If madvise is called on memory that is essentially locked (which is what pre-MM notifiers is like) then it should just be a nop right? Signed-off-by: Anthony Liguori [EMAIL PROTECTED] diff --git a/qemu/Makefile.target b/qemu/Makefile.target index bb7be0f..d6b4f46 100644 --- a/qemu/Makefile.target +++ b/qemu/Makefile.target @@ -464,7 +464,7 @@ VL_OBJS += rtl8139.o VL_OBJS+= hypercall.o # virtio devices -VL_OBJS += virtio.o virtio-net.o virtio-blk.o +VL_OBJS += virtio.o virtio-net.o virtio-blk.o virtio-balloon.o ifeq ($(TARGET_BASE_ARCH), i386) # Hardware support diff --git a/qemu/balloon.h b/qemu/balloon.h new file mode 100644 index 000..ffce1fa --- /dev/null +++ b/qemu/balloon.h @@ -0,0 +1,14 @@ +#ifndef _QEMU_BALLOON_H +#define _QEMU_BALLOON_H + +#include cpu-defs.h + +typedef ram_addr_t (QEMUBalloonEvent)(void *opaque, ram_addr_t target); + +void qemu_add_balloon_handler(QEMUBalloonEvent *func, void *opaque); + +void qemu_balloon(ram_addr_t target); + +ram_addr_t qemu_balloon_status(void); + +#endif diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c index 652b263..6c8d907 100644 --- a/qemu/hw/pc.c +++ b/qemu/hw/pc.c @@ -1122,6 +1122,9 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size, } } +if (pci_enabled) + virtio_balloon_init(pci_bus, 0x1AF4, 0x1002); + if (extboot_drive != -1) { DriveInfo *info = drives_table[extboot_drive]; int cyls, heads, secs; diff --git a/qemu/hw/pc.h b/qemu/hw/pc.h index f640395..1899c11 100644 --- a/qemu/hw/pc.h +++ b/qemu/hw/pc.h @@ -152,6 +152,9 @@ void virtio_net_poll(void); void *virtio_blk_init(PCIBus *bus, uint16_t vendor, uint16_t device, BlockDriverState *bs); +/* virtio-balloon.h */ +void *virtio_balloon_init(PCIBus *bus, uint16_t vendor, uint16_t device); + /* extboot.c */ void extboot_init(BlockDriverState *bs, int cmd); diff --git a/qemu/hw/virtio-balloon.c b/qemu/hw/virtio-balloon.c new file mode 100644 index 000..1b5a689 --- /dev/null +++ b/qemu/hw/virtio-balloon.c @@ -0,0 +1,142 @@ +/* + * Virtio Block Device + * + * Copyright IBM, Corp. 2008 + * + * Authors: + * Anthony Liguori [EMAIL PROTECTED] + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include virtio.h +#include block.h +#include pc.h +#include balloon.h +#include sysemu.h + +#include sys/mman.h + +/* from Linux's linux/virtio_blk.h */ + +/* The ID for virtio_balloon */ +#define VIRTIO_ID_BALLOON 5 + +/* The feature bitmap for virtio balloon */ +#define VIRTIO_BALLOON_F_MUST_TELL_HOST0 /* Tell before reclaiming pages */ + +struct virtio_balloon_config +{ +/* Number of pages host wants Guest to give up. */ +uint32_t num_pages; +/* Number of pages we've actually got in balloon. */ +uint32_t actual; +}; + +typedef struct VirtIOBalloon +{ +
Re: [kvm-devel] [patch 0/4] [RFC] MMU Notifiers V1
On Fri, Jan 25, 2008 at 12:42:29PM +0100, Andrea Arcangeli wrote: On a technical merit this still partially makes me sick and I think it's the last issue to debate. @@ -971,6 +974,9 @@ int try_to_unmap(struct page *page, int else ret = try_to_unmap_file(page, migration); + if (unlikely(PageExternalRmap(page))) + mmu_rmap_notifier(invalidate_page, page); + if (!page_mapped(page)) ret = SWAP_SUCCESS; return ret; I find the above hard to accept, because the moment you work with physical pages and not mm+address I think you couldn't possibly care if page_mapped is true or false, and I think the above notifier should be called _outside_ try_to_unmap. Infact I'd call mmu_rmap_notifier(invalidate_page, page); only if page_unmapped is false and the linux pte is gone already (practically just before the page_count == 2 check and after try_to_unmap). How does the called process sleep or how does it coordinate async work with try_to_unmap? We need to sleep. On a seperate note, I think the page flag needs to be set by the process when it is acquiring the page for export. But since the same page could be acquired by multiple export mechanisms, we should not clear it in the exporting driver, but rather here after all exportors have been called to invalidate_page. That lead me to believe we should add a flag to get_user_pages() that indicates this is an export with external rmap. We could then set the page flag in get_user_pages. Thanks, Robin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] ia64 kernel patches?
Jes Sorensen wrote: Zhang, Xiantao wrote: Jes Sorensen wrote: Hi Xiantao, If you could put up the patches somewhere, I could help you clean them up and push them. I would prefer not to wait until they appear in Linus' tree if possible. Hi, Jes You don't need to wait so long. We will push it to Avi's tree first in near future once ia64 kernel ready. Thanks Xiantao Hi Xiantao, Please, put it in a public tree somewhere, either on kernel.org or a different public server. This is how everybody else does Open Source development - it is much easier in the long run as others will be able to send you fixes instead of waiting 4 months. Hi, Jes Thanks for your suggestion! Yes, we need to create a public tree and work together for development. OK, we will setup it ASAP:) Thanks Xiantao - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 4/8] X86: export information about NPT to generic x86 code
Anthony Liguori wrote: Joerg Roedel wrote: The generic x86 code has to know if the specific implementation uses Nested Paging. In the generic code Nested Paging is called Hardware Assisted Paging (HAP) to avoid confusion with (future) HAP implementations of other vendors. This patch exports the availability of HAP to the generic x86 code. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |7 +++ arch/x86/kvm/vmx.c |7 +++ include/asm-x86/kvm_host.h |2 ++ 3 files changed, 16 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 2e718ff..d0bfdd8 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -1678,6 +1678,11 @@ static bool svm_cpu_has_accelerated_tpr(void) return false; } +static bool svm_hap_enabled(void) +{ +return npt_enabled; +} + To help with bisecting, you should probably return false here until the patch that actually implements NPT support. Otherwise, the 7th patch in this series breaks KVM for SVM. Ignore this, you're already doing the right thing :-) Regards, Anthony Liguori Regards, Anthony Liguori static struct kvm_x86_ops svm_x86_ops = { .cpu_has_kvm_support = has_svm, .disabled_by_bios = is_disabled, @@ -1734,6 +1739,8 @@ static struct kvm_x86_ops svm_x86_ops = { .inject_pending_vectors = do_interrupt_requests, .set_tss_addr = svm_set_tss_addr, + +.hap_enabled = svm_hap_enabled, }; static int __init svm_init(void) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 00a00e4..8feb775 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2631,6 +2631,11 @@ static void __init vmx_check_processor_compat(void *rtn) } } +static bool vmx_hap_enabled(void) +{ +return false; +} + static struct kvm_x86_ops vmx_x86_ops = { .cpu_has_kvm_support = cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, @@ -2688,6 +2693,8 @@ static struct kvm_x86_ops vmx_x86_ops = { .inject_pending_vectors = do_interrupt_requests, .set_tss_addr = vmx_set_tss_addr, + +.hap_enabled = vmx_hap_enabled, }; static int __init vmx_init(void) diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h index 67ae307..45a9d05 100644 --- a/include/asm-x86/kvm_host.h +++ b/include/asm-x86/kvm_host.h @@ -392,6 +392,8 @@ struct kvm_x86_ops { struct kvm_run *run); int (*set_tss_addr)(struct kvm *kvm, unsigned int addr); + +bool (*hap_enabled)(void); }; extern struct kvm_x86_ops *kvm_x86_ops; - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 3/8] SVM: add module parameter to disable Nested Paging
Joerg Roedel wrote: To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=off to the kvm_amd module. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 49bb57a..2e718ff 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -48,6 +48,9 @@ MODULE_LICENSE(GPL); #define SVM_DEATURE_SVML (1 2) static bool npt_enabled = false; +static char *npt = on; + +module_param(npt, charp, S_IRUGO); This would probably be better as an integer. Then we don't have to do nasty things like implicitly cast a literal to a char *. Regards, Anthony Liguori static void kvm_reput_irq(struct vcpu_svm *svm); @@ -415,6 +418,11 @@ static __init int svm_hardware_setup(void) if (!svm_has(SVM_FEATURE_NPT)) npt_enabled = false; + if (npt_enabled strncmp(npt, off, 3) == 0) { + printk(KERN_INFO kvm: Nested Paging disabled\n); + npt_enabled = false; + } + if (npt_enabled) printk(KERN_INFO kvm: Nested Paging enabled\n); - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH][RFC] SVM: Add Support for Nested Paging in AMD Fam16 CPUs
Joerg Roedel wrote: Hi, here is the first release of patches for KVM to support the Nested Paging (NPT) feature of AMD QuadCore CPUs for comments and public testing. This feature improves the guest performance significantly. I measured an improvement of around 17% using kernbench in my first tests. This patch series is basically tested with Linux guests (32 bit legacy paging, 32 bit PAE paging and 64 bit Long Mode). Also tested with Windows Vista 32 bit and 64 bit. All these guests ran successfully with these patches. The patch series only enables NPT for 64 bit Linux hosts at the moment. Please give these patches a good and deep testing. I hope we have this patchset ready for merging soon. A quick sniff test and things look pretty good. I was able to start running the install CDs for 32-bit and 64-bit Ubuntu, 32-bit OpenSuSE, 64-bit Fedora, and 32-bit Win2k8. I'll do a more thorough run of kvm-test on Monday when I have a better connection to my machine. Nice work! Regards, Anthony Liguori Joerg Here is the diffstat: arch/x86/kvm/mmu.c | 81 +++--- arch/x86/kvm/mmu.h |6 +++ arch/x86/kvm/svm.c | 94 +-- arch/x86/kvm/vmx.c |7 +++ arch/x86/kvm/x86.c |1 + include/asm-x86/kvm_host.h |4 ++ 6 files changed, 182 insertions(+), 11 deletions(-) - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH 7/8] MMU: add HAP support to the KVM MMU
This patch contains the changes to the KVM MMU necessary for support of the Nested Paging feature in AMD Barcelona and Phenom Processors. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/mmu.c | 74 ++- arch/x86/kvm/mmu.h |6 2 files changed, 78 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index dfbcf5e..bce27ca 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1072,6 +1072,7 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu) int i; gfn_t root_gfn; struct kvm_mmu_page *sp; + int metaphysical = 0; root_gfn = vcpu-arch.cr3 PAGE_SHIFT; @@ -1080,8 +1081,11 @@ static void mmu_alloc_roots(struct kvm_vcpu *vcpu) hpa_t root = vcpu-arch.mmu.root_hpa; ASSERT(!VALID_PAGE(root)); + if (kvm_x86_ops-hap_enabled()) + metaphysical = 1; sp = kvm_mmu_get_page(vcpu, root_gfn, 0, - PT64_ROOT_LEVEL, 0, ACC_ALL, NULL, NULL); + PT64_ROOT_LEVEL, metaphysical, + ACC_ALL, NULL, NULL); root = __pa(sp-spt); ++sp-root_count; vcpu-arch.mmu.root_hpa = root; @@ -1135,6 +1139,36 @@ static int nonpaging_page_fault(struct kvm_vcpu *vcpu, gva_t gva, error_code PFERR_WRITE_MASK, gfn); } +static int hap_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, + u32 error_code) +{ + struct page *page; + int r; + + ASSERT(vcpu); + ASSERT(VALID_PAGE(vcpu-arch.mmu.root_hpa)); + + r = mmu_topup_memory_caches(vcpu); + if (r) + return r; + + down_read(current-mm-mmap_sem); + page = gfn_to_page(vcpu-kvm, gpa PAGE_SHIFT); + if (is_error_page(page)) { + kvm_release_page_clean(page); + up_read(current-mm-mmap_sem); + return 1; + } + spin_lock(vcpu-kvm-mmu_lock); + kvm_mmu_free_some_pages(vcpu); + r = __direct_map(vcpu, gpa, error_code PFERR_WRITE_MASK, +gpa PAGE_SHIFT, page, HAP_ROOT_LEVEL); + spin_unlock(vcpu-kvm-mmu_lock); + up_read(current-mm-mmap_sem); + + return r; +} + static void nonpaging_free(struct kvm_vcpu *vcpu) { mmu_free_roots(vcpu); @@ -1228,7 +1262,35 @@ static int paging32E_init_context(struct kvm_vcpu *vcpu) return paging64_init_context_common(vcpu, PT32E_ROOT_LEVEL); } -static int init_kvm_mmu(struct kvm_vcpu *vcpu) +static int init_kvm_hap_mmu(struct kvm_vcpu *vcpu) +{ + struct kvm_mmu *context = vcpu-arch.mmu; + + context-new_cr3 = nonpaging_new_cr3; + context-page_fault = hap_page_fault; + context-free = nonpaging_free; + context-prefetch_page = nonpaging_prefetch_page; + context-shadow_root_level = HAP_ROOT_LEVEL; + context-root_hpa = INVALID_PAGE; + + if (!is_paging(vcpu)) { + context-gva_to_gpa = nonpaging_gva_to_gpa; + context-root_level = 0; + } else if (is_long_mode(vcpu)) { + context-gva_to_gpa = paging64_gva_to_gpa; + context-root_level = PT64_ROOT_LEVEL; + } else if (is_pae(vcpu)) { + context-gva_to_gpa = paging64_gva_to_gpa; + context-root_level = PT32E_ROOT_LEVEL; + } else { + context-gva_to_gpa = paging32_gva_to_gpa; + context-root_level = PT32_ROOT_LEVEL; + } + + return 0; +} + +static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { ASSERT(vcpu); ASSERT(!VALID_PAGE(vcpu-arch.mmu.root_hpa)); @@ -1243,6 +1305,14 @@ static int init_kvm_mmu(struct kvm_vcpu *vcpu) return paging32_init_context(vcpu); } +static int init_kvm_mmu(struct kvm_vcpu *vcpu) +{ + if (kvm_x86_ops-hap_enabled()) + return init_kvm_hap_mmu(vcpu); + else + return init_kvm_softmmu(vcpu); +} + static void destroy_kvm_mmu(struct kvm_vcpu *vcpu) { ASSERT(vcpu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 1fce19e..ef880a5 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -3,6 +3,12 @@ #include linux/kvm_host.h +#ifdef CONFIG_X86_64 +#define HAP_ROOT_LEVEL PT64_ROOT_LEVEL +#else +#define HAP_ROOT_LEVEL PT32E_ROOT_LEVEL +#endif + static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu) { if (unlikely(vcpu-kvm-arch.n_free_mmu_pages KVM_MIN_FREE_MMU_PAGES)) -- 1.5.3.7 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list
[kvm-devel] [PATCH] svm last branch recording MSR emulation
Hi, this patch adds support for the lbr MSR emulation, it also enables support for Windows XP 64bit guests. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] Signed-off-by: Markus Rechberger [EMAIL PROTECTED] --- svm.c.orig 2008-01-23 10:04:14.0 +0100 +++ svm.c 2008-01-25 12:52:11.0 +0100 @@ -1099,6 +1100,21 @@ case MSR_IA32_SYSENTER_ESP: *data = svm-vmcb-save.sysenter_esp; break; + case MSR_IA32_DEBUGCTLMSR: + *data = svm-vmcb-save.dbgctl; + break; + case MSR_IA32_LASTBRANCHFROMIP: + *data = svm-vmcb-save.br_from; + break; + case MSR_IA32_LASTBRANCHTOIP: + *data = svm-vmcb-save.br_to; + break; + case MSR_IA32_LASTINTFROMIP: + *data = svm-vmcb-save.last_excp_from; + break; + case MSR_IA32_LASTINTTOIP: + *data = svm-vmcb-save.last_excp_to; + break; default: return kvm_get_msr_common(vcpu, ecx, data); } @@ -1171,6 +1187,19 @@ if (data != 0) goto unhandled; break; + case MSR_IA32_DEBUGCTLMSR: + svm-vmcb-save.dbgctl = data; + if (svm-vmcb-save.dbgctl svm_has(SVM_FEATURE_LBRV)) { + void *msrpm_va; + svm-vmcb-control.lbr_ctl = 1; + msrpm_va = page_address(pfn_to_page(msrpm_base PAGE_SHIFT)); + set_msr_interception(msrpm_va, MSR_IA32_DEBUGCTLMSR, 0, 0); + set_msr_interception(msrpm_va, MSR_IA32_LASTBRANCHFROMIP, 0, 0); + set_msr_interception(msrpm_va, MSR_IA32_LASTBRANCHTOIP, 0, 0); + set_msr_interception(msrpm_va, MSR_IA32_LASTINTFROMIP, 0, 0); + set_msr_interception(msrpm_va, MSR_IA32_LASTINTTOIP, 0, 0); + } + break; default: unhandled: return kvm_set_msr_common(vcpu, ecx, data); - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 1/4] mmu_notifier: Core code
On Fri, 25 Jan 2008, Robin Holt wrote: Keep in mind that on a 2048p SSI MPI job starting up, we have 2048 ranks doing this at the same time 6 times withing their address range. That seems like a lock which could get hot fairly quickly. It may be for a short period during startup and shutdown, but it is there. Ok. I guess we need to have a __register_mmu_notifier that expects the mmap_sem to be held then? 1. invalidate_all() That will be fine as long as we can unregister the ops notifier and free the structure. Otherwise, we end up being called needlessly. No you cannot do that because there are still callbacks that come later. The invalidate_all may lead to invalidate_range() doing nothing for this mm. The ops notifier and the freeing of the structure has to wait until release(). 2. invalidate_range() for each vma 3. release() We cannot simply move the call up because there will be future range callbacks on vma invalidation. I am not sure what this means. Right now, if you were to notify XPMEM the process is exiting, we would take care of all the recalling of pages exported by this process, clearing those pages cache lines from cache, and raising memory protections. I would assume that moving the callout earlier would expect the same of every driver. That does not sync with the current scheme of the invalidate_range() hooks. We would have to do a global invalidate early and then place the other invalidate_range hooks in such a way that none is called in later in process exit handling. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] KVM net bug
Hi, First I'm sorry for my english. I'm using kvm on host: Debian GNU/Linux sid host kernel: Linux enol 2.6.23-1-amd64 #1 SMP Fri Dec 21 12:00:17 UTC 2007 x86_64 GNU/Linux kvm version 58+dfsg-2 (almost on Debian) guest: Debian etch and Windows XP, although I think that problem is on any guest command: kvm -hda disk.img -net nic,macaddr=valid mac -net tap I describe the problem on Debian: When I execute de kvm command the guest work but don't work the net because device eth0 don't exists instead of exists eth1 firstly eth2 when I shut down and later I boot the guest. Because of this my local net don't work. Regards, -- José A. Caro Leiva Ingeniero Técnico de Sistemas [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 1/4] mmu_notifier: Core code
+#define mmu_notifier(function, mm, args...) \ ... + if (__mn-ops-function)\ + __mn-ops-function(__mn, \ + mm, \ + args); \ __mn-ops-function(__mn, mm, args); \ I realize it is a minor nit, but since we put the continuation in column 81 in the next define, can we do the same here and make this more readable? + rcu_read_unlock(); \ ... +#define mmu_rmap_notifier(function, args...) \ + do { \ + struct mmu_rmap_notifier *__mrn; \ + struct hlist_node *__n; \ + \ +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n; + + if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { + rcu_read_lock(); + hlist_for_each_entry_rcu(mn, n, + mm-mmu_notifier.head, hlist) { + if (mn-ops-release) + mn-ops-release(mn, mm); + hlist_del(mn-hlist); I think the hlist_del needs to be before the function callout so we can free the structure without a use-after-free issue. hlist_for_each_entry_rcu(mn, n, mm-mmu_notifier.head, hlist) { hlist_del_rcu(mn-hlist); if (mn-ops-release) mn-ops-release(mn, mm); +static DEFINE_SPINLOCK(mmu_notifier_list_lock); Remove + +void mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm) +{ + spin_lock(mmu_notifier_list_lock); Shouldn't this really be protected by the down_write(mmap_sem)? Maybe: BUG_ON(!rwsem_is_write_locked(mm-mmap_sem)); + hlist_add_head(mn-hlist, mm-mmu_notifier.head); hlist_add_head_rcu(mn-hlist, mm-mmu_notifier.head); + spin_unlock(mmu_notifier_list_lock); +} +EXPORT_SYMBOL_GPL(mmu_notifier_register); + +void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm) +{ + spin_lock(mmu_notifier_list_lock); + hlist_del(mn-hlist); hlist_del_rcu? Ditto on the lock. + spin_unlock(mmu_notifier_list_lock); +} +EXPORT_SYMBOL_GPL(mmu_notifier_unregister); + +HLIST_HEAD(mmu_rmap_notifier_list); static DEFINE_SPINLOCK(mmu_rmap_notifier_list_lock); + +void mmu_rmap_notifier_register(struct mmu_rmap_notifier *mrn) +{ + spin_lock(mmu_notifier_list_lock); + hlist_add_head_rcu(mrn-hlist, mmu_rmap_notifier_list); + spin_unlock(mmu_notifier_list_lock); spin_lock(mmu_rmap_notifier_list_lock); hlist_add_head_rcu(mrn-hlist, mmu_rmap_notifier_list); spin_unlock(mmu_rmap_notifier_list_lock); +} +EXPORT_SYMBOL(mmu_rmap_notifier_register); + +void mmu_rmap_notifier_unregister(struct mmu_rmap_notifier *mrn) +{ + spin_lock(mmu_notifier_list_lock); + hlist_del_rcu(mrn-hlist); + spin_unlock(mmu_notifier_list_lock); spin_lock(mmu_rmap_notifier_list_lock); hlist_del_rcu(mrn-hlist); spin_unlock(mmu_rmap_notifier_list_lock); @@ -2043,6 +2044,7 @@ void exit_mmap(struct mm_struct *mm) vm_unacct_memory(nr_accounted); free_pgtables(tlb, vma, FIRST_USER_ADDRESS, 0); tlb_finish_mmu(tlb, 0, end); + mmu_notifier_release(mm); Can we consider moving this notifier or introducing an additional notifier in the release or a flag to this one indicating early/late. The GRU that Jack is concerned with would benefit from the early in that it could just invalidate the GRU context and immediately all GRU TLB entries are invalid. I believe Jack would like to also be able to remove his entry from the mmu_notifier list in an effort to avoid the page and range callouts. XPMEM, would also benefit from a call early. We could make all the segments as being torn down and start the recalls. We already have this code in and working (have since it was first written 6 years ago). In this case, all segments are torn down with a single message to each of the importing partitions. In contrast, the teardown code which would happen now would be one set of messages for each vma. Thanks, Robin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel
Re: [kvm-devel] [patch 1/4] mmu_notifier: Core code
On Fri, Jan 25, 2008 at 10:47:04AM -0800, Christoph Lameter wrote: On Fri, 25 Jan 2008, Robin Holt wrote: I realize it is a minor nit, but since we put the continuation in column 81 in the next define, can we do the same here and make this more readable? We need to fix the next define to not use column 81. Found a couple of more 80 column infractions. Will be fixed in next release. +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n; + + if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { + rcu_read_lock(); + hlist_for_each_entry_rcu(mn, n, + mm-mmu_notifier.head, hlist) { + if (mn-ops-release) + mn-ops-release(mn, mm); + hlist_del(mn-hlist); I think the hlist_del needs to be before the function callout so we can free the structure without a use-after-free issue. The list head is in the mm_struct. This will be freed later. I meant the structure pointed to by mn. I assume it is intended that structure be kmalloc'd as part of a larger structure. The driver is the entity which created that structure and should be the one to free it. +void mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm) +{ + spin_lock(mmu_notifier_list_lock); Shouldn't this really be protected by the down_write(mmap_sem)? Maybe: Ok. We could switch this to mmap_sem protection for the mm_struct but the rmap notifier is not associated with an mm_struct. So we would need to keep it there. Since we already have a spinlock: Just use it for both to avoid further complications. But now you are putting a global lock in where it is inappropriate. + spin_lock(mmu_notifier_list_lock); + hlist_del(mn-hlist); hlist_del_rcu? Ditto on the lock. Peter already mentioned that and I have posted patches that address this issue. @@ -2043,6 +2044,7 @@ void exit_mmap(struct mm_struct *mm) vm_unacct_memory(nr_accounted); free_pgtables(tlb, vma, FIRST_USER_ADDRESS, 0); tlb_finish_mmu(tlb, 0, end); + mmu_notifier_release(mm); Can we consider moving this notifier or introducing an additional notifier in the release or a flag to this one indicating early/late. There is only one call right now? The GRU that Jack is concerned with would benefit from the early in that it could just invalidate the GRU context and immediately all GRU TLB entries are invalid. I believe Jack would like to also be able to remove his entry from the mmu_notifier list in an effort to avoid the page and range callouts. The TLB entries are removed by earlier invalidate_range calls. I would think that no TLBs are left at this point. Its simply a matter of releasing any still allocated resources through this callback. What I was asking for is a way to avoid those numerous callouts for drivers that can do early cleanup. XPMEM, would also benefit from a call early. We could make all the segments as being torn down and start the recalls. We already have this code in and working (have since it was first written 6 years ago). In this case, all segments are torn down with a single message to each of the importing partitions. In contrast, the teardown code which would happen now would be one set of messages for each vma. So we need an additional global teardown call? Then we'd need to switch off the vma based invalidate_range()? No, EXACTLY what I originally was asking for, either move this call site up, introduce an additional mmu_notifier op, or place this one in two locations with a flag indicating which call is being made. Thanks, Robin - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] Merging KVM QEMU changes upstream
Paul Brook wrote: Is this a reasonable merge strategy? We won't introduce regressions but I can't guarantee these new things will work cross-architecture. I think it depends to some extent whether things will need rewriting to be made cross-architecture. In particular if this requires interface changes. This means either breaking existing guests, or having to support both interfaces. That's a reasonable stance to take. I don't think anything in the tree right now presents that problem. I'll start sending out some patches and if you have specific concerns, we can talk about them 1-by-1. e.g. the extboot stuff seems like something that should be usable by all targets, except that the current interface looks like it's inherently x86 specific. Well with extboot in particular, the only interface is between the extboot option ROM and QEMU and I don't think that breaking that interface will matter much in practice. Regards, Anthony Liguori Paul - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/4] [RFC] MMU Notifiers V1
On Fri, 25 Jan 2008, Andrea Arcangeli wrote: On a technical merit this still partially makes me sick and I think it's the last issue to debate. @@ -971,6 +974,9 @@ int try_to_unmap(struct page *page, int else ret = try_to_unmap_file(page, migration); + if (unlikely(PageExternalRmap(page))) + mmu_rmap_notifier(invalidate_page, page); + if (!page_mapped(page)) ret = SWAP_SUCCESS; return ret; I find the above hard to accept, because the moment you work with physical pages and not mm+address I think you couldn't possibly care if page_mapped is true or false, and I think the above notifier should be called _outside_ try_to_unmap. Infact I'd call mmu_rmap_notifier(invalidate_page, page); only if page_unmapped is false and the linux pte is gone already (practically just before the page_count == 2 check and after try_to_unmap). try_to_unmap is called from multiple places. The placement here also covers f.e. page migration. We also need to do this in the page_mkclean case because the permissions on an external pte are restricted there. So we need a refault to update the pte. I also think it's still worth to debate the rmap based on virtual or physical index. By supporting both secondary-rmap designs at the same time you seem to agree current KVM lightweight rmap implementation is a superior design at least for KVM. But by insisting on your rmap based on physical for your usage, you're implicitly telling us that is a superior design for you. But we know very little of why you can't We actually need both version. We have hardware that has a driver without rmap that does not sleep. On the other hand XPmem has rmap capability and needs to sleep for its notifications. Nevertheless I'm very glad we already fully converged on the set_page_dirty, invalidate-page after ptep_clear_flush/young, etc... and furthermore that you only made very minor modification to my code to add a pair of hooks for the page-based rmap notifiers on top of my patch. So from a functionality POV this is 100% workable already from KVM side! Well we still have to review this stuff more and I have a vague feeling that not all the multiple hooks that came about because I took the mmu_notifier(invalidate_page, ...) out of the macro need to be kept because some of them are already covered by the range operations. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] kvm-59 crash
Hello, i'm trying to use kvm, with windows XP SP2 guest , All running well, but after a small delay ( ~ 5minutes ) , kvm crash with : kvm_run: Unknown error 524 emulation failed (mmio) rip 7cb3d000 ff ff 8d 85 Fail to handle apic access vmexit! Offset is 0xf0 Some Information : Host : Linux version 2.6.23.9 Gentoo 64 Bits Intel Quad core 2.4g Memory 5G Guest - Windows XP PRO SP2 cmd line : qemu-system-x86_64 -no-acpi -snapshot -full-screen -std-vga -vnc :3 -hda /tmp/vm_XP_gov_10G.qcow2 -k fr -localtime -m 384 -usb -usbdevice tablet And to finish, how debug / hack kvm code ? I'm working with ddd en gcc, but how to do with kvm ? Regards, Nicolas Prochazka. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] ia64 kernel patches?
Zhang, Xiantao wrote: Jes Sorensen wrote: Hi Xiantao, If you could put up the patches somewhere, I could help you clean them up and push them. I would prefer not to wait until they appear in Linus' tree if possible. Hi, Jes You don't need to wait so long. We will push it to Avi's tree first in near future once ia64 kernel ready. Thanks Xiantao Hi Xiantao, Please, put it in a public tree somewhere, either on kernel.org or a different public server. This is how everybody else does Open Source development - it is much easier in the long run as others will be able to send you fixes instead of waiting 4 months. Thanks, Jes - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Migration problems
On Thu, Jan 24, 2008 at 01:52:16PM +0200, Uri Lublin wrote: Sometimes after lodavm guest network does work better with '-no-kvm-irqchip' I am trying to use bisection to find the problematic patch. Another migration problem that pops up with kvm 60, doing: migrate exec:dd of=/tmp/bla works in the qemu monitor however it doesn't, when you start kvm with -monitor pty and pipe the same command into /dev/pts/X (which is what libvirt does). Kvm segfaults immediately. I'll debug this further but maybe someone else has seen this already? Cheers, -- Guido signature.asc Description: Digital signature - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] QEMU support for virtio balloon driver
On Thu, Jan 24, 2008 at 04:29:51PM -0600, Anthony Liguori wrote: Anthony Liguori wrote: This patch adds support to QEMU for Rusty's recently introduce virtio balloon driver. The user-facing portions of this are the introduction of a balloon and info balloon command in the monitor. I think using madvise unconditionally is okay but I am not sure. Looks like it's not. I just hung my host system after doing a bunch of ballooning with a kernel that doesn't have MM notifiers. Thats strange, lack of MMU notifiers will crash the guest (by use of a stale shadow entry), but not the host. What are the symptoms? I'm inclined to think that we should have a capability check for MM notifiers and just not do madvise if they aren't present. I don't think the ioctl approach that Marcelo took is sufficient as a malicious guest could possibly hose the host. How's that? The ioctl damage is contained to the guest (other than CPU processing time, which the guest can cause in other ways). Anyway, don't see the need for back compat with older hosts. Having the guest allocate and not touch memory means that it should eventually be removed from the shadow page cache and eventually swapped out so ballooning isn't totally useless in the absence of MM notifiers. Regards, Anthony Liguori If madvise is called on memory that is essentially locked (which is what pre-MM notifiers is like) then it should just be a nop right? Signed-off-by: Anthony Liguori [EMAIL PROTECTED] diff --git a/qemu/Makefile.target b/qemu/Makefile.target index bb7be0f..d6b4f46 100644 --- a/qemu/Makefile.target +++ b/qemu/Makefile.target @@ -464,7 +464,7 @@ VL_OBJS += rtl8139.o VL_OBJS+= hypercall.o # virtio devices -VL_OBJS += virtio.o virtio-net.o virtio-blk.o +VL_OBJS += virtio.o virtio-net.o virtio-blk.o virtio-balloon.o ifeq ($(TARGET_BASE_ARCH), i386) # Hardware support diff --git a/qemu/balloon.h b/qemu/balloon.h new file mode 100644 index 000..ffce1fa --- /dev/null +++ b/qemu/balloon.h @@ -0,0 +1,14 @@ +#ifndef _QEMU_BALLOON_H +#define _QEMU_BALLOON_H + +#include cpu-defs.h + +typedef ram_addr_t (QEMUBalloonEvent)(void *opaque, ram_addr_t target); + +void qemu_add_balloon_handler(QEMUBalloonEvent *func, void *opaque); + +void qemu_balloon(ram_addr_t target); + +ram_addr_t qemu_balloon_status(void); + +#endif diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c index 652b263..6c8d907 100644 --- a/qemu/hw/pc.c +++ b/qemu/hw/pc.c @@ -1122,6 +1122,9 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size, } } +if (pci_enabled) +virtio_balloon_init(pci_bus, 0x1AF4, 0x1002); + if (extboot_drive != -1) { DriveInfo *info = drives_table[extboot_drive]; int cyls, heads, secs; diff --git a/qemu/hw/pc.h b/qemu/hw/pc.h index f640395..1899c11 100644 --- a/qemu/hw/pc.h +++ b/qemu/hw/pc.h @@ -152,6 +152,9 @@ void virtio_net_poll(void); void *virtio_blk_init(PCIBus *bus, uint16_t vendor, uint16_t device, BlockDriverState *bs); +/* virtio-balloon.h */ +void *virtio_balloon_init(PCIBus *bus, uint16_t vendor, uint16_t device); + /* extboot.c */ void extboot_init(BlockDriverState *bs, int cmd); diff --git a/qemu/hw/virtio-balloon.c b/qemu/hw/virtio-balloon.c new file mode 100644 index 000..1b5a689 --- /dev/null +++ b/qemu/hw/virtio-balloon.c @@ -0,0 +1,142 @@ +/* + * Virtio Block Device + * + * Copyright IBM, Corp. 2008 + * + * Authors: + * Anthony Liguori [EMAIL PROTECTED] + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include virtio.h +#include block.h +#include pc.h +#include balloon.h +#include sysemu.h + +#include sys/mman.h + +/* from Linux's linux/virtio_blk.h */ + +/* The ID for virtio_balloon */ +#define VIRTIO_ID_BALLOON 5 + +/* The feature bitmap for virtio balloon */ +#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ + +struct virtio_balloon_config +{ +/* Number of pages host wants Guest to give up. */ +uint32_t num_pages; +/* Number of pages we've actually got in balloon. */ +uint32_t actual; +}; + +typedef struct VirtIOBalloon +{ +VirtIODevice vdev; +VirtQueue *ivq, *dvq; +uint32_t num_pages; +uint32_t actual; +} VirtIOBalloon; + +static VirtIOBalloon *to_virtio_balloon(VirtIODevice *vdev) +{ +return (VirtIOBalloon *)vdev; +} + +static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq) +{ +VirtIOBalloon *s = to_virtio_balloon(vdev); +VirtQueueElement elem; +unsigned int count; + +while ((count = virtqueue_pop(vq, elem)) != 0) { +int i; +unsigned int wlen = 0; + +for (i = 0; i elem.out_num; i++) { +int flags; +uint32_t *pfns = elem.out_sg[i].iov_base; +unsigned int n_pfns =
Re: [kvm-devel] [patch 1/4] mmu_notifier: Core code
On Fri, 25 Jan 2008, Robin Holt wrote: I realize it is a minor nit, but since we put the continuation in column 81 in the next define, can we do the same here and make this more readable? We need to fix the next define to not use column 81. Found a couple of more 80 column infractions. Will be fixed in next release. +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n; + + if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { + rcu_read_lock(); + hlist_for_each_entry_rcu(mn, n, + mm-mmu_notifier.head, hlist) { + if (mn-ops-release) + mn-ops-release(mn, mm); + hlist_del(mn-hlist); I think the hlist_del needs to be before the function callout so we can free the structure without a use-after-free issue. The list head is in the mm_struct. This will be freed later. +void mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm) +{ + spin_lock(mmu_notifier_list_lock); Shouldn't this really be protected by the down_write(mmap_sem)? Maybe: Ok. We could switch this to mmap_sem protection for the mm_struct but the rmap notifier is not associated with an mm_struct. So we would need to keep it there. Since we already have a spinlock: Just use it for both to avoid further complications. + spin_lock(mmu_notifier_list_lock); + hlist_del(mn-hlist); hlist_del_rcu? Ditto on the lock. Peter already mentioned that and I have posted patches that address this issue. @@ -2043,6 +2044,7 @@ void exit_mmap(struct mm_struct *mm) vm_unacct_memory(nr_accounted); free_pgtables(tlb, vma, FIRST_USER_ADDRESS, 0); tlb_finish_mmu(tlb, 0, end); + mmu_notifier_release(mm); Can we consider moving this notifier or introducing an additional notifier in the release or a flag to this one indicating early/late. There is only one call right now? The GRU that Jack is concerned with would benefit from the early in that it could just invalidate the GRU context and immediately all GRU TLB entries are invalid. I believe Jack would like to also be able to remove his entry from the mmu_notifier list in an effort to avoid the page and range callouts. The TLB entries are removed by earlier invalidate_range calls. I would think that no TLBs are left at this point. Its simply a matter of releasing any still allocated resources through this callback. XPMEM, would also benefit from a call early. We could make all the segments as being torn down and start the recalls. We already have this code in and working (have since it was first written 6 years ago). In this case, all segments are torn down with a single message to each of the importing partitions. In contrast, the teardown code which would happen now would be one set of messages for each vma. So we need an additional global teardown call? Then we'd need to switch off the vma based invalidate_range()? - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 1/4] mmu_notifier: Core code
On Fri, 25 Jan 2008, Robin Holt wrote: +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n; + + if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { + rcu_read_lock(); + hlist_for_each_entry_rcu(mn, n, + mm-mmu_notifier.head, hlist) { + if (mn-ops-release) + mn-ops-release(mn, mm); + hlist_del(mn-hlist); I think the hlist_del needs to be before the function callout so we can free the structure without a use-after-free issue. The list head is in the mm_struct. This will be freed later. I meant the structure pointed to by mn. I assume it is intended that structure be kmalloc'd as part of a larger structure. The driver is the entity which created that structure and should be the one to free it. mn will be pointing to the listhead in the mm_struct one after the other. You mean the ops structure? +void mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm) +{ + spin_lock(mmu_notifier_list_lock); Shouldn't this really be protected by the down_write(mmap_sem)? Maybe: Ok. We could switch this to mmap_sem protection for the mm_struct but the rmap notifier is not associated with an mm_struct. So we would need to keep it there. Since we already have a spinlock: Just use it for both to avoid further complications. But now you are putting a global lock in where it is inappropriate. The lock is only used during register and unregister. Very low level usage. XPMEM, would also benefit from a call early. We could make all the segments as being torn down and start the recalls. We already have this code in and working (have since it was first written 6 years ago). In this case, all segments are torn down with a single message to each of the importing partitions. In contrast, the teardown code which would happen now would be one set of messages for each vma. So we need an additional global teardown call? Then we'd need to switch off the vma based invalidate_range()? No, EXACTLY what I originally was asking for, either move this call site up, introduce an additional mmu_notifier op, or place this one in two locations with a flag indicating which call is being made. Add a new invalidate_all() call? Then on exit we do 1. invalidate_all() 2. invalidate_range() for each vma 3. release() We cannot simply move the call up because there will be future range callbacks on vma invalidation. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH][RFC] SVM: Add Support for Nested Paging in AMD Fam16 CPUs
Hi, here is the first release of patches for KVM to support the Nested Paging (NPT) feature of AMD QuadCore CPUs for comments and public testing. This feature improves the guest performance significantly. I measured an improvement of around 17% using kernbench in my first tests. This patch series is basically tested with Linux guests (32 bit legacy paging, 32 bit PAE paging and 64 bit Long Mode). Also tested with Windows Vista 32 bit and 64 bit. All these guests ran successfully with these patches. The patch series only enables NPT for 64 bit Linux hosts at the moment. Please give these patches a good and deep testing. I hope we have this patchset ready for merging soon. Joerg Here is the diffstat: arch/x86/kvm/mmu.c | 81 +++--- arch/x86/kvm/mmu.h |6 +++ arch/x86/kvm/svm.c | 94 +-- arch/x86/kvm/vmx.c |7 +++ arch/x86/kvm/x86.c |1 + include/asm-x86/kvm_host.h |4 ++ 6 files changed, 182 insertions(+), 11 deletions(-) - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] KVM net bug
Hello, when you have started the debian guest, take a look if this file exists: /etc/udev/rules.d/z25_persistent-net.rules if so, delete it, and reboot your guest. let the mac address always the same for that guest. Now you should have an eth0 interface. I'm starting my debian guests like this: kvm -hda apache.img -hdb apache_storage.img -m 1024 -boot c -smp 2 -net nic,vlan=0,macaddr=00:16:3e:00:00:01,model=rtl8139 -net tap -nographic -daemonize For Windows I use: kvm -no-acpi -m 1024 -boot c windows.img -net nic,vlan=0,macaddr=00:16:3e:00:00:41,model=rtl8139 -net tap -nographic -daemonize Hope this helps, Regards Mike Anthony Liguori schrieb: José Antonio wrote: Hi, First I'm sorry for my english. I'm using kvm on host: Debian GNU/Linux sid host kernel: Linux enol 2.6.23-1-amd64 #1 SMP Fri Dec 21 12:00:17 UTC 2007 x86_64 GNU/Linux kvm version 58+dfsg-2 (almost on Debian) guest: Debian etch and Windows XP, although I think that problem is on any guest command: kvm -hda disk.img -net nic,macaddr=valid mac -net tap I describe the problem on Debian: When I execute de kvm command the guest work but don't work the net because device eth0 don't exists instead of exists eth1 firstly eth2 when I shut down and later I boot the guest. Because of this my local net don't work. I'm having a hard time understanding your problem. Do you have an appropriate qemu-ifup script? Regards, Anthony Liguori Regards, - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 3/8] SVM: add module parameter to disable Nested Paging
On Fri, Jan 25, 2008 at 03:35:23PM -0600, Anthony Liguori wrote: Joerg Roedel wrote: To disable the use of the Nested Paging feature even if it is available in hardware this patch adds a module parameter. Nested Paging can be disabled by passing npt=off to the kvm_amd module. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/svm.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 49bb57a..2e718ff 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -48,6 +48,9 @@ MODULE_LICENSE(GPL); #define SVM_DEATURE_SVML (1 2) static bool npt_enabled = false; +static char *npt = on; + +module_param(npt, charp, S_IRUGO); This would probably be better as an integer. Then we don't have to do nasty things like implicitly cast a literal to a char *. Hmm, I used int for that first but typing npt=off seemed more userfriendly to me than npt=0. So I used char* for it. Joerg -- | AMD Saxony Limited Liability Company Co. KG Operating | Wilschdorfer Landstr. 101, 01109 Dresden, Germany System| Register Court Dresden: HRA 4896 Research | General Partner authorized to represent: Center| AMD Saxony LLC (Wilmington, Delaware, US) | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Benchmarking warning
Hi, I get the following warning in the kernel messages: kvm: emulating preempt notifiers; do not benchmark on this machine Is this an accurate warning, should the machine not be used for benchmarking KVM? or does it mean don't benchmark other applications on the host? Thanks, Cam - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH][RFC] SVM: Add Support for Nested Paging in AMD Fam16 CPUs
On Fri, Jan 25, 2008 at 03:32:57PM -0600, Anthony Liguori wrote: A quick sniff test and things look pretty good. I was able to start running the install CDs for 32-bit and 64-bit Ubuntu, 32-bit OpenSuSE, 64-bit Fedora, and 32-bit Win2k8. I'll do a more thorough run of kvm-test on Monday when I have a better connection to my machine. Great. We will do more tests too next week. Life migration is completly untested for now. SMP guests worked also fine with this patches. Nice work! Thanks :-) Joerg - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] QEMU support for virtio balloon driver
On Thu, 2008-01-24 at 16:29 -0600, Anthony Liguori wrote: Anthony Liguori wrote: This patch adds support to QEMU for Rusty's recently introduce virtio balloon driver. The user-facing portions of this are the introduction of a balloon and info balloon command in the monitor. I think using madvise unconditionally is okay but I am not sure. Looks like it's not. I just hung my host system after doing a bunch of ballooning with a kernel that doesn't have MM notifiers. I'm inclined to think that we should have a capability check for MM notifiers and just not do madvise if they aren't present. I don't think the ioctl approach that Marcelo took is sufficient as a malicious guest could possibly hose the host. The ioctl to zap the shadow pages is needed in order to free memory fast. Without it the balloon will evacuate memory to slow for common mgmt application (running additional VMs). This ioctl (on older kernels only) can hose the host but so can malicious guests that do dummy cr3 switching and other hackry. If one really insist he can always add a timer to this ioctl to slow potential malicious guests. Having the guest allocate and not touch memory means that it should eventually be removed from the shadow page cache and eventually swapped out so ballooning isn't totally useless in the absence of MM notifiers. Regards, Anthony Liguori - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Benchmarking warning
On Fri, 2008-01-25 at 15:20 -0700, Cam Macdonell wrote: Hi, I get the following warning in the kernel messages: kvm: emulating preempt notifiers; do not benchmark on this machine Is this an accurate warning, should the machine not be used for It's accurate. benchmarking KVM? or does it mean don't benchmark other applications on the host? It's when running kvm (it traps every process scheduling). On newer hosts (2.6.23 =) the preempt notifiers are integrated and there is no need tapping the scheduler. Thanks, Cam - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel