RE: HAPPY NEW YEAR
From: Carl Leinbach Sent: Wednesday, January 06, 2016 4:21 PM To: Carl Leinbach Subject: HAPPY NEW YEAR Donation has been made to you by Mrs. Liliane Bettencourt. Contact email: mrslilianebettencou...@gmail.com for more details -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: fix ONE_REG AltiVec support
Happy new year ! On Wed, 16 Dec 2015 18:24:03 +0100 Greg Kurzwrote: > The get and set operations got exchanged by mistake when moving the > code from book3s.c to powerpc.c. > > Fixes: 3840edc8033ad5b86deee309c1c321ca54257452 > Signed-off-by: Greg Kurz > --- > Ping ? > It's been there for over a year but I guess we want that in 4.4, even > if doesn't break the host kernel. > > arch/powerpc/kvm/powerpc.c | 20 ++-- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c > index 6fd2405c7f4a..a3b182dcb823 100644 > --- a/arch/powerpc/kvm/powerpc.c > +++ b/arch/powerpc/kvm/powerpc.c > @@ -919,21 +919,17 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, > struct kvm_one_reg *reg) > r = -ENXIO; > break; > } > - vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval; > + val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0]; > break; > case KVM_REG_PPC_VSCR: > if (!cpu_has_feature(CPU_FTR_ALTIVEC)) { > r = -ENXIO; > break; > } > - vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val); > + val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]); > break; > case KVM_REG_PPC_VRSAVE: > - if (!cpu_has_feature(CPU_FTR_ALTIVEC)) { > - r = -ENXIO; > - break; > - } > - vcpu->arch.vrsave = set_reg_val(reg->id, val); > + val = get_reg_val(reg->id, vcpu->arch.vrsave); > break; > #endif /* CONFIG_ALTIVEC */ > default: > @@ -974,17 +970,21 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, > struct kvm_one_reg *reg) > r = -ENXIO; > break; > } > - val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0]; > + vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval; > break; > case KVM_REG_PPC_VSCR: > if (!cpu_has_feature(CPU_FTR_ALTIVEC)) { > r = -ENXIO; > break; > } > - val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]); > + vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val); > break; > case KVM_REG_PPC_VRSAVE: > - val = get_reg_val(reg->id, vcpu->arch.vrsave); > + if (!cpu_has_feature(CPU_FTR_ALTIVEC)) { > + r = -ENXIO; > + break; > + } > + vcpu->arch.vrsave = set_reg_val(reg->id, val); > break; > #endif /* CONFIG_ALTIVEC */ > default: > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Charity Donation
Hi, My name is Jeffrey Skoll, a philanthropist and the founder of one of the largest private foundations in the world. I believe strongly in ‘giving while living.’ I had one idea that never changed in my mind — that you should use your wealth to help people and I have decided to secretly give USD2.498 Million to a randomly selected individual. On receipt of this email, you should count yourself as the individual. Kindly get back to me at your earliest convenience, so I know your email address is valid. Visit the web page to know more about me: http://www.theglobeandmail.com/news/national/meet-the-canadian-billionaire-whos-giving-it-all-away/article4209888/ or you can read an article of me on Wikipedia. Regards, Jeffrey Skoll. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v1] kvm: Make vcpu->requests as 64 bit bitmap
Currently on x86 arch we has already 32 requests defined so the newer request bits can't be placed inside vcpu->requests(unsigned long) inside x86 32 bit system. But we are going to add a new request in x86 arch for Hyper-V tsc page support. To solve the problem the patch replaces vcpu->requests by bitmap with 64 bit length and uses bitmap API. The patch consists of: * announce kvm_vcpu_has_requests() to check whether vcpu has requests * announce kvm_vcpu_requests() to get vcpu requests pointer * announce kvm_clear_request() to clear particular vcpu request * replace if (vcpu->requests) by if (kvm_vcpu_has_requests(vcpu)) * replace clear_bit(req, vcpu->requests) by kvm_clear_request(req, vcpu) Signed-off-by: Andrey SmetaninCC: Paolo Bonzini CC: Gleb Natapov CC: James Hogan CC: Paolo Bonzini CC: Paul Burton CC: Ralf Baechle CC: Alexander Graf CC: Christian Borntraeger CC: Cornelia Huck CC: linux-m...@linux-mips.org CC: kvm-ppc@vger.kernel.org CC: linux-s...@vger.kernel.org CC: Roman Kagan CC: Denis V. Lunev CC: qemu-de...@nongnu.org --- arch/mips/kvm/emulate.c | 4 +--- arch/powerpc/kvm/book3s_pr.c | 2 +- arch/powerpc/kvm/book3s_pr_papr.c | 2 +- arch/powerpc/kvm/booke.c | 6 +++--- arch/powerpc/kvm/powerpc.c| 6 +++--- arch/powerpc/kvm/trace.h | 2 +- arch/s390/kvm/kvm-s390.c | 4 ++-- arch/x86/kvm/vmx.c| 2 +- arch/x86/kvm/x86.c| 14 +++--- include/linux/kvm_host.h | 27 ++- 10 files changed, 42 insertions(+), 27 deletions(-) diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c index 41b1b09..14aebe8 100644 --- a/arch/mips/kvm/emulate.c +++ b/arch/mips/kvm/emulate.c @@ -774,10 +774,8 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu *vcpu) * We we are runnable, then definitely go off to user space to * check if any I/O interrupts are pending. */ - if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) { - clear_bit(KVM_REQ_UNHALT, >requests); + if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN; - } } return EMULATE_DONE; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 64891b0..e975279 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -349,7 +349,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr) if (msr & MSR_POW) { if (!vcpu->arch.pending_exceptions) { kvm_vcpu_block(vcpu); - clear_bit(KVM_REQ_UNHALT, >requests); + kvm_clear_request(KVM_REQ_UNHALT, vcpu)); vcpu->stat.halt_wakeup++; /* Unset POW bit after we woke up */ diff --git a/arch/powerpc/kvm/book3s_pr_papr.c b/arch/powerpc/kvm/book3s_pr_papr.c index f2c75a1..60cf393 100644 --- a/arch/powerpc/kvm/book3s_pr_papr.c +++ b/arch/powerpc/kvm/book3s_pr_papr.c @@ -309,7 +309,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd) case H_CEDE: kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE); kvm_vcpu_block(vcpu); - clear_bit(KVM_REQ_UNHALT, >requests); + kvm_clear_request(KVM_REQ_UNHALT, vcpu); vcpu->stat.halt_wakeup++; return EMULATE_DONE; case H_LOGICAL_CI_LOAD: diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index fd58751..6bed382 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -574,7 +574,7 @@ static void arm_next_watchdog(struct kvm_vcpu *vcpu) * userspace, so clear the KVM_REQ_WATCHDOG request. */ if ((vcpu->arch.tsr & (TSR_ENW | TSR_WIS)) != (TSR_ENW | TSR_WIS)) - clear_bit(KVM_REQ_WATCHDOG, >requests); + kvm_clear_request(KVM_REQ_WATCHDOG, vcpu); spin_lock_irqsave(>arch.wdt_lock, flags); nr_jiffies = watchdog_next_timeout(vcpu); @@ -677,7 +677,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) kvmppc_core_check_exceptions(vcpu); - if (vcpu->requests) { + if (kvm_vcpu_has_requests(vcpu)) { /* Exception delivery raised request; start over */ return 1; } @@ -685,7 +685,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) if (vcpu->arch.shared->msr & MSR_WE) { local_irq_enable(); kvm_vcpu_block(vcpu); - clear_bit(KVM_REQ_UNHALT, >requests); +
Re: [PATCH v1] kvm: Make vcpu->requests as 64 bit bitmap
Hi Andrey, On Thu, Dec 24, 2015 at 12:30:26PM +0300, Andrey Smetanin wrote: > Currently on x86 arch we has already 32 requests defined > so the newer request bits can't be placed inside > vcpu->requests(unsigned long) inside x86 32 bit system. > But we are going to add a new request in x86 arch > for Hyper-V tsc page support. > > To solve the problem the patch replaces vcpu->requests by > bitmap with 64 bit length and uses bitmap API. > > The patch consists of: > * announce kvm_vcpu_has_requests() to check whether vcpu has > requests > * announce kvm_vcpu_requests() to get vcpu requests pointer > * announce kvm_clear_request() to clear particular vcpu request > * replace if (vcpu->requests) by if (kvm_vcpu_has_requests(vcpu)) > * replace clear_bit(req, vcpu->requests) by > kvm_clear_request(req, vcpu) > > Signed-off-by: Andrey Smetanin> CC: Paolo Bonzini > CC: Gleb Natapov > CC: James Hogan > CC: Paolo Bonzini > CC: Paul Burton > CC: Ralf Baechle > CC: Alexander Graf > CC: Christian Borntraeger > CC: Cornelia Huck > CC: linux-m...@linux-mips.org > CC: kvm-ppc@vger.kernel.org > CC: linux-s...@vger.kernel.org > CC: Roman Kagan > CC: Denis V. Lunev > CC: qemu-de...@nongnu.org For MIPS KVM bit: Acked-by: James Hogan Thanks James > > --- > arch/mips/kvm/emulate.c | 4 +--- > arch/powerpc/kvm/book3s_pr.c | 2 +- > arch/powerpc/kvm/book3s_pr_papr.c | 2 +- > arch/powerpc/kvm/booke.c | 6 +++--- > arch/powerpc/kvm/powerpc.c| 6 +++--- > arch/powerpc/kvm/trace.h | 2 +- > arch/s390/kvm/kvm-s390.c | 4 ++-- > arch/x86/kvm/vmx.c| 2 +- > arch/x86/kvm/x86.c| 14 +++--- > include/linux/kvm_host.h | 27 ++- > 10 files changed, 42 insertions(+), 27 deletions(-) > > diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c > index 41b1b09..14aebe8 100644 > --- a/arch/mips/kvm/emulate.c > +++ b/arch/mips/kvm/emulate.c > @@ -774,10 +774,8 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu > *vcpu) >* We we are runnable, then definitely go off to user space to >* check if any I/O interrupts are pending. >*/ > - if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) { > - clear_bit(KVM_REQ_UNHALT, >requests); > + if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) > vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN; > - } > } > > return EMULATE_DONE; > diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c > index 64891b0..e975279 100644 > --- a/arch/powerpc/kvm/book3s_pr.c > +++ b/arch/powerpc/kvm/book3s_pr.c > @@ -349,7 +349,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 > msr) > if (msr & MSR_POW) { > if (!vcpu->arch.pending_exceptions) { > kvm_vcpu_block(vcpu); > - clear_bit(KVM_REQ_UNHALT, >requests); > + kvm_clear_request(KVM_REQ_UNHALT, vcpu)); > vcpu->stat.halt_wakeup++; > > /* Unset POW bit after we woke up */ > diff --git a/arch/powerpc/kvm/book3s_pr_papr.c > b/arch/powerpc/kvm/book3s_pr_papr.c > index f2c75a1..60cf393 100644 > --- a/arch/powerpc/kvm/book3s_pr_papr.c > +++ b/arch/powerpc/kvm/book3s_pr_papr.c > @@ -309,7 +309,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd) > case H_CEDE: > kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE); > kvm_vcpu_block(vcpu); > - clear_bit(KVM_REQ_UNHALT, >requests); > + kvm_clear_request(KVM_REQ_UNHALT, vcpu); > vcpu->stat.halt_wakeup++; > return EMULATE_DONE; > case H_LOGICAL_CI_LOAD: > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c > index fd58751..6bed382 100644 > --- a/arch/powerpc/kvm/booke.c > +++ b/arch/powerpc/kvm/booke.c > @@ -574,7 +574,7 @@ static void arm_next_watchdog(struct kvm_vcpu *vcpu) >* userspace, so clear the KVM_REQ_WATCHDOG request. >*/ > if ((vcpu->arch.tsr & (TSR_ENW | TSR_WIS)) != (TSR_ENW | TSR_WIS)) > - clear_bit(KVM_REQ_WATCHDOG, >requests); > + kvm_clear_request(KVM_REQ_WATCHDOG, vcpu); > > spin_lock_irqsave(>arch.wdt_lock, flags); > nr_jiffies = watchdog_next_timeout(vcpu); > @@ -677,7 +677,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) > > kvmppc_core_check_exceptions(vcpu); > > - if (vcpu->requests) { > + if (kvm_vcpu_has_requests(vcpu)) { > /* Exception delivery raised request; start
Re: [PATCH v1] kvm: Make vcpu->requests as 64 bit bitmap
On 12/24/2015 02:14 PM, Roman Kagan wrote: On Thu, Dec 24, 2015 at 12:30:26PM +0300, Andrey Smetanin wrote: Currently on x86 arch we has already 32 requests defined so the newer request bits can't be placed inside vcpu->requests(unsigned long) inside x86 32 bit system. But we are going to add a new request in x86 arch for Hyper-V tsc page support. To solve the problem the patch replaces vcpu->requests by bitmap with 64 bit length and uses bitmap API. The patch consists of: * announce kvm_vcpu_has_requests() to check whether vcpu has requests * announce kvm_vcpu_requests() to get vcpu requests pointer I think that if abstracting out the implementation of the request container is what you're after, you'd better not define this function; accesses to the request map should be through your accessor functions. Good idea! I'll rework this patch and will send V2. * announce kvm_clear_request() to clear particular vcpu request * replace if (vcpu->requests) by if (kvm_vcpu_has_requests(vcpu)) * replace clear_bit(req, vcpu->requests) by kvm_clear_request(req, vcpu) Apparently one accessor is missing: test the presence of a request without clearing it from the bitmap (because kvm_check_request() clears it). agree diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h index 2e0e67e..4015b88 100644 --- a/arch/powerpc/kvm/trace.h +++ b/arch/powerpc/kvm/trace.h @@ -109,7 +109,7 @@ TRACE_EVENT(kvm_check_requests, TP_fast_assign( __entry->cpu_nr = vcpu->vcpu_id; - __entry->requests= vcpu->requests; + __entry->requests= (__u32)kvm_vcpu_requests(vcpu)[0]; ), This doesn't make sense, to expose only subset of the requests. BTW I don't see this event in Linus tree, nor in linux-next, I found it here: arch/powerpc/kvm/powerpc.c:104: trace_kvm_check_requests(vcpu); >so I'm not quite sure why it's formed this way; I guess the interesting part is the request number and the return value (i.e. whether it's set), not the whole bitmap. No, the code actually dumps part of first 32 bit of bitmap: TP_printk("vcpu=%x requests=%x", __entry->cpu_nr, __entry->requests) So, for this corner case we can make __entry->requests as unsigned long variable and make the API helper to get first unsigned long from vcpu->requests bitmap. Next print part of bitmap this way: TP_printk("vcpu=0x%x requests=0x%lx", __entry->cpu_nr, __entry->requests) (as unsigned long). POWERPC guys what do you think about such approach? Or could we even drop trace_kvm_check_requests() at all. Does trace_kvm_check_requests() still useful for you? --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -5957,7 +5957,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu) if (intr_window_requested && vmx_interrupt_allowed(vcpu)) return handle_interrupt_window(>vcpu); - if (test_bit(KVM_REQ_EVENT, >requests)) + if (test_bit(KVM_REQ_EVENT, kvm_vcpu_requests(vcpu))) Here you'd rather want that function to test for the request without clearing it. agree, i'll provide such function. --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2117,7 +2117,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (ka->boot_vcpu_runs_old_kvmclock != tmp) set_bit(KVM_REQ_MASTERCLOCK_UPDATE, - >requests); + kvm_vcpu_requests(vcpu)); This should have been kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu); agree @@ -8048,7 +8048,7 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) if (atomic_read(>arch.nmi_queued)) return true; - if (test_bit(KVM_REQ_SMI, >requests)) + if (test_bit(KVM_REQ_SMI, kvm_vcpu_requests(vcpu))) Again the test-only accessor is due here. agree --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -146,6 +146,8 @@ static inline bool is_error_page(struct page *page) #define KVM_REQ_HV_EXIT 30 #define KVM_REQ_HV_STIMER 31 +#define KVM_REQ_MAX 64 + #define KVM_USERSPACE_IRQ_SOURCE_ID 0 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1 @@ -233,7 +235,7 @@ struct kvm_vcpu { int vcpu_id; int srcu_idx; int mode; - unsigned long requests; + DECLARE_BITMAP(requests, KVM_REQ_MAX); unsigned long guest_debug; int pre_pcpu; @@ -286,6 +288,16 @@ struct kvm_vcpu { struct kvm_vcpu_arch arch; }; +static inline bool kvm_vcpu_has_requests(struct kvm_vcpu *vcpu) +{ + return !bitmap_empty(vcpu->requests, KVM_REQ_MAX); +} + +static inline ulong *kvm_vcpu_requests(struct kvm_vcpu *vcpu) +{ + return (ulong *)vcpu->requests; +} As I wrote above, I believe this function doesn't belong in the API. agree,
Re: [PATCH v1] kvm: Make vcpu->requests as 64 bit bitmap
On Thu, Dec 24, 2015 at 12:30:26PM +0300, Andrey Smetanin wrote: > Currently on x86 arch we has already 32 requests defined > so the newer request bits can't be placed inside > vcpu->requests(unsigned long) inside x86 32 bit system. > But we are going to add a new request in x86 arch > for Hyper-V tsc page support. > > To solve the problem the patch replaces vcpu->requests by > bitmap with 64 bit length and uses bitmap API. > > The patch consists of: > * announce kvm_vcpu_has_requests() to check whether vcpu has > requests > * announce kvm_vcpu_requests() to get vcpu requests pointer I think that if abstracting out the implementation of the request container is what you're after, you'd better not define this function; accesses to the request map should be through your accessor functions. > * announce kvm_clear_request() to clear particular vcpu request > * replace if (vcpu->requests) by if (kvm_vcpu_has_requests(vcpu)) > * replace clear_bit(req, vcpu->requests) by > kvm_clear_request(req, vcpu) Apparently one accessor is missing: test the presence of a request without clearing it from the bitmap (because kvm_check_request() clears it). > diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h > index 2e0e67e..4015b88 100644 > --- a/arch/powerpc/kvm/trace.h > +++ b/arch/powerpc/kvm/trace.h > @@ -109,7 +109,7 @@ TRACE_EVENT(kvm_check_requests, > > TP_fast_assign( > __entry->cpu_nr = vcpu->vcpu_id; > - __entry->requests = vcpu->requests; > + __entry->requests = (__u32)kvm_vcpu_requests(vcpu)[0]; > ), This doesn't make sense, to expose only subset of the requests. BTW I don't see this event in Linus tree, nor in linux-next, so I'm not quite sure why it's formed this way; I guess the interesting part is the request number and the return value (i.e. whether it's set), not the whole bitmap. > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -5957,7 +5957,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu > *vcpu) > if (intr_window_requested && vmx_interrupt_allowed(vcpu)) > return handle_interrupt_window(>vcpu); > > - if (test_bit(KVM_REQ_EVENT, >requests)) > + if (test_bit(KVM_REQ_EVENT, kvm_vcpu_requests(vcpu))) Here you'd rather want that function to test for the request without clearing it. > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2117,7 +2117,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct > msr_data *msr_info) > > if (ka->boot_vcpu_runs_old_kvmclock != tmp) > set_bit(KVM_REQ_MASTERCLOCK_UPDATE, > - >requests); > + kvm_vcpu_requests(vcpu)); This should have been kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu); > @@ -8048,7 +8048,7 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu > *vcpu) > if (atomic_read(>arch.nmi_queued)) > return true; > > - if (test_bit(KVM_REQ_SMI, >requests)) > + if (test_bit(KVM_REQ_SMI, kvm_vcpu_requests(vcpu))) Again the test-only accessor is due here. > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -146,6 +146,8 @@ static inline bool is_error_page(struct page *page) > #define KVM_REQ_HV_EXIT 30 > #define KVM_REQ_HV_STIMER 31 > > +#define KVM_REQ_MAX 64 > + > #define KVM_USERSPACE_IRQ_SOURCE_ID 0 > #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1 > > @@ -233,7 +235,7 @@ struct kvm_vcpu { > int vcpu_id; > int srcu_idx; > int mode; > - unsigned long requests; > + DECLARE_BITMAP(requests, KVM_REQ_MAX); > unsigned long guest_debug; > > int pre_pcpu; > @@ -286,6 +288,16 @@ struct kvm_vcpu { > struct kvm_vcpu_arch arch; > }; > > +static inline bool kvm_vcpu_has_requests(struct kvm_vcpu *vcpu) > +{ > + return !bitmap_empty(vcpu->requests, KVM_REQ_MAX); > +} > + > +static inline ulong *kvm_vcpu_requests(struct kvm_vcpu *vcpu) > +{ > + return (ulong *)vcpu->requests; > +} As I wrote above, I believe this function doesn't belong in the API. > + > static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu) > { > return cmpxchg(>mode, IN_GUEST_MODE, EXITING_GUEST_MODE); > @@ -1002,7 +1014,7 @@ static inline bool kvm_is_error_gpa(struct kvm *kvm, > gpa_t gpa) > > static inline void kvm_migrate_timers(struct kvm_vcpu *vcpu) > { > - set_bit(KVM_REQ_MIGRATE_TIMER, >requests); > + set_bit(KVM_REQ_MIGRATE_TIMER, kvm_vcpu_requests(vcpu)); kvm_make_request(KVM_REQ_MIGRATE_TIMER, vcpu); > @@ -1118,19 +1130,24 @@ static inline bool kvm_vcpu_compatible(struct > kvm_vcpu *vcpu) { return true; } > > static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu) > { > - set_bit(req, >requests); > + set_bit(req, kvm_vcpu_requests(vcpu)); > } > > static inline
[PATCH v2] kvm: Make vcpu->requests as 64 bit bitmap
Currently on x86 arch we has already 32 requests defined so the newer request bits can't be placed inside vcpu->requests(unsigned long) inside x86 32 bit system. But we are going to add a new request in x86 arch for Hyper-V tsc page support. To solve the problem the patch replaces vcpu->requests by bitmap with 64 bit length and uses bitmap API. The patch consists of: * announce kvm_has_requests() to check whether vcpu has requests * announce kvm_clear_request() to clear particular vcpu request * announce kvm_test_request() to test particular vcpu request * replace if (vcpu->requests) by if (kvm_has_requests(vcpu)) * replace clear_bit(req, vcpu->requests) by kvm_clear_request(req, vcpu) Changes v2: * hide internals of vcpu requests bitmap by interface usage in all places * replace test_bit(req, vcpu->requests) by kvm_test_request() * POWERPC: trace vcpu requests bitmap by __bitmask, __assign_bitmap, __get_bitmask Signed-off-by: Andrey SmetaninAcked-by: James Hogan CC: Paolo Bonzini CC: Gleb Natapov CC: James Hogan CC: Paul Burton CC: Ralf Baechle CC: Alexander Graf CC: Christian Borntraeger CC: Cornelia Huck CC: linux-m...@linux-mips.org CC: kvm-ppc@vger.kernel.org CC: linux-s...@vger.kernel.org CC: Roman Kagan CC: Denis V. Lunev CC: qemu-de...@nongnu.org --- arch/mips/kvm/emulate.c | 4 +--- arch/powerpc/kvm/book3s_pr.c | 2 +- arch/powerpc/kvm/book3s_pr_papr.c | 2 +- arch/powerpc/kvm/booke.c | 6 +++--- arch/powerpc/kvm/powerpc.c| 6 +++--- arch/powerpc/kvm/trace.h | 9 + arch/s390/kvm/kvm-s390.c | 4 ++-- arch/x86/kvm/vmx.c| 2 +- arch/x86/kvm/x86.c| 16 include/linux/kvm_host.h | 38 +- 10 files changed, 50 insertions(+), 39 deletions(-) diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c index 41b1b09..14aebe8 100644 --- a/arch/mips/kvm/emulate.c +++ b/arch/mips/kvm/emulate.c @@ -774,10 +774,8 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu *vcpu) * We we are runnable, then definitely go off to user space to * check if any I/O interrupts are pending. */ - if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) { - clear_bit(KVM_REQ_UNHALT, >requests); + if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN; - } } return EMULATE_DONE; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 64891b0..e975279 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -349,7 +349,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr) if (msr & MSR_POW) { if (!vcpu->arch.pending_exceptions) { kvm_vcpu_block(vcpu); - clear_bit(KVM_REQ_UNHALT, >requests); + kvm_clear_request(KVM_REQ_UNHALT, vcpu)); vcpu->stat.halt_wakeup++; /* Unset POW bit after we woke up */ diff --git a/arch/powerpc/kvm/book3s_pr_papr.c b/arch/powerpc/kvm/book3s_pr_papr.c index f2c75a1..60cf393 100644 --- a/arch/powerpc/kvm/book3s_pr_papr.c +++ b/arch/powerpc/kvm/book3s_pr_papr.c @@ -309,7 +309,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd) case H_CEDE: kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE); kvm_vcpu_block(vcpu); - clear_bit(KVM_REQ_UNHALT, >requests); + kvm_clear_request(KVM_REQ_UNHALT, vcpu); vcpu->stat.halt_wakeup++; return EMULATE_DONE; case H_LOGICAL_CI_LOAD: diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index fd58751..b2e8643 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -574,7 +574,7 @@ static void arm_next_watchdog(struct kvm_vcpu *vcpu) * userspace, so clear the KVM_REQ_WATCHDOG request. */ if ((vcpu->arch.tsr & (TSR_ENW | TSR_WIS)) != (TSR_ENW | TSR_WIS)) - clear_bit(KVM_REQ_WATCHDOG, >requests); + kvm_clear_request(KVM_REQ_WATCHDOG, vcpu); spin_lock_irqsave(>arch.wdt_lock, flags); nr_jiffies = watchdog_next_timeout(vcpu); @@ -677,7 +677,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu) kvmppc_core_check_exceptions(vcpu); - if (vcpu->requests) { + if (kvm_has_requests(vcpu)) { /* Exception delivery raised request; start over */ return 1; } @@ -685,7 +685,7 @@ int
Re: [PATCH v2] kvm: Make vcpu->requests as 64 bit bitmap
On Thu, Dec 24, 2015 at 04:29:21PM +0300, Andrey Smetanin wrote: > Currently on x86 arch we has already 32 requests defined > so the newer request bits can't be placed inside > vcpu->requests(unsigned long) inside x86 32 bit system. > But we are going to add a new request in x86 arch > for Hyper-V tsc page support. > > To solve the problem the patch replaces vcpu->requests by > bitmap with 64 bit length and uses bitmap API. > > The patch consists of: > * announce kvm_has_requests() to check whether vcpu has > requests > * announce kvm_clear_request() to clear particular vcpu request > * announce kvm_test_request() to test particular vcpu request > * replace if (vcpu->requests) by if (kvm_has_requests(vcpu)) > * replace clear_bit(req, vcpu->requests) by > kvm_clear_request(req, vcpu) > > Changes v2: > * hide internals of vcpu requests bitmap > by interface usage in all places > * replace test_bit(req, vcpu->requests) by > kvm_test_request() > * POWERPC: trace vcpu requests bitmap by > __bitmask, __assign_bitmap, __get_bitmask > > Signed-off-by: Andrey Smetanin> Acked-by: James Hogan > CC: Paolo Bonzini > CC: Gleb Natapov > CC: James Hogan > CC: Paul Burton > CC: Ralf Baechle > CC: Alexander Graf > CC: Christian Borntraeger > CC: Cornelia Huck > CC: linux-m...@linux-mips.org > CC: kvm-ppc@vger.kernel.org > CC: linux-s...@vger.kernel.org > CC: Roman Kagan > CC: Denis V. Lunev > CC: qemu-de...@nongnu.org > --- > arch/mips/kvm/emulate.c | 4 +--- > arch/powerpc/kvm/book3s_pr.c | 2 +- > arch/powerpc/kvm/book3s_pr_papr.c | 2 +- > arch/powerpc/kvm/booke.c | 6 +++--- > arch/powerpc/kvm/powerpc.c| 6 +++--- > arch/powerpc/kvm/trace.h | 9 + > arch/s390/kvm/kvm-s390.c | 4 ++-- > arch/x86/kvm/vmx.c| 2 +- > arch/x86/kvm/x86.c| 16 > include/linux/kvm_host.h | 38 +- > 10 files changed, 50 insertions(+), 39 deletions(-) Reviewed-by: Roman Kagan -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH kernel 7/9] KVM: PPC: Move reusable bits of H_PUT_TCE handler to helpers
On 12/08/2015 04:27 PM, David Gibson wrote: On Tue, Sep 15, 2015 at 08:49:37PM +1000, Alexey Kardashevskiy wrote: Upcoming multi-tce support (H_PUT_TCE_INDIRECT/H_STUFF_TCE hypercalls) will validate TCE (not to have unexpected bits) and IO address (to be within the DMA window boundaries). This introduces helpers to validate TCE and IO address. Signed-off-by: Alexey Kardashevskiy--- arch/powerpc/include/asm/kvm_ppc.h | 4 ++ arch/powerpc/kvm/book3s_64_vio_hv.c | 89 - 2 files changed, 83 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index c6ef05b..fcde896 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -166,6 +166,10 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu); extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, struct kvm_create_spapr_tce *args); +extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, + unsigned long ioba, unsigned long npages); +extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt, + unsigned long tce); extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn, unsigned long ioba, unsigned long tce); extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn, diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c b/arch/powerpc/kvm/book3s_64_vio_hv.c index 6cf1ab3..f0fd84c 100644 --- a/arch/powerpc/kvm/book3s_64_vio_hv.c +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c @@ -36,6 +36,7 @@ #include #include #include +#include #define TCES_PER_PAGE (PAGE_SIZE / sizeof(u64)) @@ -64,7 +65,7 @@ static struct kvmppc_spapr_tce_table *kvmppc_find_table(struct kvm_vcpu *vcpu, * WARNING: This will be called in real-mode on HV KVM and virtual * mode on PR KVM */ -static long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, +long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, unsigned long ioba, unsigned long npages) { unsigned long mask = (1ULL << IOMMU_PAGE_SHIFT_4K) - 1; @@ -76,6 +77,79 @@ static long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, return H_SUCCESS; } +EXPORT_SYMBOL_GPL(kvmppc_ioba_validate); Why does it need to be exported - the new users will still be in the KVM module, won't they? book3s_64_vio_hv.c contains realmode code and is always compiled into vmlinux and the helper is meant to be called from book3s_64_vio.c which may compile as a module. + +/* + * Validates TCE address. + * At the moment flags and page mask are validated. + * As the host kernel does not access those addresses (just puts them + * to the table and user space is supposed to process them), we can skip + * checking other things (such as TCE is a guest RAM address or the page + * was actually allocated). Hmm. These comments apply given that the only current user of this is the kvm acceleration of userspace TCE tables, but the name suggests it would validate any TCE, including in kernel ones for which this would be unsafe. The function has the "kvmppc_" prefix and the file besides in arch/powerpc/kvm, so to my taste it is self-explanatory that it only handles TCEs from KVM guests (not even from a random user-space tool), no? + * WARNING: This will be called in real-mode on HV KVM and virtual + * mode on PR KVM + */ +long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *stt, unsigned long tce) +{ + unsigned long mask = ((1ULL << IOMMU_PAGE_SHIFT_4K) - 1) & + ~(TCE_PCI_WRITE | TCE_PCI_READ); + + if (tce & mask) + return H_PARAMETER; + + return H_SUCCESS; +} +EXPORT_SYMBOL_GPL(kvmppc_tce_validate); + +/* Note on the use of page_address() in real mode, + * + * It is safe to use page_address() in real mode on ppc64 because + * page_address() is always defined as lowmem_page_address() + * which returns __va(PFN_PHYS(page_to_pfn(page))) which is arithmetial + * operation and does not access page struct. + * + * Theoretically page_address() could be defined different + * but either WANT_PAGE_VIRTUAL or HASHED_PAGE_VIRTUAL + * should be enabled. + * WANT_PAGE_VIRTUAL is never enabled on ppc32/ppc64, + * HASHED_PAGE_VIRTUAL could be enabled for ppc32 only and only + * if CONFIG_HIGHMEM is defined. As CONFIG_SPARSEMEM_VMEMMAP + * is not expected to be enabled on ppc32, page_address() + * is safe for ppc32 as well. + * + * WARNING: This will be called in real-mode on HV KVM and virtual + * mode on PR KVM + */ +static u64 *kvmppc_page_address(struct page *page) +{ +#if defined(HASHED_PAGE_VIRTUAL) || defined(WANT_PAGE_VIRTUAL) +#error TODO: fix to avoid page_address() here +#endif + return (u64 *) page_address(page); +} + +/* + * Handles TCE requests for emulated devices. + * Puts guest
Re: [PATCH kernel 9/9] KVM: PPC: Add support for multiple-TCE hcalls
On 12/08/2015 04:48 PM, David Gibson wrote: On Tue, Sep 15, 2015 at 08:49:39PM +1000, Alexey Kardashevskiy wrote: This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO devices or emulated PCI. These calls allow adding multiple entries (up to 512) into the TCE table in one call which saves time on transition between kernel and user space. This implements the KVM_CAP_PPC_MULTITCE capability. When present, the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE. If they can not be handled by the kernel, they are passed on to the user space. The user space still has to have an implementation for these. Both HV and PR-syle KVM are supported. Signed-off-by: Alexey Kardashevskiy--- Documentation/virtual/kvm/api.txt | 25 ++ arch/powerpc/include/asm/kvm_ppc.h | 12 +++ arch/powerpc/kvm/book3s_64_vio.c| 111 +++- arch/powerpc/kvm/book3s_64_vio_hv.c | 145 ++-- arch/powerpc/kvm/book3s_hv.c| 26 +- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 6 +- arch/powerpc/kvm/book3s_pr_papr.c | 35 arch/powerpc/kvm/powerpc.c | 3 + 8 files changed, 350 insertions(+), 13 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index d86d831..593c62a 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -3019,6 +3019,31 @@ Returns: 0 on success, -1 on error Queues an SMI on the thread's vcpu. +4.97 KVM_CAP_PPC_MULTITCE + +Capability: KVM_CAP_PPC_MULTITCE +Architectures: ppc +Type: vm + +This capability means the kernel is capable of handling hypercalls +H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user +space. This significantly accelerates DMA operations for PPC KVM guests. +User space should expect that its handlers for these hypercalls +are not going to be called if user space previously registered LIOBN +in KVM (via KVM_CREATE_SPAPR_TCE or similar calls). + +In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest, +user space might have to advertise it for the guest. For example, +IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is +present in the "ibm,hypertas-functions" device-tree property. + +The hypercalls mentioned above may or may not be processed successfully +in the kernel based fast path. If they can not be handled by the kernel, +they will get passed on to user space. So user space still has to have +an implementation for these despite the in kernel acceleration. + +This capability is always enabled. + 5. The kvm_run structure diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index fcde896..e5b968e 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -166,12 +166,24 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu); extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, struct kvm_create_spapr_tce *args); +extern struct kvmppc_spapr_tce_table *kvmppc_find_table( + struct kvm_vcpu *vcpu, unsigned long liobn); extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, unsigned long ioba, unsigned long npages); extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt, unsigned long tce); +extern long kvmppc_gpa_to_ua(struct kvm *kvm, unsigned long gpa, + unsigned long *ua, unsigned long **prmap); +extern void kvmppc_tce_put(struct kvmppc_spapr_tce_table *tt, + unsigned long idx, unsigned long tce); extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn, unsigned long ioba, unsigned long tce); +extern long kvmppc_h_put_tce_indirect(struct kvm_vcpu *vcpu, + unsigned long liobn, unsigned long ioba, + unsigned long tce_list, unsigned long npages); +extern long kvmppc_h_stuff_tce(struct kvm_vcpu *vcpu, + unsigned long liobn, unsigned long ioba, + unsigned long tce_value, unsigned long npages); extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn, unsigned long ioba); extern struct page *kvm_alloc_hpt(unsigned long nr_pages); diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index e347856..d3fc732 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -14,6 +14,7 @@ * * Copyright 2010 Paul Mackerras, IBM Corp. * Copyright 2011 David Gibson, IBM Corporation + * Copyright 2013 Alexey Kardashevskiy, IBM Corporation */ #include @@ -37,8 +38,7 @@ #include #include #include - -#define TCES_PER_PAGE
Review & Reply
Greetings, My name is Mr.Michael J. Tynan, I am a banker with Bank Of America. It is true that we have not meet each other in person, but I strongly believe in trust and friendship in every business. I have a Lebanese deceased customer's abandoned fund, which I am his personal financial adviser before his accidental death, that being the main reason why I alone working in the bank here, know much about the existence of this fund and the secrets surrounding this money. But before I disclose the full details to you, I will like to know your interest and willingness to assist me. You can call me as soon you receive my message, so that i will send to you full details about the transaction. My best regards, Mr.Michael J. Tynan MOBILE: +1 347 269 3740 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: fix ONE_REG AltiVec support
The get and set operations got exchanged by mistake when moving the code from book3s.c to powerpc.c. Fixes: 3840edc8033ad5b86deee309c1c321ca54257452 Signed-off-by: Greg Kurz--- It's been there for over a year but I guess we want that in 4.4, even if doesn't break the host kernel. arch/powerpc/kvm/powerpc.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 6fd2405c7f4a..a3b182dcb823 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -919,21 +919,17 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) r = -ENXIO; break; } - vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval; + val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0]; break; case KVM_REG_PPC_VSCR: if (!cpu_has_feature(CPU_FTR_ALTIVEC)) { r = -ENXIO; break; } - vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val); + val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]); break; case KVM_REG_PPC_VRSAVE: - if (!cpu_has_feature(CPU_FTR_ALTIVEC)) { - r = -ENXIO; - break; - } - vcpu->arch.vrsave = set_reg_val(reg->id, val); + val = get_reg_val(reg->id, vcpu->arch.vrsave); break; #endif /* CONFIG_ALTIVEC */ default: @@ -974,17 +970,21 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, struct kvm_one_reg *reg) r = -ENXIO; break; } - val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0]; + vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval; break; case KVM_REG_PPC_VSCR: if (!cpu_has_feature(CPU_FTR_ALTIVEC)) { r = -ENXIO; break; } - val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]); + vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val); break; case KVM_REG_PPC_VRSAVE: - val = get_reg_val(reg->id, vcpu->arch.vrsave); + if (!cpu_has_feature(CPU_FTR_ALTIVEC)) { + r = -ENXIO; + break; + } + vcpu->arch.vrsave = set_reg_val(reg->id, val); break; #endif /* CONFIG_ALTIVEC */ default: -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception
On Thursday 17 December 2015 08:02 AM, David Gibson wrote: > On Wed, Dec 16, 2015 at 11:26:12AM +0530, Aravinda Prasad wrote: >> This patch modifies KVM to cause a guest exit with >> KVM_EXIT_NMI instead of immediately delivering a 0x200 >> interrupt to guest upon machine check exception in >> guest address. Exiting the guest enables QEMU to build >> error log and deliver machine check exception to guest >> OS (either via guest OS registered machine check >> handler or via 0x200 guest OS interrupt vector). >> >> This approach simplifies the delivering of machine >> check exception to guest OS compared to the earlier approach >> of KVM directly invoking 0x200 guest interrupt vector. >> In the earlier approach QEMU patched the 0x200 interrupt >> vector during boot. The patched code at 0x200 issued a >> private hcall to pass the control to QEMU to build the >> error log. >> >> This design/approach is based on the feedback for the >> QEMU patches to handle machine check exception. Details >> of earlier approach of handling machine check exception >> in QEMU and related discussions can be found at: >> >> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html >> >> This patch also introduces a new KVM capability to >> control how KVM behaves on machine check exception. >> Without this capability, KVM redirects machine check >> exceptions to guest's 0x200 vector if the address in >> error belongs to guest. With this capability KVM >> causes a guest exit with NMI exit reason. >> >> This is required to avoid problems if a new kernel/KVM >> is used with an old QEMU for guests that don't issue >> "ibm,nmi-register". As old QEMU does not understand the >> NMI exit type, it treats it as a fatal error. However, >> the guest could have handled the machine check error >> if the exception was delivered to guest's 0x200 interrupt >> vector instead of NMI exit in case of old QEMU. >> >> Change Log v2: >> - Added KVM capability >> >> Signed-off-by: Aravinda Prasad>> --- >> arch/powerpc/include/asm/kvm_host.h |1 + >> arch/powerpc/kernel/asm-offsets.c |1 + >> arch/powerpc/kvm/book3s_hv.c| 12 +++--- >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 37 >> +++ >> arch/powerpc/kvm/powerpc.c |7 ++ >> include/uapi/linux/kvm.h|1 + >> 6 files changed, 31 insertions(+), 28 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/kvm_host.h >> b/arch/powerpc/include/asm/kvm_host.h >> index 827a38d..8a652ba 100644 >> --- a/arch/powerpc/include/asm/kvm_host.h >> +++ b/arch/powerpc/include/asm/kvm_host.h >> @@ -243,6 +243,7 @@ struct kvm_arch { >> int hpt_cma_alloc; >> struct dentry *debugfs_dir; >> struct dentry *htab_dentry; >> +u8 fwnmi_enabled; >> #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ >> #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE >> struct mutex hpt_mutex; >> diff --git a/arch/powerpc/kernel/asm-offsets.c >> b/arch/powerpc/kernel/asm-offsets.c >> index 221d584..6a4e81a 100644 >> --- a/arch/powerpc/kernel/asm-offsets.c >> +++ b/arch/powerpc/kernel/asm-offsets.c >> @@ -506,6 +506,7 @@ int main(void) >> DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls)); >> DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr)); >> DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v)); >> +DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled)); >> DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr)); >> DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar)); >> DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr)); >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index 2280497..1b1dff0 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, >> struct kvm_vcpu *vcpu, >> r = RESUME_GUEST; >> break; >> case BOOK3S_INTERRUPT_MACHINE_CHECK: >> -/* >> - * Deliver a machine check interrupt to the guest. >> - * We have to do this, even if the host has handled the >> - * machine check, because machine checks use SRR0/1 and >> - * the interrupt might have trashed guest state in them. >> - */ >> -kvmppc_book3s_queue_irqprio(vcpu, >> -BOOK3S_INTERRUPT_MACHINE_CHECK); >> -r = RESUME_GUEST; >> +/* Exit to guest with KVM_EXIT_NMI as exit reason */ >> +run->exit_reason = KVM_EXIT_NMI; >> +r = RESUME_HOST; >> break; >> case BOOK3S_INTERRUPT_PROGRAM: >> { >> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> index b98889e..f43c124 100644 >> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> +++
Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception
On Wed, Dec 16, 2015 at 11:26:12AM +0530, Aravinda Prasad wrote: > This patch modifies KVM to cause a guest exit with > KVM_EXIT_NMI instead of immediately delivering a 0x200 > interrupt to guest upon machine check exception in > guest address. Exiting the guest enables QEMU to build > error log and deliver machine check exception to guest > OS (either via guest OS registered machine check > handler or via 0x200 guest OS interrupt vector). > > This approach simplifies the delivering of machine > check exception to guest OS compared to the earlier approach > of KVM directly invoking 0x200 guest interrupt vector. > In the earlier approach QEMU patched the 0x200 interrupt > vector during boot. The patched code at 0x200 issued a > private hcall to pass the control to QEMU to build the > error log. > > This design/approach is based on the feedback for the > QEMU patches to handle machine check exception. Details > of earlier approach of handling machine check exception > in QEMU and related discussions can be found at: > > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html > > This patch also introduces a new KVM capability to > control how KVM behaves on machine check exception. > Without this capability, KVM redirects machine check > exceptions to guest's 0x200 vector if the address in > error belongs to guest. With this capability KVM > causes a guest exit with NMI exit reason. > > This is required to avoid problems if a new kernel/KVM > is used with an old QEMU for guests that don't issue > "ibm,nmi-register". As old QEMU does not understand the > NMI exit type, it treats it as a fatal error. However, > the guest could have handled the machine check error > if the exception was delivered to guest's 0x200 interrupt > vector instead of NMI exit in case of old QEMU. > > Change Log v2: > - Added KVM capability > > Signed-off-by: Aravinda Prasad> --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kernel/asm-offsets.c |1 + > arch/powerpc/kvm/book3s_hv.c| 12 +++--- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 37 > +++ > arch/powerpc/kvm/powerpc.c |7 ++ > include/uapi/linux/kvm.h|1 + > 6 files changed, 31 insertions(+), 28 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h > b/arch/powerpc/include/asm/kvm_host.h > index 827a38d..8a652ba 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -243,6 +243,7 @@ struct kvm_arch { > int hpt_cma_alloc; > struct dentry *debugfs_dir; > struct dentry *htab_dentry; > + u8 fwnmi_enabled; > #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ > #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE > struct mutex hpt_mutex; > diff --git a/arch/powerpc/kernel/asm-offsets.c > b/arch/powerpc/kernel/asm-offsets.c > index 221d584..6a4e81a 100644 > --- a/arch/powerpc/kernel/asm-offsets.c > +++ b/arch/powerpc/kernel/asm-offsets.c > @@ -506,6 +506,7 @@ int main(void) > DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls)); > DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr)); > DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v)); > + DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled)); > DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr)); > DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar)); > DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr)); > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 2280497..1b1dff0 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, > struct kvm_vcpu *vcpu, > r = RESUME_GUEST; > break; > case BOOK3S_INTERRUPT_MACHINE_CHECK: > - /* > - * Deliver a machine check interrupt to the guest. > - * We have to do this, even if the host has handled the > - * machine check, because machine checks use SRR0/1 and > - * the interrupt might have trashed guest state in them. > - */ > - kvmppc_book3s_queue_irqprio(vcpu, > - BOOK3S_INTERRUPT_MACHINE_CHECK); > - r = RESUME_GUEST; > + /* Exit to guest with KVM_EXIT_NMI as exit reason */ > + run->exit_reason = KVM_EXIT_NMI; > + r = RESUME_HOST; > break; > case BOOK3S_INTERRUPT_PROGRAM: > { > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b98889e..f43c124 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > addir1, r1, 112
Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception
On Wednesday 16 December 2015 03:10 PM, Thomas Huth wrote: > On 16/12/15 06:56, Aravinda Prasad wrote: >> This patch modifies KVM to cause a guest exit with >> KVM_EXIT_NMI instead of immediately delivering a 0x200 >> interrupt to guest upon machine check exception in >> guest address. Exiting the guest enables QEMU to build >> error log and deliver machine check exception to guest >> OS (either via guest OS registered machine check >> handler or via 0x200 guest OS interrupt vector). >> >> This approach simplifies the delivering of machine >> check exception to guest OS compared to the earlier approach >> of KVM directly invoking 0x200 guest interrupt vector. >> In the earlier approach QEMU patched the 0x200 interrupt >> vector during boot. The patched code at 0x200 issued a >> private hcall to pass the control to QEMU to build the >> error log. >> >> This design/approach is based on the feedback for the >> QEMU patches to handle machine check exception. Details >> of earlier approach of handling machine check exception >> in QEMU and related discussions can be found at: >> >> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html >> >> This patch also introduces a new KVM capability to >> control how KVM behaves on machine check exception. >> Without this capability, KVM redirects machine check >> exceptions to guest's 0x200 vector if the address in >> error belongs to guest. With this capability KVM >> causes a guest exit with NMI exit reason. >> >> This is required to avoid problems if a new kernel/KVM >> is used with an old QEMU for guests that don't issue >> "ibm,nmi-register". As old QEMU does not understand the >> NMI exit type, it treats it as a fatal error. However, >> the guest could have handled the machine check error >> if the exception was delivered to guest's 0x200 interrupt >> vector instead of NMI exit in case of old QEMU. >> >> Change Log v2: >> - Added KVM capability >> >> Signed-off-by: Aravinda Prasad>> --- >> arch/powerpc/include/asm/kvm_host.h |1 + >> arch/powerpc/kernel/asm-offsets.c |1 + >> arch/powerpc/kvm/book3s_hv.c| 12 +++--- >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 37 >> +++ >> arch/powerpc/kvm/powerpc.c |7 ++ >> include/uapi/linux/kvm.h|1 + >> 6 files changed, 31 insertions(+), 28 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/kvm_host.h >> b/arch/powerpc/include/asm/kvm_host.h >> index 827a38d..8a652ba 100644 >> --- a/arch/powerpc/include/asm/kvm_host.h >> +++ b/arch/powerpc/include/asm/kvm_host.h >> @@ -243,6 +243,7 @@ struct kvm_arch { >> int hpt_cma_alloc; >> struct dentry *debugfs_dir; >> struct dentry *htab_dentry; >> +u8 fwnmi_enabled; > > Here you declare the variable as 8-bits ... > >> #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ >> #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE >> struct mutex hpt_mutex; >> diff --git a/arch/powerpc/kernel/asm-offsets.c >> b/arch/powerpc/kernel/asm-offsets.c >> index 221d584..6a4e81a 100644 >> --- a/arch/powerpc/kernel/asm-offsets.c >> +++ b/arch/powerpc/kernel/asm-offsets.c >> @@ -506,6 +506,7 @@ int main(void) >> DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls)); >> DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr)); >> DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v)); >> +DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled)); > > ... then define an asm-offset for it ... > >> DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr)); >> DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar)); >> DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr)); >> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> index b98889e..f43c124 100644 >> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > [...] >> @@ -2381,24 +2377,27 @@ machine_check_realmode: >> ld r9, HSTATE_KVM_VCPU(r13) >> li r12, BOOK3S_INTERRUPT_MACHINE_CHECK >> /* >> - * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through >> - * machine check interrupt (set HSRR0 to 0x200). And for handled >> - * errors (no-fatal), just go back to guest execution with current >> - * HSRR0 instead of exiting guest. This new approach will inject >> - * machine check to guest for fatal error causing guest to crash. >> - * >> - * The old code used to return to host for unhandled errors which >> - * was causing guest to hang with soft lockups inside guest and >> - * makes it difficult to recover guest instance. >> + * Deliver unhandled/fatal (e.g. UE) MCE errors to guest >> + * by exiting the guest with KVM_EXIT_NMI exit reason (exit >> + * reason set later based on trap). For handled errors >> + * (no-fatal), go back to guest
Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception
On 16/12/15 06:56, Aravinda Prasad wrote: > This patch modifies KVM to cause a guest exit with > KVM_EXIT_NMI instead of immediately delivering a 0x200 > interrupt to guest upon machine check exception in > guest address. Exiting the guest enables QEMU to build > error log and deliver machine check exception to guest > OS (either via guest OS registered machine check > handler or via 0x200 guest OS interrupt vector). > > This approach simplifies the delivering of machine > check exception to guest OS compared to the earlier approach > of KVM directly invoking 0x200 guest interrupt vector. > In the earlier approach QEMU patched the 0x200 interrupt > vector during boot. The patched code at 0x200 issued a > private hcall to pass the control to QEMU to build the > error log. > > This design/approach is based on the feedback for the > QEMU patches to handle machine check exception. Details > of earlier approach of handling machine check exception > in QEMU and related discussions can be found at: > > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html > > This patch also introduces a new KVM capability to > control how KVM behaves on machine check exception. > Without this capability, KVM redirects machine check > exceptions to guest's 0x200 vector if the address in > error belongs to guest. With this capability KVM > causes a guest exit with NMI exit reason. > > This is required to avoid problems if a new kernel/KVM > is used with an old QEMU for guests that don't issue > "ibm,nmi-register". As old QEMU does not understand the > NMI exit type, it treats it as a fatal error. However, > the guest could have handled the machine check error > if the exception was delivered to guest's 0x200 interrupt > vector instead of NMI exit in case of old QEMU. > > Change Log v2: > - Added KVM capability > > Signed-off-by: Aravinda Prasad> --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kernel/asm-offsets.c |1 + > arch/powerpc/kvm/book3s_hv.c| 12 +++--- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 37 > +++ > arch/powerpc/kvm/powerpc.c |7 ++ > include/uapi/linux/kvm.h|1 + > 6 files changed, 31 insertions(+), 28 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h > b/arch/powerpc/include/asm/kvm_host.h > index 827a38d..8a652ba 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -243,6 +243,7 @@ struct kvm_arch { > int hpt_cma_alloc; > struct dentry *debugfs_dir; > struct dentry *htab_dentry; > + u8 fwnmi_enabled; Here you declare the variable as 8-bits ... > #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ > #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE > struct mutex hpt_mutex; > diff --git a/arch/powerpc/kernel/asm-offsets.c > b/arch/powerpc/kernel/asm-offsets.c > index 221d584..6a4e81a 100644 > --- a/arch/powerpc/kernel/asm-offsets.c > +++ b/arch/powerpc/kernel/asm-offsets.c > @@ -506,6 +506,7 @@ int main(void) > DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls)); > DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr)); > DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v)); > + DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled)); ... then define an asm-offset for it ... > DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr)); > DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar)); > DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr)); > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b98889e..f43c124 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S [...] > @@ -2381,24 +2377,27 @@ machine_check_realmode: > ld r9, HSTATE_KVM_VCPU(r13) > li r12, BOOK3S_INTERRUPT_MACHINE_CHECK > /* > - * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through > - * machine check interrupt (set HSRR0 to 0x200). And for handled > - * errors (no-fatal), just go back to guest execution with current > - * HSRR0 instead of exiting guest. This new approach will inject > - * machine check to guest for fatal error causing guest to crash. > - * > - * The old code used to return to host for unhandled errors which > - * was causing guest to hang with soft lockups inside guest and > - * makes it difficult to recover guest instance. > + * Deliver unhandled/fatal (e.g. UE) MCE errors to guest > + * by exiting the guest with KVM_EXIT_NMI exit reason (exit > + * reason set later based on trap). For handled errors > + * (no-fatal), go back to guest execution with current HSRR0 > + * instead of exiting the guest. This approach will cause > + * the guest to exit in case of fatal
[PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception
This patch modifies KVM to cause a guest exit with KVM_EXIT_NMI instead of immediately delivering a 0x200 interrupt to guest upon machine check exception in guest address. Exiting the guest enables QEMU to build error log and deliver machine check exception to guest OS (either via guest OS registered machine check handler or via 0x200 guest OS interrupt vector). This approach simplifies the delivering of machine check exception to guest OS compared to the earlier approach of KVM directly invoking 0x200 guest interrupt vector. In the earlier approach QEMU patched the 0x200 interrupt vector during boot. The patched code at 0x200 issued a private hcall to pass the control to QEMU to build the error log. This design/approach is based on the feedback for the QEMU patches to handle machine check exception. Details of earlier approach of handling machine check exception in QEMU and related discussions can be found at: https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html This patch also introduces a new KVM capability to control how KVM behaves on machine check exception. Without this capability, KVM redirects machine check exceptions to guest's 0x200 vector if the address in error belongs to guest. With this capability KVM causes a guest exit with NMI exit reason. This is required to avoid problems if a new kernel/KVM is used with an old QEMU for guests that don't issue "ibm,nmi-register". As old QEMU does not understand the NMI exit type, it treats it as a fatal error. However, the guest could have handled the machine check error if the exception was delivered to guest's 0x200 interrupt vector instead of NMI exit in case of old QEMU. Change Log v2: - Added KVM capability Signed-off-by: Aravinda Prasad--- arch/powerpc/include/asm/kvm_host.h |1 + arch/powerpc/kernel/asm-offsets.c |1 + arch/powerpc/kvm/book3s_hv.c| 12 +++--- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 37 +++ arch/powerpc/kvm/powerpc.c |7 ++ include/uapi/linux/kvm.h|1 + 6 files changed, 31 insertions(+), 28 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 827a38d..8a652ba 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -243,6 +243,7 @@ struct kvm_arch { int hpt_cma_alloc; struct dentry *debugfs_dir; struct dentry *htab_dentry; + u8 fwnmi_enabled; #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE struct mutex hpt_mutex; diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 221d584..6a4e81a 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -506,6 +506,7 @@ int main(void) DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls)); DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr)); DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v)); + DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled)); DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr)); DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar)); DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr)); diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 2280497..1b1dff0 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, r = RESUME_GUEST; break; case BOOK3S_INTERRUPT_MACHINE_CHECK: - /* -* Deliver a machine check interrupt to the guest. -* We have to do this, even if the host has handled the -* machine check, because machine checks use SRR0/1 and -* the interrupt might have trashed guest state in them. -*/ - kvmppc_book3s_queue_irqprio(vcpu, - BOOK3S_INTERRUPT_MACHINE_CHECK); - r = RESUME_GUEST; + /* Exit to guest with KVM_EXIT_NMI as exit reason */ + run->exit_reason = KVM_EXIT_NMI; + r = RESUME_HOST; break; case BOOK3S_INTERRUPT_PROGRAM: { diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index b98889e..f43c124 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) addir1, r1, 112 ld r7, HSTATE_HOST_MSR(r13) - cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL beq 11f cmpwi cr2, r12, BOOK3S_INTERRUPT_HMI @@
Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception
Hi, > This patch also introduces a new KVM capability to > control how KVM behaves on machine check exception. > Without this capability, KVM redirects machine check > exceptions to guest's 0x200 vector if the address in > error belongs to guest. With this capability KVM > causes a guest exit with NMI exit reason. > > This is required to avoid problems if a new kernel/KVM > is used with an old QEMU for guests that don't issue > "ibm,nmi-register". As old QEMU does not understand the > NMI exit type, it treats it as a fatal error. However, > the guest could have handled the machine check error > if the exception was delivered to guest's 0x200 interrupt > vector instead of NMI exit in case of old QEMU. > > Change Log v2: > - Added KVM capability I'm not really qualified to review the contents of this patch, but I'm happy that the changes in v2 address the concern I had for version 1: thank you. Regards, Daniel > > Signed-off-by: Aravinda Prasad> --- > arch/powerpc/include/asm/kvm_host.h |1 + > arch/powerpc/kernel/asm-offsets.c |1 + > arch/powerpc/kvm/book3s_hv.c| 12 +++--- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 37 > +++ > arch/powerpc/kvm/powerpc.c |7 ++ > include/uapi/linux/kvm.h|1 + > 6 files changed, 31 insertions(+), 28 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_host.h > b/arch/powerpc/include/asm/kvm_host.h > index 827a38d..8a652ba 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -243,6 +243,7 @@ struct kvm_arch { > int hpt_cma_alloc; > struct dentry *debugfs_dir; > struct dentry *htab_dentry; > + u8 fwnmi_enabled; > #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ > #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE > struct mutex hpt_mutex; > diff --git a/arch/powerpc/kernel/asm-offsets.c > b/arch/powerpc/kernel/asm-offsets.c > index 221d584..6a4e81a 100644 > --- a/arch/powerpc/kernel/asm-offsets.c > +++ b/arch/powerpc/kernel/asm-offsets.c > @@ -506,6 +506,7 @@ int main(void) > DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls)); > DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr)); > DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v)); > + DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled)); > DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr)); > DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar)); > DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr)); > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 2280497..1b1dff0 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, > struct kvm_vcpu *vcpu, > r = RESUME_GUEST; > break; > case BOOK3S_INTERRUPT_MACHINE_CHECK: > - /* > - * Deliver a machine check interrupt to the guest. > - * We have to do this, even if the host has handled the > - * machine check, because machine checks use SRR0/1 and > - * the interrupt might have trashed guest state in them. > - */ > - kvmppc_book3s_queue_irqprio(vcpu, > - BOOK3S_INTERRUPT_MACHINE_CHECK); > - r = RESUME_GUEST; > + /* Exit to guest with KVM_EXIT_NMI as exit reason */ > + run->exit_reason = KVM_EXIT_NMI; > + r = RESUME_HOST; > break; > case BOOK3S_INTERRUPT_PROGRAM: > { > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b98889e..f43c124 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > addir1, r1, 112 > ld r7, HSTATE_HOST_MSR(r13) > > - cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK > cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL > beq 11f > cmpwi cr2, r12, BOOK3S_INTERRUPT_HMI > @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > mtmsrd r6, 1 /* Clear RI in MSR */ > mtsrr0 r8 > mtsrr1 r7 > - beq cr1, 13f/* machine check */ > RFI > > /* On POWER7, we have external interrupts set to use HSRR0/1 */ > @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > mtspr SPRN_HSRR1, r7 > ba 0x500 > > -13: b machine_check_fwnmi > - > 14: mtspr SPRN_HSRR0, r8 > mtspr SPRN_HSRR1, r7 > b hmi_exception_after_realmode > @@ -2381,24 +2377,27 @@ machine_check_realmode: > ld r9, HSTATE_KVM_VCPU(r13) > li r12, BOOK3S_INTERRUPT_MACHINE_CHECK >
Re: [GIT PULL] Please pull my kvm-ppc-fixes branch
On 10/12/2015 04:12, Paul Mackerras wrote: > git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git > kvm-ppc-fixes Pulled, thanks. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/1] KVM: PPC: Increase memslots to 320
On 09/12/15 04:28, Paul Mackerras wrote: > On Wed, Nov 04, 2015 at 10:03:48AM +0100, Thomas Huth wrote: >> Only using 32 memslots for KVM on powerpc is way too low, you can >> nowadays hit this limit quite fast by adding a couple of PCI devices >> and/or pluggable memory DIMMs to the guest. >> x86 already increased the limit to 512 in total, to satisfy 256 >> pluggable DIMM slots, 3 private slots and 253 slots for other things >> like PCI devices. On powerpc, we only have 32 pluggable DIMMs in > > I agree with increasing the limit. Is there a reason we have only 32 > pluggable DIMMs in QEMU on powerpc, not more? Should we be increasing > that limit too? If so, maybe we should increase the number of memory > slots to 512? H ... the comment before the #define in QEMU reads: /* * This defines the maximum number of DIMM slots we can have for sPAPR * guest. This is not defined by sPAPR but we are defining it to 32 slots * based on default number of slots provided by PowerPC kernel. */ #define SPAPR_MAX_RAM_SLOTS 32 So as far as I can see, there's indeed a possibility that we'll increase this value once the kernel can handle more slots! So I guess you're right and we should play save and use more slots right from the start. I'll send a new patch with 512 instead. >> QEMU, not 256, so we likely do not as much slots as on x86. Thus > > "so we likely do not need as many slots as on x86" would be better > English. Oops, I'm sure that was my original intention ... anyway, thanks for pointing it out, it's always good to get some feedback as a non-native English speaker. Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: Increase memslots to 512
Only using 32 memslots for KVM on powerpc is way too low, you can nowadays hit this limit quite fast by adding a couple of PCI devices and/or pluggable memory DIMMs to the guest. x86 already increased the KVM_USER_MEM_SLOTS to 509, to satisfy 256 pluggable DIMM slots, 3 private slots and 253 slots for other things like PCI devices (i.e. resulting in 256 + 3 + 253 = 512 slots in total). We should do something similar for powerpc, and since we do not use private slots here, we can set the value to 512 directly. While we're at it, also remove the KVM_MEM_SLOTS_NUM definition from the powerpc-specific header since this gets defined in the generic kvm_host.h header anyway. Signed-off-by: Thomas Huth--- arch/powerpc/include/asm/kvm_host.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 887c259..2c96a72 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -38,8 +38,7 @@ #define KVM_MAX_VCPUS NR_CPUS #define KVM_MAX_VCORES NR_CPUS -#define KVM_USER_MEM_SLOTS 32 -#define KVM_MEM_SLOTS_NUM KVM_USER_MEM_SLOTS +#define KVM_USER_MEM_SLOTS 512 #ifdef CONFIG_KVM_MMIO #define KVM_COALESCED_MMIO_PAGE_OFFSET 1 -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8
On Fri, Nov 20, 2015 at 09:11:45AM +0100, Thomas Huth wrote: > In the old DABR register, the BT (Breakpoint Translation) bit > is bit number 61. In the new DAWRX register, the WT (Watchpoint > Translation) bit is bit number 59. So to move the DABR-BT bit > into the position of the DAWRX-WT bit, it has to be shifted by > two, not only by one. This fixes hardware watchpoints in gdb of > older guests that only use the H_SET_DABR/X interface instead > of the new H_SET_MODE interface. > > Signed-off-by: Thomas HuthThanks, applied to my kvm-ppc-next branch, with cc: sta...@vger.kernel.org. Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: remove unused variable 'vcpu_book3s'
On Tue, Dec 01, 2015 at 08:42:10PM -0300, Geyslan G. Bem wrote: > The vcpu_book3s struct is assigned but never used. So remove it. > > Signed-off-by: Geyslan G. BemThanks, applied to my kvm-ppc-next branch. Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Increase memslots to 512
On Wed, Dec 09, 2015 at 11:34:07AM +0100, Thomas Huth wrote: > Only using 32 memslots for KVM on powerpc is way too low, you can > nowadays hit this limit quite fast by adding a couple of PCI devices > and/or pluggable memory DIMMs to the guest. > > x86 already increased the KVM_USER_MEM_SLOTS to 509, to satisfy 256 > pluggable DIMM slots, 3 private slots and 253 slots for other things > like PCI devices (i.e. resulting in 256 + 3 + 253 = 512 slots in > total). We should do something similar for powerpc, and since we do > not use private slots here, we can set the value to 512 directly. > > While we're at it, also remove the KVM_MEM_SLOTS_NUM definition > from the powerpc-specific header since this gets defined in the > generic kvm_host.h header anyway. > > Signed-off-by: Thomas HuthThanks, applied to my kvm-ppc-next branch. Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL] Please pull my kvm-ppc-fixes branch
Hi Paolo, I have a small patch that I would like to get into 4.4 because it fixes a bug which for certain kernel configs allows userspace to crash the kernel. The configs are those for which KVM_BOOK3S_64_HV is set (y or m) and KVM_BOOK3S_64_PR is not. Fortunately most distros that enable KVM_BOOK3S_64_HV also enable KVM_BOOK3S_64_PR, as far as I can tell. Thanks, Paul. The following changes since commit 09922076003ad66de41ea14d2f8c3b4a16ec7774: Merge tag 'kvm-arm-for-v4.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master (2015-12-04 18:32:32 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git kvm-ppc-fixes for you to fetch changes up to c20875a3e638e4a03e099b343ec798edd1af5cc6: KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR (2015-12-10 11:34:27 +1100) Paul Mackerras (1): KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR arch/powerpc/kvm/book3s_hv.c | 6 ++ 1 file changed, 6 insertions(+) -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH kernel 5/9] KVM: PPC: Account TCE-containing pages in locked_vm
On Tue, Sep 15, 2015 at 08:49:35PM +1000, Alexey Kardashevskiy wrote: > At the moment pages used for TCE tables (in addition to pages addressed > by TCEs) are not counted in locked_vm counter so a malicious userspace > tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as RLIMIT_NOFILE and > lock a lot of memory. > > This adds counting for pages used for TCE tables. > > This counts the number of pages required for a table plus pages for > the kvmppc_spapr_tce_table struct (TCE table descriptor) itself. Hmm. Does it make sense to account for the descriptor struct itself? I mean there are lots of little structures the kernel will allocate on a process's behalf, and I don't think most of them get accounted against locked vm. > This does not change the amount of (de)allocated memory. > > Signed-off-by: Alexey Kardashevskiy> --- > arch/powerpc/kvm/book3s_64_vio.c | 51 > +++- > 1 file changed, 50 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_64_vio.c > b/arch/powerpc/kvm/book3s_64_vio.c > index 9526c34..b70787d 100644 > --- a/arch/powerpc/kvm/book3s_64_vio.c > +++ b/arch/powerpc/kvm/book3s_64_vio.c > @@ -45,13 +45,56 @@ static long kvmppc_stt_npages(unsigned long window_size) >* sizeof(u64), PAGE_SIZE) / PAGE_SIZE; > } > > +static long kvmppc_account_memlimit(long npages, bool inc) > +{ > + long ret = 0; > + const long bytes = sizeof(struct kvmppc_spapr_tce_table) + > + (abs(npages) * sizeof(struct page *)); > + const long stt_pages = ALIGN(bytes, PAGE_SIZE) / PAGE_SIZE; Overflow checks might be useful here, I'm not sure. > + > + if (!current || !current->mm) > + return ret; /* process exited */ > + > + npages += stt_pages; > + > + down_write(>mm->mmap_sem); > + > + if (inc) { > + long locked, lock_limit; > + > + locked = current->mm->locked_vm + npages; > + lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; > + if (locked > lock_limit && !capable(CAP_IPC_LOCK)) > + ret = -ENOMEM; > + else > + current->mm->locked_vm += npages; > + } else { > + if (npages > current->mm->locked_vm) Should this be a WARN_ON? It means something has gone wrong previously in the accounting, doesn't it? > + npages = current->mm->locked_vm; > + > + current->mm->locked_vm -= npages; > + } > + > + pr_debug("[%d] RLIMIT_MEMLOCK KVM %c%ld %ld/%ld%s\n", current->pid, > + inc ? '+' : '-', > + npages << PAGE_SHIFT, > + current->mm->locked_vm << PAGE_SHIFT, > + rlimit(RLIMIT_MEMLOCK), > + ret ? " - exceeded" : ""); > + > + up_write(>mm->mmap_sem); > + > + return ret; > +} > + > static void release_spapr_tce_table(struct rcu_head *head) > { > struct kvmppc_spapr_tce_table *stt = container_of(head, > struct kvmppc_spapr_tce_table, rcu); > int i; > + long npages = kvmppc_stt_npages(stt->window_size); > > - for (i = 0; i < kvmppc_stt_npages(stt->window_size); i++) > + for (i = 0; i < npages; i++) > __free_page(stt->pages[i]); > > kfree(stt); > @@ -89,6 +132,7 @@ static int kvm_spapr_tce_release(struct inode *inode, > struct file *filp) > > kvm_put_kvm(stt->kvm); > > + kvmppc_account_memlimit(kvmppc_stt_npages(stt->window_size), false); > call_rcu(>rcu, release_spapr_tce_table); > > return 0; > @@ -114,6 +158,11 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, > } > > npages = kvmppc_stt_npages(args->window_size); > + ret = kvmppc_account_memlimit(npages, true); > + if (ret) { > + stt = NULL; > + goto fail; > + } > > stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *), > GFP_KERNEL); -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH kernel 7/9] KVM: PPC: Move reusable bits of H_PUT_TCE handler to helpers
On Tue, Sep 15, 2015 at 08:49:37PM +1000, Alexey Kardashevskiy wrote: > Upcoming multi-tce support (H_PUT_TCE_INDIRECT/H_STUFF_TCE hypercalls) > will validate TCE (not to have unexpected bits) and IO address > (to be within the DMA window boundaries). > > This introduces helpers to validate TCE and IO address. > > Signed-off-by: Alexey Kardashevskiy> --- > arch/powerpc/include/asm/kvm_ppc.h | 4 ++ > arch/powerpc/kvm/book3s_64_vio_hv.c | 89 > - > 2 files changed, 83 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_ppc.h > b/arch/powerpc/include/asm/kvm_ppc.h > index c6ef05b..fcde896 100644 > --- a/arch/powerpc/include/asm/kvm_ppc.h > +++ b/arch/powerpc/include/asm/kvm_ppc.h > @@ -166,6 +166,10 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu > *vcpu); > > extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, > struct kvm_create_spapr_tce *args); > +extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, > + unsigned long ioba, unsigned long npages); > +extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt, > + unsigned long tce); > extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn, >unsigned long ioba, unsigned long tce); > extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn, > diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c > b/arch/powerpc/kvm/book3s_64_vio_hv.c > index 6cf1ab3..f0fd84c 100644 > --- a/arch/powerpc/kvm/book3s_64_vio_hv.c > +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c > @@ -36,6 +36,7 @@ > #include > #include > #include > +#include > > #define TCES_PER_PAGE(PAGE_SIZE / sizeof(u64)) > > @@ -64,7 +65,7 @@ static struct kvmppc_spapr_tce_table > *kvmppc_find_table(struct kvm_vcpu *vcpu, > * WARNING: This will be called in real-mode on HV KVM and virtual > * mode on PR KVM > */ > -static long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, > +long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, > unsigned long ioba, unsigned long npages) > { > unsigned long mask = (1ULL << IOMMU_PAGE_SHIFT_4K) - 1; > @@ -76,6 +77,79 @@ static long kvmppc_ioba_validate(struct > kvmppc_spapr_tce_table *stt, > > return H_SUCCESS; > } > +EXPORT_SYMBOL_GPL(kvmppc_ioba_validate); Why does it need to be exported - the new users will still be in the KVM module, won't they? > + > +/* > + * Validates TCE address. > + * At the moment flags and page mask are validated. > + * As the host kernel does not access those addresses (just puts them > + * to the table and user space is supposed to process them), we can skip > + * checking other things (such as TCE is a guest RAM address or the page > + * was actually allocated). Hmm. These comments apply given that the only current user of this is the kvm acceleration of userspace TCE tables, but the name suggests it would validate any TCE, including in kernel ones for which this would be unsafe. > + * WARNING: This will be called in real-mode on HV KVM and virtual > + * mode on PR KVM > + */ > +long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *stt, unsigned long > tce) > +{ > + unsigned long mask = ((1ULL << IOMMU_PAGE_SHIFT_4K) - 1) & > + ~(TCE_PCI_WRITE | TCE_PCI_READ); > + > + if (tce & mask) > + return H_PARAMETER; > + > + return H_SUCCESS; > +} > +EXPORT_SYMBOL_GPL(kvmppc_tce_validate); > + > +/* Note on the use of page_address() in real mode, > + * > + * It is safe to use page_address() in real mode on ppc64 because > + * page_address() is always defined as lowmem_page_address() > + * which returns __va(PFN_PHYS(page_to_pfn(page))) which is arithmetial > + * operation and does not access page struct. > + * > + * Theoretically page_address() could be defined different > + * but either WANT_PAGE_VIRTUAL or HASHED_PAGE_VIRTUAL > + * should be enabled. > + * WANT_PAGE_VIRTUAL is never enabled on ppc32/ppc64, > + * HASHED_PAGE_VIRTUAL could be enabled for ppc32 only and only > + * if CONFIG_HIGHMEM is defined. As CONFIG_SPARSEMEM_VMEMMAP > + * is not expected to be enabled on ppc32, page_address() > + * is safe for ppc32 as well. > + * > + * WARNING: This will be called in real-mode on HV KVM and virtual > + * mode on PR KVM > + */ > +static u64 *kvmppc_page_address(struct page *page) > +{ > +#if defined(HASHED_PAGE_VIRTUAL) || defined(WANT_PAGE_VIRTUAL) > +#error TODO: fix to avoid page_address() here > +#endif > + return (u64 *) page_address(page); > +} > + > +/* > + * Handles TCE requests for emulated devices. > + * Puts guest TCE values to the table and expects user space to convert them. > + * Called in both real and virtual modes. > + * Cannot fail so kvmppc_tce_validate must be called before it. > + * > + * WARNING: This will be called in real-mode on HV KVM and
Re: [PATCH kernel 6/9] KVM: PPC: Replace SPAPR_TCE_SHIFT with IOMMU_PAGE_SHIFT_4K
On Tue, Sep 15, 2015 at 08:49:36PM +1000, Alexey Kardashevskiy wrote: > SPAPR_TCE_SHIFT is used in few places only and since IOMMU_PAGE_SHIFT_4K > can be easily used instead, remove SPAPR_TCE_SHIFT. > > Signed-off-by: Alexey KardashevskiyReviewed-by: David Gibson > --- > arch/powerpc/include/asm/kvm_book3s_64.h | 2 -- > arch/powerpc/kvm/book3s_64_vio.c | 3 ++- > arch/powerpc/kvm/book3s_64_vio_hv.c | 4 ++-- > 3 files changed, 4 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h > b/arch/powerpc/include/asm/kvm_book3s_64.h > index 2aa79c8..7529aab 100644 > --- a/arch/powerpc/include/asm/kvm_book3s_64.h > +++ b/arch/powerpc/include/asm/kvm_book3s_64.h > @@ -33,8 +33,6 @@ static inline void svcpu_put(struct > kvmppc_book3s_shadow_vcpu *svcpu) > } > #endif > > -#define SPAPR_TCE_SHIFT 12 > - > #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE > #define KVM_DEFAULT_HPT_ORDER24 /* 16MB HPT by default */ > #endif > diff --git a/arch/powerpc/kvm/book3s_64_vio.c > b/arch/powerpc/kvm/book3s_64_vio.c > index b70787d..e347856 100644 > --- a/arch/powerpc/kvm/book3s_64_vio.c > +++ b/arch/powerpc/kvm/book3s_64_vio.c > @@ -36,12 +36,13 @@ > #include > #include > #include > +#include > > #define TCES_PER_PAGE(PAGE_SIZE / sizeof(u64)) > > static long kvmppc_stt_npages(unsigned long window_size) > { > - return ALIGN((window_size >> SPAPR_TCE_SHIFT) > + return ALIGN((window_size >> IOMMU_PAGE_SHIFT_4K) >* sizeof(u64), PAGE_SIZE) / PAGE_SIZE; > } > > diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c > b/arch/powerpc/kvm/book3s_64_vio_hv.c > index 8ae12ac..6cf1ab3 100644 > --- a/arch/powerpc/kvm/book3s_64_vio_hv.c > +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c > @@ -99,7 +99,7 @@ long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long > liobn, > if (ret) > return ret; > > - idx = ioba >> SPAPR_TCE_SHIFT; > + idx = ioba >> IOMMU_PAGE_SHIFT_4K; > page = stt->pages[idx / TCES_PER_PAGE]; > tbl = (u64 *)page_address(page); > > @@ -121,7 +121,7 @@ long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned > long liobn, > if (stt) { > ret = kvmppc_ioba_validate(stt, ioba, 1); > if (!ret) { > - unsigned long idx = ioba >> SPAPR_TCE_SHIFT; > + unsigned long idx = ioba >> IOMMU_PAGE_SHIFT_4K; > struct page *page = stt->pages[idx / TCES_PER_PAGE]; > u64 *tbl = (u64 *)page_address(page); > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH kernel 9/9] KVM: PPC: Add support for multiple-TCE hcalls
On Tue, Sep 15, 2015 at 08:49:39PM +1000, Alexey Kardashevskiy wrote: > This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and > H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO > devices or emulated PCI. These calls allow adding multiple entries > (up to 512) into the TCE table in one call which saves time on > transition between kernel and user space. > > This implements the KVM_CAP_PPC_MULTITCE capability. When present, > the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE. > If they can not be handled by the kernel, they are passed on to > the user space. The user space still has to have an implementation > for these. > > Both HV and PR-syle KVM are supported. > > Signed-off-by: Alexey Kardashevskiy> --- > Documentation/virtual/kvm/api.txt | 25 ++ > arch/powerpc/include/asm/kvm_ppc.h | 12 +++ > arch/powerpc/kvm/book3s_64_vio.c| 111 +++- > arch/powerpc/kvm/book3s_64_vio_hv.c | 145 > ++-- > arch/powerpc/kvm/book3s_hv.c| 26 +- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 6 +- > arch/powerpc/kvm/book3s_pr_papr.c | 35 > arch/powerpc/kvm/powerpc.c | 3 + > 8 files changed, 350 insertions(+), 13 deletions(-) > > diff --git a/Documentation/virtual/kvm/api.txt > b/Documentation/virtual/kvm/api.txt > index d86d831..593c62a 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -3019,6 +3019,31 @@ Returns: 0 on success, -1 on error > > Queues an SMI on the thread's vcpu. > > +4.97 KVM_CAP_PPC_MULTITCE > + > +Capability: KVM_CAP_PPC_MULTITCE > +Architectures: ppc > +Type: vm > + > +This capability means the kernel is capable of handling hypercalls > +H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user > +space. This significantly accelerates DMA operations for PPC KVM guests. > +User space should expect that its handlers for these hypercalls > +are not going to be called if user space previously registered LIOBN > +in KVM (via KVM_CREATE_SPAPR_TCE or similar calls). > + > +In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest, > +user space might have to advertise it for the guest. For example, > +IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is > +present in the "ibm,hypertas-functions" device-tree property. > + > +The hypercalls mentioned above may or may not be processed successfully > +in the kernel based fast path. If they can not be handled by the kernel, > +they will get passed on to user space. So user space still has to have > +an implementation for these despite the in kernel acceleration. > + > +This capability is always enabled. > + > 5. The kvm_run structure > > > diff --git a/arch/powerpc/include/asm/kvm_ppc.h > b/arch/powerpc/include/asm/kvm_ppc.h > index fcde896..e5b968e 100644 > --- a/arch/powerpc/include/asm/kvm_ppc.h > +++ b/arch/powerpc/include/asm/kvm_ppc.h > @@ -166,12 +166,24 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu > *vcpu); > > extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, > struct kvm_create_spapr_tce *args); > +extern struct kvmppc_spapr_tce_table *kvmppc_find_table( > + struct kvm_vcpu *vcpu, unsigned long liobn); > extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, > unsigned long ioba, unsigned long npages); > extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt, > unsigned long tce); > +extern long kvmppc_gpa_to_ua(struct kvm *kvm, unsigned long gpa, > + unsigned long *ua, unsigned long **prmap); > +extern void kvmppc_tce_put(struct kvmppc_spapr_tce_table *tt, > + unsigned long idx, unsigned long tce); > extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn, >unsigned long ioba, unsigned long tce); > +extern long kvmppc_h_put_tce_indirect(struct kvm_vcpu *vcpu, > + unsigned long liobn, unsigned long ioba, > + unsigned long tce_list, unsigned long npages); > +extern long kvmppc_h_stuff_tce(struct kvm_vcpu *vcpu, > + unsigned long liobn, unsigned long ioba, > + unsigned long tce_value, unsigned long npages); > extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn, >unsigned long ioba); > extern struct page *kvm_alloc_hpt(unsigned long nr_pages); > diff --git a/arch/powerpc/kvm/book3s_64_vio.c > b/arch/powerpc/kvm/book3s_64_vio.c > index e347856..d3fc732 100644 > --- a/arch/powerpc/kvm/book3s_64_vio.c > +++ b/arch/powerpc/kvm/book3s_64_vio.c > @@ -14,6 +14,7 @@ > * > * Copyright 2010 Paul Mackerras, IBM Corp. > * Copyright 2011 David Gibson, IBM Corporation > + * Copyright 2013 Alexey Kardashevskiy, IBM
Re: [PATCH 1/1] KVM: PPC: Increase memslots to 320
On Wed, Nov 04, 2015 at 10:03:48AM +0100, Thomas Huth wrote: > Only using 32 memslots for KVM on powerpc is way too low, you can > nowadays hit this limit quite fast by adding a couple of PCI devices > and/or pluggable memory DIMMs to the guest. > x86 already increased the limit to 512 in total, to satisfy 256 > pluggable DIMM slots, 3 private slots and 253 slots for other things > like PCI devices. On powerpc, we only have 32 pluggable DIMMs in I agree with increasing the limit. Is there a reason we have only 32 pluggable DIMMs in QEMU on powerpc, not more? Should we be increasing that limit too? If so, maybe we should increase the number of memory slots to 512? > QEMU, not 256, so we likely do not as much slots as on x86. Thus "so we likely do not need as many slots as on x86" would be better English. > setting the slot limit to 320 sounds like a good value for the > time being (until we have some code in the future to resize the > memslot array dynamically). > And while we're at it, also remove the KVM_MEM_SLOTS_NUM definition > from the powerpc-specific header since this gets defined in the > generic kvm_host.h header anyway. > > Signed-off-by: Thomas HuthPaul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: powerpc/64: Include KVM guest test in all interrupt vectors
On Thu, 2015-12-11 at 05:44:42 UTC, Paul Mackerras wrote: > Currently, if HV KVM is configured but PR KVM isn't, we don't include > a test to see whether we were interrupted in KVM guest context for the > set of interrupts which get delivered directly to the guest by hardware > if they occur in the guest. This includes things like program > interrupts. > > However, the recent bug where userspace could set the MSR for a VCPU > to have an illegal value in the TS field, and thus cause a TM Bad Thing > type of program interrupt on the hrfid that enters the guest, showed that > we can never be completely sure that these interrupts can never occur > in the guest entry/exit code. If one of these interrupts does happen > and we have HV KVM configured but not PR KVM, then we end up trying to > run the handler in the host with the MMU set to the guest MMU context, > which generally ends badly. > > Thus, for robustness it is better to have the test in every interrupt > vector, so that if some way is found to trigger some interrupt in the > guest entry/exit path, we can handle it without immediately crashing > the host. > > This means that the distinction between KVMTEST and KVMTEST_PR goes > away. Thus we delete KVMTEST_PR and associated macros and use KVMTEST > everywhere that we previously used either KVMTEST_PR or KVMTEST. It > also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR, > so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead. > > Signed-off-by: Paul MackerrasApplied to powerpc next, thanks. https://git.kernel.org/powerpc/c/31a40e2b052c0f2b80df7b56 cheers -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8
On 20/11/15 09:11, Thomas Huth wrote: > In the old DABR register, the BT (Breakpoint Translation) bit > is bit number 61. In the new DAWRX register, the WT (Watchpoint > Translation) bit is bit number 59. So to move the DABR-BT bit > into the position of the DAWRX-WT bit, it has to be shifted by > two, not only by one. This fixes hardware watchpoints in gdb of > older guests that only use the H_SET_DABR/X interface instead > of the new H_SET_MODE interface. > > Signed-off-by: Thomas Huth> --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b98889e..3983b87 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -2143,7 +2143,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > > /* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */ > 2: rlwimi r5, r4, 5, DAWRX_DR | DAWRX_DW > - rlwimi r5, r4, 1, DAWRX_WT > + rlwimi r5, r4, 2, DAWRX_WT > clrrdi r4, r4, 3 > std r4, VCPU_DAWR(r3) > std r5, VCPU_DAWRX(r3) Ping? -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH kernel 3/9] KVM: PPC: Rework H_PUT_TCE/H_GET_TCE handlers
On Tue, Sep 15, 2015 at 08:49:33PM +1000, Alexey Kardashevskiy wrote: > This reworks the existing H_PUT_TCE/H_GET_TCE handlers to have one > exit path. This allows next patch to add locks nicely. I don't see a problem with the actual code, but it doesn't seem to match this description: I still see multiple return statements for h_put_tce at least. > This moves the ioba boundaries check to a helper and adds a check for > least bits which have to be zeros. > > The patch is pretty mechanical (only check for least ioba bits is added) > so no change in behaviour is expected. > > Signed-off-by: Alexey Kardashevskiy> --- > arch/powerpc/kvm/book3s_64_vio_hv.c | 102 > +++- > 1 file changed, 66 insertions(+), 36 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c > b/arch/powerpc/kvm/book3s_64_vio_hv.c > index 89e96b3..8ae12ac 100644 > --- a/arch/powerpc/kvm/book3s_64_vio_hv.c > +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c > @@ -35,71 +35,101 @@ > #include > #include > #include > +#include > > #define TCES_PER_PAGE(PAGE_SIZE / sizeof(u64)) > > +/* > + * Finds a TCE table descriptor by LIOBN. > + * > + * WARNING: This will be called in real or virtual mode on HV KVM and virtual > + * mode on PR KVM > + */ > +static struct kvmppc_spapr_tce_table *kvmppc_find_table(struct kvm_vcpu > *vcpu, > + unsigned long liobn) > +{ > + struct kvm *kvm = vcpu->kvm; > + struct kvmppc_spapr_tce_table *stt; > + > + list_for_each_entry_rcu_notrace(stt, >arch.spapr_tce_tables, list) > + if (stt->liobn == liobn) > + return stt; > + > + return NULL; > +} > + > +/* > + * Validates IO address. > + * > + * WARNING: This will be called in real-mode on HV KVM and virtual > + * mode on PR KVM > + */ > +static long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt, > + unsigned long ioba, unsigned long npages) > +{ > + unsigned long mask = (1ULL << IOMMU_PAGE_SHIFT_4K) - 1; > + unsigned long idx = ioba >> IOMMU_PAGE_SHIFT_4K; > + unsigned long size = stt->window_size >> IOMMU_PAGE_SHIFT_4K; > + > + if ((ioba & mask) || (size + npages <= idx)) > + return H_PARAMETER; Not sure if it's worth a check for overflow in (size+npages) there. > + > + return H_SUCCESS; > +} > + > /* WARNING: This will be called in real-mode on HV KVM and virtual > * mode on PR KVM > */ > long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn, > unsigned long ioba, unsigned long tce) > { > - struct kvm *kvm = vcpu->kvm; > - struct kvmppc_spapr_tce_table *stt; > + struct kvmppc_spapr_tce_table *stt = kvmppc_find_table(vcpu, liobn); > + long ret = H_TOO_HARD; > + unsigned long idx; > + struct page *page; > + u64 *tbl; > > /* udbg_printf("H_PUT_TCE(): liobn=0x%lx ioba=0x%lx, tce=0x%lx\n", */ > /* liobn, ioba, tce); */ > > - list_for_each_entry(stt, >arch.spapr_tce_tables, list) { > - if (stt->liobn == liobn) { > - unsigned long idx = ioba >> SPAPR_TCE_SHIFT; > - struct page *page; > - u64 *tbl; > + if (!stt) > + return ret; > > - /* udbg_printf("H_PUT_TCE: liobn 0x%lx => stt=%p > window_size=0x%x\n", */ > - /* liobn, stt, stt->window_size); */ > - if (ioba >= stt->window_size) > - return H_PARAMETER; > + ret = kvmppc_ioba_validate(stt, ioba, 1); > + if (ret) > + return ret; > > - page = stt->pages[idx / TCES_PER_PAGE]; > - tbl = (u64 *)page_address(page); > + idx = ioba >> SPAPR_TCE_SHIFT; > + page = stt->pages[idx / TCES_PER_PAGE]; > + tbl = (u64 *)page_address(page); > > - /* FIXME: Need to validate the TCE itself */ > - /* udbg_printf("tce @ %p\n", [idx % > TCES_PER_PAGE]); */ > - tbl[idx % TCES_PER_PAGE] = tce; > - return H_SUCCESS; > - } > - } > + /* FIXME: Need to validate the TCE itself */ > + /* udbg_printf("tce @ %p\n", [idx % TCES_PER_PAGE]); */ > + tbl[idx % TCES_PER_PAGE] = tce; > > - /* Didn't find the liobn, punt it to userspace */ > - return H_TOO_HARD; > + return ret; So, this relies on the fact that kvmppc_ioba_validate() would have returned H_SUCCESS some distance above. This seems rather fragile if you insert anything else which alters ret in between. Since this is the success path, I think it's clearer to explicitly return H_SUCCESS. > } > EXPORT_SYMBOL_GPL(kvmppc_h_put_tce); > > long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn, > unsigned long ioba) > { > - struct kvm *kvm = vcpu->kvm; > - struct
Re: [PATCH kernel 1/9] rcu: Define notrace version of list_for_each_entry_rcu
On Tue, Sep 15, 2015 at 08:49:31PM +1000, Alexey Kardashevskiy wrote: > This defines list_for_each_entry_rcu_notrace and list_entry_rcu_notrace > which use rcu_dereference_raw_notrace instead of rcu_dereference_raw. > This allows using list_for_each_entry_rcu_notrace in real mode (MMU is off). > > Signed-off-by: Alexey KardashevskiyReviewed-by: David Gibson > --- > include/linux/rculist.h | 38 ++ > 1 file changed, 38 insertions(+) > > diff --git a/include/linux/rculist.h b/include/linux/rculist.h > index 17c6b1f..439c4d7 100644 > --- a/include/linux/rculist.h > +++ b/include/linux/rculist.h > @@ -253,6 +253,25 @@ static inline void list_splice_init_rcu(struct list_head > *list, > }) > > /** > + * list_entry_rcu_notrace - get the struct for this entry > + * @ptr:the list_head pointer. > + * @type: the type of the struct this is embedded in. > + * @member: the name of the list_struct within the struct. > + * > + * This primitive may safely run concurrently with the _rcu list-mutation > + * primitives such as list_add_rcu() as long as it's guarded by > rcu_read_lock(). > + * > + * This is the same as list_entry_rcu() except that it does > + * not do any RCU debugging or tracing. > + */ > +#define list_entry_rcu_notrace(ptr, type, member) \ > +({ \ > + typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \ > + container_of((typeof(ptr))rcu_dereference_raw_notrace(__ptr), \ > + type, member); \ > +}) > + > +/** > * Where are list_empty_rcu() and list_first_entry_rcu()? > * > * Implementing those functions following their counterparts list_empty() and > @@ -308,6 +327,25 @@ static inline void list_splice_init_rcu(struct list_head > *list, > pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) > > /** > + * list_for_each_entry_rcu_notrace - iterate over rcu list of given type > + * @pos: the type * to use as a loop cursor. > + * @head:the head for your list. > + * @member: the name of the list_struct within the struct. > + * > + * This list-traversal primitive may safely run concurrently with > + * the _rcu list-mutation primitives such as list_add_rcu() > + * as long as the traversal is guarded by rcu_read_lock(). > + * > + * This is the same as list_for_each_entry_rcu() except that it does > + * not do any RCU debugging or tracing. > + */ > +#define list_for_each_entry_rcu_notrace(pos, head, member) \ > + for (pos = list_entry_rcu_notrace((head)->next, typeof(*pos), member); \ > + >member != (head); \ > + pos = list_entry_rcu_notrace(pos->member.next, typeof(*pos), \ > + member)) > + > +/** > * list_for_each_entry_continue_rcu - continue iteration over list of given > type > * @pos: the type * to use as a loop cursor. > * @head:the head for your list. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH kernel 4/9] KVM: PPC: Use RCU for arch.spapr_tce_tables
On Tue, Sep 15, 2015 at 08:49:34PM +1000, Alexey Kardashevskiy wrote: > At the moment spapr_tce_tables is not protected against races. This makes > use of RCU-variants of list helpers. As some bits are executed in real > mode, this makes use of just introduced list_for_each_entry_rcu_notrace(). > > This converts release_spapr_tce_table() to a RCU scheduled handler. > > Signed-off-by: Alexey KardashevskiyLooks correct on my limited knowledge of RCU Reviewed-by: David Gibson -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH kernel 2/9] KVM: PPC: Make real_vmalloc_addr() public
On Tue, Sep 15, 2015 at 08:49:32PM +1000, Alexey Kardashevskiy wrote: > This helper translates vmalloc'd addresses to linear addresses. > It is only used by the KVM MMU code now and resides in the HV KVM code. > We will need it further in the TCE code and the DMA memory preregistration > code called in real mode. > > This makes real_vmalloc_addr() public and moves it to the powerpc code as > it does not do anything special for KVM. > > Signed-off-by: Alexey KardashevskiyHmm, I have a couple of small concerns. First, I'm not sure if the name is clear enough for a public function. Second, I'm not sure if mmu-hash64.h is the right place for it. This is still a function with very specific and limited usage, I wonder if we should have somewhere else for such special real mode helpers. Paulus, thoughts? -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH] kvm: remove unused variable 'vcpu_book3s'
"Geyslan G. Bem"writes: > The vcpu_book3s struct is assigned but never used. So remove it. Just out of interest, how did you find this? Compiler warning? Static analysis? Manual inspection? Thanks in advance! Regards, Daniel > > Signed-off-by: Geyslan G. Bem > --- > arch/powerpc/kvm/book3s_64_mmu.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_64_mmu.c > b/arch/powerpc/kvm/book3s_64_mmu.c > index 774a253..9bf7031 100644 > --- a/arch/powerpc/kvm/book3s_64_mmu.c > +++ b/arch/powerpc/kvm/book3s_64_mmu.c > @@ -377,15 +377,12 @@ no_seg_found: > > static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 > rb) > { > - struct kvmppc_vcpu_book3s *vcpu_book3s; > u64 esid, esid_1t; > int slb_nr; > struct kvmppc_slb *slbe; > > dprintk("KVM MMU: slbmte(0x%llx, 0x%llx)\n", rs, rb); > > - vcpu_book3s = to_book3s(vcpu); > - > esid = GET_ESID(rb); > esid_1t = GET_ESID_1T(rb); > slb_nr = rb & 0xfff; > -- > 2.6.2 > > ___ > Linuxppc-dev mailing list > linuxppc-...@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev signature.asc Description: PGP signature
[PATCH] kvm: remove unused variable 'vcpu_book3s'
The vcpu_book3s struct is assigned but never used. So remove it. Signed-off-by: Geyslan G. Bem--- arch/powerpc/kvm/book3s_64_mmu.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c index 774a253..9bf7031 100644 --- a/arch/powerpc/kvm/book3s_64_mmu.c +++ b/arch/powerpc/kvm/book3s_64_mmu.c @@ -377,15 +377,12 @@ no_seg_found: static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 rb) { - struct kvmppc_vcpu_book3s *vcpu_book3s; u64 esid, esid_1t; int slb_nr; struct kvmppc_slb *slbe; dprintk("KVM MMU: slbmte(0x%llx, 0x%llx)\n", rs, rb); - vcpu_book3s = to_book3s(vcpu); - esid = GET_ESID(rb); esid_1t = GET_ESID_1T(rb); slb_nr = rb & 0xfff; -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: remove unused variable 'vcpu_book3s'
2015-12-01 21:34 GMT-03:00 Daniel Axtens: > "Geyslan G. Bem" writes: > >> The vcpu_book3s struct is assigned but never used. So remove it. > > Just out of interest, how did you find this? Compiler warning? Static > analysis? Manual inspection? Sorry, I should have done the patch self contained. I caught it through static analysis (cppcheck). > > Thanks in advance! You're welcome. > > Regards, > Daniel > >> >> Signed-off-by: Geyslan G. Bem >> --- >> arch/powerpc/kvm/book3s_64_mmu.c | 3 --- >> 1 file changed, 3 deletions(-) >> >> diff --git a/arch/powerpc/kvm/book3s_64_mmu.c >> b/arch/powerpc/kvm/book3s_64_mmu.c >> index 774a253..9bf7031 100644 >> --- a/arch/powerpc/kvm/book3s_64_mmu.c >> +++ b/arch/powerpc/kvm/book3s_64_mmu.c >> @@ -377,15 +377,12 @@ no_seg_found: >> >> static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 >> rb) >> { >> - struct kvmppc_vcpu_book3s *vcpu_book3s; >> u64 esid, esid_1t; >> int slb_nr; >> struct kvmppc_slb *slbe; >> >> dprintk("KVM MMU: slbmte(0x%llx, 0x%llx)\n", rs, rb); >> >> - vcpu_book3s = to_book3s(vcpu); >> - >> esid = GET_ESID(rb); >> esid_1t = GET_ESID_1T(rb); >> slb_nr = rb & 0xfff; >> -- >> 2.6.2 >> >> ___ >> Linuxppc-dev mailing list >> linuxppc-...@lists.ozlabs.org >> https://lists.ozlabs.org/listinfo/linuxppc-dev -- Regards, Geyslan G. Bem hackingbits.com -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] kvm - possible out of bounds
Hello, I have found a possible out of bounds reading in arch/powerpc/kvm/book3s_64_mmu.c (kvmppc_mmu_book3s_64_xlate function). pteg[] array could be accessed twice using the i variable after the for iteration. What happens is that in the last iteration the i index is incremented to 16, checked (i<16) then confirmed exiting the loop. 277for (i=0; i<16; i+=2) { ... Later there are reading attempts to the pteg last elements, but using again the already incremented i (16). 303v = be64_to_cpu(pteg[i]); /* pteg[16] */ 304r = be64_to_cpu(pteg[i+1]); /* pteg[17] */ I really don't know if the for lace will somehow iterate until i is 16, anyway I think that the last readings must be using a defined max len/index or another more clear method. Eg. v = be64_to_cpu(pteg[PTEG_LEN - 2]); r = be64_to_cpu(pteg[PTEG_LEN - 1]); Or just. v = be64_to_cpu(pteg[14]); r = be64_to_cpu(pteg[15]); I found in the same file a variable that is not used. 380struct kvmppc_vcpu_book3s *vcpu_book3s; ... 387vcpu_book3s = to_book3s(vcpu); - A question, the kvmppc_mmu_book3s_64_init function is accessed by unconventional way? Because I have not found any calling to it. If something that I wrote is correct please tell me if I could send the patch. -- Regards, Geyslan G. Bem hackingbits.com -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] kvm - possible out of bounds
On Sun, Nov 29, 2015 at 05:14:03PM -0300, Geyslan Gregório Bem wrote: > Hello, > > I have found a possible out of bounds reading in > arch/powerpc/kvm/book3s_64_mmu.c (kvmppc_mmu_book3s_64_xlate > function). pteg[] array could be accessed twice using the i variable > after the for iteration. What happens is that in the last iteration > the i index is incremented to 16, checked (i<16) then confirmed > exiting the loop. > > 277for (i=0; i<16; i+=2) { ... > > Later there are reading attempts to the pteg last elements, but using > again the already incremented i (16). > > 303v = be64_to_cpu(pteg[i]); /* pteg[16] */ > 304r = be64_to_cpu(pteg[i+1]); /* pteg[17] */ Was it some automated tool that came up with this? There is actually no problem because the accesses outside the loop are only done if the 'found' variable is true; 'found' is initialized to false and only ever set to true inside the loop just before a break statement. Thus there is a correlation between the value of 'i' and the value of 'found' -- if 'found' is true then we know 'i' is less than 16. > I really don't know if the for lace will somehow iterate until i is > 16, anyway I think that the last readings must be using a defined max > len/index or another more clear method. I think it's perfectly clear to a human programmer, though some tools (such as gcc) struggle with this kind of correlation between variables. That's why I asked whether your report was based on the output from some tool. > Eg. > > v = be64_to_cpu(pteg[PTEG_LEN - 2]); > r = be64_to_cpu(pteg[PTEG_LEN - 1]); > > Or just. > > v = be64_to_cpu(pteg[14]); > r = be64_to_cpu(pteg[15]); Either of those options would cause the code to malfunction. > I found in the same file a variable that is not used. > > 380struct kvmppc_vcpu_book3s *vcpu_book3s; > ... > 387vcpu_book3s = to_book3s(vcpu); True. It could be removed. > A question, the kvmppc_mmu_book3s_64_init function is accessed by > unconventional way? Because I have not found any calling to it. Try arch/powerpc/kvm/book3s_pr.c line 410: kvmppc_mmu_book3s_64_init(vcpu); Grep (or git grep) is your friend. Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] kvm - possible out of bounds
2015-11-29 18:33 GMT-03:00 Paul Mackerras: > On Sun, Nov 29, 2015 at 05:14:03PM -0300, Geyslan Gregório Bem wrote: >> Hello, >> >> I have found a possible out of bounds reading in >> arch/powerpc/kvm/book3s_64_mmu.c (kvmppc_mmu_book3s_64_xlate >> function). pteg[] array could be accessed twice using the i variable >> after the for iteration. What happens is that in the last iteration >> the i index is incremented to 16, checked (i<16) then confirmed >> exiting the loop. >> >> 277for (i=0; i<16; i+=2) { ... >> >> Later there are reading attempts to the pteg last elements, but using >> again the already incremented i (16). >> >> 303v = be64_to_cpu(pteg[i]); /* pteg[16] */ >> 304r = be64_to_cpu(pteg[i+1]); /* pteg[17] */ > > Was it some automated tool that came up with this? Yep, cppcheck. As I'm still not engaged to a specific area in the kernel, just trying to help with automated catches. > > There is actually no problem because the accesses outside the loop are > only done if the 'found' variable is true; 'found' is initialized to > false and only ever set to true inside the loop just before a break > statement. Thus there is a correlation between the value of 'i' and > the value of 'found' -- if 'found' is true then we know 'i' is less > than 16. I figured it out after your explanation. > >> I really don't know if the for lace will somehow iterate until i is >> 16, anyway I think that the last readings must be using a defined max >> len/index or another more clear method. > > I think it's perfectly clear to a human programmer, though some tools > (such as gcc) struggle with this kind of correlation between > variables. That's why I asked whether your report was based on the > output from some tool. > >> Eg. >> >> v = be64_to_cpu(pteg[PTEG_LEN - 2]); >> r = be64_to_cpu(pteg[PTEG_LEN - 1]); >> >> Or just. >> >> v = be64_to_cpu(pteg[14]); >> r = be64_to_cpu(pteg[15]); > > Either of those options would cause the code to malfunction. Yep, I understand now that v and r get the found ones. So i is needed. > >> I found in the same file a variable that is not used. >> >> 380struct kvmppc_vcpu_book3s *vcpu_book3s; >> ... >> 387vcpu_book3s = to_book3s(vcpu); > > True. It could be removed. I'll make a patch for that. > >> A question, the kvmppc_mmu_book3s_64_init function is accessed by >> unconventional way? Because I have not found any calling to it. > > Try arch/powerpc/kvm/book3s_pr.c line 410: > > kvmppc_mmu_book3s_64_init(vcpu); > > Grep (or git grep) is your friend. Hmm, indeed. > > Paul. Thank you, Paul. If you have some other changes in progress let me know. -- Regards, Geyslan G. Bem hackingbits.com -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH kernel 5/9] KVM: PPC: Account TCE-containing pages in locked_vm
On Tue, Sep 15, 2015 at 08:49:35PM +1000, Alexey Kardashevskiy wrote: > At the moment pages used for TCE tables (in addition to pages addressed > by TCEs) are not counted in locked_vm counter so a malicious userspace > tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as RLIMIT_NOFILE and > lock a lot of memory. > > This adds counting for pages used for TCE tables. > > This counts the number of pages required for a table plus pages for > the kvmppc_spapr_tce_table struct (TCE table descriptor) itself. > > This does not change the amount of (de)allocated memory. > > Signed-off-by: Alexey Kardashevskiy> --- > arch/powerpc/kvm/book3s_64_vio.c | 51 > +++- > 1 file changed, 50 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_64_vio.c > b/arch/powerpc/kvm/book3s_64_vio.c > index 9526c34..b70787d 100644 > --- a/arch/powerpc/kvm/book3s_64_vio.c > +++ b/arch/powerpc/kvm/book3s_64_vio.c > @@ -45,13 +45,56 @@ static long kvmppc_stt_npages(unsigned long window_size) >* sizeof(u64), PAGE_SIZE) / PAGE_SIZE; > } > > +static long kvmppc_account_memlimit(long npages, bool inc) > +{ > + long ret = 0; > + const long bytes = sizeof(struct kvmppc_spapr_tce_table) + > + (abs(npages) * sizeof(struct page *)); Why abs(npages)? Can npages be negative? If so, what does that mean? Paul. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH kernel 5/9] KVM: PPC: Account TCE-containing pages in locked_vm
On 11/30/2015 01:06 PM, Paul Mackerras wrote: On Tue, Sep 15, 2015 at 08:49:35PM +1000, Alexey Kardashevskiy wrote: At the moment pages used for TCE tables (in addition to pages addressed by TCEs) are not counted in locked_vm counter so a malicious userspace tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as RLIMIT_NOFILE and lock a lot of memory. This adds counting for pages used for TCE tables. This counts the number of pages required for a table plus pages for the kvmppc_spapr_tce_table struct (TCE table descriptor) itself. This does not change the amount of (de)allocated memory. Signed-off-by: Alexey Kardashevskiy--- arch/powerpc/kvm/book3s_64_vio.c | 51 +++- 1 file changed, 50 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c index 9526c34..b70787d 100644 --- a/arch/powerpc/kvm/book3s_64_vio.c +++ b/arch/powerpc/kvm/book3s_64_vio.c @@ -45,13 +45,56 @@ static long kvmppc_stt_npages(unsigned long window_size) * sizeof(u64), PAGE_SIZE) / PAGE_SIZE; } +static long kvmppc_account_memlimit(long npages, bool inc) +{ + long ret = 0; + const long bytes = sizeof(struct kvmppc_spapr_tce_table) + + (abs(npages) * sizeof(struct page *)); Why abs(npages)? Can npages be negative? If so, what does that mean? Leftover from older versions when there was one shared account_memlimit(long npages). It does not make sense here, I need to remove it. -- Alexey -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8
On Fri, 20 Nov 2015 09:11:45 +0100 Thomas Huthwrote: > In the old DABR register, the BT (Breakpoint Translation) bit > is bit number 61. In the new DAWRX register, the WT (Watchpoint > Translation) bit is bit number 59. So to move the DABR-BT bit > into the position of the DAWRX-WT bit, it has to be shifted by > two, not only by one. This fixes hardware watchpoints in gdb of > older guests that only use the H_SET_DABR/X interface instead > of the new H_SET_MODE interface. > > Signed-off-by: Thomas Huth Reviewed-by: David Gibson Reviewed-by: David Gibson > --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b98889e..3983b87 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -2143,7 +2143,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > > /* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */ > 2: rlwimi r5, r4, 5, DAWRX_DR | DAWRX_DW > - rlwimi r5, r4, 1, DAWRX_WT > + rlwimi r5, r4, 2, DAWRX_WT > clrrdi r4, r4, 3 > std r4, VCPU_DAWR(r3) > std r5, VCPU_DAWRX(r3) > -- > 1.8.3.1 > -- David Gibson Senior Software Engineer, Virtualization, Red Hat pgp8jxKFUmn0u.pgp Description: OpenPGP digital signature
Re: [PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8
On 20/11/2015 09:11, Thomas Huth wrote: > In the old DABR register, the BT (Breakpoint Translation) bit > is bit number 61. In the new DAWRX register, the WT (Watchpoint > Translation) bit is bit number 59. So to move the DABR-BT bit > into the position of the DAWRX-WT bit, it has to be shifted by > two, not only by one. This fixes hardware watchpoints in gdb of > older guests that only use the H_SET_DABR/X interface instead > of the new H_SET_MODE interface. > > Signed-off-by: Thomas Huth> --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b98889e..3983b87 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -2143,7 +2143,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > > /* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */ > 2: rlwimi r5, r4, 5, DAWRX_DR | DAWRX_DW > - rlwimi r5, r4, 1, DAWRX_WT > + rlwimi r5, r4, 2, DAWRX_WT > clrrdi r4, r4, 3 > std r4, VCPU_DAWR(r3) > std r5, VCPU_DAWRX(r3) > Nice catch. Reviewed-by: Laurent Vivier -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8
In the old DABR register, the BT (Breakpoint Translation) bit is bit number 61. In the new DAWRX register, the WT (Watchpoint Translation) bit is bit number 59. So to move the DABR-BT bit into the position of the DAWRX-WT bit, it has to be shifted by two, not only by one. This fixes hardware watchpoints in gdb of older guests that only use the H_SET_DABR/X interface instead of the new H_SET_MODE interface. Signed-off-by: Thomas Huth--- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index b98889e..3983b87 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -2143,7 +2143,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) /* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */ 2: rlwimi r5, r4, 5, DAWRX_DR | DAWRX_DW - rlwimi r5, r4, 1, DAWRX_WT + rlwimi r5, r4, 2, DAWRX_WT clrrdi r4, r4, 3 std r4, VCPU_DAWR(r3) std r5, VCPU_DAWRX(r3) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] KVM: Use common function for VCPU lookup by id
From: David HildenbrandLet's reuse the new common function for VPCU lookup by id. Reviewed-by: Christian Borntraeger Reviewed-by: Dominik Dingel Signed-off-by: David Hildenbrand Signed-off-by: Christian Borntraeger [split out the new function into a separate patch] --- arch/powerpc/kvm/book3s_hv.c | 10 ++ arch/s390/kvm/diag.c | 11 +++ virt/kvm/kvm_main.c | 12 +--- 3 files changed, 10 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 54b45b7..904b3b0 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -308,16 +308,10 @@ static void kvmppc_dump_regs(struct kvm_vcpu *vcpu) static struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id) { - int r; - struct kvm_vcpu *v, *ret = NULL; + struct kvm_vcpu *ret; mutex_lock(>lock); - kvm_for_each_vcpu(r, v, kvm) { - if (v->vcpu_id == id) { - ret = v; - break; - } - } + ret = kvm_lookup_vcpu(kvm, id); mutex_unlock(>lock); return ret; } diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c index 5fbfb88..aaa7cc0 100644 --- a/arch/s390/kvm/diag.c +++ b/arch/s390/kvm/diag.c @@ -155,10 +155,8 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu) static int __diag_time_slice_end_directed(struct kvm_vcpu *vcpu) { - struct kvm *kvm = vcpu->kvm; struct kvm_vcpu *tcpu; int tid; - int i; tid = vcpu->run->s.regs.gprs[(vcpu->arch.sie_block->ipa & 0xf0) >> 4]; vcpu->stat.diagnose_9c++; @@ -167,12 +165,9 @@ static int __diag_time_slice_end_directed(struct kvm_vcpu *vcpu) if (tid == vcpu->vcpu_id) return 0; - kvm_for_each_vcpu(i, tcpu, kvm) - if (tcpu->vcpu_id == tid) { - kvm_vcpu_yield_to(tcpu); - break; - } - + tcpu = kvm_lookup_vcpu(vcpu->kvm, tid); + if (tcpu) + kvm_vcpu_yield_to(tcpu); return 0; } diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 484079e..244c9b4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2257,7 +2257,7 @@ static int create_vcpu_fd(struct kvm_vcpu *vcpu) static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) { int r; - struct kvm_vcpu *vcpu, *v; + struct kvm_vcpu *vcpu; if (id >= KVM_MAX_VCPUS) return -EINVAL; @@ -2281,12 +2281,10 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) r = -EINVAL; goto unlock_vcpu_destroy; } - - kvm_for_each_vcpu(r, v, kvm) - if (v->vcpu_id == id) { - r = -EEXIST; - goto unlock_vcpu_destroy; - } + if (kvm_lookup_vcpu(kvm, id)) { + r = -EEXIST; + goto unlock_vcpu_destroy; + } BUG_ON(kvm->vcpus[atomic_read(>online_vcpus)]); -- 2.3.0 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] KVM: s390: fix wrong lookup of VCPUs by array index
From: David HildenbrandFor now, VCPUs were always created sequentially with incrementing VCPU ids. Therefore, the index in the VCPUs array matched the id. As sequential creation might change with cpu hotplug, let's use the correct lookup function to find a VCPU by id, not array index. Reviewed-by: Christian Borntraeger Signed-off-by: David Hildenbrand Signed-off-by: Christian Borntraeger [split stable/non-stable parts] Cc: sta...@vger.kernel.org # c3853a8: KVM: Provide function for VCPU lookup by id --- arch/s390/kvm/sigp.c | 8 ++-- 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c index da690b6..081dbd0 100644 --- a/arch/s390/kvm/sigp.c +++ b/arch/s390/kvm/sigp.c @@ -291,12 +291,8 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 order_code, u16 cpu_addr, u32 parameter, u64 *status_reg) { int rc; - struct kvm_vcpu *dst_vcpu; + struct kvm_vcpu *dst_vcpu = kvm_lookup_vcpu(vcpu->kvm, cpu_addr); - if (cpu_addr >= KVM_MAX_VCPUS) - return SIGP_CC_NOT_OPERATIONAL; - - dst_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr); if (!dst_vcpu) return SIGP_CC_NOT_OPERATIONAL; @@ -478,7 +474,7 @@ int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu) trace_kvm_s390_handle_sigp_pei(vcpu, order_code, cpu_addr); if (order_code == SIGP_EXTERNAL_CALL) { - dest_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr); + dest_vcpu = kvm_lookup_vcpu(vcpu->kvm, cpu_addr); BUG_ON(dest_vcpu == NULL); kvm_s390_vcpu_wakeup(dest_vcpu); -- 2.3.0 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] KVM: Provide function for VCPU lookup by id
From: David HildenbrandLet's provide a function to lookup a VCPU by id. Reviewed-by: Christian Borntraeger Reviewed-by: Dominik Dingel Signed-off-by: David Hildenbrand Signed-off-by: Christian Borntraeger [split patch from refactoring patch] --- include/linux/kvm_host.h | 11 +++ 1 file changed, 11 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5706a21..b9f345f 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -460,6 +460,17 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i) (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \ idx++) +static inline struct kvm_vcpu *kvm_lookup_vcpu(struct kvm *kvm, int id) +{ + struct kvm_vcpu *vcpu; + int i; + + kvm_for_each_vcpu(i, vcpu, kvm) + if (vcpu->vcpu_id == id) + return vcpu; + return NULL; +} + #define kvm_for_each_memslot(memslot, slots) \ for (memslot = >memslots[0]; \ memslot < slots->memslots + KVM_MEM_SLOTS_NUM && memslot->npages;\ -- 2.3.0 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] Preview: vcpu lookup by id changes
Paolo, some more patches for review before I add them in my next pull request. Can you review the common code changes and Ack/Nack? Alex, Paul, can you review the power changes and Ack/Nack? As I want to have patch 2 in 4.4, I splitted up Davids patch into smaller chunks. David, can you double check that I did not mess up? So I want to send patch 1 and patch 2 for 4.4. Patch 3 and Patch 4 are for kvm/next (4.5), but need patch 1 as well. So I will probably also add all patches into kvm/next by setting up the git branches in a way that during 4.5 merge window kvm/next will have my branch that will go via kvm/master as common ancestor. Ok? Christian David Hildenbrand (4): KVM: Provide function for VCPU lookup by id KVM: s390: fix wrong lookup of VCPUs by array index KVM: use heuristic for fast VCPU lookup by id KVM: Use common function for VCPU lookup by id arch/powerpc/kvm/book3s_hv.c | 10 ++ arch/s390/kvm/diag.c | 11 +++ arch/s390/kvm/sigp.c | 8 ++-- include/linux/kvm_host.h | 16 virt/kvm/kvm_main.c | 12 +--- 5 files changed, 28 insertions(+), 29 deletions(-) -- 2.3.0 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] KVM: use heuristic for fast VCPU lookup by id
From: David HildenbrandUsually, VCPU ids match the array index. So let's try a fast lookup first before falling back to the slow iteration. Suggested-by: Christian Borntraeger Reviewed-by: Dominik Dingel Reviewed-by: Christian Borntraeger Signed-off-by: David Hildenbrand Signed-off-by: Christian Borntraeger --- include/linux/kvm_host.h | 5 + 1 file changed, 5 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b9f345f..7821137 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -465,6 +465,11 @@ static inline struct kvm_vcpu *kvm_lookup_vcpu(struct kvm *kvm, int id) struct kvm_vcpu *vcpu; int i; + if (id < 0 || id >= KVM_MAX_VCPUS) + return NULL; + vcpu = kvm_get_vcpu(kvm, id); + if (vcpu && vcpu->vcpu_id == id) + return vcpu; kvm_for_each_vcpu(i, vcpu, kvm) if (vcpu->vcpu_id == id) return vcpu; -- 2.3.0 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Preview: vcpu lookup by id changes
On 19/11/2015 09:37, Christian Borntraeger wrote: > > some more patches for review before I add them in my next pull request. > Can you review the common code changes and Ack/Nack? > > Alex, Paul, > can you review the power changes and Ack/Nack? > > As I want to have patch 2 in 4.4, I splitted up Davids patch into smaller > chunks. David, can you double check that I did not mess up? > So I want to send patch 1 and patch 2 for 4.4. Patch 3 and Patch 4 > are for kvm/next (4.5), but need patch 1 as well. > So I will probably also add all patches into kvm/next by setting up > the git branches in a way that during 4.5 merge window kvm/next will > have my branch that will go via kvm/master as common ancestor. > > Ok? Patches and plan both look good! I can apply patch 1 and 2 now to kvm/master. By the time kvm/next forks from Linus's tree (sometime next week, since kvm/queue is already largish) they will be in. Thanks, -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Preview: vcpu lookup by id changes
On 11/19/2015 10:38 AM, Paolo Bonzini wrote: > > > On 19/11/2015 09:37, Christian Borntraeger wrote: >> >> some more patches for review before I add them in my next pull request. >> Can you review the common code changes and Ack/Nack? >> >> Alex, Paul, >> can you review the power changes and Ack/Nack? >> >> As I want to have patch 2 in 4.4, I splitted up Davids patch into smaller >> chunks. David, can you double check that I did not mess up? >> So I want to send patch 1 and patch 2 for 4.4. Patch 3 and Patch 4 >> are for kvm/next (4.5), but need patch 1 as well. >> So I will probably also add all patches into kvm/next by setting up >> the git branches in a way that during 4.5 merge window kvm/next will >> have my branch that will go via kvm/master as common ancestor. >> >> Ok? > > Patches and plan both look good! > > I can apply patch 1 and 2 now to kvm/master. By the time kvm/next forks > from Linus's tree (sometime next week, since kvm/queue is already > largish) they will be in. I have 2or 3 more patches for 4.4 and I will prepare a pull request today. So either take them or wait for my request. Whatever works best for you. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Preview: vcpu lookup by id changes
On 19/11/2015 10:51, Christian Borntraeger wrote: > > I can apply patch 1 and 2 now to kvm/master. By the time kvm/next forks > > from Linus's tree (sometime next week, since kvm/queue is already > > largish) they will be in. > > I have 2or 3 more patches for 4.4 and I will prepare a pull request today. So > either take them or wait for my request. Whatever works best for you. I'll wait then. ARM folks also have a couple patches for me. :) Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] KVM: s390: fix wrong lookup of VCPUs by array index
Sigh. Seems that my mail script got confused by the # after stable. So please strip down the cc list. On 11/19/2015 09:37 AM, Christian Borntraeger wrote: > From: David Hildenbrand> > For now, VCPUs were always created sequentially with incrementing > VCPU ids. Therefore, the index in the VCPUs array matched the id. > > As sequential creation might change with cpu hotplug, let's use > the correct lookup function to find a VCPU by id, not array index. > > Reviewed-by: Christian Borntraeger > Signed-off-by: David Hildenbrand > Signed-off-by: Christian Borntraeger > [split stable/non-stable parts] > Cc: sta...@vger.kernel.org # c3853a8: KVM: Provide function for VCPU lookup > by id > --- > arch/s390/kvm/sigp.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c > index da690b6..081dbd0 100644 > --- a/arch/s390/kvm/sigp.c > +++ b/arch/s390/kvm/sigp.c > @@ -291,12 +291,8 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 > order_code, > u16 cpu_addr, u32 parameter, u64 *status_reg) > { > int rc; > - struct kvm_vcpu *dst_vcpu; > + struct kvm_vcpu *dst_vcpu = kvm_lookup_vcpu(vcpu->kvm, cpu_addr); > > - if (cpu_addr >= KVM_MAX_VCPUS) > - return SIGP_CC_NOT_OPERATIONAL; > - > - dst_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr); > if (!dst_vcpu) > return SIGP_CC_NOT_OPERATIONAL; > > @@ -478,7 +474,7 @@ int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu) > trace_kvm_s390_handle_sigp_pei(vcpu, order_code, cpu_addr); > > if (order_code == SIGP_EXTERNAL_CALL) { > - dest_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr); > + dest_vcpu = kvm_lookup_vcpu(vcpu->kvm, cpu_addr); > BUG_ON(dest_vcpu == NULL); > > kvm_s390_vcpu_wakeup(dest_vcpu); > -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id
On 19/11/15 09:37, Christian Borntraeger wrote: > From: David Hildenbrand> > Let's provide a function to lookup a VCPU by id. > > Reviewed-by: Christian Borntraeger > Reviewed-by: Dominik Dingel > Signed-off-by: David Hildenbrand > Signed-off-by: Christian Borntraeger > [split patch from refactoring patch] > --- > include/linux/kvm_host.h | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 5706a21..b9f345f 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -460,6 +460,17 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm > *kvm, int i) >(vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \ >idx++) > > +static inline struct kvm_vcpu *kvm_lookup_vcpu(struct kvm *kvm, int id) > +{ > + struct kvm_vcpu *vcpu; > + int i; > + > + kvm_for_each_vcpu(i, vcpu, kvm) > + if (vcpu->vcpu_id == id) > + return vcpu; > + return NULL; > +} > + Any chance that you name this function differently? Otherwise we've got two functions that sound very similar and also have similar prototypes: - kvm_get_vcpu(struct kvm *kvm, int i) - kvm_lookup_vcpu(struct kvm *kvm, int id) I'm pretty sure this will cause confusion in the future! ==> Could you maybe name the new function something like "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead? Also a short comment before the function would be nice to make the reader aware of the difference (e.g. when hot-plugging) between the vcpu-id and array-id. Otherwise: +1 for finally having a proper common function for this! Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id
> > Any chance that you name this function differently? Otherwise we've got > two functions that sound very similar and also have similar prototypes: > > - kvm_get_vcpu(struct kvm *kvm, int i) > - kvm_lookup_vcpu(struct kvm *kvm, int id) > > I'm pretty sure this will cause confusion in the future! > ==> Could you maybe name the new function something like > "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead? Had that in a previous version but decided to name it that way ... hmm ... other opinions? > > Also a short comment before the function would be nice to make the > reader aware of the difference (e.g. when hot-plugging) between the > vcpu-id and array-id. Also not sure as the header directly includes the (for my eyes ;) ) easy code. I would agree for a pure prototype. Other opinions? > > Otherwise: +1 for finally having a proper common function for this! > Thanks for having a look Thomas! David -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id
On 19/11/2015 13:55, David Hildenbrand wrote: >> > >> > I'm pretty sure this will cause confusion in the future! >> > ==> Could you maybe name the new function something like >> > "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead? > Had that in a previous version but decided to name it that way ... hmm ... > other opinions? Having _by_id at the end of the name makes sense indeed. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id
On 11/19/2015 02:13 PM, Paolo Bonzini wrote: > > > On 19/11/2015 13:55, David Hildenbrand wrote: I'm pretty sure this will cause confusion in the future! ==> Could you maybe name the new function something like "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead? >> Had that in a previous version but decided to name it that way ... hmm ... >> other opinions? > > Having _by_id at the end of the name makes sense indeed. David, I will fix up the function name myself to kvm_get_vcpu_by_id. Ok? -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id
> On 11/19/2015 02:13 PM, Paolo Bonzini wrote: > > > > > > On 19/11/2015 13:55, David Hildenbrand wrote: > > I'm pretty sure this will cause confusion in the future! > ==> Could you maybe name the new function something like > "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead? > >> Had that in a previous version but decided to name it that way ... hmm ... > >> other opinions? > > > > Having _by_id at the end of the name makes sense indeed. > > David, > I will fix up the function name myself to kvm_get_vcpu_by_id. Ok? Sure, thanks a lot! David -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] kvmtool: Cleanup kernel loading
On Wed, Nov 18, 2015 at 10:29:30AM +, Andre Przywara wrote: > On 02/11/15 14:58, Will Deacon wrote: > > On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote: > >> this series cleans up kvmtool's kernel loading functionality a bit. > >> It has been broken out of a previous series I sent [1] and contains > >> just the cleanup and bug fix parts, which should be less controversial > >> and thus easier to merge ;-) > >> I will resend the pipe loading part later on as a separate series. > >> > >> The first patch properly abstracts kernel loading to move > >> responsibility into each architecture's code. It removes quite some > >> ugly code from the generic kvm.c file. > >> The later patches address the naive usage of read(2) to, well, read > >> data from files. Doing this without coping with the subtleties of > >> the UNIX read semantics (returning with less or none data read is not > >> an error) can provoke hard to debug failures. > >> So these patches make use of the existing and one new wrapper function > >> to make sure we read everything we actually wanted to. > >> The last patch moves the ARM kernel loading code into the proper > >> location to be in line with the other architectures. > >> > >> Please have a look and give some comments! > > > > Looks good to me, but I'd like to see some comments from some mips/ppc/x86 > > people on the changes you're making over there. > > Sounds reasonable, but no answers yet. > > Can you take at least patch 1 and 2 meanwhile, preferably 6 and 7 (the > ARM parts) also if you are OK with it? > I have other patches that depend on 1/7 and 2/7, so having them upstream > would help me and reduce further dependency churn. > I am happy to resend the remaining patches for further discussion later. We let them sit on the list for a while with no comments, so I just pushed out your series. If a bug report shows up, then we can always revert the offending patch if there's no quick fix. Will -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: powerpc: kvmppc_visible_gpa can be boolean
On 16/11/2015 04:10, Yaowei Bai wrote: > In another patch kvm_is_visible_gfn is maken return bool due to this > function only returns zero or one as its return value, let's also make > kvmppc_visible_gpa return bool to keep consistent. > > No functional change. > > Signed-off-by: Yaowei Bai> --- > arch/powerpc/kvm/book3s_pr.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c > index 64891b0..70fb08d 100644 > --- a/arch/powerpc/kvm/book3s_pr.c > +++ b/arch/powerpc/kvm/book3s_pr.c > @@ -512,7 +512,7 @@ static void kvmppc_patch_dcbz(struct kvm_vcpu *vcpu, > struct kvmppc_pte *pte) > put_page(hpage); > } > > -static int kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa) > +static bool kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa) > { > ulong mp_pa = vcpu->arch.magic_page_pa; > > @@ -521,7 +521,7 @@ static int kvmppc_visible_gpa(struct kvm_vcpu *vcpu, > gpa_t gpa) > > gpa &= ~0xFFFULL; > if (unlikely(mp_pa) && unlikely((mp_pa & KVM_PAM) == (gpa & KVM_PAM))) { > - return 1; > + return true; > } > > return kvm_is_visible_gfn(vcpu->kvm, gpa >> PAGE_SHIFT); > Applied all three patches to kvm/queue, thanks. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] kvmtool: Cleanup kernel loading
Hi Will, On 02/11/15 14:58, Will Deacon wrote: > On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote: >> Hi, > > Hello Andre, > >> this series cleans up kvmtool's kernel loading functionality a bit. >> It has been broken out of a previous series I sent [1] and contains >> just the cleanup and bug fix parts, which should be less controversial >> and thus easier to merge ;-) >> I will resend the pipe loading part later on as a separate series. >> >> The first patch properly abstracts kernel loading to move >> responsibility into each architecture's code. It removes quite some >> ugly code from the generic kvm.c file. >> The later patches address the naive usage of read(2) to, well, read >> data from files. Doing this without coping with the subtleties of >> the UNIX read semantics (returning with less or none data read is not >> an error) can provoke hard to debug failures. >> So these patches make use of the existing and one new wrapper function >> to make sure we read everything we actually wanted to. >> The last patch moves the ARM kernel loading code into the proper >> location to be in line with the other architectures. >> >> Please have a look and give some comments! > > Looks good to me, but I'd like to see some comments from some mips/ppc/x86 > people on the changes you're making over there. Sounds reasonable, but no answers yet. Can you take at least patch 1 and 2 meanwhile, preferably 6 and 7 (the ARM parts) also if you are OK with it? I have other patches that depend on 1/7 and 2/7, so having them upstream would help me and reduce further dependency churn. I am happy to resend the remaining patches for further discussion later. Cheers, Andre. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: powerpc: kvmppc_visible_gpa can be boolean
In another patch kvm_is_visible_gfn is maken return bool due to this function only returns zero or one as its return value, let's also make kvmppc_visible_gpa return bool to keep consistent. No functional change. Signed-off-by: Yaowei Bai--- arch/powerpc/kvm/book3s_pr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 64891b0..70fb08d 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -512,7 +512,7 @@ static void kvmppc_patch_dcbz(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte) put_page(hpage); } -static int kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa) +static bool kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa) { ulong mp_pa = vcpu->arch.magic_page_pa; @@ -521,7 +521,7 @@ static int kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa) gpa &= ~0xFFFULL; if (unlikely(mp_pa) && unlikely((mp_pa & KVM_PAM) == (gpa & KVM_PAM))) { - return 1; + return true; } return kvm_is_visible_gfn(vcpu->kvm, gpa >> PAGE_SHIFT); -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Friday 13 November 2015 01:08 PM, Thomas Huth wrote: > On 13/11/15 07:26, Aravinda Prasad wrote: >> >> On Friday 13 November 2015 07:20 AM, David Gibson wrote: >>> On Thu, Nov 12, 2015 at 11:22:29PM +0530, Aravinda Prasad wrote: > [...] So thinking whether qemu should explicitly enable the new NMI behavior. >>> >>> So, I think the reasoning above tends towards having qemu control the >>> MC behaviour. If qemu does nothing, MCs are delivered direct to >>> 0x200, if it enables the new handling, they cause a KVM exit and qemu >>> will deliver the MC. >> >> This essentially requires qemu to control how KVM behaves as KVM does >> the actual redirection of MC either to guest's 0x200 vector or to exit >> guest. So, if we are running new qemu, then KVM should exit guest and if >> we are running old qemu, KVM should redirect MC to 0x200. Is there any >> way to communicate this to KVM? ioctl? > > Simply introduce a KVM capability that can be enabled by userspace. > See kvm_vcpu_ioctl_enable_cap() in arch/powerpc/kvm/powerpc.c. Thanks. I will look at it. Regards, Aravinda > > Thomas > -- Regards, Aravinda -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
Aravinda Prasadwrites: >> Yeah, it would be good not to break this. > > I am not familiar with CAPI. Does this affect CAPI? When a CAPI card experiences an EEH event, any cache lines it holds are filled with SUEs (Special UEs, interpreted by the kernel the same as regular UEs). When these are read, we get an MCE. Currently CAPI does not support virtualisation, but that is actively being worked on. There's a _very_ good chance it will then be backported to various distros, which could have old qemu. Therefore, I'd ideally like to make sure UEs in KVM guests work properly and continue to do so in all combinations of kernel and qemu. I'm following up the email from Mahesh that you linked: I'm not sure I quite follow his logic so I'll try to make sense of that and then get back to you. Regards, Daniel signature.asc Description: PGP signature
Re: [GIT PULL] Please pull my kvm-ppc-fixes branch
On 12/11/2015 06:35, Paul Mackerras wrote: > Paolo, > > I have two fixes for HV KVM which I would like to have included in > v4.4-rc1. The first one is a fix for a bug identified by Red Hat > which causes occasional guest crashes. The second one fixes a bug > which causes host stalls and timeouts under certain circumstances when > the host is configured for static 2-way micro-threading mode. > > Thanks, > Paul. > > The following changes since commit a3eaa8649e4c6a6afdafaa04b9114fb230617bb1: > > KVM: VMX: Fix commit which broke PML (2015-11-05 11:34:11 +0100) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git > kvm-ppc-fixes > > for you to fetch changes up to f74f2e2e26199f695ca3df94f29e9ab7cb707ea4: > > KVM: PPC: Book3S HV: Don't dynamically split core when already split > (2015-11-06 16:02:59 +1100) > > > Paul Mackerras (2): > KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails > KVM: PPC: Book3S HV: Don't dynamically split core when already split > > arch/powerpc/kvm/book3s_hv.c| 2 +- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 > 2 files changed, 13 insertions(+), 9 deletions(-) > Pulled, thanks. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Thu, Nov 12, 2015 at 11:22:29PM +0530, Aravinda Prasad wrote: > > > On Thursday 12 November 2015 10:13 AM, David Gibson wrote: > > On Thu, Nov 12, 2015 at 10:02:10AM +0530, Aravinda Prasad wrote: > >> > >> > >> On Thursday 12 November 2015 09:08 AM, David Gibson wrote: > >>> On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote: > Aravinda Prasadwrites: > > > This patch modifies KVM to cause a guest exit with > > KVM_EXIT_NMI instead of immediately delivering a 0x200 > > interrupt to guest upon machine check exception in > > guest address. Exiting the guest enables QEMU to build > > error log and deliver machine check exception to guest > > OS (either via guest OS registered machine check > > handler or via 0x200 guest OS interrupt vector). > > > > This approach simplifies the delivering of machine > > check exception to guest OS compared to the earlier approach > > of KVM directly invoking 0x200 guest interrupt vector. > > In the earlier approach QEMU patched the 0x200 interrupt > > vector during boot. The patched code at 0x200 issued a > > private hcall to pass the control to QEMU to build the > > error log. > > > > This design/approach is based on the feedback for the > > QEMU patches to handle machine check exception. Details > > of earlier approach of handling machine check exception > > in QEMU and related discussions can be found at: > > > > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html > > I've poked at the MCE code, but not the KVM MCE code, so I may be > mistaken here, but I'm not clear on how this handles errors that the > guest can recover without terminating. > > For example, a Linux guest can handle a UE in guest userspace by killing > the guest process. A hypthetical non-linux guest with a microkernel > could even survive UEs in drivers. > > It sounds from your patch like you're changing this behaviour. Is this > right? > >>> > >>> So, IIUC. Once the qemu pieces are in place as well it shouldn't > >>> change this behaviour: KVM will exit to qemu, qemu will log the error > >>> information (new), then reinject the MC to the guest which can still > >>> handle it as you describe above. > >> > >> Yes. With KVM and QEMU both in place this will not change the behavior. > >> QEMU will inject the UE to guest and the guest handles the UE based on > >> where it occurred. For example if an UE happens in a guest process > >> address space, that process will be killed. > >> > >>> > >>> But, there could be a problem if you have a new kernel with an old > >>> qemu, in that case qemu might not understand the new exit type and > >>> treat it as a fatal error, even though the guest could actually cope > >>> with it. > >> > >> In case of new kernel and old QEMU, the guest terminates as old QEMU > >> does not understand the NMI exit reason. However, this is the case with > >> old kernel and old QEMU as they do not handle UE belonging to guest. The > >> difference is that the guest kernel terminates with different error > >> code. > > > > Ok.. assuming the guest has code to handle the UE in 0x200, why would > > the guest terminate with old kernel and old qemu? I haven't quite > > followed the logic. > > I overlooked it. I think I need to take into consideration whether guest > issued "ibm, nmi-register". If the guest has issued "ibm, nmi-register" > then we should not jump to 0x200 upon UE. With the old kernel and old > QEMU this is broken as we always jump to 0x200. > > However, if the guest has not issued "ibm, nmi-register" then upon UE we > should jump to 0x200. If new kernel is used with old QEMU this > functionality breaks as it causes guest to terminate with unhandled NMI > exit. > > So thinking whether qemu should explicitly enable the new NMI > behavior. So, I think the reasoning above tends towards having qemu control the MC behaviour. If qemu does nothing, MCs are delivered direct to 0x200, if it enables the new handling, they cause a KVM exit and qemu will deliver the MC. Then I'd expect qemu to switch on the new-style handling from ibm,nmi-register. > > Regards, > Aravinda > > > > >> > >> old kernel and old QEMU -> guest panics [1] irrespective of where UE > >>happened in guest address space. > >> old kernel and new QEMU -> guest panics. same as above. > >> new kernel and old QEMU -> guest terminates with unhanded NMI error > >>irrespective of where UE happened in guest > >> new kernel and new QEMU -> guest handles UEs in process address space > >>by killing the process. guest terminates > >>for UEs in guest kernel address space. > >> > >> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118329.html > >> > >>> > >>> Aravinda, do we need to change
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Friday 13 November 2015 07:20 AM, David Gibson wrote: > On Thu, Nov 12, 2015 at 11:22:29PM +0530, Aravinda Prasad wrote: [...] >> >> I overlooked it. I think I need to take into consideration whether guest >> issued "ibm, nmi-register". If the guest has issued "ibm, nmi-register" >> then we should not jump to 0x200 upon UE. With the old kernel and old >> QEMU this is broken as we always jump to 0x200. >> >> However, if the guest has not issued "ibm, nmi-register" then upon UE we >> should jump to 0x200. If new kernel is used with old QEMU this >> functionality breaks as it causes guest to terminate with unhandled NMI >> exit. >> >> So thinking whether qemu should explicitly enable the new NMI >> behavior. > > So, I think the reasoning above tends towards having qemu control the > MC behaviour. If qemu does nothing, MCs are delivered direct to > 0x200, if it enables the new handling, they cause a KVM exit and qemu > will deliver the MC. This essentially requires qemu to control how KVM behaves as KVM does the actual redirection of MC either to guest's 0x200 vector or to exit guest. So, if we are running new qemu, then KVM should exit guest and if we are running old qemu, KVM should redirect MC to 0x200. Is there any way to communicate this to KVM? ioctl? However, this should not be a problem (in terms of breaking existing functionality) with old KVM as it always redirects MC to 0x200 irrespective of old or new qemu. > > Then I'd expect qemu to switch on the new-style handling from > ibm,nmi-register. > >> -- Regards, Aravinda -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On 13/11/15 07:26, Aravinda Prasad wrote: > > On Friday 13 November 2015 07:20 AM, David Gibson wrote: >> On Thu, Nov 12, 2015 at 11:22:29PM +0530, Aravinda Prasad wrote: [...] >>> So thinking whether qemu should explicitly enable the new NMI >>> behavior. >> >> So, I think the reasoning above tends towards having qemu control the >> MC behaviour. If qemu does nothing, MCs are delivered direct to >> 0x200, if it enables the new handling, they cause a KVM exit and qemu >> will deliver the MC. > > This essentially requires qemu to control how KVM behaves as KVM does > the actual redirection of MC either to guest's 0x200 vector or to exit > guest. So, if we are running new qemu, then KVM should exit guest and if > we are running old qemu, KVM should redirect MC to 0x200. Is there any > way to communicate this to KVM? ioctl? Simply introduce a KVM capability that can be enabled by userspace. See kvm_vcpu_ioctl_enable_cap() in arch/powerpc/kvm/powerpc.c. Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Friday 13 November 2015 03:07 AM, Daniel Axtens wrote: > Aravinda Prasadwrites: > >>> Yeah, it would be good not to break this. >> >> I am not familiar with CAPI. Does this affect CAPI? > > When a CAPI card experiences an EEH event, any cache lines it holds are > filled with SUEs (Special UEs, interpreted by the kernel the same as > regular UEs). When these are read, we get an MCE. Currently CAPI does > not support virtualisation, but that is actively being worked > on. There's a _very_ good chance it will then be backported to various > distros, which could have old qemu. > > Therefore, I'd ideally like to make sure UEs in KVM guests work properly > and continue to do so in all combinations of kernel and qemu. I'm > following up the email from Mahesh that you linked: I'm not sure I quite > follow his logic so I'll try to make sense of that and then get back to > you. sure. Thanks. Regards, Aravinda > > Regards, > Daniel > -- Regards, Aravinda -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Thursday 12 November 2015 10:13 AM, David Gibson wrote: > On Thu, Nov 12, 2015 at 10:02:10AM +0530, Aravinda Prasad wrote: >> >> >> On Thursday 12 November 2015 09:08 AM, David Gibson wrote: >>> On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote: Aravinda Prasadwrites: > This patch modifies KVM to cause a guest exit with > KVM_EXIT_NMI instead of immediately delivering a 0x200 > interrupt to guest upon machine check exception in > guest address. Exiting the guest enables QEMU to build > error log and deliver machine check exception to guest > OS (either via guest OS registered machine check > handler or via 0x200 guest OS interrupt vector). > > This approach simplifies the delivering of machine > check exception to guest OS compared to the earlier approach > of KVM directly invoking 0x200 guest interrupt vector. > In the earlier approach QEMU patched the 0x200 interrupt > vector during boot. The patched code at 0x200 issued a > private hcall to pass the control to QEMU to build the > error log. > > This design/approach is based on the feedback for the > QEMU patches to handle machine check exception. Details > of earlier approach of handling machine check exception > in QEMU and related discussions can be found at: > > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html I've poked at the MCE code, but not the KVM MCE code, so I may be mistaken here, but I'm not clear on how this handles errors that the guest can recover without terminating. For example, a Linux guest can handle a UE in guest userspace by killing the guest process. A hypthetical non-linux guest with a microkernel could even survive UEs in drivers. It sounds from your patch like you're changing this behaviour. Is this right? >>> >>> So, IIUC. Once the qemu pieces are in place as well it shouldn't >>> change this behaviour: KVM will exit to qemu, qemu will log the error >>> information (new), then reinject the MC to the guest which can still >>> handle it as you describe above. >> >> Yes. With KVM and QEMU both in place this will not change the behavior. >> QEMU will inject the UE to guest and the guest handles the UE based on >> where it occurred. For example if an UE happens in a guest process >> address space, that process will be killed. >> >>> >>> But, there could be a problem if you have a new kernel with an old >>> qemu, in that case qemu might not understand the new exit type and >>> treat it as a fatal error, even though the guest could actually cope >>> with it. >> >> In case of new kernel and old QEMU, the guest terminates as old QEMU >> does not understand the NMI exit reason. However, this is the case with >> old kernel and old QEMU as they do not handle UE belonging to guest. The >> difference is that the guest kernel terminates with different error >> code. > > Ok.. assuming the guest has code to handle the UE in 0x200, why would > the guest terminate with old kernel and old qemu? I haven't quite > followed the logic. I overlooked it. I think I need to take into consideration whether guest issued "ibm, nmi-register". If the guest has issued "ibm, nmi-register" then we should not jump to 0x200 upon UE. With the old kernel and old QEMU this is broken as we always jump to 0x200. However, if the guest has not issued "ibm, nmi-register" then upon UE we should jump to 0x200. If new kernel is used with old QEMU this functionality breaks as it causes guest to terminate with unhandled NMI exit. So thinking whether qemu should explicitly enable the new NMI behavior. Regards, Aravinda > >> >> old kernel and old QEMU -> guest panics [1] irrespective of where UE >>happened in guest address space. >> old kernel and new QEMU -> guest panics. same as above. >> new kernel and old QEMU -> guest terminates with unhanded NMI error >>irrespective of where UE happened in guest >> new kernel and new QEMU -> guest handles UEs in process address space >>by killing the process. guest terminates >>for UEs in guest kernel address space. >> >> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118329.html >> >>> >>> Aravinda, do we need to change this so that qemu has to explicitly >>> enable the new NMI behaviour? Or have I missed something that will >>> make that case work already. >> >> I think we don't need to explicitly enable the new behavior. With new >> kernel and new QEMU this should just work. As mentioned above this is >> already broken for old kernel/QEMU. Any thoughts? >> >> Regards, >> Aravinda >> >>> >>> >>> >>> ___ >>> Linuxppc-dev mailing list >>> linuxppc-...@lists.ozlabs.org >>> https://lists.ozlabs.org/listinfo/linuxppc-dev >>> >> > -- Regards,
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Thursday 12 November 2015 10:28 AM, Daniel Axtens wrote: > >> So, IIUC. Once the qemu pieces are in place as well it shouldn't >> change this behaviour: KVM will exit to qemu, qemu will log the error >> information (new), then reinject the MC to the guest which can still >> handle it as you describe above. > > Ah, that makes *much* more sense now! Thanks for the explanation: I > don't really follow qemu development. > >> >> But, there could be a problem if you have a new kernel with an old >> qemu, in that case qemu might not understand the new exit type and >> treat it as a fatal error, even though the guest could actually cope >> with it. >> >> Aravinda, do we need to change this so that qemu has to explicitly >> enable the new NMI behaviour? Or have I missed something that will >> make that case work already. > > Yeah, it would be good not to break this. I am not familiar with CAPI. Does this affect CAPI? Regards, Aravinda > > Regards, > Daniel > > >> -- >> David Gibson | I'll have my music baroque, and my code >> david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ >> _other_ >> | _way_ _around_! >> http://www.ozlabs.org/~dgibson > ___ > Linuxppc-dev mailing list > linuxppc-...@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev > -- Regards, Aravinda -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: Exit guest upon fatal machine check exception
This patch modifies KVM to cause a guest exit with KVM_EXIT_NMI instead of immediately delivering a 0x200 interrupt to guest upon machine check exception in guest address. Exiting the guest enables QEMU to build error log and deliver machine check exception to guest OS (either via guest OS registered machine check handler or via 0x200 guest OS interrupt vector). This approach simplifies the delivering of machine check exception to guest OS compared to the earlier approach of KVM directly invoking 0x200 guest interrupt vector. In the earlier approach QEMU patched the 0x200 interrupt vector during boot. The patched code at 0x200 issued a private hcall to pass the control to QEMU to build the error log. This design/approach is based on the feedback for the QEMU patches to handle machine check exception. Details of earlier approach of handling machine check exception in QEMU and related discussions can be found at: https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html Signed-off-by: Aravinda Prasad--- arch/powerpc/kvm/book3s_hv.c| 12 +++- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 31 +++ 2 files changed, 14 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 2280497..1b1dff0 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, r = RESUME_GUEST; break; case BOOK3S_INTERRUPT_MACHINE_CHECK: - /* -* Deliver a machine check interrupt to the guest. -* We have to do this, even if the host has handled the -* machine check, because machine checks use SRR0/1 and -* the interrupt might have trashed guest state in them. -*/ - kvmppc_book3s_queue_irqprio(vcpu, - BOOK3S_INTERRUPT_MACHINE_CHECK); - r = RESUME_GUEST; + /* Exit to guest with KVM_EXIT_NMI as exit reason */ + run->exit_reason = KVM_EXIT_NMI; + r = RESUME_HOST; break; case BOOK3S_INTERRUPT_PROGRAM: { diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index b98889e..672b4f6 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) addir1, r1, 112 ld r7, HSTATE_HOST_MSR(r13) - cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL beq 11f cmpwi cr2, r12, BOOK3S_INTERRUPT_HMI @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) mtmsrd r6, 1 /* Clear RI in MSR */ mtsrr0 r8 mtsrr1 r7 - beq cr1, 13f/* machine check */ RFI /* On POWER7, we have external interrupts set to use HSRR0/1 */ @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) mtspr SPRN_HSRR1, r7 ba 0x500 -13:b machine_check_fwnmi - 14:mtspr SPRN_HSRR0, r8 mtspr SPRN_HSRR1, r7 b hmi_exception_after_realmode @@ -2381,24 +2377,19 @@ machine_check_realmode: ld r9, HSTATE_KVM_VCPU(r13) li r12, BOOK3S_INTERRUPT_MACHINE_CHECK /* -* Deliver unhandled/fatal (e.g. UE) MCE errors to guest through -* machine check interrupt (set HSRR0 to 0x200). And for handled -* errors (no-fatal), just go back to guest execution with current -* HSRR0 instead of exiting guest. This new approach will inject -* machine check to guest for fatal error causing guest to crash. -* -* The old code used to return to host for unhandled errors which -* was causing guest to hang with soft lockups inside guest and -* makes it difficult to recover guest instance. +* Deliver unhandled/fatal (e.g. UE) MCE errors to guest +* by exiting the guest with KVM_EXIT_NMI exit reason (exit +* reason set later based on trap). For handled errors +* (no-fatal), go back to guest execution with current HSRR0 +* instead of exiting the guest. This approach will cause +* the guest to exit in case of fatal machine check error. */ - ld r10, VCPU_PC(r9) + bne 2f /* Continue guest execution? */ + /* If not, exit the guest. SRR0/1 are already set */ + b mc_cont +2: ld r10, VCPU_PC(r9) ld r11, VCPU_MSR(r9) - bne 2f /* Continue guest execution. */ - /* If not, deliver a machine check. SRR0/1 are already set */ - li r10, BOOK3S_INTERRUPT_MACHINE_CHECK -
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
> So, IIUC. Once the qemu pieces are in place as well it shouldn't > change this behaviour: KVM will exit to qemu, qemu will log the error > information (new), then reinject the MC to the guest which can still > handle it as you describe above. Ah, that makes *much* more sense now! Thanks for the explanation: I don't really follow qemu development. > > But, there could be a problem if you have a new kernel with an old > qemu, in that case qemu might not understand the new exit type and > treat it as a fatal error, even though the guest could actually cope > with it. > > Aravinda, do we need to change this so that qemu has to explicitly > enable the new NMI behaviour? Or have I missed something that will > make that case work already. Yeah, it would be good not to break this. Regards, Daniel > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au| minimalist, thank you. NOT _the_ > _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Thu, Nov 12, 2015 at 10:02:10AM +0530, Aravinda Prasad wrote: > > > On Thursday 12 November 2015 09:08 AM, David Gibson wrote: > > On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote: > >> Aravinda Prasadwrites: > >> > >>> This patch modifies KVM to cause a guest exit with > >>> KVM_EXIT_NMI instead of immediately delivering a 0x200 > >>> interrupt to guest upon machine check exception in > >>> guest address. Exiting the guest enables QEMU to build > >>> error log and deliver machine check exception to guest > >>> OS (either via guest OS registered machine check > >>> handler or via 0x200 guest OS interrupt vector). > >>> > >>> This approach simplifies the delivering of machine > >>> check exception to guest OS compared to the earlier approach > >>> of KVM directly invoking 0x200 guest interrupt vector. > >>> In the earlier approach QEMU patched the 0x200 interrupt > >>> vector during boot. The patched code at 0x200 issued a > >>> private hcall to pass the control to QEMU to build the > >>> error log. > >>> > >>> This design/approach is based on the feedback for the > >>> QEMU patches to handle machine check exception. Details > >>> of earlier approach of handling machine check exception > >>> in QEMU and related discussions can be found at: > >>> > >>> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html > >> > >> I've poked at the MCE code, but not the KVM MCE code, so I may be > >> mistaken here, but I'm not clear on how this handles errors that the > >> guest can recover without terminating. > >> > >> For example, a Linux guest can handle a UE in guest userspace by killing > >> the guest process. A hypthetical non-linux guest with a microkernel > >> could even survive UEs in drivers. > >> > >> It sounds from your patch like you're changing this behaviour. Is this > >> right? > > > > So, IIUC. Once the qemu pieces are in place as well it shouldn't > > change this behaviour: KVM will exit to qemu, qemu will log the error > > information (new), then reinject the MC to the guest which can still > > handle it as you describe above. > > Yes. With KVM and QEMU both in place this will not change the behavior. > QEMU will inject the UE to guest and the guest handles the UE based on > where it occurred. For example if an UE happens in a guest process > address space, that process will be killed. > > > > > But, there could be a problem if you have a new kernel with an old > > qemu, in that case qemu might not understand the new exit type and > > treat it as a fatal error, even though the guest could actually cope > > with it. > > In case of new kernel and old QEMU, the guest terminates as old QEMU > does not understand the NMI exit reason. However, this is the case with > old kernel and old QEMU as they do not handle UE belonging to guest. The > difference is that the guest kernel terminates with different error > code. Ok.. assuming the guest has code to handle the UE in 0x200, why would the guest terminate with old kernel and old qemu? I haven't quite followed the logic. > > old kernel and old QEMU -> guest panics [1] irrespective of where UE >happened in guest address space. > old kernel and new QEMU -> guest panics. same as above. > new kernel and old QEMU -> guest terminates with unhanded NMI error >irrespective of where UE happened in guest > new kernel and new QEMU -> guest handles UEs in process address space >by killing the process. guest terminates >for UEs in guest kernel address space. > > [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118329.html > > > > > Aravinda, do we need to change this so that qemu has to explicitly > > enable the new NMI behaviour? Or have I missed something that will > > make that case work already. > > I think we don't need to explicitly enable the new behavior. With new > kernel and new QEMU this should just work. As mentioned above this is > already broken for old kernel/QEMU. Any thoughts? > > Regards, > Aravinda > > > > > > > > > ___ > > Linuxppc-dev mailing list > > linuxppc-...@lists.ozlabs.org > > https://lists.ozlabs.org/listinfo/linuxppc-dev > > > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
[PATCH 1/2] KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR
Currently it is possible for userspace (e.g. QEMU) to set a value for the MSR for a guest VCPU which has both of the TS bits set, which is an illegal combination. The result of this is that when we execute a hrfid (hypervisor return from interrupt doubleword) instruction to enter the guest, the CPU will take a TM Bad Thing type of program interrupt (vector 0x700). Now, if PR KVM is configured in the kernel along with HV KVM, we actually handle this without crashing the host or giving hypervisor privilege to the guest; instead what happens is that we deliver a program interrupt to the guest, with SRR0 reflecting the address of the hrfid instruction and SRR1 containing the MSR value at that point. If PR KVM is not configured in the kernel, then we try to run the host's program interrupt handler with the MMU set to the guest context, which almost certainly causes a host crash. This closes the hole by making kvmppc_set_msr_hv() check for the illegal combination and force the TS field to a safe value (00, meaning non-transactional). Signed-off-by: Paul Mackerras--- arch/powerpc/kvm/book3s_hv.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index becad3a..f668712 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -231,6 +231,12 @@ static void kvmppc_core_vcpu_put_hv(struct kvm_vcpu *vcpu) static void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 msr) { + /* +* Check for illegal transactional state bit combination +* and if we find it, force the TS field to a safe state. +*/ + if ((msr & MSR_TS_MASK) == MSR_TS_MASK) + msr &= ~MSR_TS_MASK; vcpu->arch.shregs.msr = msr; kvmppc_end_cede(vcpu); } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] powerpc/64: Include KVM guest test in all interrupt vectors
Currently, if HV KVM is configured but PR KVM isn't, we don't include a test to see whether we were interrupted in KVM guest context for the set of interrupts which get delivered directly to the guest by hardware if they occur in the guest. This includes things like program interrupts. However, the recent bug where userspace could set the MSR for a VCPU to have an illegal value in the TS field, and thus cause a TM Bad Thing type of program interrupt on the hrfid that enters the guest, showed that we can never be completely sure that these interrupts can never occur in the guest entry/exit code. If one of these interrupts does happen and we have HV KVM configured but not PR KVM, then we end up trying to run the handler in the host with the MMU set to the guest MMU context, which generally ends badly. Thus, for robustness it is better to have the test in every interrupt vector, so that if some way is found to trigger some interrupt in the guest entry/exit path, we can handle it without immediately crashing the host. This means that the distinction between KVMTEST and KVMTEST_PR goes away. Thus we delete KVMTEST_PR and associated macros and use KVMTEST everywhere that we previously used either KVMTEST_PR or KVMTEST. It also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR, so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead. Signed-off-by: Paul Mackerras--- arch/powerpc/include/asm/exception-64s.h | 21 +++- arch/powerpc/kernel/exceptions-64s.S | 34 2 files changed, 20 insertions(+), 35 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 77f52b2..9ee1078 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -263,17 +263,6 @@ do_kvm_##n: \ #define KVM_HANDLER_SKIP(area, h, n) #endif -#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE -#define KVMTEST_PR(n) __KVMTEST(n) -#define KVM_HANDLER_PR(area, h, n) __KVM_HANDLER(area, h, n) -#define KVM_HANDLER_PR_SKIP(area, h, n)__KVM_HANDLER_SKIP(area, h, n) - -#else -#define KVMTEST_PR(n) -#define KVM_HANDLER_PR(area, h, n) -#define KVM_HANDLER_PR_SKIP(area, h, n) -#endif - #define NOTEST(n) /* @@ -360,13 +349,13 @@ label##_pSeries: \ HMT_MEDIUM_PPR_DISCARD; \ SET_SCRATCH0(r13); /* save r13 */ \ EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common,\ -EXC_STD, KVMTEST_PR, vec) +EXC_STD, KVMTEST, vec) /* Version of above for when we have to branch out-of-line */ #define STD_EXCEPTION_PSERIES_OOL(vec, label) \ .globl label##_pSeries; \ label##_pSeries: \ - EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, vec);\ + EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST, vec); \ EXCEPTION_PROLOG_PSERIES_1(label##_common, EXC_STD) #define STD_EXCEPTION_HV(loc, vec, label) \ @@ -436,17 +425,13 @@ label##_relon_hv: \ #define _SOFTEN_TEST(h, vec) __SOFTEN_TEST(h, vec) #define SOFTEN_TEST_PR(vec)\ - KVMTEST_PR(vec);\ + KVMTEST(vec); \ _SOFTEN_TEST(EXC_STD, vec) #define SOFTEN_TEST_HV(vec)\ KVMTEST(vec); \ _SOFTEN_TEST(EXC_HV, vec) -#define SOFTEN_TEST_HV_201(vec) \ - KVMTEST(vec); \ - _SOFTEN_TEST(EXC_STD, vec) - #define SOFTEN_NOTEST_PR(vec) _SOFTEN_TEST(EXC_STD, vec) #define SOFTEN_NOTEST_HV(vec) _SOFTEN_TEST(EXC_HV, vec) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 0a0399c2..1a03142 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -242,7 +242,7 @@ instruction_access_slb_pSeries: HMT_MEDIUM_PPR_DISCARD SET_SCRATCH0(r13) EXCEPTION_PROLOG_0(PACA_EXSLB) - EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x480) + EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST, 0x480) std r3,PACA_EXSLB+EX_R3(r13) mfspr r3,SPRN_SRR0/* SRR0 is faulting address */ #ifdef __DISABLED__ @@ -276,18 +276,18 @@ hardware_interrupt_hv: KVM_HANDLER(PACA_EXGEN, EXC_HV, 0x502) FTR_SECTION_ELSE _MASKABLE_EXCEPTION_PSERIES(0x500, hardware_interrupt, -
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Wed, Nov 11, 2015 at 10:28:46PM +0530, Aravinda Prasad wrote: > This patch modifies KVM to cause a guest exit with > KVM_EXIT_NMI instead of immediately delivering a 0x200 > interrupt to guest upon machine check exception in > guest address. Exiting the guest enables QEMU to build > error log and deliver machine check exception to guest > OS (either via guest OS registered machine check > handler or via 0x200 guest OS interrupt vector). > > This approach simplifies the delivering of machine > check exception to guest OS compared to the earlier approach > of KVM directly invoking 0x200 guest interrupt vector. > In the earlier approach QEMU patched the 0x200 interrupt > vector during boot. The patched code at 0x200 issued a > private hcall to pass the control to QEMU to build the > error log. > > This design/approach is based on the feedback for the > QEMU patches to handle machine check exception. Details > of earlier approach of handling machine check exception > in QEMU and related discussions can be found at: > > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html > > Signed-off-by: Aravinda Prasad> --- > arch/powerpc/kvm/book3s_hv.c| 12 +++- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 31 > +++ > 2 files changed, 14 insertions(+), 29 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 2280497..1b1dff0 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, > struct kvm_vcpu *vcpu, > r = RESUME_GUEST; > break; > case BOOK3S_INTERRUPT_MACHINE_CHECK: > - /* > - * Deliver a machine check interrupt to the guest. > - * We have to do this, even if the host has handled the > - * machine check, because machine checks use SRR0/1 and > - * the interrupt might have trashed guest state in them. > - */ > - kvmppc_book3s_queue_irqprio(vcpu, > - BOOK3S_INTERRUPT_MACHINE_CHECK); > - r = RESUME_GUEST; > + /* Exit to guest with KVM_EXIT_NMI as exit reason */ > + run->exit_reason = KVM_EXIT_NMI; > + r = RESUME_HOST; > break; > case BOOK3S_INTERRUPT_PROGRAM: > { > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b98889e..672b4f6 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > addir1, r1, 112 > ld r7, HSTATE_HOST_MSR(r13) > > - cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK > cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL > beq 11f > cmpwi cr2, r12, BOOK3S_INTERRUPT_HMI > @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > mtmsrd r6, 1 /* Clear RI in MSR */ > mtsrr0 r8 > mtsrr1 r7 > - beq cr1, 13f/* machine check */ > RFI > > /* On POWER7, we have external interrupts set to use HSRR0/1 */ > @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > mtspr SPRN_HSRR1, r7 > ba 0x500 > > -13: b machine_check_fwnmi > - I don't quite understand the 3 hunks above. The rest seems to change the delivery of MCs as I'd expect, but the above seems to change their detection and the reason for that isn't obvious to me. > 14: mtspr SPRN_HSRR0, r8 > mtspr SPRN_HSRR1, r7 > b hmi_exception_after_realmode > @@ -2381,24 +2377,19 @@ machine_check_realmode: > ld r9, HSTATE_KVM_VCPU(r13) > li r12, BOOK3S_INTERRUPT_MACHINE_CHECK > /* > - * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through > - * machine check interrupt (set HSRR0 to 0x200). And for handled > - * errors (no-fatal), just go back to guest execution with current > - * HSRR0 instead of exiting guest. This new approach will inject > - * machine check to guest for fatal error causing guest to crash. > - * > - * The old code used to return to host for unhandled errors which > - * was causing guest to hang with soft lockups inside guest and > - * makes it difficult to recover guest instance. > + * Deliver unhandled/fatal (e.g. UE) MCE errors to guest > + * by exiting the guest with KVM_EXIT_NMI exit reason (exit > + * reason set later based on trap). For handled errors > + * (no-fatal), go back to guest execution with current HSRR0 > + * instead of exiting the guest. This approach will cause > + * the guest to exit in case of fatal machine check error. >*/ > - ld r10, VCPU_PC(r9) > + bne 2f /*
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote: > Aravinda Prasadwrites: > > > This patch modifies KVM to cause a guest exit with > > KVM_EXIT_NMI instead of immediately delivering a 0x200 > > interrupt to guest upon machine check exception in > > guest address. Exiting the guest enables QEMU to build > > error log and deliver machine check exception to guest > > OS (either via guest OS registered machine check > > handler or via 0x200 guest OS interrupt vector). > > > > This approach simplifies the delivering of machine > > check exception to guest OS compared to the earlier approach > > of KVM directly invoking 0x200 guest interrupt vector. > > In the earlier approach QEMU patched the 0x200 interrupt > > vector during boot. The patched code at 0x200 issued a > > private hcall to pass the control to QEMU to build the > > error log. > > > > This design/approach is based on the feedback for the > > QEMU patches to handle machine check exception. Details > > of earlier approach of handling machine check exception > > in QEMU and related discussions can be found at: > > > > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html > > I've poked at the MCE code, but not the KVM MCE code, so I may be > mistaken here, but I'm not clear on how this handles errors that the > guest can recover without terminating. > > For example, a Linux guest can handle a UE in guest userspace by killing > the guest process. A hypthetical non-linux guest with a microkernel > could even survive UEs in drivers. > > It sounds from your patch like you're changing this behaviour. Is this > right? So, IIUC. Once the qemu pieces are in place as well it shouldn't change this behaviour: KVM will exit to qemu, qemu will log the error information (new), then reinject the MC to the guest which can still handle it as you describe above. But, there could be a problem if you have a new kernel with an old qemu, in that case qemu might not understand the new exit type and treat it as a fatal error, even though the guest could actually cope with it. Aravinda, do we need to change this so that qemu has to explicitly enable the new NMI behaviour? Or have I missed something that will make that case work already. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
[GIT PULL] Please pull my kvm-ppc-fixes branch
Paolo, I have two fixes for HV KVM which I would like to have included in v4.4-rc1. The first one is a fix for a bug identified by Red Hat which causes occasional guest crashes. The second one fixes a bug which causes host stalls and timeouts under certain circumstances when the host is configured for static 2-way micro-threading mode. Thanks, Paul. The following changes since commit a3eaa8649e4c6a6afdafaa04b9114fb230617bb1: KVM: VMX: Fix commit which broke PML (2015-11-05 11:34:11 +0100) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git kvm-ppc-fixes for you to fetch changes up to f74f2e2e26199f695ca3df94f29e9ab7cb707ea4: KVM: PPC: Book3S HV: Don't dynamically split core when already split (2015-11-06 16:02:59 +1100) Paul Mackerras (2): KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails KVM: PPC: Book3S HV: Don't dynamically split core when already split arch/powerpc/kvm/book3s_hv.c| 2 +- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 2 files changed, 13 insertions(+), 9 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
Aravinda Prasadwrites: > This patch modifies KVM to cause a guest exit with > KVM_EXIT_NMI instead of immediately delivering a 0x200 > interrupt to guest upon machine check exception in > guest address. Exiting the guest enables QEMU to build > error log and deliver machine check exception to guest > OS (either via guest OS registered machine check > handler or via 0x200 guest OS interrupt vector). > > This approach simplifies the delivering of machine > check exception to guest OS compared to the earlier approach > of KVM directly invoking 0x200 guest interrupt vector. > In the earlier approach QEMU patched the 0x200 interrupt > vector during boot. The patched code at 0x200 issued a > private hcall to pass the control to QEMU to build the > error log. > > This design/approach is based on the feedback for the > QEMU patches to handle machine check exception. Details > of earlier approach of handling machine check exception > in QEMU and related discussions can be found at: > > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html I've poked at the MCE code, but not the KVM MCE code, so I may be mistaken here, but I'm not clear on how this handles errors that the guest can recover without terminating. For example, a Linux guest can handle a UE in guest userspace by killing the guest process. A hypthetical non-linux guest with a microkernel could even survive UEs in drivers. It sounds from your patch like you're changing this behaviour. Is this right? Regards, Daniel > > Signed-off-by: Aravinda Prasad > --- > arch/powerpc/kvm/book3s_hv.c| 12 +++- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 31 > +++ > 2 files changed, 14 insertions(+), 29 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 2280497..1b1dff0 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, > struct kvm_vcpu *vcpu, > r = RESUME_GUEST; > break; > case BOOK3S_INTERRUPT_MACHINE_CHECK: > - /* > - * Deliver a machine check interrupt to the guest. > - * We have to do this, even if the host has handled the > - * machine check, because machine checks use SRR0/1 and > - * the interrupt might have trashed guest state in them. > - */ > - kvmppc_book3s_queue_irqprio(vcpu, > - BOOK3S_INTERRUPT_MACHINE_CHECK); > - r = RESUME_GUEST; > + /* Exit to guest with KVM_EXIT_NMI as exit reason */ > + run->exit_reason = KVM_EXIT_NMI; > + r = RESUME_HOST; > break; > case BOOK3S_INTERRUPT_PROGRAM: > { > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b98889e..672b4f6 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > addir1, r1, 112 > ld r7, HSTATE_HOST_MSR(r13) > > - cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK > cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL > beq 11f > cmpwi cr2, r12, BOOK3S_INTERRUPT_HMI > @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > mtmsrd r6, 1 /* Clear RI in MSR */ > mtsrr0 r8 > mtsrr1 r7 > - beq cr1, 13f/* machine check */ > RFI > > /* On POWER7, we have external interrupts set to use HSRR0/1 */ > @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) > mtspr SPRN_HSRR1, r7 > ba 0x500 > > -13: b machine_check_fwnmi > - > 14: mtspr SPRN_HSRR0, r8 > mtspr SPRN_HSRR1, r7 > b hmi_exception_after_realmode > @@ -2381,24 +2377,19 @@ machine_check_realmode: > ld r9, HSTATE_KVM_VCPU(r13) > li r12, BOOK3S_INTERRUPT_MACHINE_CHECK > /* > - * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through > - * machine check interrupt (set HSRR0 to 0x200). And for handled > - * errors (no-fatal), just go back to guest execution with current > - * HSRR0 instead of exiting guest. This new approach will inject > - * machine check to guest for fatal error causing guest to crash. > - * > - * The old code used to return to host for unhandled errors which > - * was causing guest to hang with soft lockups inside guest and > - * makes it difficult to recover guest instance. > + * Deliver unhandled/fatal (e.g. UE) MCE errors to guest > + * by exiting the guest with KVM_EXIT_NMI exit reason (exit > + * reason set later based on trap). For handled errors > + * (no-fatal), go back to guest
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Thursday 12 November 2015 09:04 AM, David Gibson wrote: > On Wed, Nov 11, 2015 at 10:28:46PM +0530, Aravinda Prasad wrote: >> This patch modifies KVM to cause a guest exit with >> KVM_EXIT_NMI instead of immediately delivering a 0x200 >> interrupt to guest upon machine check exception in >> guest address. Exiting the guest enables QEMU to build >> error log and deliver machine check exception to guest >> OS (either via guest OS registered machine check >> handler or via 0x200 guest OS interrupt vector). >> >> This approach simplifies the delivering of machine >> check exception to guest OS compared to the earlier approach >> of KVM directly invoking 0x200 guest interrupt vector. >> In the earlier approach QEMU patched the 0x200 interrupt >> vector during boot. The patched code at 0x200 issued a >> private hcall to pass the control to QEMU to build the >> error log. >> >> This design/approach is based on the feedback for the >> QEMU patches to handle machine check exception. Details >> of earlier approach of handling machine check exception >> in QEMU and related discussions can be found at: >> >> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html >> >> Signed-off-by: Aravinda Prasad>> --- >> arch/powerpc/kvm/book3s_hv.c| 12 +++- >> arch/powerpc/kvm/book3s_hv_rmhandlers.S | 31 >> +++ >> 2 files changed, 14 insertions(+), 29 deletions(-) >> >> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c >> index 2280497..1b1dff0 100644 >> --- a/arch/powerpc/kvm/book3s_hv.c >> +++ b/arch/powerpc/kvm/book3s_hv.c >> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, >> struct kvm_vcpu *vcpu, >> r = RESUME_GUEST; >> break; >> case BOOK3S_INTERRUPT_MACHINE_CHECK: >> -/* >> - * Deliver a machine check interrupt to the guest. >> - * We have to do this, even if the host has handled the >> - * machine check, because machine checks use SRR0/1 and >> - * the interrupt might have trashed guest state in them. >> - */ >> -kvmppc_book3s_queue_irqprio(vcpu, >> -BOOK3S_INTERRUPT_MACHINE_CHECK); >> -r = RESUME_GUEST; >> +/* Exit to guest with KVM_EXIT_NMI as exit reason */ >> +run->exit_reason = KVM_EXIT_NMI; >> +r = RESUME_HOST; >> break; >> case BOOK3S_INTERRUPT_PROGRAM: >> { >> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> index b98889e..672b4f6 100644 >> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) >> addir1, r1, 112 >> ld r7, HSTATE_HOST_MSR(r13) >> >> -cmpwi cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK >> cmpwi r12, BOOK3S_INTERRUPT_EXTERNAL >> beq 11f >> cmpwi cr2, r12, BOOK3S_INTERRUPT_HMI >> @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) >> mtmsrd r6, 1 /* Clear RI in MSR */ >> mtsrr0 r8 >> mtsrr1 r7 >> -beq cr1, 13f/* machine check */ >> RFI >> >> /* On POWER7, we have external interrupts set to use HSRR0/1 */ >> @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S) >> mtspr SPRN_HSRR1, r7 >> ba 0x500 >> >> -13: b machine_check_fwnmi >> - > > I don't quite understand the 3 hunks above. The rest seems to change > the delivery of MCs as I'd expect, but the above seems to change their > detection and the reason for that isn't obvious to me. We need to do guest exit here and hence we continue with RFI or else if we branch to machine_check_fwnmi then the control will flow to opal_recover_mce routine without causing the guest to exit with NMI exit code. And I think, opal_recover_mce() is used to recover from UEs in host user/kernel space and does not have a check for UEs belonging to guest. Hence if we branch to machine_check_fwnmi we end up running into panic() in opal_machine_check() if UE belonged to guest. Regards, Aravinda > > >> 14: mtspr SPRN_HSRR0, r8 >> mtspr SPRN_HSRR1, r7 >> b hmi_exception_after_realmode >> @@ -2381,24 +2377,19 @@ machine_check_realmode: >> ld r9, HSTATE_KVM_VCPU(r13) >> li r12, BOOK3S_INTERRUPT_MACHINE_CHECK >> /* >> - * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through >> - * machine check interrupt (set HSRR0 to 0x200). And for handled >> - * errors (no-fatal), just go back to guest execution with current >> - * HSRR0 instead of exiting guest. This new approach will inject >> - * machine check to guest for fatal error causing guest to crash. >> - * >> - * The old code used to return to host for
[PATCH 2/2] KVM: PPC: Book3S HV: Handle unexpected traps in guest entry/exit code better
As we saw with the TM Bad Thing type of program interrupt occurring on the hrfid that enters the guest, it is not completely impossible to have a trap occurring in the guest entry/exit code, despite the fact that the code has been written to avoid taking any traps. This adds a check in the kvmppc_handle_exit_hv() function to detect the case when a trap has occurred in the hypervisor-mode code, and instead of treating it just like a trap in guest code, we now print a message and return to userspace with a KVM_EXIT_INTERNAL_ERROR exit reason. Of the various interrupts that get handled in the assembly code in the guest exit path and that can return directly to the guest, the only one that can occur when MSR.HV=1 and MSR.EE=0 is machine check (other than system call, which we can avoid just by not doing a sc instruction). Therefore this adds code to the machine check path to ensure that if the MCE occurred in hypervisor mode, we exit to the host rather than trying to continue the guest. Signed-off-by: Paul Mackerras--- arch/powerpc/kvm/book3s_hv.c| 18 ++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 ++ 2 files changed, 20 insertions(+) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index f668712..d6baf0a 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -846,6 +846,24 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, struct kvm_vcpu *vcpu, vcpu->stat.sum_exits++; + /* +* This can happen if an interrupt occurs in the last stages +* of guest entry or the first stages of guest exit (i.e. after +* setting paca->kvm_hstate.in_guest to KVM_GUEST_MODE_GUEST_HV +* and before setting it to KVM_GUEST_MODE_HOST_HV). +* That can happen due to a bug, or due to a machine check +* occurring at just the wrong time. +*/ + if (vcpu->arch.shregs.msr & MSR_HV) { + printk(KERN_EMERG "KVM trap in HV mode!\n"); + printk(KERN_EMERG "trap=0x%x | pc=0x%lx | msr=0x%llx\n", + vcpu->arch.trap, kvmppc_get_pc(vcpu), + vcpu->arch.shregs.msr); + kvmppc_dump_regs(vcpu); + run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + run->hw.hardware_exit_reason = vcpu->arch.trap; + return RESUME_HOST; + } run->exit_reason = KVM_EXIT_UNKNOWN; run->ready_for_interrupt_injection = 1; switch (vcpu->arch.trap) { diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 3c6badc..b3ce8ff 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -2404,6 +2404,8 @@ machine_check_realmode: * guest as machine check causing guest to crash. */ ld r11, VCPU_MSR(r9) + rldicl. r0, r11, 64-MSR_HV_LG, 63 /* check if it happened in HV mode */ + bne mc_cont /* if so, exit to host */ andi. r10, r11, MSR_RI/* check for unrecoverable exception */ beq 1f /* Deliver a machine check to guest */ ld r10, VCPU_PC(r9) -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception
On Thursday 12 November 2015 09:08 AM, David Gibson wrote: > On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote: >> Aravinda Prasadwrites: >> >>> This patch modifies KVM to cause a guest exit with >>> KVM_EXIT_NMI instead of immediately delivering a 0x200 >>> interrupt to guest upon machine check exception in >>> guest address. Exiting the guest enables QEMU to build >>> error log and deliver machine check exception to guest >>> OS (either via guest OS registered machine check >>> handler or via 0x200 guest OS interrupt vector). >>> >>> This approach simplifies the delivering of machine >>> check exception to guest OS compared to the earlier approach >>> of KVM directly invoking 0x200 guest interrupt vector. >>> In the earlier approach QEMU patched the 0x200 interrupt >>> vector during boot. The patched code at 0x200 issued a >>> private hcall to pass the control to QEMU to build the >>> error log. >>> >>> This design/approach is based on the feedback for the >>> QEMU patches to handle machine check exception. Details >>> of earlier approach of handling machine check exception >>> in QEMU and related discussions can be found at: >>> >>> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html >> >> I've poked at the MCE code, but not the KVM MCE code, so I may be >> mistaken here, but I'm not clear on how this handles errors that the >> guest can recover without terminating. >> >> For example, a Linux guest can handle a UE in guest userspace by killing >> the guest process. A hypthetical non-linux guest with a microkernel >> could even survive UEs in drivers. >> >> It sounds from your patch like you're changing this behaviour. Is this >> right? > > So, IIUC. Once the qemu pieces are in place as well it shouldn't > change this behaviour: KVM will exit to qemu, qemu will log the error > information (new), then reinject the MC to the guest which can still > handle it as you describe above. Yes. With KVM and QEMU both in place this will not change the behavior. QEMU will inject the UE to guest and the guest handles the UE based on where it occurred. For example if an UE happens in a guest process address space, that process will be killed. > > But, there could be a problem if you have a new kernel with an old > qemu, in that case qemu might not understand the new exit type and > treat it as a fatal error, even though the guest could actually cope > with it. In case of new kernel and old QEMU, the guest terminates as old QEMU does not understand the NMI exit reason. However, this is the case with old kernel and old QEMU as they do not handle UE belonging to guest. The difference is that the guest kernel terminates with different error code. old kernel and old QEMU -> guest panics [1] irrespective of where UE happened in guest address space. old kernel and new QEMU -> guest panics. same as above. new kernel and old QEMU -> guest terminates with unhanded NMI error irrespective of where UE happened in guest new kernel and new QEMU -> guest handles UEs in process address space by killing the process. guest terminates for UEs in guest kernel address space. [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118329.html > > Aravinda, do we need to change this so that qemu has to explicitly > enable the new NMI behaviour? Or have I missed something that will > make that case work already. I think we don't need to explicitly enable the new behavior. With new kernel and new QEMU this should just work. As mentioned above this is already broken for old kernel/QEMU. Any thoughts? Regards, Aravinda > > > > ___ > Linuxppc-dev mailing list > linuxppc-...@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev > -- Regards, Aravinda -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] KVM: PPC: Book3S HV: Don't dynamically split core when already split
In static micro-threading modes, the dynamic micro-threading code is supposed to be disabled, because subcores can't make independent decisions about what micro-threading mode to put the core in - there is only one micro-threading mode for the whole core. The code that implements dynamic micro-threading checks for this, except that the check was missed in one case. This means that it is possible for a subcore in static 2-way micro-threading mode to try to put the core into 4-way micro-threading mode, which usually leads to stuck CPUs, spinlock lockups, and other stalls in the host. The problem was in the can_split_piggybacked_subcores() function, which should always return false if the system is in a static micro-threading mode. This fixes the problem by making can_split_piggybacked_subcores() use subcore_config_ok() for its checks, as subcore_config_ok() includes the necessary check for the static micro-threading modes. Credit to Gautham Shenoy for working out that the reason for the hangs and stalls we were seeing was that we were trying to do dynamic 4-way micro-threading while we were in static 2-way mode. Fixes: b4deba5c41e9 Cc: v...@stable.kernel.org # v4.3 Signed-off-by: Paul Mackerras--- arch/powerpc/kvm/book3s_hv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 2280497..becad3a 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2060,7 +2060,7 @@ static bool can_split_piggybacked_subcores(struct core_info *cip) return false; n_subcores += (cip->subcore_threads[sub] - 1) >> 1; } - if (n_subcores > 3 || large_sub < 0) + if (large_sub < 0 || !subcore_config_ok(n_subcores + 1, 2)) return false; /* -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/1] KVM: PPC: Increase memslots to 320
Only using 32 memslots for KVM on powerpc is way too low, you can nowadays hit this limit quite fast by adding a couple of PCI devices and/or pluggable memory DIMMs to the guest. x86 already increased the limit to 512 in total, to satisfy 256 pluggable DIMM slots, 3 private slots and 253 slots for other things like PCI devices. On powerpc, we only have 32 pluggable DIMMs in QEMU, not 256, so we likely do not as much slots as on x86. Thus setting the slot limit to 320 sounds like a good value for the time being (until we have some code in the future to resize the memslot array dynamically). And while we're at it, also remove the KVM_MEM_SLOTS_NUM definition from the powerpc-specific header since this gets defined in the generic kvm_host.h header anyway. Signed-off-by: Thomas Huth--- arch/powerpc/include/asm/kvm_host.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 887c259..89d387a 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -38,8 +38,7 @@ #define KVM_MAX_VCPUS NR_CPUS #define KVM_MAX_VCORES NR_CPUS -#define KVM_USER_MEM_SLOTS 32 -#define KVM_MEM_SLOTS_NUM KVM_USER_MEM_SLOTS +#define KVM_USER_MEM_SLOTS 320 #ifdef CONFIG_KVM_MMIO #define KVM_COALESCED_MMIO_PAGE_OFFSET 1 -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/1] KVM: PPC: Increase memslots
Now that the patch from Nikunj to support the KVM_CAP_NR_MEMSLOTS capability on powerpc has been merged to kvm/next, we can/should increase the amount of memslots on ppc, too. Since nobody else sent a patch yet (as far as I can see), I'd like to suggest to increase the slot number to 320 now. Why 320? Well, x86 uses 509 (+3 internal slots), to give enough space for 256 pluggable DIMMs and 253 other slots (for PCI etc.). On powerpc, QEMU only supports 32 pluggable DIMMs, so I think we should be fine by using something around 253 + 32 slots + some few extra ... so 320 sounds like a good value to me for the time being (but in the long run, we should really make this dynamically instead, I think). Thomas Huth (1): KVM: PPC: Increase memslots to 320 arch/powerpc/include/asm/kvm_host.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kvm-unit-tests PATCH 00/14] ppc64: initial drop
On 03/11/2015 08:08, Thomas Huth wrote: > On 03/08/15 16:41, Andrew Jones wrote: >> > This series is the first series of a series of series that will >> > bring support to kvm-unit-tests for ppc64, and eventually ppc64le. > Hi Andrew, > > may I ask about the current state of ppc64 support in the > kvm-unit-tests? Is there a newer version available than the one you > posted three months ago? I've been a slob with all the kvm-unit-tests patches. Andrew, can you send a single submission of all the patches, so that I can review them and apply them? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails
On Tue, Oct 27, 2015 at 04:13:56PM +1100, Paul Mackerras wrote: > When handling a hypervisor data or instruction storage interrupt (HDSI > or HISI), we look up the SLB entry for the address being accessed in > order to translate the effective address to a virtual address which can > be looked up in the guest HPT. This lookup can occasionally fail due > to the guest replacing an SLB entry without invalidating the evicted > SLB entry. In this situation an ERAT (effective to real address > translation cache) entry can persist and be used by the hardware even > though there is no longer a corresponding SLB entry. > > Previously we would just deliver a data or instruction storage interrupt > (DSI or ISI) to the guest in this case. However, this is not correct > and has been observed to cause guests to crash, typically with a > data storage protection interrupt on a store to the vmemmap area. > > Instead, what we do now is to synthesize a data or instruction segment > interrupt. That should cause the guest to reload an appropriate entry > into the SLB and retry the faulting instruction. If it still faults, > we should find an appropriate SLB entry next time and be able to handle > the fault. > > Signed-off-by: Paul MackerrasReviewed-by: David Gibson -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [kvm-unit-tests PATCH 00/14] ppc64: initial drop
On Tue, Nov 03, 2015 at 10:40:18AM +0100, Paolo Bonzini wrote: > > > On 03/11/2015 08:08, Thomas Huth wrote: > > On 03/08/15 16:41, Andrew Jones wrote: > >> > This series is the first series of a series of series that will > >> > bring support to kvm-unit-tests for ppc64, and eventually ppc64le. > > Hi Andrew, > > > > may I ask about the current state of ppc64 support in the > > kvm-unit-tests? Is there a newer version available than the one you > > posted three months ago? > Hi Thomas, I haven't gotten around to preparing the v2 yet :-( I do have it on my TODO list, and I'm looking forward to working on it. Now that I know you're looking for it, I'll try to bump it up in priority. Thanks for the interest! > I've been a slob with all the kvm-unit-tests patches. Andrew, can you > send a single submission of all the patches, so that I can review them > and apply them? Hi Paolo, I've got several patches on my staging branch that I believe are ready. I plan to send those as a big "pull" series for your review soon. Thanks, drew > > Paolo > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] kvmtool: Cleanup kernel loading
On 2 November 2015 at 14:58, Will Deaconwrote: > On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote: >> Hi, > > Hello Andre, > >> this series cleans up kvmtool's kernel loading functionality a bit. >> It has been broken out of a previous series I sent [1] and contains >> just the cleanup and bug fix parts, which should be less controversial >> and thus easier to merge ;-) >> I will resend the pipe loading part later on as a separate series. >> >> The first patch properly abstracts kernel loading to move >> responsibility into each architecture's code. It removes quite some >> ugly code from the generic kvm.c file. >> The later patches address the naive usage of read(2) to, well, read >> data from files. Doing this without coping with the subtleties of >> the UNIX read semantics (returning with less or none data read is not >> an error) can provoke hard to debug failures. >> So these patches make use of the existing and one new wrapper function >> to make sure we read everything we actually wanted to. >> The last patch moves the ARM kernel loading code into the proper >> location to be in line with the other architectures. >> >> Please have a look and give some comments! > > Looks good to me, but I'd like to see some comments from some mips/ppc/x86 > people on the changes you're making over there. Looks mostly good to me, as one of the kvmtool down streams. Over at https://github.com/clearlinux/kvmtool we have some patches to tweak the x86 boot flow, which will need rebasing/retweaking) specifically this commit here - https://github.com/clearlinux/kvmtool/commit/a8dee709f85735d16739d0eda0cc00d3c1b17477 -- Regards, Dimitri. 53 sleeps till Christmas, or less https://clearlinux.org Open Source Technology Center Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] kvmtool: Cleanup kernel loading
On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote: > Hi, Hello Andre, > this series cleans up kvmtool's kernel loading functionality a bit. > It has been broken out of a previous series I sent [1] and contains > just the cleanup and bug fix parts, which should be less controversial > and thus easier to merge ;-) > I will resend the pipe loading part later on as a separate series. > > The first patch properly abstracts kernel loading to move > responsibility into each architecture's code. It removes quite some > ugly code from the generic kvm.c file. > The later patches address the naive usage of read(2) to, well, read > data from files. Doing this without coping with the subtleties of > the UNIX read semantics (returning with less or none data read is not > an error) can provoke hard to debug failures. > So these patches make use of the existing and one new wrapper function > to make sure we read everything we actually wanted to. > The last patch moves the ARM kernel loading code into the proper > location to be in line with the other architectures. > > Please have a look and give some comments! Looks good to me, but I'd like to see some comments from some mips/ppc/x86 people on the changes you're making over there. Will -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kvm-unit-tests PATCH 00/14] ppc64: initial drop
On 03/08/15 16:41, Andrew Jones wrote: > This series is the first series of a series of series that will > bring support to kvm-unit-tests for ppc64, and eventually ppc64le. Hi Andrew, may I ask about the current state of ppc64 support in the kvm-unit-tests? Is there a newer version available than the one you posted three months ago? Thanks, Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] kvmtool: Cleanup kernel loading
Hi Dimitri, On 02/11/15 15:17, Dimitri John Ledkov wrote: > On 2 November 2015 at 14:58, Will Deaconwrote: >> On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote: >>> Hi, >> >> Hello Andre, >> >>> this series cleans up kvmtool's kernel loading functionality a bit. >>> It has been broken out of a previous series I sent [1] and contains >>> just the cleanup and bug fix parts, which should be less controversial >>> and thus easier to merge ;-) >>> I will resend the pipe loading part later on as a separate series. >>> >>> The first patch properly abstracts kernel loading to move >>> responsibility into each architecture's code. It removes quite some >>> ugly code from the generic kvm.c file. >>> The later patches address the naive usage of read(2) to, well, read >>> data from files. Doing this without coping with the subtleties of >>> the UNIX read semantics (returning with less or none data read is not >>> an error) can provoke hard to debug failures. >>> So these patches make use of the existing and one new wrapper function >>> to make sure we read everything we actually wanted to. >>> The last patch moves the ARM kernel loading code into the proper >>> location to be in line with the other architectures. >>> >>> Please have a look and give some comments! >> >> Looks good to me, but I'd like to see some comments from some mips/ppc/x86 >> people on the changes you're making over there. > > Looks mostly good to me, as one of the kvmtool down streams. Over at > https://github.com/clearlinux/kvmtool we have some patches to tweak > the x86 boot flow, which will need rebasing/retweaking) specifically > this commit here - > https://github.com/clearlinux/kvmtool/commit/a8dee709f85735d16739d0eda0cc00d3c1b17477 Awesome - I was actually thinking about coding something like this! In the last week I move the MIPS ELF loading out of mips/ into /elf.c to be able to load kvm-unit-tests' tests (which are Multiboot/ELF). As multiboot requires entering in protected mode, I was thinking about changing kvmtool to support entering a guest directly in protected mode - seems like this is mostly what you've done here already. Looks like we should both post our patches to merge them somehow ;-) Cheers, Andre. -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails
On 27/10/15 06:13, Paul Mackerras wrote: > When handling a hypervisor data or instruction storage interrupt (HDSI > or HISI), we look up the SLB entry for the address being accessed in > order to translate the effective address to a virtual address which can > be looked up in the guest HPT. This lookup can occasionally fail due > to the guest replacing an SLB entry without invalidating the evicted > SLB entry. In this situation an ERAT (effective to real address > translation cache) entry can persist and be used by the hardware even > though there is no longer a corresponding SLB entry. > > Previously we would just deliver a data or instruction storage interrupt > (DSI or ISI) to the guest in this case. However, this is not correct > and has been observed to cause guests to crash, typically with a > data storage protection interrupt on a store to the vmemmap area. > > Instead, what we do now is to synthesize a data or instruction segment > interrupt. That should cause the guest to reload an appropriate entry > into the SLB and retry the faulting instruction. If it still faults, > we should find an appropriate SLB entry next time and be able to handle > the fault. > > Signed-off-by: Paul Mackerras> --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b1dab8d..3c6badc 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -1749,7 +1749,8 @@ kvmppc_hdsi: > beq 3f > clrrdi r0, r4, 28 > PPC_SLBFEE_DOT(R5, R0) /* if so, look up SLB */ > - bne 1f /* if no SLB entry found */ > + li r0, BOOK3S_INTERRUPT_DATA_SEGMENT > + bne 7f /* if no SLB entry found */ > 4: std r4, VCPU_FAULT_DAR(r9) > stw r6, VCPU_FAULT_DSISR(r9) > > @@ -1768,14 +1769,15 @@ kvmppc_hdsi: > cmpdi r3, -2 /* MMIO emulation; need instr word */ > beq 2f > > - /* Synthesize a DSI for the guest */ > + /* Synthesize a DSI (or DSegI) for the guest */ > ld r4, VCPU_FAULT_DAR(r9) > mr r6, r3 > -1: mtspr SPRN_DAR, r4 > +1: li r0, BOOK3S_INTERRUPT_DATA_STORAGE > mtspr SPRN_DSISR, r6 > +7: mtspr SPRN_DAR, r4 I first though "hey, you're not setting DSISR for the DsegI now" ... but then I saw in the PowerISA that DSISR is "undefined" for the DSegI, so this should be fine, I think. > mtspr SPRN_SRR0, r10 > mtspr SPRN_SRR1, r11 > - li r10, BOOK3S_INTERRUPT_DATA_STORAGE > + mr r10, r0 > bl kvmppc_msr_interrupt > fast_interrupt_c_return: > 6: ld r7, VCPU_CTR(r9) > @@ -1823,7 +1825,8 @@ kvmppc_hisi: > beq 3f > clrrdi r0, r10, 28 > PPC_SLBFEE_DOT(R5, R0) /* if so, look up SLB */ > - bne 1f /* if no SLB entry found */ > + li r0, BOOK3S_INTERRUPT_INST_SEGMENT > + bne 7f /* if no SLB entry found */ > 4: > /* Search the hash table. */ > mr r3, r9 /* vcpu pointer */ > @@ -1840,11 +1843,12 @@ kvmppc_hisi: > cmpdi r3, -1 /* handle in kernel mode */ > beq guest_exit_cont > > - /* Synthesize an ISI for the guest */ > + /* Synthesize an ISI (or ISegI) for the guest */ > mr r11, r3 > -1: mtspr SPRN_SRR0, r10 > +1: li r0, BOOK3S_INTERRUPT_INST_STORAGE > +7: mtspr SPRN_SRR0, r10 > mtspr SPRN_SRR1, r11 > - li r10, BOOK3S_INTERRUPT_INST_STORAGE > + mr r10, r0 > bl kvmppc_msr_interrupt > b fast_interrupt_c_return As far as I can see (but I'm really not an expert here), the patch seems to be fine. And we have a test running with it for almost 6 days now and did not see any further guest crashes, so it seems to me like this is the right fix for this issue! So if you like, you can add my "Reviewed-by" or "Tested-by" to this patch. Thanks a lot for fixing this! Thomas -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [GIT PULL] Please pull my kvm-ppc-next branch
On 26/10/2015 05:17, Paul Mackerras wrote: > Paolo, > > Here is my current patch queue for KVM on PPC. There's nothing much > in the way of new features this time; it's mostly bug fixes, plus > Nikunj has implemented support for KVM_CAP_NR_MEMSLOTS. These are > intended for the "next" branch of the KVM tree. Please pull. > > Thanks, > Paul. Pulled, thanks. Paolo > The following changes since commit 9ffecb10283508260936b96022d4ee43a7798b4c: > > Linux 4.3-rc3 (2015-09-27 07:50:08 -0400) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git > kvm-ppc-next > > for you to fetch changes up to 70aa3961a196ac32baf54032b2051bac9a941118: > > KVM: PPC: Book3S HV: Handle H_DOORBELL on the guest exit path (2015-10-21 > 16:31:52 +1100) > > > Andrzej Hajda (1): > KVM: PPC: e500: fix handling local_sid_lookup result > > Gautham R. Shenoy (1): > KVM: PPC: Book3S HV: Handle H_DOORBELL on the guest exit path > > Mahesh Salgaonkar (1): > KVM: PPC: Book3S HV: Deliver machine check with MSR(RI=0) to guest as > MCE > > Nikunj A Dadhania (1): > KVM: PPC: Implement extension to report number of memslots > > Paul Mackerras (2): > KVM: PPC: Book3S HV: Don't fall back to smaller HPT size in allocation > ioctl > KVM: PPC: Book3S HV: Make H_REMOVE return correct HPTE value for absent > HPTEs > > Tudor Laurentiu (3): > powerpc/e6500: add TMCFG0 register definition > KVM: PPC: e500: Emulate TMCFG0 TMRN register > KVM: PPC: e500: fix couple of shift operations on 64 bits > > arch/powerpc/include/asm/disassemble.h | 5 + > arch/powerpc/include/asm/reg_booke.h| 6 ++ > arch/powerpc/kvm/book3s_64_mmu_hv.c | 3 ++- > arch/powerpc/kvm/book3s_hv_rm_mmu.c | 2 ++ > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 29 ++--- > arch/powerpc/kvm/e500.c | 3 ++- > arch/powerpc/kvm/e500_emulate.c | 19 +++ > arch/powerpc/kvm/e500_mmu_host.c| 4 ++-- > arch/powerpc/kvm/powerpc.c | 3 +++ > 9 files changed, 63 insertions(+), 11 deletions(-) > -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 6/7] arm/arm64: use read_file() in kernel and initrd loading
Use the new read_file() wrapper in our arm/arm64 kernel image loading function instead of the private implementation. Signed-off-by: Andre Przywara--- arm/fdt.c | 40 ++-- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/arm/fdt.c b/arm/fdt.c index ec7453f..19d7ed9 100644 --- a/arm/fdt.c +++ b/arm/fdt.c @@ -224,19 +224,6 @@ static int setup_fdt(struct kvm *kvm) } late_init(setup_fdt); -static int read_image(int fd, void **pos, void *limit) -{ - int count; - - while (((count = xread(fd, *pos, SZ_64K)) > 0) && *pos <= limit) - *pos += count; - - if (pos < 0) - die_perror("xread"); - - return *pos < limit ? 0 : -ENOMEM; -} - #define FDT_ALIGN SZ_2M #define INITRD_ALIGN 4 bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd, @@ -244,6 +231,7 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd, { void *pos, *kernel_end, *limit; unsigned long guest_addr; + ssize_t file_size; if (lseek(fd_kernel, 0, SEEK_SET) < 0) die_perror("lseek"); @@ -256,13 +244,16 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd, pos = kvm->ram_start + ARM_KERN_OFFSET(kvm); kvm->arch.kern_guest_start = host_to_guest_flat(kvm, pos); - if (read_image(fd_kernel, , limit) == -ENOMEM) - die("kernel image too big to contain in guest memory."); + file_size = read_file(fd_kernel, pos, limit - pos); + if (file_size < 0) { + if (errno == ENOMEM) + die("kernel image too big to contain in guest memory."); - kernel_end = pos; - pr_info("Loaded kernel to 0x%llx (%llu bytes)", - kvm->arch.kern_guest_start, - host_to_guest_flat(kvm, pos) - kvm->arch.kern_guest_start); + die_perror("kernel read"); + } + kernel_end = pos + file_size; + pr_info("Loaded kernel to 0x%llx (%zd bytes)", + kvm->arch.kern_guest_start, file_size); /* * Now load backwards from the end of memory so the kernel @@ -300,11 +291,16 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd, die("initrd overlaps with kernel image."); initrd_start = guest_addr; - if (read_image(fd_initrd, , limit) == -ENOMEM) - die("initrd too big to contain in guest memory."); + file_size = read_file(fd_initrd, pos, limit - pos); + if (file_size == -1) { + if (errno == ENOMEM) + die("initrd too big to contain in guest memory."); + + die_perror("initrd read"); + } kvm->arch.initrd_guest_start = initrd_start; - kvm->arch.initrd_size = host_to_guest_flat(kvm, pos) - initrd_start; + kvm->arch.initrd_size = file_size; pr_info("Loaded initrd to 0x%llx (%llu bytes)", kvm->arch.initrd_guest_start, kvm->arch.initrd_size); -- 2.5.1 -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html