RE: HAPPY NEW YEAR

2016-01-06 Thread Carl Leinbach


From: Carl Leinbach
Sent: Wednesday, January 06, 2016 4:21 PM
To: Carl Leinbach
Subject: HAPPY NEW YEAR


Donation has been made to you by Mrs. Liliane Bettencourt. Contact email: 
mrslilianebettencou...@gmail.com  for more details

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: fix ONE_REG AltiVec support

2016-01-04 Thread Greg Kurz
Happy new year !

On Wed, 16 Dec 2015 18:24:03 +0100
Greg Kurz  wrote:

> The get and set operations got exchanged by mistake when moving the
> code from book3s.c to powerpc.c.
> 
> Fixes: 3840edc8033ad5b86deee309c1c321ca54257452
> Signed-off-by: Greg Kurz 
> ---
> 

Ping ?

> It's been there for over a year but I guess we want that in 4.4, even
> if doesn't break the host kernel.
> 
>  arch/powerpc/kvm/powerpc.c |   20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 6fd2405c7f4a..a3b182dcb823 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -919,21 +919,17 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
> struct kvm_one_reg *reg)
>   r = -ENXIO;
>   break;
>   }
> - vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval;
> + val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0];
>   break;
>   case KVM_REG_PPC_VSCR:
>   if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
>   r = -ENXIO;
>   break;
>   }
> - vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val);
> + val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]);
>   break;
>   case KVM_REG_PPC_VRSAVE:
> - if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
> - r = -ENXIO;
> - break;
> - }
> - vcpu->arch.vrsave = set_reg_val(reg->id, val);
> + val = get_reg_val(reg->id, vcpu->arch.vrsave);
>   break;
>  #endif /* CONFIG_ALTIVEC */
>   default:
> @@ -974,17 +970,21 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
> struct kvm_one_reg *reg)
>   r = -ENXIO;
>   break;
>   }
> - val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0];
> + vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval;
>   break;
>   case KVM_REG_PPC_VSCR:
>   if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
>   r = -ENXIO;
>   break;
>   }
> - val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]);
> + vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val);
>   break;
>   case KVM_REG_PPC_VRSAVE:
> - val = get_reg_val(reg->id, vcpu->arch.vrsave);
> + if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
> + r = -ENXIO;
> + break;
> + }
> + vcpu->arch.vrsave = set_reg_val(reg->id, val);
>   break;
>  #endif /* CONFIG_ALTIVEC */
>   default:
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Charity Donation

2016-01-02 Thread Jeff Skoll
Hi,
My name is Jeffrey Skoll, a philanthropist and the founder of one of the 
largest private foundations in the world. I believe strongly in ‘giving while 
living.’ I had one idea that never changed in my mind — that you should use 
your wealth to help people and I have decided to secretly give USD2.498 Million 
to a randomly selected individual. On receipt of this email, you should count 
yourself as the individual. Kindly get back to me at your earliest convenience, 
so I know your email address is valid.

Visit the web page to know more about me: 
http://www.theglobeandmail.com/news/national/meet-the-canadian-billionaire-whos-giving-it-all-away/article4209888/
 or you can read an article of me on Wikipedia.

Regards,
Jeffrey Skoll.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v1] kvm: Make vcpu->requests as 64 bit bitmap

2015-12-24 Thread Andrey Smetanin
Currently on x86 arch we has already 32 requests defined
so the newer request bits can't be placed inside
vcpu->requests(unsigned long) inside x86 32 bit system.
But we are going to add a new request in x86 arch
for Hyper-V tsc page support.

To solve the problem the patch replaces vcpu->requests by
bitmap with 64 bit length and uses bitmap API.

The patch consists of:
* announce kvm_vcpu_has_requests() to check whether vcpu has
requests
* announce kvm_vcpu_requests() to get vcpu requests pointer
* announce kvm_clear_request() to clear particular vcpu request
* replace if (vcpu->requests) by if (kvm_vcpu_has_requests(vcpu))
* replace clear_bit(req, vcpu->requests) by
 kvm_clear_request(req, vcpu)

Signed-off-by: Andrey Smetanin 
CC: Paolo Bonzini 
CC: Gleb Natapov 
CC: James Hogan 
CC: Paolo Bonzini 
CC: Paul Burton 
CC: Ralf Baechle 
CC: Alexander Graf 
CC: Christian Borntraeger 
CC: Cornelia Huck 
CC: linux-m...@linux-mips.org
CC: kvm-ppc@vger.kernel.org
CC: linux-s...@vger.kernel.org
CC: Roman Kagan 
CC: Denis V. Lunev 
CC: qemu-de...@nongnu.org

---
 arch/mips/kvm/emulate.c   |  4 +---
 arch/powerpc/kvm/book3s_pr.c  |  2 +-
 arch/powerpc/kvm/book3s_pr_papr.c |  2 +-
 arch/powerpc/kvm/booke.c  |  6 +++---
 arch/powerpc/kvm/powerpc.c|  6 +++---
 arch/powerpc/kvm/trace.h  |  2 +-
 arch/s390/kvm/kvm-s390.c  |  4 ++--
 arch/x86/kvm/vmx.c|  2 +-
 arch/x86/kvm/x86.c| 14 +++---
 include/linux/kvm_host.h  | 27 ++-
 10 files changed, 42 insertions(+), 27 deletions(-)

diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c
index 41b1b09..14aebe8 100644
--- a/arch/mips/kvm/emulate.c
+++ b/arch/mips/kvm/emulate.c
@@ -774,10 +774,8 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu 
*vcpu)
 * We we are runnable, then definitely go off to user space to
 * check if any I/O interrupts are pending.
 */
-   if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) {
-   clear_bit(KVM_REQ_UNHALT, >requests);
+   if (kvm_check_request(KVM_REQ_UNHALT, vcpu))
vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
-   }
}
 
return EMULATE_DONE;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 64891b0..e975279 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -349,7 +349,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 
msr)
if (msr & MSR_POW) {
if (!vcpu->arch.pending_exceptions) {
kvm_vcpu_block(vcpu);
-   clear_bit(KVM_REQ_UNHALT, >requests);
+   kvm_clear_request(KVM_REQ_UNHALT, vcpu));
vcpu->stat.halt_wakeup++;
 
/* Unset POW bit after we woke up */
diff --git a/arch/powerpc/kvm/book3s_pr_papr.c 
b/arch/powerpc/kvm/book3s_pr_papr.c
index f2c75a1..60cf393 100644
--- a/arch/powerpc/kvm/book3s_pr_papr.c
+++ b/arch/powerpc/kvm/book3s_pr_papr.c
@@ -309,7 +309,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
case H_CEDE:
kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE);
kvm_vcpu_block(vcpu);
-   clear_bit(KVM_REQ_UNHALT, >requests);
+   kvm_clear_request(KVM_REQ_UNHALT, vcpu);
vcpu->stat.halt_wakeup++;
return EMULATE_DONE;
case H_LOGICAL_CI_LOAD:
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index fd58751..6bed382 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -574,7 +574,7 @@ static void arm_next_watchdog(struct kvm_vcpu *vcpu)
 * userspace, so clear the KVM_REQ_WATCHDOG request.
 */
if ((vcpu->arch.tsr & (TSR_ENW | TSR_WIS)) != (TSR_ENW | TSR_WIS))
-   clear_bit(KVM_REQ_WATCHDOG, >requests);
+   kvm_clear_request(KVM_REQ_WATCHDOG, vcpu);
 
spin_lock_irqsave(>arch.wdt_lock, flags);
nr_jiffies = watchdog_next_timeout(vcpu);
@@ -677,7 +677,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 
kvmppc_core_check_exceptions(vcpu);
 
-   if (vcpu->requests) {
+   if (kvm_vcpu_has_requests(vcpu)) {
/* Exception delivery raised request; start over */
return 1;
}
@@ -685,7 +685,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
if (vcpu->arch.shared->msr & MSR_WE) {
local_irq_enable();
kvm_vcpu_block(vcpu);
-   clear_bit(KVM_REQ_UNHALT, >requests);
+   

Re: [PATCH v1] kvm: Make vcpu->requests as 64 bit bitmap

2015-12-24 Thread James Hogan
Hi Andrey,

On Thu, Dec 24, 2015 at 12:30:26PM +0300, Andrey Smetanin wrote:
> Currently on x86 arch we has already 32 requests defined
> so the newer request bits can't be placed inside
> vcpu->requests(unsigned long) inside x86 32 bit system.
> But we are going to add a new request in x86 arch
> for Hyper-V tsc page support.
> 
> To solve the problem the patch replaces vcpu->requests by
> bitmap with 64 bit length and uses bitmap API.
> 
> The patch consists of:
> * announce kvm_vcpu_has_requests() to check whether vcpu has
> requests
> * announce kvm_vcpu_requests() to get vcpu requests pointer
> * announce kvm_clear_request() to clear particular vcpu request
> * replace if (vcpu->requests) by if (kvm_vcpu_has_requests(vcpu))
> * replace clear_bit(req, vcpu->requests) by
>  kvm_clear_request(req, vcpu)
> 
> Signed-off-by: Andrey Smetanin 
> CC: Paolo Bonzini 
> CC: Gleb Natapov 
> CC: James Hogan 
> CC: Paolo Bonzini 
> CC: Paul Burton 
> CC: Ralf Baechle 
> CC: Alexander Graf 
> CC: Christian Borntraeger 
> CC: Cornelia Huck 
> CC: linux-m...@linux-mips.org
> CC: kvm-ppc@vger.kernel.org
> CC: linux-s...@vger.kernel.org
> CC: Roman Kagan 
> CC: Denis V. Lunev 
> CC: qemu-de...@nongnu.org

For MIPS KVM bit:
Acked-by: James Hogan 

Thanks
James

> 
> ---
>  arch/mips/kvm/emulate.c   |  4 +---
>  arch/powerpc/kvm/book3s_pr.c  |  2 +-
>  arch/powerpc/kvm/book3s_pr_papr.c |  2 +-
>  arch/powerpc/kvm/booke.c  |  6 +++---
>  arch/powerpc/kvm/powerpc.c|  6 +++---
>  arch/powerpc/kvm/trace.h  |  2 +-
>  arch/s390/kvm/kvm-s390.c  |  4 ++--
>  arch/x86/kvm/vmx.c|  2 +-
>  arch/x86/kvm/x86.c| 14 +++---
>  include/linux/kvm_host.h  | 27 ++-
>  10 files changed, 42 insertions(+), 27 deletions(-)
> 
> diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c
> index 41b1b09..14aebe8 100644
> --- a/arch/mips/kvm/emulate.c
> +++ b/arch/mips/kvm/emulate.c
> @@ -774,10 +774,8 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu 
> *vcpu)
>* We we are runnable, then definitely go off to user space to
>* check if any I/O interrupts are pending.
>*/
> - if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) {
> - clear_bit(KVM_REQ_UNHALT, >requests);
> + if (kvm_check_request(KVM_REQ_UNHALT, vcpu))
>   vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
> - }
>   }
>  
>   return EMULATE_DONE;
> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> index 64891b0..e975279 100644
> --- a/arch/powerpc/kvm/book3s_pr.c
> +++ b/arch/powerpc/kvm/book3s_pr.c
> @@ -349,7 +349,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 
> msr)
>   if (msr & MSR_POW) {
>   if (!vcpu->arch.pending_exceptions) {
>   kvm_vcpu_block(vcpu);
> - clear_bit(KVM_REQ_UNHALT, >requests);
> + kvm_clear_request(KVM_REQ_UNHALT, vcpu));
>   vcpu->stat.halt_wakeup++;
>  
>   /* Unset POW bit after we woke up */
> diff --git a/arch/powerpc/kvm/book3s_pr_papr.c 
> b/arch/powerpc/kvm/book3s_pr_papr.c
> index f2c75a1..60cf393 100644
> --- a/arch/powerpc/kvm/book3s_pr_papr.c
> +++ b/arch/powerpc/kvm/book3s_pr_papr.c
> @@ -309,7 +309,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
>   case H_CEDE:
>   kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE);
>   kvm_vcpu_block(vcpu);
> - clear_bit(KVM_REQ_UNHALT, >requests);
> + kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>   vcpu->stat.halt_wakeup++;
>   return EMULATE_DONE;
>   case H_LOGICAL_CI_LOAD:
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index fd58751..6bed382 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -574,7 +574,7 @@ static void arm_next_watchdog(struct kvm_vcpu *vcpu)
>* userspace, so clear the KVM_REQ_WATCHDOG request.
>*/
>   if ((vcpu->arch.tsr & (TSR_ENW | TSR_WIS)) != (TSR_ENW | TSR_WIS))
> - clear_bit(KVM_REQ_WATCHDOG, >requests);
> + kvm_clear_request(KVM_REQ_WATCHDOG, vcpu);
>  
>   spin_lock_irqsave(>arch.wdt_lock, flags);
>   nr_jiffies = watchdog_next_timeout(vcpu);
> @@ -677,7 +677,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  
>   kvmppc_core_check_exceptions(vcpu);
>  
> - if (vcpu->requests) {
> + if (kvm_vcpu_has_requests(vcpu)) {
>   /* Exception delivery raised request; start 

Re: [PATCH v1] kvm: Make vcpu->requests as 64 bit bitmap

2015-12-24 Thread Andrey Smetanin



On 12/24/2015 02:14 PM, Roman Kagan wrote:

On Thu, Dec 24, 2015 at 12:30:26PM +0300, Andrey Smetanin wrote:

Currently on x86 arch we has already 32 requests defined
so the newer request bits can't be placed inside
vcpu->requests(unsigned long) inside x86 32 bit system.
But we are going to add a new request in x86 arch
for Hyper-V tsc page support.

To solve the problem the patch replaces vcpu->requests by
bitmap with 64 bit length and uses bitmap API.

The patch consists of:
* announce kvm_vcpu_has_requests() to check whether vcpu has
requests
* announce kvm_vcpu_requests() to get vcpu requests pointer


I think that if abstracting out the implementation of the request
container is what you're after, you'd better not define this function;
accesses to the request map should be through your accessor functions.


Good idea! I'll rework this patch and will send V2.



* announce kvm_clear_request() to clear particular vcpu request
* replace if (vcpu->requests) by if (kvm_vcpu_has_requests(vcpu))
* replace clear_bit(req, vcpu->requests) by
  kvm_clear_request(req, vcpu)


Apparently one accessor is missing: test the presence of a request
without clearing it from the bitmap (because kvm_check_request() clears
it).

agree



diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 2e0e67e..4015b88 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -109,7 +109,7 @@ TRACE_EVENT(kvm_check_requests,

TP_fast_assign(
__entry->cpu_nr  = vcpu->vcpu_id;
-   __entry->requests= vcpu->requests;
+   __entry->requests= (__u32)kvm_vcpu_requests(vcpu)[0];
),


This doesn't make sense, to expose only subset of the requests.  BTW I
don't see this event in Linus tree, nor in linux-next,

I found it here:
arch/powerpc/kvm/powerpc.c:104: trace_kvm_check_requests(vcpu);
>so I'm not quite

sure why it's formed this way; I guess the interesting part is the
request number and the return value (i.e. whether it's set), not the
whole bitmap.

No, the code actually dumps part of first 32 bit of bitmap:
TP_printk("vcpu=%x requests=%x",
__entry->cpu_nr, __entry->requests)

So, for this corner case we can make __entry->requests as unsigned long 
variable and make the API helper to get first unsigned long from 
vcpu->requests bitmap. Next print part of bitmap this way:

 TP_printk("vcpu=0x%x requests=0x%lx",
__entry->cpu_nr, __entry->requests)
(as unsigned long).

POWERPC guys what do you think about such approach? Or could
we even drop trace_kvm_check_requests() at all. Does 
trace_kvm_check_requests() still useful for you?



--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5957,7 +5957,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu 
*vcpu)
if (intr_window_requested && vmx_interrupt_allowed(vcpu))
return handle_interrupt_window(>vcpu);

-   if (test_bit(KVM_REQ_EVENT, >requests))
+   if (test_bit(KVM_REQ_EVENT, kvm_vcpu_requests(vcpu)))


Here you'd rather want that function to test for the request without
clearing it.

agree, i'll provide such function.



--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2117,7 +2117,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)

if (ka->boot_vcpu_runs_old_kvmclock != tmp)
set_bit(KVM_REQ_MASTERCLOCK_UPDATE,
-   >requests);
+   kvm_vcpu_requests(vcpu));


This should have been kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu);


agree

@@ -8048,7 +8048,7 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu 
*vcpu)
if (atomic_read(>arch.nmi_queued))
return true;

-   if (test_bit(KVM_REQ_SMI, >requests))
+   if (test_bit(KVM_REQ_SMI, kvm_vcpu_requests(vcpu)))


Again the test-only accessor is due here.

agree



--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -146,6 +146,8 @@ static inline bool is_error_page(struct page *page)
  #define KVM_REQ_HV_EXIT   30
  #define KVM_REQ_HV_STIMER 31

+#define KVM_REQ_MAX   64
+
  #define KVM_USERSPACE_IRQ_SOURCE_ID   0
  #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID  1

@@ -233,7 +235,7 @@ struct kvm_vcpu {
int vcpu_id;
int srcu_idx;
int mode;
-   unsigned long requests;
+   DECLARE_BITMAP(requests, KVM_REQ_MAX);
unsigned long guest_debug;

int pre_pcpu;
@@ -286,6 +288,16 @@ struct kvm_vcpu {
struct kvm_vcpu_arch arch;
  };

+static inline bool kvm_vcpu_has_requests(struct kvm_vcpu *vcpu)
+{
+   return !bitmap_empty(vcpu->requests, KVM_REQ_MAX);
+}
+
+static inline ulong *kvm_vcpu_requests(struct kvm_vcpu *vcpu)
+{
+   return (ulong *)vcpu->requests;
+}


As I wrote above, I believe this function doesn't belong in the API.

agree, 

Re: [PATCH v1] kvm: Make vcpu->requests as 64 bit bitmap

2015-12-24 Thread Roman Kagan
On Thu, Dec 24, 2015 at 12:30:26PM +0300, Andrey Smetanin wrote:
> Currently on x86 arch we has already 32 requests defined
> so the newer request bits can't be placed inside
> vcpu->requests(unsigned long) inside x86 32 bit system.
> But we are going to add a new request in x86 arch
> for Hyper-V tsc page support.
> 
> To solve the problem the patch replaces vcpu->requests by
> bitmap with 64 bit length and uses bitmap API.
> 
> The patch consists of:
> * announce kvm_vcpu_has_requests() to check whether vcpu has
> requests
> * announce kvm_vcpu_requests() to get vcpu requests pointer

I think that if abstracting out the implementation of the request
container is what you're after, you'd better not define this function;
accesses to the request map should be through your accessor functions.

> * announce kvm_clear_request() to clear particular vcpu request
> * replace if (vcpu->requests) by if (kvm_vcpu_has_requests(vcpu))
> * replace clear_bit(req, vcpu->requests) by
>  kvm_clear_request(req, vcpu)

Apparently one accessor is missing: test the presence of a request
without clearing it from the bitmap (because kvm_check_request() clears
it).

> diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
> index 2e0e67e..4015b88 100644
> --- a/arch/powerpc/kvm/trace.h
> +++ b/arch/powerpc/kvm/trace.h
> @@ -109,7 +109,7 @@ TRACE_EVENT(kvm_check_requests,
>  
>   TP_fast_assign(
>   __entry->cpu_nr = vcpu->vcpu_id;
> - __entry->requests   = vcpu->requests;
> + __entry->requests   = (__u32)kvm_vcpu_requests(vcpu)[0];
>   ),

This doesn't make sense, to expose only subset of the requests.  BTW I
don't see this event in Linus tree, nor in linux-next, so I'm not quite
sure why it's formed this way; I guess the interesting part is the
request number and the return value (i.e. whether it's set), not the
whole bitmap.

> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -5957,7 +5957,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu 
> *vcpu)
>   if (intr_window_requested && vmx_interrupt_allowed(vcpu))
>   return handle_interrupt_window(>vcpu);
>  
> - if (test_bit(KVM_REQ_EVENT, >requests))
> + if (test_bit(KVM_REQ_EVENT, kvm_vcpu_requests(vcpu)))

Here you'd rather want that function to test for the request without
clearing it.

> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2117,7 +2117,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
> msr_data *msr_info)
>  
>   if (ka->boot_vcpu_runs_old_kvmclock != tmp)
>   set_bit(KVM_REQ_MASTERCLOCK_UPDATE,
> - >requests);
> + kvm_vcpu_requests(vcpu));

This should have been kvm_make_request(KVM_REQ_MASTERCLOCK_UPDATE, vcpu);

> @@ -8048,7 +8048,7 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu 
> *vcpu)
>   if (atomic_read(>arch.nmi_queued))
>   return true;
>  
> - if (test_bit(KVM_REQ_SMI, >requests))
> + if (test_bit(KVM_REQ_SMI, kvm_vcpu_requests(vcpu)))

Again the test-only accessor is due here.

> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -146,6 +146,8 @@ static inline bool is_error_page(struct page *page)
>  #define KVM_REQ_HV_EXIT   30
>  #define KVM_REQ_HV_STIMER 31
>  
> +#define KVM_REQ_MAX   64
> +
>  #define KVM_USERSPACE_IRQ_SOURCE_ID  0
>  #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1
>  
> @@ -233,7 +235,7 @@ struct kvm_vcpu {
>   int vcpu_id;
>   int srcu_idx;
>   int mode;
> - unsigned long requests;
> + DECLARE_BITMAP(requests, KVM_REQ_MAX);
>   unsigned long guest_debug;
>  
>   int pre_pcpu;
> @@ -286,6 +288,16 @@ struct kvm_vcpu {
>   struct kvm_vcpu_arch arch;
>  };
>  
> +static inline bool kvm_vcpu_has_requests(struct kvm_vcpu *vcpu)
> +{
> + return !bitmap_empty(vcpu->requests, KVM_REQ_MAX);
> +}
> +
> +static inline ulong *kvm_vcpu_requests(struct kvm_vcpu *vcpu)
> +{
> + return (ulong *)vcpu->requests;
> +}

As I wrote above, I believe this function doesn't belong in the API.

> +
>  static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu)
>  {
>   return cmpxchg(>mode, IN_GUEST_MODE, EXITING_GUEST_MODE);
> @@ -1002,7 +1014,7 @@ static inline bool kvm_is_error_gpa(struct kvm *kvm, 
> gpa_t gpa)
>  
>  static inline void kvm_migrate_timers(struct kvm_vcpu *vcpu)
>  {
> - set_bit(KVM_REQ_MIGRATE_TIMER, >requests);
> + set_bit(KVM_REQ_MIGRATE_TIMER, kvm_vcpu_requests(vcpu));

kvm_make_request(KVM_REQ_MIGRATE_TIMER, vcpu);

> @@ -1118,19 +1130,24 @@ static inline bool kvm_vcpu_compatible(struct 
> kvm_vcpu *vcpu) { return true; }
>  
>  static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu)
>  {
> - set_bit(req, >requests);
> + set_bit(req, kvm_vcpu_requests(vcpu));
>  }
>  
>  static inline 

[PATCH v2] kvm: Make vcpu->requests as 64 bit bitmap

2015-12-24 Thread Andrey Smetanin
Currently on x86 arch we has already 32 requests defined
so the newer request bits can't be placed inside
vcpu->requests(unsigned long) inside x86 32 bit system.
But we are going to add a new request in x86 arch
for Hyper-V tsc page support.

To solve the problem the patch replaces vcpu->requests by
bitmap with 64 bit length and uses bitmap API.

The patch consists of:
* announce kvm_has_requests() to check whether vcpu has
requests
* announce kvm_clear_request() to clear particular vcpu request
* announce kvm_test_request() to test particular vcpu request
* replace if (vcpu->requests) by if (kvm_has_requests(vcpu))
* replace clear_bit(req, vcpu->requests) by
 kvm_clear_request(req, vcpu)

Changes v2:
* hide internals of vcpu requests bitmap
by interface usage in all places
* replace test_bit(req, vcpu->requests) by
 kvm_test_request()
* POWERPC: trace vcpu requests bitmap by
__bitmask, __assign_bitmap, __get_bitmask

Signed-off-by: Andrey Smetanin 
Acked-by: James Hogan 
CC: Paolo Bonzini 
CC: Gleb Natapov 
CC: James Hogan 
CC: Paul Burton 
CC: Ralf Baechle 
CC: Alexander Graf 
CC: Christian Borntraeger 
CC: Cornelia Huck 
CC: linux-m...@linux-mips.org
CC: kvm-ppc@vger.kernel.org
CC: linux-s...@vger.kernel.org
CC: Roman Kagan 
CC: Denis V. Lunev 
CC: qemu-de...@nongnu.org
---
 arch/mips/kvm/emulate.c   |  4 +---
 arch/powerpc/kvm/book3s_pr.c  |  2 +-
 arch/powerpc/kvm/book3s_pr_papr.c |  2 +-
 arch/powerpc/kvm/booke.c  |  6 +++---
 arch/powerpc/kvm/powerpc.c|  6 +++---
 arch/powerpc/kvm/trace.h  |  9 +
 arch/s390/kvm/kvm-s390.c  |  4 ++--
 arch/x86/kvm/vmx.c|  2 +-
 arch/x86/kvm/x86.c| 16 
 include/linux/kvm_host.h  | 38 +-
 10 files changed, 50 insertions(+), 39 deletions(-)

diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c
index 41b1b09..14aebe8 100644
--- a/arch/mips/kvm/emulate.c
+++ b/arch/mips/kvm/emulate.c
@@ -774,10 +774,8 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu 
*vcpu)
 * We we are runnable, then definitely go off to user space to
 * check if any I/O interrupts are pending.
 */
-   if (kvm_check_request(KVM_REQ_UNHALT, vcpu)) {
-   clear_bit(KVM_REQ_UNHALT, >requests);
+   if (kvm_check_request(KVM_REQ_UNHALT, vcpu))
vcpu->run->exit_reason = KVM_EXIT_IRQ_WINDOW_OPEN;
-   }
}
 
return EMULATE_DONE;
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 64891b0..e975279 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -349,7 +349,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 
msr)
if (msr & MSR_POW) {
if (!vcpu->arch.pending_exceptions) {
kvm_vcpu_block(vcpu);
-   clear_bit(KVM_REQ_UNHALT, >requests);
+   kvm_clear_request(KVM_REQ_UNHALT, vcpu));
vcpu->stat.halt_wakeup++;
 
/* Unset POW bit after we woke up */
diff --git a/arch/powerpc/kvm/book3s_pr_papr.c 
b/arch/powerpc/kvm/book3s_pr_papr.c
index f2c75a1..60cf393 100644
--- a/arch/powerpc/kvm/book3s_pr_papr.c
+++ b/arch/powerpc/kvm/book3s_pr_papr.c
@@ -309,7 +309,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
case H_CEDE:
kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE);
kvm_vcpu_block(vcpu);
-   clear_bit(KVM_REQ_UNHALT, >requests);
+   kvm_clear_request(KVM_REQ_UNHALT, vcpu);
vcpu->stat.halt_wakeup++;
return EMULATE_DONE;
case H_LOGICAL_CI_LOAD:
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index fd58751..b2e8643 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -574,7 +574,7 @@ static void arm_next_watchdog(struct kvm_vcpu *vcpu)
 * userspace, so clear the KVM_REQ_WATCHDOG request.
 */
if ((vcpu->arch.tsr & (TSR_ENW | TSR_WIS)) != (TSR_ENW | TSR_WIS))
-   clear_bit(KVM_REQ_WATCHDOG, >requests);
+   kvm_clear_request(KVM_REQ_WATCHDOG, vcpu);
 
spin_lock_irqsave(>arch.wdt_lock, flags);
nr_jiffies = watchdog_next_timeout(vcpu);
@@ -677,7 +677,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 
kvmppc_core_check_exceptions(vcpu);
 
-   if (vcpu->requests) {
+   if (kvm_has_requests(vcpu)) {
/* Exception delivery raised request; start over */
return 1;
}
@@ -685,7 +685,7 @@ int 

Re: [PATCH v2] kvm: Make vcpu->requests as 64 bit bitmap

2015-12-24 Thread Roman Kagan
On Thu, Dec 24, 2015 at 04:29:21PM +0300, Andrey Smetanin wrote:
> Currently on x86 arch we has already 32 requests defined
> so the newer request bits can't be placed inside
> vcpu->requests(unsigned long) inside x86 32 bit system.
> But we are going to add a new request in x86 arch
> for Hyper-V tsc page support.
> 
> To solve the problem the patch replaces vcpu->requests by
> bitmap with 64 bit length and uses bitmap API.
> 
> The patch consists of:
> * announce kvm_has_requests() to check whether vcpu has
> requests
> * announce kvm_clear_request() to clear particular vcpu request
> * announce kvm_test_request() to test particular vcpu request
> * replace if (vcpu->requests) by if (kvm_has_requests(vcpu))
> * replace clear_bit(req, vcpu->requests) by
>  kvm_clear_request(req, vcpu)
> 
> Changes v2:
> * hide internals of vcpu requests bitmap
> by interface usage in all places
> * replace test_bit(req, vcpu->requests) by
>  kvm_test_request()
> * POWERPC: trace vcpu requests bitmap by
> __bitmask, __assign_bitmap, __get_bitmask
> 
> Signed-off-by: Andrey Smetanin 
> Acked-by: James Hogan 
> CC: Paolo Bonzini 
> CC: Gleb Natapov 
> CC: James Hogan 
> CC: Paul Burton 
> CC: Ralf Baechle 
> CC: Alexander Graf 
> CC: Christian Borntraeger 
> CC: Cornelia Huck 
> CC: linux-m...@linux-mips.org
> CC: kvm-ppc@vger.kernel.org
> CC: linux-s...@vger.kernel.org
> CC: Roman Kagan 
> CC: Denis V. Lunev 
> CC: qemu-de...@nongnu.org
> ---
>  arch/mips/kvm/emulate.c   |  4 +---
>  arch/powerpc/kvm/book3s_pr.c  |  2 +-
>  arch/powerpc/kvm/book3s_pr_papr.c |  2 +-
>  arch/powerpc/kvm/booke.c  |  6 +++---
>  arch/powerpc/kvm/powerpc.c|  6 +++---
>  arch/powerpc/kvm/trace.h  |  9 +
>  arch/s390/kvm/kvm-s390.c  |  4 ++--
>  arch/x86/kvm/vmx.c|  2 +-
>  arch/x86/kvm/x86.c| 16 
>  include/linux/kvm_host.h  | 38 +-
>  10 files changed, 50 insertions(+), 39 deletions(-)

Reviewed-by: Roman Kagan 
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH kernel 7/9] KVM: PPC: Move reusable bits of H_PUT_TCE handler to helpers

2015-12-21 Thread Alexey Kardashevskiy

On 12/08/2015 04:27 PM, David Gibson wrote:

On Tue, Sep 15, 2015 at 08:49:37PM +1000, Alexey Kardashevskiy wrote:

Upcoming multi-tce support (H_PUT_TCE_INDIRECT/H_STUFF_TCE hypercalls)
will validate TCE (not to have unexpected bits) and IO address
(to be within the DMA window boundaries).

This introduces helpers to validate TCE and IO address.

Signed-off-by: Alexey Kardashevskiy 
---
  arch/powerpc/include/asm/kvm_ppc.h  |  4 ++
  arch/powerpc/kvm/book3s_64_vio_hv.c | 89 -
  2 files changed, 83 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index c6ef05b..fcde896 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -166,6 +166,10 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);

  extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce *args);
+extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
+   unsigned long ioba, unsigned long npages);
+extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt,
+   unsigned long tce);
  extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 unsigned long ioba, unsigned long tce);
  extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
b/arch/powerpc/kvm/book3s_64_vio_hv.c
index 6cf1ab3..f0fd84c 100644
--- a/arch/powerpc/kvm/book3s_64_vio_hv.c
+++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
@@ -36,6 +36,7 @@
  #include 
  #include 
  #include 
+#include 

  #define TCES_PER_PAGE (PAGE_SIZE / sizeof(u64))

@@ -64,7 +65,7 @@ static struct kvmppc_spapr_tce_table 
*kvmppc_find_table(struct kvm_vcpu *vcpu,
   * WARNING: This will be called in real-mode on HV KVM and virtual
   *  mode on PR KVM
   */
-static long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
+long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
unsigned long ioba, unsigned long npages)
  {
unsigned long mask = (1ULL << IOMMU_PAGE_SHIFT_4K) - 1;
@@ -76,6 +77,79 @@ static long kvmppc_ioba_validate(struct 
kvmppc_spapr_tce_table *stt,

return H_SUCCESS;
  }
+EXPORT_SYMBOL_GPL(kvmppc_ioba_validate);


Why does it need to be exported - the new users will still be in the
KVM module, won't they?



book3s_64_vio_hv.c contains realmode code and is always compiled into 
vmlinux and the helper is meant to be called from book3s_64_vio.c which may 
compile as a module.






+
+/*
+ * Validates TCE address.
+ * At the moment flags and page mask are validated.
+ * As the host kernel does not access those addresses (just puts them
+ * to the table and user space is supposed to process them), we can skip
+ * checking other things (such as TCE is a guest RAM address or the page
+ * was actually allocated).


Hmm.  These comments apply given that the only current user of this is
the kvm acceleration of userspace TCE tables, but the name suggests it
would validate any TCE, including in kernel ones for which this would
be unsafe.



The function has the "kvmppc_" prefix and the file besides in 
arch/powerpc/kvm, so to my taste it is self-explanatory that it only 
handles TCEs from KVM guests (not even from a random user-space tool), no?





+ * WARNING: This will be called in real-mode on HV KVM and virtual
+ *  mode on PR KVM
+ */
+long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *stt, unsigned long tce)
+{
+   unsigned long mask = ((1ULL << IOMMU_PAGE_SHIFT_4K) - 1) &
+   ~(TCE_PCI_WRITE | TCE_PCI_READ);
+
+   if (tce & mask)
+   return H_PARAMETER;
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_tce_validate);
+
+/* Note on the use of page_address() in real mode,
+ *
+ * It is safe to use page_address() in real mode on ppc64 because
+ * page_address() is always defined as lowmem_page_address()
+ * which returns __va(PFN_PHYS(page_to_pfn(page))) which is arithmetial
+ * operation and does not access page struct.
+ *
+ * Theoretically page_address() could be defined different
+ * but either WANT_PAGE_VIRTUAL or HASHED_PAGE_VIRTUAL
+ * should be enabled.
+ * WANT_PAGE_VIRTUAL is never enabled on ppc32/ppc64,
+ * HASHED_PAGE_VIRTUAL could be enabled for ppc32 only and only
+ * if CONFIG_HIGHMEM is defined. As CONFIG_SPARSEMEM_VMEMMAP
+ * is not expected to be enabled on ppc32, page_address()
+ * is safe for ppc32 as well.
+ *
+ * WARNING: This will be called in real-mode on HV KVM and virtual
+ *  mode on PR KVM
+ */
+static u64 *kvmppc_page_address(struct page *page)
+{
+#if defined(HASHED_PAGE_VIRTUAL) || defined(WANT_PAGE_VIRTUAL)
+#error TODO: fix to avoid page_address() here
+#endif
+   return (u64 *) page_address(page);
+}
+
+/*
+ * Handles TCE requests for emulated devices.
+ * Puts guest 

Re: [PATCH kernel 9/9] KVM: PPC: Add support for multiple-TCE hcalls

2015-12-21 Thread Alexey Kardashevskiy

On 12/08/2015 04:48 PM, David Gibson wrote:

On Tue, Sep 15, 2015 at 08:49:39PM +1000, Alexey Kardashevskiy wrote:

This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and
H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO
devices or emulated PCI.  These calls allow adding multiple entries
(up to 512) into the TCE table in one call which saves time on
transition between kernel and user space.

This implements the KVM_CAP_PPC_MULTITCE capability. When present,
the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE.
If they can not be handled by the kernel, they are passed on to
the user space. The user space still has to have an implementation
for these.

Both HV and PR-syle KVM are supported.

Signed-off-by: Alexey Kardashevskiy 
---
  Documentation/virtual/kvm/api.txt   |  25 ++
  arch/powerpc/include/asm/kvm_ppc.h  |  12 +++
  arch/powerpc/kvm/book3s_64_vio.c| 111 +++-
  arch/powerpc/kvm/book3s_64_vio_hv.c | 145 ++--
  arch/powerpc/kvm/book3s_hv.c|  26 +-
  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   6 +-
  arch/powerpc/kvm/book3s_pr_papr.c   |  35 
  arch/powerpc/kvm/powerpc.c  |   3 +
  8 files changed, 350 insertions(+), 13 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index d86d831..593c62a 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3019,6 +3019,31 @@ Returns: 0 on success, -1 on error

  Queues an SMI on the thread's vcpu.

+4.97 KVM_CAP_PPC_MULTITCE
+
+Capability: KVM_CAP_PPC_MULTITCE
+Architectures: ppc
+Type: vm
+
+This capability means the kernel is capable of handling hypercalls
+H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
+space. This significantly accelerates DMA operations for PPC KVM guests.
+User space should expect that its handlers for these hypercalls
+are not going to be called if user space previously registered LIOBN
+in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
+
+In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
+user space might have to advertise it for the guest. For example,
+IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
+present in the "ibm,hypertas-functions" device-tree property.
+
+The hypercalls mentioned above may or may not be processed successfully
+in the kernel based fast path. If they can not be handled by the kernel,
+they will get passed on to user space. So user space still has to have
+an implementation for these despite the in kernel acceleration.
+
+This capability is always enabled.
+
  5. The kvm_run structure
  

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index fcde896..e5b968e 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -166,12 +166,24 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu);

  extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
struct kvm_create_spapr_tce *args);
+extern struct kvmppc_spapr_tce_table *kvmppc_find_table(
+   struct kvm_vcpu *vcpu, unsigned long liobn);
  extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
unsigned long ioba, unsigned long npages);
  extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt,
unsigned long tce);
+extern long kvmppc_gpa_to_ua(struct kvm *kvm, unsigned long gpa,
+   unsigned long *ua, unsigned long **prmap);
+extern void kvmppc_tce_put(struct kvmppc_spapr_tce_table *tt,
+   unsigned long idx, unsigned long tce);
  extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 unsigned long ioba, unsigned long tce);
+extern long kvmppc_h_put_tce_indirect(struct kvm_vcpu *vcpu,
+   unsigned long liobn, unsigned long ioba,
+   unsigned long tce_list, unsigned long npages);
+extern long kvmppc_h_stuff_tce(struct kvm_vcpu *vcpu,
+   unsigned long liobn, unsigned long ioba,
+   unsigned long tce_value, unsigned long npages);
  extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
 unsigned long ioba);
  extern struct page *kvm_alloc_hpt(unsigned long nr_pages);
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index e347856..d3fc732 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -14,6 +14,7 @@
   *
   * Copyright 2010 Paul Mackerras, IBM Corp. 
   * Copyright 2011 David Gibson, IBM Corporation 
+ * Copyright 2013 Alexey Kardashevskiy, IBM Corporation 
   */

  #include 
@@ -37,8 +38,7 @@
  #include 
  #include 
  #include 
-
-#define TCES_PER_PAGE  

Review & Reply

2015-12-18 Thread J.Tynan
Greetings,
My name is Mr.Michael J. Tynan, I am a banker with Bank Of America. It is true 
that we have not meet each other in person, but I strongly believe in trust and 
friendship in every business. I have a Lebanese deceased customer's abandoned 
fund, which I am his personal financial adviser before his accidental death, 
that being the main reason why I alone working in the bank here, know much 
about the existence of this fund and the secrets surrounding this money. But 
before I disclose the full details to you, I will like to know your interest 
and willingness to assist me. You can call me as soon you receive my message, 
so that i will send to you full details about the transaction.
My best regards,
Mr.Michael J. Tynan
MOBILE: +1 347 269 3740
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: fix ONE_REG AltiVec support

2015-12-16 Thread Greg Kurz
The get and set operations got exchanged by mistake when moving the
code from book3s.c to powerpc.c.

Fixes: 3840edc8033ad5b86deee309c1c321ca54257452
Signed-off-by: Greg Kurz 
---

It's been there for over a year but I guess we want that in 4.4, even
if doesn't break the host kernel.

 arch/powerpc/kvm/powerpc.c |   20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 6fd2405c7f4a..a3b182dcb823 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -919,21 +919,17 @@ int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
r = -ENXIO;
break;
}
-   vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval;
+   val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0];
break;
case KVM_REG_PPC_VSCR:
if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
r = -ENXIO;
break;
}
-   vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val);
+   val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]);
break;
case KVM_REG_PPC_VRSAVE:
-   if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
-   r = -ENXIO;
-   break;
-   }
-   vcpu->arch.vrsave = set_reg_val(reg->id, val);
+   val = get_reg_val(reg->id, vcpu->arch.vrsave);
break;
 #endif /* CONFIG_ALTIVEC */
default:
@@ -974,17 +970,21 @@ int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, 
struct kvm_one_reg *reg)
r = -ENXIO;
break;
}
-   val.vval = vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0];
+   vcpu->arch.vr.vr[reg->id - KVM_REG_PPC_VR0] = val.vval;
break;
case KVM_REG_PPC_VSCR:
if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
r = -ENXIO;
break;
}
-   val = get_reg_val(reg->id, vcpu->arch.vr.vscr.u[3]);
+   vcpu->arch.vr.vscr.u[3] = set_reg_val(reg->id, val);
break;
case KVM_REG_PPC_VRSAVE:
-   val = get_reg_val(reg->id, vcpu->arch.vrsave);
+   if (!cpu_has_feature(CPU_FTR_ALTIVEC)) {
+   r = -ENXIO;
+   break;
+   }
+   vcpu->arch.vrsave = set_reg_val(reg->id, val);
break;
 #endif /* CONFIG_ALTIVEC */
default:

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception

2015-12-16 Thread Aravinda Prasad


On Thursday 17 December 2015 08:02 AM, David Gibson wrote:
> On Wed, Dec 16, 2015 at 11:26:12AM +0530, Aravinda Prasad wrote:
>> This patch modifies KVM to cause a guest exit with
>> KVM_EXIT_NMI instead of immediately delivering a 0x200
>> interrupt to guest upon machine check exception in
>> guest address. Exiting the guest enables QEMU to build
>> error log and deliver machine check exception to guest
>> OS (either via guest OS registered machine check
>> handler or via 0x200 guest OS interrupt vector).
>>
>> This approach simplifies the delivering of machine
>> check exception to guest OS compared to the earlier approach
>> of KVM directly invoking 0x200 guest interrupt vector.
>> In the earlier approach QEMU patched the 0x200 interrupt
>> vector during boot. The patched code at 0x200 issued a
>> private hcall to pass the control to QEMU to build the
>> error log.
>>
>> This design/approach is based on the feedback for the
>> QEMU patches to handle machine check exception. Details
>> of earlier approach of handling machine check exception
>> in QEMU and related discussions can be found at:
>>
>> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
>>
>> This patch also introduces a new KVM capability to
>> control how KVM behaves on machine check exception.
>> Without this capability, KVM redirects machine check
>> exceptions to guest's 0x200 vector if the address in
>> error belongs to guest. With this capability KVM
>> causes a guest exit with NMI exit reason.
>>
>> This is required to avoid problems if a new kernel/KVM
>> is used with an old QEMU for guests that don't issue
>> "ibm,nmi-register". As old QEMU does not understand the
>> NMI exit type, it treats it as a fatal error. However,
>> the guest could have handled the machine check error
>> if the exception was delivered to guest's 0x200 interrupt
>> vector instead of NMI exit in case of old QEMU.
>>
>> Change Log v2:
>>   - Added KVM capability
>>
>> Signed-off-by: Aravinda Prasad 
>> ---
>>  arch/powerpc/include/asm/kvm_host.h |1 +
>>  arch/powerpc/kernel/asm-offsets.c   |1 +
>>  arch/powerpc/kvm/book3s_hv.c|   12 +++---
>>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   37 
>> +++
>>  arch/powerpc/kvm/powerpc.c  |7 ++
>>  include/uapi/linux/kvm.h|1 +
>>  6 files changed, 31 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/kvm_host.h 
>> b/arch/powerpc/include/asm/kvm_host.h
>> index 827a38d..8a652ba 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -243,6 +243,7 @@ struct kvm_arch {
>>  int hpt_cma_alloc;
>>  struct dentry *debugfs_dir;
>>  struct dentry *htab_dentry;
>> +u8 fwnmi_enabled;
>>  #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
>>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>>  struct mutex hpt_mutex;
>> diff --git a/arch/powerpc/kernel/asm-offsets.c 
>> b/arch/powerpc/kernel/asm-offsets.c
>> index 221d584..6a4e81a 100644
>> --- a/arch/powerpc/kernel/asm-offsets.c
>> +++ b/arch/powerpc/kernel/asm-offsets.c
>> @@ -506,6 +506,7 @@ int main(void)
>>  DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
>>  DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
>>  DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
>> +DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled));
>>  DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
>>  DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
>>  DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 2280497..1b1dff0 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
>> struct kvm_vcpu *vcpu,
>>  r = RESUME_GUEST;
>>  break;
>>  case BOOK3S_INTERRUPT_MACHINE_CHECK:
>> -/*
>> - * Deliver a machine check interrupt to the guest.
>> - * We have to do this, even if the host has handled the
>> - * machine check, because machine checks use SRR0/1 and
>> - * the interrupt might have trashed guest state in them.
>> - */
>> -kvmppc_book3s_queue_irqprio(vcpu,
>> -BOOK3S_INTERRUPT_MACHINE_CHECK);
>> -r = RESUME_GUEST;
>> +/* Exit to guest with KVM_EXIT_NMI as exit reason */
>> +run->exit_reason = KVM_EXIT_NMI;
>> +r = RESUME_HOST;
>>  break;
>>  case BOOK3S_INTERRUPT_PROGRAM:
>>  {
>> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
>> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> index b98889e..f43c124 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> +++ 

Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception

2015-12-16 Thread David Gibson
On Wed, Dec 16, 2015 at 11:26:12AM +0530, Aravinda Prasad wrote:
> This patch modifies KVM to cause a guest exit with
> KVM_EXIT_NMI instead of immediately delivering a 0x200
> interrupt to guest upon machine check exception in
> guest address. Exiting the guest enables QEMU to build
> error log and deliver machine check exception to guest
> OS (either via guest OS registered machine check
> handler or via 0x200 guest OS interrupt vector).
> 
> This approach simplifies the delivering of machine
> check exception to guest OS compared to the earlier approach
> of KVM directly invoking 0x200 guest interrupt vector.
> In the earlier approach QEMU patched the 0x200 interrupt
> vector during boot. The patched code at 0x200 issued a
> private hcall to pass the control to QEMU to build the
> error log.
> 
> This design/approach is based on the feedback for the
> QEMU patches to handle machine check exception. Details
> of earlier approach of handling machine check exception
> in QEMU and related discussions can be found at:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> 
> This patch also introduces a new KVM capability to
> control how KVM behaves on machine check exception.
> Without this capability, KVM redirects machine check
> exceptions to guest's 0x200 vector if the address in
> error belongs to guest. With this capability KVM
> causes a guest exit with NMI exit reason.
> 
> This is required to avoid problems if a new kernel/KVM
> is used with an old QEMU for guests that don't issue
> "ibm,nmi-register". As old QEMU does not understand the
> NMI exit type, it treats it as a fatal error. However,
> the guest could have handled the machine check error
> if the exception was delivered to guest's 0x200 interrupt
> vector instead of NMI exit in case of old QEMU.
> 
> Change Log v2:
>   - Added KVM capability
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  arch/powerpc/include/asm/kvm_host.h |1 +
>  arch/powerpc/kernel/asm-offsets.c   |1 +
>  arch/powerpc/kvm/book3s_hv.c|   12 +++---
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   37 
> +++
>  arch/powerpc/kvm/powerpc.c  |7 ++
>  include/uapi/linux/kvm.h|1 +
>  6 files changed, 31 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index 827a38d..8a652ba 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -243,6 +243,7 @@ struct kvm_arch {
>   int hpt_cma_alloc;
>   struct dentry *debugfs_dir;
>   struct dentry *htab_dentry;
> + u8 fwnmi_enabled;
>  #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>   struct mutex hpt_mutex;
> diff --git a/arch/powerpc/kernel/asm-offsets.c 
> b/arch/powerpc/kernel/asm-offsets.c
> index 221d584..6a4e81a 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -506,6 +506,7 @@ int main(void)
>   DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
>   DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
>   DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
> + DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled));
>   DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
>   DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
>   DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 2280497..1b1dff0 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
> struct kvm_vcpu *vcpu,
>   r = RESUME_GUEST;
>   break;
>   case BOOK3S_INTERRUPT_MACHINE_CHECK:
> - /*
> -  * Deliver a machine check interrupt to the guest.
> -  * We have to do this, even if the host has handled the
> -  * machine check, because machine checks use SRR0/1 and
> -  * the interrupt might have trashed guest state in them.
> -  */
> - kvmppc_book3s_queue_irqprio(vcpu,
> - BOOK3S_INTERRUPT_MACHINE_CHECK);
> - r = RESUME_GUEST;
> + /* Exit to guest with KVM_EXIT_NMI as exit reason */
> + run->exit_reason = KVM_EXIT_NMI;
> + r = RESUME_HOST;
>   break;
>   case BOOK3S_INTERRUPT_PROGRAM:
>   {
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..f43c124 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   addir1, r1, 112

Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception

2015-12-16 Thread Aravinda Prasad


On Wednesday 16 December 2015 03:10 PM, Thomas Huth wrote:
> On 16/12/15 06:56, Aravinda Prasad wrote:
>> This patch modifies KVM to cause a guest exit with
>> KVM_EXIT_NMI instead of immediately delivering a 0x200
>> interrupt to guest upon machine check exception in
>> guest address. Exiting the guest enables QEMU to build
>> error log and deliver machine check exception to guest
>> OS (either via guest OS registered machine check
>> handler or via 0x200 guest OS interrupt vector).
>>
>> This approach simplifies the delivering of machine
>> check exception to guest OS compared to the earlier approach
>> of KVM directly invoking 0x200 guest interrupt vector.
>> In the earlier approach QEMU patched the 0x200 interrupt
>> vector during boot. The patched code at 0x200 issued a
>> private hcall to pass the control to QEMU to build the
>> error log.
>>
>> This design/approach is based on the feedback for the
>> QEMU patches to handle machine check exception. Details
>> of earlier approach of handling machine check exception
>> in QEMU and related discussions can be found at:
>>
>> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
>>
>> This patch also introduces a new KVM capability to
>> control how KVM behaves on machine check exception.
>> Without this capability, KVM redirects machine check
>> exceptions to guest's 0x200 vector if the address in
>> error belongs to guest. With this capability KVM
>> causes a guest exit with NMI exit reason.
>>
>> This is required to avoid problems if a new kernel/KVM
>> is used with an old QEMU for guests that don't issue
>> "ibm,nmi-register". As old QEMU does not understand the
>> NMI exit type, it treats it as a fatal error. However,
>> the guest could have handled the machine check error
>> if the exception was delivered to guest's 0x200 interrupt
>> vector instead of NMI exit in case of old QEMU.
>>
>> Change Log v2:
>>   - Added KVM capability
>>
>> Signed-off-by: Aravinda Prasad 
>> ---
>>  arch/powerpc/include/asm/kvm_host.h |1 +
>>  arch/powerpc/kernel/asm-offsets.c   |1 +
>>  arch/powerpc/kvm/book3s_hv.c|   12 +++---
>>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   37 
>> +++
>>  arch/powerpc/kvm/powerpc.c  |7 ++
>>  include/uapi/linux/kvm.h|1 +
>>  6 files changed, 31 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/kvm_host.h 
>> b/arch/powerpc/include/asm/kvm_host.h
>> index 827a38d..8a652ba 100644
>> --- a/arch/powerpc/include/asm/kvm_host.h
>> +++ b/arch/powerpc/include/asm/kvm_host.h
>> @@ -243,6 +243,7 @@ struct kvm_arch {
>>  int hpt_cma_alloc;
>>  struct dentry *debugfs_dir;
>>  struct dentry *htab_dentry;
>> +u8 fwnmi_enabled;
> 
> Here you declare the variable as 8-bits ...
> 
>>  #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
>>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>>  struct mutex hpt_mutex;
>> diff --git a/arch/powerpc/kernel/asm-offsets.c 
>> b/arch/powerpc/kernel/asm-offsets.c
>> index 221d584..6a4e81a 100644
>> --- a/arch/powerpc/kernel/asm-offsets.c
>> +++ b/arch/powerpc/kernel/asm-offsets.c
>> @@ -506,6 +506,7 @@ int main(void)
>>  DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
>>  DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
>>  DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
>> +DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled));
> 
> ... then define an asm-offset for it ...
> 
>>  DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
>>  DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
>>  DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
>> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
>> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> index b98889e..f43c124 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> [...]
>> @@ -2381,24 +2377,27 @@ machine_check_realmode:
>>  ld  r9, HSTATE_KVM_VCPU(r13)
>>  li  r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>>  /*
>> - * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through
>> - * machine check interrupt (set HSRR0 to 0x200). And for handled
>> - * errors (no-fatal), just go back to guest execution with current
>> - * HSRR0 instead of exiting guest. This new approach will inject
>> - * machine check to guest for fatal error causing guest to crash.
>> - *
>> - * The old code used to return to host for unhandled errors which
>> - * was causing guest to hang with soft lockups inside guest and
>> - * makes it difficult to recover guest instance.
>> + * Deliver unhandled/fatal (e.g. UE) MCE errors to guest
>> + * by exiting the guest with KVM_EXIT_NMI exit reason (exit
>> + * reason set later based on trap). For handled errors
>> + * (no-fatal), go back to guest 

Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception

2015-12-16 Thread Thomas Huth
On 16/12/15 06:56, Aravinda Prasad wrote:
> This patch modifies KVM to cause a guest exit with
> KVM_EXIT_NMI instead of immediately delivering a 0x200
> interrupt to guest upon machine check exception in
> guest address. Exiting the guest enables QEMU to build
> error log and deliver machine check exception to guest
> OS (either via guest OS registered machine check
> handler or via 0x200 guest OS interrupt vector).
> 
> This approach simplifies the delivering of machine
> check exception to guest OS compared to the earlier approach
> of KVM directly invoking 0x200 guest interrupt vector.
> In the earlier approach QEMU patched the 0x200 interrupt
> vector during boot. The patched code at 0x200 issued a
> private hcall to pass the control to QEMU to build the
> error log.
> 
> This design/approach is based on the feedback for the
> QEMU patches to handle machine check exception. Details
> of earlier approach of handling machine check exception
> in QEMU and related discussions can be found at:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> 
> This patch also introduces a new KVM capability to
> control how KVM behaves on machine check exception.
> Without this capability, KVM redirects machine check
> exceptions to guest's 0x200 vector if the address in
> error belongs to guest. With this capability KVM
> causes a guest exit with NMI exit reason.
> 
> This is required to avoid problems if a new kernel/KVM
> is used with an old QEMU for guests that don't issue
> "ibm,nmi-register". As old QEMU does not understand the
> NMI exit type, it treats it as a fatal error. However,
> the guest could have handled the machine check error
> if the exception was delivered to guest's 0x200 interrupt
> vector instead of NMI exit in case of old QEMU.
> 
> Change Log v2:
>   - Added KVM capability
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  arch/powerpc/include/asm/kvm_host.h |1 +
>  arch/powerpc/kernel/asm-offsets.c   |1 +
>  arch/powerpc/kvm/book3s_hv.c|   12 +++---
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   37 
> +++
>  arch/powerpc/kvm/powerpc.c  |7 ++
>  include/uapi/linux/kvm.h|1 +
>  6 files changed, 31 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index 827a38d..8a652ba 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -243,6 +243,7 @@ struct kvm_arch {
>   int hpt_cma_alloc;
>   struct dentry *debugfs_dir;
>   struct dentry *htab_dentry;
> + u8 fwnmi_enabled;

Here you declare the variable as 8-bits ...

>  #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>   struct mutex hpt_mutex;
> diff --git a/arch/powerpc/kernel/asm-offsets.c 
> b/arch/powerpc/kernel/asm-offsets.c
> index 221d584..6a4e81a 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -506,6 +506,7 @@ int main(void)
>   DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
>   DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
>   DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
> + DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled));

... then define an asm-offset for it ...

>   DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
>   DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
>   DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..f43c124 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
[...]
> @@ -2381,24 +2377,27 @@ machine_check_realmode:
>   ld  r9, HSTATE_KVM_VCPU(r13)
>   li  r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>   /*
> -  * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through
> -  * machine check interrupt (set HSRR0 to 0x200). And for handled
> -  * errors (no-fatal), just go back to guest execution with current
> -  * HSRR0 instead of exiting guest. This new approach will inject
> -  * machine check to guest for fatal error causing guest to crash.
> -  *
> -  * The old code used to return to host for unhandled errors which
> -  * was causing guest to hang with soft lockups inside guest and
> -  * makes it difficult to recover guest instance.
> +  * Deliver unhandled/fatal (e.g. UE) MCE errors to guest
> +  * by exiting the guest with KVM_EXIT_NMI exit reason (exit
> +  * reason set later based on trap). For handled errors
> +  * (no-fatal), go back to guest execution with current HSRR0
> +  * instead of exiting the guest. This approach will cause
> +  * the guest to exit in case of fatal 

[PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception

2015-12-15 Thread Aravinda Prasad
This patch modifies KVM to cause a guest exit with
KVM_EXIT_NMI instead of immediately delivering a 0x200
interrupt to guest upon machine check exception in
guest address. Exiting the guest enables QEMU to build
error log and deliver machine check exception to guest
OS (either via guest OS registered machine check
handler or via 0x200 guest OS interrupt vector).

This approach simplifies the delivering of machine
check exception to guest OS compared to the earlier approach
of KVM directly invoking 0x200 guest interrupt vector.
In the earlier approach QEMU patched the 0x200 interrupt
vector during boot. The patched code at 0x200 issued a
private hcall to pass the control to QEMU to build the
error log.

This design/approach is based on the feedback for the
QEMU patches to handle machine check exception. Details
of earlier approach of handling machine check exception
in QEMU and related discussions can be found at:

https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html

This patch also introduces a new KVM capability to
control how KVM behaves on machine check exception.
Without this capability, KVM redirects machine check
exceptions to guest's 0x200 vector if the address in
error belongs to guest. With this capability KVM
causes a guest exit with NMI exit reason.

This is required to avoid problems if a new kernel/KVM
is used with an old QEMU for guests that don't issue
"ibm,nmi-register". As old QEMU does not understand the
NMI exit type, it treats it as a fatal error. However,
the guest could have handled the machine check error
if the exception was delivered to guest's 0x200 interrupt
vector instead of NMI exit in case of old QEMU.

Change Log v2:
  - Added KVM capability

Signed-off-by: Aravinda Prasad 
---
 arch/powerpc/include/asm/kvm_host.h |1 +
 arch/powerpc/kernel/asm-offsets.c   |1 +
 arch/powerpc/kvm/book3s_hv.c|   12 +++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   37 +++
 arch/powerpc/kvm/powerpc.c  |7 ++
 include/uapi/linux/kvm.h|1 +
 6 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 827a38d..8a652ba 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -243,6 +243,7 @@ struct kvm_arch {
int hpt_cma_alloc;
struct dentry *debugfs_dir;
struct dentry *htab_dentry;
+   u8 fwnmi_enabled;
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
struct mutex hpt_mutex;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 221d584..6a4e81a 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -506,6 +506,7 @@ int main(void)
DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
+   DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled));
DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2280497..1b1dff0 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
struct kvm_vcpu *vcpu,
r = RESUME_GUEST;
break;
case BOOK3S_INTERRUPT_MACHINE_CHECK:
-   /*
-* Deliver a machine check interrupt to the guest.
-* We have to do this, even if the host has handled the
-* machine check, because machine checks use SRR0/1 and
-* the interrupt might have trashed guest state in them.
-*/
-   kvmppc_book3s_queue_irqprio(vcpu,
-   BOOK3S_INTERRUPT_MACHINE_CHECK);
-   r = RESUME_GUEST;
+   /* Exit to guest with KVM_EXIT_NMI as exit reason */
+   run->exit_reason = KVM_EXIT_NMI;
+   r = RESUME_HOST;
break;
case BOOK3S_INTERRUPT_PROGRAM:
{
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index b98889e..f43c124 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
addir1, r1, 112
ld  r7, HSTATE_HOST_MSR(r13)
 
-   cmpwi   cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK
cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
beq 11f
cmpwi   cr2, r12, BOOK3S_INTERRUPT_HMI
@@ 

Re: [PATCH v2] KVM: PPC: Exit guest upon fatal machine check exception

2015-12-15 Thread Daniel Axtens
Hi,


> This patch also introduces a new KVM capability to
> control how KVM behaves on machine check exception.
> Without this capability, KVM redirects machine check
> exceptions to guest's 0x200 vector if the address in
> error belongs to guest. With this capability KVM
> causes a guest exit with NMI exit reason.
>
> This is required to avoid problems if a new kernel/KVM
> is used with an old QEMU for guests that don't issue
> "ibm,nmi-register". As old QEMU does not understand the
> NMI exit type, it treats it as a fatal error. However,
> the guest could have handled the machine check error
> if the exception was delivered to guest's 0x200 interrupt
> vector instead of NMI exit in case of old QEMU.
>
> Change Log v2:
>   - Added KVM capability

I'm not really qualified to review the contents of this patch, but I'm
happy that the changes in v2 address the concern I had for version 1:
thank you.

Regards,
Daniel

>
> Signed-off-by: Aravinda Prasad 
> ---
>  arch/powerpc/include/asm/kvm_host.h |1 +
>  arch/powerpc/kernel/asm-offsets.c   |1 +
>  arch/powerpc/kvm/book3s_hv.c|   12 +++---
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   37 
> +++
>  arch/powerpc/kvm/powerpc.c  |7 ++
>  include/uapi/linux/kvm.h|1 +
>  6 files changed, 31 insertions(+), 28 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index 827a38d..8a652ba 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -243,6 +243,7 @@ struct kvm_arch {
>   int hpt_cma_alloc;
>   struct dentry *debugfs_dir;
>   struct dentry *htab_dentry;
> + u8 fwnmi_enabled;
>  #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
>  #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
>   struct mutex hpt_mutex;
> diff --git a/arch/powerpc/kernel/asm-offsets.c 
> b/arch/powerpc/kernel/asm-offsets.c
> index 221d584..6a4e81a 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -506,6 +506,7 @@ int main(void)
>   DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
>   DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
>   DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
> + DEFINE(KVM_FWNMI, offsetof(struct kvm, arch.fwnmi_enabled));
>   DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
>   DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
>   DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 2280497..1b1dff0 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
> struct kvm_vcpu *vcpu,
>   r = RESUME_GUEST;
>   break;
>   case BOOK3S_INTERRUPT_MACHINE_CHECK:
> - /*
> -  * Deliver a machine check interrupt to the guest.
> -  * We have to do this, even if the host has handled the
> -  * machine check, because machine checks use SRR0/1 and
> -  * the interrupt might have trashed guest state in them.
> -  */
> - kvmppc_book3s_queue_irqprio(vcpu,
> - BOOK3S_INTERRUPT_MACHINE_CHECK);
> - r = RESUME_GUEST;
> + /* Exit to guest with KVM_EXIT_NMI as exit reason */
> + run->exit_reason = KVM_EXIT_NMI;
> + r = RESUME_HOST;
>   break;
>   case BOOK3S_INTERRUPT_PROGRAM:
>   {
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..f43c124 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   addir1, r1, 112
>   ld  r7, HSTATE_HOST_MSR(r13)
>  
> - cmpwi   cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>   cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
>   beq 11f
>   cmpwi   cr2, r12, BOOK3S_INTERRUPT_HMI
> @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   mtmsrd  r6, 1   /* Clear RI in MSR */
>   mtsrr0  r8
>   mtsrr1  r7
> - beq cr1, 13f/* machine check */
>   RFI
>  
>   /* On POWER7, we have external interrupts set to use HSRR0/1 */
> @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   mtspr   SPRN_HSRR1, r7
>   ba  0x500
>  
> -13:  b   machine_check_fwnmi
> -
>  14:  mtspr   SPRN_HSRR0, r8
>   mtspr   SPRN_HSRR1, r7
>   b   hmi_exception_after_realmode
> @@ -2381,24 +2377,27 @@ machine_check_realmode:
>   ld  r9, HSTATE_KVM_VCPU(r13)
>   li  r12, BOOK3S_INTERRUPT_MACHINE_CHECK
> 

Re: [GIT PULL] Please pull my kvm-ppc-fixes branch

2015-12-11 Thread Paolo Bonzini


On 10/12/2015 04:12, Paul Mackerras wrote:
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git 
> kvm-ppc-fixes

Pulled, thanks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] KVM: PPC: Increase memslots to 320

2015-12-09 Thread Thomas Huth
On 09/12/15 04:28, Paul Mackerras wrote:
> On Wed, Nov 04, 2015 at 10:03:48AM +0100, Thomas Huth wrote:
>> Only using 32 memslots for KVM on powerpc is way too low, you can
>> nowadays hit this limit quite fast by adding a couple of PCI devices
>> and/or pluggable memory DIMMs to the guest.
>> x86 already increased the limit to 512 in total, to satisfy 256
>> pluggable DIMM slots, 3 private slots and 253 slots for other things
>> like PCI devices. On powerpc, we only have 32 pluggable DIMMs in
> 
> I agree with increasing the limit.  Is there a reason we have only 32
> pluggable DIMMs in QEMU on powerpc, not more?  Should we be increasing
> that limit too?  If so, maybe we should increase the number of memory
> slots to 512?

H ... the comment before the #define in QEMU reads:

/*
 * This defines the maximum number of DIMM slots we can have for sPAPR
 * guest. This is not defined by sPAPR but we are defining it to 32 slots
 * based on default number of slots provided by PowerPC kernel.
 */
#define SPAPR_MAX_RAM_SLOTS 32

So as far as I can see, there's indeed a possibility that we'll
increase this value once the kernel can handle more slots!

So I guess you're right and we should play save and use more slots
right from the start. I'll send a new patch with 512 instead.

>> QEMU, not 256, so we likely do not as much slots as on x86. Thus
> 
> "so we likely do not need as many slots as on x86" would be better
> English.

Oops, I'm sure that was my original intention ... anyway, thanks for
pointing it out, it's always good to get some feedback as a non-native
English speaker.

 Thomas

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: Increase memslots to 512

2015-12-09 Thread Thomas Huth
Only using 32 memslots for KVM on powerpc is way too low, you can
nowadays hit this limit quite fast by adding a couple of PCI devices
and/or pluggable memory DIMMs to the guest.

x86 already increased the KVM_USER_MEM_SLOTS to 509, to satisfy 256
pluggable DIMM slots, 3 private slots and 253 slots for other things
like PCI devices (i.e. resulting in 256 + 3 + 253 = 512 slots in
total). We should do something similar for powerpc, and since we do
not use private slots here, we can set the value to 512 directly.

While we're at it, also remove the KVM_MEM_SLOTS_NUM definition
from the powerpc-specific header since this gets defined in the
generic kvm_host.h header anyway.

Signed-off-by: Thomas Huth 
---
 arch/powerpc/include/asm/kvm_host.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 887c259..2c96a72 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -38,8 +38,7 @@
 
 #define KVM_MAX_VCPUS  NR_CPUS
 #define KVM_MAX_VCORES NR_CPUS
-#define KVM_USER_MEM_SLOTS 32
-#define KVM_MEM_SLOTS_NUM KVM_USER_MEM_SLOTS
+#define KVM_USER_MEM_SLOTS 512
 
 #ifdef CONFIG_KVM_MMIO
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8

2015-12-09 Thread Paul Mackerras
On Fri, Nov 20, 2015 at 09:11:45AM +0100, Thomas Huth wrote:
> In the old DABR register, the BT (Breakpoint Translation) bit
> is bit number 61. In the new DAWRX register, the WT (Watchpoint
> Translation) bit is bit number 59. So to move the DABR-BT bit
> into the position of the DAWRX-WT bit, it has to be shifted by
> two, not only by one. This fixes hardware watchpoints in gdb of
> older guests that only use the H_SET_DABR/X interface instead
> of the new H_SET_MODE interface.
> 
> Signed-off-by: Thomas Huth 

Thanks, applied to my kvm-ppc-next branch, with cc: sta...@vger.kernel.org.

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: remove unused variable 'vcpu_book3s'

2015-12-09 Thread Paul Mackerras
On Tue, Dec 01, 2015 at 08:42:10PM -0300, Geyslan G. Bem wrote:
> The vcpu_book3s struct is assigned but never used. So remove it.
> 
> Signed-off-by: Geyslan G. Bem 

Thanks, applied to my kvm-ppc-next branch.

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Increase memslots to 512

2015-12-09 Thread Paul Mackerras
On Wed, Dec 09, 2015 at 11:34:07AM +0100, Thomas Huth wrote:
> Only using 32 memslots for KVM on powerpc is way too low, you can
> nowadays hit this limit quite fast by adding a couple of PCI devices
> and/or pluggable memory DIMMs to the guest.
> 
> x86 already increased the KVM_USER_MEM_SLOTS to 509, to satisfy 256
> pluggable DIMM slots, 3 private slots and 253 slots for other things
> like PCI devices (i.e. resulting in 256 + 3 + 253 = 512 slots in
> total). We should do something similar for powerpc, and since we do
> not use private slots here, we can set the value to 512 directly.
> 
> While we're at it, also remove the KVM_MEM_SLOTS_NUM definition
> from the powerpc-specific header since this gets defined in the
> generic kvm_host.h header anyway.
> 
> Signed-off-by: Thomas Huth 

Thanks, applied to my kvm-ppc-next branch.

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] Please pull my kvm-ppc-fixes branch

2015-12-09 Thread Paul Mackerras
Hi Paolo,

I have a small patch that I would like to get into 4.4 because it
fixes a bug which for certain kernel configs allows userspace to crash
the kernel.  The configs are those for which KVM_BOOK3S_64_HV is set
(y or m) and KVM_BOOK3S_64_PR is not.  Fortunately most distros that
enable KVM_BOOK3S_64_HV also enable KVM_BOOK3S_64_PR, as far as I can
tell.

Thanks,
Paul.

The following changes since commit 09922076003ad66de41ea14d2f8c3b4a16ec7774:

  Merge tag 'kvm-arm-for-v4.4-rc4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master 
(2015-12-04 18:32:32 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git kvm-ppc-fixes

for you to fetch changes up to c20875a3e638e4a03e099b343ec798edd1af5cc6:

  KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR 
(2015-12-10 11:34:27 +1100)


Paul Mackerras (1):
  KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR

 arch/powerpc/kvm/book3s_hv.c | 6 ++
 1 file changed, 6 insertions(+)
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH kernel 5/9] KVM: PPC: Account TCE-containing pages in locked_vm

2015-12-08 Thread David Gibson
On Tue, Sep 15, 2015 at 08:49:35PM +1000, Alexey Kardashevskiy wrote:
> At the moment pages used for TCE tables (in addition to pages addressed
> by TCEs) are not counted in locked_vm counter so a malicious userspace
> tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as RLIMIT_NOFILE and
> lock a lot of memory.
> 
> This adds counting for pages used for TCE tables.
> 
> This counts the number of pages required for a table plus pages for
> the kvmppc_spapr_tce_table struct (TCE table descriptor) itself.

Hmm.  Does it make sense to account for the descriptor struct itself?
I mean there are lots of little structures the kernel will allocate on
a process's behalf, and I don't think most of them get accounted
against locked vm.

> This does not change the amount of (de)allocated memory.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/kvm/book3s_64_vio.c | 51 
> +++-
>  1 file changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> b/arch/powerpc/kvm/book3s_64_vio.c
> index 9526c34..b70787d 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -45,13 +45,56 @@ static long kvmppc_stt_npages(unsigned long window_size)
>* sizeof(u64), PAGE_SIZE) / PAGE_SIZE;
>  }
>  
> +static long kvmppc_account_memlimit(long npages, bool inc)
> +{
> + long ret = 0;
> + const long bytes = sizeof(struct kvmppc_spapr_tce_table) +
> + (abs(npages) * sizeof(struct page *));
> + const long stt_pages = ALIGN(bytes, PAGE_SIZE) / PAGE_SIZE;

Overflow checks might be useful here, I'm not sure.

> +
> + if (!current || !current->mm)
> + return ret; /* process exited */
> +
> + npages += stt_pages;
> +
> + down_write(>mm->mmap_sem);
> +
> + if (inc) {
> + long locked, lock_limit;
> +
> + locked = current->mm->locked_vm + npages;
> + lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> + if (locked > lock_limit && !capable(CAP_IPC_LOCK))
> + ret = -ENOMEM;
> + else
> + current->mm->locked_vm += npages;
> + } else {
> + if (npages > current->mm->locked_vm)

Should this be a WARN_ON?  It means something has gone wrong
previously in the accounting, doesn't it?

> + npages = current->mm->locked_vm;
> +
> + current->mm->locked_vm -= npages;
> + }
> +
> + pr_debug("[%d] RLIMIT_MEMLOCK KVM %c%ld %ld/%ld%s\n", current->pid,
> + inc ? '+' : '-',
> + npages << PAGE_SHIFT,
> + current->mm->locked_vm << PAGE_SHIFT,
> + rlimit(RLIMIT_MEMLOCK),
> + ret ? " - exceeded" : "");
> +
> + up_write(>mm->mmap_sem);
> +
> + return ret;
> +}
> +
>  static void release_spapr_tce_table(struct rcu_head *head)
>  {
>   struct kvmppc_spapr_tce_table *stt = container_of(head,
>   struct kvmppc_spapr_tce_table, rcu);
>   int i;
> + long npages = kvmppc_stt_npages(stt->window_size);
>  
> - for (i = 0; i < kvmppc_stt_npages(stt->window_size); i++)
> + for (i = 0; i < npages; i++)
>   __free_page(stt->pages[i]);
>  
>   kfree(stt);
> @@ -89,6 +132,7 @@ static int kvm_spapr_tce_release(struct inode *inode, 
> struct file *filp)
>  
>   kvm_put_kvm(stt->kvm);
>  
> + kvmppc_account_memlimit(kvmppc_stt_npages(stt->window_size), false);
>   call_rcu(>rcu, release_spapr_tce_table);
>  
>   return 0;
> @@ -114,6 +158,11 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
>   }
>  
>   npages = kvmppc_stt_npages(args->window_size);
> + ret = kvmppc_account_memlimit(npages, true);
> + if (ret) {
> + stt = NULL;
> + goto fail;
> + }
>  
>   stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *),
> GFP_KERNEL);

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH kernel 7/9] KVM: PPC: Move reusable bits of H_PUT_TCE handler to helpers

2015-12-08 Thread David Gibson
On Tue, Sep 15, 2015 at 08:49:37PM +1000, Alexey Kardashevskiy wrote:
> Upcoming multi-tce support (H_PUT_TCE_INDIRECT/H_STUFF_TCE hypercalls)
> will validate TCE (not to have unexpected bits) and IO address
> (to be within the DMA window boundaries).
> 
> This introduces helpers to validate TCE and IO address.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/include/asm/kvm_ppc.h  |  4 ++
>  arch/powerpc/kvm/book3s_64_vio_hv.c | 89 
> -
>  2 files changed, 83 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> b/arch/powerpc/include/asm/kvm_ppc.h
> index c6ef05b..fcde896 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -166,6 +166,10 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu 
> *vcpu);
>  
>  extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
>   struct kvm_create_spapr_tce *args);
> +extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
> + unsigned long ioba, unsigned long npages);
> +extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt,
> + unsigned long tce);
>  extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
>unsigned long ioba, unsigned long tce);
>  extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
> diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
> b/arch/powerpc/kvm/book3s_64_vio_hv.c
> index 6cf1ab3..f0fd84c 100644
> --- a/arch/powerpc/kvm/book3s_64_vio_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
> @@ -36,6 +36,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define TCES_PER_PAGE(PAGE_SIZE / sizeof(u64))
>  
> @@ -64,7 +65,7 @@ static struct kvmppc_spapr_tce_table 
> *kvmppc_find_table(struct kvm_vcpu *vcpu,
>   * WARNING: This will be called in real-mode on HV KVM and virtual
>   *  mode on PR KVM
>   */
> -static long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
> +long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
>   unsigned long ioba, unsigned long npages)
>  {
>   unsigned long mask = (1ULL << IOMMU_PAGE_SHIFT_4K) - 1;
> @@ -76,6 +77,79 @@ static long kvmppc_ioba_validate(struct 
> kvmppc_spapr_tce_table *stt,
>  
>   return H_SUCCESS;
>  }
> +EXPORT_SYMBOL_GPL(kvmppc_ioba_validate);

Why does it need to be exported - the new users will still be in the
KVM module, won't they?

> +
> +/*
> + * Validates TCE address.
> + * At the moment flags and page mask are validated.
> + * As the host kernel does not access those addresses (just puts them
> + * to the table and user space is supposed to process them), we can skip
> + * checking other things (such as TCE is a guest RAM address or the page
> + * was actually allocated).

Hmm.  These comments apply given that the only current user of this is
the kvm acceleration of userspace TCE tables, but the name suggests it
would validate any TCE, including in kernel ones for which this would
be unsafe.

> + * WARNING: This will be called in real-mode on HV KVM and virtual
> + *  mode on PR KVM
> + */
> +long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *stt, unsigned long 
> tce)
> +{
> + unsigned long mask = ((1ULL << IOMMU_PAGE_SHIFT_4K) - 1) &
> + ~(TCE_PCI_WRITE | TCE_PCI_READ);
> +
> + if (tce & mask)
> + return H_PARAMETER;
> +
> + return H_SUCCESS;
> +}
> +EXPORT_SYMBOL_GPL(kvmppc_tce_validate);
> +
> +/* Note on the use of page_address() in real mode,
> + *
> + * It is safe to use page_address() in real mode on ppc64 because
> + * page_address() is always defined as lowmem_page_address()
> + * which returns __va(PFN_PHYS(page_to_pfn(page))) which is arithmetial
> + * operation and does not access page struct.
> + *
> + * Theoretically page_address() could be defined different
> + * but either WANT_PAGE_VIRTUAL or HASHED_PAGE_VIRTUAL
> + * should be enabled.
> + * WANT_PAGE_VIRTUAL is never enabled on ppc32/ppc64,
> + * HASHED_PAGE_VIRTUAL could be enabled for ppc32 only and only
> + * if CONFIG_HIGHMEM is defined. As CONFIG_SPARSEMEM_VMEMMAP
> + * is not expected to be enabled on ppc32, page_address()
> + * is safe for ppc32 as well.
> + *
> + * WARNING: This will be called in real-mode on HV KVM and virtual
> + *  mode on PR KVM
> + */
> +static u64 *kvmppc_page_address(struct page *page)
> +{
> +#if defined(HASHED_PAGE_VIRTUAL) || defined(WANT_PAGE_VIRTUAL)
> +#error TODO: fix to avoid page_address() here
> +#endif
> + return (u64 *) page_address(page);
> +}
> +
> +/*
> + * Handles TCE requests for emulated devices.
> + * Puts guest TCE values to the table and expects user space to convert them.
> + * Called in both real and virtual modes.
> + * Cannot fail so kvmppc_tce_validate must be called before it.
> + *
> + * WARNING: This will be called in real-mode on HV KVM and 

Re: [PATCH kernel 6/9] KVM: PPC: Replace SPAPR_TCE_SHIFT with IOMMU_PAGE_SHIFT_4K

2015-12-08 Thread David Gibson
On Tue, Sep 15, 2015 at 08:49:36PM +1000, Alexey Kardashevskiy wrote:
> SPAPR_TCE_SHIFT is used in few places only and since IOMMU_PAGE_SHIFT_4K
> can be easily used instead, remove SPAPR_TCE_SHIFT.
> 
> Signed-off-by: Alexey Kardashevskiy 

Reviewed-by: David Gibson 

> ---
>  arch/powerpc/include/asm/kvm_book3s_64.h | 2 --
>  arch/powerpc/kvm/book3s_64_vio.c | 3 ++-
>  arch/powerpc/kvm/book3s_64_vio_hv.c  | 4 ++--
>  3 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
> b/arch/powerpc/include/asm/kvm_book3s_64.h
> index 2aa79c8..7529aab 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
> @@ -33,8 +33,6 @@ static inline void svcpu_put(struct 
> kvmppc_book3s_shadow_vcpu *svcpu)
>  }
>  #endif
>  
> -#define SPAPR_TCE_SHIFT  12
> -
>  #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
>  #define KVM_DEFAULT_HPT_ORDER24  /* 16MB HPT by default */
>  #endif
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> b/arch/powerpc/kvm/book3s_64_vio.c
> index b70787d..e347856 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -36,12 +36,13 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define TCES_PER_PAGE(PAGE_SIZE / sizeof(u64))
>  
>  static long kvmppc_stt_npages(unsigned long window_size)
>  {
> - return ALIGN((window_size >> SPAPR_TCE_SHIFT)
> + return ALIGN((window_size >> IOMMU_PAGE_SHIFT_4K)
>* sizeof(u64), PAGE_SIZE) / PAGE_SIZE;
>  }
>  
> diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
> b/arch/powerpc/kvm/book3s_64_vio_hv.c
> index 8ae12ac..6cf1ab3 100644
> --- a/arch/powerpc/kvm/book3s_64_vio_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
> @@ -99,7 +99,7 @@ long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long 
> liobn,
>   if (ret)
>   return ret;
>  
> - idx = ioba >> SPAPR_TCE_SHIFT;
> + idx = ioba >> IOMMU_PAGE_SHIFT_4K;
>   page = stt->pages[idx / TCES_PER_PAGE];
>   tbl = (u64 *)page_address(page);
>  
> @@ -121,7 +121,7 @@ long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned 
> long liobn,
>   if (stt) {
>   ret = kvmppc_ioba_validate(stt, ioba, 1);
>   if (!ret) {
> - unsigned long idx = ioba >> SPAPR_TCE_SHIFT;
> + unsigned long idx = ioba >> IOMMU_PAGE_SHIFT_4K;
>   struct page *page = stt->pages[idx / TCES_PER_PAGE];
>   u64 *tbl = (u64 *)page_address(page);
>  

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH kernel 9/9] KVM: PPC: Add support for multiple-TCE hcalls

2015-12-08 Thread David Gibson
On Tue, Sep 15, 2015 at 08:49:39PM +1000, Alexey Kardashevskiy wrote:
> This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and
> H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO
> devices or emulated PCI.  These calls allow adding multiple entries
> (up to 512) into the TCE table in one call which saves time on
> transition between kernel and user space.
> 
> This implements the KVM_CAP_PPC_MULTITCE capability. When present,
> the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE.
> If they can not be handled by the kernel, they are passed on to
> the user space. The user space still has to have an implementation
> for these.
> 
> Both HV and PR-syle KVM are supported.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  Documentation/virtual/kvm/api.txt   |  25 ++
>  arch/powerpc/include/asm/kvm_ppc.h  |  12 +++
>  arch/powerpc/kvm/book3s_64_vio.c| 111 +++-
>  arch/powerpc/kvm/book3s_64_vio_hv.c | 145 
> ++--
>  arch/powerpc/kvm/book3s_hv.c|  26 +-
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   6 +-
>  arch/powerpc/kvm/book3s_pr_papr.c   |  35 
>  arch/powerpc/kvm/powerpc.c  |   3 +
>  8 files changed, 350 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt 
> b/Documentation/virtual/kvm/api.txt
> index d86d831..593c62a 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -3019,6 +3019,31 @@ Returns: 0 on success, -1 on error
>  
>  Queues an SMI on the thread's vcpu.
>  
> +4.97 KVM_CAP_PPC_MULTITCE
> +
> +Capability: KVM_CAP_PPC_MULTITCE
> +Architectures: ppc
> +Type: vm
> +
> +This capability means the kernel is capable of handling hypercalls
> +H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
> +space. This significantly accelerates DMA operations for PPC KVM guests.
> +User space should expect that its handlers for these hypercalls
> +are not going to be called if user space previously registered LIOBN
> +in KVM (via KVM_CREATE_SPAPR_TCE or similar calls).
> +
> +In order to enable H_PUT_TCE_INDIRECT and H_STUFF_TCE use in the guest,
> +user space might have to advertise it for the guest. For example,
> +IBM pSeries (sPAPR) guest starts using them if "hcall-multi-tce" is
> +present in the "ibm,hypertas-functions" device-tree property.
> +
> +The hypercalls mentioned above may or may not be processed successfully
> +in the kernel based fast path. If they can not be handled by the kernel,
> +they will get passed on to user space. So user space still has to have
> +an implementation for these despite the in kernel acceleration.
> +
> +This capability is always enabled.
> +
>  5. The kvm_run structure
>  
>  
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> b/arch/powerpc/include/asm/kvm_ppc.h
> index fcde896..e5b968e 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -166,12 +166,24 @@ extern int kvmppc_pseries_do_hcall(struct kvm_vcpu 
> *vcpu);
>  
>  extern long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
>   struct kvm_create_spapr_tce *args);
> +extern struct kvmppc_spapr_tce_table *kvmppc_find_table(
> + struct kvm_vcpu *vcpu, unsigned long liobn);
>  extern long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
>   unsigned long ioba, unsigned long npages);
>  extern long kvmppc_tce_validate(struct kvmppc_spapr_tce_table *tt,
>   unsigned long tce);
> +extern long kvmppc_gpa_to_ua(struct kvm *kvm, unsigned long gpa,
> + unsigned long *ua, unsigned long **prmap);
> +extern void kvmppc_tce_put(struct kvmppc_spapr_tce_table *tt,
> + unsigned long idx, unsigned long tce);
>  extern long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
>unsigned long ioba, unsigned long tce);
> +extern long kvmppc_h_put_tce_indirect(struct kvm_vcpu *vcpu,
> + unsigned long liobn, unsigned long ioba,
> + unsigned long tce_list, unsigned long npages);
> +extern long kvmppc_h_stuff_tce(struct kvm_vcpu *vcpu,
> + unsigned long liobn, unsigned long ioba,
> + unsigned long tce_value, unsigned long npages);
>  extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
>unsigned long ioba);
>  extern struct page *kvm_alloc_hpt(unsigned long nr_pages);
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> b/arch/powerpc/kvm/book3s_64_vio.c
> index e347856..d3fc732 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -14,6 +14,7 @@
>   *
>   * Copyright 2010 Paul Mackerras, IBM Corp. 
>   * Copyright 2011 David Gibson, IBM Corporation 
> + * Copyright 2013 Alexey Kardashevskiy, IBM 

Re: [PATCH 1/1] KVM: PPC: Increase memslots to 320

2015-12-08 Thread Paul Mackerras
On Wed, Nov 04, 2015 at 10:03:48AM +0100, Thomas Huth wrote:
> Only using 32 memslots for KVM on powerpc is way too low, you can
> nowadays hit this limit quite fast by adding a couple of PCI devices
> and/or pluggable memory DIMMs to the guest.
> x86 already increased the limit to 512 in total, to satisfy 256
> pluggable DIMM slots, 3 private slots and 253 slots for other things
> like PCI devices. On powerpc, we only have 32 pluggable DIMMs in

I agree with increasing the limit.  Is there a reason we have only 32
pluggable DIMMs in QEMU on powerpc, not more?  Should we be increasing
that limit too?  If so, maybe we should increase the number of memory
slots to 512?

> QEMU, not 256, so we likely do not as much slots as on x86. Thus

"so we likely do not need as many slots as on x86" would be better
English.

> setting the slot limit to 320 sounds like a good value for the
> time being (until we have some code in the future to resize the
> memslot array dynamically).
> And while we're at it, also remove the KVM_MEM_SLOTS_NUM definition
> from the powerpc-specific header since this gets defined in the
> generic kvm_host.h header anyway.
> 
> Signed-off-by: Thomas Huth 

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: powerpc/64: Include KVM guest test in all interrupt vectors

2015-12-07 Thread Michael Ellerman
On Thu, 2015-12-11 at 05:44:42 UTC, Paul Mackerras wrote:
> Currently, if HV KVM is configured but PR KVM isn't, we don't include
> a test to see whether we were interrupted in KVM guest context for the
> set of interrupts which get delivered directly to the guest by hardware
> if they occur in the guest.  This includes things like program
> interrupts.
> 
> However, the recent bug where userspace could set the MSR for a VCPU
> to have an illegal value in the TS field, and thus cause a TM Bad Thing
> type of program interrupt on the hrfid that enters the guest, showed that
> we can never be completely sure that these interrupts can never occur
> in the guest entry/exit code.  If one of these interrupts does happen
> and we have HV KVM configured but not PR KVM, then we end up trying to
> run the handler in the host with the MMU set to the guest MMU context,
> which generally ends badly.
> 
> Thus, for robustness it is better to have the test in every interrupt
> vector, so that if some way is found to trigger some interrupt in the
> guest entry/exit path, we can handle it without immediately crashing
> the host.
> 
> This means that the distinction between KVMTEST and KVMTEST_PR goes
> away.  Thus we delete KVMTEST_PR and associated macros and use KVMTEST
> everywhere that we previously used either KVMTEST_PR or KVMTEST.  It
> also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR,
> so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead.
> 
> Signed-off-by: Paul Mackerras 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/31a40e2b052c0f2b80df7b56

cheers
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8

2015-12-07 Thread Thomas Huth
On 20/11/15 09:11, Thomas Huth wrote:
> In the old DABR register, the BT (Breakpoint Translation) bit
> is bit number 61. In the new DAWRX register, the WT (Watchpoint
> Translation) bit is bit number 59. So to move the DABR-BT bit
> into the position of the DAWRX-WT bit, it has to be shifted by
> two, not only by one. This fixes hardware watchpoints in gdb of
> older guests that only use the H_SET_DABR/X interface instead
> of the new H_SET_MODE interface.
> 
> Signed-off-by: Thomas Huth 
> ---
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..3983b87 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -2143,7 +2143,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>  
>   /* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */
>  2:   rlwimi  r5, r4, 5, DAWRX_DR | DAWRX_DW
> - rlwimi  r5, r4, 1, DAWRX_WT
> + rlwimi  r5, r4, 2, DAWRX_WT
>   clrrdi  r4, r4, 3
>   std r4, VCPU_DAWR(r3)
>   std r5, VCPU_DAWRX(r3)

Ping?

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH kernel 3/9] KVM: PPC: Rework H_PUT_TCE/H_GET_TCE handlers

2015-12-07 Thread David Gibson
On Tue, Sep 15, 2015 at 08:49:33PM +1000, Alexey Kardashevskiy wrote:
> This reworks the existing H_PUT_TCE/H_GET_TCE handlers to have one
> exit path. This allows next patch to add locks nicely.

I don't see a problem with the actual code, but it doesn't seem to
match this description: I still see multiple return statements for
h_put_tce at least.

> This moves the ioba boundaries check to a helper and adds a check for
> least bits which have to be zeros.
> 
> The patch is pretty mechanical (only check for least ioba bits is added)
> so no change in behaviour is expected.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/kvm/book3s_64_vio_hv.c | 102 
> +++-
>  1 file changed, 66 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_vio_hv.c 
> b/arch/powerpc/kvm/book3s_64_vio_hv.c
> index 89e96b3..8ae12ac 100644
> --- a/arch/powerpc/kvm/book3s_64_vio_hv.c
> +++ b/arch/powerpc/kvm/book3s_64_vio_hv.c
> @@ -35,71 +35,101 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define TCES_PER_PAGE(PAGE_SIZE / sizeof(u64))
>  
> +/*
> + * Finds a TCE table descriptor by LIOBN.
> + *
> + * WARNING: This will be called in real or virtual mode on HV KVM and virtual
> + *  mode on PR KVM
> + */
> +static struct kvmppc_spapr_tce_table *kvmppc_find_table(struct kvm_vcpu 
> *vcpu,
> + unsigned long liobn)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct kvmppc_spapr_tce_table *stt;
> +
> + list_for_each_entry_rcu_notrace(stt, >arch.spapr_tce_tables, list)
> + if (stt->liobn == liobn)
> + return stt;
> +
> + return NULL;
> +}
> +
> +/*
> + * Validates IO address.
> + *
> + * WARNING: This will be called in real-mode on HV KVM and virtual
> + *  mode on PR KVM
> + */
> +static long kvmppc_ioba_validate(struct kvmppc_spapr_tce_table *stt,
> + unsigned long ioba, unsigned long npages)
> +{
> + unsigned long mask = (1ULL << IOMMU_PAGE_SHIFT_4K) - 1;
> + unsigned long idx = ioba >> IOMMU_PAGE_SHIFT_4K;
> + unsigned long size = stt->window_size >> IOMMU_PAGE_SHIFT_4K;
> +
> + if ((ioba & mask) || (size + npages <= idx))
> + return H_PARAMETER;

Not sure if it's worth a check for overflow in (size+npages) there.

> +
> + return H_SUCCESS;
> +}
> +
>  /* WARNING: This will be called in real-mode on HV KVM and virtual
>   *  mode on PR KVM
>   */
>  long kvmppc_h_put_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
> unsigned long ioba, unsigned long tce)
>  {
> - struct kvm *kvm = vcpu->kvm;
> - struct kvmppc_spapr_tce_table *stt;
> + struct kvmppc_spapr_tce_table *stt = kvmppc_find_table(vcpu, liobn);
> + long ret = H_TOO_HARD;
> + unsigned long idx;
> + struct page *page;
> + u64 *tbl;
>  
>   /* udbg_printf("H_PUT_TCE(): liobn=0x%lx ioba=0x%lx, tce=0x%lx\n", */
>   /*  liobn, ioba, tce); */
>  
> - list_for_each_entry(stt, >arch.spapr_tce_tables, list) {
> - if (stt->liobn == liobn) {
> - unsigned long idx = ioba >> SPAPR_TCE_SHIFT;
> - struct page *page;
> - u64 *tbl;
> + if (!stt)
> + return ret;
>  
> - /* udbg_printf("H_PUT_TCE: liobn 0x%lx => stt=%p  
> window_size=0x%x\n", */
> - /*  liobn, stt, stt->window_size); */
> - if (ioba >= stt->window_size)
> - return H_PARAMETER;
> + ret = kvmppc_ioba_validate(stt, ioba, 1);
> + if (ret)
> + return ret;
>  
> - page = stt->pages[idx / TCES_PER_PAGE];
> - tbl = (u64 *)page_address(page);
> + idx = ioba >> SPAPR_TCE_SHIFT;
> + page = stt->pages[idx / TCES_PER_PAGE];
> + tbl = (u64 *)page_address(page);
>  
> - /* FIXME: Need to validate the TCE itself */
> - /* udbg_printf("tce @ %p\n", [idx % 
> TCES_PER_PAGE]); */
> - tbl[idx % TCES_PER_PAGE] = tce;
> - return H_SUCCESS;
> - }
> - }
> + /* FIXME: Need to validate the TCE itself */
> + /* udbg_printf("tce @ %p\n", [idx % TCES_PER_PAGE]); */
> + tbl[idx % TCES_PER_PAGE] = tce;
>  
> - /* Didn't find the liobn, punt it to userspace */
> - return H_TOO_HARD;
> + return ret;

So, this relies on the fact that kvmppc_ioba_validate() would have
returned H_SUCCESS some distance above.  This seems rather fragile if
you insert anything else which alters ret in between.  Since this is
the success path, I think it's clearer to explicitly return H_SUCCESS.

>  }
>  EXPORT_SYMBOL_GPL(kvmppc_h_put_tce);
>  
>  long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
> unsigned long ioba)
>  {
> - struct kvm *kvm = vcpu->kvm;
> - struct 

Re: [PATCH kernel 1/9] rcu: Define notrace version of list_for_each_entry_rcu

2015-12-07 Thread David Gibson
On Tue, Sep 15, 2015 at 08:49:31PM +1000, Alexey Kardashevskiy wrote:
> This defines list_for_each_entry_rcu_notrace and list_entry_rcu_notrace
> which use rcu_dereference_raw_notrace instead of rcu_dereference_raw.
> This allows using list_for_each_entry_rcu_notrace in real mode (MMU is off).
> 
> Signed-off-by: Alexey Kardashevskiy 

Reviewed-by: David Gibson 

> ---
>  include/linux/rculist.h | 38 ++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 17c6b1f..439c4d7 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -253,6 +253,25 @@ static inline void list_splice_init_rcu(struct list_head 
> *list,
>  })
>  
>  /**
> + * list_entry_rcu_notrace - get the struct for this entry
> + * @ptr:the  list_head pointer.
> + * @type:   the type of the struct this is embedded in.
> + * @member: the name of the list_struct within the struct.
> + *
> + * This primitive may safely run concurrently with the _rcu list-mutation
> + * primitives such as list_add_rcu() as long as it's guarded by 
> rcu_read_lock().
> + *
> + * This is the same as list_entry_rcu() except that it does
> + * not do any RCU debugging or tracing.
> + */
> +#define list_entry_rcu_notrace(ptr, type, member) \
> +({ \
> + typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \
> + container_of((typeof(ptr))rcu_dereference_raw_notrace(__ptr), \
> + type, member); \
> +})
> +
> +/**
>   * Where are list_empty_rcu() and list_first_entry_rcu()?
>   *
>   * Implementing those functions following their counterparts list_empty() and
> @@ -308,6 +327,25 @@ static inline void list_splice_init_rcu(struct list_head 
> *list,
>   pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
>  
>  /**
> + * list_for_each_entry_rcu_notrace - iterate over rcu list of given type
> + * @pos: the type * to use as a loop cursor.
> + * @head:the head for your list.
> + * @member:  the name of the list_struct within the struct.
> + *
> + * This list-traversal primitive may safely run concurrently with
> + * the _rcu list-mutation primitives such as list_add_rcu()
> + * as long as the traversal is guarded by rcu_read_lock().
> + *
> + * This is the same as list_for_each_entry_rcu() except that it does
> + * not do any RCU debugging or tracing.
> + */
> +#define list_for_each_entry_rcu_notrace(pos, head, member) \
> + for (pos = list_entry_rcu_notrace((head)->next, typeof(*pos), member); \
> + >member != (head); \
> + pos = list_entry_rcu_notrace(pos->member.next, typeof(*pos), \
> + member))
> +
> +/**
>   * list_for_each_entry_continue_rcu - continue iteration over list of given 
> type
>   * @pos: the type * to use as a loop cursor.
>   * @head:the head for your list.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH kernel 4/9] KVM: PPC: Use RCU for arch.spapr_tce_tables

2015-12-07 Thread David Gibson
On Tue, Sep 15, 2015 at 08:49:34PM +1000, Alexey Kardashevskiy wrote:
> At the moment spapr_tce_tables is not protected against races. This makes
> use of RCU-variants of list helpers. As some bits are executed in real
> mode, this makes use of just introduced list_for_each_entry_rcu_notrace().
> 
> This converts release_spapr_tce_table() to a RCU scheduled handler.
> 
> Signed-off-by: Alexey Kardashevskiy 

Looks correct on my limited knowledge of RCU

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH kernel 2/9] KVM: PPC: Make real_vmalloc_addr() public

2015-12-07 Thread David Gibson
On Tue, Sep 15, 2015 at 08:49:32PM +1000, Alexey Kardashevskiy wrote:
> This helper translates vmalloc'd addresses to linear addresses.
> It is only used by the KVM MMU code now and resides in the HV KVM code.
> We will need it further in the TCE code and the DMA memory preregistration
> code called in real mode.
> 
> This makes real_vmalloc_addr() public and moves it to the powerpc code as
> it does not do anything special for KVM.
> 
> Signed-off-by: Alexey Kardashevskiy 

Hmm, I have a couple of small concerns.

First, I'm not sure if the name is clear enough for a public function.

Second, I'm not sure if mmu-hash64.h is the right place for it.  This
is still a function with very specific and limited usage, I wonder if
we should have somewhere else for such special real mode helpers.

Paulus, thoughts?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] kvm: remove unused variable 'vcpu_book3s'

2015-12-01 Thread Daniel Axtens
"Geyslan G. Bem"  writes:

> The vcpu_book3s struct is assigned but never used. So remove it.

Just out of interest, how did you find this? Compiler warning? Static
analysis? Manual inspection?

Thanks in advance!

Regards,
Daniel

>
> Signed-off-by: Geyslan G. Bem 
> ---
>  arch/powerpc/kvm/book3s_64_mmu.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_64_mmu.c 
> b/arch/powerpc/kvm/book3s_64_mmu.c
> index 774a253..9bf7031 100644
> --- a/arch/powerpc/kvm/book3s_64_mmu.c
> +++ b/arch/powerpc/kvm/book3s_64_mmu.c
> @@ -377,15 +377,12 @@ no_seg_found:
>  
>  static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 
> rb)
>  {
> - struct kvmppc_vcpu_book3s *vcpu_book3s;
>   u64 esid, esid_1t;
>   int slb_nr;
>   struct kvmppc_slb *slbe;
>  
>   dprintk("KVM MMU: slbmte(0x%llx, 0x%llx)\n", rs, rb);
>  
> - vcpu_book3s = to_book3s(vcpu);
> -
>   esid = GET_ESID(rb);
>   esid_1t = GET_ESID_1T(rb);
>   slb_nr = rb & 0xfff;
> -- 
> 2.6.2
>
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev


signature.asc
Description: PGP signature


[PATCH] kvm: remove unused variable 'vcpu_book3s'

2015-12-01 Thread Geyslan G. Bem
The vcpu_book3s struct is assigned but never used. So remove it.

Signed-off-by: Geyslan G. Bem 
---
 arch/powerpc/kvm/book3s_64_mmu.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index 774a253..9bf7031 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -377,15 +377,12 @@ no_seg_found:
 
 static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 rb)
 {
-   struct kvmppc_vcpu_book3s *vcpu_book3s;
u64 esid, esid_1t;
int slb_nr;
struct kvmppc_slb *slbe;
 
dprintk("KVM MMU: slbmte(0x%llx, 0x%llx)\n", rs, rb);
 
-   vcpu_book3s = to_book3s(vcpu);
-
esid = GET_ESID(rb);
esid_1t = GET_ESID_1T(rb);
slb_nr = rb & 0xfff;
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: remove unused variable 'vcpu_book3s'

2015-12-01 Thread Geyslan G. Bem
2015-12-01 21:34 GMT-03:00 Daniel Axtens :
> "Geyslan G. Bem"  writes:
>
>> The vcpu_book3s struct is assigned but never used. So remove it.
>
> Just out of interest, how did you find this? Compiler warning? Static
> analysis? Manual inspection?

Sorry, I should have done the patch self contained. I caught it
through static analysis (cppcheck).

>
> Thanks in advance!

You're welcome.

>
> Regards,
> Daniel
>
>>
>> Signed-off-by: Geyslan G. Bem 
>> ---
>>  arch/powerpc/kvm/book3s_64_mmu.c | 3 ---
>>  1 file changed, 3 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_64_mmu.c 
>> b/arch/powerpc/kvm/book3s_64_mmu.c
>> index 774a253..9bf7031 100644
>> --- a/arch/powerpc/kvm/book3s_64_mmu.c
>> +++ b/arch/powerpc/kvm/book3s_64_mmu.c
>> @@ -377,15 +377,12 @@ no_seg_found:
>>
>>  static void kvmppc_mmu_book3s_64_slbmte(struct kvm_vcpu *vcpu, u64 rs, u64 
>> rb)
>>  {
>> - struct kvmppc_vcpu_book3s *vcpu_book3s;
>>   u64 esid, esid_1t;
>>   int slb_nr;
>>   struct kvmppc_slb *slbe;
>>
>>   dprintk("KVM MMU: slbmte(0x%llx, 0x%llx)\n", rs, rb);
>>
>> - vcpu_book3s = to_book3s(vcpu);
>> -
>>   esid = GET_ESID(rb);
>>   esid_1t = GET_ESID_1T(rb);
>>   slb_nr = rb & 0xfff;
>> --
>> 2.6.2
>>
>> ___
>> Linuxppc-dev mailing list
>> linuxppc-...@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev



-- 
Regards,

Geyslan G. Bem
hackingbits.com
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] kvm - possible out of bounds

2015-11-29 Thread Geyslan Gregório Bem
Hello,

I have found a possible out of bounds reading in
arch/powerpc/kvm/book3s_64_mmu.c (kvmppc_mmu_book3s_64_xlate
function). pteg[] array could be accessed twice using the i variable
after the for iteration. What happens is that in the last iteration
the i index is incremented to 16, checked (i<16) then confirmed
exiting the loop.

277for (i=0; i<16; i+=2) { ...

Later there are reading attempts to the pteg last elements, but using
again the already incremented i (16).

303v = be64_to_cpu(pteg[i]);  /* pteg[16] */
304r = be64_to_cpu(pteg[i+1]); /* pteg[17] */

I really don't know if the for lace will somehow iterate until i is
16, anyway I think that the last readings must be using a defined max
len/index or another more clear method.

Eg.

v = be64_to_cpu(pteg[PTEG_LEN - 2]);
r = be64_to_cpu(pteg[PTEG_LEN - 1]);

Or just.

v = be64_to_cpu(pteg[14]);
r = be64_to_cpu(pteg[15]);



I found in the same file a variable that is not used.

380struct kvmppc_vcpu_book3s *vcpu_book3s;
...
387vcpu_book3s = to_book3s(vcpu);

-

A question, the kvmppc_mmu_book3s_64_init function is accessed by
unconventional way? Because I have not found any calling to it.

If something that I wrote is correct please tell me if I could send the patch.

-- 
Regards,

Geyslan G. Bem
hackingbits.com
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] kvm - possible out of bounds

2015-11-29 Thread Paul Mackerras
On Sun, Nov 29, 2015 at 05:14:03PM -0300, Geyslan Gregório Bem wrote:
> Hello,
> 
> I have found a possible out of bounds reading in
> arch/powerpc/kvm/book3s_64_mmu.c (kvmppc_mmu_book3s_64_xlate
> function). pteg[] array could be accessed twice using the i variable
> after the for iteration. What happens is that in the last iteration
> the i index is incremented to 16, checked (i<16) then confirmed
> exiting the loop.
> 
> 277for (i=0; i<16; i+=2) { ...
> 
> Later there are reading attempts to the pteg last elements, but using
> again the already incremented i (16).
> 
> 303v = be64_to_cpu(pteg[i]);  /* pteg[16] */
> 304r = be64_to_cpu(pteg[i+1]); /* pteg[17] */

Was it some automated tool that came up with this?

There is actually no problem because the accesses outside the loop are
only done if the 'found' variable is true; 'found' is initialized to
false and only ever set to true inside the loop just before a break
statement.  Thus there is a correlation between the value of 'i' and
the value of 'found' -- if 'found' is true then we know 'i' is less
than 16.

> I really don't know if the for lace will somehow iterate until i is
> 16, anyway I think that the last readings must be using a defined max
> len/index or another more clear method.

I think it's perfectly clear to a human programmer, though some tools
(such as gcc) struggle with this kind of correlation between
variables.  That's why I asked whether your report was based on the
output from some tool.

> Eg.
> 
> v = be64_to_cpu(pteg[PTEG_LEN - 2]);
> r = be64_to_cpu(pteg[PTEG_LEN - 1]);
> 
> Or just.
> 
> v = be64_to_cpu(pteg[14]);
> r = be64_to_cpu(pteg[15]);

Either of those options would cause the code to malfunction.

> I found in the same file a variable that is not used.
> 
> 380struct kvmppc_vcpu_book3s *vcpu_book3s;
> ...
> 387vcpu_book3s = to_book3s(vcpu);

True.  It could be removed.

> A question, the kvmppc_mmu_book3s_64_init function is accessed by
> unconventional way? Because I have not found any calling to it.

Try arch/powerpc/kvm/book3s_pr.c line 410:

kvmppc_mmu_book3s_64_init(vcpu);

Grep (or git grep) is your friend.

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] kvm - possible out of bounds

2015-11-29 Thread Geyslan Gregório Bem
2015-11-29 18:33 GMT-03:00 Paul Mackerras :
> On Sun, Nov 29, 2015 at 05:14:03PM -0300, Geyslan Gregório Bem wrote:
>> Hello,
>>
>> I have found a possible out of bounds reading in
>> arch/powerpc/kvm/book3s_64_mmu.c (kvmppc_mmu_book3s_64_xlate
>> function). pteg[] array could be accessed twice using the i variable
>> after the for iteration. What happens is that in the last iteration
>> the i index is incremented to 16, checked (i<16) then confirmed
>> exiting the loop.
>>
>> 277for (i=0; i<16; i+=2) { ...
>>
>> Later there are reading attempts to the pteg last elements, but using
>> again the already incremented i (16).
>>
>> 303v = be64_to_cpu(pteg[i]);  /* pteg[16] */
>> 304r = be64_to_cpu(pteg[i+1]); /* pteg[17] */
>
> Was it some automated tool that came up with this?

Yep, cppcheck. As I'm still not engaged to a specific area in the
kernel, just trying to help with automated catches.

>
> There is actually no problem because the accesses outside the loop are
> only done if the 'found' variable is true; 'found' is initialized to
> false and only ever set to true inside the loop just before a break
> statement.  Thus there is a correlation between the value of 'i' and
> the value of 'found' -- if 'found' is true then we know 'i' is less
> than 16.

I figured it out after your explanation.

>
>> I really don't know if the for lace will somehow iterate until i is
>> 16, anyway I think that the last readings must be using a defined max
>> len/index or another more clear method.
>
> I think it's perfectly clear to a human programmer, though some tools
> (such as gcc) struggle with this kind of correlation between
> variables.  That's why I asked whether your report was based on the
> output from some tool.
>
>> Eg.
>>
>> v = be64_to_cpu(pteg[PTEG_LEN - 2]);
>> r = be64_to_cpu(pteg[PTEG_LEN - 1]);
>>
>> Or just.
>>
>> v = be64_to_cpu(pteg[14]);
>> r = be64_to_cpu(pteg[15]);
>
> Either of those options would cause the code to malfunction.

Yep, I understand now that v and r get the found ones. So i is needed.

>
>> I found in the same file a variable that is not used.
>>
>> 380struct kvmppc_vcpu_book3s *vcpu_book3s;
>> ...
>> 387vcpu_book3s = to_book3s(vcpu);
>
> True.  It could be removed.

I'll make a patch for that.

>
>> A question, the kvmppc_mmu_book3s_64_init function is accessed by
>> unconventional way? Because I have not found any calling to it.
>
> Try arch/powerpc/kvm/book3s_pr.c line 410:
>
> kvmppc_mmu_book3s_64_init(vcpu);
>
> Grep (or git grep) is your friend.

Hmm, indeed.

>
> Paul.

Thank you, Paul. If you have some other changes in progress let me know.

-- 
Regards,

Geyslan G. Bem
hackingbits.com
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH kernel 5/9] KVM: PPC: Account TCE-containing pages in locked_vm

2015-11-29 Thread Paul Mackerras
On Tue, Sep 15, 2015 at 08:49:35PM +1000, Alexey Kardashevskiy wrote:
> At the moment pages used for TCE tables (in addition to pages addressed
> by TCEs) are not counted in locked_vm counter so a malicious userspace
> tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as RLIMIT_NOFILE and
> lock a lot of memory.
> 
> This adds counting for pages used for TCE tables.
> 
> This counts the number of pages required for a table plus pages for
> the kvmppc_spapr_tce_table struct (TCE table descriptor) itself.
> 
> This does not change the amount of (de)allocated memory.
> 
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/kvm/book3s_64_vio.c | 51 
> +++-
>  1 file changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_64_vio.c 
> b/arch/powerpc/kvm/book3s_64_vio.c
> index 9526c34..b70787d 100644
> --- a/arch/powerpc/kvm/book3s_64_vio.c
> +++ b/arch/powerpc/kvm/book3s_64_vio.c
> @@ -45,13 +45,56 @@ static long kvmppc_stt_npages(unsigned long window_size)
>* sizeof(u64), PAGE_SIZE) / PAGE_SIZE;
>  }
>  
> +static long kvmppc_account_memlimit(long npages, bool inc)
> +{
> + long ret = 0;
> + const long bytes = sizeof(struct kvmppc_spapr_tce_table) +
> + (abs(npages) * sizeof(struct page *));

Why abs(npages)?  Can npages be negative?  If so, what does that mean?

Paul.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH kernel 5/9] KVM: PPC: Account TCE-containing pages in locked_vm

2015-11-29 Thread Alexey Kardashevskiy

On 11/30/2015 01:06 PM, Paul Mackerras wrote:

On Tue, Sep 15, 2015 at 08:49:35PM +1000, Alexey Kardashevskiy wrote:

At the moment pages used for TCE tables (in addition to pages addressed
by TCEs) are not counted in locked_vm counter so a malicious userspace
tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as RLIMIT_NOFILE and
lock a lot of memory.

This adds counting for pages used for TCE tables.

This counts the number of pages required for a table plus pages for
the kvmppc_spapr_tce_table struct (TCE table descriptor) itself.

This does not change the amount of (de)allocated memory.

Signed-off-by: Alexey Kardashevskiy 
---
  arch/powerpc/kvm/book3s_64_vio.c | 51 +++-
  1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 9526c34..b70787d 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -45,13 +45,56 @@ static long kvmppc_stt_npages(unsigned long window_size)
 * sizeof(u64), PAGE_SIZE) / PAGE_SIZE;
  }

+static long kvmppc_account_memlimit(long npages, bool inc)
+{
+   long ret = 0;
+   const long bytes = sizeof(struct kvmppc_spapr_tce_table) +
+   (abs(npages) * sizeof(struct page *));


Why abs(npages)?  Can npages be negative?  If so, what does that mean?



Leftover from older versions when there was one shared 
account_memlimit(long npages). It does not make sense here, I need to 
remove it.



--
Alexey
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8

2015-11-22 Thread David Gibson
On Fri, 20 Nov 2015 09:11:45 +0100
Thomas Huth  wrote:

> In the old DABR register, the BT (Breakpoint Translation) bit
> is bit number 61. In the new DAWRX register, the WT (Watchpoint
> Translation) bit is bit number 59. So to move the DABR-BT bit
> into the position of the DAWRX-WT bit, it has to be shifted by
> two, not only by one. This fixes hardware watchpoints in gdb of
> older guests that only use the H_SET_DABR/X interface instead
> of the new H_SET_MODE interface.
> 
> Signed-off-by: Thomas Huth 

Reviewed-by: David Gibson 
Reviewed-by: David Gibson 

> ---
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..3983b87 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -2143,7 +2143,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>  
>   /* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */
>  2:   rlwimi  r5, r4, 5, DAWRX_DR | DAWRX_DW
> - rlwimi  r5, r4, 1, DAWRX_WT
> + rlwimi  r5, r4, 2, DAWRX_WT
>   clrrdi  r4, r4, 3
>   std r4, VCPU_DAWR(r3)
>   std r5, VCPU_DAWRX(r3)
> -- 
> 1.8.3.1
> 


-- 
David Gibson 
Senior Software Engineer, Virtualization, Red Hat


pgp8jxKFUmn0u.pgp
Description: OpenPGP digital signature


Re: [PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8

2015-11-20 Thread Laurent Vivier


On 20/11/2015 09:11, Thomas Huth wrote:
> In the old DABR register, the BT (Breakpoint Translation) bit
> is bit number 61. In the new DAWRX register, the WT (Watchpoint
> Translation) bit is bit number 59. So to move the DABR-BT bit
> into the position of the DAWRX-WT bit, it has to be shifted by
> two, not only by one. This fixes hardware watchpoints in gdb of
> older guests that only use the H_SET_DABR/X interface instead
> of the new H_SET_MODE interface.
> 
> Signed-off-by: Thomas Huth 
> ---
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..3983b87 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -2143,7 +2143,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>  
>   /* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */
>  2:   rlwimi  r5, r4, 5, DAWRX_DR | DAWRX_DW
> - rlwimi  r5, r4, 1, DAWRX_WT
> + rlwimi  r5, r4, 2, DAWRX_WT
>   clrrdi  r4, r4, 3
>   std r4, VCPU_DAWR(r3)
>   std r5, VCPU_DAWRX(r3)
> 

Nice catch.

Reviewed-by: Laurent Vivier 
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8

2015-11-20 Thread Thomas Huth
In the old DABR register, the BT (Breakpoint Translation) bit
is bit number 61. In the new DAWRX register, the WT (Watchpoint
Translation) bit is bit number 59. So to move the DABR-BT bit
into the position of the DAWRX-WT bit, it has to be shifted by
two, not only by one. This fixes hardware watchpoints in gdb of
older guests that only use the H_SET_DABR/X interface instead
of the new H_SET_MODE interface.

Signed-off-by: Thomas Huth 
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index b98889e..3983b87 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2143,7 +2143,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 
/* Emulate H_SET_DABR/X on P8 for the sake of compat mode guests */
 2: rlwimi  r5, r4, 5, DAWRX_DR | DAWRX_DW
-   rlwimi  r5, r4, 1, DAWRX_WT
+   rlwimi  r5, r4, 2, DAWRX_WT
clrrdi  r4, r4, 3
std r4, VCPU_DAWR(r3)
std r5, VCPU_DAWRX(r3)
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] KVM: Use common function for VCPU lookup by id

2015-11-19 Thread Christian Borntraeger
From: David Hildenbrand 

Let's reuse the new common function for VPCU lookup by id.

Reviewed-by: Christian Borntraeger 
Reviewed-by: Dominik Dingel 
Signed-off-by: David Hildenbrand 
Signed-off-by: Christian Borntraeger 
[split out the new function into a separate patch]
---
 arch/powerpc/kvm/book3s_hv.c | 10 ++
 arch/s390/kvm/diag.c | 11 +++
 virt/kvm/kvm_main.c  | 12 +---
 3 files changed, 10 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 54b45b7..904b3b0 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -308,16 +308,10 @@ static void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
 
 static struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id)
 {
-   int r;
-   struct kvm_vcpu *v, *ret = NULL;
+   struct kvm_vcpu *ret;
 
mutex_lock(>lock);
-   kvm_for_each_vcpu(r, v, kvm) {
-   if (v->vcpu_id == id) {
-   ret = v;
-   break;
-   }
-   }
+   ret = kvm_lookup_vcpu(kvm, id);
mutex_unlock(>lock);
return ret;
 }
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index 5fbfb88..aaa7cc0 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -155,10 +155,8 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
 
 static int __diag_time_slice_end_directed(struct kvm_vcpu *vcpu)
 {
-   struct kvm *kvm = vcpu->kvm;
struct kvm_vcpu *tcpu;
int tid;
-   int i;
 
tid = vcpu->run->s.regs.gprs[(vcpu->arch.sie_block->ipa & 0xf0) >> 4];
vcpu->stat.diagnose_9c++;
@@ -167,12 +165,9 @@ static int __diag_time_slice_end_directed(struct kvm_vcpu 
*vcpu)
if (tid == vcpu->vcpu_id)
return 0;
 
-   kvm_for_each_vcpu(i, tcpu, kvm)
-   if (tcpu->vcpu_id == tid) {
-   kvm_vcpu_yield_to(tcpu);
-   break;
-   }
-
+   tcpu = kvm_lookup_vcpu(vcpu->kvm, tid);
+   if (tcpu)
+   kvm_vcpu_yield_to(tcpu);
return 0;
 }
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 484079e..244c9b4 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2257,7 +2257,7 @@ static int create_vcpu_fd(struct kvm_vcpu *vcpu)
 static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
 {
int r;
-   struct kvm_vcpu *vcpu, *v;
+   struct kvm_vcpu *vcpu;
 
if (id >= KVM_MAX_VCPUS)
return -EINVAL;
@@ -2281,12 +2281,10 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, 
u32 id)
r = -EINVAL;
goto unlock_vcpu_destroy;
}
-
-   kvm_for_each_vcpu(r, v, kvm)
-   if (v->vcpu_id == id) {
-   r = -EEXIST;
-   goto unlock_vcpu_destroy;
-   }
+   if (kvm_lookup_vcpu(kvm, id)) {
+   r = -EEXIST;
+   goto unlock_vcpu_destroy;
+   }
 
BUG_ON(kvm->vcpus[atomic_read(>online_vcpus)]);
 
-- 
2.3.0

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] KVM: s390: fix wrong lookup of VCPUs by array index

2015-11-19 Thread Christian Borntraeger
From: David Hildenbrand 

For now, VCPUs were always created sequentially with incrementing
VCPU ids. Therefore, the index in the VCPUs array matched the id.

As sequential creation might change with cpu hotplug, let's use
the correct lookup function to find a VCPU by id, not array index.

Reviewed-by: Christian Borntraeger 
Signed-off-by: David Hildenbrand 
Signed-off-by: Christian Borntraeger 
[split stable/non-stable parts]
Cc: sta...@vger.kernel.org # c3853a8: KVM: Provide function for VCPU lookup by 
id
---
 arch/s390/kvm/sigp.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
index da690b6..081dbd0 100644
--- a/arch/s390/kvm/sigp.c
+++ b/arch/s390/kvm/sigp.c
@@ -291,12 +291,8 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 
order_code,
   u16 cpu_addr, u32 parameter, u64 *status_reg)
 {
int rc;
-   struct kvm_vcpu *dst_vcpu;
+   struct kvm_vcpu *dst_vcpu = kvm_lookup_vcpu(vcpu->kvm, cpu_addr);
 
-   if (cpu_addr >= KVM_MAX_VCPUS)
-   return SIGP_CC_NOT_OPERATIONAL;
-
-   dst_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr);
if (!dst_vcpu)
return SIGP_CC_NOT_OPERATIONAL;
 
@@ -478,7 +474,7 @@ int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu)
trace_kvm_s390_handle_sigp_pei(vcpu, order_code, cpu_addr);
 
if (order_code == SIGP_EXTERNAL_CALL) {
-   dest_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr);
+   dest_vcpu = kvm_lookup_vcpu(vcpu->kvm, cpu_addr);
BUG_ON(dest_vcpu == NULL);
 
kvm_s390_vcpu_wakeup(dest_vcpu);
-- 
2.3.0

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] KVM: Provide function for VCPU lookup by id

2015-11-19 Thread Christian Borntraeger
From: David Hildenbrand 

Let's provide a function to lookup a VCPU by id.

Reviewed-by: Christian Borntraeger 
Reviewed-by: Dominik Dingel 
Signed-off-by: David Hildenbrand 
Signed-off-by: Christian Borntraeger 
[split patch from refactoring patch]
---
 include/linux/kvm_host.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 5706a21..b9f345f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -460,6 +460,17 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm 
*kvm, int i)
 (vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
 idx++)
 
+static inline struct kvm_vcpu *kvm_lookup_vcpu(struct kvm *kvm, int id)
+{
+   struct kvm_vcpu *vcpu;
+   int i;
+
+   kvm_for_each_vcpu(i, vcpu, kvm)
+   if (vcpu->vcpu_id == id)
+   return vcpu;
+   return NULL;
+}
+
 #define kvm_for_each_memslot(memslot, slots)   \
for (memslot = >memslots[0]; \
  memslot < slots->memslots + KVM_MEM_SLOTS_NUM && memslot->npages;\
-- 
2.3.0

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] Preview: vcpu lookup by id changes

2015-11-19 Thread Christian Borntraeger
Paolo,

some more patches for review before I add them in my next pull request.
Can you review the common code changes and Ack/Nack?

Alex, Paul,
can you review the power changes and Ack/Nack? 

As I want to have patch 2 in 4.4, I splitted up Davids patch into smaller
chunks. David, can you double check that I did not mess up?
So I want to send patch 1 and patch 2 for 4.4. Patch 3 and Patch 4
are for kvm/next (4.5), but need patch 1 as well. 
So I will probably also add all patches into kvm/next by setting up
the git branches in a way that during 4.5 merge window kvm/next will
have my branch that will go via kvm/master as common ancestor.

Ok?

Christian

David Hildenbrand (4):
  KVM: Provide function for VCPU lookup by id
  KVM: s390: fix wrong lookup of VCPUs by array index
  KVM: use heuristic for fast VCPU lookup by id
  KVM: Use common function for VCPU lookup by id

 arch/powerpc/kvm/book3s_hv.c | 10 ++
 arch/s390/kvm/diag.c | 11 +++
 arch/s390/kvm/sigp.c |  8 ++--
 include/linux/kvm_host.h | 16 
 virt/kvm/kvm_main.c  | 12 +---
 5 files changed, 28 insertions(+), 29 deletions(-)

-- 
2.3.0

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] KVM: use heuristic for fast VCPU lookup by id

2015-11-19 Thread Christian Borntraeger
From: David Hildenbrand 

Usually, VCPU ids match the array index. So let's try a fast
lookup first before falling back to the slow iteration.

Suggested-by: Christian Borntraeger 
Reviewed-by: Dominik Dingel 
Reviewed-by: Christian Borntraeger 
Signed-off-by: David Hildenbrand 
Signed-off-by: Christian Borntraeger 
---
 include/linux/kvm_host.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index b9f345f..7821137 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -465,6 +465,11 @@ static inline struct kvm_vcpu *kvm_lookup_vcpu(struct kvm 
*kvm, int id)
struct kvm_vcpu *vcpu;
int i;
 
+   if (id < 0 || id >= KVM_MAX_VCPUS)
+   return NULL;
+   vcpu = kvm_get_vcpu(kvm, id);
+   if (vcpu && vcpu->vcpu_id == id)
+   return vcpu;
kvm_for_each_vcpu(i, vcpu, kvm)
if (vcpu->vcpu_id == id)
return vcpu;
-- 
2.3.0

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Preview: vcpu lookup by id changes

2015-11-19 Thread Paolo Bonzini


On 19/11/2015 09:37, Christian Borntraeger wrote:
> 
> some more patches for review before I add them in my next pull request.
> Can you review the common code changes and Ack/Nack?
> 
> Alex, Paul,
> can you review the power changes and Ack/Nack? 
> 
> As I want to have patch 2 in 4.4, I splitted up Davids patch into smaller
> chunks. David, can you double check that I did not mess up?
> So I want to send patch 1 and patch 2 for 4.4. Patch 3 and Patch 4
> are for kvm/next (4.5), but need patch 1 as well. 
> So I will probably also add all patches into kvm/next by setting up
> the git branches in a way that during 4.5 merge window kvm/next will
> have my branch that will go via kvm/master as common ancestor.
> 
> Ok?

Patches and plan both look good!

I can apply patch 1 and 2 now to kvm/master.  By the time kvm/next forks
from Linus's tree (sometime next week, since kvm/queue is already
largish) they will be in.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Preview: vcpu lookup by id changes

2015-11-19 Thread Christian Borntraeger
On 11/19/2015 10:38 AM, Paolo Bonzini wrote:
> 
> 
> On 19/11/2015 09:37, Christian Borntraeger wrote:
>>
>> some more patches for review before I add them in my next pull request.
>> Can you review the common code changes and Ack/Nack?
>>
>> Alex, Paul,
>> can you review the power changes and Ack/Nack? 
>>
>> As I want to have patch 2 in 4.4, I splitted up Davids patch into smaller
>> chunks. David, can you double check that I did not mess up?
>> So I want to send patch 1 and patch 2 for 4.4. Patch 3 and Patch 4
>> are for kvm/next (4.5), but need patch 1 as well. 
>> So I will probably also add all patches into kvm/next by setting up
>> the git branches in a way that during 4.5 merge window kvm/next will
>> have my branch that will go via kvm/master as common ancestor.
>>
>> Ok?
> 
> Patches and plan both look good!
> 
> I can apply patch 1 and 2 now to kvm/master.  By the time kvm/next forks
> from Linus's tree (sometime next week, since kvm/queue is already
> largish) they will be in.

I have 2or 3 more patches for 4.4 and I will prepare a pull request today. So
either take them or wait for my request. Whatever works best for you.



--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Preview: vcpu lookup by id changes

2015-11-19 Thread Paolo Bonzini


On 19/11/2015 10:51, Christian Borntraeger wrote:
> > I can apply patch 1 and 2 now to kvm/master.  By the time kvm/next forks
> > from Linus's tree (sometime next week, since kvm/queue is already
> > largish) they will be in.
> 
> I have 2or 3 more patches for 4.4 and I will prepare a pull request today. So
> either take them or wait for my request. Whatever works best for you.

I'll wait then.  ARM folks also have a couple patches for me. :)

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/4] KVM: s390: fix wrong lookup of VCPUs by array index

2015-11-19 Thread Christian Borntraeger
Sigh.
Seems that my mail script got confused by the # after stable. So please
strip down the cc list.


On 11/19/2015 09:37 AM, Christian Borntraeger wrote:
> From: David Hildenbrand 
> 
> For now, VCPUs were always created sequentially with incrementing
> VCPU ids. Therefore, the index in the VCPUs array matched the id.
> 
> As sequential creation might change with cpu hotplug, let's use
> the correct lookup function to find a VCPU by id, not array index.
> 
> Reviewed-by: Christian Borntraeger 
> Signed-off-by: David Hildenbrand 
> Signed-off-by: Christian Borntraeger 
> [split stable/non-stable parts]
> Cc: sta...@vger.kernel.org # c3853a8: KVM: Provide function for VCPU lookup 
> by id
> ---
>  arch/s390/kvm/sigp.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/s390/kvm/sigp.c b/arch/s390/kvm/sigp.c
> index da690b6..081dbd0 100644
> --- a/arch/s390/kvm/sigp.c
> +++ b/arch/s390/kvm/sigp.c
> @@ -291,12 +291,8 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 
> order_code,
>  u16 cpu_addr, u32 parameter, u64 *status_reg)
>  {
>   int rc;
> - struct kvm_vcpu *dst_vcpu;
> + struct kvm_vcpu *dst_vcpu = kvm_lookup_vcpu(vcpu->kvm, cpu_addr);
> 
> - if (cpu_addr >= KVM_MAX_VCPUS)
> - return SIGP_CC_NOT_OPERATIONAL;
> -
> - dst_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr);
>   if (!dst_vcpu)
>   return SIGP_CC_NOT_OPERATIONAL;
> 
> @@ -478,7 +474,7 @@ int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu)
>   trace_kvm_s390_handle_sigp_pei(vcpu, order_code, cpu_addr);
> 
>   if (order_code == SIGP_EXTERNAL_CALL) {
> - dest_vcpu = kvm_get_vcpu(vcpu->kvm, cpu_addr);
> + dest_vcpu = kvm_lookup_vcpu(vcpu->kvm, cpu_addr);
>   BUG_ON(dest_vcpu == NULL);
> 
>   kvm_s390_vcpu_wakeup(dest_vcpu);
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id

2015-11-19 Thread Thomas Huth
On 19/11/15 09:37, Christian Borntraeger wrote:
> From: David Hildenbrand 
> 
> Let's provide a function to lookup a VCPU by id.
> 
> Reviewed-by: Christian Borntraeger 
> Reviewed-by: Dominik Dingel 
> Signed-off-by: David Hildenbrand 
> Signed-off-by: Christian Borntraeger 
> [split patch from refactoring patch]
> ---
>  include/linux/kvm_host.h | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 5706a21..b9f345f 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -460,6 +460,17 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm 
> *kvm, int i)
>(vcpup = kvm_get_vcpu(kvm, idx)) != NULL; \
>idx++)
>  
> +static inline struct kvm_vcpu *kvm_lookup_vcpu(struct kvm *kvm, int id)
> +{
> + struct kvm_vcpu *vcpu;
> + int i;
> +
> + kvm_for_each_vcpu(i, vcpu, kvm)
> + if (vcpu->vcpu_id == id)
> + return vcpu;
> + return NULL;
> +}
> +

Any chance that you name this function differently? Otherwise we've got
two functions that sound very similar and also have similar prototypes:

- kvm_get_vcpu(struct kvm *kvm, int i)
- kvm_lookup_vcpu(struct kvm *kvm, int id)

I'm pretty sure this will cause confusion in the future!
==> Could you maybe name the new function something like
"kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead?

Also a short comment before the function would be nice to make the
reader aware of the difference (e.g. when hot-plugging) between the
vcpu-id and array-id.

Otherwise: +1 for finally having a proper common function for this!

 Thomas

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id

2015-11-19 Thread David Hildenbrand
> 
> Any chance that you name this function differently? Otherwise we've got
> two functions that sound very similar and also have similar prototypes:
> 
> - kvm_get_vcpu(struct kvm *kvm, int i)
> - kvm_lookup_vcpu(struct kvm *kvm, int id)
> 
> I'm pretty sure this will cause confusion in the future!
> ==> Could you maybe name the new function something like
> "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead?

Had that in a previous version but decided to name it that way ... hmm ...
other opinions?

> 
> Also a short comment before the function would be nice to make the
> reader aware of the difference (e.g. when hot-plugging) between the
> vcpu-id and array-id.

Also not sure as the header directly includes the (for my eyes ;) ) easy code.
I would agree for a pure prototype. Other opinions?

> 
> Otherwise: +1 for finally having a proper common function for this!
> 

Thanks for having a look Thomas!

David

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id

2015-11-19 Thread Paolo Bonzini


On 19/11/2015 13:55, David Hildenbrand wrote:
>> > 
>> > I'm pretty sure this will cause confusion in the future!
>> > ==> Could you maybe name the new function something like
>> > "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead?
> Had that in a previous version but decided to name it that way ... hmm ...
> other opinions?

Having _by_id at the end of the name makes sense indeed.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id

2015-11-19 Thread Christian Borntraeger
On 11/19/2015 02:13 PM, Paolo Bonzini wrote:
> 
> 
> On 19/11/2015 13:55, David Hildenbrand wrote:

 I'm pretty sure this will cause confusion in the future!
 ==> Could you maybe name the new function something like
 "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead?
>> Had that in a previous version but decided to name it that way ... hmm ...
>> other opinions?
> 
> Having _by_id at the end of the name makes sense indeed.

David,
I will fix up the function name myself to kvm_get_vcpu_by_id. Ok?





--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] KVM: Provide function for VCPU lookup by id

2015-11-19 Thread David Hildenbrand
> On 11/19/2015 02:13 PM, Paolo Bonzini wrote:
> > 
> > 
> > On 19/11/2015 13:55, David Hildenbrand wrote:
> 
>  I'm pretty sure this will cause confusion in the future!
>  ==> Could you maybe name the new function something like
>  "kvm_lookup_vcpu_by_id" or "kvm_get_vcpu_by_id" instead?
> >> Had that in a previous version but decided to name it that way ... hmm ...
> >> other opinions?
> > 
> > Having _by_id at the end of the name makes sense indeed.
> 
> David,
> I will fix up the function name myself to kvm_get_vcpu_by_id. Ok?

Sure, thanks a lot!

David

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-18 Thread Will Deacon
On Wed, Nov 18, 2015 at 10:29:30AM +, Andre Przywara wrote:
> On 02/11/15 14:58, Will Deacon wrote:
> > On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
> >> this series cleans up kvmtool's kernel loading functionality a bit.
> >> It has been broken out of a previous series I sent [1] and contains
> >> just the cleanup and bug fix parts, which should be less controversial
> >> and thus easier to merge ;-)
> >> I will resend the pipe loading part later on as a separate series.
> >>
> >> The first patch properly abstracts kernel loading to move
> >> responsibility into each architecture's code. It removes quite some
> >> ugly code from the generic kvm.c file.
> >> The later patches address the naive usage of read(2) to, well, read
> >> data from files. Doing this without coping with the subtleties of
> >> the UNIX read semantics (returning with less or none data read is not
> >> an error) can provoke hard to debug failures.
> >> So these patches make use of the existing and one new wrapper function
> >> to make sure we read everything we actually wanted to.
> >> The last patch moves the ARM kernel loading code into the proper
> >> location to be in line with the other architectures.
> >>
> >> Please have a look and give some comments!
> > 
> > Looks good to me, but I'd like to see some comments from some mips/ppc/x86
> > people on the changes you're making over there.
> 
> Sounds reasonable, but no answers yet.
> 
> Can you take at least patch 1 and 2 meanwhile, preferably 6 and 7 (the
> ARM parts) also if you are OK with it?
> I have other patches that depend on 1/7 and 2/7, so having them upstream
> would help me and reduce further dependency churn.
> I am happy to resend the remaining patches for further discussion later.

We let them sit on the list for a while with no comments, so I just pushed
out your series. If a bug report shows up, then we can always revert the
offending patch if there's no quick fix.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: powerpc: kvmppc_visible_gpa can be boolean

2015-11-18 Thread Paolo Bonzini


On 16/11/2015 04:10, Yaowei Bai wrote:
> In another patch kvm_is_visible_gfn is maken return bool due to this
> function only returns zero or one as its return value, let's also make
> kvmppc_visible_gpa return bool to keep consistent.
> 
> No functional change.
> 
> Signed-off-by: Yaowei Bai 
> ---
>  arch/powerpc/kvm/book3s_pr.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> index 64891b0..70fb08d 100644
> --- a/arch/powerpc/kvm/book3s_pr.c
> +++ b/arch/powerpc/kvm/book3s_pr.c
> @@ -512,7 +512,7 @@ static void kvmppc_patch_dcbz(struct kvm_vcpu *vcpu, 
> struct kvmppc_pte *pte)
>   put_page(hpage);
>  }
>  
> -static int kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
> +static bool kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
>  {
>   ulong mp_pa = vcpu->arch.magic_page_pa;
>  
> @@ -521,7 +521,7 @@ static int kvmppc_visible_gpa(struct kvm_vcpu *vcpu, 
> gpa_t gpa)
>  
>   gpa &= ~0xFFFULL;
>   if (unlikely(mp_pa) && unlikely((mp_pa & KVM_PAM) == (gpa & KVM_PAM))) {
> - return 1;
> + return true;
>   }
>  
>   return kvm_is_visible_gfn(vcpu->kvm, gpa >> PAGE_SHIFT);
> 

Applied all three patches to kvm/queue, thanks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-18 Thread Andre Przywara
Hi Will,

On 02/11/15 14:58, Will Deacon wrote:
> On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
>> Hi,
> 
> Hello Andre,
> 
>> this series cleans up kvmtool's kernel loading functionality a bit.
>> It has been broken out of a previous series I sent [1] and contains
>> just the cleanup and bug fix parts, which should be less controversial
>> and thus easier to merge ;-)
>> I will resend the pipe loading part later on as a separate series.
>>
>> The first patch properly abstracts kernel loading to move
>> responsibility into each architecture's code. It removes quite some
>> ugly code from the generic kvm.c file.
>> The later patches address the naive usage of read(2) to, well, read
>> data from files. Doing this without coping with the subtleties of
>> the UNIX read semantics (returning with less or none data read is not
>> an error) can provoke hard to debug failures.
>> So these patches make use of the existing and one new wrapper function
>> to make sure we read everything we actually wanted to.
>> The last patch moves the ARM kernel loading code into the proper
>> location to be in line with the other architectures.
>>
>> Please have a look and give some comments!
> 
> Looks good to me, but I'd like to see some comments from some mips/ppc/x86
> people on the changes you're making over there.

Sounds reasonable, but no answers yet.

Can you take at least patch 1 and 2 meanwhile, preferably 6 and 7 (the
ARM parts) also if you are OK with it?
I have other patches that depend on 1/7 and 2/7, so having them upstream
would help me and reduce further dependency churn.
I am happy to resend the remaining patches for further discussion later.

Cheers,
Andre.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: powerpc: kvmppc_visible_gpa can be boolean

2015-11-15 Thread Yaowei Bai
In another patch kvm_is_visible_gfn is maken return bool due to this
function only returns zero or one as its return value, let's also make
kvmppc_visible_gpa return bool to keep consistent.

No functional change.

Signed-off-by: Yaowei Bai 
---
 arch/powerpc/kvm/book3s_pr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 64891b0..70fb08d 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -512,7 +512,7 @@ static void kvmppc_patch_dcbz(struct kvm_vcpu *vcpu, struct 
kvmppc_pte *pte)
put_page(hpage);
 }
 
-static int kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
+static bool kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
 {
ulong mp_pa = vcpu->arch.magic_page_pa;
 
@@ -521,7 +521,7 @@ static int kvmppc_visible_gpa(struct kvm_vcpu *vcpu, gpa_t 
gpa)
 
gpa &= ~0xFFFULL;
if (unlikely(mp_pa) && unlikely((mp_pa & KVM_PAM) == (gpa & KVM_PAM))) {
-   return 1;
+   return true;
}
 
return kvm_is_visible_gfn(vcpu->kvm, gpa >> PAGE_SHIFT);
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-13 Thread Aravinda Prasad


On Friday 13 November 2015 01:08 PM, Thomas Huth wrote:
> On 13/11/15 07:26, Aravinda Prasad wrote:
>>
>> On Friday 13 November 2015 07:20 AM, David Gibson wrote:
>>> On Thu, Nov 12, 2015 at 11:22:29PM +0530, Aravinda Prasad wrote:
> [...]
 So thinking whether qemu should explicitly enable the new NMI
 behavior.
>>>
>>> So, I think the reasoning above tends towards having qemu control the
>>> MC behaviour.  If qemu does nothing, MCs are delivered direct to
>>> 0x200, if it enables the new handling, they cause a KVM exit and qemu
>>> will deliver the MC.
>>
>> This essentially requires qemu to control how KVM behaves as KVM does
>> the actual redirection of MC either to guest's 0x200 vector or to exit
>> guest. So, if we are running new qemu, then KVM should exit guest and if
>> we are running old qemu, KVM should redirect MC to 0x200. Is there any
>> way to communicate this to KVM? ioctl?
> 
> Simply introduce a KVM capability that can be enabled by userspace.
> See kvm_vcpu_ioctl_enable_cap() in arch/powerpc/kvm/powerpc.c.

Thanks. I will look at it.

Regards,
Aravinda

> 
>  Thomas
> 

-- 
Regards,
Aravinda

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-12 Thread Daniel Axtens
Aravinda Prasad  writes:

>> Yeah, it would be good not to break this.
>
> I am not familiar with CAPI. Does this affect CAPI?

When a CAPI card experiences an EEH event, any cache lines it holds are
filled with SUEs (Special UEs, interpreted by the kernel the same as
regular UEs). When these are read, we get an MCE. Currently CAPI does
not support virtualisation, but that is actively being worked
on. There's a _very_ good chance it will then be backported to various
distros, which could have old qemu.

Therefore, I'd ideally like to make sure UEs in KVM guests work properly
and continue to do so in all combinations of kernel and qemu. I'm
following up the email from Mahesh that you linked: I'm not sure I quite
follow his logic so I'll try to make sense of that and then get back to
you.

Regards,
Daniel



signature.asc
Description: PGP signature


Re: [GIT PULL] Please pull my kvm-ppc-fixes branch

2015-11-12 Thread Paolo Bonzini


On 12/11/2015 06:35, Paul Mackerras wrote:
> Paolo,
> 
> I have two fixes for HV KVM which I would like to have included in
> v4.4-rc1.  The first one is a fix for a bug identified by Red Hat
> which causes occasional guest crashes.  The second one fixes a bug
> which causes host stalls and timeouts under certain circumstances when
> the host is configured for static 2-way micro-threading mode.
> 
> Thanks,
> Paul.
> 
> The following changes since commit a3eaa8649e4c6a6afdafaa04b9114fb230617bb1:
> 
>   KVM: VMX: Fix commit which broke PML (2015-11-05 11:34:11 +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git 
> kvm-ppc-fixes
> 
> for you to fetch changes up to f74f2e2e26199f695ca3df94f29e9ab7cb707ea4:
> 
>   KVM: PPC: Book3S HV: Don't dynamically split core when already split 
> (2015-11-06 16:02:59 +1100)
> 
> 
> Paul Mackerras (2):
>   KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails
>   KVM: PPC: Book3S HV: Don't dynamically split core when already split
> 
>  arch/powerpc/kvm/book3s_hv.c|  2 +-
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 
>  2 files changed, 13 insertions(+), 9 deletions(-)
> 

Pulled, thanks.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-12 Thread David Gibson
On Thu, Nov 12, 2015 at 11:22:29PM +0530, Aravinda Prasad wrote:
> 
> 
> On Thursday 12 November 2015 10:13 AM, David Gibson wrote:
> > On Thu, Nov 12, 2015 at 10:02:10AM +0530, Aravinda Prasad wrote:
> >>
> >>
> >> On Thursday 12 November 2015 09:08 AM, David Gibson wrote:
> >>> On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote:
>  Aravinda Prasad  writes:
> 
> > This patch modifies KVM to cause a guest exit with
> > KVM_EXIT_NMI instead of immediately delivering a 0x200
> > interrupt to guest upon machine check exception in
> > guest address. Exiting the guest enables QEMU to build
> > error log and deliver machine check exception to guest
> > OS (either via guest OS registered machine check
> > handler or via 0x200 guest OS interrupt vector).
> >
> > This approach simplifies the delivering of machine
> > check exception to guest OS compared to the earlier approach
> > of KVM directly invoking 0x200 guest interrupt vector.
> > In the earlier approach QEMU patched the 0x200 interrupt
> > vector during boot. The patched code at 0x200 issued a
> > private hcall to pass the control to QEMU to build the
> > error log.
> >
> > This design/approach is based on the feedback for the
> > QEMU patches to handle machine check exception. Details
> > of earlier approach of handling machine check exception
> > in QEMU and related discussions can be found at:
> >
> > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> 
>  I've poked at the MCE code, but not the KVM MCE code, so I may be
>  mistaken here, but I'm not clear on how this handles errors that the
>  guest can recover without terminating.
> 
>  For example, a Linux guest can handle a UE in guest userspace by killing
>  the guest process. A hypthetical non-linux guest with a microkernel
>  could even survive UEs in drivers.
> 
>  It sounds from your patch like you're changing this behaviour. Is this
>  right?
> >>>
> >>> So, IIUC.  Once the qemu pieces are in place as well it shouldn't
> >>> change this behaviour: KVM will exit to qemu, qemu will log the error
> >>> information (new), then reinject the MC to the guest which can still
> >>> handle it as you describe above.
> >>
> >> Yes. With KVM and QEMU both in place this will not change the behavior.
> >> QEMU will inject the UE to guest and the guest handles the UE based on
> >> where it occurred. For example if an UE happens in a guest process
> >> address space, that process will be killed.
> >>
> >>>
> >>> But, there could be a problem if you have a new kernel with an old
> >>> qemu, in that case qemu might not understand the new exit type and
> >>> treat it as a fatal error, even though the guest could actually cope
> >>> with it.
> >>
> >> In case of new kernel and old QEMU, the guest terminates as old QEMU
> >> does not understand the NMI exit reason. However, this is the case with
> >> old kernel and old QEMU as they do not handle UE belonging to guest. The
> >> difference is that the guest kernel terminates with different error
> >> code.
> > 
> > Ok.. assuming the guest has code to handle the UE in 0x200, why would
> > the guest terminate with old kernel and old qemu?  I haven't quite
> > followed the logic.
> 
> I overlooked it. I think I need to take into consideration whether guest
> issued "ibm, nmi-register". If the guest has issued "ibm, nmi-register"
> then we should not jump to 0x200 upon UE. With the old kernel and old
> QEMU this is broken as we always jump to 0x200.
> 
> However, if the guest has not issued "ibm, nmi-register" then upon UE we
> should jump to 0x200. If new kernel is used with old QEMU this
> functionality breaks as it causes guest to terminate with unhandled NMI
> exit.
> 
> So thinking whether qemu should explicitly enable the new NMI
> behavior.

So, I think the reasoning above tends towards having qemu control the
MC behaviour.  If qemu does nothing, MCs are delivered direct to
0x200, if it enables the new handling, they cause a KVM exit and qemu
will deliver the MC.

Then I'd expect qemu to switch on the new-style handling from
ibm,nmi-register.

> 
> Regards,
> Aravinda
> 
> > 
> >>
> >> old kernel and old QEMU -> guest panics [1] irrespective of where UE
> >>happened in guest address space.
> >> old kernel and new QEMU -> guest panics. same as above.
> >> new kernel and old QEMU -> guest terminates with unhanded NMI error
> >>irrespective of where UE happened in guest
> >> new kernel and new QEMU -> guest handles UEs in process address space
> >>by killing the process. guest terminates
> >>for UEs in guest kernel address space.
> >>
> >> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118329.html
> >>
> >>>
> >>> Aravinda, do we need to change 

Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-12 Thread Aravinda Prasad


On Friday 13 November 2015 07:20 AM, David Gibson wrote:
> On Thu, Nov 12, 2015 at 11:22:29PM +0530, Aravinda Prasad wrote:

[...]

>>
>> I overlooked it. I think I need to take into consideration whether guest
>> issued "ibm, nmi-register". If the guest has issued "ibm, nmi-register"
>> then we should not jump to 0x200 upon UE. With the old kernel and old
>> QEMU this is broken as we always jump to 0x200.
>>
>> However, if the guest has not issued "ibm, nmi-register" then upon UE we
>> should jump to 0x200. If new kernel is used with old QEMU this
>> functionality breaks as it causes guest to terminate with unhandled NMI
>> exit.
>>
>> So thinking whether qemu should explicitly enable the new NMI
>> behavior.
> 
> So, I think the reasoning above tends towards having qemu control the
> MC behaviour.  If qemu does nothing, MCs are delivered direct to
> 0x200, if it enables the new handling, they cause a KVM exit and qemu
> will deliver the MC.

This essentially requires qemu to control how KVM behaves as KVM does
the actual redirection of MC either to guest's 0x200 vector or to exit
guest. So, if we are running new qemu, then KVM should exit guest and if
we are running old qemu, KVM should redirect MC to 0x200. Is there any
way to communicate this to KVM? ioctl?

However, this should not be a problem (in terms of breaking existing
functionality) with old KVM as it always redirects MC to 0x200
irrespective of old or new qemu.

> 
> Then I'd expect qemu to switch on the new-style handling from
> ibm,nmi-register.
> 
>>

-- 
Regards,
Aravinda

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-12 Thread Thomas Huth
On 13/11/15 07:26, Aravinda Prasad wrote:
> 
> On Friday 13 November 2015 07:20 AM, David Gibson wrote:
>> On Thu, Nov 12, 2015 at 11:22:29PM +0530, Aravinda Prasad wrote:
[...]
>>> So thinking whether qemu should explicitly enable the new NMI
>>> behavior.
>>
>> So, I think the reasoning above tends towards having qemu control the
>> MC behaviour.  If qemu does nothing, MCs are delivered direct to
>> 0x200, if it enables the new handling, they cause a KVM exit and qemu
>> will deliver the MC.
> 
> This essentially requires qemu to control how KVM behaves as KVM does
> the actual redirection of MC either to guest's 0x200 vector or to exit
> guest. So, if we are running new qemu, then KVM should exit guest and if
> we are running old qemu, KVM should redirect MC to 0x200. Is there any
> way to communicate this to KVM? ioctl?

Simply introduce a KVM capability that can be enabled by userspace.
See kvm_vcpu_ioctl_enable_cap() in arch/powerpc/kvm/powerpc.c.

 Thomas

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-12 Thread Aravinda Prasad


On Friday 13 November 2015 03:07 AM, Daniel Axtens wrote:
> Aravinda Prasad  writes:
> 
>>> Yeah, it would be good not to break this.
>>
>> I am not familiar with CAPI. Does this affect CAPI?
> 
> When a CAPI card experiences an EEH event, any cache lines it holds are
> filled with SUEs (Special UEs, interpreted by the kernel the same as
> regular UEs). When these are read, we get an MCE. Currently CAPI does
> not support virtualisation, but that is actively being worked
> on. There's a _very_ good chance it will then be backported to various
> distros, which could have old qemu.
> 
> Therefore, I'd ideally like to make sure UEs in KVM guests work properly
> and continue to do so in all combinations of kernel and qemu. I'm
> following up the email from Mahesh that you linked: I'm not sure I quite
> follow his logic so I'll try to make sense of that and then get back to
> you.

sure. Thanks.

Regards,
Aravinda

> 
> Regards,
> Daniel
> 

-- 
Regards,
Aravinda

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-12 Thread Aravinda Prasad


On Thursday 12 November 2015 10:13 AM, David Gibson wrote:
> On Thu, Nov 12, 2015 at 10:02:10AM +0530, Aravinda Prasad wrote:
>>
>>
>> On Thursday 12 November 2015 09:08 AM, David Gibson wrote:
>>> On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote:
 Aravinda Prasad  writes:

> This patch modifies KVM to cause a guest exit with
> KVM_EXIT_NMI instead of immediately delivering a 0x200
> interrupt to guest upon machine check exception in
> guest address. Exiting the guest enables QEMU to build
> error log and deliver machine check exception to guest
> OS (either via guest OS registered machine check
> handler or via 0x200 guest OS interrupt vector).
>
> This approach simplifies the delivering of machine
> check exception to guest OS compared to the earlier approach
> of KVM directly invoking 0x200 guest interrupt vector.
> In the earlier approach QEMU patched the 0x200 interrupt
> vector during boot. The patched code at 0x200 issued a
> private hcall to pass the control to QEMU to build the
> error log.
>
> This design/approach is based on the feedback for the
> QEMU patches to handle machine check exception. Details
> of earlier approach of handling machine check exception
> in QEMU and related discussions can be found at:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html

 I've poked at the MCE code, but not the KVM MCE code, so I may be
 mistaken here, but I'm not clear on how this handles errors that the
 guest can recover without terminating.

 For example, a Linux guest can handle a UE in guest userspace by killing
 the guest process. A hypthetical non-linux guest with a microkernel
 could even survive UEs in drivers.

 It sounds from your patch like you're changing this behaviour. Is this
 right?
>>>
>>> So, IIUC.  Once the qemu pieces are in place as well it shouldn't
>>> change this behaviour: KVM will exit to qemu, qemu will log the error
>>> information (new), then reinject the MC to the guest which can still
>>> handle it as you describe above.
>>
>> Yes. With KVM and QEMU both in place this will not change the behavior.
>> QEMU will inject the UE to guest and the guest handles the UE based on
>> where it occurred. For example if an UE happens in a guest process
>> address space, that process will be killed.
>>
>>>
>>> But, there could be a problem if you have a new kernel with an old
>>> qemu, in that case qemu might not understand the new exit type and
>>> treat it as a fatal error, even though the guest could actually cope
>>> with it.
>>
>> In case of new kernel and old QEMU, the guest terminates as old QEMU
>> does not understand the NMI exit reason. However, this is the case with
>> old kernel and old QEMU as they do not handle UE belonging to guest. The
>> difference is that the guest kernel terminates with different error
>> code.
> 
> Ok.. assuming the guest has code to handle the UE in 0x200, why would
> the guest terminate with old kernel and old qemu?  I haven't quite
> followed the logic.

I overlooked it. I think I need to take into consideration whether guest
issued "ibm, nmi-register". If the guest has issued "ibm, nmi-register"
then we should not jump to 0x200 upon UE. With the old kernel and old
QEMU this is broken as we always jump to 0x200.

However, if the guest has not issued "ibm, nmi-register" then upon UE we
should jump to 0x200. If new kernel is used with old QEMU this
functionality breaks as it causes guest to terminate with unhandled NMI
exit.

So thinking whether qemu should explicitly enable the new NMI behavior.

Regards,
Aravinda

> 
>>
>> old kernel and old QEMU -> guest panics [1] irrespective of where UE
>>happened in guest address space.
>> old kernel and new QEMU -> guest panics. same as above.
>> new kernel and old QEMU -> guest terminates with unhanded NMI error
>>irrespective of where UE happened in guest
>> new kernel and new QEMU -> guest handles UEs in process address space
>>by killing the process. guest terminates
>>for UEs in guest kernel address space.
>>
>> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118329.html
>>
>>>
>>> Aravinda, do we need to change this so that qemu has to explicitly
>>> enable the new NMI behaviour?  Or have I missed something that will
>>> make that case work already.
>>
>> I think we don't need to explicitly enable the new behavior. With new
>> kernel and new QEMU this should just work. As mentioned above this is
>> already broken for old kernel/QEMU. Any thoughts?
>>
>> Regards,
>> Aravinda
>>
>>>
>>>
>>>
>>> ___
>>> Linuxppc-dev mailing list
>>> linuxppc-...@lists.ozlabs.org
>>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>>>
>>
> 

-- 
Regards,

Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-12 Thread Aravinda Prasad


On Thursday 12 November 2015 10:28 AM, Daniel Axtens wrote:
> 
>> So, IIUC.  Once the qemu pieces are in place as well it shouldn't
>> change this behaviour: KVM will exit to qemu, qemu will log the error
>> information (new), then reinject the MC to the guest which can still
>> handle it as you describe above.
> 
> Ah, that makes *much* more sense now! Thanks for the explanation: I
> don't really follow qemu development.
> 
>>
>> But, there could be a problem if you have a new kernel with an old
>> qemu, in that case qemu might not understand the new exit type and
>> treat it as a fatal error, even though the guest could actually cope
>> with it.
>>
>> Aravinda, do we need to change this so that qemu has to explicitly
>> enable the new NMI behaviour?  Or have I missed something that will
>> make that case work already.
> 
> Yeah, it would be good not to break this.

I am not familiar with CAPI. Does this affect CAPI?

Regards,
Aravinda

> 
> Regards,
> Daniel
> 
> 
>> -- 
>> David Gibson | I'll have my music baroque, and my code
>> david AT gibson.dropbear.id.au   | minimalist, thank you.  NOT _the_ 
>> _other_
>>  | _way_ _around_!
>> http://www.ozlabs.org/~dgibson
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

-- 
Regards,
Aravinda

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-11 Thread Aravinda Prasad
This patch modifies KVM to cause a guest exit with
KVM_EXIT_NMI instead of immediately delivering a 0x200
interrupt to guest upon machine check exception in
guest address. Exiting the guest enables QEMU to build
error log and deliver machine check exception to guest
OS (either via guest OS registered machine check
handler or via 0x200 guest OS interrupt vector).

This approach simplifies the delivering of machine
check exception to guest OS compared to the earlier approach
of KVM directly invoking 0x200 guest interrupt vector.
In the earlier approach QEMU patched the 0x200 interrupt
vector during boot. The patched code at 0x200 issued a
private hcall to pass the control to QEMU to build the
error log.

This design/approach is based on the feedback for the
QEMU patches to handle machine check exception. Details
of earlier approach of handling machine check exception
in QEMU and related discussions can be found at:

https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html

Signed-off-by: Aravinda Prasad 
---
 arch/powerpc/kvm/book3s_hv.c|   12 +++-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   31 +++
 2 files changed, 14 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2280497..1b1dff0 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
struct kvm_vcpu *vcpu,
r = RESUME_GUEST;
break;
case BOOK3S_INTERRUPT_MACHINE_CHECK:
-   /*
-* Deliver a machine check interrupt to the guest.
-* We have to do this, even if the host has handled the
-* machine check, because machine checks use SRR0/1 and
-* the interrupt might have trashed guest state in them.
-*/
-   kvmppc_book3s_queue_irqprio(vcpu,
-   BOOK3S_INTERRUPT_MACHINE_CHECK);
-   r = RESUME_GUEST;
+   /* Exit to guest with KVM_EXIT_NMI as exit reason */
+   run->exit_reason = KVM_EXIT_NMI;
+   r = RESUME_HOST;
break;
case BOOK3S_INTERRUPT_PROGRAM:
{
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index b98889e..672b4f6 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
addir1, r1, 112
ld  r7, HSTATE_HOST_MSR(r13)
 
-   cmpwi   cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK
cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
beq 11f
cmpwi   cr2, r12, BOOK3S_INTERRUPT_HMI
@@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
mtmsrd  r6, 1   /* Clear RI in MSR */
mtsrr0  r8
mtsrr1  r7
-   beq cr1, 13f/* machine check */
RFI
 
/* On POWER7, we have external interrupts set to use HSRR0/1 */
@@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
mtspr   SPRN_HSRR1, r7
ba  0x500
 
-13:b   machine_check_fwnmi
-
 14:mtspr   SPRN_HSRR0, r8
mtspr   SPRN_HSRR1, r7
b   hmi_exception_after_realmode
@@ -2381,24 +2377,19 @@ machine_check_realmode:
ld  r9, HSTATE_KVM_VCPU(r13)
li  r12, BOOK3S_INTERRUPT_MACHINE_CHECK
/*
-* Deliver unhandled/fatal (e.g. UE) MCE errors to guest through
-* machine check interrupt (set HSRR0 to 0x200). And for handled
-* errors (no-fatal), just go back to guest execution with current
-* HSRR0 instead of exiting guest. This new approach will inject
-* machine check to guest for fatal error causing guest to crash.
-*
-* The old code used to return to host for unhandled errors which
-* was causing guest to hang with soft lockups inside guest and
-* makes it difficult to recover guest instance.
+* Deliver unhandled/fatal (e.g. UE) MCE errors to guest
+* by exiting the guest with KVM_EXIT_NMI exit reason (exit
+* reason set later based on trap). For handled errors
+* (no-fatal), go back to guest execution with current HSRR0
+* instead of exiting the guest. This approach will cause
+* the guest to exit in case of fatal machine check error.
 */
-   ld  r10, VCPU_PC(r9)
+   bne 2f  /* Continue guest execution? */
+   /* If not, exit the guest. SRR0/1 are already set */
+   b   mc_cont
+2: ld  r10, VCPU_PC(r9)
ld  r11, VCPU_MSR(r9)
-   bne 2f  /* Continue guest execution. */
-   /* If not, deliver a machine check.  SRR0/1 are already set */
-   li  r10, BOOK3S_INTERRUPT_MACHINE_CHECK
- 

Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-11 Thread Daniel Axtens

> So, IIUC.  Once the qemu pieces are in place as well it shouldn't
> change this behaviour: KVM will exit to qemu, qemu will log the error
> information (new), then reinject the MC to the guest which can still
> handle it as you describe above.

Ah, that makes *much* more sense now! Thanks for the explanation: I
don't really follow qemu development.

>
> But, there could be a problem if you have a new kernel with an old
> qemu, in that case qemu might not understand the new exit type and
> treat it as a fatal error, even though the guest could actually cope
> with it.
>
> Aravinda, do we need to change this so that qemu has to explicitly
> enable the new NMI behaviour?  Or have I missed something that will
> make that case work already.

Yeah, it would be good not to break this.

Regards,
Daniel


> -- 
> David Gibson  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au| minimalist, thank you.  NOT _the_ 
> _other_
>   | _way_ _around_!
> http://www.ozlabs.org/~dgibson
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-11 Thread David Gibson
On Thu, Nov 12, 2015 at 10:02:10AM +0530, Aravinda Prasad wrote:
> 
> 
> On Thursday 12 November 2015 09:08 AM, David Gibson wrote:
> > On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote:
> >> Aravinda Prasad  writes:
> >>
> >>> This patch modifies KVM to cause a guest exit with
> >>> KVM_EXIT_NMI instead of immediately delivering a 0x200
> >>> interrupt to guest upon machine check exception in
> >>> guest address. Exiting the guest enables QEMU to build
> >>> error log and deliver machine check exception to guest
> >>> OS (either via guest OS registered machine check
> >>> handler or via 0x200 guest OS interrupt vector).
> >>>
> >>> This approach simplifies the delivering of machine
> >>> check exception to guest OS compared to the earlier approach
> >>> of KVM directly invoking 0x200 guest interrupt vector.
> >>> In the earlier approach QEMU patched the 0x200 interrupt
> >>> vector during boot. The patched code at 0x200 issued a
> >>> private hcall to pass the control to QEMU to build the
> >>> error log.
> >>>
> >>> This design/approach is based on the feedback for the
> >>> QEMU patches to handle machine check exception. Details
> >>> of earlier approach of handling machine check exception
> >>> in QEMU and related discussions can be found at:
> >>>
> >>> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> >>
> >> I've poked at the MCE code, but not the KVM MCE code, so I may be
> >> mistaken here, but I'm not clear on how this handles errors that the
> >> guest can recover without terminating.
> >>
> >> For example, a Linux guest can handle a UE in guest userspace by killing
> >> the guest process. A hypthetical non-linux guest with a microkernel
> >> could even survive UEs in drivers.
> >>
> >> It sounds from your patch like you're changing this behaviour. Is this
> >> right?
> > 
> > So, IIUC.  Once the qemu pieces are in place as well it shouldn't
> > change this behaviour: KVM will exit to qemu, qemu will log the error
> > information (new), then reinject the MC to the guest which can still
> > handle it as you describe above.
> 
> Yes. With KVM and QEMU both in place this will not change the behavior.
> QEMU will inject the UE to guest and the guest handles the UE based on
> where it occurred. For example if an UE happens in a guest process
> address space, that process will be killed.
> 
> > 
> > But, there could be a problem if you have a new kernel with an old
> > qemu, in that case qemu might not understand the new exit type and
> > treat it as a fatal error, even though the guest could actually cope
> > with it.
> 
> In case of new kernel and old QEMU, the guest terminates as old QEMU
> does not understand the NMI exit reason. However, this is the case with
> old kernel and old QEMU as they do not handle UE belonging to guest. The
> difference is that the guest kernel terminates with different error
> code.

Ok.. assuming the guest has code to handle the UE in 0x200, why would
the guest terminate with old kernel and old qemu?  I haven't quite
followed the logic.

> 
> old kernel and old QEMU -> guest panics [1] irrespective of where UE
>happened in guest address space.
> old kernel and new QEMU -> guest panics. same as above.
> new kernel and old QEMU -> guest terminates with unhanded NMI error
>irrespective of where UE happened in guest
> new kernel and new QEMU -> guest handles UEs in process address space
>by killing the process. guest terminates
>for UEs in guest kernel address space.
> 
> [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118329.html
> 
> > 
> > Aravinda, do we need to change this so that qemu has to explicitly
> > enable the new NMI behaviour?  Or have I missed something that will
> > make that case work already.
> 
> I think we don't need to explicitly enable the new behavior. With new
> kernel and new QEMU this should just work. As mentioned above this is
> already broken for old kernel/QEMU. Any thoughts?
> 
> Regards,
> Aravinda
> 
> > 
> > 
> > 
> > ___
> > Linuxppc-dev mailing list
> > linuxppc-...@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
> > 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH 1/2] KVM: PPC: Book3S HV: Prohibit setting illegal transaction state in MSR

2015-11-11 Thread Paul Mackerras
Currently it is possible for userspace (e.g. QEMU) to set a value
for the MSR for a guest VCPU which has both of the TS bits set,
which is an illegal combination.  The result of this is that when
we execute a hrfid (hypervisor return from interrupt doubleword)
instruction to enter the guest, the CPU will take a TM Bad Thing
type of program interrupt (vector 0x700).

Now, if PR KVM is configured in the kernel along with HV KVM, we
actually handle this without crashing the host or giving hypervisor
privilege to the guest; instead what happens is that we deliver a
program interrupt to the guest, with SRR0 reflecting the address
of the hrfid instruction and SRR1 containing the MSR value at that
point.  If PR KVM is not configured in the kernel, then we try to
run the host's program interrupt handler with the MMU set to the
guest context, which almost certainly causes a host crash.

This closes the hole by making kvmppc_set_msr_hv() check for the
illegal combination and force the TS field to a safe value (00,
meaning non-transactional).

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index becad3a..f668712 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -231,6 +231,12 @@ static void kvmppc_core_vcpu_put_hv(struct kvm_vcpu *vcpu)
 
 static void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 msr)
 {
+   /*
+* Check for illegal transactional state bit combination
+* and if we find it, force the TS field to a safe state.
+*/
+   if ((msr & MSR_TS_MASK) == MSR_TS_MASK)
+   msr &= ~MSR_TS_MASK;
vcpu->arch.shregs.msr = msr;
kvmppc_end_cede(vcpu);
 }
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] powerpc/64: Include KVM guest test in all interrupt vectors

2015-11-11 Thread Paul Mackerras
Currently, if HV KVM is configured but PR KVM isn't, we don't include
a test to see whether we were interrupted in KVM guest context for the
set of interrupts which get delivered directly to the guest by hardware
if they occur in the guest.  This includes things like program
interrupts.

However, the recent bug where userspace could set the MSR for a VCPU
to have an illegal value in the TS field, and thus cause a TM Bad Thing
type of program interrupt on the hrfid that enters the guest, showed that
we can never be completely sure that these interrupts can never occur
in the guest entry/exit code.  If one of these interrupts does happen
and we have HV KVM configured but not PR KVM, then we end up trying to
run the handler in the host with the MMU set to the guest MMU context,
which generally ends badly.

Thus, for robustness it is better to have the test in every interrupt
vector, so that if some way is found to trigger some interrupt in the
guest entry/exit path, we can handle it without immediately crashing
the host.

This means that the distinction between KVMTEST and KVMTEST_PR goes
away.  Thus we delete KVMTEST_PR and associated macros and use KVMTEST
everywhere that we previously used either KVMTEST_PR or KVMTEST.  It
also means that SOFTEN_TEST_HV_201 becomes the same as SOFTEN_TEST_PR,
so we deleted SOFTEN_TEST_HV_201 and use SOFTEN_TEST_PR instead.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/exception-64s.h | 21 +++-
 arch/powerpc/kernel/exceptions-64s.S | 34 
 2 files changed, 20 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64s.h 
b/arch/powerpc/include/asm/exception-64s.h
index 77f52b2..9ee1078 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -263,17 +263,6 @@ do_kvm_##n:
\
 #define KVM_HANDLER_SKIP(area, h, n)
 #endif
 
-#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
-#define KVMTEST_PR(n)  __KVMTEST(n)
-#define KVM_HANDLER_PR(area, h, n) __KVM_HANDLER(area, h, n)
-#define KVM_HANDLER_PR_SKIP(area, h, n)__KVM_HANDLER_SKIP(area, h, n)
-
-#else
-#define KVMTEST_PR(n)
-#define KVM_HANDLER_PR(area, h, n)
-#define KVM_HANDLER_PR_SKIP(area, h, n)
-#endif
-
 #define NOTEST(n)
 
 /*
@@ -360,13 +349,13 @@ label##_pSeries:  \
HMT_MEDIUM_PPR_DISCARD; \
SET_SCRATCH0(r13);  /* save r13 */  \
EXCEPTION_PROLOG_PSERIES(PACA_EXGEN, label##_common,\
-EXC_STD, KVMTEST_PR, vec)
+EXC_STD, KVMTEST, vec)
 
 /* Version of above for when we have to branch out-of-line */
 #define STD_EXCEPTION_PSERIES_OOL(vec, label)  \
.globl label##_pSeries; \
 label##_pSeries:   \
-   EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST_PR, vec);\
+   EXCEPTION_PROLOG_1(PACA_EXGEN, KVMTEST, vec);   \
EXCEPTION_PROLOG_PSERIES_1(label##_common, EXC_STD)
 
 #define STD_EXCEPTION_HV(loc, vec, label)  \
@@ -436,17 +425,13 @@ label##_relon_hv: 
\
 #define _SOFTEN_TEST(h, vec)   __SOFTEN_TEST(h, vec)
 
 #define SOFTEN_TEST_PR(vec)\
-   KVMTEST_PR(vec);\
+   KVMTEST(vec);   \
_SOFTEN_TEST(EXC_STD, vec)
 
 #define SOFTEN_TEST_HV(vec)\
KVMTEST(vec);   \
_SOFTEN_TEST(EXC_HV, vec)
 
-#define SOFTEN_TEST_HV_201(vec)
\
-   KVMTEST(vec);   \
-   _SOFTEN_TEST(EXC_STD, vec)
-
 #define SOFTEN_NOTEST_PR(vec)  _SOFTEN_TEST(EXC_STD, vec)
 #define SOFTEN_NOTEST_HV(vec)  _SOFTEN_TEST(EXC_HV, vec)
 
diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 0a0399c2..1a03142 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -242,7 +242,7 @@ instruction_access_slb_pSeries:
HMT_MEDIUM_PPR_DISCARD
SET_SCRATCH0(r13)
EXCEPTION_PROLOG_0(PACA_EXSLB)
-   EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST_PR, 0x480)
+   EXCEPTION_PROLOG_1(PACA_EXSLB, KVMTEST, 0x480)
std r3,PACA_EXSLB+EX_R3(r13)
mfspr   r3,SPRN_SRR0/* SRR0 is faulting address */
 #ifdef __DISABLED__
@@ -276,18 +276,18 @@ hardware_interrupt_hv:
KVM_HANDLER(PACA_EXGEN, EXC_HV, 0x502)
FTR_SECTION_ELSE
_MASKABLE_EXCEPTION_PSERIES(0x500, hardware_interrupt,
- 

Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-11 Thread David Gibson
On Wed, Nov 11, 2015 at 10:28:46PM +0530, Aravinda Prasad wrote:
> This patch modifies KVM to cause a guest exit with
> KVM_EXIT_NMI instead of immediately delivering a 0x200
> interrupt to guest upon machine check exception in
> guest address. Exiting the guest enables QEMU to build
> error log and deliver machine check exception to guest
> OS (either via guest OS registered machine check
> handler or via 0x200 guest OS interrupt vector).
> 
> This approach simplifies the delivering of machine
> check exception to guest OS compared to the earlier approach
> of KVM directly invoking 0x200 guest interrupt vector.
> In the earlier approach QEMU patched the 0x200 interrupt
> vector during boot. The patched code at 0x200 issued a
> private hcall to pass the control to QEMU to build the
> error log.
> 
> This design/approach is based on the feedback for the
> QEMU patches to handle machine check exception. Details
> of earlier approach of handling machine check exception
> in QEMU and related discussions can be found at:
> 
> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> 
> Signed-off-by: Aravinda Prasad 
> ---
>  arch/powerpc/kvm/book3s_hv.c|   12 +++-
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   31 
> +++
>  2 files changed, 14 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 2280497..1b1dff0 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
> struct kvm_vcpu *vcpu,
>   r = RESUME_GUEST;
>   break;
>   case BOOK3S_INTERRUPT_MACHINE_CHECK:
> - /*
> -  * Deliver a machine check interrupt to the guest.
> -  * We have to do this, even if the host has handled the
> -  * machine check, because machine checks use SRR0/1 and
> -  * the interrupt might have trashed guest state in them.
> -  */
> - kvmppc_book3s_queue_irqprio(vcpu,
> - BOOK3S_INTERRUPT_MACHINE_CHECK);
> - r = RESUME_GUEST;
> + /* Exit to guest with KVM_EXIT_NMI as exit reason */
> + run->exit_reason = KVM_EXIT_NMI;
> + r = RESUME_HOST;
>   break;
>   case BOOK3S_INTERRUPT_PROGRAM:
>   {
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..672b4f6 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   addir1, r1, 112
>   ld  r7, HSTATE_HOST_MSR(r13)
>  
> - cmpwi   cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>   cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
>   beq 11f
>   cmpwi   cr2, r12, BOOK3S_INTERRUPT_HMI
> @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   mtmsrd  r6, 1   /* Clear RI in MSR */
>   mtsrr0  r8
>   mtsrr1  r7
> - beq cr1, 13f/* machine check */
>   RFI
>  
>   /* On POWER7, we have external interrupts set to use HSRR0/1 */
> @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   mtspr   SPRN_HSRR1, r7
>   ba  0x500
>  
> -13:  b   machine_check_fwnmi
> -

I don't quite understand the 3 hunks above.  The rest seems to change
the delivery of MCs as I'd expect, but the above seems to change their
detection and the reason for that isn't obvious to me.


>  14:  mtspr   SPRN_HSRR0, r8
>   mtspr   SPRN_HSRR1, r7
>   b   hmi_exception_after_realmode
> @@ -2381,24 +2377,19 @@ machine_check_realmode:
>   ld  r9, HSTATE_KVM_VCPU(r13)
>   li  r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>   /*
> -  * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through
> -  * machine check interrupt (set HSRR0 to 0x200). And for handled
> -  * errors (no-fatal), just go back to guest execution with current
> -  * HSRR0 instead of exiting guest. This new approach will inject
> -  * machine check to guest for fatal error causing guest to crash.
> -  *
> -  * The old code used to return to host for unhandled errors which
> -  * was causing guest to hang with soft lockups inside guest and
> -  * makes it difficult to recover guest instance.
> +  * Deliver unhandled/fatal (e.g. UE) MCE errors to guest
> +  * by exiting the guest with KVM_EXIT_NMI exit reason (exit
> +  * reason set later based on trap). For handled errors
> +  * (no-fatal), go back to guest execution with current HSRR0
> +  * instead of exiting the guest. This approach will cause
> +  * the guest to exit in case of fatal machine check error.
>*/
> - ld  r10, VCPU_PC(r9)
> + bne 2f  /* 

Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-11 Thread David Gibson
On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote:
> Aravinda Prasad  writes:
> 
> > This patch modifies KVM to cause a guest exit with
> > KVM_EXIT_NMI instead of immediately delivering a 0x200
> > interrupt to guest upon machine check exception in
> > guest address. Exiting the guest enables QEMU to build
> > error log and deliver machine check exception to guest
> > OS (either via guest OS registered machine check
> > handler or via 0x200 guest OS interrupt vector).
> >
> > This approach simplifies the delivering of machine
> > check exception to guest OS compared to the earlier approach
> > of KVM directly invoking 0x200 guest interrupt vector.
> > In the earlier approach QEMU patched the 0x200 interrupt
> > vector during boot. The patched code at 0x200 issued a
> > private hcall to pass the control to QEMU to build the
> > error log.
> >
> > This design/approach is based on the feedback for the
> > QEMU patches to handle machine check exception. Details
> > of earlier approach of handling machine check exception
> > in QEMU and related discussions can be found at:
> >
> > https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
> 
> I've poked at the MCE code, but not the KVM MCE code, so I may be
> mistaken here, but I'm not clear on how this handles errors that the
> guest can recover without terminating.
> 
> For example, a Linux guest can handle a UE in guest userspace by killing
> the guest process. A hypthetical non-linux guest with a microkernel
> could even survive UEs in drivers.
> 
> It sounds from your patch like you're changing this behaviour. Is this
> right?

So, IIUC.  Once the qemu pieces are in place as well it shouldn't
change this behaviour: KVM will exit to qemu, qemu will log the error
information (new), then reinject the MC to the guest which can still
handle it as you describe above.

But, there could be a problem if you have a new kernel with an old
qemu, in that case qemu might not understand the new exit type and
treat it as a fatal error, even though the guest could actually cope
with it.

Aravinda, do we need to change this so that qemu has to explicitly
enable the new NMI behaviour?  Or have I missed something that will
make that case work already.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[GIT PULL] Please pull my kvm-ppc-fixes branch

2015-11-11 Thread Paul Mackerras
Paolo,

I have two fixes for HV KVM which I would like to have included in
v4.4-rc1.  The first one is a fix for a bug identified by Red Hat
which causes occasional guest crashes.  The second one fixes a bug
which causes host stalls and timeouts under certain circumstances when
the host is configured for static 2-way micro-threading mode.

Thanks,
Paul.

The following changes since commit a3eaa8649e4c6a6afdafaa04b9114fb230617bb1:

  KVM: VMX: Fix commit which broke PML (2015-11-05 11:34:11 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git kvm-ppc-fixes

for you to fetch changes up to f74f2e2e26199f695ca3df94f29e9ab7cb707ea4:

  KVM: PPC: Book3S HV: Don't dynamically split core when already split 
(2015-11-06 16:02:59 +1100)


Paul Mackerras (2):
  KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails
  KVM: PPC: Book3S HV: Don't dynamically split core when already split

 arch/powerpc/kvm/book3s_hv.c|  2 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 
 2 files changed, 13 insertions(+), 9 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-11 Thread Daniel Axtens
Aravinda Prasad  writes:

> This patch modifies KVM to cause a guest exit with
> KVM_EXIT_NMI instead of immediately delivering a 0x200
> interrupt to guest upon machine check exception in
> guest address. Exiting the guest enables QEMU to build
> error log and deliver machine check exception to guest
> OS (either via guest OS registered machine check
> handler or via 0x200 guest OS interrupt vector).
>
> This approach simplifies the delivering of machine
> check exception to guest OS compared to the earlier approach
> of KVM directly invoking 0x200 guest interrupt vector.
> In the earlier approach QEMU patched the 0x200 interrupt
> vector during boot. The patched code at 0x200 issued a
> private hcall to pass the control to QEMU to build the
> error log.
>
> This design/approach is based on the feedback for the
> QEMU patches to handle machine check exception. Details
> of earlier approach of handling machine check exception
> in QEMU and related discussions can be found at:
>
> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html

I've poked at the MCE code, but not the KVM MCE code, so I may be
mistaken here, but I'm not clear on how this handles errors that the
guest can recover without terminating.

For example, a Linux guest can handle a UE in guest userspace by killing
the guest process. A hypthetical non-linux guest with a microkernel
could even survive UEs in drivers.

It sounds from your patch like you're changing this behaviour. Is this
right?

Regards,
Daniel

>
> Signed-off-by: Aravinda Prasad 
> ---
>  arch/powerpc/kvm/book3s_hv.c|   12 +++-
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   31 
> +++
>  2 files changed, 14 insertions(+), 29 deletions(-)
>
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 2280497..1b1dff0 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
> struct kvm_vcpu *vcpu,
>   r = RESUME_GUEST;
>   break;
>   case BOOK3S_INTERRUPT_MACHINE_CHECK:
> - /*
> -  * Deliver a machine check interrupt to the guest.
> -  * We have to do this, even if the host has handled the
> -  * machine check, because machine checks use SRR0/1 and
> -  * the interrupt might have trashed guest state in them.
> -  */
> - kvmppc_book3s_queue_irqprio(vcpu,
> - BOOK3S_INTERRUPT_MACHINE_CHECK);
> - r = RESUME_GUEST;
> + /* Exit to guest with KVM_EXIT_NMI as exit reason */
> + run->exit_reason = KVM_EXIT_NMI;
> + r = RESUME_HOST;
>   break;
>   case BOOK3S_INTERRUPT_PROGRAM:
>   {
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b98889e..672b4f6 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   addir1, r1, 112
>   ld  r7, HSTATE_HOST_MSR(r13)
>  
> - cmpwi   cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>   cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
>   beq 11f
>   cmpwi   cr2, r12, BOOK3S_INTERRUPT_HMI
> @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   mtmsrd  r6, 1   /* Clear RI in MSR */
>   mtsrr0  r8
>   mtsrr1  r7
> - beq cr1, 13f/* machine check */
>   RFI
>  
>   /* On POWER7, we have external interrupts set to use HSRR0/1 */
> @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>   mtspr   SPRN_HSRR1, r7
>   ba  0x500
>  
> -13:  b   machine_check_fwnmi
> -
>  14:  mtspr   SPRN_HSRR0, r8
>   mtspr   SPRN_HSRR1, r7
>   b   hmi_exception_after_realmode
> @@ -2381,24 +2377,19 @@ machine_check_realmode:
>   ld  r9, HSTATE_KVM_VCPU(r13)
>   li  r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>   /*
> -  * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through
> -  * machine check interrupt (set HSRR0 to 0x200). And for handled
> -  * errors (no-fatal), just go back to guest execution with current
> -  * HSRR0 instead of exiting guest. This new approach will inject
> -  * machine check to guest for fatal error causing guest to crash.
> -  *
> -  * The old code used to return to host for unhandled errors which
> -  * was causing guest to hang with soft lockups inside guest and
> -  * makes it difficult to recover guest instance.
> +  * Deliver unhandled/fatal (e.g. UE) MCE errors to guest
> +  * by exiting the guest with KVM_EXIT_NMI exit reason (exit
> +  * reason set later based on trap). For handled errors
> +  * (no-fatal), go back to guest 

Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-11 Thread Aravinda Prasad


On Thursday 12 November 2015 09:04 AM, David Gibson wrote:
> On Wed, Nov 11, 2015 at 10:28:46PM +0530, Aravinda Prasad wrote:
>> This patch modifies KVM to cause a guest exit with
>> KVM_EXIT_NMI instead of immediately delivering a 0x200
>> interrupt to guest upon machine check exception in
>> guest address. Exiting the guest enables QEMU to build
>> error log and deliver machine check exception to guest
>> OS (either via guest OS registered machine check
>> handler or via 0x200 guest OS interrupt vector).
>>
>> This approach simplifies the delivering of machine
>> check exception to guest OS compared to the earlier approach
>> of KVM directly invoking 0x200 guest interrupt vector.
>> In the earlier approach QEMU patched the 0x200 interrupt
>> vector during boot. The patched code at 0x200 issued a
>> private hcall to pass the control to QEMU to build the
>> error log.
>>
>> This design/approach is based on the feedback for the
>> QEMU patches to handle machine check exception. Details
>> of earlier approach of handling machine check exception
>> in QEMU and related discussions can be found at:
>>
>> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
>>
>> Signed-off-by: Aravinda Prasad 
>> ---
>>  arch/powerpc/kvm/book3s_hv.c|   12 +++-
>>  arch/powerpc/kvm/book3s_hv_rmhandlers.S |   31 
>> +++
>>  2 files changed, 14 insertions(+), 29 deletions(-)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 2280497..1b1dff0 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -859,15 +859,9 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
>> struct kvm_vcpu *vcpu,
>>  r = RESUME_GUEST;
>>  break;
>>  case BOOK3S_INTERRUPT_MACHINE_CHECK:
>> -/*
>> - * Deliver a machine check interrupt to the guest.
>> - * We have to do this, even if the host has handled the
>> - * machine check, because machine checks use SRR0/1 and
>> - * the interrupt might have trashed guest state in them.
>> - */
>> -kvmppc_book3s_queue_irqprio(vcpu,
>> -BOOK3S_INTERRUPT_MACHINE_CHECK);
>> -r = RESUME_GUEST;
>> +/* Exit to guest with KVM_EXIT_NMI as exit reason */
>> +run->exit_reason = KVM_EXIT_NMI;
>> +r = RESUME_HOST;
>>  break;
>>  case BOOK3S_INTERRUPT_PROGRAM:
>>  {
>> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
>> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> index b98889e..672b4f6 100644
>> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
>> @@ -147,7 +147,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>>  addir1, r1, 112
>>  ld  r7, HSTATE_HOST_MSR(r13)
>>  
>> -cmpwi   cr1, r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>>  cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
>>  beq 11f
>>  cmpwi   cr2, r12, BOOK3S_INTERRUPT_HMI
>> @@ -160,7 +159,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>>  mtmsrd  r6, 1   /* Clear RI in MSR */
>>  mtsrr0  r8
>>  mtsrr1  r7
>> -beq cr1, 13f/* machine check */
>>  RFI
>>  
>>  /* On POWER7, we have external interrupts set to use HSRR0/1 */
>> @@ -168,8 +166,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
>>  mtspr   SPRN_HSRR1, r7
>>  ba  0x500
>>  
>> -13: b   machine_check_fwnmi
>> -
> 
> I don't quite understand the 3 hunks above.  The rest seems to change
> the delivery of MCs as I'd expect, but the above seems to change their
> detection and the reason for that isn't obvious to me.

We need to do guest exit here and hence we continue with RFI or else if
we branch to machine_check_fwnmi then the control will flow to
opal_recover_mce routine without causing the guest to exit with NMI exit
code. And I think, opal_recover_mce() is used to recover from UEs in
host user/kernel space and does not have a check for UEs belonging to
guest. Hence if we branch to machine_check_fwnmi we end up running into
panic() in opal_machine_check() if UE belonged to guest.

Regards,
Aravinda

> 
> 
>>  14: mtspr   SPRN_HSRR0, r8
>>  mtspr   SPRN_HSRR1, r7
>>  b   hmi_exception_after_realmode
>> @@ -2381,24 +2377,19 @@ machine_check_realmode:
>>  ld  r9, HSTATE_KVM_VCPU(r13)
>>  li  r12, BOOK3S_INTERRUPT_MACHINE_CHECK
>>  /*
>> - * Deliver unhandled/fatal (e.g. UE) MCE errors to guest through
>> - * machine check interrupt (set HSRR0 to 0x200). And for handled
>> - * errors (no-fatal), just go back to guest execution with current
>> - * HSRR0 instead of exiting guest. This new approach will inject
>> - * machine check to guest for fatal error causing guest to crash.
>> - *
>> - * The old code used to return to host for 

[PATCH 2/2] KVM: PPC: Book3S HV: Handle unexpected traps in guest entry/exit code better

2015-11-11 Thread Paul Mackerras
As we saw with the TM Bad Thing type of program interrupt occurring
on the hrfid that enters the guest, it is not completely impossible
to have a trap occurring in the guest entry/exit code, despite the
fact that the code has been written to avoid taking any traps.

This adds a check in the kvmppc_handle_exit_hv() function to detect
the case when a trap has occurred in the hypervisor-mode code, and
instead of treating it just like a trap in guest code, we now print
a message and return to userspace with a KVM_EXIT_INTERNAL_ERROR
exit reason.

Of the various interrupts that get handled in the assembly code in
the guest exit path and that can return directly to the guest, the
only one that can occur when MSR.HV=1 and MSR.EE=0 is machine check
(other than system call, which we can avoid just by not doing a sc
instruction).  Therefore this adds code to the machine check path to
ensure that if the MCE occurred in hypervisor mode, we exit to the
host rather than trying to continue the guest.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv.c| 18 ++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |  2 ++
 2 files changed, 20 insertions(+)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index f668712..d6baf0a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -846,6 +846,24 @@ static int kvmppc_handle_exit_hv(struct kvm_run *run, 
struct kvm_vcpu *vcpu,
 
vcpu->stat.sum_exits++;
 
+   /*
+* This can happen if an interrupt occurs in the last stages
+* of guest entry or the first stages of guest exit (i.e. after
+* setting paca->kvm_hstate.in_guest to KVM_GUEST_MODE_GUEST_HV
+* and before setting it to KVM_GUEST_MODE_HOST_HV).
+* That can happen due to a bug, or due to a machine check
+* occurring at just the wrong time.
+*/
+   if (vcpu->arch.shregs.msr & MSR_HV) {
+   printk(KERN_EMERG "KVM trap in HV mode!\n");
+   printk(KERN_EMERG "trap=0x%x | pc=0x%lx | msr=0x%llx\n",
+   vcpu->arch.trap, kvmppc_get_pc(vcpu),
+   vcpu->arch.shregs.msr);
+   kvmppc_dump_regs(vcpu);
+   run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+   run->hw.hardware_exit_reason = vcpu->arch.trap;
+   return RESUME_HOST;
+   }
run->exit_reason = KVM_EXIT_UNKNOWN;
run->ready_for_interrupt_injection = 1;
switch (vcpu->arch.trap) {
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 3c6badc..b3ce8ff 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -2404,6 +2404,8 @@ machine_check_realmode:
 * guest as machine check causing guest to crash.
 */
ld  r11, VCPU_MSR(r9)
+   rldicl. r0, r11, 64-MSR_HV_LG, 63 /* check if it happened in HV mode */
+   bne mc_cont /* if so, exit to host */
andi.   r10, r11, MSR_RI/* check for unrecoverable exception */
beq 1f  /* Deliver a machine check to guest */
ld  r10, VCPU_PC(r9)
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Exit guest upon fatal machine check exception

2015-11-11 Thread Aravinda Prasad


On Thursday 12 November 2015 09:08 AM, David Gibson wrote:
> On Thu, Nov 12, 2015 at 01:24:19PM +1100, Daniel Axtens wrote:
>> Aravinda Prasad  writes:
>>
>>> This patch modifies KVM to cause a guest exit with
>>> KVM_EXIT_NMI instead of immediately delivering a 0x200
>>> interrupt to guest upon machine check exception in
>>> guest address. Exiting the guest enables QEMU to build
>>> error log and deliver machine check exception to guest
>>> OS (either via guest OS registered machine check
>>> handler or via 0x200 guest OS interrupt vector).
>>>
>>> This approach simplifies the delivering of machine
>>> check exception to guest OS compared to the earlier approach
>>> of KVM directly invoking 0x200 guest interrupt vector.
>>> In the earlier approach QEMU patched the 0x200 interrupt
>>> vector during boot. The patched code at 0x200 issued a
>>> private hcall to pass the control to QEMU to build the
>>> error log.
>>>
>>> This design/approach is based on the feedback for the
>>> QEMU patches to handle machine check exception. Details
>>> of earlier approach of handling machine check exception
>>> in QEMU and related discussions can be found at:
>>>
>>> https://lists.nongnu.org/archive/html/qemu-devel/2014-11/msg00813.html
>>
>> I've poked at the MCE code, but not the KVM MCE code, so I may be
>> mistaken here, but I'm not clear on how this handles errors that the
>> guest can recover without terminating.
>>
>> For example, a Linux guest can handle a UE in guest userspace by killing
>> the guest process. A hypthetical non-linux guest with a microkernel
>> could even survive UEs in drivers.
>>
>> It sounds from your patch like you're changing this behaviour. Is this
>> right?
> 
> So, IIUC.  Once the qemu pieces are in place as well it shouldn't
> change this behaviour: KVM will exit to qemu, qemu will log the error
> information (new), then reinject the MC to the guest which can still
> handle it as you describe above.

Yes. With KVM and QEMU both in place this will not change the behavior.
QEMU will inject the UE to guest and the guest handles the UE based on
where it occurred. For example if an UE happens in a guest process
address space, that process will be killed.

> 
> But, there could be a problem if you have a new kernel with an old
> qemu, in that case qemu might not understand the new exit type and
> treat it as a fatal error, even though the guest could actually cope
> with it.

In case of new kernel and old QEMU, the guest terminates as old QEMU
does not understand the NMI exit reason. However, this is the case with
old kernel and old QEMU as they do not handle UE belonging to guest. The
difference is that the guest kernel terminates with different error code.

old kernel and old QEMU -> guest panics [1] irrespective of where UE
   happened in guest address space.
old kernel and new QEMU -> guest panics. same as above.
new kernel and old QEMU -> guest terminates with unhanded NMI error
   irrespective of where UE happened in guest
new kernel and new QEMU -> guest handles UEs in process address space
   by killing the process. guest terminates
   for UEs in guest kernel address space.

[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-June/118329.html

> 
> Aravinda, do we need to change this so that qemu has to explicitly
> enable the new NMI behaviour?  Or have I missed something that will
> make that case work already.

I think we don't need to explicitly enable the new behavior. With new
kernel and new QEMU this should just work. As mentioned above this is
already broken for old kernel/QEMU. Any thoughts?

Regards,
Aravinda

> 
> 
> 
> ___
> Linuxppc-dev mailing list
> linuxppc-...@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

-- 
Regards,
Aravinda

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: PPC: Book3S HV: Don't dynamically split core when already split

2015-11-05 Thread Paul Mackerras
In static micro-threading modes, the dynamic micro-threading code
is supposed to be disabled, because subcores can't make independent
decisions about what micro-threading mode to put the core in - there is
only one micro-threading mode for the whole core.  The code that
implements dynamic micro-threading checks for this, except that the
check was missed in one case.  This means that it is possible for a
subcore in static 2-way micro-threading mode to try to put the core
into 4-way micro-threading mode, which usually leads to stuck CPUs,
spinlock lockups, and other stalls in the host.

The problem was in the can_split_piggybacked_subcores() function, which
should always return false if the system is in a static micro-threading
mode.  This fixes the problem by making can_split_piggybacked_subcores()
use subcore_config_ok() for its checks, as subcore_config_ok() includes
the necessary check for the static micro-threading modes.

Credit to Gautham Shenoy for working out that the reason for the hangs
and stalls we were seeing was that we were trying to do dynamic 4-way
micro-threading while we were in static 2-way mode.

Fixes: b4deba5c41e9
Cc: v...@stable.kernel.org # v4.3
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/kvm/book3s_hv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2280497..becad3a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2060,7 +2060,7 @@ static bool can_split_piggybacked_subcores(struct 
core_info *cip)
return false;
n_subcores += (cip->subcore_threads[sub] - 1) >> 1;
}
-   if (n_subcores > 3 || large_sub < 0)
+   if (large_sub < 0 || !subcore_config_ok(n_subcores + 1, 2))
return false;
 
/*
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] KVM: PPC: Increase memslots to 320

2015-11-04 Thread Thomas Huth
Only using 32 memslots for KVM on powerpc is way too low, you can
nowadays hit this limit quite fast by adding a couple of PCI devices
and/or pluggable memory DIMMs to the guest.
x86 already increased the limit to 512 in total, to satisfy 256
pluggable DIMM slots, 3 private slots and 253 slots for other things
like PCI devices. On powerpc, we only have 32 pluggable DIMMs in
QEMU, not 256, so we likely do not as much slots as on x86. Thus
setting the slot limit to 320 sounds like a good value for the
time being (until we have some code in the future to resize the
memslot array dynamically).
And while we're at it, also remove the KVM_MEM_SLOTS_NUM definition
from the powerpc-specific header since this gets defined in the
generic kvm_host.h header anyway.

Signed-off-by: Thomas Huth 
---
 arch/powerpc/include/asm/kvm_host.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 887c259..89d387a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -38,8 +38,7 @@
 
 #define KVM_MAX_VCPUS  NR_CPUS
 #define KVM_MAX_VCORES NR_CPUS
-#define KVM_USER_MEM_SLOTS 32
-#define KVM_MEM_SLOTS_NUM KVM_USER_MEM_SLOTS
+#define KVM_USER_MEM_SLOTS 320
 
 #ifdef CONFIG_KVM_MMIO
 #define KVM_COALESCED_MMIO_PAGE_OFFSET 1
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/1] KVM: PPC: Increase memslots

2015-11-04 Thread Thomas Huth
Now that the patch from Nikunj to support the KVM_CAP_NR_MEMSLOTS
capability on powerpc has been merged to kvm/next, we can/should
increase the amount of memslots on ppc, too.
Since nobody else sent a patch yet (as far as I can see), I'd like
to suggest to increase the slot number to 320 now. Why 320? Well,
x86 uses 509 (+3 internal slots), to give enough space for
256 pluggable DIMMs and 253 other slots (for PCI etc.).
On powerpc, QEMU only supports 32 pluggable DIMMs, so I think we
should be fine by using something around 253 + 32 slots + some few
extra ... so 320 sounds like a good value to me for the time
being (but in the long run, we should really make this dynamically
instead, I think).

Thomas Huth (1):
  KVM: PPC: Increase memslots to 320

 arch/powerpc/include/asm/kvm_host.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-unit-tests PATCH 00/14] ppc64: initial drop

2015-11-03 Thread Paolo Bonzini


On 03/11/2015 08:08, Thomas Huth wrote:
> On 03/08/15 16:41, Andrew Jones wrote:
>> > This series is the first series of a series of series that will
>> > bring support to kvm-unit-tests for ppc64, and eventually ppc64le.
>  Hi Andrew,
> 
> may I ask about the current state of ppc64 support in the
> kvm-unit-tests? Is there a newer version available than the one you
> posted three months ago?

I've been a slob with all the kvm-unit-tests patches.  Andrew, can you
send a single submission of all the patches, so that I can review them
and apply them?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails

2015-11-03 Thread David Gibson
On Tue, Oct 27, 2015 at 04:13:56PM +1100, Paul Mackerras wrote:
> When handling a hypervisor data or instruction storage interrupt (HDSI
> or HISI), we look up the SLB entry for the address being accessed in
> order to translate the effective address to a virtual address which can
> be looked up in the guest HPT.  This lookup can occasionally fail due
> to the guest replacing an SLB entry without invalidating the evicted
> SLB entry.  In this situation an ERAT (effective to real address
> translation cache) entry can persist and be used by the hardware even
> though there is no longer a corresponding SLB entry.
> 
> Previously we would just deliver a data or instruction storage interrupt
> (DSI or ISI) to the guest in this case.  However, this is not correct
> and has been observed to cause guests to crash, typically with a
> data storage protection interrupt on a store to the vmemmap area.
> 
> Instead, what we do now is to synthesize a data or instruction segment
> interrupt.  That should cause the guest to reload an appropriate entry
> into the SLB and retry the faulting instruction.  If it still faults,
> we should find an appropriate SLB entry next time and be able to handle
> the fault.
> 
> Signed-off-by: Paul Mackerras 

Reviewed-by: David Gibson 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [kvm-unit-tests PATCH 00/14] ppc64: initial drop

2015-11-03 Thread Andrew Jones
On Tue, Nov 03, 2015 at 10:40:18AM +0100, Paolo Bonzini wrote:
> 
> 
> On 03/11/2015 08:08, Thomas Huth wrote:
> > On 03/08/15 16:41, Andrew Jones wrote:
> >> > This series is the first series of a series of series that will
> >> > bring support to kvm-unit-tests for ppc64, and eventually ppc64le.
> >  Hi Andrew,
> > 
> > may I ask about the current state of ppc64 support in the
> > kvm-unit-tests? Is there a newer version available than the one you
> > posted three months ago?
> 

Hi Thomas,

I haven't gotten around to preparing the v2 yet :-(   I do have it on
my TODO list, and I'm looking forward to working on it. Now that I know
you're looking for it, I'll try to bump it up in priority. Thanks for
the interest!

> I've been a slob with all the kvm-unit-tests patches.  Andrew, can you
> send a single submission of all the patches, so that I can review them
> and apply them?

Hi Paolo,

I've got several patches on my staging branch that I believe are ready.
I plan to send those as a big "pull" series for your review soon.

Thanks,
drew

> 
> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-02 Thread Dimitri John Ledkov
On 2 November 2015 at 14:58, Will Deacon  wrote:
> On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
>> Hi,
>
> Hello Andre,
>
>> this series cleans up kvmtool's kernel loading functionality a bit.
>> It has been broken out of a previous series I sent [1] and contains
>> just the cleanup and bug fix parts, which should be less controversial
>> and thus easier to merge ;-)
>> I will resend the pipe loading part later on as a separate series.
>>
>> The first patch properly abstracts kernel loading to move
>> responsibility into each architecture's code. It removes quite some
>> ugly code from the generic kvm.c file.
>> The later patches address the naive usage of read(2) to, well, read
>> data from files. Doing this without coping with the subtleties of
>> the UNIX read semantics (returning with less or none data read is not
>> an error) can provoke hard to debug failures.
>> So these patches make use of the existing and one new wrapper function
>> to make sure we read everything we actually wanted to.
>> The last patch moves the ARM kernel loading code into the proper
>> location to be in line with the other architectures.
>>
>> Please have a look and give some comments!
>
> Looks good to me, but I'd like to see some comments from some mips/ppc/x86
> people on the changes you're making over there.

Looks mostly good to me, as one of the kvmtool down streams. Over at
https://github.com/clearlinux/kvmtool we have some patches to tweak
the x86 boot flow, which will need rebasing/retweaking) specifically
this commit here -
https://github.com/clearlinux/kvmtool/commit/a8dee709f85735d16739d0eda0cc00d3c1b17477

-- 
Regards,

Dimitri.
53 sleeps till Christmas, or less

https://clearlinux.org
Open Source Technology Center
Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-02 Thread Will Deacon
On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
> Hi,

Hello Andre,

> this series cleans up kvmtool's kernel loading functionality a bit.
> It has been broken out of a previous series I sent [1] and contains
> just the cleanup and bug fix parts, which should be less controversial
> and thus easier to merge ;-)
> I will resend the pipe loading part later on as a separate series.
> 
> The first patch properly abstracts kernel loading to move
> responsibility into each architecture's code. It removes quite some
> ugly code from the generic kvm.c file.
> The later patches address the naive usage of read(2) to, well, read
> data from files. Doing this without coping with the subtleties of
> the UNIX read semantics (returning with less or none data read is not
> an error) can provoke hard to debug failures.
> So these patches make use of the existing and one new wrapper function
> to make sure we read everything we actually wanted to.
> The last patch moves the ARM kernel loading code into the proper
> location to be in line with the other architectures.
> 
> Please have a look and give some comments!

Looks good to me, but I'd like to see some comments from some mips/ppc/x86
people on the changes you're making over there.

Will
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [kvm-unit-tests PATCH 00/14] ppc64: initial drop

2015-11-02 Thread Thomas Huth
On 03/08/15 16:41, Andrew Jones wrote:
> This series is the first series of a series of series that will
> bring support to kvm-unit-tests for ppc64, and eventually ppc64le.

 Hi Andrew,

may I ask about the current state of ppc64 support in the
kvm-unit-tests? Is there a newer version available than the one you
posted three months ago?

 Thanks,
  Thomas


--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] kvmtool: Cleanup kernel loading

2015-11-02 Thread Andre Przywara
Hi Dimitri,

On 02/11/15 15:17, Dimitri John Ledkov wrote:
> On 2 November 2015 at 14:58, Will Deacon  wrote:
>> On Fri, Oct 30, 2015 at 06:26:53PM +, Andre Przywara wrote:
>>> Hi,
>>
>> Hello Andre,
>>
>>> this series cleans up kvmtool's kernel loading functionality a bit.
>>> It has been broken out of a previous series I sent [1] and contains
>>> just the cleanup and bug fix parts, which should be less controversial
>>> and thus easier to merge ;-)
>>> I will resend the pipe loading part later on as a separate series.
>>>
>>> The first patch properly abstracts kernel loading to move
>>> responsibility into each architecture's code. It removes quite some
>>> ugly code from the generic kvm.c file.
>>> The later patches address the naive usage of read(2) to, well, read
>>> data from files. Doing this without coping with the subtleties of
>>> the UNIX read semantics (returning with less or none data read is not
>>> an error) can provoke hard to debug failures.
>>> So these patches make use of the existing and one new wrapper function
>>> to make sure we read everything we actually wanted to.
>>> The last patch moves the ARM kernel loading code into the proper
>>> location to be in line with the other architectures.
>>>
>>> Please have a look and give some comments!
>>
>> Looks good to me, but I'd like to see some comments from some mips/ppc/x86
>> people on the changes you're making over there.
> 
> Looks mostly good to me, as one of the kvmtool down streams. Over at
> https://github.com/clearlinux/kvmtool we have some patches to tweak
> the x86 boot flow, which will need rebasing/retweaking) specifically
> this commit here -
> https://github.com/clearlinux/kvmtool/commit/a8dee709f85735d16739d0eda0cc00d3c1b17477

Awesome - I was actually thinking about coding something like this!
In the last week I move the MIPS ELF loading out of mips/ into /elf.c to
be able to load kvm-unit-tests' tests (which are Multiboot/ELF). As
multiboot requires entering in protected mode, I was thinking about
changing kvmtool to support entering a guest directly in protected mode
- seems like this is mostly what you've done here already.

Looks like we should both post our patches to merge them somehow ;-)

Cheers,
Andre.
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: Book3S HV: Synthesize segment fault if SLB lookup fails

2015-11-02 Thread Thomas Huth
On 27/10/15 06:13, Paul Mackerras wrote:
> When handling a hypervisor data or instruction storage interrupt (HDSI
> or HISI), we look up the SLB entry for the address being accessed in
> order to translate the effective address to a virtual address which can
> be looked up in the guest HPT.  This lookup can occasionally fail due
> to the guest replacing an SLB entry without invalidating the evicted
> SLB entry.  In this situation an ERAT (effective to real address
> translation cache) entry can persist and be used by the hardware even
> though there is no longer a corresponding SLB entry.
> 
> Previously we would just deliver a data or instruction storage interrupt
> (DSI or ISI) to the guest in this case.  However, this is not correct
> and has been observed to cause guests to crash, typically with a
> data storage protection interrupt on a store to the vmemmap area.
> 
> Instead, what we do now is to synthesize a data or instruction segment
> interrupt.  That should cause the guest to reload an appropriate entry
> into the SLB and retry the faulting instruction.  If it still faults,
> we should find an appropriate SLB entry next time and be able to handle
> the fault.
> 
> Signed-off-by: Paul Mackerras 
> ---
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 20 
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
> b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> index b1dab8d..3c6badc 100644
> --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> @@ -1749,7 +1749,8 @@ kvmppc_hdsi:
>   beq 3f
>   clrrdi  r0, r4, 28
>   PPC_SLBFEE_DOT(R5, R0)  /* if so, look up SLB */
> - bne 1f  /* if no SLB entry found */
> + li  r0, BOOK3S_INTERRUPT_DATA_SEGMENT
> + bne 7f  /* if no SLB entry found */
>  4:   std r4, VCPU_FAULT_DAR(r9)
>   stw r6, VCPU_FAULT_DSISR(r9)
>  
> @@ -1768,14 +1769,15 @@ kvmppc_hdsi:
>   cmpdi   r3, -2  /* MMIO emulation; need instr word */
>   beq 2f
>  
> - /* Synthesize a DSI for the guest */
> + /* Synthesize a DSI (or DSegI) for the guest */
>   ld  r4, VCPU_FAULT_DAR(r9)
>   mr  r6, r3
> -1:   mtspr   SPRN_DAR, r4
> +1:   li  r0, BOOK3S_INTERRUPT_DATA_STORAGE
>   mtspr   SPRN_DSISR, r6
> +7:   mtspr   SPRN_DAR, r4

I first though "hey, you're not setting DSISR for the DsegI now" ... but
then I saw in the PowerISA that DSISR is "undefined" for the DSegI, so
this should be fine, I think.

>   mtspr   SPRN_SRR0, r10
>   mtspr   SPRN_SRR1, r11
> - li  r10, BOOK3S_INTERRUPT_DATA_STORAGE
> + mr  r10, r0
>   bl  kvmppc_msr_interrupt
>  fast_interrupt_c_return:
>  6:   ld  r7, VCPU_CTR(r9)
> @@ -1823,7 +1825,8 @@ kvmppc_hisi:
>   beq 3f
>   clrrdi  r0, r10, 28
>   PPC_SLBFEE_DOT(R5, R0)  /* if so, look up SLB */
> - bne 1f  /* if no SLB entry found */
> + li  r0, BOOK3S_INTERRUPT_INST_SEGMENT
> + bne 7f  /* if no SLB entry found */
>  4:
>   /* Search the hash table. */
>   mr  r3, r9  /* vcpu pointer */
> @@ -1840,11 +1843,12 @@ kvmppc_hisi:
>   cmpdi   r3, -1  /* handle in kernel mode */
>   beq guest_exit_cont
>  
> - /* Synthesize an ISI for the guest */
> + /* Synthesize an ISI (or ISegI) for the guest */
>   mr  r11, r3
> -1:   mtspr   SPRN_SRR0, r10
> +1:   li  r0, BOOK3S_INTERRUPT_INST_STORAGE
> +7:   mtspr   SPRN_SRR0, r10
>   mtspr   SPRN_SRR1, r11
> - li  r10, BOOK3S_INTERRUPT_INST_STORAGE
> + mr  r10, r0
>   bl  kvmppc_msr_interrupt
>   b   fast_interrupt_c_return

As far as I can see (but I'm really not an expert here), the patch seems
to be fine. And we have a test running with it for almost 6 days now and
did not see any further guest crashes, so it seems to me like this is
the right fix for this issue! So if you like, you can add my
"Reviewed-by" or "Tested-by" to this patch.

 Thanks a lot for fixing this!
  Thomas

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] Please pull my kvm-ppc-next branch

2015-11-02 Thread Paolo Bonzini
On 26/10/2015 05:17, Paul Mackerras wrote:
> Paolo,
> 
> Here is my current patch queue for KVM on PPC.  There's nothing much
> in the way of new features this time; it's mostly bug fixes, plus
> Nikunj has implemented support for KVM_CAP_NR_MEMSLOTS.  These are
> intended for the "next" branch of the KVM tree.  Please pull.
> 
> Thanks,
> Paul.

Pulled, thanks.

Paolo

> The following changes since commit 9ffecb10283508260936b96022d4ee43a7798b4c:
> 
>   Linux 4.3-rc3 (2015-09-27 07:50:08 -0400)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc.git 
> kvm-ppc-next
> 
> for you to fetch changes up to 70aa3961a196ac32baf54032b2051bac9a941118:
> 
>   KVM: PPC: Book3S HV: Handle H_DOORBELL on the guest exit path (2015-10-21 
> 16:31:52 +1100)
> 
> 
> Andrzej Hajda (1):
>   KVM: PPC: e500: fix handling local_sid_lookup result
> 
> Gautham R. Shenoy (1):
>   KVM: PPC: Book3S HV: Handle H_DOORBELL on the guest exit path
> 
> Mahesh Salgaonkar (1):
>   KVM: PPC: Book3S HV: Deliver machine check with MSR(RI=0) to guest as 
> MCE
> 
> Nikunj A Dadhania (1):
>   KVM: PPC: Implement extension to report number of memslots
> 
> Paul Mackerras (2):
>   KVM: PPC: Book3S HV: Don't fall back to smaller HPT size in allocation 
> ioctl
>   KVM: PPC: Book3S HV: Make H_REMOVE return correct HPTE value for absent 
> HPTEs
> 
> Tudor Laurentiu (3):
>   powerpc/e6500: add TMCFG0 register definition
>   KVM: PPC: e500: Emulate TMCFG0 TMRN register
>   KVM: PPC: e500: fix couple of shift operations on 64 bits
> 
>  arch/powerpc/include/asm/disassemble.h  |  5 +
>  arch/powerpc/include/asm/reg_booke.h|  6 ++
>  arch/powerpc/kvm/book3s_64_mmu_hv.c |  3 ++-
>  arch/powerpc/kvm/book3s_hv_rm_mmu.c |  2 ++
>  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 29 ++---
>  arch/powerpc/kvm/e500.c |  3 ++-
>  arch/powerpc/kvm/e500_emulate.c | 19 +++
>  arch/powerpc/kvm/e500_mmu_host.c|  4 ++--
>  arch/powerpc/kvm/powerpc.c  |  3 +++
>  9 files changed, 63 insertions(+), 11 deletions(-)
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] arm/arm64: use read_file() in kernel and initrd loading

2015-10-30 Thread Andre Przywara
Use the new read_file() wrapper in our arm/arm64 kernel image loading
function instead of the private implementation.

Signed-off-by: Andre Przywara 
---
 arm/fdt.c | 40 ++--
 1 file changed, 18 insertions(+), 22 deletions(-)

diff --git a/arm/fdt.c b/arm/fdt.c
index ec7453f..19d7ed9 100644
--- a/arm/fdt.c
+++ b/arm/fdt.c
@@ -224,19 +224,6 @@ static int setup_fdt(struct kvm *kvm)
 }
 late_init(setup_fdt);
 
-static int read_image(int fd, void **pos, void *limit)
-{
-   int count;
-
-   while (((count = xread(fd, *pos, SZ_64K)) > 0) && *pos <= limit)
-   *pos += count;
-
-   if (pos < 0)
-   die_perror("xread");
-
-   return *pos < limit ? 0 : -ENOMEM;
-}
-
 #define FDT_ALIGN  SZ_2M
 #define INITRD_ALIGN   4
 bool kvm__arch_load_kernel_image(struct kvm *kvm, int fd_kernel, int fd_initrd,
@@ -244,6 +231,7 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int 
fd_kernel, int fd_initrd,
 {
void *pos, *kernel_end, *limit;
unsigned long guest_addr;
+   ssize_t file_size;
 
if (lseek(fd_kernel, 0, SEEK_SET) < 0)
die_perror("lseek");
@@ -256,13 +244,16 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int 
fd_kernel, int fd_initrd,
 
pos = kvm->ram_start + ARM_KERN_OFFSET(kvm);
kvm->arch.kern_guest_start = host_to_guest_flat(kvm, pos);
-   if (read_image(fd_kernel, , limit) == -ENOMEM)
-   die("kernel image too big to contain in guest memory.");
+   file_size = read_file(fd_kernel, pos, limit - pos);
+   if (file_size < 0) {
+   if (errno == ENOMEM)
+   die("kernel image too big to contain in guest memory.");
 
-   kernel_end = pos;
-   pr_info("Loaded kernel to 0x%llx (%llu bytes)",
-   kvm->arch.kern_guest_start,
-   host_to_guest_flat(kvm, pos) - kvm->arch.kern_guest_start);
+   die_perror("kernel read");
+   }
+   kernel_end = pos + file_size;
+   pr_info("Loaded kernel to 0x%llx (%zd bytes)",
+   kvm->arch.kern_guest_start, file_size);
 
/*
 * Now load backwards from the end of memory so the kernel
@@ -300,11 +291,16 @@ bool kvm__arch_load_kernel_image(struct kvm *kvm, int 
fd_kernel, int fd_initrd,
die("initrd overlaps with kernel image.");
 
initrd_start = guest_addr;
-   if (read_image(fd_initrd, , limit) == -ENOMEM)
-   die("initrd too big to contain in guest memory.");
+   file_size = read_file(fd_initrd, pos, limit - pos);
+   if (file_size == -1) {
+   if (errno == ENOMEM)
+   die("initrd too big to contain in guest 
memory.");
+
+   die_perror("initrd read");
+   }
 
kvm->arch.initrd_guest_start = initrd_start;
-   kvm->arch.initrd_size = host_to_guest_flat(kvm, pos) - 
initrd_start;
+   kvm->arch.initrd_size = file_size;
pr_info("Loaded initrd to 0x%llx (%llu bytes)",
kvm->arch.initrd_guest_start,
kvm->arch.initrd_size);
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   7   8   9   10   >