from:"Alexander Graf"

Re: [GIT PULL 3/3] KVM: s390: use simple switch statement as multiplexer

2015-10-29 Thread Alexander Graf


> Am 29.10.2015 um 16:08 schrieb Christian Borntraeger :
> 
> We currently do some magic shifting (by exploiting that exit codes
> are always a multiple of 4) and a table lookup to jump into the
> exit handlers. This causes some calculations and checks, just to
> do an potentially expensive function call.
> 
> Changing that to a switch statement gives the compiler the chance
> to inline and dynamically decide between jump tables or inline
> compare and branches. In addition it makes the code more readable.
> 
> bloat-o-meter gives me a small reduction in code size:
> 
> add/remove: 0/7 grow/shrink: 1/1 up/down: 986/-1334 (-348)
> function old new   delta
> kvm_handle_sie_intercept  721058+986
> handle_prog  704 696  -8
> handle_noop   54   - -54
> handle_partial_execution  60   - -60
> intercept_funcs  120   --120
> handle_instruction   198   --198
> handle_validity  210   --210
> handle_stop  316   --316
> handle_external_interrupt368   --368
> 
> Right now my gcc does conditional branches instead of jump tables.
> The inlining seems to give us enough cycles as some micro-benchmarking
> shows minimal improvements, but still in noise.

Awesome. I ended up with the same conclusions on switch vs table lookups in the 
ppc code back in the day.

> 
> Signed-off-by: Christian Borntraeger 
> Reviewed-by: Cornelia Huck 
> ---
> arch/s390/kvm/intercept.c | 42 +-
> 1 file changed, 21 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
> index 7365e8a..b4a5aa1 100644
> --- a/arch/s390/kvm/intercept.c
> +++ b/arch/s390/kvm/intercept.c
> @@ -336,28 +336,28 @@ static int handle_partial_execution(struct kvm_vcpu 
> *vcpu)
>return -EOPNOTSUPP;
> }
> 
> -static const intercept_handler_t intercept_funcs[] = {
> -[0x00 >> 2] = handle_noop,
> -[0x04 >> 2] = handle_instruction,
> -[0x08 >> 2] = handle_prog,
> -[0x10 >> 2] = handle_noop,
> -[0x14 >> 2] = handle_external_interrupt,
> -[0x18 >> 2] = handle_noop,
> -[0x1C >> 2] = kvm_s390_handle_wait,
> -[0x20 >> 2] = handle_validity,
> -[0x28 >> 2] = handle_stop,
> -[0x38 >> 2] = handle_partial_execution,
> -};
> -
> int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu)
> {
> -intercept_handler_t func;
> -u8 code = vcpu->arch.sie_block->icptcode;
> -
> -if (code & 3 || (code >> 2) >= ARRAY_SIZE(intercept_funcs))
> +switch (vcpu->arch.sie_block->icptcode) {
> +case 0x00:
> +case 0x10:
> +case 0x18:

... if you could convert these magic numbers to something more telling however, 
I think readability would improve even more! That can easily be a follow up 
patch though.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] powerpc/e500: move qemu machine spec together with the rest

2015-09-14 Thread Alexander Graf



> Am 14.09.2015 um 15:17 schrieb Laurentiu Tudor :
> 
>> On 09/10/2015 02:01 AM, Scott Wood wrote:
>>> On Fri, 2015-09-04 at 15:46 +0300, Laurentiu Tudor wrote:
>>> This way we get rid of an entire file with mostly
>>> duplicated code plus a Kconfig option that you always
>>> had to take care to check it in order for kvm to work.
>>> 
>>> Signed-off-by: Laurentiu Tudor 
>>> ---
>>> arch/powerpc/platforms/85xx/Kconfig   | 15 -
>>> arch/powerpc/platforms/85xx/Makefile  |  1 -
>>> arch/powerpc/platforms/85xx/corenet_generic.c |  1 +
>>> arch/powerpc/platforms/85xx/qemu_e500.c   | 85 
>> 
>> 
>> qemu_e500 is not only for corenet chips.  
> 
> That's too bad. :-(
> I remember discussions on dropping the e500v2 support at some point in time?
> 
>> We can add it to the defconfig (in fact I've been meaning to do so).
> 
> Or maybe just drop de KConfig option and
> wrap the file in an #ifdef CONFIG_KVM or something along these lines?

CONFIG_KVM is for host support though. This is for the guest kernel.

Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD

2015-09-12 Thread Alexander Graf



> Am 12.09.2015 um 18:47 schrieb Nathan Whitehorn :
> 
>> On 09/06/15 16:52, Paul Mackerras wrote:
>>> On Sun, Sep 06, 2015 at 12:47:12PM -0700, Nathan Whitehorn wrote:
>>> Anything I can do to help move these along? It's a big performance
>>> improvement for FreeBSD guests.
>> These patches are in Paolo's kvm-ppc-next branch and should go into
>> Linus' tree in the next couple of days.
>> 
>> Paul.
> 
> One additional question. What is your preferred way to enable these? Since 
> these are part of the mandatory part of the PAPR spec, I think there's an 
> argument to add them to the default_hcall_list? Otherwise, they should be 
> enabled by default in QEMU (I can take care of sending that patch if you 
> prefer this route).

The default hcall list just describes which hcalls were implicitly enabled at 
the point in time we made them enableable by user space. IMHO no new hcalls 
should get added there.

So yes, please send a patch to qemu :).


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 3/3] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD

2015-09-12 Thread Alexander Graf



> Am 12.09.2015 um 18:47 schrieb Nathan Whitehorn :
> 
>> On 09/06/15 16:52, Paul Mackerras wrote:
>>> On Sun, Sep 06, 2015 at 12:47:12PM -0700, Nathan Whitehorn wrote:
>>> Anything I can do to help move these along? It's a big performance
>>> improvement for FreeBSD guests.
>> These patches are in Paolo's kvm-ppc-next branch and should go into
>> Linus' tree in the next couple of days.
>> 
>> Paul.
> 
> One additional question. What is your preferred way to enable these? Since 
> these are part of the mandatory part of the PAPR spec, I think there's an 
> argument to add them to the default_hcall_list? Otherwise, they should be 
> enabled by default in QEMU (I can take care of sending that patch if you 
> prefer this route).

The default hcall list just describes which hcalls were implicitly enabled at 
the point in time we made them enableable by user space. IMHO no new hcalls 
should get added there.

So yes, please send a patch to qemu :).


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Alexander Graf



On 04.09.15 11:59, Christian Borntraeger wrote:
> Am 04.09.2015 um 11:35 schrieb Thomas Huth:
>>
>>  Hi all,
>>
>> now that we get memory hotplugging for the spapr machine on qemu-ppc,
>> too, it seems like we easily can hit the amount of KVM-internal memory
>> slots now ("#define KVM_USER_MEM_SLOTS 32" in
>> arch/powerpc/include/asm/kvm_host.h). For example, start
>> qemu-system-ppc64 with a couple of "-device secondary-vga" and "-m
>> 4G,slots=32,maxmem=40G" and then try to hot-plug all 32 DIMMs ... and
>> you'll see that it aborts way earlier already.
>>
>> The x86 code already increased the amount of KVM_USER_MEM_SLOTS to 509
>> already (+3 internal slots = 512) ... maybe we should now increase the
>> amount of slots on powerpc, too? Since we don't use internal slots on
>> POWER, would 512 be a good value? Or would less be sufficient, too?
> 
> When you are at it, the s390 value should also be increased I guess.

That constant defines the array size for the memslot array in struct kvm
which in turn again gets allocated by kzalloc, so it's pinned kernel
memory that is physically contiguous. Doing big allocations can turn
into problems during runtime.

So maybe there is another way? Can we extend the memslot array size
dynamically somehow? Allocate it separately? How much memory does the
memslot array use up with 512 entries?


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: ppc: Fix size of the PSPB register

2015-09-02 Thread Alexander Graf



> Am 02.09.2015 um 09:26 schrieb Thomas Huth :
> 
>> On 02/09/15 00:55, Benjamin Herrenschmidt wrote:
>>> On Wed, 2015-09-02 at 08:45 +1000, Paul Mackerras wrote:
>>> On Wed, Sep 02, 2015 at 08:25:05AM +1000, Benjamin Herrenschmidt
>>> wrote:
 On Tue, 2015-09-01 at 23:41 +0200, Thomas Huth wrote:
> The size of the Problem State Priority Boost Register is only
> 32 bits, so let's change the type of the corresponding variable
> accordingly to avoid future trouble.
 
 It's not future trouble, it's broken today for LE and this should
 fix
 it BUT 
>>> 
>>> No, it's broken today for BE hosts, which will always see 0 for the
>>> PSPB register value.  LE hosts are fine.
> 
> Right ... I just meant that nobody really experienced trouble with this
> today yet, but the bug is already present now already of course.

Sounds like a great candidate for kvm-unit-tests then, no? ;)


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: ppc: Fix size of the PSPB register

2015-09-02 Thread Alexander Graf



> Am 02.09.2015 um 09:26 schrieb Thomas Huth :
> 
>> On 02/09/15 00:55, Benjamin Herrenschmidt wrote:
>>> On Wed, 2015-09-02 at 08:45 +1000, Paul Mackerras wrote:
>>> On Wed, Sep 02, 2015 at 08:25:05AM +1000, Benjamin Herrenschmidt
>>> wrote:
 On Tue, 2015-09-01 at 23:41 +0200, Thomas Huth wrote:
> The size of the Problem State Priority Boost Register is only
> 32 bits, so let's change the type of the corresponding variable
> accordingly to avoid future trouble.
 
 It's not future trouble, it's broken today for LE and this should
 fix
 it BUT 
>>> 
>>> No, it's broken today for BE hosts, which will always see 0 for the
>>> PSPB register value.  LE hosts are fine.
> 
> Right ... I just meant that nobody really experienced trouble with this
> today yet, but the bug is already present now already of course.

Sounds like a great candidate for kvm-unit-tests then, no? ;)


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Build regressions/improvements in v4.2-rc8

2015-08-26 Thread Alexander Graf



On 24.08.15 10:36, Geert Uytterhoeven wrote:
 On Mon, Aug 24, 2015 at 10:34 AM, Geert Uytterhoeven
 ge...@linux-m68k.org wrote:
 JFYI, when comparing v4.2-rc8[1] to v4.2-rc7[3], the summaries are:
   - build errors: +4/-7
 
 4 regressions:
   + /home/kisskb/slave/src/include/linux/kvm_host.h: error: array
 subscript is above array bounds [-Werror=array-bounds]:  = 430:19
 (arch/powerpc/kvm/book3s_64_mmu.c: In function 'kvmppc_mmu
 _book3s_64_tlbie':)
 
 powerpc-randconfig (seen before in a v3.15-rc1 build?)

I'm not quite sure what's going wrong here. The code in question does

  kvm_for_each_vcpu(i, v, vcpu-kvm)
kvmppc_mmu_pte_vflush(v, va  12, mask);

and IIUC the thing we're potentially running over on would be
kvm-vcpus[i]. But that one is bound by the kvm_for_each_vcpu loop, no?


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vfio: Enable VFIO device for powerpc

2015-08-26 Thread Alexander Graf



On 13.08.15 03:15, David Gibson wrote:
 ec53500f kvm: Add VFIO device added a special KVM pseudo-device which is
 used to handle any necessary interactions between KVM and VFIO.
 
 Currently that device is built on x86 and ARM, but not powerpc, although
 powerpc does support both KVM and VFIO.  This makes things awkward in
 userspace
 
 Currently qemu prints an alarming error message if you attempt to use VFIO
 and it can't initialize the KVM VFIO device.  We don't want to remove the
 warning, because lack of the KVM VFIO device could mean coherency problems
 on x86.  On powerpc, however, the error is harmless but looks disturbing,
 and a test based on host architecture in qemu would be ugly, and break if
 we do need the KVM VFIO device for something important in future.
 
 There's nothing preventing the KVM VFIO device from being built for
 powerpc, so this patch turns it on.  It won't actually do anything, since
 we don't define any of the arch_*() hooks, but it will make qemu happy and
 we can extend it in future if we need to.
 
 Signed-off-by: David Gibson da...@gibson.dropbear.id.au
 Reviewed-by: Eric Auger eric.au...@linaro.org

Paul is going to take care of the kvm-ppc tree for 4.3. Also, ppc kvm
patches should get CC on the kvm-ppc@vger mailing list ;).

Paul, could you please pick this one up?


Thanks!

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] vfio: Enable VFIO device for powerpc

2015-08-26 Thread Alexander Graf



On 13.08.15 03:15, David Gibson wrote:
 ec53500f kvm: Add VFIO device added a special KVM pseudo-device which is
 used to handle any necessary interactions between KVM and VFIO.
 
 Currently that device is built on x86 and ARM, but not powerpc, although
 powerpc does support both KVM and VFIO.  This makes things awkward in
 userspace
 
 Currently qemu prints an alarming error message if you attempt to use VFIO
 and it can't initialize the KVM VFIO device.  We don't want to remove the
 warning, because lack of the KVM VFIO device could mean coherency problems
 on x86.  On powerpc, however, the error is harmless but looks disturbing,
 and a test based on host architecture in qemu would be ugly, and break if
 we do need the KVM VFIO device for something important in future.
 
 There's nothing preventing the KVM VFIO device from being built for
 powerpc, so this patch turns it on.  It won't actually do anything, since
 we don't define any of the arch_*() hooks, but it will make qemu happy and
 we can extend it in future if we need to.
 
 Signed-off-by: David Gibson da...@gibson.dropbear.id.au
 Reviewed-by: Eric Auger eric.au...@linaro.org

Paul is going to take care of the kvm-ppc tree for 4.3. Also, ppc kvm
patches should get CC on the kvm-ppc@vger mailing list ;).

Paul, could you please pick this one up?


Thanks!

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PULL 00/12] ppc patch queue 2015-08-22

2015-08-23 Thread Alexander Graf



On 22.08.15 15:32, Paolo Bonzini wrote:
 
 
 On 22/08/2015 02:21, Alexander Graf wrote:
 Hi Paolo,

 This is my current patch queue for ppc.  Please pull.
 
 Done, but this queue has not been in linux-next.  Please push to
 kvm-ppc-next on your github Linux tree as well; please keep an eye on

Ah, sorry. I pushed to kvm-ppc-next in parallel to sending the request.

 Steven Rothwell's messages in the next few days, and I'll send the pull
 request sometimes next week via webmail if everything goes fine.

Nothing exciting came in so far, so I hope we're good :).


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PULL 00/12] ppc patch queue 2015-08-22

2015-08-23 Thread Alexander Graf



On 22.08.15 15:32, Paolo Bonzini wrote:
 
 
 On 22/08/2015 02:21, Alexander Graf wrote:
 Hi Paolo,

 This is my current patch queue for ppc.  Please pull.
 
 Done, but this queue has not been in linux-next.  Please push to
 kvm-ppc-next on your github Linux tree as well; please keep an eye on

Ah, sorry. I pushed to kvm-ppc-next in parallel to sending the request.

 Steven Rothwell's messages in the next few days, and I'll send the pull
 request sometimes next week via webmail if everything goes fine.

Nothing exciting came in so far, so I hope we're good :).


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 07/12] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

The reference (R) and change (C) bits in a HPT entry can be set by
hardware at any time up until the HPTE is invalidated and the TLB
invalidation sequence has completed.  This means that when removing
a HPTE, we need to read the HPTE after the invalidation sequence has
completed in order to obtain reliable values of R and C.  The code
in kvmppc_do_h_remove() used to do this.  However, commit 6f22bd3265fb
(KVM: PPC: Book3S HV: Make HTAB code LE host aware) removed the
read after invalidation as a side effect of other changes.  This
restores the read of the HPTE after invalidation.

The user-visible effect of this bug would be that when migrating a
guest, there is a small probability that a page modified by the guest
and then unmapped by the guest might not get re-transmitted and thus
the destination might end up with a stale copy of the page.

Fixes: 6f22bd3265fb
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index b027a89..c6d601c 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -421,14 +421,20 @@ long kvmppc_do_h_remove(struct kvm *kvm, unsigned long 
flags,
rev = real_vmalloc_addr(kvm-arch.revmap[pte_index]);
v = pte  ~HPTE_V_HVLOCK;
if (v  HPTE_V_VALID) {
-   u64 pte1;
-
-   pte1 = be64_to_cpu(hpte[1]);
hpte[0] = ~cpu_to_be64(HPTE_V_VALID);
-   rb = compute_tlbie_rb(v, pte1, pte_index);
+   rb = compute_tlbie_rb(v, be64_to_cpu(hpte[1]), pte_index);
do_tlbies(kvm, rb, 1, global_invalidates(kvm, flags), true);
-   /* Read PTE low word after tlbie to get final R/C values */
-   remove_revmap_chain(kvm, pte_index, rev, v, pte1);
+   /*
+* The reference (R) and change (C) bits in a HPT
+* entry can be set by hardware at any time up until
+* the HPTE is invalidated and the TLB invalidation
+* sequence has completed.  This means that when
+* removing a HPTE, we need to re-read the HPTE after
+* the invalidation sequence has completed in order to
+* obtain reliable values of R and C.
+*/
+   remove_revmap_chain(kvm, pte_index, rev, v,
+   be64_to_cpu(hpte[1]));
}
r = rev-guest_rpte  ~HPTE_GR_RESERVED;
note_hpte_modification(kvm, rev);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 00/12] ppc patch queue 2015-08-22

2015-08-22 Thread Alexander Graf

Hi Paolo,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit 4d283ec908e617fa28bcb06bce310206f0655d67:

  x86/kvm: Rename VMX's segment access rights defines (2015-08-15 00:47:13 
+0200)

are available in the git repository at:

  git://github.com/agraf/linux-2.6.git tags/signed-kvm-ppc-next

for you to fetch changes up to c63517c2e3810071359af926f621c1f784388c3f:

  KVM: PPC: Book3S: correct width in XER handling (2015-08-22 11:16:19 +0200)


Patch queue for ppc - 2015-08-22

Highlights for KVM PPC this time around:

  - Book3S: A few bug fixes
  - Book3S: Allow micro-threading on POWER8


Paul Mackerras (7):
  KVM: PPC: Book3S HV: Make use of unused threads when running guests
  KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8
  KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE
  KVM: PPC: Book3S HV: Fix bug in dirty page tracking
  KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  KVM: PPC: Book3S HV: Fix preempted vcore list locking
  KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation

Sam bobroff (1):
  KVM: PPC: Book3S: correct width in XER handling

Thomas Huth (2):
  KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig
  KVM: PPC: Fix warnings from sparse

Tudor Laurentiu (2):
  KVM: PPC: fix suspicious use of conditional operator
  KVM: PPC: add missing pt_regs initialization

 arch/powerpc/include/asm/kvm_book3s.h |   5 +-
 arch/powerpc/include/asm/kvm_book3s_asm.h |  22 +-
 arch/powerpc/include/asm/kvm_booke.h  |   4 +-
 arch/powerpc/include/asm/kvm_host.h   |  24 +-
 arch/powerpc/include/asm/ppc-opcode.h |   2 +-
 arch/powerpc/kernel/asm-offsets.c |   9 +
 arch/powerpc/kvm/Kconfig  |   8 +-
 arch/powerpc/kvm/book3s.c |   3 +-
 arch/powerpc/kvm/book3s_32_mmu_host.c |   1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c |   1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |   8 +-
 arch/powerpc/kvm/book3s_emulate.c |   1 +
 arch/powerpc/kvm/book3s_hv.c  | 660 ++
 arch/powerpc/kvm/book3s_hv_builtin.c  |  32 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 161 +++-
 arch/powerpc/kvm/book3s_hv_rm_xics.c  |   4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 128 +-
 arch/powerpc/kvm/book3s_paired_singles.c  |   2 +-
 arch/powerpc/kvm/book3s_segment.S |   4 +-
 arch/powerpc/kvm/booke.c  |   1 +
 arch/powerpc/kvm/e500_mmu.c   |   2 +-
 arch/powerpc/kvm/powerpc.c|   2 +-
 22 files changed, 938 insertions(+), 146 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 07/12] KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

The reference (R) and change (C) bits in a HPT entry can be set by
hardware at any time up until the HPTE is invalidated and the TLB
invalidation sequence has completed.  This means that when removing
a HPTE, we need to read the HPTE after the invalidation sequence has
completed in order to obtain reliable values of R and C.  The code
in kvmppc_do_h_remove() used to do this.  However, commit 6f22bd3265fb
(KVM: PPC: Book3S HV: Make HTAB code LE host aware) removed the
read after invalidation as a side effect of other changes.  This
restores the read of the HPTE after invalidation.

The user-visible effect of this bug would be that when migrating a
guest, there is a small probability that a page modified by the guest
and then unmapped by the guest might not get re-transmitted and thus
the destination might end up with a stale copy of the page.

Fixes: 6f22bd3265fb
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index b027a89..c6d601c 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -421,14 +421,20 @@ long kvmppc_do_h_remove(struct kvm *kvm, unsigned long 
flags,
rev = real_vmalloc_addr(kvm-arch.revmap[pte_index]);
v = pte  ~HPTE_V_HVLOCK;
if (v  HPTE_V_VALID) {
-   u64 pte1;
-
-   pte1 = be64_to_cpu(hpte[1]);
hpte[0] = ~cpu_to_be64(HPTE_V_VALID);
-   rb = compute_tlbie_rb(v, pte1, pte_index);
+   rb = compute_tlbie_rb(v, be64_to_cpu(hpte[1]), pte_index);
do_tlbies(kvm, rb, 1, global_invalidates(kvm, flags), true);
-   /* Read PTE low word after tlbie to get final R/C values */
-   remove_revmap_chain(kvm, pte_index, rev, v, pte1);
+   /*
+* The reference (R) and change (C) bits in a HPT
+* entry can be set by hardware at any time up until
+* the HPTE is invalidated and the TLB invalidation
+* sequence has completed.  This means that when
+* removing a HPTE, we need to re-read the HPTE after
+* the invalidation sequence has completed in order to
+* obtain reliable values of R and C.
+*/
+   remove_revmap_chain(kvm, pte_index, rev, v,
+   be64_to_cpu(hpte[1]));
}
r = rev-guest_rpte  ~HPTE_GR_RESERVED;
note_hpte_modification(kvm, rev);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 04/12] KVM: PPC: add missing pt_regs initialization

2015-08-22 Thread Alexander Graf

From: Tudor Laurentiu b10...@freescale.com

On this switch branch the regs initialization
doesn't happen so add it.
This was found with the help of a static
code analysis tool.

Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index cc58426..ae458f0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -933,6 +933,7 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
 #endif
break;
case BOOKE_INTERRUPT_CRITICAL:
+   kvmppc_fill_pt_regs(regs);
unknown_exception(regs);
break;
case BOOKE_INTERRUPT_DEBUG:
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 00/12] ppc patch queue 2015-08-22

2015-08-22 Thread Alexander Graf

Hi Paolo,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit 4d283ec908e617fa28bcb06bce310206f0655d67:

  x86/kvm: Rename VMX's segment access rights defines (2015-08-15 00:47:13 
+0200)

are available in the git repository at:

  git://github.com/agraf/linux-2.6.git tags/signed-kvm-ppc-next

for you to fetch changes up to c63517c2e3810071359af926f621c1f784388c3f:

  KVM: PPC: Book3S: correct width in XER handling (2015-08-22 11:16:19 +0200)


Patch queue for ppc - 2015-08-22

Highlights for KVM PPC this time around:

  - Book3S: A few bug fixes
  - Book3S: Allow micro-threading on POWER8


Paul Mackerras (7):
  KVM: PPC: Book3S HV: Make use of unused threads when running guests
  KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8
  KVM: PPC: Book3S HV: Fix race in reading change bit when removing HPTE
  KVM: PPC: Book3S HV: Fix bug in dirty page tracking
  KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD
  KVM: PPC: Book3S HV: Fix preempted vcore list locking
  KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation

Sam bobroff (1):
  KVM: PPC: Book3S: correct width in XER handling

Thomas Huth (2):
  KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig
  KVM: PPC: Fix warnings from sparse

Tudor Laurentiu (2):
  KVM: PPC: fix suspicious use of conditional operator
  KVM: PPC: add missing pt_regs initialization

 arch/powerpc/include/asm/kvm_book3s.h |   5 +-
 arch/powerpc/include/asm/kvm_book3s_asm.h |  22 +-
 arch/powerpc/include/asm/kvm_booke.h  |   4 +-
 arch/powerpc/include/asm/kvm_host.h   |  24 +-
 arch/powerpc/include/asm/ppc-opcode.h |   2 +-
 arch/powerpc/kernel/asm-offsets.c |   9 +
 arch/powerpc/kvm/Kconfig  |   8 +-
 arch/powerpc/kvm/book3s.c |   3 +-
 arch/powerpc/kvm/book3s_32_mmu_host.c |   1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c |   1 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |   8 +-
 arch/powerpc/kvm/book3s_emulate.c |   1 +
 arch/powerpc/kvm/book3s_hv.c  | 660 ++
 arch/powerpc/kvm/book3s_hv_builtin.c  |  32 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 161 +++-
 arch/powerpc/kvm/book3s_hv_rm_xics.c  |   4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 128 +-
 arch/powerpc/kvm/book3s_paired_singles.c  |   2 +-
 arch/powerpc/kvm/book3s_segment.S |   4 +-
 arch/powerpc/kvm/booke.c  |   1 +
 arch/powerpc/kvm/e500_mmu.c   |   2 +-
 arch/powerpc/kvm/powerpc.c|   2 +-
 22 files changed, 938 insertions(+), 146 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 03/12] KVM: PPC: Fix warnings from sparse

2015-08-22 Thread Alexander Graf

From: Thomas Huth th...@redhat.com

When compiling the KVM code for POWER with make C=1, sparse
complains about functions missing proper prototypes and a 64-bit
constant missing the ULL prefix. Let's fix this by making the
functions static or by including the proper header with the
prototypes, and by appending a ULL prefix to the constant
PPC_MPPE_ADDRESS_MASK.

Signed-off-by: Thomas Huth th...@redhat.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/ppc-opcode.h| 2 +-
 arch/powerpc/kvm/book3s.c| 3 ++-
 arch/powerpc/kvm/book3s_32_mmu_host.c| 1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c| 1 +
 arch/powerpc/kvm/book3s_emulate.c| 1 +
 arch/powerpc/kvm/book3s_hv.c | 8 
 arch/powerpc/kvm/book3s_paired_singles.c | 2 +-
 arch/powerpc/kvm/powerpc.c   | 2 +-
 8 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 8452335..790f5d1 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -287,7 +287,7 @@
 
 /* POWER8 Micro Partition Prefetch (MPP) parameters */
 /* Address mask is common for LOGMPP instruction and MPPR SPR */
-#define PPC_MPPE_ADDRESS_MASK 0xc000
+#define PPC_MPPE_ADDRESS_MASK 0xc000ULL
 
 /* Bits 60 and 61 of MPP SPR should be set to one of the following */
 /* Aborting the fetch is indeed setting 00 in the table size bits */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 05ea8fc..53285d5 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -240,7 +240,8 @@ void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu, 
ulong flags)
kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_INST_STORAGE);
 }
 
-int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
+static int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu,
+unsigned int priority)
 {
int deliver = 1;
int vec = 0;
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c 
b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 2035d16..d5c9bfe 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -26,6 +26,7 @@
 #include asm/machdep.h
 #include asm/mmu_context.h
 #include asm/hw_irq.h
+#include book3s.h
 
 /* #define DEBUG_MMU */
 /* #define DEBUG_SR */
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
index b982d92..79ad35a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -28,6 +28,7 @@
 #include asm/mmu_context.h
 #include asm/hw_irq.h
 #include trace_pr.h
+#include book3s.h
 
 #define PTE_SIZE 12
 
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 5a2bc4b..2afdb9c 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -23,6 +23,7 @@
 #include asm/reg.h
 #include asm/switch_to.h
 #include asm/time.h
+#include book3s.h
 
 #define OP_19_XOP_RFID 18
 #define OP_19_XOP_RFI  50
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 68d067a..6e588ac 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -214,12 +214,12 @@ static void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 
msr)
kvmppc_end_cede(vcpu);
 }
 
-void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 pvr)
+static void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 pvr)
 {
vcpu-arch.pvr = pvr;
 }
 
-int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
+static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
 {
unsigned long pcr = 0;
struct kvmppc_vcore *vc = vcpu-arch.vcore;
@@ -259,7 +259,7 @@ int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 
arch_compat)
return 0;
 }
 
-void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
+static void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
 {
int r;
 
@@ -292,7 +292,7 @@ void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
   vcpu-arch.last_inst);
 }
 
-struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id)
+static struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id)
 {
int r;
struct kvm_vcpu *v, *ret = NULL;
diff --git a/arch/powerpc/kvm/book3s_paired_singles.c 
b/arch/powerpc/kvm/book3s_paired_singles.c
index bd6ab16..a759d9a 100644
--- a/arch/powerpc/kvm/book3s_paired_singles.c
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -352,7 +352,7 @@ static inline u32 inst_get_field(u32 inst, int msb, int lsb)
return kvmppc_get_field(inst, msb + 32, lsb + 32);
 }
 
-bool kvmppc_inst_is_paired_single(struct kvm_vcpu *vcpu, u32 inst)
+static bool kvmppc_inst_is_paired_single(struct kvm_vcpu *vcpu, u32 inst)
 {
if (!(vcpu-arch.hflags  BOOK3S_HFLAG_PAIRED_SINGLE))
return false;
diff --git a/arch

[PULL 01/12] KVM: PPC: fix suspicious use of conditional operator

2015-08-22 Thread Alexander Graf

From: Tudor Laurentiu b10...@freescale.com

This was signaled by a static code analysis tool.

Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
Reviewed-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/e500_mmu.c b/arch/powerpc/kvm/e500_mmu.c
index 50860e9..29911a0 100644
--- a/arch/powerpc/kvm/e500_mmu.c
+++ b/arch/powerpc/kvm/e500_mmu.c
@@ -377,7 +377,7 @@ int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, gva_t ea)
| MAS0_NV(vcpu_e500-gtlb_nv[tlbsel]);
vcpu-arch.shared-mas1 =
  (vcpu-arch.shared-mas6  MAS6_SPID0)
-   | (vcpu-arch.shared-mas6  (MAS6_SAS ? MAS1_TS : 0))
+   | ((vcpu-arch.shared-mas6  MAS6_SAS) ? MAS1_TS : 0)
| (vcpu-arch.shared-mas4  MAS4_TSIZED(~0));
vcpu-arch.shared-mas2 = MAS2_EPN;
vcpu-arch.shared-mas2 |= vcpu-arch.shared-mas4 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 05/12] KVM: PPC: Book3S HV: Make use of unused threads when running guests

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

When running a virtual core of a guest that is configured with fewer
threads per core than the physical cores have, the extra physical
threads are currently unused.  This makes it possible to use them to
run one or more other virtual cores from the same guest when certain
conditions are met.  This applies on POWER7, and on POWER8 to guests
with one thread per virtual core.  (It doesn't apply to POWER8 guests
with multiple threads per vcore because they require a 1-1 virtual to
physical thread mapping in order to be able to use msgsndp and the
TIR.)

The idea is that we maintain a list of preempted vcores for each
physical cpu (i.e. each core, since the host runs single-threaded).
Then, when a vcore is about to run, it checks to see if there are
any vcores on the list for its physical cpu that could be
piggybacked onto this vcore's execution.  If so, those additional
vcores are put into state VCORE_PIGGYBACK and their runnable VCPU
threads are started as well as the original vcore, which is called
the master vcore.

After the vcores have exited the guest, the extra ones are put back
onto the preempted list if any of their VCPUs are still runnable and
not idle.

This means that vcpu-arch.ptid is no longer necessarily the same as
the physical thread that the vcpu runs on.  In order to make it easier
for code that wants to send an IPI to know which CPU to target, we
now store that in a new field in struct vcpu_arch, called thread_cpu.

Reviewed-by: David Gibson da...@gibson.dropbear.id.au
Tested-by: Laurent Vivier lviv...@redhat.com
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  19 +-
 arch/powerpc/kernel/asm-offsets.c   |   2 +
 arch/powerpc/kvm/book3s_hv.c| 333 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c|   7 +-
 arch/powerpc/kvm/book3s_hv_rm_xics.c|   4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   5 +
 6 files changed, 298 insertions(+), 72 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index d91f65b..2b74490 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -278,7 +278,9 @@ struct kvmppc_vcore {
u16 last_cpu;
u8 vcore_state;
u8 in_guest;
+   struct kvmppc_vcore *master_vcore;
struct list_head runnable_threads;
+   struct list_head preempt_list;
spinlock_t lock;
wait_queue_head_t wq;
spinlock_t stoltb_lock; /* protects stolen_tb and preempt_tb */
@@ -300,12 +302,18 @@ struct kvmppc_vcore {
 #define VCORE_EXIT_MAP(vc) ((vc)-entry_exit_map  8)
 #define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
-/* Values for vcore_state */
+/*
+ * Values for vcore_state.
+ * Note that these are arranged such that lower values
+ * ( VCORE_SLEEPING) don't require stolen time accounting
+ * on load/unload, and higher values do.
+ */
 #define VCORE_INACTIVE 0
-#define VCORE_SLEEPING 1
-#define VCORE_PREEMPT  2
-#define VCORE_RUNNING  3
-#define VCORE_EXITING  4
+#define VCORE_PREEMPT  1
+#define VCORE_PIGGYBACK2
+#define VCORE_SLEEPING 3
+#define VCORE_RUNNING  4
+#define VCORE_EXITING  5
 
 /*
  * Struct used to manage memory for a virtual processor area
@@ -619,6 +627,7 @@ struct kvm_vcpu_arch {
int trap;
int state;
int ptid;
+   int thread_cpu;
bool timer_running;
wait_queue_head_t cpu_run;
 
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 9823057..a78cdbf 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -512,6 +512,8 @@ int main(void)
DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
DEFINE(VCPU_VPA_DIRTY, offsetof(struct kvm_vcpu, arch.vpa.dirty));
DEFINE(VCPU_HEIR, offsetof(struct kvm_vcpu, arch.emul_inst));
+   DEFINE(VCPU_CPU, offsetof(struct kvm_vcpu, cpu));
+   DEFINE(VCPU_THREAD_CPU, offsetof(struct kvm_vcpu, arch.thread_cpu));
 #endif
 #ifdef CONFIG_PPC_BOOK3S
DEFINE(VCPU_VCPUID, offsetof(struct kvm_vcpu, vcpu_id));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6e588ac..0173ce2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -81,6 +81,9 @@ static DECLARE_BITMAP(default_enabled_hcalls, 
MAX_HCALL_OPCODE/4 + 1);
 #define MPP_BUFFER_ORDER   3
 #endif
 
+static int target_smt_mode;
+module_param(target_smt_mode, int, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(target_smt_mode, Target threads per core (0 = max));
 
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
@@ -114,7 +117,7 @@ static bool kvmppc_ipi_thread(int cpu)
 
 static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 {
-   int cpu = vcpu-cpu;
+   int cpu

[PULL 01/12] KVM: PPC: fix suspicious use of conditional operator

2015-08-22 Thread Alexander Graf

From: Tudor Laurentiu b10...@freescale.com

This was signaled by a static code analysis tool.

Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
Reviewed-by: Scott Wood scottw...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/e500_mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/e500_mmu.c b/arch/powerpc/kvm/e500_mmu.c
index 50860e9..29911a0 100644
--- a/arch/powerpc/kvm/e500_mmu.c
+++ b/arch/powerpc/kvm/e500_mmu.c
@@ -377,7 +377,7 @@ int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, gva_t ea)
| MAS0_NV(vcpu_e500-gtlb_nv[tlbsel]);
vcpu-arch.shared-mas1 =
  (vcpu-arch.shared-mas6  MAS6_SPID0)
-   | (vcpu-arch.shared-mas6  (MAS6_SAS ? MAS1_TS : 0))
+   | ((vcpu-arch.shared-mas6  MAS6_SAS) ? MAS1_TS : 0)
| (vcpu-arch.shared-mas4  MAS4_TSIZED(~0));
vcpu-arch.shared-mas2 = MAS2_EPN;
vcpu-arch.shared-mas2 |= vcpu-arch.shared-mas4 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 05/12] KVM: PPC: Book3S HV: Make use of unused threads when running guests

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

When running a virtual core of a guest that is configured with fewer
threads per core than the physical cores have, the extra physical
threads are currently unused.  This makes it possible to use them to
run one or more other virtual cores from the same guest when certain
conditions are met.  This applies on POWER7, and on POWER8 to guests
with one thread per virtual core.  (It doesn't apply to POWER8 guests
with multiple threads per vcore because they require a 1-1 virtual to
physical thread mapping in order to be able to use msgsndp and the
TIR.)

The idea is that we maintain a list of preempted vcores for each
physical cpu (i.e. each core, since the host runs single-threaded).
Then, when a vcore is about to run, it checks to see if there are
any vcores on the list for its physical cpu that could be
piggybacked onto this vcore's execution.  If so, those additional
vcores are put into state VCORE_PIGGYBACK and their runnable VCPU
threads are started as well as the original vcore, which is called
the master vcore.

After the vcores have exited the guest, the extra ones are put back
onto the preempted list if any of their VCPUs are still runnable and
not idle.

This means that vcpu-arch.ptid is no longer necessarily the same as
the physical thread that the vcpu runs on.  In order to make it easier
for code that wants to send an IPI to know which CPU to target, we
now store that in a new field in struct vcpu_arch, called thread_cpu.

Reviewed-by: David Gibson da...@gibson.dropbear.id.au
Tested-by: Laurent Vivier lviv...@redhat.com
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  19 +-
 arch/powerpc/kernel/asm-offsets.c   |   2 +
 arch/powerpc/kvm/book3s_hv.c| 333 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c|   7 +-
 arch/powerpc/kvm/book3s_hv_rm_xics.c|   4 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   5 +
 6 files changed, 298 insertions(+), 72 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index d91f65b..2b74490 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -278,7 +278,9 @@ struct kvmppc_vcore {
u16 last_cpu;
u8 vcore_state;
u8 in_guest;
+   struct kvmppc_vcore *master_vcore;
struct list_head runnable_threads;
+   struct list_head preempt_list;
spinlock_t lock;
wait_queue_head_t wq;
spinlock_t stoltb_lock; /* protects stolen_tb and preempt_tb */
@@ -300,12 +302,18 @@ struct kvmppc_vcore {
 #define VCORE_EXIT_MAP(vc) ((vc)-entry_exit_map  8)
 #define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
-/* Values for vcore_state */
+/*
+ * Values for vcore_state.
+ * Note that these are arranged such that lower values
+ * ( VCORE_SLEEPING) don't require stolen time accounting
+ * on load/unload, and higher values do.
+ */
 #define VCORE_INACTIVE 0
-#define VCORE_SLEEPING 1
-#define VCORE_PREEMPT  2
-#define VCORE_RUNNING  3
-#define VCORE_EXITING  4
+#define VCORE_PREEMPT  1
+#define VCORE_PIGGYBACK2
+#define VCORE_SLEEPING 3
+#define VCORE_RUNNING  4
+#define VCORE_EXITING  5
 
 /*
  * Struct used to manage memory for a virtual processor area
@@ -619,6 +627,7 @@ struct kvm_vcpu_arch {
int trap;
int state;
int ptid;
+   int thread_cpu;
bool timer_running;
wait_queue_head_t cpu_run;
 
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 9823057..a78cdbf 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -512,6 +512,8 @@ int main(void)
DEFINE(VCPU_VPA, offsetof(struct kvm_vcpu, arch.vpa.pinned_addr));
DEFINE(VCPU_VPA_DIRTY, offsetof(struct kvm_vcpu, arch.vpa.dirty));
DEFINE(VCPU_HEIR, offsetof(struct kvm_vcpu, arch.emul_inst));
+   DEFINE(VCPU_CPU, offsetof(struct kvm_vcpu, cpu));
+   DEFINE(VCPU_THREAD_CPU, offsetof(struct kvm_vcpu, arch.thread_cpu));
 #endif
 #ifdef CONFIG_PPC_BOOK3S
DEFINE(VCPU_VCPUID, offsetof(struct kvm_vcpu, vcpu_id));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6e588ac..0173ce2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -81,6 +81,9 @@ static DECLARE_BITMAP(default_enabled_hcalls, 
MAX_HCALL_OPCODE/4 + 1);
 #define MPP_BUFFER_ORDER   3
 #endif
 
+static int target_smt_mode;
+module_param(target_smt_mode, int, S_IRUGO | S_IWUSR);
+MODULE_PARM_DESC(target_smt_mode, Target threads per core (0 = max));
 
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
@@ -114,7 +117,7 @@ static bool kvmppc_ipi_thread(int cpu)
 
 static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 {
-   int cpu = vcpu-cpu;
+   int cpu

[PULL 06/12] KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_asm.h |  20 ++
 arch/powerpc/include/asm/kvm_host.h   |   3 +
 arch/powerpc/kernel/asm-offsets.c |   7 +
 arch/powerpc/kvm/book3s_hv.c  | 367 ++
 arch/powerpc/kvm/book3s_hv_builtin.c  |  25 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 113 +++--
 6 files changed, 473 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 5bdfb5d..57d5dfe 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -25,6 +25,12 @@
 #define XICS_MFRR  0xc
 #define XICS_IPI   2   /* interrupt source # for IPIs */
 
+/* Maximum number of threads per physical core */
+#define MAX_SMT_THREADS8
+
+/* Maximum number of subcores per physical core */
+#define MAX_SUBCORES   4
+
 #ifdef __ASSEMBLY__
 
 #ifdef CONFIG_KVM_BOOK3S_HANDLER
@@ -65,6 +71,19 @@ kvmppc_resume_\intno:
 
 #else  /*__ASSEMBLY__ */
 
+struct kvmppc_vcore;
+
+/* Struct used for coordinating micro-threading (split-core) mode changes */
+struct kvm_split_mode {
+   unsigned long   rpr;
+   unsigned long   pmmar;
+   unsigned long   ldbar;
+   u8  subcore_size;
+   u8  do_nap;
+   u8  napped[MAX_SMT_THREADS];
+   struct kvmppc_vcore *master_vcs[MAX_SUBCORES];
+};
+
 /*
  * This struct goes in the PACA on 64-bit processors.  It is used
  * to store host state that needs to be saved when we enter a guest
@@ -100,6 +119,7 @@ struct kvmppc_host_state {
u64 host_spurr;
u64 host_dscr;
u64 dec_expires;
+   struct kvm_split_mode *kvm_split_mode;
 #endif
 #ifdef CONFIG_PPC_BOOK3S_64
u64 cfar;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2b74490..80eb29a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -302,6 +302,9 @@ struct kvmppc_vcore {
 #define VCORE_EXIT_MAP(vc) ((vc)-entry_exit_map  8)
 #define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
+/* This bit is used when a vcore exit is triggered from outside the vcore */
+#define VCORE_EXIT_REQ 0x1
+
 /*
  * Values for vcore_state.
  * Note that these are arranged such that lower values
diff --git a/arch

[PULL 02/12] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-08-22 Thread Alexander Graf

From: Thomas Huth th...@redhat.com

Since the PPC970 support has been removed from the kvm-hv kernel
module recently, we should also reflect this change in the help
text of the corresponding Kconfig option.

Signed-off-by: Thomas Huth th...@redhat.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Kconfig | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 3caec2c..c2024ac 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -74,14 +74,14 @@ config KVM_BOOK3S_64
  If unsure, say N.
 
 config KVM_BOOK3S_64_HV
-   tristate KVM support for POWER7 and PPC970 using hypervisor mode in 
host
+   tristate KVM for POWER7 and later using hypervisor mode in host
depends on KVM_BOOK3S_64  PPC_POWERNV
select KVM_BOOK3S_HV_POSSIBLE
select MMU_NOTIFIER
select CMA
---help---
  Support running unmodified book3s_64 guest kernels in
- virtual machines on POWER7 and PPC970 processors that have
+ virtual machines on POWER7 and newer processors that have
  hypervisor mode available to the host.
 
  If you say Y here, KVM will use the hardware virtualization
@@ -89,8 +89,8 @@ config KVM_BOOK3S_64_HV
  guest operating systems will run at full hardware speed
  using supervisor and user modes.  However, this also means
  that KVM is not usable under PowerVM (pHyp), is only usable
- on POWER7 (or later) processors and PPC970-family processors,
- and cannot emulate a different processor from the host processor.
+ on POWER7 or later processors, and cannot emulate a
+ different processor from the host processor.
 
  If unsure, say N.
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 08/12] KVM: PPC: Book3S HV: Fix bug in dirty page tracking

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This fixes a bug in the tracking of pages that get modified by the
guest.  If the guest creates a large-page HPTE, writes to memory
somewhere within the large page, and then removes the HPTE, we only
record the modified state for the first normal page within the large
page, when in fact the guest might have modified some other normal
page within the large page.

To fix this we use some unused bits in the rmap entry to record the
order (log base 2) of the size of the page that was modified, when
removing an HPTE.  Then in kvm_test_clear_dirty_npages() we use that
order to return the correct number of modified pages.

The same thing could in principle happen when removing a HPTE at the
host's request, i.e. when paging out a page, except that we never
page out large pages, and the guest can only create large-page HPTEs
if the guest RAM is backed by large pages.  However, we also fix
this case for the sake of future-proofing.

The reference bit is also subject to the same loss of information.  We
don't make the same fix here for the reference bit because there isn't
an interface for userspace to find out which pages the guest has
referenced, whereas there is one for userspace to find out which pages
the guest has modified.  Because of this loss of information, the
kvm_age_hva_hv() and kvm_test_age_hva_hv() functions might incorrectly
say that a page has not been referenced when it has, but that doesn't
matter greatly because we never page or swap out large pages.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  1 +
 arch/powerpc/include/asm/kvm_host.h   |  2 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |  8 +++-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 17 +
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index b91e74a..e6b2534 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -158,6 +158,7 @@ extern pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t 
gpa, bool writing,
bool *writable);
 extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
unsigned long *rmap, long pte_index, int realmode);
+extern void kvmppc_update_rmap_change(unsigned long *rmap, unsigned long 
psize);
 extern void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
unsigned long pte_index);
 void kvmppc_clear_ref_hpte(struct kvm *kvm, __be64 *hptep,
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 80eb29a..e187b6a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -205,8 +205,10 @@ struct revmap_entry {
  */
 #define KVMPPC_RMAP_LOCK_BIT   63
 #define KVMPPC_RMAP_RC_SHIFT   32
+#define KVMPPC_RMAP_CHG_SHIFT  48
 #define KVMPPC_RMAP_REFERENCED (HPTE_R_R  KVMPPC_RMAP_RC_SHIFT)
 #define KVMPPC_RMAP_CHANGED(HPTE_R_C  KVMPPC_RMAP_RC_SHIFT)
+#define KVMPPC_RMAP_CHG_ORDER  (0x3ful  KVMPPC_RMAP_CHG_SHIFT)
 #define KVMPPC_RMAP_PRESENT0x1ul
 #define KVMPPC_RMAP_INDEX  0xul
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index dab68b7..1f9c0a1 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -761,6 +761,8 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
/* Harvest R and C */
rcbits = be64_to_cpu(hptep[1])  (HPTE_R_R | HPTE_R_C);
*rmapp |= rcbits  KVMPPC_RMAP_RC_SHIFT;
+   if (rcbits  HPTE_R_C)
+   kvmppc_update_rmap_change(rmapp, psize);
if (rcbits  ~rev[i].guest_rpte) {
rev[i].guest_rpte = ptel | rcbits;
note_hpte_modification(kvm, rev[i]);
@@ -927,8 +929,12 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
  retry:
lock_rmap(rmapp);
if (*rmapp  KVMPPC_RMAP_CHANGED) {
-   *rmapp = ~KVMPPC_RMAP_CHANGED;
+   long change_order = (*rmapp  KVMPPC_RMAP_CHG_ORDER)
+KVMPPC_RMAP_CHG_SHIFT;
+   *rmapp = ~(KVMPPC_RMAP_CHANGED | KVMPPC_RMAP_CHG_ORDER);
npages_dirty = 1;
+   if (change_order  PAGE_SHIFT)
+   npages_dirty = 1ul  (change_order - PAGE_SHIFT);
}
if (!(*rmapp  KVMPPC_RMAP_PRESENT)) {
unlock_rmap(rmapp);
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index c6d601c..c7a3ab2 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -12,6 +12,7 @@
 #include linux/kvm_host.h
 #include

[PULL 03/12] KVM: PPC: Fix warnings from sparse

2015-08-22 Thread Alexander Graf

From: Thomas Huth th...@redhat.com

When compiling the KVM code for POWER with make C=1, sparse
complains about functions missing proper prototypes and a 64-bit
constant missing the ULL prefix. Let's fix this by making the
functions static or by including the proper header with the
prototypes, and by appending a ULL prefix to the constant
PPC_MPPE_ADDRESS_MASK.

Signed-off-by: Thomas Huth th...@redhat.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/ppc-opcode.h| 2 +-
 arch/powerpc/kvm/book3s.c| 3 ++-
 arch/powerpc/kvm/book3s_32_mmu_host.c| 1 +
 arch/powerpc/kvm/book3s_64_mmu_host.c| 1 +
 arch/powerpc/kvm/book3s_emulate.c| 1 +
 arch/powerpc/kvm/book3s_hv.c | 8 
 arch/powerpc/kvm/book3s_paired_singles.c | 2 +-
 arch/powerpc/kvm/powerpc.c   | 2 +-
 8 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc-opcode.h 
b/arch/powerpc/include/asm/ppc-opcode.h
index 8452335..790f5d1 100644
--- a/arch/powerpc/include/asm/ppc-opcode.h
+++ b/arch/powerpc/include/asm/ppc-opcode.h
@@ -287,7 +287,7 @@
 
 /* POWER8 Micro Partition Prefetch (MPP) parameters */
 /* Address mask is common for LOGMPP instruction and MPPR SPR */
-#define PPC_MPPE_ADDRESS_MASK 0xc000
+#define PPC_MPPE_ADDRESS_MASK 0xc000ULL
 
 /* Bits 60 and 61 of MPP SPR should be set to one of the following */
 /* Aborting the fetch is indeed setting 00 in the table size bits */
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index 05ea8fc..53285d5 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -240,7 +240,8 @@ void kvmppc_core_queue_inst_storage(struct kvm_vcpu *vcpu, 
ulong flags)
kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_INST_STORAGE);
 }
 
-int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu, unsigned int priority)
+static int kvmppc_book3s_irqprio_deliver(struct kvm_vcpu *vcpu,
+unsigned int priority)
 {
int deliver = 1;
int vec = 0;
diff --git a/arch/powerpc/kvm/book3s_32_mmu_host.c 
b/arch/powerpc/kvm/book3s_32_mmu_host.c
index 2035d16..d5c9bfe 100644
--- a/arch/powerpc/kvm/book3s_32_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_32_mmu_host.c
@@ -26,6 +26,7 @@
 #include asm/machdep.h
 #include asm/mmu_context.h
 #include asm/hw_irq.h
+#include book3s.h
 
 /* #define DEBUG_MMU */
 /* #define DEBUG_SR */
diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
index b982d92..79ad35a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_host.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -28,6 +28,7 @@
 #include asm/mmu_context.h
 #include asm/hw_irq.h
 #include trace_pr.h
+#include book3s.h
 
 #define PTE_SIZE 12
 
diff --git a/arch/powerpc/kvm/book3s_emulate.c 
b/arch/powerpc/kvm/book3s_emulate.c
index 5a2bc4b..2afdb9c 100644
--- a/arch/powerpc/kvm/book3s_emulate.c
+++ b/arch/powerpc/kvm/book3s_emulate.c
@@ -23,6 +23,7 @@
 #include asm/reg.h
 #include asm/switch_to.h
 #include asm/time.h
+#include book3s.h
 
 #define OP_19_XOP_RFID 18
 #define OP_19_XOP_RFI  50
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 68d067a..6e588ac 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -214,12 +214,12 @@ static void kvmppc_set_msr_hv(struct kvm_vcpu *vcpu, u64 
msr)
kvmppc_end_cede(vcpu);
 }
 
-void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 pvr)
+static void kvmppc_set_pvr_hv(struct kvm_vcpu *vcpu, u32 pvr)
 {
vcpu-arch.pvr = pvr;
 }
 
-int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
+static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 arch_compat)
 {
unsigned long pcr = 0;
struct kvmppc_vcore *vc = vcpu-arch.vcore;
@@ -259,7 +259,7 @@ int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, u32 
arch_compat)
return 0;
 }
 
-void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
+static void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
 {
int r;
 
@@ -292,7 +292,7 @@ void kvmppc_dump_regs(struct kvm_vcpu *vcpu)
   vcpu-arch.last_inst);
 }
 
-struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id)
+static struct kvm_vcpu *kvmppc_find_vcpu(struct kvm *kvm, int id)
 {
int r;
struct kvm_vcpu *v, *ret = NULL;
diff --git a/arch/powerpc/kvm/book3s_paired_singles.c 
b/arch/powerpc/kvm/book3s_paired_singles.c
index bd6ab16..a759d9a 100644
--- a/arch/powerpc/kvm/book3s_paired_singles.c
+++ b/arch/powerpc/kvm/book3s_paired_singles.c
@@ -352,7 +352,7 @@ static inline u32 inst_get_field(u32 inst, int msb, int lsb)
return kvmppc_get_field(inst, msb + 32, lsb + 32);
 }
 
-bool kvmppc_inst_is_paired_single(struct kvm_vcpu *vcpu, u32 inst)
+static bool kvmppc_inst_is_paired_single(struct kvm_vcpu *vcpu, u32 inst)
 {
if (!(vcpu-arch.hflags  BOOK3S_HFLAG_PAIRED_SINGLE))
return false;
diff --git a/arch

[PULL 10/12] KVM: PPC: Book3S HV: Fix preempted vcore list locking

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

When a vcore gets preempted, we put it on the preempted vcore list for
the current CPU.  The runner task then calls schedule() and comes back
some time later and takes itself off the list.  We need to be careful
to lock the list that it was put onto, which may not be the list for the
current CPU since the runner task may have moved to another CPU.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6e3ef30..3d02276 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1962,10 +1962,11 @@ static void kvmppc_vcore_preempt(struct kvmppc_vcore 
*vc)
 
 static void kvmppc_vcore_end_preempt(struct kvmppc_vcore *vc)
 {
-   struct preempted_vcore_list *lp = this_cpu_ptr(preempted_vcores);
+   struct preempted_vcore_list *lp;
 
kvmppc_core_end_stolen(vc);
if (!list_empty(vc-preempt_list)) {
+   lp = per_cpu(preempted_vcores, vc-pcpu);
spin_lock(lp-lock);
list_del_init(vc-preempt_list);
spin_unlock(lp-lock);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 12/12] KVM: PPC: Book3S: correct width in XER handling

2015-08-22 Thread Alexander Graf

From: Sam bobroff sam.bobr...@au1.ibm.com

In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64
bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is
accessed as such.

This patch corrects places where it is accessed as a 32 bit field by a
64 bit kernel.  In some cases this is via a 32 bit load or store
instruction which, depending on endianness, will cause either the
lower or upper 32 bits to be missed.  In another case it is cast as a
u32, causing the upper 32 bits to be cleared.

This patch corrects those places by extending the access methods to
64 bits.

Signed-off-by: Sam Bobroff sam.bobr...@au1.ibm.com
Reviewed-by: Laurent Vivier lviv...@redhat.com
Reviewed-by: Thomas Huth th...@redhat.com
Tested-by: Thomas Huth th...@redhat.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h | 4 ++--
 arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +-
 arch/powerpc/include/asm/kvm_booke.h  | 4 ++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 6 +++---
 arch/powerpc/kvm/book3s_segment.S | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index e6b2534..9fac01c 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -226,12 +226,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
return vcpu-arch.cr;
 }
 
-static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
+static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
 {
vcpu-arch.xer = val;
 }
 
-static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
+static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
 {
return vcpu-arch.xer;
 }
diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 57d5dfe..72b6225 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -132,7 +132,7 @@ struct kvmppc_book3s_shadow_vcpu {
bool in_use;
ulong gpr[14];
u32 cr;
-   u32 xer;
+   ulong xer;
ulong ctr;
ulong lr;
ulong pc;
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index 3286f0d..bc6e29e 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -54,12 +54,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
return vcpu-arch.cr;
 }
 
-static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
+static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
 {
vcpu-arch.xer = val;
 }
 
-static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
+static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
 {
return vcpu-arch.xer;
 }
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index e347766..472680f 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -944,7 +944,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
blt hdec_soon
 
ld  r6, VCPU_CTR(r4)
-   lwz r7, VCPU_XER(r4)
+   ld  r7, VCPU_XER(r4)
 
mtctr   r6
mtxer   r7
@@ -1181,7 +1181,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
mfctr   r3
mfxer   r4
std r3, VCPU_CTR(r9)
-   stw r4, VCPU_XER(r9)
+   std r4, VCPU_XER(r9)
 
/* If this is a page table miss then see if it's theirs or ours */
cmpwi   r12, BOOK3S_INTERRUPT_H_DATA_STORAGE
@@ -1763,7 +1763,7 @@ kvmppc_hdsi:
bl  kvmppc_msr_interrupt
 fast_interrupt_c_return:
 6: ld  r7, VCPU_CTR(r9)
-   lwz r8, VCPU_XER(r9)
+   ld  r8, VCPU_XER(r9)
mtctr   r7
mtxer   r8
mr  r4, r9
diff --git a/arch/powerpc/kvm/book3s_segment.S 
b/arch/powerpc/kvm/book3s_segment.S
index acee37c..ca8f174 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -123,7 +123,7 @@ no_dcbz32_on:
PPC_LL  r8, SVCPU_CTR(r3)
PPC_LL  r9, SVCPU_LR(r3)
lwz r10, SVCPU_CR(r3)
-   lwz r11, SVCPU_XER(r3)
+   PPC_LL  r11, SVCPU_XER(r3)
 
mtctr   r8
mtlrr9
@@ -237,7 +237,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
mfctr   r8
mflrr9
 
-   stw r5, SVCPU_XER(r13)
+   PPC_STL r5, SVCPU_XER(r13)
PPC_STL r6, SVCPU_FAULT_DAR(r13)
stw r7, SVCPU_FAULT_DSISR(r13)
PPC_STL r8, SVCPU_CTR(r13)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 09/12] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This adds implementations for the H_CLEAR_REF (test and clear reference
bit) and H_CLEAR_MOD (test and clear changed bit) hypercalls.

When clearing the reference or change bit in the guest view of the HPTE,
we also have to clear it in the real HPTE so that we can detect future
references or changes.  When we do so, we transfer the R or C bit value
to the rmap entry for the underlying host page so that kvm_age_hva_hv(),
kvm_test_age_hva_hv() and kvmppc_hv_get_dirty_log() know that the page
has been referenced and/or changed.

These hypercalls are not used by Linux guests.  These implementations
have been tested using a FreeBSD guest.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 126 ++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   4 +-
 2 files changed, 121 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index c7a3ab2..c1df9bb 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -112,25 +112,38 @@ void kvmppc_update_rmap_change(unsigned long *rmap, 
unsigned long psize)
 }
 EXPORT_SYMBOL_GPL(kvmppc_update_rmap_change);
 
+/* Returns a pointer to the revmap entry for the page mapped by a HPTE */
+static unsigned long *revmap_for_hpte(struct kvm *kvm, unsigned long hpte_v,
+ unsigned long hpte_gr)
+{
+   struct kvm_memory_slot *memslot;
+   unsigned long *rmap;
+   unsigned long gfn;
+
+   gfn = hpte_rpn(hpte_gr, hpte_page_size(hpte_v, hpte_gr));
+   memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
+   if (!memslot)
+   return NULL;
+
+   rmap = real_vmalloc_addr(memslot-arch.rmap[gfn - memslot-base_gfn]);
+   return rmap;
+}
+
 /* Remove this HPTE from the chain for a real page */
 static void remove_revmap_chain(struct kvm *kvm, long pte_index,
struct revmap_entry *rev,
unsigned long hpte_v, unsigned long hpte_r)
 {
struct revmap_entry *next, *prev;
-   unsigned long gfn, ptel, head;
-   struct kvm_memory_slot *memslot;
+   unsigned long ptel, head;
unsigned long *rmap;
unsigned long rcbits;
 
rcbits = hpte_r  (HPTE_R_R | HPTE_R_C);
ptel = rev-guest_rpte |= rcbits;
-   gfn = hpte_rpn(ptel, hpte_page_size(hpte_v, ptel));
-   memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
-   if (!memslot)
+   rmap = revmap_for_hpte(kvm, hpte_v, ptel);
+   if (!rmap)
return;
-
-   rmap = real_vmalloc_addr(memslot-arch.rmap[gfn - memslot-base_gfn]);
lock_rmap(rmap);
 
head = *rmap  KVMPPC_RMAP_INDEX;
@@ -678,6 +691,105 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long 
flags,
return H_SUCCESS;
 }
 
+long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
+   unsigned long pte_index)
+{
+   struct kvm *kvm = vcpu-kvm;
+   __be64 *hpte;
+   unsigned long v, r, gr;
+   struct revmap_entry *rev;
+   unsigned long *rmap;
+   long ret = H_NOT_FOUND;
+
+   if (pte_index = kvm-arch.hpt_npte)
+   return H_PARAMETER;
+
+   rev = real_vmalloc_addr(kvm-arch.revmap[pte_index]);
+   hpte = (__be64 *)(kvm-arch.hpt_virt + (pte_index  4));
+   while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
+   cpu_relax();
+   v = be64_to_cpu(hpte[0]);
+   r = be64_to_cpu(hpte[1]);
+   if (!(v  (HPTE_V_VALID | HPTE_V_ABSENT)))
+   goto out;
+
+   gr = rev-guest_rpte;
+   if (rev-guest_rpte  HPTE_R_R) {
+   rev-guest_rpte = ~HPTE_R_R;
+   note_hpte_modification(kvm, rev);
+   }
+   if (v  HPTE_V_VALID) {
+   gr |= r  (HPTE_R_R | HPTE_R_C);
+   if (r  HPTE_R_R) {
+   kvmppc_clear_ref_hpte(kvm, hpte, pte_index);
+   rmap = revmap_for_hpte(kvm, v, gr);
+   if (rmap) {
+   lock_rmap(rmap);
+   *rmap |= KVMPPC_RMAP_REFERENCED;
+   unlock_rmap(rmap);
+   }
+   }
+   }
+   vcpu-arch.gpr[4] = gr;
+   ret = H_SUCCESS;
+ out:
+   unlock_hpte(hpte, v  ~HPTE_V_HVLOCK);
+   return ret;
+}
+
+long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
+   unsigned long pte_index)
+{
+   struct kvm *kvm = vcpu-kvm;
+   __be64 *hpte;
+   unsigned long v, r, gr;
+   struct revmap_entry *rev;
+   unsigned long *rmap;
+   long ret = H_NOT_FOUND;
+
+   if (pte_index = kvm-arch.hpt_npte)
+   return H_PARAMETER;
+
+   rev = real_vmalloc_addr(kvm-arch.revmap[pte_index]);
+   hpte = (__be64

[PULL 11/12] KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

Whenever a vcore state is VCORE_PREEMPT we need to be counting stolen
time for it.  This currently isn't the case when we have a vcore that
no longer has any runnable threads in it but still has a runner task,
so we do an explicit call to kvmppc_core_start_stolen() in that case.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3d02276..fad52f2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2283,9 +2283,14 @@ static void post_guest_process(struct kvmppc_vcore *vc, 
bool is_master)
}
list_del_init(vc-preempt_list);
if (!is_master) {
-   vc-vcore_state = vc-runner ? VCORE_PREEMPT : VCORE_INACTIVE;
-   if (still_running  0)
+   if (still_running  0) {
kvmppc_vcore_preempt(vc);
+   } else if (vc-runner) {
+   vc-vcore_state = VCORE_PREEMPT;
+   kvmppc_core_start_stolen(vc);
+   } else {
+   vc-vcore_state = VCORE_INACTIVE;
+   }
if (vc-n_runnable  0  vc-runner == NULL) {
/* make sure there's a candidate runner awake */
vcpu = list_first_entry(vc-runnable_threads,
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 06/12] KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_asm.h |  20 ++
 arch/powerpc/include/asm/kvm_host.h   |   3 +
 arch/powerpc/kernel/asm-offsets.c |   7 +
 arch/powerpc/kvm/book3s_hv.c  | 367 ++
 arch/powerpc/kvm/book3s_hv_builtin.c  |  25 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 113 +++--
 6 files changed, 473 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 5bdfb5d..57d5dfe 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -25,6 +25,12 @@
 #define XICS_MFRR  0xc
 #define XICS_IPI   2   /* interrupt source # for IPIs */
 
+/* Maximum number of threads per physical core */
+#define MAX_SMT_THREADS8
+
+/* Maximum number of subcores per physical core */
+#define MAX_SUBCORES   4
+
 #ifdef __ASSEMBLY__
 
 #ifdef CONFIG_KVM_BOOK3S_HANDLER
@@ -65,6 +71,19 @@ kvmppc_resume_\intno:
 
 #else  /*__ASSEMBLY__ */
 
+struct kvmppc_vcore;
+
+/* Struct used for coordinating micro-threading (split-core) mode changes */
+struct kvm_split_mode {
+   unsigned long   rpr;
+   unsigned long   pmmar;
+   unsigned long   ldbar;
+   u8  subcore_size;
+   u8  do_nap;
+   u8  napped[MAX_SMT_THREADS];
+   struct kvmppc_vcore *master_vcs[MAX_SUBCORES];
+};
+
 /*
  * This struct goes in the PACA on 64-bit processors.  It is used
  * to store host state that needs to be saved when we enter a guest
@@ -100,6 +119,7 @@ struct kvmppc_host_state {
u64 host_spurr;
u64 host_dscr;
u64 dec_expires;
+   struct kvm_split_mode *kvm_split_mode;
 #endif
 #ifdef CONFIG_PPC_BOOK3S_64
u64 cfar;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2b74490..80eb29a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -302,6 +302,9 @@ struct kvmppc_vcore {
 #define VCORE_EXIT_MAP(vc) ((vc)-entry_exit_map  8)
 #define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
+/* This bit is used when a vcore exit is triggered from outside the vcore */
+#define VCORE_EXIT_REQ 0x1
+
 /*
  * Values for vcore_state.
  * Note that these are arranged such that lower values
diff --git a/arch

[PULL 04/12] KVM: PPC: add missing pt_regs initialization

2015-08-22 Thread Alexander Graf

From: Tudor Laurentiu b10...@freescale.com

On this switch branch the regs initialization
doesn't happen so add it.
This was found with the help of a static
code analysis tool.

Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index cc58426..ae458f0 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -933,6 +933,7 @@ static void kvmppc_restart_interrupt(struct kvm_vcpu *vcpu,
 #endif
break;
case BOOKE_INTERRUPT_CRITICAL:
+   kvmppc_fill_pt_regs(regs);
unknown_exception(regs);
break;
case BOOKE_INTERRUPT_DEBUG:
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 12/12] KVM: PPC: Book3S: correct width in XER handling

2015-08-22 Thread Alexander Graf

From: Sam bobroff sam.bobr...@au1.ibm.com

In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64
bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is
accessed as such.

This patch corrects places where it is accessed as a 32 bit field by a
64 bit kernel.  In some cases this is via a 32 bit load or store
instruction which, depending on endianness, will cause either the
lower or upper 32 bits to be missed.  In another case it is cast as a
u32, causing the upper 32 bits to be cleared.

This patch corrects those places by extending the access methods to
64 bits.

Signed-off-by: Sam Bobroff sam.bobr...@au1.ibm.com
Reviewed-by: Laurent Vivier lviv...@redhat.com
Reviewed-by: Thomas Huth th...@redhat.com
Tested-by: Thomas Huth th...@redhat.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h | 4 ++--
 arch/powerpc/include/asm/kvm_book3s_asm.h | 2 +-
 arch/powerpc/include/asm/kvm_booke.h  | 4 ++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 6 +++---
 arch/powerpc/kvm/book3s_segment.S | 4 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index e6b2534..9fac01c 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -226,12 +226,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
return vcpu-arch.cr;
 }
 
-static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
+static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
 {
vcpu-arch.xer = val;
 }
 
-static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
+static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
 {
return vcpu-arch.xer;
 }
diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 57d5dfe..72b6225 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -132,7 +132,7 @@ struct kvmppc_book3s_shadow_vcpu {
bool in_use;
ulong gpr[14];
u32 cr;
-   u32 xer;
+   ulong xer;
ulong ctr;
ulong lr;
ulong pc;
diff --git a/arch/powerpc/include/asm/kvm_booke.h 
b/arch/powerpc/include/asm/kvm_booke.h
index 3286f0d..bc6e29e 100644
--- a/arch/powerpc/include/asm/kvm_booke.h
+++ b/arch/powerpc/include/asm/kvm_booke.h
@@ -54,12 +54,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
return vcpu-arch.cr;
 }
 
-static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
+static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
 {
vcpu-arch.xer = val;
 }
 
-static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
+static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
 {
return vcpu-arch.xer;
 }
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index e347766..472680f 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -944,7 +944,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
blt hdec_soon
 
ld  r6, VCPU_CTR(r4)
-   lwz r7, VCPU_XER(r4)
+   ld  r7, VCPU_XER(r4)
 
mtctr   r6
mtxer   r7
@@ -1181,7 +1181,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
mfctr   r3
mfxer   r4
std r3, VCPU_CTR(r9)
-   stw r4, VCPU_XER(r9)
+   std r4, VCPU_XER(r9)
 
/* If this is a page table miss then see if it's theirs or ours */
cmpwi   r12, BOOK3S_INTERRUPT_H_DATA_STORAGE
@@ -1763,7 +1763,7 @@ kvmppc_hdsi:
bl  kvmppc_msr_interrupt
 fast_interrupt_c_return:
 6: ld  r7, VCPU_CTR(r9)
-   lwz r8, VCPU_XER(r9)
+   ld  r8, VCPU_XER(r9)
mtctr   r7
mtxer   r8
mr  r4, r9
diff --git a/arch/powerpc/kvm/book3s_segment.S 
b/arch/powerpc/kvm/book3s_segment.S
index acee37c..ca8f174 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -123,7 +123,7 @@ no_dcbz32_on:
PPC_LL  r8, SVCPU_CTR(r3)
PPC_LL  r9, SVCPU_LR(r3)
lwz r10, SVCPU_CR(r3)
-   lwz r11, SVCPU_XER(r3)
+   PPC_LL  r11, SVCPU_XER(r3)
 
mtctr   r8
mtlrr9
@@ -237,7 +237,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
mfctr   r8
mflrr9
 
-   stw r5, SVCPU_XER(r13)
+   PPC_STL r5, SVCPU_XER(r13)
PPC_STL r6, SVCPU_FAULT_DAR(r13)
stw r7, SVCPU_FAULT_DSISR(r13)
PPC_STL r8, SVCPU_CTR(r13)
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 09/12] KVM: PPC: Book3S HV: Implement H_CLEAR_REF and H_CLEAR_MOD

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This adds implementations for the H_CLEAR_REF (test and clear reference
bit) and H_CLEAR_MOD (test and clear changed bit) hypercalls.

When clearing the reference or change bit in the guest view of the HPTE,
we also have to clear it in the real HPTE so that we can detect future
references or changes.  When we do so, we transfer the R or C bit value
to the rmap entry for the underlying host page so that kvm_age_hva_hv(),
kvm_test_age_hva_hv() and kvmppc_hv_get_dirty_log() know that the page
has been referenced and/or changed.

These hypercalls are not used by Linux guests.  These implementations
have been tested using a FreeBSD guest.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c | 126 ++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S |   4 +-
 2 files changed, 121 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index c7a3ab2..c1df9bb 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -112,25 +112,38 @@ void kvmppc_update_rmap_change(unsigned long *rmap, 
unsigned long psize)
 }
 EXPORT_SYMBOL_GPL(kvmppc_update_rmap_change);
 
+/* Returns a pointer to the revmap entry for the page mapped by a HPTE */
+static unsigned long *revmap_for_hpte(struct kvm *kvm, unsigned long hpte_v,
+ unsigned long hpte_gr)
+{
+   struct kvm_memory_slot *memslot;
+   unsigned long *rmap;
+   unsigned long gfn;
+
+   gfn = hpte_rpn(hpte_gr, hpte_page_size(hpte_v, hpte_gr));
+   memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
+   if (!memslot)
+   return NULL;
+
+   rmap = real_vmalloc_addr(memslot-arch.rmap[gfn - memslot-base_gfn]);
+   return rmap;
+}
+
 /* Remove this HPTE from the chain for a real page */
 static void remove_revmap_chain(struct kvm *kvm, long pte_index,
struct revmap_entry *rev,
unsigned long hpte_v, unsigned long hpte_r)
 {
struct revmap_entry *next, *prev;
-   unsigned long gfn, ptel, head;
-   struct kvm_memory_slot *memslot;
+   unsigned long ptel, head;
unsigned long *rmap;
unsigned long rcbits;
 
rcbits = hpte_r  (HPTE_R_R | HPTE_R_C);
ptel = rev-guest_rpte |= rcbits;
-   gfn = hpte_rpn(ptel, hpte_page_size(hpte_v, ptel));
-   memslot = __gfn_to_memslot(kvm_memslots_raw(kvm), gfn);
-   if (!memslot)
+   rmap = revmap_for_hpte(kvm, hpte_v, ptel);
+   if (!rmap)
return;
-
-   rmap = real_vmalloc_addr(memslot-arch.rmap[gfn - memslot-base_gfn]);
lock_rmap(rmap);
 
head = *rmap  KVMPPC_RMAP_INDEX;
@@ -678,6 +691,105 @@ long kvmppc_h_read(struct kvm_vcpu *vcpu, unsigned long 
flags,
return H_SUCCESS;
 }
 
+long kvmppc_h_clear_ref(struct kvm_vcpu *vcpu, unsigned long flags,
+   unsigned long pte_index)
+{
+   struct kvm *kvm = vcpu-kvm;
+   __be64 *hpte;
+   unsigned long v, r, gr;
+   struct revmap_entry *rev;
+   unsigned long *rmap;
+   long ret = H_NOT_FOUND;
+
+   if (pte_index = kvm-arch.hpt_npte)
+   return H_PARAMETER;
+
+   rev = real_vmalloc_addr(kvm-arch.revmap[pte_index]);
+   hpte = (__be64 *)(kvm-arch.hpt_virt + (pte_index  4));
+   while (!try_lock_hpte(hpte, HPTE_V_HVLOCK))
+   cpu_relax();
+   v = be64_to_cpu(hpte[0]);
+   r = be64_to_cpu(hpte[1]);
+   if (!(v  (HPTE_V_VALID | HPTE_V_ABSENT)))
+   goto out;
+
+   gr = rev-guest_rpte;
+   if (rev-guest_rpte  HPTE_R_R) {
+   rev-guest_rpte = ~HPTE_R_R;
+   note_hpte_modification(kvm, rev);
+   }
+   if (v  HPTE_V_VALID) {
+   gr |= r  (HPTE_R_R | HPTE_R_C);
+   if (r  HPTE_R_R) {
+   kvmppc_clear_ref_hpte(kvm, hpte, pte_index);
+   rmap = revmap_for_hpte(kvm, v, gr);
+   if (rmap) {
+   lock_rmap(rmap);
+   *rmap |= KVMPPC_RMAP_REFERENCED;
+   unlock_rmap(rmap);
+   }
+   }
+   }
+   vcpu-arch.gpr[4] = gr;
+   ret = H_SUCCESS;
+ out:
+   unlock_hpte(hpte, v  ~HPTE_V_HVLOCK);
+   return ret;
+}
+
+long kvmppc_h_clear_mod(struct kvm_vcpu *vcpu, unsigned long flags,
+   unsigned long pte_index)
+{
+   struct kvm *kvm = vcpu-kvm;
+   __be64 *hpte;
+   unsigned long v, r, gr;
+   struct revmap_entry *rev;
+   unsigned long *rmap;
+   long ret = H_NOT_FOUND;
+
+   if (pte_index = kvm-arch.hpt_npte)
+   return H_PARAMETER;
+
+   rev = real_vmalloc_addr(kvm-arch.revmap[pte_index]);
+   hpte = (__be64

[PULL 02/12] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-08-22 Thread Alexander Graf

From: Thomas Huth th...@redhat.com

Since the PPC970 support has been removed from the kvm-hv kernel
module recently, we should also reflect this change in the help
text of the corresponding Kconfig option.

Signed-off-by: Thomas Huth th...@redhat.com
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Kconfig | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index 3caec2c..c2024ac 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -74,14 +74,14 @@ config KVM_BOOK3S_64
  If unsure, say N.
 
 config KVM_BOOK3S_64_HV
-   tristate KVM support for POWER7 and PPC970 using hypervisor mode in 
host
+   tristate KVM for POWER7 and later using hypervisor mode in host
depends on KVM_BOOK3S_64  PPC_POWERNV
select KVM_BOOK3S_HV_POSSIBLE
select MMU_NOTIFIER
select CMA
---help---
  Support running unmodified book3s_64 guest kernels in
- virtual machines on POWER7 and PPC970 processors that have
+ virtual machines on POWER7 and newer processors that have
  hypervisor mode available to the host.
 
  If you say Y here, KVM will use the hardware virtualization
@@ -89,8 +89,8 @@ config KVM_BOOK3S_64_HV
  guest operating systems will run at full hardware speed
  using supervisor and user modes.  However, this also means
  that KVM is not usable under PowerVM (pHyp), is only usable
- on POWER7 (or later) processors and PPC970-family processors,
- and cannot emulate a different processor from the host processor.
+ on POWER7 or later processors, and cannot emulate a
+ different processor from the host processor.
 
  If unsure, say N.
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 08/12] KVM: PPC: Book3S HV: Fix bug in dirty page tracking

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This fixes a bug in the tracking of pages that get modified by the
guest.  If the guest creates a large-page HPTE, writes to memory
somewhere within the large page, and then removes the HPTE, we only
record the modified state for the first normal page within the large
page, when in fact the guest might have modified some other normal
page within the large page.

To fix this we use some unused bits in the rmap entry to record the
order (log base 2) of the size of the page that was modified, when
removing an HPTE.  Then in kvm_test_clear_dirty_npages() we use that
order to return the correct number of modified pages.

The same thing could in principle happen when removing a HPTE at the
host's request, i.e. when paging out a page, except that we never
page out large pages, and the guest can only create large-page HPTEs
if the guest RAM is backed by large pages.  However, we also fix
this case for the sake of future-proofing.

The reference bit is also subject to the same loss of information.  We
don't make the same fix here for the reference bit because there isn't
an interface for userspace to find out which pages the guest has
referenced, whereas there is one for userspace to find out which pages
the guest has modified.  Because of this loss of information, the
kvm_age_hva_hv() and kvm_test_age_hva_hv() functions might incorrectly
say that a page has not been referenced when it has, but that doesn't
matter greatly because we never page or swap out large pages.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  1 +
 arch/powerpc/include/asm/kvm_host.h   |  2 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c   |  8 +++-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c   | 17 +
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index b91e74a..e6b2534 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -158,6 +158,7 @@ extern pfn_t kvmppc_gpa_to_pfn(struct kvm_vcpu *vcpu, gpa_t 
gpa, bool writing,
bool *writable);
 extern void kvmppc_add_revmap_chain(struct kvm *kvm, struct revmap_entry *rev,
unsigned long *rmap, long pte_index, int realmode);
+extern void kvmppc_update_rmap_change(unsigned long *rmap, unsigned long 
psize);
 extern void kvmppc_invalidate_hpte(struct kvm *kvm, __be64 *hptep,
unsigned long pte_index);
 void kvmppc_clear_ref_hpte(struct kvm *kvm, __be64 *hptep,
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 80eb29a..e187b6a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -205,8 +205,10 @@ struct revmap_entry {
  */
 #define KVMPPC_RMAP_LOCK_BIT   63
 #define KVMPPC_RMAP_RC_SHIFT   32
+#define KVMPPC_RMAP_CHG_SHIFT  48
 #define KVMPPC_RMAP_REFERENCED (HPTE_R_R  KVMPPC_RMAP_RC_SHIFT)
 #define KVMPPC_RMAP_CHANGED(HPTE_R_C  KVMPPC_RMAP_RC_SHIFT)
+#define KVMPPC_RMAP_CHG_ORDER  (0x3ful  KVMPPC_RMAP_CHG_SHIFT)
 #define KVMPPC_RMAP_PRESENT0x1ul
 #define KVMPPC_RMAP_INDEX  0xul
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index dab68b7..1f9c0a1 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -761,6 +761,8 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
/* Harvest R and C */
rcbits = be64_to_cpu(hptep[1])  (HPTE_R_R | HPTE_R_C);
*rmapp |= rcbits  KVMPPC_RMAP_RC_SHIFT;
+   if (rcbits  HPTE_R_C)
+   kvmppc_update_rmap_change(rmapp, psize);
if (rcbits  ~rev[i].guest_rpte) {
rev[i].guest_rpte = ptel | rcbits;
note_hpte_modification(kvm, rev[i]);
@@ -927,8 +929,12 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
  retry:
lock_rmap(rmapp);
if (*rmapp  KVMPPC_RMAP_CHANGED) {
-   *rmapp = ~KVMPPC_RMAP_CHANGED;
+   long change_order = (*rmapp  KVMPPC_RMAP_CHG_ORDER)
+KVMPPC_RMAP_CHG_SHIFT;
+   *rmapp = ~(KVMPPC_RMAP_CHANGED | KVMPPC_RMAP_CHG_ORDER);
npages_dirty = 1;
+   if (change_order  PAGE_SHIFT)
+   npages_dirty = 1ul  (change_order - PAGE_SHIFT);
}
if (!(*rmapp  KVMPPC_RMAP_PRESENT)) {
unlock_rmap(rmapp);
diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
index c6d601c..c7a3ab2 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
@@ -12,6 +12,7 @@
 #include linux/kvm_host.h
 #include

[PULL 11/12] KVM: PPC: Book3S HV: Fix preempted vcore stolen time calculation

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

Whenever a vcore state is VCORE_PREEMPT we need to be counting stolen
time for it.  This currently isn't the case when we have a vcore that
no longer has any runnable threads in it but still has a runner task,
so we do an explicit call to kvmppc_core_start_stolen() in that case.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 3d02276..fad52f2 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2283,9 +2283,14 @@ static void post_guest_process(struct kvmppc_vcore *vc, 
bool is_master)
}
list_del_init(vc-preempt_list);
if (!is_master) {
-   vc-vcore_state = vc-runner ? VCORE_PREEMPT : VCORE_INACTIVE;
-   if (still_running  0)
+   if (still_running  0) {
kvmppc_vcore_preempt(vc);
+   } else if (vc-runner) {
+   vc-vcore_state = VCORE_PREEMPT;
+   kvmppc_core_start_stolen(vc);
+   } else {
+   vc-vcore_state = VCORE_INACTIVE;
+   }
if (vc-n_runnable  0  vc-runner == NULL) {
/* make sure there's a candidate runner awake */
vcpu = list_first_entry(vc-runnable_threads,
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 10/12] KVM: PPC: Book3S HV: Fix preempted vcore list locking

2015-08-22 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

When a vcore gets preempted, we put it on the preempted vcore list for
the current CPU.  The runner task then calls schedule() and comes back
some time later and takes itself off the list.  We need to be careful
to lock the list that it was put onto, which may not be the list for the
current CPU since the runner task may have moved to another CPU.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6e3ef30..3d02276 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1962,10 +1962,11 @@ static void kvmppc_vcore_preempt(struct kvmppc_vcore 
*vc)
 
 static void kvmppc_vcore_end_preempt(struct kvmppc_vcore *vc)
 {
-   struct preempted_vcore_list *lp = this_cpu_ptr(preempted_vcores);
+   struct preempted_vcore_list *lp;
 
kvmppc_core_end_stolen(vc);
if (!list_empty(vc-preempt_list)) {
+   lp = per_cpu(preempted_vcores, vc-pcpu);
spin_lock(lp-lock);
list_del_init(vc-preempt_list);
spin_unlock(lp-lock);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 1/1] KVM: PPC: Book3S: correct width in XER handling

2015-08-12 Thread Alexander Graf



On 06.08.15 12:16, Laurent Vivier wrote:
 Hi,
 
 I'd also like to see this patch in the mainstream as it fixes a bug
 appearing when we switch from vCPU context to hypervisor context (guest
 crash).

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing

2015-08-12 Thread Alexander Graf



On 07.08.15 17:54, Nicholas Krause wrote:
 This fixes the incorrect return statement in the function
 mpic_set_default_irq_routing from always returning zero
 to signal success to this function's caller to instead
 return the return value of kvm_set_irq_routing as this
 function can fail and we need to correctly signal the
 caller of mpic_set_default_irq_routing when the call
 to this particular function has failed.
 
 Signed-off-by: Nicholas Krause xerofo...@gmail.com

I like the patch, but I don't see it on the kvm-ppc mailing list. It
doesn't show up on patchwork or spinics. Did something go wrong while
sending it out?


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing

2015-08-12 Thread Alexander Graf



On 07.08.15 17:54, Nicholas Krause wrote:
 This fixes the incorrect return statement in the function
 mpic_set_default_irq_routing from always returning zero
 to signal success to this function's caller to instead
 return the return value of kvm_set_irq_routing as this
 function can fail and we need to correctly signal the
 caller of mpic_set_default_irq_routing when the call
 to this particular function has failed.
 
 Signed-off-by: Nicholas Krause xerofo...@gmail.com

I like the patch, but I don't see it on the kvm-ppc mailing list. It
doesn't show up on patchwork or spinics. Did something go wrong while
sending it out?


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm:powerpc:Fix return statements for wrapper functions in the file book3s_64_mmu_hv.c

2015-08-12 Thread Alexander Graf



On 10.08.15 17:27, Nicholas Krause wrote:
 This fixes the wrapper functions kvm_umap_hva_hv and the function
 kvm_unmap_hav_range_hv to return the return value of the function
 kvm_handle_hva or kvm_handle_hva_range that they are wrapped to
 call internally rather then always making the caller of these
 wrapper functions think they always run successfully by returning
 the value of zero directly.
 
 Signed-off-by: Nicholas Krause xerofo...@gmail.com

Paul, could you please take on this one?

Thanks,

Alex

 ---
  arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)
 
 diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
 b/arch/powerpc/kvm/book3s_64_mmu_hv.c
 index dab68b7..0905c8f 100644
 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
 +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
 @@ -774,14 +774,12 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned 
 long *rmapp,
  
  int kvm_unmap_hva_hv(struct kvm *kvm, unsigned long hva)
  {
 - kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
 - return 0;
 + return kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
  }
  
  int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned 
 long end)
  {
 - kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp);
 - return 0;
 + return kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp);
  }
  
  void kvmppc_core_flush_memslot_hv(struct kvm *kvm,
 
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm:powerpc:Fix return statements for wrapper functions in the file book3s_64_mmu_hv.c

2015-08-12 Thread Alexander Graf



On 10.08.15 17:27, Nicholas Krause wrote:
 This fixes the wrapper functions kvm_umap_hva_hv and the function
 kvm_unmap_hav_range_hv to return the return value of the function
 kvm_handle_hva or kvm_handle_hva_range that they are wrapped to
 call internally rather then always making the caller of these
 wrapper functions think they always run successfully by returning
 the value of zero directly.
 
 Signed-off-by: Nicholas Krause xerofo...@gmail.com

Paul, could you please take on this one?

Thanks,

Alex

 ---
  arch/powerpc/kvm/book3s_64_mmu_hv.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)
 
 diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
 b/arch/powerpc/kvm/book3s_64_mmu_hv.c
 index dab68b7..0905c8f 100644
 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
 +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
 @@ -774,14 +774,12 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned 
 long *rmapp,
  
  int kvm_unmap_hva_hv(struct kvm *kvm, unsigned long hva)
  {
 - kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
 - return 0;
 + return kvm_handle_hva(kvm, hva, kvm_unmap_rmapp);
  }
  
  int kvm_unmap_hva_range_hv(struct kvm *kvm, unsigned long start, unsigned 
 long end)
  {
 - kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp);
 - return 0;
 + return kvm_handle_hva_range(kvm, start, end, kvm_unmap_rmapp);
  }
  
  void kvmppc_core_flush_memslot_hv(struct kvm *kvm,
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing

2015-08-12 Thread Alexander Graf



On 12.08.15 21:06, nick wrote:
 
 
 On 2015-08-12 03:05 PM, Alexander Graf wrote:


 On 07.08.15 17:54, Nicholas Krause wrote:
 This fixes the incorrect return statement in the function
 mpic_set_default_irq_routing from always returning zero
 to signal success to this function's caller to instead
 return the return value of kvm_set_irq_routing as this
 function can fail and we need to correctly signal the
 caller of mpic_set_default_irq_routing when the call
 to this particular function has failed.

 Signed-off-by: Nicholas Krause xerofo...@gmail.com

 I like the patch, but I don't see it on the kvm-ppc mailing list. It
 doesn't show up on patchwork or spinics. Did something go wrong while
 sending it out?


 Alex

 Alex,
 Ask Paolo about it as he would be able to explain it better then I.

Well, whatever the reason, I can only apply patches that actually
appeared on the public mailing list. Otherwise people may not get the
chance to review them ;).


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] kvm:powerpc:Fix incorrect return statement in the function mpic_set_default_irq_routing

2015-08-12 Thread Alexander Graf



On 12.08.15 21:06, nick wrote:
 
 
 On 2015-08-12 03:05 PM, Alexander Graf wrote:


 On 07.08.15 17:54, Nicholas Krause wrote:
 This fixes the incorrect return statement in the function
 mpic_set_default_irq_routing from always returning zero
 to signal success to this function's caller to instead
 return the return value of kvm_set_irq_routing as this
 function can fail and we need to correctly signal the
 caller of mpic_set_default_irq_routing when the call
 to this particular function has failed.

 Signed-off-by: Nicholas Krause xerofo...@gmail.com

 I like the patch, but I don't see it on the kvm-ppc mailing list. It
 doesn't show up on patchwork or spinics. Did something go wrong while
 sending it out?


 Alex

 Alex,
 Ask Paolo about it as he would be able to explain it better then I.

Well, whatever the reason, I can only apply patches that actually
appeared on the public mailing list. Otherwise people may not get the
chance to review them ;).


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvm-unit-tests PATCH 11/14] powerpc/ppc64: add rtas_power_off

2015-08-03 Thread Alexander Graf



On 03.08.15 19:02, Andrew Jones wrote:
 On Mon, Aug 03, 2015 at 07:08:17PM +0200, Paolo Bonzini wrote:


 On 03/08/2015 16:41, Andrew Jones wrote:
 Add enough RTAS support to support power-off, and apply it to
 exit().

 Signed-off-by: Andrew Jones drjo...@redhat.com

 Why not use virtio-mmio + testdev on ppc as well?  Similar to how we're
 not using PSCI on ARM or ACPI on x86.
 
 I have some longer term plans to add minimal virtio-pci support to
 kvm-unit-tests, and then we could plug virtio-serial+chr-testdev into
 that. I didn't think I could use virtio-mmio directly with spapr, but
 maybe I can? Actually, I sort of like this approach more in some

You would need to add support for the dynamic sysbus device allocation
in the spapr machine, but then I don't see why it wouldn't work.

PCI however is the more natural choice on sPAPR if you want to do virtio.

That said, if all you need is a chr transport, IIRC there should be a
way to get you additional channels on the existing serial port - which
really is just a simply hypercall interface. But David is the best
person to guide you to the best path forward here.


Alex

 respects though, as it doesn't require a special testdev or virtio
 support, keeping the unit test extra minimal. In fact, I was even
 thinking about posting patches (which I've already written) that
 allow chr-testdev to be optional for ARM too, now that it could use
 the exitcode snooper.
 
 Thanks,
 drew
 

 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [kvm-unit-tests PATCH 11/14] powerpc/ppc64: add rtas_power_off

2015-08-03 Thread Alexander Graf



On 03.08.15 19:02, Andrew Jones wrote:
 On Mon, Aug 03, 2015 at 07:08:17PM +0200, Paolo Bonzini wrote:


 On 03/08/2015 16:41, Andrew Jones wrote:
 Add enough RTAS support to support power-off, and apply it to
 exit().

 Signed-off-by: Andrew Jones drjo...@redhat.com

 Why not use virtio-mmio + testdev on ppc as well?  Similar to how we're
 not using PSCI on ARM or ACPI on x86.
 
 I have some longer term plans to add minimal virtio-pci support to
 kvm-unit-tests, and then we could plug virtio-serial+chr-testdev into
 that. I didn't think I could use virtio-mmio directly with spapr, but
 maybe I can? Actually, I sort of like this approach more in some

You would need to add support for the dynamic sysbus device allocation
in the spapr machine, but then I don't see why it wouldn't work.

PCI however is the more natural choice on sPAPR if you want to do virtio.

That said, if all you need is a chr transport, IIRC there should be a
way to get you additional channels on the existing serial port - which
really is just a simply hypercall interface. But David is the best
person to guide you to the best path forward here.


Alex

 respects though, as it doesn't require a special testdev or virtio
 support, keeping the unit test extra minimal. In fact, I was even
 thinking about posting patches (which I've already written) that
 allow chr-testdev to be optional for ARM too, now that it could use
 the exitcode snooper.
 
 Thanks,
 drew
 

 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] Two fixes for dynamic micro-threading

2015-07-23 Thread Alexander Graf



On 20.07.15 08:49, David Gibson wrote:
 On Thu, Jul 16, 2015 at 05:11:12PM +1000, Paul Mackerras wrote:
 This series contains two fixes for the new dynamic micro-threading
 code that was added recently for HV-mode KVM on Power servers.
 The patches are against Alex Graf's kvm-ppc-queue branch.  Please
 apply.
 
 agraf,
 
 Any word on these?  These appear to fix a really nasty host crash in
 current upstream.  I'd really like to see them merged ASAP.

Thanks, applied to kvm-ppc-queue.

The host crash should only occur with dynamic micro-threading enabled,
which is not in Linus' tree, correct?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] Two fixes for dynamic micro-threading

2015-07-23 Thread Alexander Graf



On 20.07.15 08:49, David Gibson wrote:
 On Thu, Jul 16, 2015 at 05:11:12PM +1000, Paul Mackerras wrote:
 This series contains two fixes for the new dynamic micro-threading
 code that was added recently for HV-mode KVM on Power servers.
 The patches are against Alex Graf's kvm-ppc-queue branch.  Please
 apply.
 
 agraf,
 
 Any word on these?  These appear to fix a really nasty host crash in
 current upstream.  I'd really like to see them merged ASAP.

Thanks, applied to kvm-ppc-queue.

The host crash should only occur with dynamic micro-threading enabled,
which is not in Linus' tree, correct?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/5] PPC: Current patch queue for HV KVM

2015-07-01 Thread Alexander Graf



On 24.06.15 13:18, Paul Mackerras wrote:
 This is my current queue of patches for HV KVM.  This series is based
 on the kvm next branch.  They have all been posted 6 weeks ago or
 more, though I have just added a 3-line fix to patch 2/5 to fix a bug
 that we found in testing migration, and I expanded a comment (no code
 change) in patch 3/5 following a suggestion by Aneesh.
 
 I'd like to see these go into 4.2 if possible.

Thanks, applied all to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/5] PPC: Current patch queue for HV KVM

2015-07-01 Thread Alexander Graf



On 24.06.15 13:18, Paul Mackerras wrote:
 This is my current queue of patches for HV KVM.  This series is based
 on the kvm next branch.  They have all been posted 6 weeks ago or
 more, though I have just added a 3-line fix to patch 2/5 to fix a bug
 that we found in testing migration, and I expanded a comment (no code
 change) in patch 3/5 following a suggestion by Aneesh.
 
 I'd like to see these go into 4.2 if possible.

Thanks, applied all to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/5] KVM: PPC: Book3S HV: Implement dynamic micro-threading on POWER8

2015-06-30 Thread Alexander Graf


On 06/24/15 13:18, Paul Mackerras wrote:

This builds on the ability to run more than one vcore on a physical
core by using the micro-threading (split-core) modes of the POWER8
chip.  Previously, only vcores from the same VM could be run together,
and (on POWER8) only if they had just one thread per core.  With the
ability to split the core on guest entry and unsplit it on guest exit,
we can run up to 8 vcpu threads from up to 4 different VMs, and we can
run multiple vcores with 2 or 4 vcpus per vcore.

Dynamic micro-threading is only available if the static configuration
of the cores is whole-core mode (unsplit), and only on POWER8.

To manage this, we introduce a new kvm_split_mode struct which is
shared across all of the subcores in the core, with a pointer in the
paca on each thread.  In addition we extend the core_info struct to
have information on each subcore.  When deciding whether to add a
vcore to the set already on the core, we now have two possibilities:
(a) piggyback the vcore onto an existing subcore, or (b) start a new
subcore.

Currently, when any vcpu needs to exit the guest and switch to host
virtual mode, we interrupt all the threads in all subcores and switch
the core back to whole-core mode.  It may be possible in future to
allow some of the subcores to keep executing in the guest while
subcore 0 switches to the host, but that is not implemented in this
patch.

This adds a module parameter called dynamic_mt_modes which controls
which micro-threading (split-core) modes the code will consider, as a
bitmap.  In other words, if it is 0, no micro-threading mode is
considered; if it is 2, only 2-way micro-threading is considered; if
it is 4, only 4-way, and if it is 6, both 2-way and 4-way
micro-threading mode will be considered.  The default is 6.

With this, we now have secondary threads which are the primary thread
for their subcore and therefore need to do the MMU switch.  These
threads will need to be started even if they have no vcpu to run, so
we use the vcore pointer in the PACA rather than the vcpu pointer to
trigger them.

It is now possible for thread 0 to find that an exit has been
requested before it gets to switch the subcore state to the guest.  In
that case we haven't added the guest's timebase offset to the
timebase, so we need to be careful not to subtract the offset in the
guest exit path.  In fact we just skip the whole path that switches
back to host context, since we haven't switched to the guest context.

Signed-off-by: Paul Mackerras pau...@samba.org
---
  arch/powerpc/include/asm/kvm_book3s_asm.h |  20 ++
  arch/powerpc/include/asm/kvm_host.h   |   3 +
  arch/powerpc/kernel/asm-offsets.c |   7 +
  arch/powerpc/kvm/book3s_hv.c  | 369 ++
  arch/powerpc/kvm/book3s_hv_builtin.c  |  25 +-
  arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 113 +++--
  6 files changed, 475 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 5bdfb5d..4024d24 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -25,6 +25,12 @@
  #define XICS_MFRR 0xc
  #define XICS_IPI  2   /* interrupt source # for IPIs */
  
+/* Maximum number of threads per physical core */

+#define MAX_THREADS8
+
+/* Maximum number of subcores per physical core */
+#define MAX_SUBCORES   4
+
  #ifdef __ASSEMBLY__
  
  #ifdef CONFIG_KVM_BOOK3S_HANDLER

@@ -65,6 +71,19 @@ kvmppc_resume_\intno:
  
  #else  /*__ASSEMBLY__ */
  
+struct kvmppc_vcore;

+
+/* Struct used for coordinating micro-threading (split-core) mode changes */
+struct kvm_split_mode {
+   unsigned long   rpr;
+   unsigned long   pmmar;
+   unsigned long   ldbar;
+   u8  subcore_size;
+   u8  do_nap;
+   u8  napped[MAX_THREADS];
+   struct kvmppc_vcore *master_vcs[MAX_SUBCORES];
+};
+
  /*
   * This struct goes in the PACA on 64-bit processors.  It is used
   * to store host state that needs to be saved when we enter a guest
@@ -100,6 +119,7 @@ struct kvmppc_host_state {
u64 host_spurr;
u64 host_dscr;
u64 dec_expires;
+   struct kvm_split_mode *kvm_split_mode;
  #endif
  #ifdef CONFIG_PPC_BOOK3S_64
u64 cfar;
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2b74490..80eb29a 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -302,6 +302,9 @@ struct kvmppc_vcore {
  #define VCORE_EXIT_MAP(vc)((vc)-entry_exit_map  8)
  #define VCORE_IS_EXITING(vc)  (VCORE_EXIT_MAP(vc) != 0)
  
+/* This bit is used when a vcore exit is triggered from outside the vcore */

+#define VCORE_EXIT_REQ 0x1
+
  /*
   * Values for vcore_state.
   * Note that these are arranged such that lower values
diff --git a/arch/powerpc/kernel/asm-offsets.c

Re: [PATCH 1/3] powerpc: implement barrier primitives

2015-06-17 Thread Alexander Graf



On 17.06.15 12:15, Will Deacon wrote:
 On Wed, Jun 17, 2015 at 10:43:48AM +0100, Andre Przywara wrote:
 Instead of referring to the Linux header including the barrier
 macros, copy over the rather simple implementation for the PowerPC
 barrier instructions kvmtool uses. This fixes build for powerpc.

 Signed-off-by: Andre Przywara andre.przyw...@arm.com
 ---
 Hi,

 I just took what kvmtool seems to have used before, I actually have
 no idea if sync is the right instruction or lwsync would do.
 Would be nice if some people with PowerPC knowledge could comment.
 
 I *think* we can use lwsync for rmb and wmb, but would want confirmation
 from a ppc guy before making that change!

Also I'd prefer to play safe for now :)


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] treewide: Fix typo compatability - compatibility

2015-06-01 Thread Alexander Graf



On 27.05.15 14:05, Laurent Pinchart wrote:
 Even though 'compatability' has a dedicated entry in the Wiktionary,
 it's listed as 'Mispelling of compatibility'. Fix it.
 
 Signed-off-by: Laurent Pinchart laurent.pinch...@ideasonboard.com
 ---
  arch/metag/include/asm/elf.h | 2 +-


  arch/powerpc/kvm/book3s.c| 2 +-

Acked-by: Alexander Graf ag...@suse.de

for the PPC KVM bit.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 1/1] KVM: PPC: Book3S: correct width in XER handling

2015-05-26 Thread Alexander Graf



On 26.05.15 02:27, Sam Bobroff wrote:
 In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64
 bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is
 accessed as such.
 
 This patch corrects places where it is accessed as a 32 bit field by a
 64 bit kernel.  In some cases this is via a 32 bit load or store
 instruction which, depending on endianness, will cause either the
 lower or upper 32 bits to be missed.  In another case it is cast as a
 u32, causing the upper 32 bits to be cleared.
 
 This patch corrects those places by extending the access methods to
 64 bits.
 
 Signed-off-by: Sam Bobroff sam.bobr...@au1.ibm.com
 ---
 
 v2:
 
 Also extend kvmppc_book3s_shadow_vcpu.xer to 64 bit.
 
  arch/powerpc/include/asm/kvm_book3s.h |4 ++--
  arch/powerpc/include/asm/kvm_book3s_asm.h |2 +-
  arch/powerpc/kvm/book3s_hv_rmhandlers.S   |6 +++---
  arch/powerpc/kvm/book3s_segment.S |4 ++--
  4 files changed, 8 insertions(+), 8 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
 b/arch/powerpc/include/asm/kvm_book3s.h
 index b91e74a..05a875a 100644
 --- a/arch/powerpc/include/asm/kvm_book3s.h
 +++ b/arch/powerpc/include/asm/kvm_book3s.h
 @@ -225,12 +225,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
   return vcpu-arch.cr;
  }
  
 -static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
 +static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)

Now we have book3s and booke files with different prototypes on the same
inline function names. That's really ugly. Please keep them in sync ;).


Alex

  {
   vcpu-arch.xer = val;
  }
  
 -static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
 +static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
  {
   return vcpu-arch.xer;
  }
 diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h 
 b/arch/powerpc/include/asm/kvm_book3s_asm.h
 index 5bdfb5d..c4ccd2d 100644
 --- a/arch/powerpc/include/asm/kvm_book3s_asm.h
 +++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
 @@ -112,7 +112,7 @@ struct kvmppc_book3s_shadow_vcpu {
   bool in_use;
   ulong gpr[14];
   u32 cr;
 - u32 xer;
 + ulong xer;
   ulong ctr;
   ulong lr;
   ulong pc;
 diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
 b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 index 4d70df2..d75be59 100644
 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 @@ -870,7 +870,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
   blt hdec_soon
  
   ld  r6, VCPU_CTR(r4)
 - lwz r7, VCPU_XER(r4)
 + ld  r7, VCPU_XER(r4)
  
   mtctr   r6
   mtxer   r7
 @@ -1103,7 +1103,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
   mfctr   r3
   mfxer   r4
   std r3, VCPU_CTR(r9)
 - stw r4, VCPU_XER(r9)
 + std r4, VCPU_XER(r9)
  
   /* If this is a page table miss then see if it's theirs or ours */
   cmpwi   r12, BOOK3S_INTERRUPT_H_DATA_STORAGE
 @@ -1675,7 +1675,7 @@ kvmppc_hdsi:
   bl  kvmppc_msr_interrupt
  fast_interrupt_c_return:
  6:   ld  r7, VCPU_CTR(r9)
 - lwz r8, VCPU_XER(r9)
 + ld  r8, VCPU_XER(r9)
   mtctr   r7
   mtxer   r8
   mr  r4, r9
 diff --git a/arch/powerpc/kvm/book3s_segment.S 
 b/arch/powerpc/kvm/book3s_segment.S
 index acee37c..ca8f174 100644
 --- a/arch/powerpc/kvm/book3s_segment.S
 +++ b/arch/powerpc/kvm/book3s_segment.S
 @@ -123,7 +123,7 @@ no_dcbz32_on:
   PPC_LL  r8, SVCPU_CTR(r3)
   PPC_LL  r9, SVCPU_LR(r3)
   lwz r10, SVCPU_CR(r3)
 - lwz r11, SVCPU_XER(r3)
 + PPC_LL  r11, SVCPU_XER(r3)
  
   mtctr   r8
   mtlrr9
 @@ -237,7 +237,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
   mfctr   r8
   mflrr9
  
 - stw r5, SVCPU_XER(r13)
 + PPC_STL r5, SVCPU_XER(r13)
   PPC_STL r6, SVCPU_FAULT_DAR(r13)
   stw r7, SVCPU_FAULT_DSISR(r13)
   PPC_STL r8, SVCPU_CTR(r13)
 
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: fix suspicious use of conditional operator

2015-05-25 Thread Alexander Graf



On 25.05.15 10:48, Laurentiu Tudor wrote:
 This was signaled by a static code analysis tool.
 
 Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
 Reviewed-by: Scott Wood scottw...@freescale.com

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: fix suspicious use of conditional operator

2015-05-25 Thread Alexander Graf



On 25.05.15 10:48, Laurentiu Tudor wrote:
 This was signaled by a static code analysis tool.
 
 Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
 Reviewed-by: Scott Wood scottw...@freescale.com

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-05-25 Thread Alexander Graf



On 22.05.15 11:41, Thomas Huth wrote:
 Since the PPC970 support has been removed from the kvm-hv kernel
 module recently, we should also reflect this change in the help
 text of the corresponding Kconfig option.
 
 Signed-off-by: Thomas Huth th...@redhat.com

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-05-25 Thread Alexander Graf



On 22.05.15 11:41, Thomas Huth wrote:
 Since the PPC970 support has been removed from the kvm-hv kernel
 module recently, we should also reflect this change in the help
 text of the corresponding Kconfig option.
 
 Signed-off-by: Thomas Huth th...@redhat.com

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: Fix warnings from sparse

2015-05-25 Thread Alexander Graf



On 22.05.15 09:25, Thomas Huth wrote:
 When compiling the KVM code for POWER with make C=1, sparse
 complains about functions missing proper prototypes and a 64-bit
 constant missing the ULL prefix. Let's fix this by making the
 functions static or by including the proper header with the
 prototypes, and by appending a ULL prefix to the constant
 PPC_MPPE_ADDRESS_MASK.
 
 Signed-off-by: Thomas Huth th...@redhat.com

Thanks, applied to kvm-ppc-queue.

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-05-25 Thread Alexander Graf



On 22.05.15 11:41, Thomas Huth wrote:
 Since the PPC970 support has been removed from the kvm-hv kernel
 module recently, we should also reflect this change in the help
 text of the corresponding Kconfig option.
 
 Signed-off-by: Thomas Huth th...@redhat.com

Thanks, applied to kvm-ppc-queue.

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: Fix warnings from sparse

2015-05-25 Thread Alexander Graf



On 22.05.15 09:25, Thomas Huth wrote:
 When compiling the KVM code for POWER with make C=1, sparse
 complains about functions missing proper prototypes and a 64-bit
 constant missing the ULL prefix. Let's fix this by making the
 functions static or by including the proper header with the
 prototypes, and by appending a ULL prefix to the constant
 PPC_MPPE_ADDRESS_MASK.
 
 Signed-off-by: Thomas Huth th...@redhat.com

Thanks, applied to kvm-ppc-queue.

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: Remove PPC970 from KVM_BOOK3S_64_HV text in Kconfig

2015-05-25 Thread Alexander Graf



On 22.05.15 11:41, Thomas Huth wrote:
 Since the PPC970 support has been removed from the kvm-hv kernel
 module recently, we should also reflect this change in the help
 text of the corresponding Kconfig option.
 
 Signed-off-by: Thomas Huth th...@redhat.com

Thanks, applied to kvm-ppc-queue.

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: check for lookup_linux_ptep() returning NULL

2015-05-25 Thread Alexander Graf



On 21.05.15 21:37, Scott Wood wrote:
 On Thu, 2015-05-21 at 16:26 +0300, Laurentiu Tudor wrote:
 If passed a larger page size lookup_linux_ptep()
 may fail, so add a check for that and bail out
 if that's the case.
 This was found with the help of a static
 code analysis tool.

 Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
 Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
 Cc: Scott Wood scottw...@freescale.com
 ---
 based on https://github.com/agraf/linux-2.6.git kvm-ppc-next

  arch/powerpc/kvm/e500_mmu_host.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 Reviewed-by: Scott Wood scottw...@freescale.com

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: check for lookup_linux_ptep() returning NULL

2015-05-25 Thread Alexander Graf



On 21.05.15 21:37, Scott Wood wrote:
 On Thu, 2015-05-21 at 16:26 +0300, Laurentiu Tudor wrote:
 If passed a larger page size lookup_linux_ptep()
 may fail, so add a check for that and bail out
 if that's the case.
 This was found with the help of a static
 code analysis tool.

 Signed-off-by: Mihai Caraman mihai.cara...@freescale.com
 Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
 Cc: Scott Wood scottw...@freescale.com
 ---
 based on https://github.com/agraf/linux-2.6.git kvm-ppc-next

  arch/powerpc/kvm/e500_mmu_host.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 Reviewed-by: Scott Wood scottw...@freescale.com

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] KVM: PPC: Book3S: correct width in XER handling

2015-05-25 Thread Alexander Graf



On 26.05.15 02:14, Sam Bobroff wrote:
 On Mon, May 25, 2015 at 11:08:08PM +0200, Alexander Graf wrote:


 On 20.05.15 07:26, Sam Bobroff wrote:
 In 64 bit kernels, the Fixed Point Exception Register (XER) is a 64
 bit field (e.g. in kvm_regs and kvm_vcpu_arch) and in most places it is
 accessed as such.

 This patch corrects places where it is accessed as a 32 bit field by a
 64 bit kernel.  In some cases this is via a 32 bit load or store
 instruction which, depending on endianness, will cause either the
 lower or upper 32 bits to be missed.  In another case it is cast as a
 u32, causing the upper 32 bits to be cleared.

 This patch corrects those places by extending the access methods to
 64 bits.

 Signed-off-by: Sam Bobroff sam.bobr...@au1.ibm.com
 ---

  arch/powerpc/include/asm/kvm_book3s.h   |4 ++--
  arch/powerpc/kvm/book3s_hv_rmhandlers.S |6 +++---
  arch/powerpc/kvm/book3s_segment.S   |4 ++--
  3 files changed, 7 insertions(+), 7 deletions(-)

 diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
 b/arch/powerpc/include/asm/kvm_book3s.h
 index b91e74a..05a875a 100644
 --- a/arch/powerpc/include/asm/kvm_book3s.h
 +++ b/arch/powerpc/include/asm/kvm_book3s.h
 @@ -225,12 +225,12 @@ static inline u32 kvmppc_get_cr(struct kvm_vcpu *vcpu)
 return vcpu-arch.cr;
  }
  
 -static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, u32 val)
 +static inline void kvmppc_set_xer(struct kvm_vcpu *vcpu, ulong val)
  {
 vcpu-arch.xer = val;
  }
  
 -static inline u32 kvmppc_get_xer(struct kvm_vcpu *vcpu)
 +static inline ulong kvmppc_get_xer(struct kvm_vcpu *vcpu)
  {
 return vcpu-arch.xer;
  }
 diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
 b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 index 4d70df2..d75be59 100644
 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
 @@ -870,7 +870,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
 blt hdec_soon
  
 ld  r6, VCPU_CTR(r4)
 -   lwz r7, VCPU_XER(r4)
 +   ld  r7, VCPU_XER(r4)
  
 mtctr   r6
 mtxer   r7
 @@ -1103,7 +1103,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 mfctr   r3
 mfxer   r4
 std r3, VCPU_CTR(r9)
 -   stw r4, VCPU_XER(r9)
 +   std r4, VCPU_XER(r9)
  
 /* If this is a page table miss then see if it's theirs or ours */
 cmpwi   r12, BOOK3S_INTERRUPT_H_DATA_STORAGE
 @@ -1675,7 +1675,7 @@ kvmppc_hdsi:
 bl  kvmppc_msr_interrupt
  fast_interrupt_c_return:
  6: ld  r7, VCPU_CTR(r9)
 -   lwz r8, VCPU_XER(r9)
 +   ld  r8, VCPU_XER(r9)
 mtctr   r7
 mtxer   r8
 mr  r4, r9
 diff --git a/arch/powerpc/kvm/book3s_segment.S 
 b/arch/powerpc/kvm/book3s_segment.S
 index acee37c..ca8f174 100644
 --- a/arch/powerpc/kvm/book3s_segment.S
 +++ b/arch/powerpc/kvm/book3s_segment.S
 @@ -123,7 +123,7 @@ no_dcbz32_on:
 PPC_LL  r8, SVCPU_CTR(r3)
 PPC_LL  r9, SVCPU_LR(r3)
 lwz r10, SVCPU_CR(r3)
 -   lwz r11, SVCPU_XER(r3)
 +   PPC_LL  r11, SVCPU_XER(r3)

 struct kvmppc_book3s_shadow_vcpu {
 bool in_use;
 ulong gpr[14];
 u32 cr;
 u32 xer;
 [...]

 so at least this change looks wrong. Please double-check all fields in
 your patch again.


 Alex
 
 Thanks for the review and the catch!
 
 The xer field in kvm_vcpu_arch is already ulong, so it looks like the one in
 kvmppc_book3s_shadow_vcpu is the only other case. I'll fix that and repost.

I guess given that the one in pt_regs is also ulong going ulong rather
than u32 is the better choice, yes.

While at it, could you please just do a grep -i xer across all kvm (.c
and .h) files and just sanity check that we're staying in sync?


Thanks!

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: add missing pt_regs initialization

2015-05-25 Thread Alexander Graf



On 18.05.15 14:44, Laurentiu Tudor wrote:
 On this switch branch the regs initialization
 doesn't happen so add it.
 This was found with the help of a static
 code analysis tool.
 
 Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
 Cc: Scott Wood scottw...@freescale.com
 Cc: Mihai Caraman mihai.cara...@freescale.com

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: add missing pt_regs initialization

2015-05-25 Thread Alexander Graf



On 18.05.15 14:44, Laurentiu Tudor wrote:
 On this switch branch the regs initialization
 doesn't happen so add it.
 This was found with the help of a static
 code analysis tool.
 
 Signed-off-by: Laurentiu Tudor laurentiu.tu...@freescale.com
 Cc: Scott Wood scottw...@freescale.com
 Cc: Mihai Caraman mihai.cara...@freescale.com

Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: PPC: Book3S HV: Fix list traversal in error case

2015-05-09 Thread Alexander Graf



On 29.04.15 06:49, Paul Mackerras wrote:
 This fixes a regression introduced in commit 25fedfca94cf, KVM: PPC:
 Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu, which
 leads to a user-triggerable oops.
 
 In the case where we try to run a vcore on a physical core that is
 not in single-threaded mode, or the vcore has too many threads for
 the physical core, we iterate the list of runnable vcpus to make
 each one return an EBUSY error to userspace.  Since this involves
 taking each vcpu off the runnable_threads list for the vcore, we
 need to use list_for_each_entry_safe rather than list_for_each_entry
 to traverse the list.  Otherwise the kernel will crash with an oops
 message like this:
 
 Unable to handle kernel paging request for data at address 0x000fff88
 Faulting instruction address: 0xd0001e635dc8
 Oops: Kernel access of bad area, sig: 11 [#2]
 SMP NR_CPUS=1024 NUMA PowerNV
 ...
 CPU: 48 PID: 91256 Comm: qemu-system-ppc Tainted: G  D3.18.0 #1
 task: c0274e507500 ti: c027d1924000 task.ti: c027d1924000
 NIP: d0001e635dc8 LR: d0001e635df8 CTR: c011ba50
 REGS: c027d19275b0 TRAP: 0300   Tainted: G  D (3.18.0)
 MSR: 90009033 SF,HV,EE,ME,IR,DR,RI,LE  CR: 22002824  XER: 
 CFAR: c0008468 DAR: 000fff88 DSISR: 4000 SOFTE: 1
 GPR00: d0001e635df8 c027d1927830 d0001e64c850 0001
 GPR04: 0001 0001  
 GPR08: 00200200   d0001e63e588
 GPR12: 2200 c7dbc800 c00fc780 000a
 GPR16: fffc c00fd5439690 c00fc7801c98 0001
 GPR20: 0003 c027d1927aa8 c00fd543b348 c00fd543b350
 GPR24:  c00fa57f 0030 
 GPR28: fff0 c00fd543b328 000fe468 c00fd543b300
 NIP [d0001e635dc8] kvmppc_run_core+0x198/0x17c0 [kvm_hv]
 LR [d0001e635df8] kvmppc_run_core+0x1c8/0x17c0 [kvm_hv]
 Call Trace:
 [c027d1927830] [d0001e635df8] kvmppc_run_core+0x1c8/0x17c0 [kvm_hv] 
 (unreliable)
 [c027d1927a30] [d0001e638350] kvmppc_vcpu_run_hv+0x5b0/0xdd0 [kvm_hv]
 [c027d1927b70] [d0001e510504] kvmppc_vcpu_run+0x44/0x60 [kvm]
 [c027d1927ba0] [d0001e50d4a4] kvm_arch_vcpu_ioctl_run+0x64/0x170 [kvm]
 [c027d1927be0] [d0001e504be8] kvm_vcpu_ioctl+0x5e8/0x7a0 [kvm]
 [c027d1927d40] [c02d6720] do_vfs_ioctl+0x490/0x780
 [c027d1927de0] [c02d6ae4] SyS_ioctl+0xd4/0xf0
 [c027d1927e30] [c0009358] syscall_exit+0x0/0x98
 Instruction dump:
 6000 6042 387e1b30 3883 38a1 38c0 480087d9 e8410018
 ebde1c98 7fbdf040 3bdee368 419e0048 813e1b20 939e1b18 2f890001 409effcc
 ---[ end trace 8cdf50251cca6680 ]---
 
 Fixes: 25fedfca94cf
 Signed-off-by: Paul Mackerras pau...@samba.org

Reviewed-by: Alexander Graf ag...@suse.de

Paolo, can you please take this patch into 4.1 directly?


Thanks a lot,

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: s390: remove delayed reallocation of page tables for KVM

2015-04-27 Thread Alexander Graf


On 04/27/2015 03:57 PM, Martin Schwidefsky wrote:

On Mon, 27 Apr 2015 15:48:42 +0200
Alexander Graf ag...@suse.de wrote:


On 04/23/2015 02:13 PM, Martin Schwidefsky wrote:

On Thu, 23 Apr 2015 14:01:23 +0200
Alexander Graf ag...@suse.de wrote:


As far as alternative approaches go, I don't have a great idea otoh.
We could have an elf flag indicating that this process needs 4k page
tables to limit the impact to a single process. In fact, could we
maybe still limit the scope to non-global? A personality may work
as well. Or ulimit?

I tried the ELF flag approach, does not work. The trouble is that
allocate_mm() has to create the page tables with 4K tables if you
want to change the page table layout later on. We have learned the
hard way that the direction 2K to 4K does not work due to races
in the mm.

Now there are two major cases: 1) fork + execve and 2) fork only.
The ELF flag can be used to reduce from 4K to 2K for 1) but not 2).
2) is required for apps that use lots of forking, e.g. database or
web servers. Same goes for the approach with a personality flag or
ulimit.

We would have to distinguish the two cases for allocate_mm(),
if the new mm is allocated for a fork the current mm decides
2K vs. 4K. If the new mm is allocated by binfmt_elf, then start
with 4K and do the downgrade after the ELF flag has been evaluated.

Well, you could also make it a personality flag for example, no? Then
every new process below a certain one always gets 4k page tables until
they drop the personality, at which point each child would only get 2k
page tables again.

I'm mostly concerned that people will end up mixing VMs and other
workloads on the same LPAR, so I don't think there's a one-shoe-fits-all
solution.

If I add an argument to mm_init() to indicate if this context
is for fork() or execve() then the ELF header flag approach works.


So you don't need the sysctl?


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: s390: remove delayed reallocation of page tables for KVM

2015-04-27 Thread Alexander Graf


On 04/23/2015 02:13 PM, Martin Schwidefsky wrote:

On Thu, 23 Apr 2015 14:01:23 +0200
Alexander Graf ag...@suse.de wrote:


As far as alternative approaches go, I don't have a great idea otoh.
We could have an elf flag indicating that this process needs 4k page
tables to limit the impact to a single process. In fact, could we
maybe still limit the scope to non-global? A personality may work
as well. Or ulimit?

I tried the ELF flag approach, does not work. The trouble is that
allocate_mm() has to create the page tables with 4K tables if you
want to change the page table layout later on. We have learned the
hard way that the direction 2K to 4K does not work due to races
in the mm.

Now there are two major cases: 1) fork + execve and 2) fork only.
The ELF flag can be used to reduce from 4K to 2K for 1) but not 2).
2) is required for apps that use lots of forking, e.g. database or
web servers. Same goes for the approach with a personality flag or
ulimit.

We would have to distinguish the two cases for allocate_mm(),
if the new mm is allocated for a fork the current mm decides
2K vs. 4K. If the new mm is allocated by binfmt_elf, then start
with 4K and do the downgrade after the ELF flag has been evaluated.


Well, you could also make it a personality flag for example, no? Then 
every new process below a certain one always gets 4k page tables until 
they drop the personality, at which point each child would only get 2k 
page tables again.


I'm mostly concerned that people will end up mixing VMs and other 
workloads on the same LPAR, so I don't think there's a one-shoe-fits-all 
solution.



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: s390: remove delayed reallocation of page tables for KVM

2015-04-27 Thread Alexander Graf


On 04/23/2015 02:08 PM, Christian Borntraeger wrote:

Am 23.04.2015 um 14:01 schrieb Alexander Graf:



Am 23.04.2015 um 13:43 schrieb Christian Borntraeger borntrae...@de.ibm.com:


Am 23.04.2015 um 13:37 schrieb Alexander Graf:



Am 23.04.2015 um 13:08 schrieb Christian Borntraeger borntrae...@de.ibm.com:

From: Martin Schwidefsky schwidef...@de.ibm.com

Replacing a 2K page table with a 4K page table while a VMA is active
for the affected memory region is fundamentally broken. Rip out the
page table reallocation code and replace it with a simple system
control 'vm.allocate_pgste'. If the system control is set the page
tables for all processes are allocated as full 4K pages, even for
processes that do not need it.

Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com

Couldn't you make this a hidden kconfig option that gets automatically selected 
when kvm is enabled? Or is there a non-kvm case that needs it too?

For things like RHEV the default could certainly be enabled, but for normal
distros like SLES/RHEL, the idea was to NOT enable that by default, as the 
non-KVM
case is more common and might suffer from the additional memory consumption of
the page tables. (big databases come to mind)

We could think about having rpms like kvm to provide a sysctl file that sets it 
if we
want to minimize the impact. Other ideas?

Oh, I'm sorry, I misread the ifdef. I don't think it makes sense to have a 
config option for the default value then, just rely only on sysctl.conf for 
changed defaults.

As far as mechanisms to change it go, every distribution has their own ways of dealing 
with this. RH has a profile thing, we don't really have anything central, but 
individual sysctl.d files for example that a kvm package could provide.
Either way, the default choosing shouldn't happen in .config ;).

So you vote for getting rid of the Kconfig?

Also, please add some helpful error message in qemu to guide users to the 
sysctl.

Yes, we will provide a qemu patch (cc stable) after this hits the kernel.


As far as alternative approaches go, I don't have a great idea otoh. We could 
have an elf flag indicating that this process needs 4k page tables to limit the 
impact to a single process.

This approach was actually Martins first fix. The problem is that the decision 
takes place on execve,
but we need an answer at fork time. So we always started with 4k page tables 
and freed the 2nd halv on
execve. Now this did not work for processes that only fork (without execve).


In fact, could we maybe still limit the scope to non-global? A personality may 
work as well. Or ulimit?

I think we will go for now with the sysctl and see if we can come up with some 
automatic way as additional
patch later on.


Sounds perfectly reasonable to me. You can for example also just set the 
sysctl bit in libvirtd :).



Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: s390: remove delayed reallocation of page tables for KVM

2015-04-23 Thread Alexander Graf



 Am 23.04.2015 um 13:43 schrieb Christian Borntraeger borntrae...@de.ibm.com:
 
 Am 23.04.2015 um 13:37 schrieb Alexander Graf:
 
 
 Am 23.04.2015 um 13:08 schrieb Christian Borntraeger 
 borntrae...@de.ibm.com:
 
 From: Martin Schwidefsky schwidef...@de.ibm.com
 
 Replacing a 2K page table with a 4K page table while a VMA is active
 for the affected memory region is fundamentally broken. Rip out the
 page table reallocation code and replace it with a simple system
 control 'vm.allocate_pgste'. If the system control is set the page
 tables for all processes are allocated as full 4K pages, even for
 processes that do not need it.
 
 Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
 Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
 
 Couldn't you make this a hidden kconfig option that gets automatically 
 selected when kvm is enabled? Or is there a non-kvm case that needs it too?
 
 For things like RHEV the default could certainly be enabled, but for normal
 distros like SLES/RHEL, the idea was to NOT enable that by default, as the 
 non-KVM
 case is more common and might suffer from the additional memory consumption of
 the page tables. (big databases come to mind)
 
 We could think about having rpms like kvm to provide a sysctl file that sets 
 it if we
 want to minimize the impact. Other ideas?

Oh, I'm sorry, I misread the ifdef. I don't think it makes sense to have a 
config option for the default value then, just rely only on sysctl.conf for 
changed defaults.

As far as mechanisms to change it go, every distribution has their own ways of 
dealing with this. RH has a profile thing, we don't really have anything 
central, but individual sysctl.d files for example that a kvm package could 
provide.

Either way, the default choosing shouldn't happen in .config ;). Also, please 
add some helpful error message in qemu to guide users to the sysctl.

As far as alternative approaches go, I don't have a great idea otoh. We could 
have an elf flag indicating that this process needs 4k page tables to limit the 
impact to a single process. In fact, could we maybe still limit the scope to 
non-global? A personality may work as well. Or ulimit?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KVM: s390: remove delayed reallocation of page tables for KVM

2015-04-23 Thread Alexander Graf



 Am 23.04.2015 um 13:08 schrieb Christian Borntraeger borntrae...@de.ibm.com:
 
 From: Martin Schwidefsky schwidef...@de.ibm.com
 
 Replacing a 2K page table with a 4K page table while a VMA is active
 for the affected memory region is fundamentally broken. Rip out the
 page table reallocation code and replace it with a simple system
 control 'vm.allocate_pgste'. If the system control is set the page
 tables for all processes are allocated as full 4K pages, even for
 processes that do not need it.
 
 Signed-off-by: Martin Schwidefsky schwidef...@de.ibm.com
 Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com

Couldn't you make this a hidden kconfig option that gets automatically selected 
when kvm is enabled? Or is there a non-kvm case that needs it too?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 02/21] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-04-21 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

On POWER, storage caching is usually configured via the MMU - attributes
such as cache-inhibited are stored in the TLB and the hashed page table.

This makes correctly performing cache inhibited IO accesses awkward when
the MMU is turned off (real mode).  Some CPU models provide special
registers to control the cache attributes of real mode load and stores but
this is not at all consistent.  This is a problem in particular for SLOF,
the firmware used on KVM guests, which runs entirely in real mode, but
which needs to do IO to load the kernel.

To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
a logical address (aka guest physical address).  SLOF uses these for IO.

However, because these are implemented within qemu, not the host kernel,
these bypass any IO devices emulated within KVM itself.  The simplest way
to see this problem is to attempt to boot a KVM guest from a virtio-blk
device with iothread / dataplane enabled.  The iothread code relies on an
in kernel implementation of the virtio queue notification, which is not
triggered by the IO hcalls, and so the guest will stall in SLOF unable to
load the guest OS.

This patch addresses this by providing in-kernel implementations of the
2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
address not handled by the KVM IO bus will cause a VM exit, hitting the
qemu implementation as before.

Note that a userspace change is also required, in order to enable these
new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
[agraf: fix compilation]
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  3 ++
 arch/powerpc/kvm/book3s.c | 76 +++
 arch/powerpc/kvm/book3s_hv.c  | 12 ++
 arch/powerpc/kvm/book3s_pr_papr.c | 28 +
 4 files changed, 119 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 942c7b1..578e550 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -292,6 +292,9 @@ static inline bool kvmppc_supports_magic_page(struct 
kvm_vcpu *vcpu)
return !is_kvmppc_hv_enabled(vcpu-kvm);
 }
 
+extern int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu);
+extern int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu);
+
 /* Magic register values loaded into r3 and r4 before the 'sc' assembly
  * instruction for the OSI hypercalls */
 #define OSI_SC_MAGIC_R30x113724FA
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index cfbcdc6..453a8a4 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -821,6 +821,82 @@ void kvmppc_core_destroy_vm(struct kvm *kvm)
 #endif
 }
 
+int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu)
+{
+   unsigned long size = kvmppc_get_gpr(vcpu, 4);
+   unsigned long addr = kvmppc_get_gpr(vcpu, 5);
+   u64 buf;
+   int ret;
+
+   if (!is_power_of_2(size) || (size  sizeof(buf)))
+   return H_TOO_HARD;
+
+   ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, size, buf);
+   if (ret != 0)
+   return H_TOO_HARD;
+
+   switch (size) {
+   case 1:
+   kvmppc_set_gpr(vcpu, 4, *(u8 *)buf);
+   break;
+
+   case 2:
+   kvmppc_set_gpr(vcpu, 4, be16_to_cpu(*(__be16 *)buf));
+   break;
+
+   case 4:
+   kvmppc_set_gpr(vcpu, 4, be32_to_cpu(*(__be32 *)buf));
+   break;
+
+   case 8:
+   kvmppc_set_gpr(vcpu, 4, be64_to_cpu(*(__be64 *)buf));
+   break;
+
+   default:
+   BUG();
+   }
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_load);
+
+int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu)
+{
+   unsigned long size = kvmppc_get_gpr(vcpu, 4);
+   unsigned long addr = kvmppc_get_gpr(vcpu, 5);
+   unsigned long val = kvmppc_get_gpr(vcpu, 6);
+   u64 buf;
+   int ret;
+
+   switch (size) {
+   case 1:
+   *(u8 *)buf = val;
+   break;
+
+   case 2:
+   *(__be16 *)buf = cpu_to_be16(val);
+   break;
+
+   case 4:
+   *(__be32 *)buf = cpu_to_be32(val);
+   break;
+
+   case 8:
+   *(__be64 *)buf = cpu_to_be64(val);
+   break;
+
+   default:
+   return H_TOO_HARD;
+   }
+
+   ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, addr, size, buf);
+   if (ret != 0)
+   return H_TOO_HARD;
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_store);
+
 int kvmppc_core_check_processor_compat(void)
 {
/*
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756

[PULL 11/21] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This reads the timebase at various points in the real-mode guest
entry/exit code and uses that to accumulate total, minimum and
maximum time spent in those parts of the code.  Currently these
times are accumulated per vcpu in 5 parts of the code:

* rm_entry - time taken from the start of kvmppc_hv_entry() until
  just before entering the guest.
* rm_intr - time from when we take a hypervisor interrupt in the
  guest until we either re-enter the guest or decide to exit to the
  host.  This includes time spent handling hcalls in real mode.
* rm_exit - time from when we decide to exit the guest until the
  return from kvmppc_hv_entry().
* guest - time spend in the guest
* cede - time spent napping in real mode due to an H_CEDE hcall
  while other threads in the same vcore are active.

These times are exposed in debugfs in a directory per vcpu that
contains a file called timings.  This file contains one line for
each of the 5 timings above, with the name followed by a colon and
4 numbers, which are the count (number of times the code has been
executed), the total time, the minimum time, and the maximum time,
all in nanoseconds.

The overhead of the extra code amounts to about 30ns for an hcall that
is handled in real mode (e.g. H_SET_DABR), which is about 25%.  Since
production environments may not wish to incur this overhead, the new
code is conditional on a new config symbol,
CONFIG_KVM_BOOK3S_HV_EXIT_TIMING.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  21 +
 arch/powerpc/include/asm/time.h |   3 +
 arch/powerpc/kernel/asm-offsets.c   |  13 +++
 arch/powerpc/kernel/time.c  |   6 ++
 arch/powerpc/kvm/Kconfig|  14 +++
 arch/powerpc/kvm/book3s_hv.c| 150 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 141 +-
 7 files changed, 346 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index f1d0bbc..d2068bb 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -369,6 +369,14 @@ struct kvmppc_slb {
u8 base_page_size;  /* MMU_PAGE_xxx */
 };
 
+/* Struct used to accumulate timing information in HV real mode code */
+struct kvmhv_tb_accumulator {
+   u64 seqcount;   /* used to synchronize access, also count * 2 */
+   u64 tb_total;   /* total time in timebase ticks */
+   u64 tb_min; /* min time */
+   u64 tb_max; /* max time */
+};
+
 # ifdef CONFIG_PPC_FSL_BOOK3E
 #define KVMPPC_BOOKE_IAC_NUM   2
 #define KVMPPC_BOOKE_DAC_NUM   2
@@ -657,6 +665,19 @@ struct kvm_vcpu_arch {
 
u32 emul_inst;
 #endif
+
+#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
+   struct kvmhv_tb_accumulator *cur_activity;  /* What we're timing */
+   u64 cur_tb_start;   /* when it started */
+   struct kvmhv_tb_accumulator rm_entry;   /* real-mode entry code */
+   struct kvmhv_tb_accumulator rm_intr;/* real-mode intr handling */
+   struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */
+   struct kvmhv_tb_accumulator guest_time; /* guest execution */
+   struct kvmhv_tb_accumulator cede_time;  /* time napping inside guest */
+
+   struct dentry *debugfs_dir;
+   struct dentry *debugfs_timings;
+#endif /* CONFIG_KVM_BOOK3S_HV_EXIT_TIMING */
 };
 
 #define VCPU_FPR(vcpu, i)  (vcpu)-arch.fp.fpr[i][TS_FPROFFSET]
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 03cbada..10fc784 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -211,5 +211,8 @@ extern void secondary_cpu_time_init(void);
 
 DECLARE_PER_CPU(u64, decrementers_next_tb);
 
+/* Convert timebase ticks to nanoseconds */
+unsigned long long tb_to_ns(unsigned long long tb_ticks);
+
 #endif /* __KERNEL__ */
 #endif /* __POWERPC_TIME_H */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 4717859..3fea721 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -459,6 +459,19 @@ int main(void)
DEFINE(VCPU_SPRG2, offsetof(struct kvm_vcpu, arch.shregs.sprg2));
DEFINE(VCPU_SPRG3, offsetof(struct kvm_vcpu, arch.shregs.sprg3));
 #endif
+#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
+   DEFINE(VCPU_TB_RMENTRY, offsetof(struct kvm_vcpu, arch.rm_entry));
+   DEFINE(VCPU_TB_RMINTR, offsetof(struct kvm_vcpu, arch.rm_intr));
+   DEFINE(VCPU_TB_RMEXIT, offsetof(struct kvm_vcpu, arch.rm_exit));
+   DEFINE(VCPU_TB_GUEST, offsetof(struct kvm_vcpu, arch.guest_time));
+   DEFINE(VCPU_TB_CEDE, offsetof(struct kvm_vcpu, arch.cede_time));
+   DEFINE(VCPU_CUR_ACTIVITY, offsetof(struct kvm_vcpu, arch.cur_activity));
+   DEFINE

[PULL 21/21] KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This uses msgsnd where possible for signalling other threads within
the same core on POWER8 systems, rather than IPIs through the XICS
interrupt controller.  This includes waking secondary threads to run
the guest, the interrupts generated by the virtual XICS, and the
interrupts to bring the other threads out of the guest when exiting.

Aggregated statistics from debugfs across vcpus for a guest with 32
vcpus, 8 threads/vcore, running on a POWER8, show this before the
change:

 rm_entry: 3387.6ns (228 - 86600, 1008969 samples)
  rm_exit: 4561.5ns (12 - 3477452, 1009402 samples)
  rm_intr: 1660.0ns (12 - 553050, 3600051 samples)

and this after the change:

 rm_entry: 3060.1ns (212 - 65138, 953873 samples)
  rm_exit: 4244.1ns (12 - 9693408, 954331 samples)
  rm_intr: 1342.3ns (12 - 1104718, 3405326 samples)

for a test of booting Fedora 20 big-endian to the login prompt.

The time taken for a H_PROD hcall (which is handled in the host
kernel) went down from about 35 microseconds to about 16 microseconds
with this change.

The noinline added to kvmppc_run_core turned out to be necessary for
good performance, at least with gcc 4.9.2 as packaged with Fedora 21
and a little-endian POWER8 host.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kernel/asm-offsets.c   |  3 ++
 arch/powerpc/kvm/book3s_hv.c| 51 ++---
 arch/powerpc/kvm/book3s_hv_builtin.c| 16 +--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 22 --
 4 files changed, 70 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 0d07efb..0034b6b 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -37,6 +37,7 @@
 #include asm/thread_info.h
 #include asm/rtas.h
 #include asm/vdso_datapage.h
+#include asm/dbell.h
 #ifdef CONFIG_PPC64
 #include asm/paca.h
 #include asm/lppaca.h
@@ -759,5 +760,7 @@ int main(void)
offsetof(struct paca_struct, subcore_sibling_mask));
 #endif
 
+   DEFINE(PPC_DBELL_SERVER, PPC_DBELL_SERVER);
+
return 0;
 }
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index ea1600f..48d3c5d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -51,6 +51,7 @@
 #include asm/hvcall.h
 #include asm/switch_to.h
 #include asm/smp.h
+#include asm/dbell.h
 #include linux/gfp.h
 #include linux/vmalloc.h
 #include linux/highmem.h
@@ -84,9 +85,35 @@ static DECLARE_BITMAP(default_enabled_hcalls, 
MAX_HCALL_OPCODE/4 + 1);
 static void kvmppc_end_cede(struct kvm_vcpu *vcpu);
 static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
 
+static bool kvmppc_ipi_thread(int cpu)
+{
+   /* On POWER8 for IPIs to threads in the same core, use msgsnd */
+   if (cpu_has_feature(CPU_FTR_ARCH_207S)) {
+   preempt_disable();
+   if (cpu_first_thread_sibling(cpu) ==
+   cpu_first_thread_sibling(smp_processor_id())) {
+   unsigned long msg = PPC_DBELL_TYPE(PPC_DBELL_SERVER);
+   msg |= cpu_thread_in_core(cpu);
+   smp_mb();
+   __asm__ __volatile__ (PPC_MSGSND(%0) : : r (msg));
+   preempt_enable();
+   return true;
+   }
+   preempt_enable();
+   }
+
+#if defined(CONFIG_PPC_ICP_NATIVE)  defined(CONFIG_SMP)
+   if (cpu = 0  cpu  nr_cpu_ids  paca[cpu].kvm_hstate.xics_phys) {
+   xics_wake_cpu(cpu);
+   return true;
+   }
+#endif
+
+   return false;
+}
+
 static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
 {
-   int me;
int cpu = vcpu-cpu;
wait_queue_head_t *wqp;
 
@@ -96,20 +123,12 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu)
++vcpu-stat.halt_wakeup;
}
 
-   me = get_cpu();
+   if (kvmppc_ipi_thread(cpu + vcpu-arch.ptid))
+   return;
 
/* CPU points to the first thread of the core */
-   if (cpu != me  cpu = 0  cpu  nr_cpu_ids) {
-#ifdef CONFIG_PPC_ICP_NATIVE
-   int real_cpu = cpu + vcpu-arch.ptid;
-   if (paca[real_cpu].kvm_hstate.xics_phys)
-   xics_wake_cpu(real_cpu);
-   else
-#endif
-   if (cpu_online(cpu))
-   smp_send_reschedule(cpu);
-   }
-   put_cpu();
+   if (cpu = 0  cpu  nr_cpu_ids  cpu_online(cpu))
+   smp_send_reschedule(cpu);
 }
 
 /*
@@ -1781,10 +1800,8 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu)
/* Order stores to hstate.kvm_vcore etc. before store to kvm_vcpu */
smp_wmb();
tpaca-kvm_hstate.kvm_vcpu = vcpu;
-#if defined(CONFIG_PPC_ICP_NATIVE)  defined(CONFIG_SMP)
if (cpu != smp_processor_id

[PULL 00/21] ppc patch queue 2015-04-21 for 4.1

2015-04-21 Thread Alexander Graf

Hi Paolo / Marcelo,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit b79013b2449c23f1f505bdf39c5a6c330338b244:

  Merge tag 'staging-4.1-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging (2015-04-13 
17:37:33 -0700)

are available in the git repository at:


  git://github.com/agraf/linux-2.6.git tags/signed-kvm-ppc-queue

for you to fetch changes up to 66feed61cdf6ee65fd551d3460b1efba6bee55b8:

  KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8 (2015-04-21 
15:21:34 +0200)


Patch queue for ppc - 2015-04-21

This is the latest queue for KVM on PowerPC changes. Highlights this
time around:

  - Book3S HV: Debugging aids
  - Book3S HV: Minor performance improvements
  - Book3S HV: Cleanups


Aneesh Kumar K.V (2):
  KVM: PPC: Book3S HV: Remove RMA-related variables from code
  KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte

David Gibson (1):
  kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

Michael Ellerman (1):
  KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.

Paul Mackerras (12):
  KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT
  KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
  KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update
  KVM: PPC: Book3S HV: Minor cleanups
  KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu
  KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken
  KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI
  KVM: PPC: Book3S HV: Use decrementer to wake napping threads
  KVM: PPC: Book3S HV: Use bitmap of active threads rather than count
  KVM: PPC: Book3S HV: Streamline guest entry and exit
  KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C
  KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8

Suresh E. Warrier (2):
  powerpc: Export __spin_yield
  KVM: PPC: Book3S HV: Add guest-host real mode completion counters

Suresh Warrier (3):
  KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock
  KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode
  KVM: PPC: Book3S HV: Add ICP real mode counters

 Documentation/virtual/kvm/api.txt|  17 +
 arch/powerpc/include/asm/archrandom.h|  11 +-
 arch/powerpc/include/asm/kvm_book3s.h|   3 +
 arch/powerpc/include/asm/kvm_book3s_64.h |  18 +
 arch/powerpc/include/asm/kvm_host.h  |  47 ++-
 arch/powerpc/include/asm/kvm_ppc.h   |   2 +
 arch/powerpc/include/asm/time.h  |   3 +
 arch/powerpc/kernel/asm-offsets.c|  20 +-
 arch/powerpc/kernel/time.c   |   6 +
 arch/powerpc/kvm/Kconfig |  14 +
 arch/powerpc/kvm/book3s.c|  76 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 189 +--
 arch/powerpc/kvm/book3s_hv.c | 435 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c | 100 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  |  25 +-
 arch/powerpc/kvm/book3s_hv_rm_xics.c | 238 +++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 559 +++
 arch/powerpc/kvm/book3s_pr_papr.c|  28 ++
 arch/powerpc/kvm/book3s_xics.c   | 105 --
 arch/powerpc/kvm/book3s_xics.h   |  13 +-
 arch/powerpc/kvm/powerpc.c   |   3 +
 arch/powerpc/lib/locks.c |   1 +
 arch/powerpc/platforms/powernv/rng.c |  29 ++
 include/uapi/linux/kvm.h |   1 +
 virt/kvm/kvm_main.c  |   1 +
 25 files changed, 1580 insertions(+), 364 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 08/21] KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode

2015-04-21 Thread Alexander Graf

From: Suresh Warrier warr...@linux.vnet.ibm.com

Interrupt-based hypercalls return H_TOO_HARD to inform KVM that it needs
to switch to the host to complete the rest of hypercall function in
virtual mode. This patch ports the virtual mode ICS/ICP reject and resend
functions to be runnable in hypervisor real mode, thus avoiding the need
to switch to the host to execute these functions in virtual mode. However,
the hypercalls continue to return H_TOO_HARD for vcpu_wakeup and notify
events - these events cannot be done in real mode and they will still need
a switch to host virtual mode.

There are sufficient differences between the real mode code and the
virtual mode code for the ICS/ICP resend and reject functions that
for now the code has been duplicated instead of sharing common code.
In the future, we can look at creating common functions.

Signed-off-by: Suresh Warrier warr...@linux.vnet.ibm.com
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rm_xics.c | 225 ---
 1 file changed, 211 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c 
b/arch/powerpc/kvm/book3s_hv_rm_xics.c
index 7c22997..73bbe92 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@ -23,12 +23,39 @@
 
 #define DEBUG_PASSUP
 
+static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp 
*icp,
+   u32 new_irq);
+
 static inline void rm_writeb(unsigned long paddr, u8 val)
 {
__asm__ __volatile__(sync; stbcix %0,0,%1
: : r (val), r (paddr) : memory);
 }
 
+/* -- ICS routines -- */
+static void ics_rm_check_resend(struct kvmppc_xics *xics,
+   struct kvmppc_ics *ics, struct kvmppc_icp *icp)
+{
+   int i;
+
+   arch_spin_lock(ics-lock);
+
+   for (i = 0; i  KVMPPC_XICS_IRQ_PER_ICS; i++) {
+   struct ics_irq_state *state = ics-irq_state[i];
+
+   if (!state-resend)
+   continue;
+
+   arch_spin_unlock(ics-lock);
+   icp_rm_deliver_irq(xics, icp, state-number);
+   arch_spin_lock(ics-lock);
+   }
+
+   arch_spin_unlock(ics-lock);
+}
+
+/* -- ICP routines -- */
+
 static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
struct kvm_vcpu *this_vcpu)
 {
@@ -116,6 +143,178 @@ static inline int check_too_hard(struct kvmppc_xics *xics,
return (xics-real_mode_dbg || icp-rm_action) ? H_TOO_HARD : H_SUCCESS;
 }
 
+static void icp_rm_check_resend(struct kvmppc_xics *xics,
+struct kvmppc_icp *icp)
+{
+   u32 icsid;
+
+   /* Order this load with the test for need_resend in the caller */
+   smp_rmb();
+   for_each_set_bit(icsid, icp-resend_map, xics-max_icsid + 1) {
+   struct kvmppc_ics *ics = xics-ics[icsid];
+
+   if (!test_and_clear_bit(icsid, icp-resend_map))
+   continue;
+   if (!ics)
+   continue;
+   ics_rm_check_resend(xics, ics, icp);
+   }
+}
+
+static bool icp_rm_try_to_deliver(struct kvmppc_icp *icp, u32 irq, u8 priority,
+  u32 *reject)
+{
+   union kvmppc_icp_state old_state, new_state;
+   bool success;
+
+   do {
+   old_state = new_state = READ_ONCE(icp-state);
+
+   *reject = 0;
+
+   /* See if we can deliver */
+   success = new_state.cppr  priority 
+   new_state.mfrr  priority 
+   new_state.pending_pri  priority;
+
+   /*
+* If we can, check for a rejection and perform the
+* delivery
+*/
+   if (success) {
+   *reject = new_state.xisr;
+   new_state.xisr = irq;
+   new_state.pending_pri = priority;
+   } else {
+   /*
+* If we failed to deliver we set need_resend
+* so a subsequent CPPR state change causes us
+* to try a new delivery.
+*/
+   new_state.need_resend = true;
+   }
+
+   } while (!icp_rm_try_update(icp, old_state, new_state));
+
+   return success;
+}
+
+static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp 
*icp,
+   u32 new_irq)
+{
+   struct ics_irq_state *state;
+   struct kvmppc_ics *ics;
+   u32 reject;
+   u16 src;
+
+   /*
+* This is used both for initial delivery of an interrupt and
+* for subsequent rejection.
+*
+* Rejection can be racy vs. resends. We have evaluated the
+* rejection in an atomic ICP transaction which is now complete,
+* so

[PULL 05/21] KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte

2015-04-21 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

This adds helper routines for locking and unlocking HPTEs, and uses
them in the rest of the code.  We don't change any locking rules in
this patch.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_64.h | 14 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 25 ++---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  | 25 +
 3 files changed, 33 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 2d81e20..0789a0f 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -85,6 +85,20 @@ static inline long try_lock_hpte(__be64 *hpte, unsigned long 
bits)
return old == 0;
 }
 
+static inline void unlock_hpte(__be64 *hpte, unsigned long hpte_v)
+{
+   hpte_v = ~HPTE_V_HVLOCK;
+   asm volatile(PPC_RELEASE_BARRIER  : : : memory);
+   hpte[0] = cpu_to_be64(hpte_v);
+}
+
+/* Without barrier */
+static inline void __unlock_hpte(__be64 *hpte, unsigned long hpte_v)
+{
+   hpte_v = ~HPTE_V_HVLOCK;
+   hpte[0] = cpu_to_be64(hpte_v);
+}
+
 static inline int __hpte_actual_psize(unsigned int lp, int psize)
 {
int i, shift;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index dbf1271..6c6825a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -338,9 +338,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
v = be64_to_cpu(hptep[0])  ~HPTE_V_HVLOCK;
gr = kvm-arch.revmap[index].guest_rpte;
 
-   /* Unlock the HPTE */
-   asm volatile(lwsync : : : memory);
-   hptep[0] = cpu_to_be64(v);
+   unlock_hpte(hptep, v);
preempt_enable();
 
gpte-eaddr = eaddr;
@@ -469,8 +467,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
hpte[0] = be64_to_cpu(hptep[0])  ~HPTE_V_HVLOCK;
hpte[1] = be64_to_cpu(hptep[1]);
hpte[2] = r = rev-guest_rpte;
-   asm volatile(lwsync : : : memory);
-   hptep[0] = cpu_to_be64(hpte[0]);
+   unlock_hpte(hptep, hpte[0]);
preempt_enable();
 
if (hpte[0] != vcpu-arch.pgfault_hpte[0] ||
@@ -621,7 +618,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
hptep[1] = cpu_to_be64(r);
eieio();
-   hptep[0] = cpu_to_be64(hpte[0]);
+   __unlock_hpte(hptep, hpte[0]);
asm volatile(ptesync : : : memory);
preempt_enable();
if (page  hpte_is_writable(r))
@@ -642,7 +639,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
return ret;
 
  out_unlock:
-   hptep[0] = ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
preempt_enable();
goto out_put;
 }
@@ -771,7 +768,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
}
}
unlock_rmap(rmapp);
-   hptep[0] = ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
}
return 0;
 }
@@ -857,7 +854,7 @@ static int kvm_age_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
}
ret = 1;
}
-   hptep[0] = ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
} while ((i = j) != head);
 
unlock_rmap(rmapp);
@@ -974,8 +971,7 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
 
/* Now check and modify the HPTE */
if (!(hptep[0]  cpu_to_be64(HPTE_V_VALID))) {
-   /* unlock and continue */
-   hptep[0] = ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
continue;
}
 
@@ -996,9 +992,9 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
npages_dirty = n;
eieio();
}
-   v = ~(HPTE_V_ABSENT | HPTE_V_HVLOCK);
+   v = ~HPTE_V_ABSENT;
v |= HPTE_V_VALID;
-   hptep[0] = cpu_to_be64(v);
+   __unlock_hpte(hptep, v);
} while ((i = j) != head);
 
unlock_rmap(rmapp);
@@ -1218,8 +1214,7 @@ static long record_hpte(unsigned long flags, __be64 *hptp,
r = ~HPTE_GR_MODIFIED;
revp-guest_rpte = r;
}
-   asm volatile(PPC_RELEASE_BARRIER  : : : memory);
-   hptp[0] = ~cpu_to_be64(HPTE_V_HVLOCK

[PULL 10/21] KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This creates a debugfs directory for each HV guest (assuming debugfs
is enabled in the kernel config), and within that directory, a file
by which the contents of the guest's HPT (hashed page table) can be
read.  The directory is named vm, where  is the PID of the
process that created the guest.  The file is named htab.  This is
intended to help in debugging problems in the host's management
of guest memory.

The contents of the file consist of a series of lines like this:

  3f48 4000d032bf003505 000bd7ff1196 0003b5c71196

The first field is the index of the entry in the HPT, the second and
third are the HPT entry, so the third entry contains the real page
number that is mapped by the entry if the entry's valid bit is set.
The fourth field is the guest's view of the second doubleword of the
entry, so it contains the guest physical address.  (The format of the
second through fourth fields are described in the Power ISA and also
in arch/powerpc/include/asm/mmu-hash64.h.)

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_64.h |   2 +
 arch/powerpc/include/asm/kvm_host.h  |   2 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 136 +++
 arch/powerpc/kvm/book3s_hv.c |  12 +++
 virt/kvm/kvm_main.c  |   1 +
 5 files changed, 153 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 0789a0f..869c53f 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -436,6 +436,8 @@ static inline struct kvm_memslots *kvm_memslots_raw(struct 
kvm *kvm)
return rcu_dereference_raw_notrace(kvm-memslots);
 }
 
+extern void kvmppc_mmu_debugfs_init(struct kvm *kvm);
+
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 
 #endif /* __ASM_KVM_BOOK3S_64_H__ */
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 015773f..f1d0bbc 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -238,6 +238,8 @@ struct kvm_arch {
atomic_t hpte_mod_interest;
cpumask_t need_tlb_flush;
int hpt_cma_alloc;
+   struct dentry *debugfs_dir;
+   struct dentry *htab_dentry;
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
struct mutex hpt_mutex;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 6c6825a..d6fe308 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -27,6 +27,7 @@
 #include linux/srcu.h
 #include linux/anon_inodes.h
 #include linux/file.h
+#include linux/debugfs.h
 
 #include asm/tlbflush.h
 #include asm/kvm_ppc.h
@@ -1490,6 +1491,141 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct 
kvm_get_htab_fd *ghf)
return ret;
 }
 
+struct debugfs_htab_state {
+   struct kvm  *kvm;
+   struct mutexmutex;
+   unsigned long   hpt_index;
+   int chars_left;
+   int buf_index;
+   charbuf[64];
+};
+
+static int debugfs_htab_open(struct inode *inode, struct file *file)
+{
+   struct kvm *kvm = inode-i_private;
+   struct debugfs_htab_state *p;
+
+   p = kzalloc(sizeof(*p), GFP_KERNEL);
+   if (!p)
+   return -ENOMEM;
+
+   kvm_get_kvm(kvm);
+   p-kvm = kvm;
+   mutex_init(p-mutex);
+   file-private_data = p;
+
+   return nonseekable_open(inode, file);
+}
+
+static int debugfs_htab_release(struct inode *inode, struct file *file)
+{
+   struct debugfs_htab_state *p = file-private_data;
+
+   kvm_put_kvm(p-kvm);
+   kfree(p);
+   return 0;
+}
+
+static ssize_t debugfs_htab_read(struct file *file, char __user *buf,
+size_t len, loff_t *ppos)
+{
+   struct debugfs_htab_state *p = file-private_data;
+   ssize_t ret, r;
+   unsigned long i, n;
+   unsigned long v, hr, gr;
+   struct kvm *kvm;
+   __be64 *hptp;
+
+   ret = mutex_lock_interruptible(p-mutex);
+   if (ret)
+   return ret;
+
+   if (p-chars_left) {
+   n = p-chars_left;
+   if (n  len)
+   n = len;
+   r = copy_to_user(buf, p-buf + p-buf_index, n);
+   n -= r;
+   p-chars_left -= n;
+   p-buf_index += n;
+   buf += n;
+   len -= n;
+   ret = n;
+   if (r) {
+   if (!n)
+   ret = -EFAULT;
+   goto out;
+   }
+   }
+
+   kvm = p-kvm;
+   i = p-hpt_index;
+   hptp = (__be64 *)(kvm-arch.hpt_virt + (i * HPTE_SIZE));
+   for (; len != 0  i  kvm-arch.hpt_npte; ++i, hptp += 2) {
+   if (!(be64_to_cpu

[PULL 05/21] KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte

2015-04-21 Thread Alexander Graf

From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com

This adds helper routines for locking and unlocking HPTEs, and uses
them in the rest of the code.  We don't change any locking rules in
this patch.

Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_64.h | 14 ++
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 25 ++---
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  | 25 +
 3 files changed, 33 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 2d81e20..0789a0f 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -85,6 +85,20 @@ static inline long try_lock_hpte(__be64 *hpte, unsigned long 
bits)
return old == 0;
 }
 
+static inline void unlock_hpte(__be64 *hpte, unsigned long hpte_v)
+{
+   hpte_v = ~HPTE_V_HVLOCK;
+   asm volatile(PPC_RELEASE_BARRIER  : : : memory);
+   hpte[0] = cpu_to_be64(hpte_v);
+}
+
+/* Without barrier */
+static inline void __unlock_hpte(__be64 *hpte, unsigned long hpte_v)
+{
+   hpte_v = ~HPTE_V_HVLOCK;
+   hpte[0] = cpu_to_be64(hpte_v);
+}
+
 static inline int __hpte_actual_psize(unsigned int lp, int psize)
 {
int i, shift;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index dbf1271..6c6825a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -338,9 +338,7 @@ static int kvmppc_mmu_book3s_64_hv_xlate(struct kvm_vcpu 
*vcpu, gva_t eaddr,
v = be64_to_cpu(hptep[0])  ~HPTE_V_HVLOCK;
gr = kvm-arch.revmap[index].guest_rpte;
 
-   /* Unlock the HPTE */
-   asm volatile(lwsync : : : memory);
-   hptep[0] = cpu_to_be64(v);
+   unlock_hpte(hptep, v);
preempt_enable();
 
gpte-eaddr = eaddr;
@@ -469,8 +467,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
hpte[0] = be64_to_cpu(hptep[0])  ~HPTE_V_HVLOCK;
hpte[1] = be64_to_cpu(hptep[1]);
hpte[2] = r = rev-guest_rpte;
-   asm volatile(lwsync : : : memory);
-   hptep[0] = cpu_to_be64(hpte[0]);
+   unlock_hpte(hptep, hpte[0]);
preempt_enable();
 
if (hpte[0] != vcpu-arch.pgfault_hpte[0] ||
@@ -621,7 +618,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
 
hptep[1] = cpu_to_be64(r);
eieio();
-   hptep[0] = cpu_to_be64(hpte[0]);
+   __unlock_hpte(hptep, hpte[0]);
asm volatile(ptesync : : : memory);
preempt_enable();
if (page  hpte_is_writable(r))
@@ -642,7 +639,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
return ret;
 
  out_unlock:
-   hptep[0] = ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
preempt_enable();
goto out_put;
 }
@@ -771,7 +768,7 @@ static int kvm_unmap_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
}
}
unlock_rmap(rmapp);
-   hptep[0] = ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
}
return 0;
 }
@@ -857,7 +854,7 @@ static int kvm_age_rmapp(struct kvm *kvm, unsigned long 
*rmapp,
}
ret = 1;
}
-   hptep[0] = ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
} while ((i = j) != head);
 
unlock_rmap(rmapp);
@@ -974,8 +971,7 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
 
/* Now check and modify the HPTE */
if (!(hptep[0]  cpu_to_be64(HPTE_V_VALID))) {
-   /* unlock and continue */
-   hptep[0] = ~cpu_to_be64(HPTE_V_HVLOCK);
+   __unlock_hpte(hptep, be64_to_cpu(hptep[0]));
continue;
}
 
@@ -996,9 +992,9 @@ static int kvm_test_clear_dirty_npages(struct kvm *kvm, 
unsigned long *rmapp)
npages_dirty = n;
eieio();
}
-   v = ~(HPTE_V_ABSENT | HPTE_V_HVLOCK);
+   v = ~HPTE_V_ABSENT;
v |= HPTE_V_VALID;
-   hptep[0] = cpu_to_be64(v);
+   __unlock_hpte(hptep, v);
} while ((i = j) != head);
 
unlock_rmap(rmapp);
@@ -1218,8 +1214,7 @@ static long record_hpte(unsigned long flags, __be64 *hptp,
r = ~HPTE_GR_MODIFIED;
revp-guest_rpte = r;
}
-   asm volatile(PPC_RELEASE_BARRIER  : : : memory);
-   hptp[0] = ~cpu_to_be64(HPTE_V_HVLOCK

[PULL 20/21] KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This replaces the assembler code for kvmhv_commence_exit() with C code
in book3s_hv_builtin.c.  It also moves the IPI sending code that was
in book3s_hv_rm_xics.c into a new kvmhv_rm_send_ipi() function so it
can be used by kvmhv_commence_exit() as well as icp_rm_set_vcpu_irq().

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_64.h |  2 +
 arch/powerpc/kvm/book3s_hv_builtin.c | 63 ++
 arch/powerpc/kvm/book3s_hv_rm_xics.c | 12 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 66 
 4 files changed, 75 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 869c53f..2b84e48 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -438,6 +438,8 @@ static inline struct kvm_memslots *kvm_memslots_raw(struct 
kvm *kvm)
 
 extern void kvmppc_mmu_debugfs_init(struct kvm *kvm);
 
+extern void kvmhv_rm_send_ipi(int cpu);
+
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 
 #endif /* __ASM_KVM_BOOK3S_64_H__ */
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 2754251..c42aa55 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -22,6 +22,7 @@
 #include asm/kvm_ppc.h
 #include asm/kvm_book3s.h
 #include asm/archrandom.h
+#include asm/xics.h
 
 #define KVM_CMA_CHUNK_ORDER18
 
@@ -184,3 +185,65 @@ long kvmppc_h_random(struct kvm_vcpu *vcpu)
 
return H_HARDWARE;
 }
+
+static inline void rm_writeb(unsigned long paddr, u8 val)
+{
+   __asm__ __volatile__(stbcix %0,0,%1
+   : : r (val), r (paddr) : memory);
+}
+
+/*
+ * Send an interrupt to another CPU.
+ * This can only be called in real mode.
+ * The caller needs to include any barrier needed to order writes
+ * to memory vs. the IPI/message.
+ */
+void kvmhv_rm_send_ipi(int cpu)
+{
+   unsigned long xics_phys;
+
+   /* Poke the target */
+   xics_phys = paca[cpu].kvm_hstate.xics_phys;
+   rm_writeb(xics_phys + XICS_MFRR, IPI_PRIORITY);
+}
+
+/*
+ * The following functions are called from the assembly code
+ * in book3s_hv_rmhandlers.S.
+ */
+static void kvmhv_interrupt_vcore(struct kvmppc_vcore *vc, int active)
+{
+   int cpu = vc-pcpu;
+
+   /* Order setting of exit map vs. msgsnd/IPI */
+   smp_mb();
+   for (; active; active = 1, ++cpu)
+   if (active  1)
+   kvmhv_rm_send_ipi(cpu);
+}
+
+void kvmhv_commence_exit(int trap)
+{
+   struct kvmppc_vcore *vc = local_paca-kvm_hstate.kvm_vcore;
+   int ptid = local_paca-kvm_hstate.ptid;
+   int me, ee;
+
+   /* Set our bit in the threads-exiting-guest map in the 0xff00
+  bits of vcore-entry_exit_map */
+   me = 0x100  ptid;
+   do {
+   ee = vc-entry_exit_map;
+   } while (cmpxchg(vc-entry_exit_map, ee, ee | me) != ee);
+
+   /* Are we the first here? */
+   if ((ee  8) != 0)
+   return;
+
+   /*
+* Trigger the other threads in this vcore to exit the guest.
+* If this is a hypervisor decrementer interrupt then they
+* will be already on their way out of the guest.
+*/
+   if (trap != BOOK3S_INTERRUPT_HV_DECREMENTER)
+   kvmhv_interrupt_vcore(vc, ee  ~(1  ptid));
+}
diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c 
b/arch/powerpc/kvm/book3s_hv_rm_xics.c
index 6dded8c..00e45b6 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@ -26,12 +26,6 @@
 static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp 
*icp,
u32 new_irq);
 
-static inline void rm_writeb(unsigned long paddr, u8 val)
-{
-   __asm__ __volatile__(sync; stbcix %0,0,%1
-   : : r (val), r (paddr) : memory);
-}
-
 /* -- ICS routines -- */
 static void ics_rm_check_resend(struct kvmppc_xics *xics,
struct kvmppc_ics *ics, struct kvmppc_icp *icp)
@@ -60,7 +54,6 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
struct kvm_vcpu *this_vcpu)
 {
struct kvmppc_icp *this_icp = this_vcpu-arch.icp;
-   unsigned long xics_phys;
int cpu;
 
/* Mark the target VCPU as having an interrupt pending */
@@ -83,9 +76,8 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
/* In SMT cpu will always point to thread 0, we adjust it */
cpu += vcpu-arch.ptid;
 
-   /* Not too hard, then poke the target */
-   xics_phys = paca[cpu].kvm_hstate.xics_phys;
-   rm_writeb(xics_phys + XICS_MFRR, IPI_PRIORITY);
+   smp_mb();
+   kvmhv_rm_send_ipi(cpu);
 }
 
 static void icp_rm_clr_vcpu_irq(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm

[PULL 16/21] KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

When running a multi-threaded guest and vcpu 0 in a virtual core
is not running in the guest (i.e. it is busy elsewhere in the host),
thread 0 of the physical core will switch the MMU to the guest and
then go to nap mode in the code at kvm_do_nap.  If the guest sends
an IPI to thread 0 using the msgsndp instruction, that will wake
up thread 0 and cause all the threads in the guest to exit to the
host unnecessarily.  To avoid the unnecessary exit, this arranges
for the PECEDP bit to be cleared in this situation.  When napping
due to a H_CEDE from the guest, we still set PECEDP so that the
thread will wake up on an IPI sent using msgsndp.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6716db3..12d7e4c 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -191,6 +191,7 @@ kvmppc_primary_no_guest:
li  r3, NAPPING_NOVCPU
stb r3, HSTATE_NAPPING(r13)
 
+   li  r3, 0   /* Don't wake on privileged (OS) doorbell */
b   kvm_do_nap
 
 kvm_novcpu_wakeup:
@@ -2129,10 +2130,13 @@ _GLOBAL(kvmppc_h_cede)  /* r3 = vcpu pointer, 
r11 = msr, r13 = paca */
bl  kvmhv_accumulate_time
 #endif
 
+   lis r3, LPCR_PECEDP@h   /* Do wake on privileged doorbell */
+
/*
 * Take a nap until a decrementer or external or doobell interrupt
-* occurs, with PECE1, PECE0 and PECEDP set in LPCR. Also clear the
-* runlatch bit before napping.
+* occurs, with PECE1 and PECE0 set in LPCR.
+* On POWER8, if we are ceding, also set PECEDP.
+* Also clear the runlatch bit before napping.
 */
 kvm_do_nap:
mfspr   r0, SPRN_CTRLF
@@ -2144,7 +2148,7 @@ kvm_do_nap:
mfspr   r5,SPRN_LPCR
ori r5,r5,LPCR_PECE0 | LPCR_PECE1
 BEGIN_FTR_SECTION
-   orisr5,r5,LPCR_PECEDP@h
+   rlwimi  r5, r3, 0, LPCR_PECEDP
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
mtspr   SPRN_LPCR,r5
isync
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 18/21] KVM: PPC: Book3S HV: Use bitmap of active threads rather than count

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

Currently, the entry_exit_count field in the kvmppc_vcore struct
contains two 8-bit counts, one of the threads that have started entering
the guest, and one of the threads that have started exiting the guest.
This changes it to an entry_exit_map field which contains two bitmaps
of 8 bits each.  The advantage of doing this is that it gives us a
bitmap of which threads need to be signalled when exiting the guest.
That means that we no longer need to use the trick of setting the
HDEC to 0 to pull the other threads out of the guest, which led in
some cases to a spurious HDEC interrupt on the next guest entry.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h | 15 
 arch/powerpc/kernel/asm-offsets.c   |  2 +-
 arch/powerpc/kvm/book3s_hv.c|  5 ++-
 arch/powerpc/kvm/book3s_hv_builtin.c| 10 +++---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 61 +++--
 5 files changed, 44 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 1517faa..d67a838 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -263,15 +263,15 @@ struct kvm_arch {
 
 /*
  * Struct for a virtual core.
- * Note: entry_exit_count combines an entry count in the bottom 8 bits
- * and an exit count in the next 8 bits.  This is so that we can
- * atomically increment the entry count iff the exit count is 0
- * without taking the lock.
+ * Note: entry_exit_map combines a bitmap of threads that have entered
+ * in the bottom 8 bits and a bitmap of threads that have exited in the
+ * next 8 bits.  This is so that we can atomically set the entry bit
+ * iff the exit map is 0 without taking a lock.
  */
 struct kvmppc_vcore {
int n_runnable;
int num_threads;
-   int entry_exit_count;
+   int entry_exit_map;
int napping_threads;
int first_vcpuid;
u16 pcpu;
@@ -296,8 +296,9 @@ struct kvmppc_vcore {
ulong conferring_threads;
 };
 
-#define VCORE_ENTRY_COUNT(vc)  ((vc)-entry_exit_count  0xff)
-#define VCORE_EXIT_COUNT(vc)   ((vc)-entry_exit_count  8)
+#define VCORE_ENTRY_MAP(vc)((vc)-entry_exit_map  0xff)
+#define VCORE_EXIT_MAP(vc) ((vc)-entry_exit_map  8)
+#define VCORE_IS_EXITING(vc)   (VCORE_EXIT_MAP(vc) != 0)
 
 /* Values for vcore_state */
 #define VCORE_INACTIVE 0
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 8aa8246..0d07efb 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -562,7 +562,7 @@ int main(void)
DEFINE(VCPU_ACOP, offsetof(struct kvm_vcpu, arch.acop));
DEFINE(VCPU_WORT, offsetof(struct kvm_vcpu, arch.wort));
DEFINE(VCPU_SHADOW_SRR1, offsetof(struct kvm_vcpu, arch.shadow_srr1));
-   DEFINE(VCORE_ENTRY_EXIT, offsetof(struct kvmppc_vcore, 
entry_exit_count));
+   DEFINE(VCORE_ENTRY_EXIT, offsetof(struct kvmppc_vcore, entry_exit_map));
DEFINE(VCORE_IN_GUEST, offsetof(struct kvmppc_vcore, in_guest));
DEFINE(VCORE_NAPPING_THREADS, offsetof(struct kvmppc_vcore, 
napping_threads));
DEFINE(VCORE_KVM, offsetof(struct kvmppc_vcore, kvm));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 7c1335d..ea1600f 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1952,7 +1952,7 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
/*
 * Initialize *vc.
 */
-   vc-entry_exit_count = 0;
+   vc-entry_exit_map = 0;
vc-preempt_tb = TB_NIL;
vc-in_guest = 0;
vc-napping_threads = 0;
@@ -2119,8 +2119,7 @@ static int kvmppc_run_vcpu(struct kvm_run *kvm_run, 
struct kvm_vcpu *vcpu)
 * this thread straight away and have it join in.
 */
if (!signal_pending(current)) {
-   if (vc-vcore_state == VCORE_RUNNING 
-   VCORE_EXIT_COUNT(vc) == 0) {
+   if (vc-vcore_state == VCORE_RUNNING  !VCORE_IS_EXITING(vc)) {
kvmppc_create_dtl_entry(vcpu, vc);
kvmppc_start_thread(vcpu);
trace_kvm_guest_enter(vcpu);
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 1954a1c..2754251 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -115,11 +115,11 @@ long int kvmppc_rm_h_confer(struct kvm_vcpu *vcpu, int 
target,
int rv = H_SUCCESS; /* = don't yield */
 
set_bit(vcpu-arch.ptid, vc-conferring_threads);
-   while ((get_tb()  stop)  (VCORE_EXIT_COUNT(vc) == 0)) {
-   threads_running = VCORE_ENTRY_COUNT(vc);
-   threads_ceded = hweight32(vc-napping_threads);
-   threads_conferring = hweight32(vc-conferring_threads

[PULL 09/21] KVM: PPC: Book3S HV: Add ICP real mode counters

2015-04-21 Thread Alexander Graf

From: Suresh Warrier warr...@linux.vnet.ibm.com

Add two counters to count how often we generate real-mode ICS resend
and reject events. The counters provide some performance statistics
that could be used in the future to consider if the real mode functions
need further optimizing. The counters are displayed as part of IPC and
ICP state provided by /sys/debug/kernel/powerpc/kvm* for each VM.

Also added two counters that count (approximately) how many times we
don't find an ICP or ICS we're looking for. These are not currently
exposed through sysfs, but can be useful when debugging crashes.

Signed-off-by: Suresh Warrier warr...@linux.vnet.ibm.com
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rm_xics.c |  7 +++
 arch/powerpc/kvm/book3s_xics.c   | 10 --
 arch/powerpc/kvm/book3s_xics.h   |  5 +
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c 
b/arch/powerpc/kvm/book3s_hv_rm_xics.c
index 73bbe92..6dded8c 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@ -227,6 +227,7 @@ static void icp_rm_deliver_irq(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
ics = kvmppc_xics_find_ics(xics, new_irq, src);
if (!ics) {
/* Unsafe increment, but this does not need to be accurate */
+   xics-err_noics++;
return;
}
state = ics-irq_state[src];
@@ -239,6 +240,7 @@ static void icp_rm_deliver_irq(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
icp = kvmppc_xics_find_server(xics-kvm, state-server);
if (!icp) {
/* Unsafe increment again*/
+   xics-err_noicp++;
goto out;
}
}
@@ -383,6 +385,7 @@ static void icp_rm_down_cppr(struct kvmppc_xics *xics, 
struct kvmppc_icp *icp,
 * separately here as well.
 */
if (resend) {
+   icp-n_check_resend++;
icp_rm_check_resend(xics, icp);
}
 }
@@ -500,11 +503,13 @@ int kvmppc_rm_h_ipi(struct kvm_vcpu *vcpu, unsigned long 
server,
 
/* Handle reject in real mode */
if (reject  reject != XICS_IPI) {
+   this_icp-n_reject++;
icp_rm_deliver_irq(xics, icp, reject);
}
 
/* Handle resends in real mode */
if (resend) {
+   this_icp-n_check_resend++;
icp_rm_check_resend(xics, icp);
}
 
@@ -566,6 +571,7 @@ int kvmppc_rm_h_cppr(struct kvm_vcpu *vcpu, unsigned long 
cppr)
 * attempt (see comments in icp_rm_deliver_irq).
 */
if (reject  reject != XICS_IPI) {
+   icp-n_reject++;
icp_rm_deliver_irq(xics, icp, reject);
}
  bail:
@@ -616,6 +622,7 @@ int kvmppc_rm_h_eoi(struct kvm_vcpu *vcpu, unsigned long 
xirr)
 
/* Still asserted, resend it */
if (state-asserted) {
+   icp-n_reject++;
icp_rm_deliver_irq(xics, icp, irq);
}
 
diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
index 5f7beebd..8f3e6cc 100644
--- a/arch/powerpc/kvm/book3s_xics.c
+++ b/arch/powerpc/kvm/book3s_xics.c
@@ -901,6 +901,7 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
unsigned long flags;
unsigned long t_rm_kick_vcpu, t_rm_check_resend;
unsigned long t_rm_reject, t_rm_notify_eoi;
+   unsigned long t_reject, t_check_resend;
 
if (!kvm)
return 0;
@@ -909,6 +910,8 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
t_rm_notify_eoi = 0;
t_rm_check_resend = 0;
t_rm_reject = 0;
+   t_check_resend = 0;
+   t_reject = 0;
 
seq_printf(m, =\nICP state\n=\n);
 
@@ -928,12 +931,15 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
t_rm_notify_eoi += icp-n_rm_notify_eoi;
t_rm_check_resend += icp-n_rm_check_resend;
t_rm_reject += icp-n_rm_reject;
+   t_check_resend += icp-n_check_resend;
+   t_reject += icp-n_reject;
}
 
-   seq_puts(m, ICP Guest Real Mode exit totals: );
-   seq_printf(m, \tkick_vcpu=%lu check_resend=%lu reject=%lu 
notify_eoi=%lu\n,
+   seq_printf(m, ICP Guest-Host totals: kick_vcpu=%lu check_resend=%lu 
reject=%lu notify_eoi=%lu\n,
t_rm_kick_vcpu, t_rm_check_resend,
t_rm_reject, t_rm_notify_eoi);
+   seq_printf(m, ICP Real Mode totals: check_resend=%lu resend=%lu\n,
+   t_check_resend, t_reject);
for (icsid = 0; icsid = KVMPPC_XICS_MAX_ICS_ID; icsid++) {
struct kvmppc_ics *ics = xics-ics[icsid];
 
diff --git a/arch/powerpc/kvm/book3s_xics.h b/arch/powerpc/kvm/book3s_xics.h
index 055424c..56ea44f

[PULL 06/21] KVM: PPC: Book3S HV: Add guest-host real mode completion counters

2015-04-21 Thread Alexander Graf

From: Suresh E. Warrier warr...@linux.vnet.ibm.com

Add counters to track number of times we switch from guest real mode
to host virtual mode during an interrupt-related hyper call because the
hypercall requires actions that cannot be completed in real mode. This
will help when making optimizations that reduce guest-host transitions.

It is safe to use an ordinary increment rather than an atomic operation
because there is one ICP per virtual CPU and kvmppc_xics_rm_complete()
only works on the ICP for the current VCPU.

The counters are displayed as part of IPC and ICP state provided by
/sys/debug/kernel/powerpc/kvm* for each VM.

Signed-off-by: Suresh Warrier warr...@linux.vnet.ibm.com
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_xics.c | 31 +++
 arch/powerpc/kvm/book3s_xics.h |  6 ++
 2 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_xics.c b/arch/powerpc/kvm/book3s_xics.c
index a4a8d9f..60bdbac 100644
--- a/arch/powerpc/kvm/book3s_xics.c
+++ b/arch/powerpc/kvm/book3s_xics.c
@@ -802,14 +802,22 @@ static noinline int kvmppc_xics_rm_complete(struct 
kvm_vcpu *vcpu, u32 hcall)
XICS_DBG(XICS_RM: H_%x completing, act: %x state: %lx tgt: %p\n,
 hcall, icp-rm_action, icp-rm_dbgstate.raw, icp-rm_dbgtgt);
 
-   if (icp-rm_action  XICS_RM_KICK_VCPU)
+   if (icp-rm_action  XICS_RM_KICK_VCPU) {
+   icp-n_rm_kick_vcpu++;
kvmppc_fast_vcpu_kick(icp-rm_kick_target);
-   if (icp-rm_action  XICS_RM_CHECK_RESEND)
+   }
+   if (icp-rm_action  XICS_RM_CHECK_RESEND) {
+   icp-n_rm_check_resend++;
icp_check_resend(xics, icp-rm_resend_icp);
-   if (icp-rm_action  XICS_RM_REJECT)
+   }
+   if (icp-rm_action  XICS_RM_REJECT) {
+   icp-n_rm_reject++;
icp_deliver_irq(xics, icp, icp-rm_reject);
-   if (icp-rm_action  XICS_RM_NOTIFY_EOI)
+   }
+   if (icp-rm_action  XICS_RM_NOTIFY_EOI) {
+   icp-n_rm_notify_eoi++;
kvm_notify_acked_irq(vcpu-kvm, 0, icp-rm_eoied_irq);
+   }
 
icp-rm_action = 0;
 
@@ -872,10 +880,17 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
struct kvm *kvm = xics-kvm;
struct kvm_vcpu *vcpu;
int icsid, i;
+   unsigned long t_rm_kick_vcpu, t_rm_check_resend;
+   unsigned long t_rm_reject, t_rm_notify_eoi;
 
if (!kvm)
return 0;
 
+   t_rm_kick_vcpu = 0;
+   t_rm_notify_eoi = 0;
+   t_rm_check_resend = 0;
+   t_rm_reject = 0;
+
seq_printf(m, =\nICP state\n=\n);
 
kvm_for_each_vcpu(i, vcpu, kvm) {
@@ -890,8 +905,16 @@ static int xics_debug_show(struct seq_file *m, void 
*private)
   icp-server_num, state.xisr,
   state.pending_pri, state.cppr, state.mfrr,
   state.out_ee, state.need_resend);
+   t_rm_kick_vcpu += icp-n_rm_kick_vcpu;
+   t_rm_notify_eoi += icp-n_rm_notify_eoi;
+   t_rm_check_resend += icp-n_rm_check_resend;
+   t_rm_reject += icp-n_rm_reject;
}
 
+   seq_puts(m, ICP Guest Real Mode exit totals: );
+   seq_printf(m, \tkick_vcpu=%lu check_resend=%lu reject=%lu 
notify_eoi=%lu\n,
+   t_rm_kick_vcpu, t_rm_check_resend,
+   t_rm_reject, t_rm_notify_eoi);
for (icsid = 0; icsid = KVMPPC_XICS_MAX_ICS_ID; icsid++) {
struct kvmppc_ics *ics = xics-ics[icsid];
 
diff --git a/arch/powerpc/kvm/book3s_xics.h b/arch/powerpc/kvm/book3s_xics.h
index 73f0f27..de970ec 100644
--- a/arch/powerpc/kvm/book3s_xics.h
+++ b/arch/powerpc/kvm/book3s_xics.h
@@ -78,6 +78,12 @@ struct kvmppc_icp {
u32  rm_reject;
u32  rm_eoied_irq;
 
+   /* Counters for each reason we exited real mode */
+   unsigned long n_rm_kick_vcpu;
+   unsigned long n_rm_check_resend;
+   unsigned long n_rm_reject;
+   unsigned long n_rm_notify_eoi;
+
/* Debug stuff for real mode */
union kvmppc_icp_state rm_dbgstate;
struct kvm_vcpu *rm_dbgtgt;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 12/21] KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

Previously, if kvmppc_run_core() was running a VCPU that needed a VPA
update (i.e. one of its 3 virtual processor areas needed to be pinned
in memory so the host real mode code can update it on guest entry and
exit), we would drop the vcore lock and do the update there and then.
Future changes will make it inconvenient to drop the lock, so instead
we now remove it from the list of runnable VCPUs and wake up its
VCPU task.  This will have the effect that the VCPU task will exit
kvmppc_run_vcpu(), go around the do loop in kvmppc_vcpu_run_hv(), and
re-enter kvmppc_run_vcpu(), whereupon it will do the necessary call
to kvmppc_update_vpas() and then rejoin the vcore.

The one complication is that the runner VCPU (whose VCPU task is the
current task) might be one of the ones that gets removed from the
runnable list.  In that case we just return from kvmppc_run_core()
and let the code in kvmppc_run_vcpu() wake up another VCPU task to be
the runner if necessary.

This all means that the VCORE_STARTING state is no longer used, so we
remove it.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  5 ++--
 arch/powerpc/kvm/book3s_hv.c| 56 -
 2 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index d2068bb..2f339ff 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -306,9 +306,8 @@ struct kvmppc_vcore {
 /* Values for vcore_state */
 #define VCORE_INACTIVE 0
 #define VCORE_SLEEPING 1
-#define VCORE_STARTING 2
-#define VCORE_RUNNING  3
-#define VCORE_EXITING  4
+#define VCORE_RUNNING  2
+#define VCORE_EXITING  3
 
 /*
  * Struct used to manage memory for a virtual processor area
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 64a02d4..b38c10e 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1863,6 +1863,25 @@ static void kvmppc_start_restoring_l2_cache(const struct 
kvmppc_vcore *vc)
mtspr(SPRN_MPPR, mpp_addr | PPC_MPPR_FETCH_WHOLE_TABLE);
 }
 
+static void prepare_threads(struct kvmppc_vcore *vc)
+{
+   struct kvm_vcpu *vcpu, *vnext;
+
+   list_for_each_entry_safe(vcpu, vnext, vc-runnable_threads,
+arch.run_list) {
+   if (signal_pending(vcpu-arch.run_task))
+   vcpu-arch.ret = -EINTR;
+   else if (vcpu-arch.vpa.update_pending ||
+vcpu-arch.slb_shadow.update_pending ||
+vcpu-arch.dtl.update_pending)
+   vcpu-arch.ret = RESUME_GUEST;
+   else
+   continue;
+   kvmppc_remove_runnable(vc, vcpu);
+   wake_up(vcpu-arch.cpu_run);
+   }
+}
+
 /*
  * Run a set of guest threads on a physical core.
  * Called with vc-lock held.
@@ -1872,46 +1891,31 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
struct kvm_vcpu *vcpu, *vnext;
long ret;
u64 now;
-   int i, need_vpa_update;
+   int i;
int srcu_idx;
-   struct kvm_vcpu *vcpus_to_update[threads_per_core];
 
-   /* don't start if any threads have a signal pending */
-   need_vpa_update = 0;
-   list_for_each_entry(vcpu, vc-runnable_threads, arch.run_list) {
-   if (signal_pending(vcpu-arch.run_task))
-   return;
-   if (vcpu-arch.vpa.update_pending ||
-   vcpu-arch.slb_shadow.update_pending ||
-   vcpu-arch.dtl.update_pending)
-   vcpus_to_update[need_vpa_update++] = vcpu;
-   }
+   /*
+* Remove from the list any threads that have a signal pending
+* or need a VPA update done
+*/
+   prepare_threads(vc);
+
+   /* if the runner is no longer runnable, let the caller pick a new one */
+   if (vc-runner-arch.state != KVMPPC_VCPU_RUNNABLE)
+   return;
 
/*
-* Initialize *vc, in particular vc-vcore_state, so we can
-* drop the vcore lock if necessary.
+* Initialize *vc.
 */
vc-n_woken = 0;
vc-nap_count = 0;
vc-entry_exit_count = 0;
vc-preempt_tb = TB_NIL;
-   vc-vcore_state = VCORE_STARTING;
vc-in_guest = 0;
vc-napping_threads = 0;
vc-conferring_threads = 0;
 
/*
-* Updating any of the vpas requires calling kvmppc_pin_guest_page,
-* which can't be called with any spinlocks held.
-*/
-   if (need_vpa_update) {
-   spin_unlock(vc-lock);
-   for (i = 0; i  need_vpa_update; ++i)
-   kvmppc_update_vpas(vcpus_to_update[i]);
-   spin_lock(vc-lock);
-   }
-
-   /*
 * Make sure we are running on primary threads

[PULL 13/21] KVM: PPC: Book3S HV: Minor cleanups

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

* Remove unused kvmppc_vcore::n_busy field.
* Remove setting of RMOR, since it was only used on PPC970 and the
  PPC970 KVM support has been removed.
* Don't use r1 or r2 in setting the runlatch since they are
  conventionally reserved for other things; use r0 instead.
* Streamline the code a little and remove the ext_interrupt_to_host
  label.
* Add some comments about register usage.
* hcall_try_real_mode doesn't need to be global, and can't be
  called from C code anyway.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  2 --
 arch/powerpc/kernel/asm-offsets.c   |  1 -
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 44 ++---
 3 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2f339ff..3eecd88 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -227,7 +227,6 @@ struct kvm_arch {
unsigned long host_sdr1;
int tlbie_lock;
unsigned long lpcr;
-   unsigned long rmor;
unsigned long vrma_slb_v;
int hpte_setup_done;
u32 hpt_order;
@@ -271,7 +270,6 @@ struct kvm_arch {
  */
 struct kvmppc_vcore {
int n_runnable;
-   int n_busy;
int num_threads;
int entry_exit_count;
int n_woken;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 3fea721..92ec3fc 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -505,7 +505,6 @@ int main(void)
DEFINE(KVM_NEED_FLUSH, offsetof(struct kvm, arch.need_tlb_flush.bits));
DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
-   DEFINE(KVM_RMOR, offsetof(struct kvm, arch.rmor));
DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index b06fe53..f8267e5 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -245,9 +245,9 @@ kvm_novcpu_exit:
 kvm_start_guest:
 
/* Set runlatch bit the minute you wake up from nap */
-   mfspr   r1, SPRN_CTRLF
-   ori r1, r1, 1
-   mtspr   SPRN_CTRLT, r1
+   mfspr   r0, SPRN_CTRLF
+   ori r0, r0, 1
+   mtspr   SPRN_CTRLT, r0
 
ld  r2,PACATOC(r13)
 
@@ -493,11 +493,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
cmpwi   r0,0
beq 20b
 
-   /* Set LPCR and RMOR. */
+   /* Set LPCR. */
 10:ld  r8,VCORE_LPCR(r5)
mtspr   SPRN_LPCR,r8
-   ld  r8,KVM_RMOR(r9)
-   mtspr   SPRN_RMOR,r8
isync
 
/* Check if HDEC expires soon */
@@ -1075,7 +1073,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
bne 2f
mfspr   r3,SPRN_HDEC
cmpwi   r3,0
-   bge ignore_hdec
+   mr  r4,r9
+   bge fast_guest_return
 2:
/* See if this is an hcall we can handle in real mode */
cmpwi   r12,BOOK3S_INTERRUPT_SYSCALL
@@ -1083,26 +1082,21 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
/* External interrupt ? */
cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
-   bne+ext_interrupt_to_host
+   bne+guest_exit_cont
 
/* External interrupt, first check for host_ipi. If this is
 * set, we know the host wants us out so let's do it now
 */
bl  kvmppc_read_intr
cmpdi   r3, 0
-   bgt ext_interrupt_to_host
+   bgt guest_exit_cont
 
/* Check if any CPU is heading out to the host, if so head out too */
ld  r5, HSTATE_KVM_VCORE(r13)
lwz r0, VCORE_ENTRY_EXIT(r5)
cmpwi   r0, 0x100
-   bge ext_interrupt_to_host
-
-   /* Return to guest after delivering any pending interrupt */
mr  r4, r9
-   b   deliver_guest_interrupt
-
-ext_interrupt_to_host:
+   blt deliver_guest_interrupt
 
 guest_exit_cont:   /* r9 = vcpu, r12 = trap, r13 = paca */
/* Save more register state  */
@@ -1763,8 +1757,10 @@ kvmppc_hisi:
  * Returns to the guest if we handle it, or continues on up to
  * the kernel if we can't (i.e. if we don't have a handler for
  * it, or if the handler returns H_TOO_HARD).
+ *
+ * r5 - r8 contain hcall args,
+ * r9 = vcpu, r10 = pc, r11 = msr, r12 = trap, r13 = paca
  */
-   .globl  hcall_try_real_mode
 hcall_try_real_mode:
ld  r3,VCPU_GPR(R3)(r9)
andi.   r0,r11,MSR_PR
@@ -2024,10 +2020,6 @@ hcall_real_table:
.globl  hcall_real_table_end
 hcall_real_table_end:
 
-ignore_hdec

[PULL 03/21] KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.

2015-04-21 Thread Alexander Graf

From: Michael Ellerman mich...@ellerman.id.au

Some PowerNV systems include a hardware random-number generator.
This HWRNG is present on POWER7+ and POWER8 chips and is capable of
generating one 64-bit random number every microsecond.  The random
numbers are produced by sampling a set of 64 unstable high-frequency
oscillators and are almost completely entropic.

PAPR defines an H_RANDOM hypercall which guests can use to obtain one
64-bit random sample from the HWRNG.  This adds a real-mode
implementation of the H_RANDOM hypercall.  This hypercall was
implemented in real mode because the latency of reading the HWRNG is
generally small compared to the latency of a guest exit and entry for
all the threads in the same virtual core.

Userspace can detect the presence of the HWRNG and the H_RANDOM
implementation by querying the KVM_CAP_PPC_HWRNG capability.  The
H_RANDOM hypercall implementation will only be invoked when the guest
does an H_RANDOM hypercall if userspace first enables the in-kernel
H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability.

Signed-off-by: Michael Ellerman mich...@ellerman.id.au
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 Documentation/virtual/kvm/api.txt   |  17 +
 arch/powerpc/include/asm/archrandom.h   |  11 ++-
 arch/powerpc/include/asm/kvm_ppc.h  |   2 +
 arch/powerpc/kvm/book3s_hv_builtin.c|  15 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 115 
 arch/powerpc/kvm/powerpc.c  |   3 +
 arch/powerpc/platforms/powernv/rng.c|  29 
 include/uapi/linux/kvm.h|   1 +
 8 files changed, 191 insertions(+), 2 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index bc9f6fe..9fa2bf8 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3573,3 +3573,20 @@ struct {
 @ar   - access register number
 
 KVM handlers should exit to userspace with rc = -EREMOTE.
+
+
+8. Other capabilities.
+--
+
+This section lists capabilities that give information about other
+features of the KVM implementation.
+
+8.1 KVM_CAP_PPC_HWRNG
+
+Architectures: ppc
+
+This capability, if KVM_CHECK_EXTENSION indicates that it is
+available, means that that the kernel has an implementation of the
+H_RANDOM hypercall backed by a hardware random-number generator.
+If present, the kernel H_RANDOM handler can be enabled for guest use
+with the KVM_CAP_PPC_ENABLE_HCALL capability.
diff --git a/arch/powerpc/include/asm/archrandom.h 
b/arch/powerpc/include/asm/archrandom.h
index bde5311..0cc6eed 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -30,8 +30,6 @@ static inline int arch_has_random(void)
return !!ppc_md.get_random_long;
 }
 
-int powernv_get_random_long(unsigned long *v);
-
 static inline int arch_get_random_seed_long(unsigned long *v)
 {
return 0;
@@ -47,4 +45,13 @@ static inline int arch_has_random_seed(void)
 
 #endif /* CONFIG_ARCH_RANDOM */
 
+#ifdef CONFIG_PPC_POWERNV
+int powernv_hwrng_present(void);
+int powernv_get_random_long(unsigned long *v);
+int powernv_get_random_real_mode(unsigned long *v);
+#else
+static inline int powernv_hwrng_present(void) { return 0; }
+static inline int powernv_get_random_real_mode(unsigned long *v) { return 0; }
+#endif
+
 #endif /* _ASM_POWERPC_ARCHRANDOM_H */
diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 46bf652..b8475da 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -302,6 +302,8 @@ static inline bool is_kvmppc_hv_enabled(struct kvm *kvm)
return kvm-arch.kvm_ops == kvmppc_hv_ops;
 }
 
+extern int kvmppc_hwrng_present(void);
+
 /*
  * Cuts out inst bits with ordering according to spec.
  * That means the leftmost bit is zero. All given bits are included.
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 1f083ff..1954a1c 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -21,6 +21,7 @@
 #include asm/cputable.h
 #include asm/kvm_ppc.h
 #include asm/kvm_book3s.h
+#include asm/archrandom.h
 
 #define KVM_CMA_CHUNK_ORDER18
 
@@ -169,3 +170,17 @@ int kvmppc_hcall_impl_hv_realmode(unsigned long cmd)
return 0;
 }
 EXPORT_SYMBOL_GPL(kvmppc_hcall_impl_hv_realmode);
+
+int kvmppc_hwrng_present(void)
+{
+   return powernv_hwrng_present();
+}
+EXPORT_SYMBOL_GPL(kvmppc_hwrng_present);
+
+long kvmppc_h_random(struct kvm_vcpu *vcpu)
+{
+   if (powernv_get_random_real_mode(vcpu-arch.gpr[4]))
+   return H_SUCCESS;
+
+   return H_HARDWARE;
+}
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6cbf163..0814ca1 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm

[PULL 15/21] KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

We can tell when a secondary thread has finished running a guest by
the fact that it clears its kvm_hstate.kvm_vcpu pointer, so there
is no real need for the nap_count field in the kvmppc_vcore struct.
This changes kvmppc_wait_for_nap to poll the kvm_hstate.kvm_vcpu
pointers of the secondary threads rather than polling vc-nap_count.
Besides reducing the size of the kvmppc_vcore struct by 8 bytes,
this also means that we can tell which secondary threads have got
stuck and thus print a more informative error message.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  2 --
 arch/powerpc/kernel/asm-offsets.c   |  1 -
 arch/powerpc/kvm/book3s_hv.c| 47 +++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 19 +
 4 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 83c4425..1517faa 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -272,8 +272,6 @@ struct kvmppc_vcore {
int n_runnable;
int num_threads;
int entry_exit_count;
-   int n_woken;
-   int nap_count;
int napping_threads;
int first_vcpuid;
u16 pcpu;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 92ec3fc..8aa8246 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -563,7 +563,6 @@ int main(void)
DEFINE(VCPU_WORT, offsetof(struct kvm_vcpu, arch.wort));
DEFINE(VCPU_SHADOW_SRR1, offsetof(struct kvm_vcpu, arch.shadow_srr1));
DEFINE(VCORE_ENTRY_EXIT, offsetof(struct kvmppc_vcore, 
entry_exit_count));
-   DEFINE(VCORE_NAP_COUNT, offsetof(struct kvmppc_vcore, nap_count));
DEFINE(VCORE_IN_GUEST, offsetof(struct kvmppc_vcore, in_guest));
DEFINE(VCORE_NAPPING_THREADS, offsetof(struct kvmppc_vcore, 
napping_threads));
DEFINE(VCORE_KVM, offsetof(struct kvmppc_vcore, kvm));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index fb4f166..7c1335d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1729,8 +1729,10 @@ static int kvmppc_grab_hwthread(int cpu)
tpaca = paca[cpu];
 
/* Ensure the thread won't go into the kernel if it wakes */
-   tpaca-kvm_hstate.hwthread_req = 1;
tpaca-kvm_hstate.kvm_vcpu = NULL;
+   tpaca-kvm_hstate.napping = 0;
+   smp_wmb();
+   tpaca-kvm_hstate.hwthread_req = 1;
 
/*
 * If the thread is already executing in the kernel (e.g. handling
@@ -1773,35 +1775,43 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu)
}
cpu = vc-pcpu + vcpu-arch.ptid;
tpaca = paca[cpu];
-   tpaca-kvm_hstate.kvm_vcpu = vcpu;
tpaca-kvm_hstate.kvm_vcore = vc;
tpaca-kvm_hstate.ptid = vcpu-arch.ptid;
vcpu-cpu = vc-pcpu;
+   /* Order stores to hstate.kvm_vcore etc. before store to kvm_vcpu */
smp_wmb();
+   tpaca-kvm_hstate.kvm_vcpu = vcpu;
 #if defined(CONFIG_PPC_ICP_NATIVE)  defined(CONFIG_SMP)
-   if (cpu != smp_processor_id()) {
+   if (cpu != smp_processor_id())
xics_wake_cpu(cpu);
-   if (vcpu-arch.ptid)
-   ++vc-n_woken;
-   }
 #endif
 }
 
-static void kvmppc_wait_for_nap(struct kvmppc_vcore *vc)
+static void kvmppc_wait_for_nap(void)
 {
-   int i;
+   int cpu = smp_processor_id();
+   int i, loops;
 
-   HMT_low();
-   i = 0;
-   while (vc-nap_count  vc-n_woken) {
-   if (++i = 100) {
-   pr_err(kvmppc_wait_for_nap timeout %d %d\n,
-  vc-nap_count, vc-n_woken);
-   break;
+   for (loops = 0; loops  100; ++loops) {
+   /*
+* Check if all threads are finished.
+* We set the vcpu pointer when starting a thread
+* and the thread clears it when finished, so we look
+* for any threads that still have a non-NULL vcpu ptr.
+*/
+   for (i = 1; i  threads_per_subcore; ++i)
+   if (paca[cpu + i].kvm_hstate.kvm_vcpu)
+   break;
+   if (i == threads_per_subcore) {
+   HMT_medium();
+   return;
}
-   cpu_relax();
+   HMT_low();
}
HMT_medium();
+   for (i = 1; i  threads_per_subcore; ++i)
+   if (paca[cpu + i].kvm_hstate.kvm_vcpu)
+   pr_err(KVM: CPU %d seems to be stuck\n, cpu + i);
 }
 
 /*
@@ -1942,8 +1952,6 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
/*
 * Initialize *vc.
 */
-   vc-n_woken = 0

[PULL 00/21] ppc patch queue 2015-04-21 for 4.1

2015-04-21 Thread Alexander Graf

Hi Paolo / Marcelo,

This is my current patch queue for ppc.  Please pull.

Alex


The following changes since commit b79013b2449c23f1f505bdf39c5a6c330338b244:

  Merge tag 'staging-4.1-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging (2015-04-13 
17:37:33 -0700)

are available in the git repository at:


  git://github.com/agraf/linux-2.6.git tags/signed-kvm-ppc-queue

for you to fetch changes up to 66feed61cdf6ee65fd551d3460b1efba6bee55b8:

  KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8 (2015-04-21 
15:21:34 +0200)


Patch queue for ppc - 2015-04-21

This is the latest queue for KVM on PowerPC changes. Highlights this
time around:

  - Book3S HV: Debugging aids
  - Book3S HV: Minor performance improvements
  - Book3S HV: Cleanups


Aneesh Kumar K.V (2):
  KVM: PPC: Book3S HV: Remove RMA-related variables from code
  KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte

David Gibson (1):
  kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

Michael Ellerman (1):
  KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.

Paul Mackerras (12):
  KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT
  KVM: PPC: Book3S HV: Accumulate timing information for real-mode code
  KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update
  KVM: PPC: Book3S HV: Minor cleanups
  KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu
  KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken
  KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI
  KVM: PPC: Book3S HV: Use decrementer to wake napping threads
  KVM: PPC: Book3S HV: Use bitmap of active threads rather than count
  KVM: PPC: Book3S HV: Streamline guest entry and exit
  KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C
  KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8

Suresh E. Warrier (2):
  powerpc: Export __spin_yield
  KVM: PPC: Book3S HV: Add guest-host real mode completion counters

Suresh Warrier (3):
  KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock
  KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode
  KVM: PPC: Book3S HV: Add ICP real mode counters

 Documentation/virtual/kvm/api.txt|  17 +
 arch/powerpc/include/asm/archrandom.h|  11 +-
 arch/powerpc/include/asm/kvm_book3s.h|   3 +
 arch/powerpc/include/asm/kvm_book3s_64.h |  18 +
 arch/powerpc/include/asm/kvm_host.h  |  47 ++-
 arch/powerpc/include/asm/kvm_ppc.h   |   2 +
 arch/powerpc/include/asm/time.h  |   3 +
 arch/powerpc/kernel/asm-offsets.c|  20 +-
 arch/powerpc/kernel/time.c   |   6 +
 arch/powerpc/kvm/Kconfig |  14 +
 arch/powerpc/kvm/book3s.c|  76 +
 arch/powerpc/kvm/book3s_64_mmu_hv.c  | 189 +--
 arch/powerpc/kvm/book3s_hv.c | 435 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c | 100 +-
 arch/powerpc/kvm/book3s_hv_rm_mmu.c  |  25 +-
 arch/powerpc/kvm/book3s_hv_rm_xics.c | 238 +++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 559 +++
 arch/powerpc/kvm/book3s_pr_papr.c|  28 ++
 arch/powerpc/kvm/book3s_xics.c   | 105 --
 arch/powerpc/kvm/book3s_xics.h   |  13 +-
 arch/powerpc/kvm/powerpc.c   |   3 +
 arch/powerpc/lib/locks.c |   1 +
 arch/powerpc/platforms/powernv/rng.c |  29 ++
 include/uapi/linux/kvm.h |   1 +
 virt/kvm/kvm_main.c  |   1 +
 25 files changed, 1580 insertions(+), 364 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 16/21] KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

When running a multi-threaded guest and vcpu 0 in a virtual core
is not running in the guest (i.e. it is busy elsewhere in the host),
thread 0 of the physical core will switch the MMU to the guest and
then go to nap mode in the code at kvm_do_nap.  If the guest sends
an IPI to thread 0 using the msgsndp instruction, that will wake
up thread 0 and cause all the threads in the guest to exit to the
host unnecessarily.  To avoid the unnecessary exit, this arranges
for the PECEDP bit to be cleared in this situation.  When napping
due to a H_CEDE from the guest, we still set PECEDP so that the
thread will wake up on an IPI sent using msgsndp.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 6716db3..12d7e4c 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -191,6 +191,7 @@ kvmppc_primary_no_guest:
li  r3, NAPPING_NOVCPU
stb r3, HSTATE_NAPPING(r13)
 
+   li  r3, 0   /* Don't wake on privileged (OS) doorbell */
b   kvm_do_nap
 
 kvm_novcpu_wakeup:
@@ -2129,10 +2130,13 @@ _GLOBAL(kvmppc_h_cede)  /* r3 = vcpu pointer, 
r11 = msr, r13 = paca */
bl  kvmhv_accumulate_time
 #endif
 
+   lis r3, LPCR_PECEDP@h   /* Do wake on privileged doorbell */
+
/*
 * Take a nap until a decrementer or external or doobell interrupt
-* occurs, with PECE1, PECE0 and PECEDP set in LPCR. Also clear the
-* runlatch bit before napping.
+* occurs, with PECE1 and PECE0 set in LPCR.
+* On POWER8, if we are ceding, also set PECEDP.
+* Also clear the runlatch bit before napping.
 */
 kvm_do_nap:
mfspr   r0, SPRN_CTRLF
@@ -2144,7 +2148,7 @@ kvm_do_nap:
mfspr   r5,SPRN_LPCR
ori r5,r5,LPCR_PECE0 | LPCR_PECE1
 BEGIN_FTR_SECTION
-   orisr5,r5,LPCR_PECEDP@h
+   rlwimi  r5, r3, 0, LPCR_PECEDP
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
mtspr   SPRN_LPCR,r5
isync
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PULL 02/21] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-04-21 Thread Alexander Graf

From: David Gibson da...@gibson.dropbear.id.au

On POWER, storage caching is usually configured via the MMU - attributes
such as cache-inhibited are stored in the TLB and the hashed page table.

This makes correctly performing cache inhibited IO accesses awkward when
the MMU is turned off (real mode).  Some CPU models provide special
registers to control the cache attributes of real mode load and stores but
this is not at all consistent.  This is a problem in particular for SLOF,
the firmware used on KVM guests, which runs entirely in real mode, but
which needs to do IO to load the kernel.

To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
a logical address (aka guest physical address).  SLOF uses these for IO.

However, because these are implemented within qemu, not the host kernel,
these bypass any IO devices emulated within KVM itself.  The simplest way
to see this problem is to attempt to boot a KVM guest from a virtio-blk
device with iothread / dataplane enabled.  The iothread code relies on an
in kernel implementation of the virtio queue notification, which is not
triggered by the IO hcalls, and so the guest will stall in SLOF unable to
load the guest OS.

This patch addresses this by providing in-kernel implementations of the
2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
address not handled by the KVM IO bus will cause a VM exit, hitting the
qemu implementation as before.

Note that a userspace change is also required, in order to enable these
new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.

Signed-off-by: David Gibson da...@gibson.dropbear.id.au
[agraf: fix compilation]
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s.h |  3 ++
 arch/powerpc/kvm/book3s.c | 76 +++
 arch/powerpc/kvm/book3s_hv.c  | 12 ++
 arch/powerpc/kvm/book3s_pr_papr.c | 28 +
 4 files changed, 119 insertions(+)

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
index 942c7b1..578e550 100644
--- a/arch/powerpc/include/asm/kvm_book3s.h
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -292,6 +292,9 @@ static inline bool kvmppc_supports_magic_page(struct 
kvm_vcpu *vcpu)
return !is_kvmppc_hv_enabled(vcpu-kvm);
 }
 
+extern int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu);
+extern int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu);
+
 /* Magic register values loaded into r3 and r4 before the 'sc' assembly
  * instruction for the OSI hypercalls */
 #define OSI_SC_MAGIC_R30x113724FA
diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
index cfbcdc6..453a8a4 100644
--- a/arch/powerpc/kvm/book3s.c
+++ b/arch/powerpc/kvm/book3s.c
@@ -821,6 +821,82 @@ void kvmppc_core_destroy_vm(struct kvm *kvm)
 #endif
 }
 
+int kvmppc_h_logical_ci_load(struct kvm_vcpu *vcpu)
+{
+   unsigned long size = kvmppc_get_gpr(vcpu, 4);
+   unsigned long addr = kvmppc_get_gpr(vcpu, 5);
+   u64 buf;
+   int ret;
+
+   if (!is_power_of_2(size) || (size  sizeof(buf)))
+   return H_TOO_HARD;
+
+   ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, addr, size, buf);
+   if (ret != 0)
+   return H_TOO_HARD;
+
+   switch (size) {
+   case 1:
+   kvmppc_set_gpr(vcpu, 4, *(u8 *)buf);
+   break;
+
+   case 2:
+   kvmppc_set_gpr(vcpu, 4, be16_to_cpu(*(__be16 *)buf));
+   break;
+
+   case 4:
+   kvmppc_set_gpr(vcpu, 4, be32_to_cpu(*(__be32 *)buf));
+   break;
+
+   case 8:
+   kvmppc_set_gpr(vcpu, 4, be64_to_cpu(*(__be64 *)buf));
+   break;
+
+   default:
+   BUG();
+   }
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_load);
+
+int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu)
+{
+   unsigned long size = kvmppc_get_gpr(vcpu, 4);
+   unsigned long addr = kvmppc_get_gpr(vcpu, 5);
+   unsigned long val = kvmppc_get_gpr(vcpu, 6);
+   u64 buf;
+   int ret;
+
+   switch (size) {
+   case 1:
+   *(u8 *)buf = val;
+   break;
+
+   case 2:
+   *(__be16 *)buf = cpu_to_be16(val);
+   break;
+
+   case 4:
+   *(__be32 *)buf = cpu_to_be32(val);
+   break;
+
+   case 8:
+   *(__be64 *)buf = cpu_to_be64(val);
+   break;
+
+   default:
+   return H_TOO_HARD;
+   }
+
+   ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, addr, size, buf);
+   if (ret != 0)
+   return H_TOO_HARD;
+
+   return H_SUCCESS;
+}
+EXPORT_SYMBOL_GPL(kvmppc_h_logical_ci_store);
+
 int kvmppc_core_check_processor_compat(void)
 {
/*
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index de74756

[PULL 14/21] KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

Rather than calling cond_resched() in kvmppc_run_core() before doing
the post-processing for the vcpus that we have just run (that is,
calling kvmppc_handle_exit_hv(), kvmppc_set_timer(), etc.), we now do
that post-processing before calling cond_resched(), and that post-
processing is moved out into its own function, post_guest_process().

The reschedule point is now in kvmppc_run_vcpu() and we define a new
vcore state, VCORE_PREEMPT, to indicate that that the vcore's runner
task is runnable but not running.  (Doing the reschedule with the
vcore in VCORE_INACTIVE state would be bad because there are potentially
other vcpus waiting for the runner in kvmppc_wait_for_exec() which
then wouldn't get woken up.)

Also, we make use of the handy cond_resched_lock() function, which
unlocks and relocks vc-lock for us around the reschedule.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  5 +-
 arch/powerpc/kvm/book3s_hv.c| 92 +
 2 files changed, 55 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 3eecd88..83c4425 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -304,8 +304,9 @@ struct kvmppc_vcore {
 /* Values for vcore_state */
 #define VCORE_INACTIVE 0
 #define VCORE_SLEEPING 1
-#define VCORE_RUNNING  2
-#define VCORE_EXITING  3
+#define VCORE_PREEMPT  2
+#define VCORE_RUNNING  3
+#define VCORE_EXITING  4
 
 /*
  * Struct used to manage memory for a virtual processor area
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index b38c10e..fb4f166 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1882,15 +1882,50 @@ static void prepare_threads(struct kvmppc_vcore *vc)
}
 }
 
+static void post_guest_process(struct kvmppc_vcore *vc)
+{
+   u64 now;
+   long ret;
+   struct kvm_vcpu *vcpu, *vnext;
+
+   now = get_tb();
+   list_for_each_entry_safe(vcpu, vnext, vc-runnable_threads,
+arch.run_list) {
+   /* cancel pending dec exception if dec is positive */
+   if (now  vcpu-arch.dec_expires 
+   kvmppc_core_pending_dec(vcpu))
+   kvmppc_core_dequeue_dec(vcpu);
+
+   trace_kvm_guest_exit(vcpu);
+
+   ret = RESUME_GUEST;
+   if (vcpu-arch.trap)
+   ret = kvmppc_handle_exit_hv(vcpu-arch.kvm_run, vcpu,
+   vcpu-arch.run_task);
+
+   vcpu-arch.ret = ret;
+   vcpu-arch.trap = 0;
+
+   if (vcpu-arch.ceded) {
+   if (!is_kvmppc_resume_guest(ret))
+   kvmppc_end_cede(vcpu);
+   else
+   kvmppc_set_timer(vcpu);
+   }
+   if (!is_kvmppc_resume_guest(vcpu-arch.ret)) {
+   kvmppc_remove_runnable(vc, vcpu);
+   wake_up(vcpu-arch.cpu_run);
+   }
+   }
+}
+
 /*
  * Run a set of guest threads on a physical core.
  * Called with vc-lock held.
  */
 static void kvmppc_run_core(struct kvmppc_vcore *vc)
 {
-   struct kvm_vcpu *vcpu, *vnext;
-   long ret;
-   u64 now;
+   struct kvm_vcpu *vcpu;
int i;
int srcu_idx;
 
@@ -1922,8 +1957,11 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
 */
if ((threads_per_core  1) 
((vc-num_threads  threads_per_subcore) || !on_primary_thread())) {
-   list_for_each_entry(vcpu, vc-runnable_threads, arch.run_list)
+   list_for_each_entry(vcpu, vc-runnable_threads, arch.run_list) 
{
vcpu-arch.ret = -EBUSY;
+   kvmppc_remove_runnable(vc, vcpu);
+   wake_up(vcpu-arch.cpu_run);
+   }
goto out;
}
 
@@ -1979,44 +2017,12 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
kvm_guest_exit();
 
preempt_enable();
-   cond_resched();
 
spin_lock(vc-lock);
-   now = get_tb();
-   list_for_each_entry(vcpu, vc-runnable_threads, arch.run_list) {
-   /* cancel pending dec exception if dec is positive */
-   if (now  vcpu-arch.dec_expires 
-   kvmppc_core_pending_dec(vcpu))
-   kvmppc_core_dequeue_dec(vcpu);
-
-   trace_kvm_guest_exit(vcpu);
-
-   ret = RESUME_GUEST;
-   if (vcpu-arch.trap)
-   ret = kvmppc_handle_exit_hv(vcpu-arch.kvm_run, vcpu,
-   vcpu-arch.run_task);
-
-   vcpu-arch.ret = ret;
-   vcpu-arch.trap = 0;
-
-   if (vcpu

[PULL 20/21] KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This replaces the assembler code for kvmhv_commence_exit() with C code
in book3s_hv_builtin.c.  It also moves the IPI sending code that was
in book3s_hv_rm_xics.c into a new kvmhv_rm_send_ipi() function so it
can be used by kvmhv_commence_exit() as well as icp_rm_set_vcpu_irq().

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_64.h |  2 +
 arch/powerpc/kvm/book3s_hv_builtin.c | 63 ++
 arch/powerpc/kvm/book3s_hv_rm_xics.c | 12 +-
 arch/powerpc/kvm/book3s_hv_rmhandlers.S  | 66 
 4 files changed, 75 insertions(+), 68 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h 
b/arch/powerpc/include/asm/kvm_book3s_64.h
index 869c53f..2b84e48 100644
--- a/arch/powerpc/include/asm/kvm_book3s_64.h
+++ b/arch/powerpc/include/asm/kvm_book3s_64.h
@@ -438,6 +438,8 @@ static inline struct kvm_memslots *kvm_memslots_raw(struct 
kvm *kvm)
 
 extern void kvmppc_mmu_debugfs_init(struct kvm *kvm);
 
+extern void kvmhv_rm_send_ipi(int cpu);
+
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 
 #endif /* __ASM_KVM_BOOK3S_64_H__ */
diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index 2754251..c42aa55 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -22,6 +22,7 @@
 #include asm/kvm_ppc.h
 #include asm/kvm_book3s.h
 #include asm/archrandom.h
+#include asm/xics.h
 
 #define KVM_CMA_CHUNK_ORDER18
 
@@ -184,3 +185,65 @@ long kvmppc_h_random(struct kvm_vcpu *vcpu)
 
return H_HARDWARE;
 }
+
+static inline void rm_writeb(unsigned long paddr, u8 val)
+{
+   __asm__ __volatile__(stbcix %0,0,%1
+   : : r (val), r (paddr) : memory);
+}
+
+/*
+ * Send an interrupt to another CPU.
+ * This can only be called in real mode.
+ * The caller needs to include any barrier needed to order writes
+ * to memory vs. the IPI/message.
+ */
+void kvmhv_rm_send_ipi(int cpu)
+{
+   unsigned long xics_phys;
+
+   /* Poke the target */
+   xics_phys = paca[cpu].kvm_hstate.xics_phys;
+   rm_writeb(xics_phys + XICS_MFRR, IPI_PRIORITY);
+}
+
+/*
+ * The following functions are called from the assembly code
+ * in book3s_hv_rmhandlers.S.
+ */
+static void kvmhv_interrupt_vcore(struct kvmppc_vcore *vc, int active)
+{
+   int cpu = vc-pcpu;
+
+   /* Order setting of exit map vs. msgsnd/IPI */
+   smp_mb();
+   for (; active; active = 1, ++cpu)
+   if (active  1)
+   kvmhv_rm_send_ipi(cpu);
+}
+
+void kvmhv_commence_exit(int trap)
+{
+   struct kvmppc_vcore *vc = local_paca-kvm_hstate.kvm_vcore;
+   int ptid = local_paca-kvm_hstate.ptid;
+   int me, ee;
+
+   /* Set our bit in the threads-exiting-guest map in the 0xff00
+  bits of vcore-entry_exit_map */
+   me = 0x100  ptid;
+   do {
+   ee = vc-entry_exit_map;
+   } while (cmpxchg(vc-entry_exit_map, ee, ee | me) != ee);
+
+   /* Are we the first here? */
+   if ((ee  8) != 0)
+   return;
+
+   /*
+* Trigger the other threads in this vcore to exit the guest.
+* If this is a hypervisor decrementer interrupt then they
+* will be already on their way out of the guest.
+*/
+   if (trap != BOOK3S_INTERRUPT_HV_DECREMENTER)
+   kvmhv_interrupt_vcore(vc, ee  ~(1  ptid));
+}
diff --git a/arch/powerpc/kvm/book3s_hv_rm_xics.c 
b/arch/powerpc/kvm/book3s_hv_rm_xics.c
index 6dded8c..00e45b6 100644
--- a/arch/powerpc/kvm/book3s_hv_rm_xics.c
+++ b/arch/powerpc/kvm/book3s_hv_rm_xics.c
@@ -26,12 +26,6 @@
 static void icp_rm_deliver_irq(struct kvmppc_xics *xics, struct kvmppc_icp 
*icp,
u32 new_irq);
 
-static inline void rm_writeb(unsigned long paddr, u8 val)
-{
-   __asm__ __volatile__(sync; stbcix %0,0,%1
-   : : r (val), r (paddr) : memory);
-}
-
 /* -- ICS routines -- */
 static void ics_rm_check_resend(struct kvmppc_xics *xics,
struct kvmppc_ics *ics, struct kvmppc_icp *icp)
@@ -60,7 +54,6 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
struct kvm_vcpu *this_vcpu)
 {
struct kvmppc_icp *this_icp = this_vcpu-arch.icp;
-   unsigned long xics_phys;
int cpu;
 
/* Mark the target VCPU as having an interrupt pending */
@@ -83,9 +76,8 @@ static void icp_rm_set_vcpu_irq(struct kvm_vcpu *vcpu,
/* In SMT cpu will always point to thread 0, we adjust it */
cpu += vcpu-arch.ptid;
 
-   /* Not too hard, then poke the target */
-   xics_phys = paca[cpu].kvm_hstate.xics_phys;
-   rm_writeb(xics_phys + XICS_MFRR, IPI_PRIORITY);
+   smp_mb();
+   kvmhv_rm_send_ipi(cpu);
 }
 
 static void icp_rm_clr_vcpu_irq(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm

[PULL 13/21] KVM: PPC: Book3S HV: Minor cleanups

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

* Remove unused kvmppc_vcore::n_busy field.
* Remove setting of RMOR, since it was only used on PPC970 and the
  PPC970 KVM support has been removed.
* Don't use r1 or r2 in setting the runlatch since they are
  conventionally reserved for other things; use r0 instead.
* Streamline the code a little and remove the ext_interrupt_to_host
  label.
* Add some comments about register usage.
* hcall_try_real_mode doesn't need to be global, and can't be
  called from C code anyway.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  2 --
 arch/powerpc/kernel/asm-offsets.c   |  1 -
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 44 ++---
 3 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 2f339ff..3eecd88 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -227,7 +227,6 @@ struct kvm_arch {
unsigned long host_sdr1;
int tlbie_lock;
unsigned long lpcr;
-   unsigned long rmor;
unsigned long vrma_slb_v;
int hpte_setup_done;
u32 hpt_order;
@@ -271,7 +270,6 @@ struct kvm_arch {
  */
 struct kvmppc_vcore {
int n_runnable;
-   int n_busy;
int num_threads;
int entry_exit_count;
int n_woken;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 3fea721..92ec3fc 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -505,7 +505,6 @@ int main(void)
DEFINE(KVM_NEED_FLUSH, offsetof(struct kvm, arch.need_tlb_flush.bits));
DEFINE(KVM_ENABLED_HCALLS, offsetof(struct kvm, arch.enabled_hcalls));
DEFINE(KVM_LPCR, offsetof(struct kvm, arch.lpcr));
-   DEFINE(KVM_RMOR, offsetof(struct kvm, arch.rmor));
DEFINE(KVM_VRMA_SLB_V, offsetof(struct kvm, arch.vrma_slb_v));
DEFINE(VCPU_DSISR, offsetof(struct kvm_vcpu, arch.shregs.dsisr));
DEFINE(VCPU_DAR, offsetof(struct kvm_vcpu, arch.shregs.dar));
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index b06fe53..f8267e5 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -245,9 +245,9 @@ kvm_novcpu_exit:
 kvm_start_guest:
 
/* Set runlatch bit the minute you wake up from nap */
-   mfspr   r1, SPRN_CTRLF
-   ori r1, r1, 1
-   mtspr   SPRN_CTRLT, r1
+   mfspr   r0, SPRN_CTRLF
+   ori r0, r0, 1
+   mtspr   SPRN_CTRLT, r0
 
ld  r2,PACATOC(r13)
 
@@ -493,11 +493,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
cmpwi   r0,0
beq 20b
 
-   /* Set LPCR and RMOR. */
+   /* Set LPCR. */
 10:ld  r8,VCORE_LPCR(r5)
mtspr   SPRN_LPCR,r8
-   ld  r8,KVM_RMOR(r9)
-   mtspr   SPRN_RMOR,r8
isync
 
/* Check if HDEC expires soon */
@@ -1075,7 +1073,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
bne 2f
mfspr   r3,SPRN_HDEC
cmpwi   r3,0
-   bge ignore_hdec
+   mr  r4,r9
+   bge fast_guest_return
 2:
/* See if this is an hcall we can handle in real mode */
cmpwi   r12,BOOK3S_INTERRUPT_SYSCALL
@@ -1083,26 +1082,21 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
/* External interrupt ? */
cmpwi   r12, BOOK3S_INTERRUPT_EXTERNAL
-   bne+ext_interrupt_to_host
+   bne+guest_exit_cont
 
/* External interrupt, first check for host_ipi. If this is
 * set, we know the host wants us out so let's do it now
 */
bl  kvmppc_read_intr
cmpdi   r3, 0
-   bgt ext_interrupt_to_host
+   bgt guest_exit_cont
 
/* Check if any CPU is heading out to the host, if so head out too */
ld  r5, HSTATE_KVM_VCORE(r13)
lwz r0, VCORE_ENTRY_EXIT(r5)
cmpwi   r0, 0x100
-   bge ext_interrupt_to_host
-
-   /* Return to guest after delivering any pending interrupt */
mr  r4, r9
-   b   deliver_guest_interrupt
-
-ext_interrupt_to_host:
+   blt deliver_guest_interrupt
 
 guest_exit_cont:   /* r9 = vcpu, r12 = trap, r13 = paca */
/* Save more register state  */
@@ -1763,8 +1757,10 @@ kvmppc_hisi:
  * Returns to the guest if we handle it, or continues on up to
  * the kernel if we can't (i.e. if we don't have a handler for
  * it, or if the handler returns H_TOO_HARD).
+ *
+ * r5 - r8 contain hcall args,
+ * r9 = vcpu, r10 = pc, r11 = msr, r12 = trap, r13 = paca
  */
-   .globl  hcall_try_real_mode
 hcall_try_real_mode:
ld  r3,VCPU_GPR(R3)(r9)
andi.   r0,r11,MSR_PR
@@ -2024,10 +2020,6 @@ hcall_real_table:
.globl  hcall_real_table_end
 hcall_real_table_end:
 
-ignore_hdec

[PULL 19/21] KVM: PPC: Book3S HV: Streamline guest entry and exit

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

On entry to the guest, secondary threads now wait for the primary to
switch the MMU after loading up most of their state, rather than before.
This means that the secondary threads get into the guest sooner, in the
common case where the secondary threads get to kvmppc_hv_entry before
the primary thread.

On exit, the first thread out increments the exit count and interrupts
the other threads (to get them out of the guest) before saving most
of its state, rather than after.  That means that the other threads
exit sooner and means that the first thread doesn't spend so much
time waiting for the other threads at the point where the MMU gets
switched back to the host.

This pulls out the code that increments the exit count and interrupts
other threads into a separate function, kvmhv_commence_exit().
This also makes sure that r12 and vcpu-arch.trap are set correctly
in some corner cases.

Statistics from /sys/kernel/debug/kvm/vm*/vcpu*/timings show the
improvement.  Aggregating across vcpus for a guest with 32 vcpus,
8 threads/vcore, running on a POWER8, gives this before the change:

 rm_entry: avg 4537.3ns (222 - 48444, 1068878 samples)
  rm_exit: avg 4787.6ns (152 - 165490, 1010717 samples)
  rm_intr: avg 1673.6ns (12 - 341304, 3818691 samples)

and this after the change:

 rm_entry: avg 3427.7ns (232 - 68150, 1118921 samples)
  rm_exit: avg 4716.0ns (12 - 150720, 1119477 samples)
  rm_intr: avg 1614.8ns (12 - 522436, 3850432 samples)

showing a substantial reduction in the time spent per guest entry in
the real-mode guest entry code, and smaller reductions in the real
mode guest exit and interrupt handling times.  (The test was to start
the guest and boot Fedora 20 big-endian to the login prompt.)

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 212 +++-
 1 file changed, 126 insertions(+), 86 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S 
b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 245f5c9..3f6fd78 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -175,6 +175,19 @@ kvmppc_primary_no_guest:
/* put the HDEC into the DEC, since HDEC interrupts don't wake us */
mfspr   r3, SPRN_HDEC
mtspr   SPRN_DEC, r3
+   /*
+* Make sure the primary has finished the MMU switch.
+* We should never get here on a secondary thread, but
+* check it for robustness' sake.
+*/
+   ld  r5, HSTATE_KVM_VCORE(r13)
+65:lbz r0, VCORE_IN_GUEST(r5)
+   cmpwi   r0, 0
+   beq 65b
+   /* Set LPCR. */
+   ld  r8,VCORE_LPCR(r5)
+   mtspr   SPRN_LPCR,r8
+   isync
/* set our bit in napping_threads */
ld  r5, HSTATE_KVM_VCORE(r13)
lbz r7, HSTATE_PTID(r13)
@@ -206,7 +219,7 @@ kvm_novcpu_wakeup:
 
/* check the wake reason */
bl  kvmppc_check_wake_reason
-   
+
/* see if any other thread is already exiting */
lwz r0, VCORE_ENTRY_EXIT(r5)
cmpwi   r0, 0x100
@@ -244,7 +257,15 @@ kvm_novcpu_wakeup:
b   kvmppc_got_guest
 
 kvm_novcpu_exit:
-   b   hdec_soon
+#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
+   ld  r4, HSTATE_KVM_VCPU(r13)
+   cmpdi   r4, 0
+   beq 13f
+   addir3, r4, VCPU_TB_RMEXIT
+   bl  kvmhv_accumulate_time
+#endif
+13:bl  kvmhv_commence_exit
+   b   kvmhv_switch_to_host
 
 /*
  * We come in here when wakened from nap mode.
@@ -422,7 +443,7 @@ kvmppc_hv_entry:
/* Primary thread switches to guest partition. */
ld  r9,VCORE_KVM(r5)/* pointer to struct kvm */
cmpwi   r6,0
-   bne 20f
+   bne 10f
ld  r6,KVM_SDR1(r9)
lwz r7,KVM_LPID(r9)
li  r0,LPID_RSVD/* switch to reserved LPID */
@@ -493,26 +514,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 
li  r0,1
stb r0,VCORE_IN_GUEST(r5)   /* signal secondaries to continue */
-   b   10f
-
-   /* Secondary threads wait for primary to have done partition switch */
-20:lbz r0,VCORE_IN_GUEST(r5)
-   cmpwi   r0,0
-   beq 20b
-
-   /* Set LPCR. */
-10:ld  r8,VCORE_LPCR(r5)
-   mtspr   SPRN_LPCR,r8
-   isync
-
-   /* Check if HDEC expires soon */
-   mfspr   r3,SPRN_HDEC
-   cmpwi   r3,512  /* 1 microsecond */
-   li  r12,BOOK3S_INTERRUPT_HV_DECREMENTER
-   blt hdec_soon
 
/* Do we have a guest vcpu to run? */
-   cmpdi   r4, 0
+10:cmpdi   r4, 0
beq kvmppc_primary_no_guest
 kvmppc_got_guest:
 
@@ -837,6 +841,30 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
clrrdi  r6,r6,1
mtspr   SPRN_CTRLT,r6
 4:
+   /* Secondary threads wait for primary to have

[PULL 15/21] KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

We can tell when a secondary thread has finished running a guest by
the fact that it clears its kvm_hstate.kvm_vcpu pointer, so there
is no real need for the nap_count field in the kvmppc_vcore struct.
This changes kvmppc_wait_for_nap to poll the kvm_hstate.kvm_vcpu
pointers of the secondary threads rather than polling vc-nap_count.
Besides reducing the size of the kvmppc_vcore struct by 8 bytes,
this also means that we can tell which secondary threads have got
stuck and thus print a more informative error message.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  2 --
 arch/powerpc/kernel/asm-offsets.c   |  1 -
 arch/powerpc/kvm/book3s_hv.c| 47 +++--
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 19 +
 4 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 83c4425..1517faa 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -272,8 +272,6 @@ struct kvmppc_vcore {
int n_runnable;
int num_threads;
int entry_exit_count;
-   int n_woken;
-   int nap_count;
int napping_threads;
int first_vcpuid;
u16 pcpu;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 92ec3fc..8aa8246 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -563,7 +563,6 @@ int main(void)
DEFINE(VCPU_WORT, offsetof(struct kvm_vcpu, arch.wort));
DEFINE(VCPU_SHADOW_SRR1, offsetof(struct kvm_vcpu, arch.shadow_srr1));
DEFINE(VCORE_ENTRY_EXIT, offsetof(struct kvmppc_vcore, 
entry_exit_count));
-   DEFINE(VCORE_NAP_COUNT, offsetof(struct kvmppc_vcore, nap_count));
DEFINE(VCORE_IN_GUEST, offsetof(struct kvmppc_vcore, in_guest));
DEFINE(VCORE_NAPPING_THREADS, offsetof(struct kvmppc_vcore, 
napping_threads));
DEFINE(VCORE_KVM, offsetof(struct kvmppc_vcore, kvm));
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index fb4f166..7c1335d 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1729,8 +1729,10 @@ static int kvmppc_grab_hwthread(int cpu)
tpaca = paca[cpu];
 
/* Ensure the thread won't go into the kernel if it wakes */
-   tpaca-kvm_hstate.hwthread_req = 1;
tpaca-kvm_hstate.kvm_vcpu = NULL;
+   tpaca-kvm_hstate.napping = 0;
+   smp_wmb();
+   tpaca-kvm_hstate.hwthread_req = 1;
 
/*
 * If the thread is already executing in the kernel (e.g. handling
@@ -1773,35 +1775,43 @@ static void kvmppc_start_thread(struct kvm_vcpu *vcpu)
}
cpu = vc-pcpu + vcpu-arch.ptid;
tpaca = paca[cpu];
-   tpaca-kvm_hstate.kvm_vcpu = vcpu;
tpaca-kvm_hstate.kvm_vcore = vc;
tpaca-kvm_hstate.ptid = vcpu-arch.ptid;
vcpu-cpu = vc-pcpu;
+   /* Order stores to hstate.kvm_vcore etc. before store to kvm_vcpu */
smp_wmb();
+   tpaca-kvm_hstate.kvm_vcpu = vcpu;
 #if defined(CONFIG_PPC_ICP_NATIVE)  defined(CONFIG_SMP)
-   if (cpu != smp_processor_id()) {
+   if (cpu != smp_processor_id())
xics_wake_cpu(cpu);
-   if (vcpu-arch.ptid)
-   ++vc-n_woken;
-   }
 #endif
 }
 
-static void kvmppc_wait_for_nap(struct kvmppc_vcore *vc)
+static void kvmppc_wait_for_nap(void)
 {
-   int i;
+   int cpu = smp_processor_id();
+   int i, loops;
 
-   HMT_low();
-   i = 0;
-   while (vc-nap_count  vc-n_woken) {
-   if (++i = 100) {
-   pr_err(kvmppc_wait_for_nap timeout %d %d\n,
-  vc-nap_count, vc-n_woken);
-   break;
+   for (loops = 0; loops  100; ++loops) {
+   /*
+* Check if all threads are finished.
+* We set the vcpu pointer when starting a thread
+* and the thread clears it when finished, so we look
+* for any threads that still have a non-NULL vcpu ptr.
+*/
+   for (i = 1; i  threads_per_subcore; ++i)
+   if (paca[cpu + i].kvm_hstate.kvm_vcpu)
+   break;
+   if (i == threads_per_subcore) {
+   HMT_medium();
+   return;
}
-   cpu_relax();
+   HMT_low();
}
HMT_medium();
+   for (i = 1; i  threads_per_subcore; ++i)
+   if (paca[cpu + i].kvm_hstate.kvm_vcpu)
+   pr_err(KVM: CPU %d seems to be stuck\n, cpu + i);
 }
 
 /*
@@ -1942,8 +1952,6 @@ static void kvmppc_run_core(struct kvmppc_vcore *vc)
/*
 * Initialize *vc.
 */
-   vc-n_woken = 0

[PULL 11/21] KVM: PPC: Book3S HV: Accumulate timing information for real-mode code

2015-04-21 Thread Alexander Graf

From: Paul Mackerras pau...@samba.org

This reads the timebase at various points in the real-mode guest
entry/exit code and uses that to accumulate total, minimum and
maximum time spent in those parts of the code.  Currently these
times are accumulated per vcpu in 5 parts of the code:

* rm_entry - time taken from the start of kvmppc_hv_entry() until
  just before entering the guest.
* rm_intr - time from when we take a hypervisor interrupt in the
  guest until we either re-enter the guest or decide to exit to the
  host.  This includes time spent handling hcalls in real mode.
* rm_exit - time from when we decide to exit the guest until the
  return from kvmppc_hv_entry().
* guest - time spend in the guest
* cede - time spent napping in real mode due to an H_CEDE hcall
  while other threads in the same vcore are active.

These times are exposed in debugfs in a directory per vcpu that
contains a file called timings.  This file contains one line for
each of the 5 timings above, with the name followed by a colon and
4 numbers, which are the count (number of times the code has been
executed), the total time, the minimum time, and the maximum time,
all in nanoseconds.

The overhead of the extra code amounts to about 30ns for an hcall that
is handled in real mode (e.g. H_SET_DABR), which is about 25%.  Since
production environments may not wish to incur this overhead, the new
code is conditional on a new config symbol,
CONFIG_KVM_BOOK3S_HV_EXIT_TIMING.

Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_host.h |  21 +
 arch/powerpc/include/asm/time.h |   3 +
 arch/powerpc/kernel/asm-offsets.c   |  13 +++
 arch/powerpc/kernel/time.c  |   6 ++
 arch/powerpc/kvm/Kconfig|  14 +++
 arch/powerpc/kvm/book3s_hv.c| 150 
 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 141 +-
 7 files changed, 346 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index f1d0bbc..d2068bb 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -369,6 +369,14 @@ struct kvmppc_slb {
u8 base_page_size;  /* MMU_PAGE_xxx */
 };
 
+/* Struct used to accumulate timing information in HV real mode code */
+struct kvmhv_tb_accumulator {
+   u64 seqcount;   /* used to synchronize access, also count * 2 */
+   u64 tb_total;   /* total time in timebase ticks */
+   u64 tb_min; /* min time */
+   u64 tb_max; /* max time */
+};
+
 # ifdef CONFIG_PPC_FSL_BOOK3E
 #define KVMPPC_BOOKE_IAC_NUM   2
 #define KVMPPC_BOOKE_DAC_NUM   2
@@ -657,6 +665,19 @@ struct kvm_vcpu_arch {
 
u32 emul_inst;
 #endif
+
+#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
+   struct kvmhv_tb_accumulator *cur_activity;  /* What we're timing */
+   u64 cur_tb_start;   /* when it started */
+   struct kvmhv_tb_accumulator rm_entry;   /* real-mode entry code */
+   struct kvmhv_tb_accumulator rm_intr;/* real-mode intr handling */
+   struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */
+   struct kvmhv_tb_accumulator guest_time; /* guest execution */
+   struct kvmhv_tb_accumulator cede_time;  /* time napping inside guest */
+
+   struct dentry *debugfs_dir;
+   struct dentry *debugfs_timings;
+#endif /* CONFIG_KVM_BOOK3S_HV_EXIT_TIMING */
 };
 
 #define VCPU_FPR(vcpu, i)  (vcpu)-arch.fp.fpr[i][TS_FPROFFSET]
diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index 03cbada..10fc784 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -211,5 +211,8 @@ extern void secondary_cpu_time_init(void);
 
 DECLARE_PER_CPU(u64, decrementers_next_tb);
 
+/* Convert timebase ticks to nanoseconds */
+unsigned long long tb_to_ns(unsigned long long tb_ticks);
+
 #endif /* __KERNEL__ */
 #endif /* __POWERPC_TIME_H */
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 4717859..3fea721 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -459,6 +459,19 @@ int main(void)
DEFINE(VCPU_SPRG2, offsetof(struct kvm_vcpu, arch.shregs.sprg2));
DEFINE(VCPU_SPRG3, offsetof(struct kvm_vcpu, arch.shregs.sprg3));
 #endif
+#ifdef CONFIG_KVM_BOOK3S_HV_EXIT_TIMING
+   DEFINE(VCPU_TB_RMENTRY, offsetof(struct kvm_vcpu, arch.rm_entry));
+   DEFINE(VCPU_TB_RMINTR, offsetof(struct kvm_vcpu, arch.rm_intr));
+   DEFINE(VCPU_TB_RMEXIT, offsetof(struct kvm_vcpu, arch.rm_exit));
+   DEFINE(VCPU_TB_GUEST, offsetof(struct kvm_vcpu, arch.guest_time));
+   DEFINE(VCPU_TB_CEDE, offsetof(struct kvm_vcpu, arch.cede_time));
+   DEFINE(VCPU_CUR_ACTIVITY, offsetof(struct kvm_vcpu, arch.cur_activity));
+   DEFINE

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 7284 matches

Mail list logo