On Wed, Mar 31, 2021 at 10:32 PM wrote:
>
> From: Guo Ren
>
> This patch introduces a ticket lock implementation for riscv, along the
> same lines as the implementation for arch/arm & arch/csky.
>
> We still use qspinlock as default.
>
> Signed-off-by: Guo Ren
> Cc: Peter Zijlstra
> Cc: Anup
On 3/31/21 3:50 PM, Michael Ellerman wrote:
"Aneesh Kumar K.V" writes:
Shivaprasad G Bhat writes:
Add support for ND_REGION_ASYNC capability if the device tree
indicates 'ibm,hcall-flush-required' property in the NVDIMM node.
Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor.
If
POWER9 and later processors always go via the P9 guest entry path now.
Remove the remaining support from the P7/8 path.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c| 62 ++--
arch/powerpc/kvm/book3s_hv_interrupts.S | 9 +-
This additionally has to save and restore the host SLB, and also
ensure that the MMU is off while switching into the guest SLB.
P9 and later CPUs now always go via the P9 path. The "fast" guest
mode is now renamed to the P9 mode, which is consistent with
functionality and naming.
Signed-off-by:
Guest entry/exit has to restore and save/clear the SLB, plus several
other bits to accommodate hash guests in the P9 path.
Radix host, hash guest support is removed from the P7/8 path.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c| 20 ++-
The reflection of sc 1 interrupts from guest PR=1 to the guest kernel is
required to support a hash guest running PR KVM where its guest is
making hcalls with sc 1.
In preparation for hash guest support, add this hcall reflection to the
P9 path. The P7/8 path does this in its realmode hcall
In order to support hash guests in the P9 path (which does not do real
mode hcalls or page fault handling), these real-mode hash specific
interrupts need to be implemented in virt mode.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c| 145 ++--
Functionality should not be changed.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c | 29 +++--
1 file changed, 15 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 5ef43d9b19bc..4b4250c04117
All radix guests go via the P9 path now, so there is no need to limit
nested HV to processors that support "mixed mode" MMU. Remove the
restriction.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
Commit f3c18e9342a44 ("KVM: PPC: Book3S HV: Use XICS hypercalls when
running as a nested hypervisor") added nested HV tests in XICS
hypercalls, but not all are required.
* icp_eoi is only called by kvmppc_deliver_irq_passthru which is only
called by kvmppc_check_passthru which is only caled by
Now that the P7/8 path no longer supports radix, real-mode handlers
do not need to deal with being called in virt mode.
This change effectively reverts commit acde25726bc6 ("KVM: PPC: Book3S
HV: Add radix checks in real-mode hypercall handlers").
It removes a few more real-mode tests in rm hcall
The P9 path now runs all supported radix guest combinations, so
remove radix guest support from the P7/8 path.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/asm-offsets.c | 1 -
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 103 +---
2 files changed, 3
Radix guest support will be removed from the P7/8 path, so disallow
dependent threads mode on P9.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/kvm_host.h | 1 -
arch/powerpc/kvm/book3s_hv.c| 27 +--
2 files changed, 5 insertions(+), 23
Rather than partition the guest PID space + flush a rogue guest PID to
work around this problem, instead fix it by always disabling the MMU when
switching in or out of guest MMU context in HV mode.
This may be a bit less efficient, but it is a lot less complicated and
allows the P9 path to
Move MMU context switch as late as reasonably possible to minimise code
running with guest context switched in. This becomes more important when
this code may run in real-mode, with later changes.
Move WARN_ON as early as possible so program check interrupts are less
likely to tangle everything
This is a first step to wrapping supervisor and user SPR saving and
loading up into helpers, which will then be called independently in
bare metal and nested HV cases in order to optimise SPR access.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c | 141
This is wasted work if the time limit is exceeded.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv_interrupt.c | 36 --
1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_interrupt.c
The C conversion caused exit timing to become a bit cramped. Expand it
to cover more of the entry and exit code.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv_interrupt.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git
SRR0/1, DAR, DSISR must all be protected from machine check which can
clobber them. Ensure MSR[RI] is clear while they are live.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c | 11 +++--
arch/powerpc/kvm/book3s_hv_interrupt.c | 33 +++---
Now the initial C implementation is done, inline more HV code to make
rearranging things easier.
And rename __kvmhv_vcpu_entry_p9 to drop the leading underscores as it's
now C, and is now a more complete vcpu entry.
Reviewed-by: Fabiano Rosas
Signed-off-by: Nicholas Piggin
---
Almost all logic is moved to C, by introducing a new in_guest mode for
the P9 path that branches very early in the KVM interrupt handler to
P9 exit code.
The main P9 entry and exit assembly is now only about 160 lines of low
level stack setup and register save/restore, plus a bad-interrupt
Rather than have KVM look up the host timer and fiddle with the
irq-work internal details, have the powerpc/time.c code provide a
function for KVM to re-arm the Linux timer code when exiting a
guest.
This is implementation has an improvement over existing code of
marking a decrementer interrupt
The host Linux timer code arms the decrementer with the value
'decrementers_next_tb - current_tb' using set_dec(), which stores
val - 1 on Book3S-64, which is not quite the same as what KVM does
to re-arm the host decrementer when exiting the guest.
This shouldn't be a significant change, but it
irq_work's use of the DEC SPR is racy with guest<->host switch and guest
entry which flips the DEC interrupt to guest, which could lose a host
work interrupt.
This patch closes one race, and attempts to comment another class of
races.
Signed-off-by: Nicholas Piggin
---
mftb is serialising (dispatch next-to-complete) so it is heavy weight
for a mfspr. Avoid reading it multiple times in the entry or exit paths.
A small number of cycles delay to timers is tolerable.
Reviewed-by: Fabiano Rosas
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c | 9
There is no need to save away the host DEC value, as it is derived
from the host timer subsystem, which maintains the next timer time.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/time.h | 5 +
arch/powerpc/kernel/time.c | 1 +
arch/powerpc/kvm/book3s_hv.c| 14
LPCR[HDICE]=0 suppresses hypervisor decrementer exceptions on some
processors, so it must be enabled before HDEC is set.
Rather than set it in the host LPCR then setting HDEC, move the HDEC
update to after the guest MMU context (including LPCR) is loaded.
There shouldn't be much concern with
On processors that don't suppress the HDEC exceptions when LPCR[HDICE]=0,
this could help reduce needless guest exits due to leftover exceptions on
entering the guest.
Reviewed-by: Alexey Kardashevskiy
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/time.h | 2 ++
In the interest of minimising the amount of code that is run in
"real-mode", don't handle hcalls in real mode in the P9 path. This
requires some new handlers for H_CEDE and xics-on-xive to be added
before xive is pulled or cede logic is checked.
This introduces a change in radix guest behaviour
Move the xive management up so the low level register switching can be
pushed further down in a later patch. XIVE MMIO CI operations can run in
higher level code with machine checks, tracing, etc., available.
Reviewed-by: Alexey Kardashevskiy
Signed-off-by: Nicholas Piggin
---
This is more symmetric with kvmppc_xive_push_vcpu. The extra test in
the asm will go away in a later change.
Reviewed-by: Cédric Le Goater
Reviewed-by: Alexey Kardashevskiy
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/kvm_ppc.h | 2 ++
arch/powerpc/kvm/book3s_hv.c
Switching the MMU from radix<->radix mode is tricky particularly as the
MMU can remain enabled and requires a certain sequence of SPR updates.
Move these together into their own functions.
This also includes the radix TLB check / flush because it's tied in to
MMU switching due to tlbiel getting
This sets up the same calling convention from interrupt entry to
KVM interrupt handler for system calls as exists for other interrupt
types.
This is a better API, it uses a save area rather than SPR, and it has
more registers free to use. Using a single common API helps maintain
it, and it
This is not used by PR KVM.
Reviewed-by: Alexey Kardashevskiy
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_64_entry.S | 4
arch/powerpc/kvm/book3s_hv_rmhandlers.S | 4 +++-
arch/powerpc/kvm/book3s_segment.S | 3 +++
3 files changed, 6 insertions(+), 5 deletions(-)
Like the earlier patch for hcalls, KVM interrupt entry requires a
different calling convention than the Linux interrupt handlers
set up. Move the code that converts from one to the other into KVM.
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/exceptions-64s.S | 131
System calls / hcalls have a different calling convention than
other interrupts, so there is code in the KVMTEST to massage these
into the same form as other interrupt handlers.
Move this work into the KVM hcall handler. This means teaching KVM
a little more about the low level interrupt handler
Add a separate hcall entry point. This can be used to deal with the
different calling convention.
Reviewed-by: Daniel Axtens
Reviewed-by: Fabiano Rosas
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/exceptions-64s.S | 6 +++---
arch/powerpc/kvm/book3s_64_entry.S | 6 +-
2 files
Move the GUEST_MODE_SKIP logic into KVM code. This is quite a KVM
internal detail that has no real need to be in common handlers.
Also add a comment explaining why this thing exists.
Reviewed-by: Daniel Axtens
Reviewed-by: Fabiano Rosas
Signed-off-by: Nicholas Piggin
---
Rather than bifurcate the call depending on whether or not HV is
possible, and have the HV entry test for PR, just make a single
common point which does the demultiplexing. This makes it simpler
to add another type of exit handler.
Acked-by: Paul Mackerras
Reviewed-by: Daniel Axtens
Rather than clear the HV bit from the MSR at guest entry, make it clear
that the hypervisor does not allow the guest to set the bit.
The HV clear is kept in guest entry for now, but a future patch will
warn if it is set.
Acked-by: Paul Mackerras
Signed-off-by: Nicholas Piggin
---
Rather than add the ME bit to the MSR at guest entry, make it clear
that the hypervisor does not allow the guest to clear the bit.
The ME set is kept in guest entry for now, but a future patch will
warn if it's not present.
Acked-by: Paul Mackerras
Reviewed-by: Daniel Axtens
Reviewed-by:
The code being executed in KVM_GUEST_MODE_SKIP is hypervisor code with
MSR[IR]=0, so the faults of concern are the d-side ones caused by access
to guest context by the hypervisor.
Instruction breakpoint interrupts are not a concern here. It's unlikely
any good would come of causing breaks in this
Cell does not support KVM.
Acked-by: Paul Mackerras
Reviewed-by: Fabiano Rosas
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kernel/exceptions-64s.S | 6 --
1 file changed, 6 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64s.S
b/arch/powerpc/kernel/exceptions-64s.S
index
This config option causes the warning in init_default_hcalls to fire
because the TCE handlers are in the default hcall list but not
implemented.
Acked-by: Paul Mackerras
Reviewed-by: Daniel Axtens
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c | 2 ++
1 file changed, 2
The va argument is not used in the function or set by its asm caller,
so remove it to be safe.
Acked-by: Paul Mackerras
Reviewed-by: Daniel Axtens
Signed-off-by: Nicholas Piggin
---
arch/powerpc/include/asm/kvm_ppc.h | 3 +--
arch/powerpc/kvm/book3s_hv_rm_mmu.c | 3 +--
2 files changed, 2
This SPR is set to 0 twice when exiting the guest.
Acked-by: Paul Mackerras
Suggested-by: Fabiano Rosas
Reviewed-by: Daniel Axtens
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c
Prevent radix guests setting LPCR[TC]. This bit only applies to hash
partitions.
Reviewed-by: Alexey Kardashevskiy
Signed-off-by: Nicholas Piggin
---
arch/powerpc/kvm/book3s_hv.c | 4
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
These are already disallowed by H_SET_MODE from the guest, also disallow
these by updating LPCR directly.
AIL modes can affect the host interrupt behaviour while the guest LPCR
value is set, so filter it here too.
Acked-by: Paul Mackerras
Suggested-by: Fabiano Rosas
Signed-off-by: Nicholas
This will get a bit more complicated in future patches. Move it
into the helper function.
This change allows the L1 hypervisor to determine some of the LPCR
bits that the L0 is using to run it, which could be a privilege
violation (LPCR is HV-privileged), although the same problem exists
now for
Git tree here
https://github.com/npiggin/linux/tree/kvm-in-c-v6
This fixes a couple of bugs with the POWER7/8 path and now a
POWER8 SMT guest boots and runs again.
Main changes since v5:
- Fixed changelog and subject for patch to re-arm host timer.
- Fixed compile error with !HV [kernel test
Guest LPCR depends on hardware type, and future changes will add
restrictions based on errata and guest MMU mode. Move this logic
to a common function and use it for the cases where the guest
wants to update its LPCR (or the LPCR of a nested guest).
This also adds a warning in other places that
When neither CONFIG_PCI nor CONFIG_IBMVIO is set/enabled, iommu.c has a
build error. The fault injection code is not useful in that kernel config,
so make the FAIL_IOMMU option depend on PCI || IBMVIO.
Prevents this build error (warning escalated to error):
../arch/powerpc/kernel/iommu.c:178:30:
One user has expressed the need to both append and prepend some
built-in parameters to the command line provided by the bootloader.
Allthough it is a corner case, it is easy to implement so let's do it.
When the user chooses to prepend the bootloader provided command line
with the built-in
Le 04/04/2021 à 18:31, Vaibhav Jain a écrit :
While removing large number of mappings from hash page tables for
large memory systems as soft-lockup is reported because of the time
spent inside htap_remove_mapping() like one below:
watchdog: BUG: soft lockup - CPU#8 stuck for 23s!
NIP
While removing large number of mappings from hash page tables for
large memory systems as soft-lockup is reported because of the time
spent inside htap_remove_mapping() like one below:
watchdog: BUG: soft lockup - CPU#8 stuck for 23s!
NIP plpar_hcall+0x38/0x58
LR
'page_address(skb_frag_page()) + skb_frag_off()' can be replaced by an
equivalent 'skb_frag_address()' which is less verbose.
Signed-off-by: Christophe JAILLET
---
drivers/net/ethernet/ibm/ibmvnic.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git
56 matches
Mail list logo