When introduced, IRQFD resampling worked on POWER8 with XICS. However
KVM on POWER9 has never implemented it - the compatibility mode code
("XICS-on-XIVE") misses the kvm_notify_acked_irq() call and the native
XIVE mode does not handle INTx in KVM at all.
This moved the capability support
On Tue, Sep 20, 2022 at 9:58 AM Catalin Marinas wrote:
>
> On Tue, Sep 20, 2022 at 05:33:42PM +0100, Marc Zyngier wrote:
> > On Tue, 20 Sep 2022 16:39:47 +0100,
> > Catalin Marinas wrote:
> > > On Mon, Sep 19, 2022 at 07:12:53PM +0100, Marc Zyngier wrote:
> > > > On Mon, 05 Sep 2022 18:01:55
Certain VMMs such as crosvm have features (e.g. sandboxing) that depend
on being able to map guest memory as MAP_SHARED. The current restriction
on sharing MAP_SHARED pages with the guest is preventing the use of
those features with MTE. Now that the races between tasks concurrently
clearing tags
From: Catalin Marinas
Initialising the tags and setting PG_mte_tagged flag for a page can race
between multiple set_pte_at() on shared pages or setting the stage 2 pte
via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and
set it before attempting page initialisation. Given that
As with PG_arch_2, this flag is only allowed on 64-bit architectures due
to the shortage of bits available. It will be used by the arm64 MTE code
in subsequent patches.
Signed-off-by: Peter Collingbourne
Cc: Will Deacon
Cc: Marc Zyngier
Cc: Steven Price
[catalin.mari...@arm.com: added flag
Document both the restriction on VM_MTE_ALLOWED mappings and
the relaxation for shared mappings.
Signed-off-by: Peter Collingbourne
Acked-by: Catalin Marinas
---
Documentation/virt/kvm/api.rst | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git
Previously we allowed creating a memslot containing a private mapping that
was not VM_MTE_ALLOWED, but would later reject KVM_RUN with -EFAULT. Now
we reject the memory region at memslot creation time.
Since this is a minor tweak to the ABI (a VMM that created one of
these memslots would fail
From: Catalin Marinas
Currently sanitise_mte_tags() checks if it's an online page before
attempting to sanitise the tags. Such detection should be done in the
caller via the VM_MTE_ALLOWED vma flag. Since kvm_set_spte_gfn() does
not have the vma, leave the page unmapped if not already tagged.
From: Catalin Marinas
Currently the PG_mte_tagged page flag mostly means the page contains
valid tags and it should be set after the tags have been cleared or
restored. However, in mte_sync_tags() it is set before setting the tags
to avoid, in theory, a race with concurrent mprotect(PROT_MTE)
From: Catalin Marinas
Commit 4beba9486abd ("mm: Add PG_arch_2 page flag") introduced a new
page flag for all 64-bit architectures. However, even if an architecture
is 64-bit, it may still have limited spare bits in the 'flags' member of
'struct page'. This may happen if an architecture enables
Hi,
This patch series allows VMMs to use shared mappings in MTE enabled
guests. The first five patches were taken from Catalin's tree [1] which
addressed some review feedback from when they were previously sent out
as v3 of this series. The first patch from Catalin's tree makes room
for an
From: Paolo Bonzini
KVM_REQ_UNHALT is now unnecessary because it is replaced by the return
value of kvm_vcpu_block/kvm_vcpu_halt. Remove it.
No functional change intended.
Signed-off-by: Paolo Bonzini
Signed-off-by: Sean Christopherson
---
Documentation/virt/kvm/vcpu-requests.rst | 28
Don't snapshot pending INIT/SIPI events prior to checking nested events,
architecturally there's nothing wrong with KVM processing (dropping) a
SIPI that is received immediately after synthesizing a VM-Exit. Taking
and consuming the snapshot makes the flow way more subtle than it needs
to be,
From: Paolo Bonzini
KVM_REQ_UNHALT is a weird request that simply reports the value of
kvm_arch_vcpu_runnable() on exit from kvm_vcpu_halt(). Only
MIPS and x86 are looking at it, the others just clear it. Check
the state of the vCPU directly so that the request is handled
as a nop on all
From: Paolo Bonzini
kvm_vcpu_check_block() is called while not in TASK_RUNNING, and therefore
it cannot sleep. Writing to guest memory is therefore forbidden, but it
can happen on AMD processors if kvm_check_nested_events() causes a vmexit.
Fortunately, all events that are caught by
Explicitly check for a pending INIT/SIPI event when emulating VMXOFF
instead of blindly making an event request. There's obviously no need
to evaluate events if none are pending.
Signed-off-by: Sean Christopherson
---
arch/x86/kvm/vmx/nested.c | 4 ++--
1 file changed, 2 insertions(+), 2
Evaluate interrupts, i.e. set KVM_REQ_EVENT, if INIT or SIPI is pending
when emulating nested VM-Enter. INIT is blocked while the CPU is in VMX
root mode, but not in VMX non-root, i.e. becomes unblocked on VM-Enter.
This bug has been masked by KVM calling ->check_nested_events() in the
core run
Set KVM_REQ_EVENT if INIT or SIPI is pending when the guest enables GIF.
INIT in particular is blocked when GIF=0 and needs to be processed when
GIF is toggled to '1'. This bug has been masked by (a) KVM calling
->check_nested_events() in the core run loop and (b) hypervisors toggling
GIF from
Rename and invert kvm_vcpu_latch_init() to kvm_apic_init_sipi_allowed()
so as to match the behavior of {interrupt,nmi,smi}_allowed(), and expose
the helper so that it can be used by kvm_vcpu_has_events() to determine
whether or not an INIT or SIPI is pending _and_ can be taken immediately.
From: Paolo Bonzini
Do not return true from kvm_vcpu_has_events() if the vCPU isn' going to
immediately process a pending INIT/SIPI. INIT/SIPI shouldn't be treated
as wake events if they are blocked.
Signed-off-by: Paolo Bonzini
[sean: rebase onto refactored INIT/SIPI helpers, massage
Rename kvm_apic_has_events() to kvm_apic_has_pending_init_or_sipi() so
that it's more obvious that "events" really just means "INIT or SIPI".
Opportunistically clean up a weirdly worded comment that referenced
kvm_apic_has_events() instead of kvm_apic_accept_events().
No functional change
Set KVM_REQ_EVENT when MTF becomes pending to ensure that KVM will run
through inject_pending_event() and thus vmx_check_nested_events() prior
to re-entering the guest.
MTF currently works by virtue of KVM's hack that calls
kvm_check_nested_events() from kvm_vcpu_running(), but that hack will
be
Non-x86 folks, there's nothing interesting to see here, y'all got pulled
in because removing KVM_REQ_UNHALT requires deleting kvm_clear_request()
from arch code.
Note, this based on:
https://github.com/sean-jc/linux.git tags/kvm-x86-6.1-1
to pre-resolve conflicts with the
From: Paolo Bonzini
Interrupts, NMIs etc. sent while in guest mode are already handled
properly by the *_interrupt_allowed callbacks, but other events can
cause a vCPU to be runnable that are specific to guest mode.
In the case of VMX there are two, the preemption timer and the
monitor trap.
On 21/09/2022 02:08, Marc Zyngier wr
ote:
On Tue, 20 Sep 2022 13:51:43 +0100,
Alexey Kardashevskiy wrote:
When introduced, IRQFD resampling worked on POWER8 with XICS. However
KVM on POWER9 has never implemented it - the compatibility mode code
("XICS-on-XIVE") misses the
On Tue, Sep 20, 2022 at 05:44:01PM +0100, Marc Zyngier wrote:
> Mark Brown wrote:
> > void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu)
> > {
> > BUG_ON(!current->mm);
> > - BUG_ON(test_thread_flag(TIF_SVE));
> > +
> > + fpsimd_kvm_prepare();
>
> Why is this *before* the check against
Hey Ricardo,
On Tue, Sep 20, 2022 at 12:02:08PM -0700, Ricardo Koller wrote:
> On Tue, Sep 20, 2022 at 06:36:29PM +, Oliver Upton wrote:
> > Presently stage2_apply_range() works on a batch of memory addressed by a
> > stage 2 root table entry for the VM. Depending on the IPA limit of the
> >
Ignore kvm-arm.mode if !is_hyp_mode_available(). Specifically, we want
to avoid switching kvm_mode to KVM_MODE_PROTECTED if hypervisor mode is
not available. This prevents "Protected KVM" cpu capability being
reported when Linux is booting in EL1 and would not have KVM enabled.
Reasonably though,
On Tue, Sep 20, 2022 at 07:19:57PM +0100, Marc Zyngier wrote:
> Mark Brown wrote:
> > Now that we are recording the type of floating point register state we
> > are saving when we save it we can use that information when we load to
> > decide which register state is required and bring the
On Tue, Sep 20, 2022 at 06:36:29PM +, Oliver Upton wrote:
> Presently stage2_apply_range() works on a batch of memory addressed by a
> stage 2 root table entry for the VM. Depending on the IPA limit of the
> VM and PAGE_SIZE of the host, this could address a massive range of
> memory. Some
On Tue, Sep 20, 2022 at 06:40:03PM +, Sean Christopherson wrote:
> On Tue, Sep 20, 2022, Ricardo Koller wrote:
> > On Tue, Sep 20, 2022 at 06:07:13PM +, Sean Christopherson wrote:
> > > On Tue, Sep 20, 2022, Ricardo Koller wrote:
> > > > The previous commit added support for callers of
On Tue, Sep 20, 2022 at 07:04:24PM +0100, Marc Zyngier wrote:
> Mark Brown wrote:
> > - switch (last->to_save) {
> > - case FP_STATE_TASK:
> > - break;
> > - case FP_STATE_FPSIMD:
> > - WARN_ON_ONCE(save_sve_regs);
> > - break;
> > - case FP_STATE_SVE:
> > -
On Tue, Sep 20, 2022, Ricardo Koller wrote:
> On Tue, Sep 20, 2022 at 06:07:13PM +, Sean Christopherson wrote:
> > On Tue, Sep 20, 2022, Ricardo Koller wrote:
> > > The previous commit added support for callers of vm_create() to
> > > specify
> >
> > Changelog is stale, vm_create()
Presently stage2_apply_range() works on a batch of memory addressed by a
stage 2 root table entry for the VM. Depending on the IPA limit of the
VM and PAGE_SIZE of the host, this could address a massive range of
memory. Some examples:
4 level, 4K paging -> 512 GB batch size
3 level, 64K
On Tue, Sep 20, 2022 at 06:52:59PM +0100, Marc Zyngier wrote:
> On Mon, 15 Aug 2022 23:55:25 +0100,
> Mark Brown wrote:
> > enum fp_state {
> > + FP_STATE_TASK, /* Save based on current, invalid as fp_type */
> How is that related to the FP_TYPE_TASK in the commit message? What
On Tue, 20 Sep 2022 19:09:15 +0100,
Mark Brown wrote:
>
> [1 ]
> On Tue, Sep 20, 2022 at 06:14:13PM +0100, Marc Zyngier wrote:
> > Mark Brown wrote:
>
> > > When we save the state for the floating point registers this can be done
> > > in the form visible through either the FPSIMD V registers
On Tue, Sep 20, 2022 at 06:07:13PM +, Sean Christopherson wrote:
> On Tue, Sep 20, 2022, Ricardo Koller wrote:
> > The previous commit added support for callers of vm_create() to specify
>
> Changelog is stale, vm_create() no longer takes the struct.
>
> Side topic, it's usually a
On Mon, 15 Aug 2022 23:55:27 +0100,
Mark Brown wrote:
>
> Now that we are recording the type of floating point register state we
> are saving when we save it we can use that information when we load to
> decide which register state is required and bring the TIF_SVE state into
> sync with the
On Tue, Sep 20, 2022 at 06:14:13PM +0100, Marc Zyngier wrote:
> Mark Brown wrote:
> > When we save the state for the floating point registers this can be done
> > in the form visible through either the FPSIMD V registers or the SVE Z and
> > P registers. At present we track which format is
On Tue, Sep 20, 2022, Ricardo Koller wrote:
> The previous commit added support for callers of vm_create() to specify
Changelog is stale, vm_create() no longer takes the struct.
Side topic, it's usually a good idea to use "strong" terminology when
referencing
past/future changes, e.g.
On Mon, 15 Aug 2022 23:55:26 +0100,
Mark Brown wrote:
>
> Now that we are explicitly telling the host FP code which register state
> it needs to save we can remove the manipulation of TIF_SVE from the KVM
> code, simplifying it and allowing us to optimise our handling of normal
> tasks. Remove
On Mon, 15 Aug 2022 23:55:25 +0100,
Mark Brown wrote:
>
> In order to avoid needlessly saving and restoring the guest registers KVM
> relies on the host FPSMID code to save the guest registers when we context
> switch away from the guest. This is done by binding the KVM guest state to
> the CPU
On Tue, Sep 20, 2022 at 05:39:19PM +, Sean Christopherson wrote:
> On Tue, Sep 20, 2022, Ricardo Koller wrote:
> > The vm_create() helpers are hardcoded to place most page types (code,
> > page-tables, stacks, etc) in the same memslot #0, and always backed with
> > anonymous 4K. There are a
On Tue, Sep 20, 2022, Ricardo Koller wrote:
> The vm_create() helpers are hardcoded to place most page types (code,
> page-tables, stacks, etc) in the same memslot #0, and always backed with
> anonymous 4K. There are a couple of issues with that. First, tests willing
> to
Preferred kernel
On Mon, 15 Aug 2022 23:55:24 +0100,
Mark Brown wrote:
>
> When we save the state for the floating point registers this can be done
> in the form visible through either the FPSIMD V registers or the SVE Z and
> P registers. At present we track which format is currently used based on
> TIF_SVE and
On Tue, Sep 20, 2022 at 10:05:15AM +0200, Andrew Jones wrote:
> On Tue, Sep 20, 2022 at 04:25:47AM +, Ricardo Koller wrote:
> > Add a new test for stage 2 faults when using different combinations of
> > guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> > and types of
On Tue, Sep 20, 2022 at 05:33:42PM +0100, Marc Zyngier wrote:
> On Tue, 20 Sep 2022 16:39:47 +0100,
> Catalin Marinas wrote:
> > On Mon, Sep 19, 2022 at 07:12:53PM +0100, Marc Zyngier wrote:
> > > On Mon, 05 Sep 2022 18:01:55 +0100,
> > > Catalin Marinas wrote:
> > > > Peter, please let me know
On Mon, 15 Aug 2022 23:55:23 +0100,
Mark Brown wrote:
>
> Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving)
> KVM has not tracked the host SVE state, relying on the fact that we
> currently disable SVE whenever we perform a syscall. This may not be true
> in future since
On Tue, 20 Sep 2022 16:39:47 +0100,
Catalin Marinas wrote:
>
> On Mon, Sep 19, 2022 at 07:12:53PM +0100, Marc Zyngier wrote:
> > On Mon, 05 Sep 2022 18:01:55 +0100,
> > Catalin Marinas wrote:
> > > Peter, please let me know if you want to pick this series up together
> > > with your other KVM
On Tue, 20 Sep 2022 13:51:43 +0100,
Alexey Kardashevskiy wrote:
>
> When introduced, IRQFD resampling worked on POWER8 with XICS. However
> KVM on POWER9 has never implemented it - the compatibility mode code
> ("XICS-on-XIVE") misses the kvm_notify_acked_irq() call and the native
> XIVE mode
On Mon, Sep 19, 2022 at 07:12:53PM +0100, Marc Zyngier wrote:
> On Mon, 05 Sep 2022 18:01:55 +0100,
> Catalin Marinas wrote:
> > Peter, please let me know if you want to pick this series up together
> > with your other KVM patches. Otherwise I can post it separately, it's
> > worth merging it on
On Tue, Sep 20, 2022 at 02:20:48PM +0100, Alexandru Elisei wrote:
> Hi,
>
> On Tue, Sep 20, 2022 at 10:45:53AM +0200, Andrew Jones wrote:
> > On Tue, Aug 09, 2022 at 10:15:44AM +0100, Alexandru Elisei wrote:
> > > With powerpc moving the page allocator, there are no architectures left
> > > which
Hi,
On Tue, Sep 20, 2022 at 10:45:53AM +0200, Andrew Jones wrote:
> On Tue, Aug 09, 2022 at 10:15:44AM +0100, Alexandru Elisei wrote:
> > With powerpc moving the page allocator, there are no architectures left
> > which use the physical allocator after the boot setup: arm, arm64,
> > s390x and
Hi,
On Tue, Sep 20, 2022 at 10:40:47AM +0200, Andrew Jones wrote:
> On Tue, Aug 09, 2022 at 10:15:45AM +0100, Alexandru Elisei wrote:
> > The page allocator has better allocation tracking and is used by all
> > architectures, while the physical allocator is now never used for
> > allocating
When introduced, IRQFD resampling worked on POWER8 with XICS. However
KVM on POWER9 has never implemented it - the compatibility mode code
("XICS-on-XIVE") misses the kvm_notify_acked_irq() call and the native
XIVE mode does not handle INTx in KVM at all.
This moved the capability support
Hi Eric,
On 2022/9/20 17:23, Eric Auger wrote:
Hi Zenghui,
On 9/19/22 16:30, Zenghui Yu wrote:
Hi Eric,
A few comments when looking through the PMU test code (2 years after
the series was merged).
Thank you for reviewing even after this time! Do you want to address the
issues yourself and
On Tue, Aug 09, 2022 at 10:15:55AM +0100, Alexandru Elisei wrote:
> The vmalloc allocator returns non-id mapped addresses, where the virtual
> address is different than the physical address. This makes it impossible
> to access the stack of the secondary CPUs while the MMU is disabled.
>
> On
I guess this should be squashed into one of the early patches in this
series since we don't have this issue with the current code.
Thanks,
drew
On Tue, Aug 09, 2022 at 10:15:52AM +0100, Alexandru Elisei wrote:
> Include libcflat from page.h to avoid error like this one:
>
>
On Tue, Aug 09, 2022 at 10:15:51AM +0100, Alexandru Elisei wrote:
> Commit b5f659be4775 ("arm/arm64: Remove dcache_line_size global
> variable") moved the dcache_by_line_op macro to assembler.h and changed
> it to take the size of the regions instead of the end address as
> parameter. This was
Hi Denis,
On Tue, 20 Sep 2022 09:20:05 +0100,
Denis Nikitin wrote:
>
> Kernel build with -fprofile-sample-use raises the following failure:
>
> error: arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: Unexpected SHT_REL
> section ".rel.llvm.call-graph-profile"
How is this flag provided? I don't see any
On Tue, Aug 09, 2022 at 10:15:48AM +0100, Alexandru Elisei wrote:
> For the boot CPU, the entire stack is zeroed in the entry code. For the
> secondaries, only struct thread_info, which lives at the bottom of the
> stack, is zeroed in thread_info_init().
>
> Be consistent and zero the entire
Hi Zenghui,
On 9/19/22 16:30, Zenghui Yu wrote:
> Hi Eric,
>
> A few comments when looking through the PMU test code (2 years after
> the series was merged).
Thank you for reviewing even after this time! Do you want to address the
issues yourself and send a patch series or do you prefer I
On Tue, Aug 09, 2022 at 10:15:47AM +0100, Alexandru Elisei wrote:
> Until commit 031755dbfefb ("arm: enable vmalloc"), the idmap was allocated
> using pgd_alloc(). After that commit, all the page table allocator
> functions were switched to using the page allocator, but pgd_alloc() was
> left
On Tue, Aug 09, 2022 at 10:15:46AM +0100, Alexandru Elisei wrote:
> phys_end was used to cap the linearly mapped memory to 3G to allow 1G of
> room for the vmalloc area to grown down. This was made useless in commit
> c1cd1a2bed69 ("arm/arm64: mmu: Remove memory layout assumptions"), when
>
On Tue, Aug 09, 2022 at 10:15:44AM +0100, Alexandru Elisei wrote:
> With powerpc moving the page allocator, there are no architectures left
> which use the physical allocator after the boot setup: arm, arm64,
> s390x and powerpc drain the physical allocator to initialize the page
> allocator; and
On Tue, Aug 09, 2022 at 10:15:45AM +0100, Alexandru Elisei wrote:
> The page allocator has better allocation tracking and is used by all
> architectures, while the physical allocator is now never used for
> allocating memory.
>
> Simplify the physical allocator by removing allocation accounting.
On Tue, Aug 09, 2022 at 10:15:42AM +0100, Alexandru Elisei wrote:
> phys_alloc_aligned_safe() is called only by early_memalign() and the safe
> parameter is always true. In the spirit of simplifying the code, merge the
> two functions together. Rename it to memalign_early(), to match the naming
>
On Tue, Aug 09, 2022 at 10:15:41AM +0100, Alexandru Elisei wrote:
> Commit 11c4715fbf87 ("alloc: implement free") changed align_min from a
> static variable to a field for the alloc_ops struct and carried over the
> initializer value of DEFAULT_MINIMUM_ALIGNMENT.
>
> Commit 7e3e823b78c0
On Tue, Aug 09, 2022 at 10:15:40AM +0100, Alexandru Elisei wrote:
> There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> with functionality relies on the __ASSEMBLY__ prepocessor constant being
> correctly defined to work correctly. So far, kvm-unit-tests has relied on
>
On Tue, Sep 20, 2022 at 04:25:47AM +, Ricardo Koller wrote:
> Add a new test for stage 2 faults when using different combinations of
> guest accesses (e.g., write, S1PTW), backing source type (e.g., anon)
> and types of faults (e.g., read on hugetlbfs with a hole). The next
> commits will add
On Tue, Sep 20, 2022 at 04:25:46AM +, Ricardo Koller wrote:
> The previous commit added support for callers of vm_create() to specify
> what memslots to use for code, page-tables, and data allocations. Change
> them accordingly:
>
> - stacks, code, and exception tables use the code
On Tue, Sep 20, 2022 at 04:25:45AM +, Ricardo Koller wrote:
> The vm_create() helpers are hardcoded to place most page types (code,
> page-tables, stacks, etc) in the same memslot #0, and always backed with
> anonymous 4K. There are a couple of issues with that. First, tests willing
> to
>
72 matches
Mail list logo