[RFC PATCH v3 1/2] KVM: arm64: Move CMOs from user_mem_abort to the fault handlers
We currently uniformly permorm CMOs of D-cache and I-cache in function user_mem_abort before calling the fault handlers. If we get concurrent guest faults(e.g. translation faults, permission faults) or some really unnecessary guest faults caused by BBM, CMOs for the first vcpu are necessary while the others later are not. By moving CMOs to the fault handlers, we can easily identify conditions where they are really needed and avoid the unnecessary ones. As it's a time consuming process to perform CMOs especially when flushing a block range, so this solution reduces much load of kvm and improve efficiency of the page table code. So let's move both clean of D-cache and invalidation of I-cache to the map path and move only invalidation of I-cache to the permission path. Since the original APIs for CMOs in mmu.c are only called in function user_mem_abort, we now also move them to pgtable.c. Signed-off-by: Yanan Wang --- arch/arm64/include/asm/kvm_mmu.h | 31 --- arch/arm64/kvm/hyp/pgtable.c | 68 +--- arch/arm64/kvm/mmu.c | 23 ++- 3 files changed, 57 insertions(+), 65 deletions(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 90873851f677..c31f88306d4e 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -177,37 +177,6 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu) return (vcpu_read_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101; } -static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size) -{ - void *va = page_address(pfn_to_page(pfn)); - - /* -* With FWB, we ensure that the guest always accesses memory using -* cacheable attributes, and we don't have to clean to PoC when -* faulting in pages. Furthermore, FWB implies IDC, so cleaning to -* PoU is not required either in this case. -*/ - if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) - return; - - kvm_flush_dcache_to_poc(va, size); -} - -static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn, - unsigned long size) -{ - if (icache_is_aliasing()) { - /* any kind of VIPT cache */ - __flush_icache_all(); - } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) { - /* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */ - void *va = page_address(pfn_to_page(pfn)); - - invalidate_icache_range((unsigned long)va, - (unsigned long)va + size); - } -} - void kvm_set_way_flush(struct kvm_vcpu *vcpu); void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled); diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 4d177ce1d536..829a34eea526 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -464,6 +464,43 @@ static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot, return 0; } +static bool stage2_pte_cacheable(kvm_pte_t pte) +{ + u64 memattr = pte & KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR; + return memattr == PAGE_S2_MEMATTR(NORMAL); +} + +static bool stage2_pte_executable(kvm_pte_t pte) +{ + return !(pte & KVM_PTE_LEAF_ATTR_HI_S2_XN); +} + +static void stage2_flush_dcache(void *addr, u64 size) +{ + /* +* With FWB, we ensure that the guest always accesses memory using +* cacheable attributes, and we don't have to clean to PoC when +* faulting in pages. Furthermore, FWB implies IDC, so cleaning to +* PoU is not required either in this case. +*/ + if (cpus_have_const_cap(ARM64_HAS_STAGE2_FWB)) + return; + + __flush_dcache_area(addr, size); +} + +static void stage2_invalidate_icache(void *addr, u64 size) +{ + if (icache_is_aliasing()) { + /* Flush any kind of VIPT icache */ + __flush_icache_all(); + } else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) { + /* PIPT or VPIPT at EL2 */ + invalidate_icache_range((unsigned long)addr, + (unsigned long)addr + size); + } +} + static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep, struct stage2_map_data *data) @@ -495,6 +532,13 @@ static int stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level, put_page(page); } + /* Perform CMOs before installation of the new PTE */ + if (!kvm_pte_valid(old) || stage2_pte_cacheable(old)) + stage2_flush_dcache(__va(phys), granule); + + if (stage2_pte_executable(new)) + stage2_invalidate_icache(__va(phys), granule); + smp_store_release(ptep, new); get_page(page);
[RFC PATCH v3 0/2] KVM: arm64: Improve efficiency of stage2 page table
Hi, This is a new version of the series [1] that I have posted before. It makes some efficiency improvement of stage2 page table code and there are some test results to quantify the benefit of each patch. [1] v2: https://lore.kernel.org/lkml/20210310094319.18760-1-wangyana...@huawei.com/ Although there hasn't been any feedback about v2, I am certain that there should be a big change for the series after plenty of discussion with Alexandru Elisei. A conclusion was drew that CMOs are still needed for the scenario of coalescing tables, and as a result the benefit of patch #3 in v2 becomes rather little judging from the test results. So drop this patch and keep the others which still remain meaningful. Changelogs: v2->v3: - drop patch #3 in v2 - retest v3 based on v5.12-rc2 v1->v2: - rebased on top of mainline v5.12-rc2 - also move CMOs of I-cache to the fault handlers - retest v2 based on v5.12-rc2 - v1: https://lore.kernel.org/lkml/20210208112250.163568-1-wangyana...@huawei.com/ About this v3 series: Patch #1: We currently uniformly permorm CMOs of D-cache and I-cache in function user_mem_abort before calling the fault handlers. If we get concurrent guest faults(e.g. translation faults, permission faults) or some really unnecessary guest faults caused by BBM, CMOs for the first vcpu are necessary while the others later are not. By moving CMOs to the fault handlers, we can easily identify conditions where they are really needed and avoid the unnecessary ones. As it's a time consuming process to perform CMOs especially when flushing a block range, so this solution reduces much load of kvm and improve efficiency of the page table code. So let's move both clean of D-cache and invalidation of I-cache to the map path and move only invalidation of I-cache to the permission path. Since the original APIs for CMOs in mmu.c are only called in function user_mem_abort, we now also move them to pgtable.c. The following results represent the benefit of patch #1 alone, and they were tested by [2] (kvm/selftest) that I have posted recently. [2] https://lore.kernel.org/lkml/20210302125751.19080-1-wangyana...@huawei.com/ When there are muitiple vcpus concurrently accessing the same memory region, we can test the execution time of KVM creating new mappings, updating the permissions of old mappings from RO to RW, and rebuilding the blocks after they have been split. hardware platform: HiSilicon Kunpeng920 Server host kernel: Linux mainline v5.12-rc2 cmdline: ./kvm_page_table_test -m 4 -s anonymous -b 1G -v 80 (80 vcpus, 1G memory, page mappings(normal 4K)) KVM_CREATE_MAPPINGS: before 104.35s -> after 90.42s +13.35% KVM_UPDATE_MAPPINGS: before 78.64s -> after 75.45s + 4.06% cmdline: ./kvm_page_table_test -m 4 -s anonymous_thp -b 20G -v 40 (40 vcpus, 20G memory, block mappings(THP 2M)) KVM_CREATE_MAPPINGS: before 15.66s -> after 6.92s +55.80% KVM_UPDATE_MAPPINGS: before 178.80s -> after 123.35s +31.00% KVM_REBUILD_BLOCKS: before 187.34s -> after 131.76s +30.65% cmdline: ./kvm_page_table_test -m 4 -s anonymous_hugetlb_1gb -b 20G -v 40 (40 vcpus, 20G memory, block mappings(HUGETLB 1G)) KVM_CREATE_MAPPINGS: before 104.54s -> after 3.70s +96.46% KVM_UPDATE_MAPPINGS: before 174.20s -> after 115.94s +33.44% KVM_REBUILD_BLOCKS: before 103.95s -> after 2.96s +97.15% Patch #2: A new method to distinguish cases of memcache allocations is introduced. By comparing fault_granule and vma_pagesize, cases that require allocations from memcache and cases that don't can be distinguished completely. Yanan Wang (2): KVM: arm64: Move CMOs from user_mem_abort to the fault handlers KVM: arm64: Distinguish cases of memcache allocations completely arch/arm64/include/asm/kvm_mmu.h | 31 --- arch/arm64/kvm/hyp/pgtable.c | 68 +--- arch/arm64/kvm/mmu.c | 48 -- 3 files changed, 69 insertions(+), 78 deletions(-) -- 2.19.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC PATCH v3 2/2] KVM: arm64: Distinguish cases of memcache allocations completely
With a guest translation fault, the memcache pages are not needed if KVM is only about to install a new leaf entry into the existing page table. And with a guest permission fault, the memcache pages are also not needed for a write_fault in dirty-logging time if KVM is only about to update the existing leaf entry instead of collapsing a block entry into a table. By comparing fault_granule and vma_pagesize, cases that require allocations from memcache and cases that don't can be distinguished completely. Signed-off-by: Yanan Wang --- arch/arm64/kvm/mmu.c | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 1eec9f63bc6f..05af40dc60c1 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -810,19 +810,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, gfn = fault_ipa >> PAGE_SHIFT; mmap_read_unlock(current->mm); - /* -* Permission faults just need to update the existing leaf entry, -* and so normally don't require allocations from the memcache. The -* only exception to this is when dirty logging is enabled at runtime -* and a write fault needs to collapse a block entry into a table. -*/ - if (fault_status != FSC_PERM || (logging_active && write_fault)) { - ret = kvm_mmu_topup_memory_cache(memcache, -kvm_mmu_cache_min_pages(kvm)); - if (ret) - return ret; - } - mmu_seq = vcpu->kvm->mmu_notifier_seq; /* * Ensure the read of mmu_notifier_seq happens before we call @@ -880,6 +867,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) prot |= KVM_PGTABLE_PROT_X; + /* +* Allocations from the memcache are required only when granule of the +* lookup level where the guest fault happened exceeds vma_pagesize, +* which means new page tables will be created in the fault handlers. +*/ + if (fault_granule > vma_pagesize) { + ret = kvm_mmu_topup_memory_cache(memcache, +kvm_mmu_cache_min_pages(kvm)); + if (ret) + return ret; + } + /* * Under the premise of getting a FSC_PERM fault, we just need to relax * permissions only if vma_pagesize equals fault_granule. Otherwise, -- 2.19.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 4/4] KVM: arm64: Distinguish cases of memcache allocations completely
Hi Alex, On 2021/3/26 1:26, Alexandru Elisei wrote: Hi Yanan, On 2/8/21 11:22 AM, Yanan Wang wrote: With a guest translation fault, the memcache pages are not needed if KVM is only about to install a new leaf entry into the existing page table. And with a guest permission fault, the memcache pages are also not needed for a write_fault in dirty-logging time if KVM is only about to update the existing leaf entry instead of collapsing a block entry into a table. By comparing fault_granule and vma_pagesize, cases that require allocations from memcache and cases that don't can be distinguished completely. Signed-off-by: Yanan Wang --- arch/arm64/kvm/mmu.c | 25 - 1 file changed, 12 insertions(+), 13 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index d151927a7d62..550498a9104e 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -815,19 +815,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, gfn = fault_ipa >> PAGE_SHIFT; mmap_read_unlock(current->mm); - /* -* Permission faults just need to update the existing leaf entry, -* and so normally don't require allocations from the memcache. The -* only exception to this is when dirty logging is enabled at runtime -* and a write fault needs to collapse a block entry into a table. -*/ - if (fault_status != FSC_PERM || (logging_active && write_fault)) { - ret = kvm_mmu_topup_memory_cache(memcache, -kvm_mmu_cache_min_pages(kvm)); - if (ret) - return ret; - } - mmu_seq = vcpu->kvm->mmu_notifier_seq; /* * Ensure the read of mmu_notifier_seq happens before we call @@ -887,6 +874,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) prot |= KVM_PGTABLE_PROT_X; + /* +* Allocations from the memcache are required only when granule of the +* lookup level where the guest fault happened exceeds vma_pagesize, +* which means new page tables will be created in the fault handlers. +*/ + if (fault_granule > vma_pagesize) { + ret = kvm_mmu_topup_memory_cache(memcache, +kvm_mmu_cache_min_pages(kvm)); + if (ret) + return ret; + } I distinguish three situations: 1. fault_granule == vma_pagesize. If the stage 2 fault occurs at the leaf level, then it means that all the tables that the translation table walker traversed until the leaf are valid. No need to allocate a new page, as stage 2 will only change the leaf to point to a valid PA. 2. fault_granule > vma_pagesize. This means that there's a table missing at some point in the table walk, so we're going to need to allocate at least one table to hold the leaf entry. We need to topup the memory cache. 3. fault_granule < vma_pagesize. From our discussion in patch #3, this can happen only if the userspace translation tables use a block mapping, dirty page logging is enabled, the fault_ipa is mapped as a last level entry, dirty page logging gets disabled and then we get a fault. In this case, the PTE table will be coalesced into a PMD block mapping, and the PMD table entry that pointed to the PTE table will be changed to a block mapping. No table will be allocated. Looks to me like this patch is valid, but getting it wrong can break a VM and I would feel a lot more comfortable if someone who is more familiar with the code would have a look. Thanks for your explanation here. Above is also what I thought about this patch. Thanks, Yanan Thanks, Alex + /* * Under the premise of getting a FSC_PERM fault, we just need to relax * permissions only if vma_pagesize equals fault_granule. Otherwise, . ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [GIT PULL] KVM/arm64 fixes for 5.12, take #3
On 25/03/21 12:44, Marc Zyngier wrote: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-5.12-3 Pulled, thanks. Paolo ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 4/4] KVM: arm64: Distinguish cases of memcache allocations completely
Hi Yanan, On 2/8/21 11:22 AM, Yanan Wang wrote: > With a guest translation fault, the memcache pages are not needed if KVM > is only about to install a new leaf entry into the existing page table. > And with a guest permission fault, the memcache pages are also not needed > for a write_fault in dirty-logging time if KVM is only about to update > the existing leaf entry instead of collapsing a block entry into a table. > > By comparing fault_granule and vma_pagesize, cases that require allocations > from memcache and cases that don't can be distinguished completely. > > Signed-off-by: Yanan Wang > --- > arch/arm64/kvm/mmu.c | 25 - > 1 file changed, 12 insertions(+), 13 deletions(-) > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index d151927a7d62..550498a9104e 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -815,19 +815,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, > phys_addr_t fault_ipa, > gfn = fault_ipa >> PAGE_SHIFT; > mmap_read_unlock(current->mm); > > - /* > - * Permission faults just need to update the existing leaf entry, > - * and so normally don't require allocations from the memcache. The > - * only exception to this is when dirty logging is enabled at runtime > - * and a write fault needs to collapse a block entry into a table. > - */ > - if (fault_status != FSC_PERM || (logging_active && write_fault)) { > - ret = kvm_mmu_topup_memory_cache(memcache, > - kvm_mmu_cache_min_pages(kvm)); > - if (ret) > - return ret; > - } > - > mmu_seq = vcpu->kvm->mmu_notifier_seq; > /* >* Ensure the read of mmu_notifier_seq happens before we call > @@ -887,6 +874,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, > phys_addr_t fault_ipa, > else if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) > prot |= KVM_PGTABLE_PROT_X; > > + /* > + * Allocations from the memcache are required only when granule of the > + * lookup level where the guest fault happened exceeds vma_pagesize, > + * which means new page tables will be created in the fault handlers. > + */ > + if (fault_granule > vma_pagesize) { > + ret = kvm_mmu_topup_memory_cache(memcache, > + kvm_mmu_cache_min_pages(kvm)); > + if (ret) > + return ret; > + } I distinguish three situations: 1. fault_granule == vma_pagesize. If the stage 2 fault occurs at the leaf level, then it means that all the tables that the translation table walker traversed until the leaf are valid. No need to allocate a new page, as stage 2 will only change the leaf to point to a valid PA. 2. fault_granule > vma_pagesize. This means that there's a table missing at some point in the table walk, so we're going to need to allocate at least one table to hold the leaf entry. We need to topup the memory cache. 3. fault_granule < vma_pagesize. From our discussion in patch #3, this can happen only if the userspace translation tables use a block mapping, dirty page logging is enabled, the fault_ipa is mapped as a last level entry, dirty page logging gets disabled and then we get a fault. In this case, the PTE table will be coalesced into a PMD block mapping, and the PMD table entry that pointed to the PTE table will be changed to a block mapping. No table will be allocated. Looks to me like this patch is valid, but getting it wrong can break a VM and I would feel a lot more comfortable if someone who is more familiar with the code would have a look. Thanks, Alex > + > /* >* Under the premise of getting a FSC_PERM fault, we just need to relax >* permissions only if vma_pagesize equals fault_granule. Otherwise, ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [kvm-unit-tests PATCH v2 0/6] Misc assembly fixes and cleanups
On Mon, Mar 22, 2021 at 03:06:35PM +, Alexandru Elisei wrote: > This series is mostly fixes and cleanups for things I found when playing > with EFI support. Most of them I hope are fairly self-explanatory. > > What is clearly aimed at running on baremetal is patch #2 ("arm/arm64: > Remove dcache_line_size global variable"), which is needed because the > startup environment is different for EFI apps and we're going to need to do > cache maintenance before setup() is run. > > Patch #4 ("lib: arm64: Consolidate register definitions to sysreg.h") is > there to make importing register definitions and other header files from > Linux (like el2_setup.h) easier by switching to the same layout. And arm > is already using sysreg.h for SCTLR fields. > > Changes in v2: > * Gathered Reviewed-by tags, thank you! > * For patch #2 ("arm/arm64: Remove dcache_line_size global variable"), I've > modified the commit message to mention the change in parameters for > dcache_by_line_op, I've added the proper header guards to > lib/arm/asm/assembler.h and I've changed raw_dcache_line_size to use ubfx > instead of ubfm. > > Alexandru Elisei (6): > arm64: Remove unnecessary ISB when writing to SPSel > arm/arm64: Remove dcache_line_size global variable > arm/arm64: Remove unnecessary ISB when doing dcache maintenance > lib: arm64: Consolidate register definitions to sysreg.h > arm64: Configure SCTLR_EL1 at boot > arm64: Disable TTBR1_EL1 translation table walks > > lib/arm/asm/assembler.h | 53 ++ > lib/arm/asm/processor.h | 7 - > lib/arm64/asm/arch_gicv3.h| 6 > lib/arm64/asm/assembler.h | 54 +++ > lib/arm64/asm/pgtable-hwdef.h | 1 + > lib/arm64/asm/processor.h | 17 --- > lib/arm64/asm/sysreg.h| 24 > lib/arm/setup.c | 7 - > arm/cstart.S | 19 ++-- > arm/cstart64.S| 28 +++--- > 10 files changed, 145 insertions(+), 71 deletions(-) > create mode 100644 lib/arm/asm/assembler.h > create mode 100644 lib/arm64/asm/assembler.h > > -- > 2.31.0 > Applied to arm/queue https://gitlab.com/rhdrjones/kvm-unit-tests/-/commits/arm/queue Thanks, drew ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[GIT PULL] KVM/arm64 fixes for 5.12, take #3
Hi Paolo, Here's another set of fixes for KVM/arm64 in 5.12. One patch fixes a GICv3 MMIO regression introduced when working around a firmware bug. The last two patches prevent the guest from messing with the ARMv8.4 tracing, a new feature that was introduced in 5.12. Please pull, M. The following changes since commit 1e28eed17697bcf343c6743f0028cc3b5dd88bf0: Linux 5.12-rc3 (2021-03-14 14:41:02 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvmarm-fixes-5.12-3 for you to fetch changes up to af22df997d71c32304d6835a8b690281063b8010: KVM: arm64: Fix CPU interface MMIO compatibility detection (2021-03-24 17:26:38 +) KVM/arm64 fixes for 5.12, take #3 - Fix GICv3 MMIO compatibility probing - Prevent guests from using the ARMv8.4 self-hosted tracing extension Marc Zyngier (1): KVM: arm64: Fix CPU interface MMIO compatibility detection Suzuki K Poulose (2): KVM: arm64: Hide system instruction access to Trace registers KVM: arm64: Disable guest access to trace filter controls arch/arm64/include/asm/kvm_arm.h | 1 + arch/arm64/kernel/cpufeature.c | 1 - arch/arm64/kvm/debug.c | 2 ++ arch/arm64/kvm/hyp/vgic-v3-sr.c | 9 + 4 files changed, 12 insertions(+), 1 deletion(-) ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
On Thu, Mar 25, 2021 at 09:33:17AM +, Marc Zyngier wrote: > On 2021-03-25 09:14, Shameer Kolothum wrote: > > From: Marc Zyngier > > > > commit 9739f6ef053f104a997165701c6e15582c4307ee upstream. > > > > It looks like we have broken firmware out there that wrongly advertises > > a GICv2 compatibility interface, despite the CPUs not being able to deal > > with it. > > > > To work around this, check that the CPU initialising KVM is actually > > able > > to switch to MMIO instead of system registers, and use that as a > > precondition to enable GICv2 compatibility in KVM. > > > > Note that the detection happens on a single CPU. If the firmware is > > lying *and* that the CPUs are asymetric, all hope is lost anyway. > > > > Cc: sta...@vger.kernel.org #5.10 > > Reported-by: Shameerali Kolothum Thodi > > > > Tested-by: Shameer Kolothum > > Signed-off-by: Marc Zyngier > > Message-Id: <20210305185254.3730990-8-...@kernel.org> > > Signed-off-by: Paolo Bonzini > > Signed-off-by: Shameer Kolothum > > Please hold on on that. > > This patch causes a regression, and needs a fix that is currently queued > for 5.12 [1]. Once this hits upstream, please add the fix to the series > and post it as a whole. Both patches now dropped from my queue, thanks. greg k-h ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6 00/38] KVM: arm64: Stage-2 for the host
On Fri, 19 Mar 2021 10:01:08 +, Quentin Perret wrote: > This is the v6 of the series previously posted here: > > https://lore.kernel.org/r/20210315143536.214621-1-qper...@google.com/ > > This basically allows us to wrap the host with a stage 2 when running in > nVHE, hence paving the way for protecting guest memory from the host in > the future (among other use-cases). For more details about the > motivation and the design angle taken here, I would recommend to have a > look at the cover letter of v1, and/or to watch these presentations at > LPC [1] and KVM forum 2020 [2]. > > [...] Applied to next, thanks! [01/38] arm64: lib: Annotate {clear, copy}_page() as position-independent commit: 8d9902055c57548bb342dc3ca78caa21e9643024 [02/38] KVM: arm64: Link position-independent string routines into .hyp.text commit: 7b4a7b5e6fefd15f708f959dd43e188444e252ec [03/38] arm64: kvm: Add standalone ticket spinlock implementation for use at hyp commit: 67c2d326332ee28079348e43cf4f17bbfe63b260 [04/38] KVM: arm64: Initialize kvm_nvhe_init_params early commit: 9cc7758145fd24b17cff0734b7cfd80de30be052 [05/38] KVM: arm64: Avoid free_page() in page-table allocator commit: cc706a63894fdcc25d226378898921e1ab7dd64e [06/38] KVM: arm64: Factor memory allocation out of pgtable.c commit: 7aef0cbcdcd0995efde9957b3eda9f31a219613d [07/38] KVM: arm64: Introduce a BSS section for use at Hyp commit: 380e18ade4a51334e8806160e6f0fdfaca0b4428 [08/38] KVM: arm64: Make kvm_call_hyp() a function call at Hyp commit: 40a50853d37af3fd2e98b769e1a79839ad16b107 [09/38] KVM: arm64: Allow using kvm_nvhe_sym() in hyp code commit: fa21472a316af8ad7af3114049db89678444c7ed [10/38] KVM: arm64: Introduce an early Hyp page allocator commit: e759604087231c672f91564cc805336e70d333a0 [11/38] KVM: arm64: Stub CONFIG_DEBUG_LIST at Hyp commit: 40d9e41e525c13d07bc72d49968926f4502e5b33 [12/38] KVM: arm64: Introduce a Hyp buddy page allocator commit: 8e17c66249e9ea08b44879c7af0315e70a83316c [13/38] KVM: arm64: Enable access to sanitized CPU features at EL2 commit: 7a440cc78392c3caf805ef0afc7ead031e4d0830 [14/38] KVM: arm64: Provide __flush_dcache_area at EL2 commit: d460df12926825a3926da91f054f9f11f88bb33e [15/38] KVM: arm64: Factor out vector address calculation commit: bc1d2892e9aa6dcf6cd83adbd3616051cbd4c429 [16/38] arm64: asm: Provide set_sctlr_el2 macro commit: 8f4de66e247b805e1b3d1c15367ee0ef4cbb6003 [17/38] KVM: arm64: Prepare the creation of s1 mappings at EL2 commit: f320bc742bc23c1d43567712fe2814bf04b19ebc [18/38] KVM: arm64: Elevate hypervisor mappings creation at EL2 commit: bfa79a805454f768b8d76ab683659d9e219a037a [19/38] KVM: arm64: Use kvm_arch for stage 2 pgtable commit: 834cd93deb75f3a43420e479f133dd02fba95aa6 [20/38] KVM: arm64: Use kvm_arch in kvm_s2_mmu commit: cfb1a98de7a9aa51931ff5b336fc5c3c201d01cc [21/38] KVM: arm64: Set host stage 2 using kvm_nvhe_init_params commit: 734864c177bca5148adfe7a96744993d61513430 [22/38] KVM: arm64: Refactor kvm_arm_setup_stage2() commit: bcb25a2b86b4b96385ffbcc54d51c400793b7393 [23/38] KVM: arm64: Refactor __load_guest_stage2() commit: 6ec7e56d3265f6e7673d0788bfa3a76820c9bdfe [24/38] KVM: arm64: Refactor __populate_fault_info() commit: 159b859beed76836a2c7cfa6303c312a40bb9dc7 [25/38] KVM: arm64: Make memcache anonymous in pgtable allocator commit: e37f37a0e780f23210b2a5cb314dab39fea7086a [26/38] KVM: arm64: Reserve memory for host stage 2 commit: 04e5de03093f669ccc233e56b7838bfa7a7af6e1 [27/38] KVM: arm64: Sort the hypervisor memblocks commit: a14307f5310c737744641ff8da7a8d491c3c85cd [28/38] KVM: arm64: Always zero invalid PTEs commit: f60ca2f9321a71ee3d2a7bd620c1827b82ce05f2 [29/38] KVM: arm64: Use page-table to track page ownership commit: 807923e04a0f5c6c34dc2eb52ae544cb0e4e4e66 [30/38] KVM: arm64: Refactor the *_map_set_prot_attr() helpers commit: 3fab82347ffb36c8b7b38dabc8e79276eeb1a81c [31/38] KVM: arm64: Add kvm_pgtable_stage2_find_range() commit: 2fcb3a59401d2d12b5337b62c799eeb22cf40a2c [32/38] KVM: arm64: Introduce KVM_PGTABLE_S2_NOFWB stage 2 flag commit: bc224df155c466178128a2950af16cba37b6f218 [33/38] KVM: arm64: Introduce KVM_PGTABLE_S2_IDMAP stage 2 flag commit: 8942a237c771b65f8bc1232536e4b4b829c7701f [34/38] KVM: arm64: Provide sanitized mmfr* registers at EL2 commit: def1aaf9e0bc6987bb4b417aac37226e994a1a74 [35/38] KVM: arm64: Wrap the host with a stage 2 commit: 1025c8c0c6accfcbdc8f52ca1940160f65cd87d6 [36/38] KVM: arm64: Page-align the .hyp sections commit: b83042f0f143a5e9e899924987b542b2ac766e53 [37/38] KVM: arm64: Disable PMU support in protected mode commit: 9589a38cdfeba0889590e6ef4627b439034d456c [38/38] KVM: arm64: Protect the .hyp sections from the host commit:
Re: [PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
On Thu, 25 Mar 2021 09:37:15 +, Shameerali Kolothum Thodi wrote: > > > > > -Original Message- > > From: Marc Zyngier [mailto:m...@kernel.org] > > Sent: 25 March 2021 09:33 > > To: Shameerali Kolothum Thodi > > Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org; > > linux-arm-ker...@lists.infradead.org; sta...@vger.kernel.org; > > pbonz...@redhat.com; Linuxarm > > Subject: Re: [PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware > > wrongly advertising GICv2-on-v3 compatibility > > > > On 2021-03-25 09:14, Shameer Kolothum wrote: > > > From: Marc Zyngier > > > > > > commit 9739f6ef053f104a997165701c6e15582c4307ee upstream. > > > > > > It looks like we have broken firmware out there that wrongly > > > advertises a GICv2 compatibility interface, despite the CPUs not being > > > able to deal with it. > > > > > > To work around this, check that the CPU initialising KVM is actually > > > able to switch to MMIO instead of system registers, and use that as a > > > precondition to enable GICv2 compatibility in KVM. > > > > > > Note that the detection happens on a single CPU. If the firmware is > > > lying *and* that the CPUs are asymetric, all hope is lost anyway. > > > > > > Cc: sta...@vger.kernel.org #5.10 > > > Reported-by: Shameerali Kolothum Thodi > > > > > > Tested-by: Shameer Kolothum > > > Signed-off-by: Marc Zyngier > > > Message-Id: <20210305185254.3730990-8-...@kernel.org> > > > Signed-off-by: Paolo Bonzini > > > Signed-off-by: Shameer Kolothum > > > > Please hold on on that. > > > > This patch causes a regression, and needs a fix that is currently queued > > for 5.12 > > [1]. Once this hits upstream, please add the fix to the series and post it > > as a > > whole. > > Ok. Yes, I noted that. But was thinking if this goes through first > and then we can have a stable tag for that one, we can manage > it. The problem is we'd end-up with 5.10 being subtly broken for a while, and I want to avoid this. Specially given that not having this series only affects broken platforms, while having an incomplete series breaks working systems (which is be counter productive). > Anyway, will wait now. Thanks for that, M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
RE: [PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
> -Original Message- > From: Marc Zyngier [mailto:m...@kernel.org] > Sent: 25 March 2021 09:33 > To: Shameerali Kolothum Thodi > Cc: kvmarm@lists.cs.columbia.edu; k...@vger.kernel.org; > linux-arm-ker...@lists.infradead.org; sta...@vger.kernel.org; > pbonz...@redhat.com; Linuxarm > Subject: Re: [PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware > wrongly advertising GICv2-on-v3 compatibility > > On 2021-03-25 09:14, Shameer Kolothum wrote: > > From: Marc Zyngier > > > > commit 9739f6ef053f104a997165701c6e15582c4307ee upstream. > > > > It looks like we have broken firmware out there that wrongly > > advertises a GICv2 compatibility interface, despite the CPUs not being > > able to deal with it. > > > > To work around this, check that the CPU initialising KVM is actually > > able to switch to MMIO instead of system registers, and use that as a > > precondition to enable GICv2 compatibility in KVM. > > > > Note that the detection happens on a single CPU. If the firmware is > > lying *and* that the CPUs are asymetric, all hope is lost anyway. > > > > Cc: sta...@vger.kernel.org #5.10 > > Reported-by: Shameerali Kolothum Thodi > > > > Tested-by: Shameer Kolothum > > Signed-off-by: Marc Zyngier > > Message-Id: <20210305185254.3730990-8-...@kernel.org> > > Signed-off-by: Paolo Bonzini > > Signed-off-by: Shameer Kolothum > > Please hold on on that. > > This patch causes a regression, and needs a fix that is currently queued for > 5.12 > [1]. Once this hits upstream, please add the fix to the series and post it as > a > whole. Ok. Yes, I noted that. But was thinking if this goes through first and then we can have a stable tag for that one, we can manage it. Anyway, will wait now. Thanks, Shameer > Thanks, > > M. > > [1] https://lore.kernel.org/r/20210323162301.2049595-1-...@kernel.org > -- > Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
On 2021-03-25 09:14, Shameer Kolothum wrote: From: Marc Zyngier commit 9739f6ef053f104a997165701c6e15582c4307ee upstream. It looks like we have broken firmware out there that wrongly advertises a GICv2 compatibility interface, despite the CPUs not being able to deal with it. To work around this, check that the CPU initialising KVM is actually able to switch to MMIO instead of system registers, and use that as a precondition to enable GICv2 compatibility in KVM. Note that the detection happens on a single CPU. If the firmware is lying *and* that the CPUs are asymetric, all hope is lost anyway. Cc: sta...@vger.kernel.org #5.10 Reported-by: Shameerali Kolothum Thodi Tested-by: Shameer Kolothum Signed-off-by: Marc Zyngier Message-Id: <20210305185254.3730990-8-...@kernel.org> Signed-off-by: Paolo Bonzini Signed-off-by: Shameer Kolothum Please hold on on that. This patch causes a regression, and needs a fix that is currently queued for 5.12 [1]. Once this hits upstream, please add the fix to the series and post it as a whole. Thanks, M. [1] https://lore.kernel.org/r/20210323162301.2049595-1-...@kernel.org -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH for-stable-5.10 2/2] KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility
From: Marc Zyngier commit 9739f6ef053f104a997165701c6e15582c4307ee upstream. It looks like we have broken firmware out there that wrongly advertises a GICv2 compatibility interface, despite the CPUs not being able to deal with it. To work around this, check that the CPU initialising KVM is actually able to switch to MMIO instead of system registers, and use that as a precondition to enable GICv2 compatibility in KVM. Note that the detection happens on a single CPU. If the firmware is lying *and* that the CPUs are asymetric, all hope is lost anyway. Cc: sta...@vger.kernel.org #5.10 Reported-by: Shameerali Kolothum Thodi Tested-by: Shameer Kolothum Signed-off-by: Marc Zyngier Message-Id: <20210305185254.3730990-8-...@kernel.org> Signed-off-by: Paolo Bonzini Signed-off-by: Shameer Kolothum --- arch/arm64/kvm/hyp/vgic-v3-sr.c | 35 +++-- arch/arm64/kvm/vgic/vgic-v3.c | 8 ++-- 2 files changed, 39 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c index 54ce4048d7d1..098b96c121e3 100644 --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c @@ -406,11 +406,42 @@ void __vgic_v3_init_lrs(void) /* * Return the GIC CPU configuration: * - [31:0] ICH_VTR_EL2 - * - [63:32] RES0 + * - [62:32] RES0 + * - [63]MMIO (GICv2) capable */ u64 __vgic_v3_get_gic_config(void) { - return read_gicreg(ICH_VTR_EL2); + u64 val, sre = read_gicreg(ICC_SRE_EL1); + unsigned long flags = 0; + + /* +* To check whether we have a MMIO-based (GICv2 compatible) +* CPU interface, we need to disable the system register +* view. To do that safely, we have to prevent any interrupt +* from firing (which would be deadly). +* +* Note that this only makes sense on VHE, as interrupts are +* already masked for nVHE as part of the exception entry to +* EL2. +*/ + if (has_vhe()) + flags = local_daif_save(); + + write_gicreg(0, ICC_SRE_EL1); + isb(); + + val = read_gicreg(ICC_SRE_EL1); + + write_gicreg(sre, ICC_SRE_EL1); + isb(); + + if (has_vhe()) + local_daif_restore(flags); + + val = (val & ICC_SRE_EL1_SRE) ? 0 : (1ULL << 63); + val |= read_gicreg(ICH_VTR_EL2); + + return val; } u64 __vgic_v3_read_vmcr(void) diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 8e7bf3151057..6a4bced0851d 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -584,8 +584,10 @@ early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable); int vgic_v3_probe(const struct gic_kvm_info *info) { u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config); + bool has_v2; int ret; + has_v2 = ich_vtr_el2 >> 63; ich_vtr_el2 = (u32)ich_vtr_el2; /* @@ -605,13 +607,15 @@ int vgic_v3_probe(const struct gic_kvm_info *info) gicv4_enable ? "en" : "dis"); } + kvm_vgic_global_state.vcpu_base = 0; + if (!info->vcpu.start) { kvm_info("GICv3: no GICV resource entry\n"); - kvm_vgic_global_state.vcpu_base = 0; + } else if (!has_v2) { + pr_warn(FW_BUG "CPU interface incapable of MMIO access\n"); } else if (!PAGE_ALIGNED(info->vcpu.start)) { pr_warn("GICV physical address 0x%llx not page aligned\n", (unsigned long long)info->vcpu.start); - kvm_vgic_global_state.vcpu_base = 0; } else { kvm_vgic_global_state.vcpu_base = info->vcpu.start; kvm_vgic_global_state.can_emulate_gicv2 = true; -- 2.17.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH for-stable-5.10 1/2] KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config()
From: Marc Zyngier commit b9d699e2694d032aa8ecc15141f698ccb050dc95 upstream. As we are about to report a bit more information to the rest of the kernel, rename __vgic_v3_get_ich_vtr_el2() to the more explicit __vgic_v3_get_gic_config(). No functional change. Cc: sta...@vger.kernel.org #5.10 Tested-by: Shameer Kolothum Signed-off-by: Marc Zyngier Message-Id: <20210305185254.3730990-7-...@kernel.org> Signed-off-by: Paolo Bonzini [Shameer: Back ported to 5.10] Signed-off-by: Shameer Kolothum --- arch/arm64/include/asm/kvm_asm.h | 4 ++-- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 4 ++-- arch/arm64/kvm/hyp/vgic-v3-sr.c| 7 ++- arch/arm64/kvm/vgic/vgic-v3.c | 4 +++- 4 files changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 54387ccd1ab2..21de2d632f97 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -52,7 +52,7 @@ #define __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_local_vmid 5 #define __KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff 6 #define __KVM_HOST_SMCCC_FUNC___kvm_enable_ssbs7 -#define __KVM_HOST_SMCCC_FUNC___vgic_v3_get_ich_vtr_el28 +#define __KVM_HOST_SMCCC_FUNC___vgic_v3_get_gic_config 8 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_read_vmcr 9 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_write_vmcr 10 #define __KVM_HOST_SMCCC_FUNC___vgic_v3_init_lrs 11 @@ -191,7 +191,7 @@ extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu); extern void __kvm_enable_ssbs(void); -extern u64 __vgic_v3_get_ich_vtr_el2(void); +extern u64 __vgic_v3_get_gic_config(void); extern u64 __vgic_v3_read_vmcr(void); extern void __vgic_v3_write_vmcr(u32 vmcr); extern void __vgic_v3_init_lrs(void); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index e2eafe2c93af..1e09d0052bf3 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -62,8 +62,8 @@ static void handle_host_hcall(unsigned long func_id, case KVM_HOST_SMCCC_FUNC(__kvm_enable_ssbs): __kvm_enable_ssbs(); break; - case KVM_HOST_SMCCC_FUNC(__vgic_v3_get_ich_vtr_el2): - ret = __vgic_v3_get_ich_vtr_el2(); + case KVM_HOST_SMCCC_FUNC(__vgic_v3_get_gic_config): + ret = __vgic_v3_get_gic_config(); break; case KVM_HOST_SMCCC_FUNC(__vgic_v3_read_vmcr): ret = __vgic_v3_read_vmcr(); diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c index 452f4cacd674..54ce4048d7d1 100644 --- a/arch/arm64/kvm/hyp/vgic-v3-sr.c +++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c @@ -403,7 +403,12 @@ void __vgic_v3_init_lrs(void) __gic_v3_set_lr(0, i); } -u64 __vgic_v3_get_ich_vtr_el2(void) +/* + * Return the GIC CPU configuration: + * - [31:0] ICH_VTR_EL2 + * - [63:32] RES0 + */ +u64 __vgic_v3_get_gic_config(void) { return read_gicreg(ICH_VTR_EL2); } diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c index 9cdf39a94a63..8e7bf3151057 100644 --- a/arch/arm64/kvm/vgic/vgic-v3.c +++ b/arch/arm64/kvm/vgic/vgic-v3.c @@ -583,9 +583,11 @@ early_param("kvm-arm.vgic_v4_enable", early_gicv4_enable); */ int vgic_v3_probe(const struct gic_kvm_info *info) { - u32 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_ich_vtr_el2); + u64 ich_vtr_el2 = kvm_call_hyp_ret(__vgic_v3_get_gic_config); int ret; + ich_vtr_el2 = (u32)ich_vtr_el2; + /* * The ListRegs field is 5 bits, but there is an architectural * maximum of 16 list registers. Just ignore bit 4... -- 2.17.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH for-stable-5.10 0/2] Backport for- Work around firmware wrongly advertising GICv2 compatibility
Backport 2 patches that are required on ARM64 platforms having firmware wrongly advertising GICv2 compatibility. Patch 1 has nvhe related conflict resolution while patch 2 is cleanly applied. Tested on HiSilicon D06 platform. Marc Zyngier (2): KVM: arm64: Rename __vgic_v3_get_ich_vtr_el2() to __vgic_v3_get_gic_config() KVM: arm64: Workaround firmware wrongly advertising GICv2-on-v3 compatibility arch/arm64/include/asm/kvm_asm.h | 4 +-- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 4 +-- arch/arm64/kvm/hyp/vgic-v3-sr.c| 40 -- arch/arm64/kvm/vgic/vgic-v3.c | 12 ++--- 4 files changed, 51 insertions(+), 9 deletions(-) -- 2.17.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm