[RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA
For pass-through device assignment, the ARM64 KVM hypervisor retrieves the memory region properties physical address, size, and whether a region backed with struct page or not from VMA. The prefetchable attribute of a BAR region isn't visible to KVM to make an optimal decision for stage2 attributes. This patch updates vma->vm_page_prot and maps with write-combine attribute if the associated BAR is prefetchable. For ARM64 pgprot_writecombine() is mapped to memory-type MT_NORMAL_NC which has no side effects on reads and multiple writes can be combined. Signed-off-by: Shanker Donthineni --- drivers/vfio/pci/vfio_pci.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 5023e23db3bc..1b734fe1dd51 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -1703,7 +1703,11 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) } vma->vm_private_data = vdev; - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + if (IS_ENABLED(CONFIG_ARM64) && + (pci_resource_flags(pdev, index) & IORESOURCE_PREFETCH)) + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); + else + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff; /* -- 2.17.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC 0/2] [RFC] Honor PCI prefetchable attributes for a virtual machine on ARM64
Problem statement: Virtual machine crashes when NVIDIA GPU driver access a prefetchable BAR space due to the unaligned reads/writes for pass-through devices. The same binary works fine as expected in the host kernel. Only one BAR has control & status registers (CSR) and other PCI BARs are marked as prefetchable. NVIDIA GPU driver uses the write-combine feature for mapping the prefetchable BARs to improve performance. This problem applies to all other drivers which want to enable WC. Solution: Honor PCI prefetchable attributes for the guest operating systems. Proposal: ARM64-KVM uses VMA struct for the needed information e.g. region physical address, size, and memory-type (struct page backed mapping or anonymous memory) for setting up a stage-2 page table. Right now memory region either can be mapped as DEVICE (strongly ordered) or NORMAL (write-back cache) depends on the flag VM_PFNMAP in VMA. VFIO-PCI will keep the prefetchable (write-combine) information in vma->vm_page_prot similar to other fields, and KVM will prepare stage-2 entries based on the memory-type attribute that was set in VMA. Shanker Donthineni (2): vfio/pci: keep the prefetchable attribute of a BAR region in VMA KVM: arm64: Add write-combine support for stage-2 entries arch/arm64/include/asm/kvm_mmu.h | 3 ++- arch/arm64/include/asm/kvm_pgtable.h | 2 ++ arch/arm64/include/asm/memory.h | 4 +++- arch/arm64/kvm/hyp/pgtable.c | 9 +++-- arch/arm64/kvm/mmu.c | 22 +++--- arch/arm64/kvm/vgic/vgic-v2.c| 2 +- drivers/vfio/pci/vfio_pci.c | 6 +- 7 files changed, 39 insertions(+), 9 deletions(-) -- 2.17.1 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC 2/2] KVM: arm64: Add write-combine support for stage-2 entries
In the current implementation, the device memory is always mapped as DEVICE_nGnRE in stage-2. In the host kernel, device drivers have flexibility whether to choose a memory-type device or write-combine (Non-cacheable) depends on the use case. PCI specification has a prefetchable BAR concept where multiple writes can be combined and no side effects on reads. It provides huge performance improvement and also allows unaligned access. NVIDIA GPU PCIe devices have 3 BAR regions. Two regions are mapped to video/compute memory and marked as prefetchable. The GPU driver takes advantage of the write-combine feature for higher performance. The same driver has no issues in the host kernel but crashes inside the virtual machine because of unaligned accesses. This patch finds the PTE attributes for device memory in VMA. It updates the stage-2 attribute to NORMAL_NC for WC regions and the default type DEVICE_nGnRE for non-WC regions. Change-Id: Ibaea69c7a301df3c86609e871f6d066728391080 Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/kvm_mmu.h | 3 ++- arch/arm64/include/asm/kvm_pgtable.h | 2 ++ arch/arm64/include/asm/memory.h | 4 +++- arch/arm64/kvm/hyp/pgtable.c | 9 +++-- arch/arm64/kvm/mmu.c | 21 ++--- arch/arm64/kvm/vgic/vgic-v2.c| 2 +- 6 files changed, 33 insertions(+), 8 deletions(-) diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h index 90873851f677..dec498a6ba2f 100644 --- a/arch/arm64/include/asm/kvm_mmu.h +++ b/arch/arm64/include/asm/kvm_mmu.h @@ -160,7 +160,8 @@ void stage2_unmap_vm(struct kvm *kvm); int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu); void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu); int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, - phys_addr_t pa, unsigned long size, bool writable); + phys_addr_t pa, unsigned long size, bool writable, + bool writecombine); int kvm_handle_guest_abort(struct kvm_vcpu *vcpu); diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h index 8886d43cfb11..26f28220f6f3 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -35,6 +35,7 @@ struct kvm_pgtable { * @KVM_PGTABLE_PROT_W:Write permission. * @KVM_PGTABLE_PROT_R:Read permission. * @KVM_PGTABLE_PROT_DEVICE: Device attributes. + * @KVM_PGTABLE_PROT_WC: Normal non-cacheable (WC). */ enum kvm_pgtable_prot { KVM_PGTABLE_PROT_X = BIT(0), @@ -42,6 +43,7 @@ enum kvm_pgtable_prot { KVM_PGTABLE_PROT_R = BIT(2), KVM_PGTABLE_PROT_DEVICE = BIT(3), + KVM_PGTABLE_PROT_WC = BIT(4), }; #define PAGE_HYP (KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W) diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 0aabc3be9a75..04a812b59437 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -144,13 +144,15 @@ * Memory types for Stage-2 translation */ #define MT_S2_NORMAL 0xf +#define MT_S2_WRITE_COMBINE5 #define MT_S2_DEVICE_nGnRE 0x1 /* * Memory types for Stage-2 translation when ID_AA64MMFR2_EL1.FWB is 0001 - * Stage-2 enforces Normal-WB and Device-nGnRE + * Stage-2 enforces Normal-WB, Normal-NC and Device-nGnRE */ #define MT_S2_FWB_NORMAL 6 +#define MT_S2_FWB_WRITE_COMBINE5 #define MT_S2_FWB_DEVICE_nGnRE 1 #ifdef CONFIG_ARM64_4K_PAGES diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 926fc07074f5..bdfed559eae2 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -444,9 +444,14 @@ static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot, struct stage2_map_data *data) { bool device = prot & KVM_PGTABLE_PROT_DEVICE; - kvm_pte_t attr = device ? PAGE_S2_MEMATTR(DEVICE_nGnRE) : - PAGE_S2_MEMATTR(NORMAL); u32 sh = KVM_PTE_LEAF_ATTR_LO_S2_SH_IS; + kvm_pte_t attr = PAGE_S2_MEMATTR(NORMAL); + + if (device) { + attr = (prot & KVM_PGTABLE_PROT_WC) ? + PAGE_S2_MEMATTR(WRITE_COMBINE) : + PAGE_S2_MEMATTR(DEVICE_nGnRE); + } if (!(prot & KVM_PGTABLE_PROT_X)) attr |= KVM_PTE_LEAF_ATTR_HI_S2_XN; diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8711894db8c2..5b8ec1ab12e2 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -487,6 +487,16 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu) } } +/** + * is_vma_write_combine - check if VMA is mapped with writecombine or not + * Return true if VMA mapped with MT_NORMAL_NC otherwise fasle + */ +static bool inline is_vma_write_co
Re: [PATCH v2] arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
Hi Will, On 03/09/2018 07:48 AM, Will Deacon wrote: > Hi SHanker, > > On Mon, Mar 05, 2018 at 11:06:43AM -0600, Shanker Donthineni wrote: >> The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC >> V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses >> the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead >> of Silicon provider service ID 0xC2001700. >> >> Signed-off-by: Shanker Donthineni >> --- >> Chnages since v1: >> - Trivial change in cpucaps.h (refresh after removing >> ARM64_HARDEN_BP_POST_GUEST_EXIT) >> >> arch/arm64/include/asm/cpucaps.h | 5 ++-- >> arch/arm64/include/asm/kvm_asm.h | 2 -- >> arch/arm64/kernel/bpi.S | 8 -- >> arch/arm64/kernel/cpu_errata.c | 55 >> ++-- >> arch/arm64/kvm/hyp/entry.S | 12 - >> arch/arm64/kvm/hyp/switch.c | 10 >> 6 files changed, 21 insertions(+), 71 deletions(-) > > Could you reply to my outstanding question on the last version of this patch > please? > I replied to your comments. This patch contents have been discussed with QCOM CPU architecture and design team. Their recommendation was to keep two variants of variant2 mitigation in order to take advantage of Falkor hardware and avoid the unnecessary overhead by calling SMMCC always. > http://lists.infradead.org/pipermail/linux-arm-kernel/2018-March/564194.html > > Will > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
Hi Will, On 03/06/2018 09:25 AM, Will Deacon wrote: > On Mon, Mar 05, 2018 at 12:03:33PM -0600, Shanker Donthineni wrote: >> On 03/05/2018 11:15 AM, Will Deacon wrote: >>> On Mon, Mar 05, 2018 at 10:57:58AM -0600, Shanker Donthineni wrote: >>>> On 03/05/2018 09:56 AM, Will Deacon wrote: >>>>> On Fri, Mar 02, 2018 at 03:50:18PM -0600, Shanker Donthineni wrote: >>>>>> @@ -199,33 +208,15 @@ static int enable_smccc_arch_workaround_1(void >>>>>> *data) >>>>>> return 0; >>>>>> } >>>>>> >>>>>> +if (((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR) || >>>>>> +((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR_V1)) >>>>>> +cb = qcom_link_stack_sanitization; >>>>> >>>>> Is this just a performance thing? Do you actually see an advantage over >>>>> always making the firmware call? We've seen minimal impact in our testing. >>>>> >>>> >>>> Yes, we've couple of advantages using the standard SMCCC_ARCH_WOKAROUND_1 >>>> framework. >>>> - Improves the code readability. >>>> - Avoid the unnecessary MIDR checks on each vCPU exit. >>>> - Validates ID_AA64PFR0_CVS2 feature for Falkor chips. >>>> - Avoids the 2nd link stack sanitization workaround in firmware. >>> >>> What I mean is, can we drop qcom_link_stack_sanitization altogether and >>> use the SMCCC interface for everything? >>> >> >> No, We would like to keep it qcom_link_stack_sanitization for host kernel >> since it takes a few CPU cycles instead of heavyweight SMCCC call. > > Is that something that you can actually measure in the workloads and > benchmarks that you care about? If so, fine, but that doesn't seem to be the > case for the Cortex cores we've looked at internally and it would be nice to > avoid having different workarounds in the kernel just because the SMCCC > interface wasn't baked in time, rather than because there's a meaningful > performance difference. > We've seen noticeable performance improvement with the microbench workloads, ans also some of our customers have observed improvements on heavy workloads. Unfortunately I can't share those specific results here. SMCCC call overhead is much higher as compared to link stack workaround on Falkor, ~99X. Host kernel workaround takes less than ~20 CPU cycles, whereas SMCCC_ARCH_WOKAROUND_1 consumes thousands of CPU cycles to sanitize the branch prediction on Falkor. Especially workloads inside virtual machines provides much better results because no KVM involvement is required whenever guest calls qcom_link_stack_sanitization(). > Will > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v7] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Co-authored-by: Philip Elcan Signed-off-by: Shanker Donthineni --- Changes since v6: -Both I-Cache and D-Cache changes are symmetric as Will suggested. -Remove Kconfig option. -Patch __flush_icache_all(). Changes since v5: -Addressed Mark's review comments. Changes since v4: -Moved patching ARM64_HAS_CACHE_DIC inside invalidate_icache_by_line -Removed 'dsb ishst' for ARM64_HAS_CACHE_DIC as Mark suggested. Changes since v3: -Added preprocessor guard CONFIG_xxx to code snippets in cache.S -Changed barrier attributes from ISH to ISHST. Changes since v2: -Included barriers, DSB/ISB with DIC set, and DSB with IDC set. -Single Kconfig option. Changes since v1: -Reworded commit text. -Used the alternatives framework as Catalin suggested. -Rebased on top of https://patchwork.kernel.org/patch/10227927/ arch/arm64/include/asm/cache.h | 4 arch/arm64/include/asm/cacheflush.h | 7 +-- arch/arm64/include/asm/cpucaps.h| 4 +++- arch/arm64/kernel/cpufeature.c | 36 ++-- arch/arm64/mm/cache.S | 21 - 5 files changed, 62 insertions(+), 10 deletions(-) diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index ea9bb4e..9bbffc7 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -20,8 +20,12 @@ #define CTR_L1IP_SHIFT 14 #define CTR_L1IP_MASK 3 +#define CTR_DMINLINE_SHIFT 16 +#define CTR_ERG_SHIFT 20 #define CTR_CWG_SHIFT 24 #define CTR_CWG_MASK 15 +#define CTR_IDC_SHIFT 28 +#define CTR_DIC_SHIFT 29 #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h index bef9f41..d51bde1 100644 --- a/arch/arm64/include/asm/cacheflush.h +++ b/arch/arm64/include/asm/cacheflush.h @@ -133,8 +133,11 @@ extern void copy_to_user_page(struct vm_area_struct *, struct page *, static inline void __flush_icache_all(void) { - asm("ic ialluis"); - dsb(ish); + /* Instruction cache invalidation is not required for I/D coherence? */ + if (!cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) { + asm("ic ialluis"); + dsb(ish); + } } #define flush_dcache_mmap_lock(mapping) \ diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..8dd42ae 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -45,7 +45,9 @@ #define ARM64_HARDEN_BRANCH_PREDICTOR 24 #define ARM64_HARDEN_BP_POST_GUEST_EXIT25 #define ARM64_HAS_RAS_EXTN 26 +#define ARM64_HAS_CACHE_IDC27 +#define ARM64_HAS_CACHE_DIC28 -#define ARM64_NCAPS27 +#define ARM64_NCAPS29 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 2985a06..9f39e9c 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -199,12 +199,12 @@ static int __init register_cpu_hwcaps_dumper(void) }; static const struct arm64_ftr_bits ftr_ctr[] = { - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 29, 1, 1), /* DIC */ - AR
Re: [PATCH v6] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
Hi Will, On 03/06/2018 12:48 PM, Shanker Donthineni wrote: > Hi Will, > > On 03/06/2018 09:23 AM, Will Deacon wrote: >> Hi Shanker, >> >> On Tue, Mar 06, 2018 at 08:47:27AM -0600, Shanker Donthineni wrote: >>> On 03/06/2018 07:44 AM, Will Deacon wrote: >>>> I think this is a slight asymmetry with the code for the I-side. On the >>>> I-side, you hook into invalidate_icache_by_line, whereas on the D-side you >>>> hook into the callers of dcache_by_line_op. Why is that? >>>> >>> >>> There is no particular reason other than complexity of the macro with >>> another >>> alternative. I tried to avoid this change by updating >>> __clean_dcache_area_pou(). >>> I can change if you're interested to see both I-Side and D-Side changes are >>> symmetric some thing like this... >>> >>> .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2 >>> >>> .if (\op == cvau) >>> alternative_if ARM64_HAS_CACHE_IDC >>> dsb ishst >>> b 9997f >>> alternative_else_nop_endif >>> .endif >>> >>> dcache_line_size \tmp1, \tmp2 >>> add \size, \kaddr, \size >>> sub \tmp2, \tmp1, #1 >>> bic \kaddr, \kaddr, \tmp2 >>> 9998: >>> .if (\op == cvau || \op == cvac) >>> alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE >>> dc \op, \kaddr >>> alternative_else >>> dc civac, \kaddr >>> alternative_endif >>> .elseif (\op == cvap) >>> alternative_if ARM64_HAS_DCPOP >>> sys 3, c7, c12, 1, \kaddr // dc cvap >>> alternative_else >>> dc cvac, \kaddr >>> alternative_endif >>> .else >>> dc \op, \kaddr >>> .endif >>> add \kaddr, \kaddr, \tmp1 >>> cmp \kaddr, \size >>> b.lo9998b >>> dsb \domain >>> 9997: >>> .endm >> >> I think it would be cleaner the other way round, actually -- move the check >> out of invalidate_icache_by_line and into its two callers. >> > > Sure, I'll send out the next patch with your suggestions. > >>>> I notice that the only user other than >>>> flush_icache_range/__flush_cache_user_range or invalidate_icache_by_line >>>> is in KVM, via invalidate_icache_range. If you want to hook in there, why >>>> aren't you also patching __flush_icache_all? If so, I'd rather have the >>>> I-side code consistent with the D-side code and do this in the handful of >>>> callers. We might even be able to elide a branch or two that way. >>>> >>> >>> Agree with you, it saves function calls overhead. I'll do this change... >>> >>> static void invalidate_icache_guest_page(kvm_pfn_t pfn, unsigned long size) >>> { >>> if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC) >>> __invalidate_icache_guest_page(pfn, size); >>> } >>> >>> >>>> I'm going to assume that I-cache aliases are all coherent if DIC=1, so it's >>>> safe to elide our alias sync code. >>>> >>> >>> I'm not sure about I-cache whether aliases are all coherent if DIC=1 ot not. >>> Unfortunately I don't have any hardware to test DIC=1. I've verified IDC=1. >> >> I checked with our architects and aliases don't pose a problem here, so you >> can ignore me :) >> > > I also confirmed with Thomas Speier, we can skip __flush_icache_all() if > DIC=1. > > Planning to patch __flush_icache_all() itself instead of changing the callers. This way we can avoid "ic ialluis" completely. Is this okay for you? static inline void __flush_icache_all(void) { /* Instruction cache invalidation is not required for I/D coherence? */ if (!cpus_have_const_cap(ARM64_HAS_CACHE_DIC)) { asm("ic ialluis"); dsb(ish); } } >> Will >> >> ___ >> linux-arm-kernel mailing list >> linux-arm-ker...@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
Hi Will, On 03/06/2018 09:23 AM, Will Deacon wrote: > Hi Shanker, > > On Tue, Mar 06, 2018 at 08:47:27AM -0600, Shanker Donthineni wrote: >> On 03/06/2018 07:44 AM, Will Deacon wrote: >>> I think this is a slight asymmetry with the code for the I-side. On the >>> I-side, you hook into invalidate_icache_by_line, whereas on the D-side you >>> hook into the callers of dcache_by_line_op. Why is that? >>> >> >> There is no particular reason other than complexity of the macro with >> another >> alternative. I tried to avoid this change by updating >> __clean_dcache_area_pou(). >> I can change if you're interested to see both I-Side and D-Side changes are >> symmetric some thing like this... >> >> .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2 >> >> .if(\op == cvau) >> alternative_if ARM64_HAS_CACHE_IDC >> dsb ishst >> b 9997f >> alternative_else_nop_endif >> .endif >> >> dcache_line_size \tmp1, \tmp2 >> add \size, \kaddr, \size >> sub \tmp2, \tmp1, #1 >> bic \kaddr, \kaddr, \tmp2 >> 9998: >> .if (\op == cvau || \op == cvac) >> alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE >> dc \op, \kaddr >> alternative_else >> dc civac, \kaddr >> alternative_endif >> .elseif (\op == cvap) >> alternative_if ARM64_HAS_DCPOP >> sys 3, c7, c12, 1, \kaddr // dc cvap >> alternative_else >> dc cvac, \kaddr >> alternative_endif >> .else >> dc \op, \kaddr >> .endif >> add \kaddr, \kaddr, \tmp1 >> cmp \kaddr, \size >> b.lo9998b >> dsb \domain >> 9997: >> .endm > > I think it would be cleaner the other way round, actually -- move the check > out of invalidate_icache_by_line and into its two callers. > Sure, I'll send out the next patch with your suggestions. >>> I notice that the only user other than >>> flush_icache_range/__flush_cache_user_range or invalidate_icache_by_line >>> is in KVM, via invalidate_icache_range. If you want to hook in there, why >>> aren't you also patching __flush_icache_all? If so, I'd rather have the >>> I-side code consistent with the D-side code and do this in the handful of >>> callers. We might even be able to elide a branch or two that way. >>> >> >> Agree with you, it saves function calls overhead. I'll do this change... >> >> static void invalidate_icache_guest_page(kvm_pfn_t pfn, unsigned long size) >> { >> if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC) >> __invalidate_icache_guest_page(pfn, size); >> } >> >> >>> I'm going to assume that I-cache aliases are all coherent if DIC=1, so it's >>> safe to elide our alias sync code. >>> >> >> I'm not sure about I-cache whether aliases are all coherent if DIC=1 ot not. >> Unfortunately I don't have any hardware to test DIC=1. I've verified IDC=1. > > I checked with our architects and aliases don't pose a problem here, so you > can ignore me :) > I also confirmed with Thomas Speier, we can skip __flush_icache_all() if DIC=1. > Will > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
Hi Will On 03/06/2018 07:44 AM, Will Deacon wrote: > Hi Shanker, > > On Wed, Feb 28, 2018 at 10:14:00PM -0600, Shanker Donthineni wrote: >> The DCache clean & ICache invalidation requirements for instructions >> to be data coherence are discoverable through new fields in CTR_EL0. >> The following two control bits DIC and IDC were defined for this >> purpose. No need to perform point of unification cache maintenance >> operations from software on systems where CPU caches are transparent. >> >> This patch optimize the three functions __flush_cache_user_range(), >> clean_dcache_area_pou() and invalidate_icache_range() if the hardware >> reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two >> instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic >> in order to avoid the unnecessary overhead. >> >> CTR_EL0.DIC: Instruction cache invalidation requirements for >> instruction to data coherence. The meaning of this bit[29]. >> 0: Instruction cache invalidation to the point of unification >> is required for instruction to data coherence. >> 1: Instruction cache cleaning to the point of unification is >> not required for instruction to data coherence. >> >> CTR_EL0.IDC: Data cache clean requirements for instruction to data >> coherence. The meaning of this bit[28]. >> 0: Data cache clean to the point of unification is required for >> instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 >> or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). >> 1: Data cache clean to the point of unification is not required >> for instruction to data coherence. >> >> Co-authored-by: Philip Elcan >> Signed-off-by: Shanker Donthineni >> --- >> Changes since v5: >> -Addressed Mark's review comments. > > This mostly looks good now. Just a few comments inline. > >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index 7381eeb..41af850 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -1091,6 +1091,18 @@ config ARM64_RAS_EXTN >>and access the new registers if the system supports the extension. >>Platform RAS features may additionally depend on firmware support. >> >> +config ARM64_SKIP_CACHE_POU >> +bool "Enable support to skip cache PoU operations" >> +default y >> +help >> + Explicit point of unification cache operations can be eliminated >> + in software if the hardware handles transparently. The new bits in >> + CTR_EL0, CTR_EL0.DIC and CTR_EL0.IDC indicates the hardware >> + capabilities of ICache and DCache PoU requirements. >> + >> + Selecting this feature will allow the kernel to optimize cache >> + maintenance to the PoU. >> + >> endmenu > > Let's not bother with a Kconfig option. I think the extra couple of NOPs > this introduces for CPUs that don't implement the new features isn't going > to hurt anybody. > Okay, I'll get rid of Kconfig option. >> diff --git a/arch/arm64/include/asm/assembler.h >> b/arch/arm64/include/asm/assembler.h >> index 3c78835..39f2274 100644 >> --- a/arch/arm64/include/asm/assembler.h >> +++ b/arch/arm64/include/asm/assembler.h >> @@ -444,6 +444,11 @@ >> * Corrupts: tmp1, tmp2 >> */ >> .macro invalidate_icache_by_line start, end, tmp1, tmp2, label >> +#ifdef CONFIG_ARM64_SKIP_CACHE_POU >> +alternative_if ARM64_HAS_CACHE_DIC >> +b 9996f >> +alternative_else_nop_endif >> +#endif >> icache_line_size \tmp1, \tmp2 >> sub \tmp2, \tmp1, #1 >> bic \tmp2, \start, \tmp2 >> @@ -453,6 +458,7 @@ >> cmp \tmp2, \end >> b.lo9997b >> dsb ish >> +9996: >> isb >> .endm >> >> diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h >> index ea9bb4e..d460e9f 100644 >> --- a/arch/arm64/include/asm/cache.h >> +++ b/arch/arm64/include/asm/cache.h >> @@ -20,8 +20,12 @@ >> >> #define CTR_L1IP_SHIFT 14 >> #define CTR_L1IP_MASK 3 >> +#define CTR_DMLINE_SHIFT16 > > This should be "CTR_DMINLINE_SHIFT" > I'll change it. >> +#define CTR_ERG_SHIFT 20 >> #define CTR_CWG_SHIFT 24 >> #define CTR_CWG_MASK15 >> +#define CTR_IDC_SHIFT 28 >> +#define CTR_DIC_SHIFT 2
Re: [PATCH] arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
Hi Will, On 03/05/2018 11:15 AM, Will Deacon wrote: > On Mon, Mar 05, 2018 at 10:57:58AM -0600, Shanker Donthineni wrote: >> Hi Will, >> >> On 03/05/2018 09:56 AM, Will Deacon wrote: >>> Hi Shanker, >>> >>> On Fri, Mar 02, 2018 at 03:50:18PM -0600, Shanker Donthineni wrote: >>>> The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC >>>> V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses >>>> the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead >>>> of Silicon provider service ID 0xC2001700. >>>> >>>> Signed-off-by: Shanker Donthineni >>>> --- >>>> arch/arm64/include/asm/cpucaps.h | 2 +- >>>> arch/arm64/include/asm/kvm_asm.h | 2 -- >>>> arch/arm64/kernel/bpi.S | 8 -- >>>> arch/arm64/kernel/cpu_errata.c | 55 >>>> ++-- >>>> arch/arm64/kvm/hyp/entry.S | 12 - >>>> arch/arm64/kvm/hyp/switch.c | 10 >>>> 6 files changed, 20 insertions(+), 69 deletions(-) >>> >>> I'm happy to take this via arm64 if I get an ack from Marc/Christoffer. >>> >>>> diff --git a/arch/arm64/include/asm/cpucaps.h >>>> b/arch/arm64/include/asm/cpucaps.h >>>> index bb26382..6ecc249 100644 >>>> --- a/arch/arm64/include/asm/cpucaps.h >>>> +++ b/arch/arm64/include/asm/cpucaps.h >>>> @@ -43,7 +43,7 @@ >>>> #define ARM64_SVE 22 >>>> #define ARM64_UNMAP_KERNEL_AT_EL0 23 >>>> #define ARM64_HARDEN_BRANCH_PREDICTOR 24 >>>> -#define ARM64_HARDEN_BP_POST_GUEST_EXIT 25 >>>> +/* #define ARM64_UNALLOCATED_ENTRY25 */ >>>> #define ARM64_HAS_RAS_EXTN26 >>>> >>>> #define ARM64_NCAPS 27 >>> >>> These aren't ABI, so I think you can just drop >>> ARM64_HARDEN_BP_POST_GUEST_EXIT and repack the others accordingly. >>> >> Sure, I'll remove it completely in v2 patch. >> >>>> diff --git a/arch/arm64/include/asm/kvm_asm.h >>>> b/arch/arm64/include/asm/kvm_asm.h >>>> index 24961b7..ab4d0a9 100644 >>>> --- a/arch/arm64/include/asm/kvm_asm.h >>>> +++ b/arch/arm64/include/asm/kvm_asm.h >>>> @@ -68,8 +68,6 @@ >>>> >>>> extern u32 __init_stage2_translation(void); >>>> >>>> -extern void __qcom_hyp_sanitize_btac_predictors(void); >>>> - >>>> #endif >>>> >>>> #endif /* __ARM_KVM_ASM_H__ */ >>>> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S >>>> index e5de335..dc4eb15 100644 >>>> --- a/arch/arm64/kernel/bpi.S >>>> +++ b/arch/arm64/kernel/bpi.S >>>> @@ -55,14 +55,6 @@ ENTRY(__bp_harden_hyp_vecs_start) >>>>.endr >>>> ENTRY(__bp_harden_hyp_vecs_end) >>>> >>>> -ENTRY(__qcom_hyp_sanitize_link_stack_start) >>>> - stp x29, x30, [sp, #-16]! >>>> - .rept 16 >>>> - bl . + 4 >>>> - .endr >>>> - ldp x29, x30, [sp], #16 >>>> -ENTRY(__qcom_hyp_sanitize_link_stack_end) >>>> - >>>> .macro smccc_workaround_1 inst >>>>sub sp, sp, #(8 * 4) >>>>stp x2, x3, [sp, #(8 * 0)] >>>> diff --git a/arch/arm64/kernel/cpu_errata.c >>>> b/arch/arm64/kernel/cpu_errata.c >>>> index 52f15cd..d779ffd4 100644 >>>> --- a/arch/arm64/kernel/cpu_errata.c >>>> +++ b/arch/arm64/kernel/cpu_errata.c >>>> @@ -67,8 +67,6 @@ static int cpu_enable_trap_ctr_access(void *__unused) >>>> DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); >>>> >>>> #ifdef CONFIG_KVM >>>> -extern char __qcom_hyp_sanitize_link_stack_start[]; >>>> -extern char __qcom_hyp_sanitize_link_stack_end[]; >>>> extern char __smccc_workaround_1_smc_start[]; >>>> extern char __smccc_workaround_1_smc_end[]; >>>> extern char __smccc_workaround_1_hvc_start[]; >>>> @@ -115,8 +113,6 @@ static void >>>> __install_bp_hardening_cb(bp_hardening_cb_t fn, >>>>spin_unlock(&bp_lock); >>>> } >>>> #else >>>> -#define __qcom_hyp_sanitize_link_stack_start NULL >>>> -#define __qco
[PATCH v2] arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead of Silicon provider service ID 0xC2001700. Signed-off-by: Shanker Donthineni --- Chnages since v1: - Trivial change in cpucaps.h (refresh after removing ARM64_HARDEN_BP_POST_GUEST_EXIT) arch/arm64/include/asm/cpucaps.h | 5 ++-- arch/arm64/include/asm/kvm_asm.h | 2 -- arch/arm64/kernel/bpi.S | 8 -- arch/arm64/kernel/cpu_errata.c | 55 ++-- arch/arm64/kvm/hyp/entry.S | 12 - arch/arm64/kvm/hyp/switch.c | 10 6 files changed, 21 insertions(+), 71 deletions(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..324c85e 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -43,9 +43,8 @@ #define ARM64_SVE 22 #define ARM64_UNMAP_KERNEL_AT_EL0 23 #define ARM64_HARDEN_BRANCH_PREDICTOR 24 -#define ARM64_HARDEN_BP_POST_GUEST_EXIT25 -#define ARM64_HAS_RAS_EXTN 26 +#define ARM64_HAS_RAS_EXTN 25 -#define ARM64_NCAPS27 +#define ARM64_NCAPS26 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 24961b7..ab4d0a9 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -68,8 +68,6 @@ extern u32 __init_stage2_translation(void); -extern void __qcom_hyp_sanitize_btac_predictors(void); - #endif #endif /* __ARM_KVM_ASM_H__ */ diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S index e5de335..dc4eb15 100644 --- a/arch/arm64/kernel/bpi.S +++ b/arch/arm64/kernel/bpi.S @@ -55,14 +55,6 @@ ENTRY(__bp_harden_hyp_vecs_start) .endr ENTRY(__bp_harden_hyp_vecs_end) -ENTRY(__qcom_hyp_sanitize_link_stack_start) - stp x29, x30, [sp, #-16]! - .rept 16 - bl . + 4 - .endr - ldp x29, x30, [sp], #16 -ENTRY(__qcom_hyp_sanitize_link_stack_end) - .macro smccc_workaround_1 inst sub sp, sp, #(8 * 4) stp x2, x3, [sp, #(8 * 0)] diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 52f15cd..d779ffd4 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -67,8 +67,6 @@ static int cpu_enable_trap_ctr_access(void *__unused) DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); #ifdef CONFIG_KVM -extern char __qcom_hyp_sanitize_link_stack_start[]; -extern char __qcom_hyp_sanitize_link_stack_end[]; extern char __smccc_workaround_1_smc_start[]; extern char __smccc_workaround_1_smc_end[]; extern char __smccc_workaround_1_hvc_start[]; @@ -115,8 +113,6 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn, spin_unlock(&bp_lock); } #else -#define __qcom_hyp_sanitize_link_stack_start NULL -#define __qcom_hyp_sanitize_link_stack_end NULL #define __smccc_workaround_1_smc_start NULL #define __smccc_workaround_1_smc_end NULL #define __smccc_workaround_1_hvc_start NULL @@ -161,12 +157,25 @@ static void call_hvc_arch_workaround_1(void) arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL); } +static void qcom_link_stack_sanitization(void) +{ + u64 tmp; + + asm volatile("mov %0, x30 \n" +".rept 16 \n" +"bl. + 4 \n" +".endr \n" +"mov x30, %0 \n" +: "=&r" (tmp)); +} + static int enable_smccc_arch_workaround_1(void *data) { const struct arm64_cpu_capabilities *entry = data; bp_hardening_cb_t cb; void *smccc_start, *smccc_end; struct arm_smccc_res res; + u32 midr = read_cpuid_id(); if (!entry->matches(entry, SCOPE_LOCAL_CPU)) return 0; @@ -199,33 +208,15 @@ static int enable_smccc_arch_workaround_1(void *data) return 0; } + if (((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR) || + ((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR_V1)) + cb = qcom_link_stack_sanitization; + install_bp_hardening_cb(entry, cb, smccc_start, smccc_end); return 0; } -static void qcom_link_stack_sanitization(void) -{ - u64 tmp; - - asm volatile("mov %0, x30 \n" -".rept 16 \n" -"bl. + 4 \n" -".endr \n" -"mov
Re: [PATCH] arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
Hi Will, On 03/05/2018 09:56 AM, Will Deacon wrote: > Hi Shanker, > > On Fri, Mar 02, 2018 at 03:50:18PM -0600, Shanker Donthineni wrote: >> The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC >> V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses >> the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead >> of Silicon provider service ID 0xC2001700. >> >> Signed-off-by: Shanker Donthineni >> --- >> arch/arm64/include/asm/cpucaps.h | 2 +- >> arch/arm64/include/asm/kvm_asm.h | 2 -- >> arch/arm64/kernel/bpi.S | 8 -- >> arch/arm64/kernel/cpu_errata.c | 55 >> ++-- >> arch/arm64/kvm/hyp/entry.S | 12 - >> arch/arm64/kvm/hyp/switch.c | 10 >> 6 files changed, 20 insertions(+), 69 deletions(-) > > I'm happy to take this via arm64 if I get an ack from Marc/Christoffer. > >> diff --git a/arch/arm64/include/asm/cpucaps.h >> b/arch/arm64/include/asm/cpucaps.h >> index bb26382..6ecc249 100644 >> --- a/arch/arm64/include/asm/cpucaps.h >> +++ b/arch/arm64/include/asm/cpucaps.h >> @@ -43,7 +43,7 @@ >> #define ARM64_SVE 22 >> #define ARM64_UNMAP_KERNEL_AT_EL0 23 >> #define ARM64_HARDEN_BRANCH_PREDICTOR 24 >> -#define ARM64_HARDEN_BP_POST_GUEST_EXIT 25 >> +/* #define ARM64_UNALLOCATED_ENTRY 25 */ >> #define ARM64_HAS_RAS_EXTN 26 >> >> #define ARM64_NCAPS 27 > > These aren't ABI, so I think you can just drop > ARM64_HARDEN_BP_POST_GUEST_EXIT and repack the others accordingly. > Sure, I'll remove it completely in v2 patch. >> diff --git a/arch/arm64/include/asm/kvm_asm.h >> b/arch/arm64/include/asm/kvm_asm.h >> index 24961b7..ab4d0a9 100644 >> --- a/arch/arm64/include/asm/kvm_asm.h >> +++ b/arch/arm64/include/asm/kvm_asm.h >> @@ -68,8 +68,6 @@ >> >> extern u32 __init_stage2_translation(void); >> >> -extern void __qcom_hyp_sanitize_btac_predictors(void); >> - >> #endif >> >> #endif /* __ARM_KVM_ASM_H__ */ >> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S >> index e5de335..dc4eb15 100644 >> --- a/arch/arm64/kernel/bpi.S >> +++ b/arch/arm64/kernel/bpi.S >> @@ -55,14 +55,6 @@ ENTRY(__bp_harden_hyp_vecs_start) >> .endr >> ENTRY(__bp_harden_hyp_vecs_end) >> >> -ENTRY(__qcom_hyp_sanitize_link_stack_start) >> -stp x29, x30, [sp, #-16]! >> -.rept 16 >> -bl . + 4 >> -.endr >> -ldp x29, x30, [sp], #16 >> -ENTRY(__qcom_hyp_sanitize_link_stack_end) >> - >> .macro smccc_workaround_1 inst >> sub sp, sp, #(8 * 4) >> stp x2, x3, [sp, #(8 * 0)] >> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c >> index 52f15cd..d779ffd4 100644 >> --- a/arch/arm64/kernel/cpu_errata.c >> +++ b/arch/arm64/kernel/cpu_errata.c >> @@ -67,8 +67,6 @@ static int cpu_enable_trap_ctr_access(void *__unused) >> DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); >> >> #ifdef CONFIG_KVM >> -extern char __qcom_hyp_sanitize_link_stack_start[]; >> -extern char __qcom_hyp_sanitize_link_stack_end[]; >> extern char __smccc_workaround_1_smc_start[]; >> extern char __smccc_workaround_1_smc_end[]; >> extern char __smccc_workaround_1_hvc_start[]; >> @@ -115,8 +113,6 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t >> fn, >> spin_unlock(&bp_lock); >> } >> #else >> -#define __qcom_hyp_sanitize_link_stack_startNULL >> -#define __qcom_hyp_sanitize_link_stack_end NULL >> #define __smccc_workaround_1_smc_start NULL >> #define __smccc_workaround_1_smc_endNULL >> #define __smccc_workaround_1_hvc_start NULL >> @@ -161,12 +157,25 @@ static void call_hvc_arch_workaround_1(void) >> arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL); >> } >> >> +static void qcom_link_stack_sanitization(void) >> +{ >> +u64 tmp; >> + >> +asm volatile("mov %0, x30 \n" >> + ".rept 16 \n" >> + "bl. + 4 \n" >> + ".endr \n" >> + "mov x30, %0 \n" >> + : "=&r" (tmp));
[PATCH] arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead of Silicon provider service ID 0xC2001700. Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/cpucaps.h | 2 +- arch/arm64/include/asm/kvm_asm.h | 2 -- arch/arm64/kernel/bpi.S | 8 -- arch/arm64/kernel/cpu_errata.c | 55 ++-- arch/arm64/kvm/hyp/entry.S | 12 - arch/arm64/kvm/hyp/switch.c | 10 6 files changed, 20 insertions(+), 69 deletions(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..6ecc249 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -43,7 +43,7 @@ #define ARM64_SVE 22 #define ARM64_UNMAP_KERNEL_AT_EL0 23 #define ARM64_HARDEN_BRANCH_PREDICTOR 24 -#define ARM64_HARDEN_BP_POST_GUEST_EXIT25 +/* #define ARM64_UNALLOCATED_ENTRY 25 */ #define ARM64_HAS_RAS_EXTN 26 #define ARM64_NCAPS27 diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index 24961b7..ab4d0a9 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -68,8 +68,6 @@ extern u32 __init_stage2_translation(void); -extern void __qcom_hyp_sanitize_btac_predictors(void); - #endif #endif /* __ARM_KVM_ASM_H__ */ diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S index e5de335..dc4eb15 100644 --- a/arch/arm64/kernel/bpi.S +++ b/arch/arm64/kernel/bpi.S @@ -55,14 +55,6 @@ ENTRY(__bp_harden_hyp_vecs_start) .endr ENTRY(__bp_harden_hyp_vecs_end) -ENTRY(__qcom_hyp_sanitize_link_stack_start) - stp x29, x30, [sp, #-16]! - .rept 16 - bl . + 4 - .endr - ldp x29, x30, [sp], #16 -ENTRY(__qcom_hyp_sanitize_link_stack_end) - .macro smccc_workaround_1 inst sub sp, sp, #(8 * 4) stp x2, x3, [sp, #(8 * 0)] diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 52f15cd..d779ffd4 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -67,8 +67,6 @@ static int cpu_enable_trap_ctr_access(void *__unused) DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data); #ifdef CONFIG_KVM -extern char __qcom_hyp_sanitize_link_stack_start[]; -extern char __qcom_hyp_sanitize_link_stack_end[]; extern char __smccc_workaround_1_smc_start[]; extern char __smccc_workaround_1_smc_end[]; extern char __smccc_workaround_1_hvc_start[]; @@ -115,8 +113,6 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn, spin_unlock(&bp_lock); } #else -#define __qcom_hyp_sanitize_link_stack_start NULL -#define __qcom_hyp_sanitize_link_stack_end NULL #define __smccc_workaround_1_smc_start NULL #define __smccc_workaround_1_smc_end NULL #define __smccc_workaround_1_hvc_start NULL @@ -161,12 +157,25 @@ static void call_hvc_arch_workaround_1(void) arm_smccc_1_1_hvc(ARM_SMCCC_ARCH_WORKAROUND_1, NULL); } +static void qcom_link_stack_sanitization(void) +{ + u64 tmp; + + asm volatile("mov %0, x30 \n" +".rept 16 \n" +"bl. + 4 \n" +".endr \n" +"mov x30, %0 \n" +: "=&r" (tmp)); +} + static int enable_smccc_arch_workaround_1(void *data) { const struct arm64_cpu_capabilities *entry = data; bp_hardening_cb_t cb; void *smccc_start, *smccc_end; struct arm_smccc_res res; + u32 midr = read_cpuid_id(); if (!entry->matches(entry, SCOPE_LOCAL_CPU)) return 0; @@ -199,33 +208,15 @@ static int enable_smccc_arch_workaround_1(void *data) return 0; } + if (((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR) || + ((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR_V1)) + cb = qcom_link_stack_sanitization; + install_bp_hardening_cb(entry, cb, smccc_start, smccc_end); return 0; } -static void qcom_link_stack_sanitization(void) -{ - u64 tmp; - - asm volatile("mov %0, x30 \n" -".rept 16 \n" -"bl. + 4 \n" -".endr \n" -"mov x30, %0 \n" -: "=&r" (tmp)); -} - -static int qcom_enable_link_stack_sanitization(
[PATCH v6] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Co-authored-by: Philip Elcan Signed-off-by: Shanker Donthineni --- Changes since v5: -Addressed Mark's review comments. Changes since v4: -Moved patching ARM64_HAS_CACHE_DIC inside invalidate_icache_by_line -Removed 'dsb ishst' for ARM64_HAS_CACHE_DIC as Mark suggested. Changes since v3: -Added preprocessor guard CONFIG_xxx to code snippets in cache.S -Changed barrier attributes from ISH to ISHST. Changes since v2: -Included barriers, DSB/ISB with DIC set, and DSB with IDC set. -Single Kconfig option. Changes since v1: -Reworded commit text. -Used the alternatives framework as Catalin suggested. -Rebased on top of https://patchwork.kernel.org/patch/10227927/ arch/arm64/Kconfig | 12 arch/arm64/include/asm/assembler.h | 6 ++ arch/arm64/include/asm/cache.h | 4 arch/arm64/include/asm/cpucaps.h | 4 +++- arch/arm64/kernel/cpufeature.c | 40 -- arch/arm64/mm/cache.S | 13 + 6 files changed, 72 insertions(+), 7 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 7381eeb..41af850 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1091,6 +1091,18 @@ config ARM64_RAS_EXTN and access the new registers if the system supports the extension. Platform RAS features may additionally depend on firmware support. +config ARM64_SKIP_CACHE_POU + bool "Enable support to skip cache PoU operations" + default y + help + Explicit point of unification cache operations can be eliminated + in software if the hardware handles transparently. The new bits in + CTR_EL0, CTR_EL0.DIC and CTR_EL0.IDC indicates the hardware + capabilities of ICache and DCache PoU requirements. + + Selecting this feature will allow the kernel to optimize cache + maintenance to the PoU. + endmenu config ARM64_SVE diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 3c78835..39f2274 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -444,6 +444,11 @@ * Corrupts: tmp1, tmp2 */ .macro invalidate_icache_by_line start, end, tmp1, tmp2, label +#ifdef CONFIG_ARM64_SKIP_CACHE_POU +alternative_if ARM64_HAS_CACHE_DIC + b 9996f +alternative_else_nop_endif +#endif icache_line_size \tmp1, \tmp2 sub \tmp2, \tmp1, #1 bic \tmp2, \start, \tmp2 @@ -453,6 +458,7 @@ cmp \tmp2, \end b.lo9997b dsb ish +9996: isb .endm diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index ea9bb4e..d460e9f 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -20,8 +20,12 @@ #define CTR_L1IP_SHIFT 14 #define CTR_L1IP_MASK 3 +#define CTR_DMLINE_SHIFT 16 +#define CTR_ERG_SHIFT 20 #define CTR_CWG_SHIFT 24 #define CTR_CWG_MASK 15 +#define CTR_IDC_SHIFT 28 +#define CTR_DIC_SHIFT 29 #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..8dd42ae 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -45,7 +45,9 @@ #define ARM64_HARDEN_BRANCH_PREDICTOR 24 #define ARM64_HARDEN_BP_POST_GUEST_EXIT
[PATCH v5] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Signed-off-by: Philip Elcan Signed-off-by: Shanker Donthineni --- Changes since v4: -Moved patching ARM64_HAS_CACHE_DIC inside invalidate_icache_by_line -Removed 'dsb ishst' for ARM64_HAS_CACHE_DIC as Mark suggested. Changes since v3: -Added preprocessor guard CONFIG_xxx to code snippets in cache.S -Changed barrier attributes from ISH to ISHST. Changes since v2: -Included barriers, DSB/ISB with DIC set, and DSB with IDC set. -Single Kconfig option. Changes since v1: -Reworded commit text. -Used the alternatives framework as Catalin suggested. -Rebased on top of https://patchwork.kernel.org/patch/10227927/ arch/arm64/Kconfig | 12 arch/arm64/include/asm/assembler.h | 6 ++ arch/arm64/include/asm/cache.h | 5 + arch/arm64/include/asm/cpucaps.h | 4 +++- arch/arm64/kernel/cpufeature.c | 40 -- arch/arm64/mm/cache.S | 13 + 6 files changed, 73 insertions(+), 7 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f55fe5b..82b8053 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1095,6 +1095,18 @@ config ARM64_RAS_EXTN and access the new registers if the system supports the extension. Platform RAS features may additionally depend on firmware support. +config ARM64_SKIP_CACHE_POU + bool "Enable support to skip cache POU operations" + default y + help + Explicit point of unification cache operations can be eliminated + in software if the hardware handles transparently. The new bits in + CTR_EL0, CTR_EL0.DIC and CTR_EL0.IDC indicates the hardware + capabilities of ICache and DCache POU requirements. + + Selecting this feature will allow the kernel to optimize the POU + cache maintaince operations where it requires 'D{I}C C{I}VAU' + endmenu config ARM64_SVE diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 3c78835..39f2274 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -444,6 +444,11 @@ * Corrupts: tmp1, tmp2 */ .macro invalidate_icache_by_line start, end, tmp1, tmp2, label +#ifdef CONFIG_ARM64_SKIP_CACHE_POU +alternative_if ARM64_HAS_CACHE_DIC + b 9996f +alternative_else_nop_endif +#endif icache_line_size \tmp1, \tmp2 sub \tmp2, \tmp1, #1 bic \tmp2, \start, \tmp2 @@ -453,6 +458,7 @@ cmp \tmp2, \end b.lo9997b dsb ish +9996: isb .endm diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index ea9bb4e..e22178b 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -20,8 +20,13 @@ #define CTR_L1IP_SHIFT 14 #define CTR_L1IP_MASK 3 +#define CTR_DMLINE_SHIFT 16 +#define CTR_ERG_SHIFT 20 #define CTR_CWG_SHIFT 24 #define CTR_CWG_MASK 15 +#define CTR_IDC_SHIFT 28 +#define CTR_DIC_SHIFT 29 +#define CTR_B31_SHIFT 31 #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..8dd42ae 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -45,7 +45,9 @@ #define ARM64_HARDEN_BRANCH_PREDICTOR 24 #define ARM64
[PATCH v4] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Signed-off-by: Philip Elcan Signed-off-by: Shanker Donthineni --- Changes since v3: -Added preprocessor guard CONFIG_xxx to code snippets in cache.S -Changed barrier attributes from ISH to ISHST. Changes since v2: -Included barriers, DSB/ISB with DIC set, and DSB with IDC set. -Single Kconfig option. Changes since v1: -Reworded commit text. -Used the alternatives framework as Catalin suggested. -Rebased on top of https://patchwork.kernel.org/patch/10227927/ arch/arm64/Kconfig | 12 arch/arm64/include/asm/cache.h | 5 + arch/arm64/include/asm/cpucaps.h | 4 +++- arch/arm64/kernel/cpufeature.c | 40 ++-- arch/arm64/mm/cache.S| 29 - 5 files changed, 82 insertions(+), 8 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f55fe5b..82b8053 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1095,6 +1095,18 @@ config ARM64_RAS_EXTN and access the new registers if the system supports the extension. Platform RAS features may additionally depend on firmware support. +config ARM64_SKIP_CACHE_POU + bool "Enable support to skip cache POU operations" + default y + help + Explicit point of unification cache operations can be eliminated + in software if the hardware handles transparently. The new bits in + CTR_EL0, CTR_EL0.DIC and CTR_EL0.IDC indicates the hardware + capabilities of ICache and DCache POU requirements. + + Selecting this feature will allow the kernel to optimize the POU + cache maintaince operations where it requires 'D{I}C C{I}VAU' + endmenu config ARM64_SVE diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index ea9bb4e..e22178b 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -20,8 +20,13 @@ #define CTR_L1IP_SHIFT 14 #define CTR_L1IP_MASK 3 +#define CTR_DMLINE_SHIFT 16 +#define CTR_ERG_SHIFT 20 #define CTR_CWG_SHIFT 24 #define CTR_CWG_MASK 15 +#define CTR_IDC_SHIFT 28 +#define CTR_DIC_SHIFT 29 +#define CTR_B31_SHIFT 31 #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..8dd42ae 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -45,7 +45,9 @@ #define ARM64_HARDEN_BRANCH_PREDICTOR 24 #define ARM64_HARDEN_BP_POST_GUEST_EXIT25 #define ARM64_HAS_RAS_EXTN 26 +#define ARM64_HAS_CACHE_IDC27 +#define ARM64_HAS_CACHE_DIC28 -#define ARM64_NCAPS27 +#define ARM64_NCAPS29 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index ff8a6e9..c0b0db0 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -199,12 +199,12 @@ static int __init register_cpu_hwcaps_dumper(void) }; static const struct arm64_ftr_bits ftr_ctr[] = { - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 29, 1, 1), /* DIC */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 28, 1, 1),
Re: [PATCH v3] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
Hi Mark, On 02/21/2018 09:09 AM, Mark Rutland wrote: > On Wed, Feb 21, 2018 at 07:49:06AM -0600, Shanker Donthineni wrote: >> The DCache clean & ICache invalidation requirements for instructions >> to be data coherence are discoverable through new fields in CTR_EL0. >> The following two control bits DIC and IDC were defined for this >> purpose. No need to perform point of unification cache maintenance >> operations from software on systems where CPU caches are transparent. >> >> This patch optimize the three functions __flush_cache_user_range(), >> clean_dcache_area_pou() and invalidate_icache_range() if the hardware >> reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two >> instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic >> in order to avoid the unnecessary overhead. >> >> CTR_EL0.DIC: Instruction cache invalidation requirements for >> instruction to data coherence. The meaning of this bit[29]. >> 0: Instruction cache invalidation to the point of unification >> is required for instruction to data coherence. >> 1: Instruction cache cleaning to the point of unification is >> not required for instruction to data coherence. >> >> CTR_EL0.IDC: Data cache clean requirements for instruction to data >> coherence. The meaning of this bit[28]. >> 0: Data cache clean to the point of unification is required for >> instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 >> or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). >> 1: Data cache clean to the point of unification is not required >> for instruction to data coherence. >> >> Signed-off-by: Philip Elcan >> Signed-off-by: Shanker Donthineni >> --- >> Changes since v2: >> -Included barriers, DSB/ISB with DIC set, and DSB with IDC set. >> -Single Kconfig option. >> >> Changes since v1: >> -Reworded commit text. >> -Used the alternatives framework as Catalin suggested. >> -Rebased on top of https://patchwork.kernel.org/patch/10227927/ >> >> arch/arm64/Kconfig | 12 >> arch/arm64/include/asm/cache.h | 5 + >> arch/arm64/include/asm/cpucaps.h | 4 +++- >> arch/arm64/kernel/cpufeature.c | 40 >> ++-- >> arch/arm64/mm/cache.S| 21 +++-- >> 5 files changed, 73 insertions(+), 9 deletions(-) >> >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index f55fe5b..82b8053 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -1095,6 +1095,18 @@ config ARM64_RAS_EXTN >>and access the new registers if the system supports the extension. >>Platform RAS features may additionally depend on firmware support. >> >> +config ARM64_SKIP_CACHE_POU >> +bool "Enable support to skip cache POU operations" >> +default y >> +help >> + Explicit point of unification cache operations can be eliminated >> + in software if the hardware handles transparently. The new bits in >> + CTR_EL0, CTR_EL0.DIC and CTR_EL0.IDC indicates the hardware >> + capabilities of ICache and DCache POU requirements. >> + >> + Selecting this feature will allow the kernel to optimize the POU >> + cache maintaince operations where it requires 'D{I}C C{I}VAU' >> + >> endmenu > > Is it worth having a config option for this at all? The savings from turning > this off seem trivial. > >> >> config ARM64_SVE >> diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h >> index ea9bb4e..e22178b 100644 >> --- a/arch/arm64/include/asm/cache.h >> +++ b/arch/arm64/include/asm/cache.h >> @@ -20,8 +20,13 @@ >> >> #define CTR_L1IP_SHIFT 14 >> #define CTR_L1IP_MASK 3 >> +#define CTR_DMLINE_SHIFT16 >> +#define CTR_ERG_SHIFT 20 >> #define CTR_CWG_SHIFT 24 >> #define CTR_CWG_MASK15 >> +#define CTR_IDC_SHIFT 28 >> +#define CTR_DIC_SHIFT 29 >> +#define CTR_B31_SHIFT 31 >> >> #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & >> CTR_L1IP_MASK) >> >> diff --git a/arch/arm64/include/asm/cpucaps.h >> b/arch/arm64/include/asm/cpucaps.h >> index bb26382..8dd42ae 100644 >> --- a/arch/arm64/include/asm/cpucaps.h >> +++ b/arch/arm64/include/asm/cpucaps.h >> @@ -45,7 +45,9
[PATCH v3] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Signed-off-by: Philip Elcan Signed-off-by: Shanker Donthineni --- Changes since v2: -Included barriers, DSB/ISB with DIC set, and DSB with IDC set. -Single Kconfig option. Changes since v1: -Reworded commit text. -Used the alternatives framework as Catalin suggested. -Rebased on top of https://patchwork.kernel.org/patch/10227927/ arch/arm64/Kconfig | 12 arch/arm64/include/asm/cache.h | 5 + arch/arm64/include/asm/cpucaps.h | 4 +++- arch/arm64/kernel/cpufeature.c | 40 ++-- arch/arm64/mm/cache.S| 21 +++-- 5 files changed, 73 insertions(+), 9 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f55fe5b..82b8053 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1095,6 +1095,18 @@ config ARM64_RAS_EXTN and access the new registers if the system supports the extension. Platform RAS features may additionally depend on firmware support. +config ARM64_SKIP_CACHE_POU + bool "Enable support to skip cache POU operations" + default y + help + Explicit point of unification cache operations can be eliminated + in software if the hardware handles transparently. The new bits in + CTR_EL0, CTR_EL0.DIC and CTR_EL0.IDC indicates the hardware + capabilities of ICache and DCache POU requirements. + + Selecting this feature will allow the kernel to optimize the POU + cache maintaince operations where it requires 'D{I}C C{I}VAU' + endmenu config ARM64_SVE diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index ea9bb4e..e22178b 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -20,8 +20,13 @@ #define CTR_L1IP_SHIFT 14 #define CTR_L1IP_MASK 3 +#define CTR_DMLINE_SHIFT 16 +#define CTR_ERG_SHIFT 20 #define CTR_CWG_SHIFT 24 #define CTR_CWG_MASK 15 +#define CTR_IDC_SHIFT 28 +#define CTR_DIC_SHIFT 29 +#define CTR_B31_SHIFT 31 #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..8dd42ae 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -45,7 +45,9 @@ #define ARM64_HARDEN_BRANCH_PREDICTOR 24 #define ARM64_HARDEN_BP_POST_GUEST_EXIT25 #define ARM64_HAS_RAS_EXTN 26 +#define ARM64_HAS_CACHE_IDC27 +#define ARM64_HAS_CACHE_DIC28 -#define ARM64_NCAPS27 +#define ARM64_NCAPS29 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index ff8a6e9..12e100a 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -199,12 +199,12 @@ static int __init register_cpu_hwcaps_dumper(void) }; static const struct arm64_ftr_bits ftr_ctr[] = { - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 29, 1, 1), /* DIC */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 28, 1, 1), /* IDC */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, 24, 4, 0), /* CWG */ - ARM64_FTR_BITS(FT
Re: [PATCH v2] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
Hi Catalin, On 02/21/2018 05:12 AM, Catalin Marinas wrote: > On Mon, Feb 19, 2018 at 08:59:06PM -0600, Shanker Donthineni wrote: >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index f55fe5b..4061210 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -1095,6 +1095,27 @@ config ARM64_RAS_EXTN >>and access the new registers if the system supports the extension. >>Platform RAS features may additionally depend on firmware support. >> >> +config ARM64_CACHE_IDC >> +bool "Enable support for DCache clean PoU optimization" >> +default y >> +help >> + The data cache clean to the point of unification is not required >> + for instruction to be data coherence if CTR_EL0.IDC has value 1. >> + >> + Selecting this feature will allow the kernel to optimize the POU >> + cache maintaince operations where it requires 'DC CVAU'. >> + >> +config ARM64_CACHE_DIC >> +bool "Enable support for ICache invalidation PoU optimization" >> +default y >> +help >> + Instruction cache invalidation to the point of unification is not >> + required for instruction to be data coherence if CTR_EL0.DIC has >> + value 1. >> + >> + Selecting this feature will allow the kernel to optimize the POU >> + cache maintaince operations where it requires 'IC IVAU'. > > A single Kconfig entry is sufficient for both features. > I'll do in v3 patch. >> @@ -864,6 +864,22 @@ static bool has_no_fpsimd(const struct >> arm64_cpu_capabilities *entry, int __unus >> ID_AA64PFR0_FP_SHIFT) < 0; >> } >> >> +#ifdef CONFIG_ARM64_CACHE_IDC >> +static bool has_cache_idc(const struct arm64_cpu_capabilities *entry, >> + int __unused) >> +{ >> +return !!(read_sanitised_ftr_reg(SYS_CTR_EL0) & (1UL << CTR_IDC_SHIFT)); >> +} >> +#endif >> + >> +#ifdef CONFIG_ARM64_CACHE_DIC >> +static bool has_cache_dic(const struct arm64_cpu_capabilities *entry, >> + int __unused) >> +{ >> +return !!(read_sanitised_ftr_reg(SYS_CTR_EL0) & (1UL << CTR_DIC_SHIFT)); >> +} >> +#endif > > Nitpick: no need for !! since the function type is bool already. > Sure, I'll remove '!!'. >> diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S >> index 758bde7..7d37d71 100644 >> --- a/arch/arm64/mm/cache.S >> +++ b/arch/arm64/mm/cache.S >> @@ -50,6 +50,9 @@ ENTRY(flush_icache_range) >> */ >> ENTRY(__flush_cache_user_range) >> uaccess_ttbr0_enable x2, x3, x4 >> +alternative_if ARM64_HAS_CACHE_IDC >> +b 8f >> +alternative_else_nop_endif >> dcache_line_size x2, x3 >> sub x3, x2, #1 >> bic x4, x0, x3 >> @@ -60,6 +63,11 @@ user_alt 9f, "dc cvau, x4", "dc civac, x4", >> ARM64_WORKAROUND_CLEAN_CACHE >> b.lo1b >> dsb ish >> >> +8: >> +alternative_if ARM64_HAS_CACHE_DIC >> +mov x0, #0 >> +b 1f >> +alternative_else_nop_endif >> invalidate_icache_by_line x0, x1, x2, x3, 9f >> mov x0, #0 >> 1: > > You can add another label at mov x0, #0 below this hunk and keep a > single instruction in the alternative path. > > However, my worry is that in an implementation with DIC set, we also > skip the DSB/ISB sequence in the invalidate_icache_by_line macro. For > example, in an implementation with transparent PoU, we could have: > > str , [addr] > // no cache maintenance or barrier > br > Thanks for pointing out the missing barriers. I think it make sense to follow the existing barrier semantics in order to avoid the unknown things. > Is an ISB required between the instruction store and execution? I would > say yes but maybe Will has a better opinion here. > Agree, an ISB is required especially for self-modifying code. I'll include in v3 patch. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2] arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC
The DCache clean & ICache invalidation requirements for instructions to be data coherence are discoverable through new fields in CTR_EL0. The following two control bits DIC and IDC were defined for this purpose. No need to perform point of unification cache maintenance operations from software on systems where CPU caches are transparent. This patch optimize the three functions __flush_cache_user_range(), clean_dcache_area_pou() and invalidate_icache_range() if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. Basically it skips the two instructions 'DC CVAU' and 'IC IVAU', and the associated loop logic in order to avoid the unnecessary overhead. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Signed-off-by: Philip Elcan Signed-off-by: Shanker Donthineni --- Changes since v1: -Reworded commit text. -Used the alternatives framework as Catalin suggested. -Rebased on top of https://patchwork.kernel.org/patch/10227927/ arch/arm64/Kconfig | 21 +++ arch/arm64/include/asm/cache.h | 5 + arch/arm64/include/asm/cpucaps.h | 4 +++- arch/arm64/kernel/cpufeature.c | 44 ++-- arch/arm64/mm/cache.S| 15 ++ 5 files changed, 82 insertions(+), 7 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index f55fe5b..4061210 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1095,6 +1095,27 @@ config ARM64_RAS_EXTN and access the new registers if the system supports the extension. Platform RAS features may additionally depend on firmware support. +config ARM64_CACHE_IDC + bool "Enable support for DCache clean PoU optimization" + default y + help + The data cache clean to the point of unification is not required + for instruction to be data coherence if CTR_EL0.IDC has value 1. + + Selecting this feature will allow the kernel to optimize the POU + cache maintaince operations where it requires 'DC CVAU'. + +config ARM64_CACHE_DIC + bool "Enable support for ICache invalidation PoU optimization" + default y + help + Instruction cache invalidation to the point of unification is not + required for instruction to be data coherence if CTR_EL0.DIC has + value 1. + + Selecting this feature will allow the kernel to optimize the POU + cache maintaince operations where it requires 'IC IVAU'. + endmenu config ARM64_SVE diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h index ea9bb4e..e22178b 100644 --- a/arch/arm64/include/asm/cache.h +++ b/arch/arm64/include/asm/cache.h @@ -20,8 +20,13 @@ #define CTR_L1IP_SHIFT 14 #define CTR_L1IP_MASK 3 +#define CTR_DMLINE_SHIFT 16 +#define CTR_ERG_SHIFT 20 #define CTR_CWG_SHIFT 24 #define CTR_CWG_MASK 15 +#define CTR_IDC_SHIFT 28 +#define CTR_DIC_SHIFT 29 +#define CTR_B31_SHIFT 31 #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index bb26382..8dd42ae 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -45,7 +45,9 @@ #define ARM64_HARDEN_BRANCH_PREDICTOR 24 #define ARM64_HARDEN_BP_POST_GUEST_EXIT25 #define ARM64_HAS_RAS_EXTN 26 +#define ARM64_HAS_CACHE_IDC27 +#define ARM64_HAS_CACHE_DIC28 -#define ARM64_NCAPS27 +#define ARM64_NCAPS29 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index ff8a6e9..53a7266 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -199,12 +199,12 @@ static int __init register_cpu_hwcaps_dumper(void) }; static const struct arm64_ftr_bits ftr_ctr[] = { - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */ - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 29, 1, 1), /* DIC */ -
Re: [PATCH] arm64: Add support for new control bits CTR_EL0.IDC and CTR_EL0.IDC
Thanks Catalin for your comments. On 02/19/2018 11:18 AM, Catalin Marinas wrote: > On Mon, Feb 19, 2018 at 10:35:30AM -0600, Shanker Donthineni wrote: >> On 02/19/2018 08:38 AM, Catalin Marinas wrote: >>> On the patch, I'd rather have an alternative framework entry for no VAU >>> cache maint required and some ret instruction at the beginning of the >>> cache maint function rather than jumping out of the loop somewhere >>> inside the cache maintenance code, penalising the CPUs that do require >>> it. >> >> Alternative framework might break things in case of CPU hotplug. I need one >> more confirmation from you on incorporating alternative framework. > > CPU hotplug can be an issue but it should be handled like other similar > cases: if a CPU comes online late and its features are incompatible, it > should not be brought online. The cpufeature code handles this. > > With Will's patch for CTR_EL0, we handle different CPU features during > boot, defaulting to the lowest value for the IDC/DIC bits. > > I suggest you add new ARM64_HAS_* feature bits and enable them based on > CTR_EL0.IDC and DIC. You could check for both being 1 with a single > feature bit but I guess an implementation is allowed to have these > different (e.g. DIC == 0 and IDC == 1). > I'll add two new features ARM64_HAS_DIC and ARM64_HAS_IDC to support all implementations. Unfortunately QCOM server chips supports IDC not DIC. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] arm64: Add support for new control bits CTR_EL0.IDC and CTR_EL0.IDC
Hi Will, On 02/19/2018 08:43 AM, Will Deacon wrote: > Hi Shanker, > > On Fri, Feb 16, 2018 at 06:57:46PM -0600, Shanker Donthineni wrote: >> Two point of unification cache maintenance operations 'DC CVAU' and >> 'IC IVAU' are optional for implementors as per ARMv8 specification. >> This patch parses the updated CTR_EL0 register definition and adds >> the required changes to skip POU operations if the hardware reports >> CTR_EL0.IDC and/or CTR_EL0.IDC. >> >> CTR_EL0.DIC: Instruction cache invalidation requirements for >> instruction to data coherence. The meaning of this bit[29]. >> 0: Instruction cache invalidation to the point of unification >> is required for instruction to data coherence. >> 1: Instruction cache cleaning to the point of unification is >> not required for instruction to data coherence. >> >> CTR_EL0.IDC: Data cache clean requirements for instruction to data >> coherence. The meaning of this bit[28]. >> 0: Data cache clean to the point of unification is required for >> instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 >> or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). >> 1: Data cache clean to the point of unification is not required >> for instruction to data coherence. >> >> Signed-off-by: Philip Elcan >> Signed-off-by: Shanker Donthineni >> --- >> arch/arm64/include/asm/assembler.h | 48 >> -- >> arch/arm64/include/asm/cache.h | 2 ++ >> arch/arm64/kernel/cpufeature.c | 2 ++ >> arch/arm64/mm/cache.S | 26 ++--- >> 4 files changed, 51 insertions(+), 27 deletions(-) > > I was looking at our CTR_EL0 code last week but forgot to post the patch I > wrote fixing up some of the fields. I just send it now, so please can > you rebase on top of: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2018-February/560488.html > > Also: > >> diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h >> index ea9bb4e..aea533b 100644 >> --- a/arch/arm64/include/asm/cache.h >> +++ b/arch/arm64/include/asm/cache.h >> @@ -22,6 +22,8 @@ >> #define CTR_L1IP_MASK 3 >> #define CTR_CWG_SHIFT 24 >> #define CTR_CWG_MASK15 >> +#define CTR_IDC_SHIFT 28 >> +#define CTR_DIC_SHIFT 29 >> >> #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & >> CTR_L1IP_MASK) >> >> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c >> index 29b1f87..f42bb5a 100644 >> --- a/arch/arm64/kernel/cpufeature.c >> +++ b/arch/arm64/kernel/cpufeature.c >> @@ -200,6 +200,8 @@ static int __init register_cpu_hwcaps_dumper(void) >> >> static const struct arm64_ftr_bits ftr_ctr[] = { >> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RAO >> */ >> +ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DIC_SHIFT, >> 1, 0), /* DIC */ >> +ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IDC_SHIFT, >> 1, 0), /* IDC */ >> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_SAFE, 24, 4, 0), >> /* CWG */ >> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 20, 4, 0), >> /* ERG */ >> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, 16, 4, 1), >> /* DminLine */ > > Could you update the other table entries here to use the CTR_*_SHIFT values > as well? > I'll do. > Thanks, > > Will > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] arm64: Add support for new control bits CTR_EL0.IDC and CTR_EL0.IDC
Hi Catalin, On 02/19/2018 08:38 AM, Catalin Marinas wrote: > On Fri, Feb 16, 2018 at 06:57:46PM -0600, Shanker Donthineni wrote: >> Two point of unification cache maintenance operations 'DC CVAU' and >> 'IC IVAU' are optional for implementors as per ARMv8 specification. >> This patch parses the updated CTR_EL0 register definition and adds >> the required changes to skip POU operations if the hardware reports >> CTR_EL0.IDC and/or CTR_EL0.IDC. >> >> CTR_EL0.DIC: Instruction cache invalidation requirements for >> instruction to data coherence. The meaning of this bit[29]. >> 0: Instruction cache invalidation to the point of unification >> is required for instruction to data coherence. >> 1: Instruction cache cleaning to the point of unification is >> not required for instruction to data coherence. >> >> CTR_EL0.IDC: Data cache clean requirements for instruction to data >> coherence. The meaning of this bit[28]. >> 0: Data cache clean to the point of unification is required for >> instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 >> or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). >> 1: Data cache clean to the point of unification is not required >> for instruction to data coherence. > > There is a difference between cache maintenance to PoU "is not required" > and the actual instructions being optional (i.e. undef when executed). > If your caches are transparent and DC CVAU/IC IVAU is not required, > these instructions should behave as NOPs. So, are you trying to improve > the performance of the cache maintenance routines in the kernel? If yes, > please show some (relative) numbers and a better description in the > commit log. > Yes, I agree with you, POU instructions are NOPs if the caches are transparent. There was no issue as per correctness point of view. But causing the unnecessary overhead in ASM routines where code goes thorough VA range incremented by cache line size. This overhead is noticeable with 64K PAGE, especially with sections mappings. I'll reword the commit text to reflect your comments in v2 patch. e.g. 512M section with 64K PAGE_SIZE kernel, assume 64Bytes cache size. flush_icache_range() consumes around 256M cpu cycles Icache loop overhead: 512Mbytes / 64Bytes * 4 instructions per loop Dcache loop overhead: 512Mbytes / 64Bytes * 4 instructions per loop With this patch it takes less than ~1K cycles. > On the patch, I'd rather have an alternative framework entry for no VAU > cache maint required and some ret instruction at the beginning of the > cache maint function rather than jumping out of the loop somewhere > inside the cache maintenance code, penalising the CPUs that do require > it. > Alternative framework might break things in case of CPU hotplug. I need one more confirmation from you on incorporating alternative framework. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] KVM: arm/arm64: No need to zero CNTVOFF in kvm_timer_vcpu_put() for VHE
In AArch64/AArch32, the virtual counter uses a fixed virtual offset of zero in the following situations as per ARMv8 specifications: 1) HCR_EL2.E2H is 1, and CNTVCT_EL0/CNTVCT are read from EL2. 2) HCR_EL2.{E2H, TGE} is {1, 1}, and either: — CNTVCT_EL0 is read from Non-secure EL0 or EL2. — CNTVCT is read from Non-secure EL0. So, no need to zero CNTVOFF_EL2/CNTVOFF for VHE case. Signed-off-by: Shanker Donthineni --- virt/kvm/arm/arch_timer.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c index 70268c0..86eca324 100644 --- a/virt/kvm/arm/arch_timer.c +++ b/virt/kvm/arm/arch_timer.c @@ -541,9 +541,11 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu) * The kernel may decide to run userspace after calling vcpu_put, so * we reset cntvoff to 0 to ensure a consistent read between user * accesses to the virtual counter and kernel access to the physical -* counter. +* counter of non-VHE case. For VHE, the virtual counter uses a fixed +* virtual offset of zero, so no need to zero CNTVOFF_EL2 register. */ - set_cntvoff(0); + if (!has_vhe()) + set_cntvoff(0); } /* -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] arm64: Add support for new control bits CTR_EL0.IDC and CTR_EL0.IDC
Two point of unification cache maintenance operations 'DC CVAU' and 'IC IVAU' are optional for implementors as per ARMv8 specification. This patch parses the updated CTR_EL0 register definition and adds the required changes to skip POU operations if the hardware reports CTR_EL0.IDC and/or CTR_EL0.IDC. CTR_EL0.DIC: Instruction cache invalidation requirements for instruction to data coherence. The meaning of this bit[29]. 0: Instruction cache invalidation to the point of unification is required for instruction to data coherence. 1: Instruction cache cleaning to the point of unification is not required for instruction to data coherence. CTR_EL0.IDC: Data cache clean requirements for instruction to data coherence. The meaning of this bit[28]. 0: Data cache clean to the point of unification is required for instruction to data coherence, unless CLIDR_EL1.LoC == 0b000 or (CLIDR_EL1.LoUIS == 0b000 && CLIDR_EL1.LoUU == 0b000). 1: Data cache clean to the point of unification is not required for instruction to data coherence. Signed-off-by: Philip Elcan Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/assembler.h | 48 -- arch/arm64/include/asm/cache.h | 2 ++ arch/arm64/kernel/cpufeature.c | 2 ++ arch/arm64/mm/cache.S | 26 ++--- 4 files changed, 51 insertions(+), 27 deletions(-) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 3c78835..9eaa948 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -30,6 +30,7 @@ #include #include #include +#include .macro save_and_disable_daif, flags mrs \flags, daif @@ -334,9 +335,9 @@ * raw_dcache_line_size - get the minimum D-cache line size on this CPU * from the CTR register. */ - .macro raw_dcache_line_size, reg, tmp - mrs \tmp, ctr_el0 // read CTR - ubfm\tmp, \tmp, #16, #19// cache line size encoding + .macro raw_dcache_line_size, reg, tmp, ctr + mrs \ctr, ctr_el0 // read CTR + ubfm\tmp, \ctr, #16, #19// cache line size encoding mov \reg, #4// bytes per word lsl \reg, \reg, \tmp// actual cache line size .endm @@ -344,9 +345,9 @@ /* * dcache_line_size - get the safe D-cache line size across all CPUs */ - .macro dcache_line_size, reg, tmp - read_ctr\tmp - ubfm\tmp, \tmp, #16, #19// cache line size encoding + .macro dcache_line_size, reg, tmp, ctr + read_ctr\ctr + ubfm\tmp, \ctr, #16, #19// cache line size encoding mov \reg, #4// bytes per word lsl \reg, \reg, \tmp// actual cache line size .endm @@ -355,9 +356,9 @@ * raw_icache_line_size - get the minimum I-cache line size on this CPU * from the CTR register. */ - .macro raw_icache_line_size, reg, tmp - mrs \tmp, ctr_el0 // read CTR - and \tmp, \tmp, #0xf// cache line size encoding + .macro raw_icache_line_size, reg, tmp, ctr + mrs \ctr, ctr_el0 // read CTR + and \tmp, \ctr, #0xf// cache line size encoding mov \reg, #4// bytes per word lsl \reg, \reg, \tmp// actual cache line size .endm @@ -365,9 +366,9 @@ /* * icache_line_size - get the safe I-cache line size across all CPUs */ - .macro icache_line_size, reg, tmp - read_ctr\tmp - and \tmp, \tmp, #0xf// cache line size encoding + .macro icache_line_size, reg, tmp, ctr + read_ctr\ctr + and \tmp, \ctr, #0xf// cache line size encoding mov \reg, #4// bytes per word lsl \reg, \reg, \tmp// actual cache line size .endm @@ -408,13 +409,21 @@ * size: size of the region * Corrupts: kaddr, size, tmp1, tmp2 */ - .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2 - dcache_line_size \tmp1, \tmp2 + .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2, tmp3 + dcache_line_size \tmp1, \tmp2, \tmp3 add \size, \kaddr, \size sub \tmp2, \tmp1, #1 bic \kaddr, \kaddr, \tmp2 9998: - .if (\op == cvau || \op == cvac) + .if (\op == cvau) +alternative_if_not ARM64_WORKAROUND_CLEAN_CACHE + tbnz\tmp3, #CTR_IDC_SHIFT, 9997f + dc cvau, \kaddr +alternative_else + dc civac, \kaddr + nop +alternative_endif + .elseif (\op == cvac) alternative_if_not ARM64_WORKA
[PATCH] arm64: Add missing Falkor part number for branch predictor hardening
References to CPU part number MIDR_QCOM_FALKOR were dropped from the mailing list patch due to mainline/arm64 branch dependency. So this patch adds the missing part number. Fixes: ec82b567a74f ("arm64: Implement branch predictor hardening for Falkor") Signed-off-by: Shanker Donthineni --- arch/arm64/kernel/cpu_errata.c | 9 + arch/arm64/kvm/hyp/switch.c| 4 +++- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 0782359..52f15cd 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -408,6 +408,15 @@ static int qcom_enable_link_stack_sanitization(void *data) }, { .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR), + .enable = qcom_enable_link_stack_sanitization, + }, + { + .capability = ARM64_HARDEN_BP_POST_GUEST_EXIT, + MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR), + }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN), .enable = enable_smccc_arch_workaround_1, }, diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 116252a8..870f4b1 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -407,8 +407,10 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu) u32 midr = read_cpuid_id(); /* Apply BTAC predictors mitigation to all Falkor chips */ - if ((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR_V1) + if (((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR) || + ((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR_V1)) { __qcom_hyp_sanitize_btac_predictors(); + } } fp_enabled = __fpsimd_enabled(); -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
Hi Will, Thanks for your quick reply. On 02/01/2018 04:33 AM, Will Deacon wrote: > Hi Shanker, > > On Wed, Jan 31, 2018 at 06:03:42PM -0600, Shanker Donthineni wrote: >> A DMB instruction can be used to ensure the relative order of only >> memory accesses before and after the barrier. Since writes to system >> registers are not memory operations, barrier DMB is not sufficient >> for observability of memory accesses that occur before ICC_SGI1R_EL1 >> writes. >> >> A DSB instruction ensures that no instructions that appear in program >> order after the DSB instruction, can execute until the DSB instruction >> has completed. >> >> Signed-off-by: Shanker Donthineni >> --- >> drivers/irqchip/irq-gic-v3.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c >> index b56c3e2..980ae8e 100644 >> --- a/drivers/irqchip/irq-gic-v3.c >> +++ b/drivers/irqchip/irq-gic-v3.c >> @@ -688,7 +688,7 @@ static void gic_raise_softirq(const struct cpumask >> *mask, unsigned int irq) >> * Ensure that stores to Normal memory are visible to the >> * other CPUs before issuing the IPI. >> */ >> -smp_wmb(); >> +wmb(); > > I think this is the right thing to do and the smp_wmb() was accidentally > pulled in here as a copy-paste from the GICv2 driver where it is sufficient > in practice. > > Did you spot this by code inspection, or did the DMB actually cause > observable failures? (trying to figure out whether or not this need to go > to -stable). > We've inspected the code because kernel was causing failures in scheduler/IPI_RESCHDULE. After some time of debugging, we landed in GIC driver and found that the issue was due to the DMB barrier. Side note, we're also missing synchronization barriers in GIC driver after writing some of the ICC_XXX system registers. I'm planning to post those changes for comments. e.g: gic_write_sgi1r(val) and gic_write_eoir(irqnr); > Anyway: > > Acked-by: Will Deacon > > Cheers, > > Will > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
A DMB instruction can be used to ensure the relative order of only memory accesses before and after the barrier. Since writes to system registers are not memory operations, barrier DMB is not sufficient for observability of memory accesses that occur before ICC_SGI1R_EL1 writes. A DSB instruction ensures that no instructions that appear in program order after the DSB instruction, can execute until the DSB instruction has completed. Signed-off-by: Shanker Donthineni --- drivers/irqchip/irq-gic-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index b56c3e2..980ae8e 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -688,7 +688,7 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq) * Ensure that stores to Normal memory are visible to the * other CPUs before issuing the IPI. */ - smp_wmb(); + wmb(); for_each_cpu(cpu, mask) { u64 cluster_id = MPIDR_TO_SGI_CLUSTER_ID(cpu_logical_map(cpu)); -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] arm64: Implement branch predictor hardening for Falkor
Hi Will/Catalin, Please drop https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/?h=kpti&id=79ad24ef6c260efa0614896b15e67f4829448e32 in which you've removed FALKOR MIDR change. I've posted v2 patch series including typo fix & FALKOR MIDR patch which is already available in upstream v4.15-rc7 branch. Please merge v2 patch. On 01/08/2018 01:10 PM, Shanker Donthineni wrote: > Hi Will, > > On 01/08/2018 12:44 PM, Will Deacon wrote: >> On Mon, Jan 08, 2018 at 05:09:33PM +, Will Deacon wrote: >>> On Fri, Jan 05, 2018 at 02:28:59PM -0600, Shanker Donthineni wrote: >>>> Falkor is susceptible to branch predictor aliasing and can >>>> theoretically be attacked by malicious code. This patch >>>> implements a mitigation for these attacks, preventing any >>>> malicious entries from affecting other victim contexts. >>> >>> Thanks, Shanker. I'll pick this up (fixing the typo pointed out by Drew). >> >> Note that MIDR_FALKOR doesn't exist in mainline, so I had to drop those >> changes too. See the kpti branch for details. >> > > The FALKOR MIDR patch is already available in the upstream kernel v4.15-rc7 > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64?h=v4.15-rc7&id=c622cc013cece073722592cff1ac6643a33b1622 > > If you want I can resend the above patch in v2 series including typo fix. > >> If you'd like anything else done here, please send additional patches to me >> and Catalin that we can apply on top of what we currently have. Note that >> I'm in the air tomorrow, so won't be picking up email. >> >> Cheers, >> >> Will >> >> ___ >> linux-arm-kernel mailing list >> linux-arm-ker...@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2 1/2] arm64: Define cputype macros for Falkor CPU
Add cputype definition macros for Qualcomm Datacenter Technologies Falkor CPU in cputype.h. It's unfortunate that the first revision of the Falkor CPU used the wrong part number 0x800, got fixed in v2 chip with part number 0xC00, and would be used the same value for future revisions. Signed-off-by: Shanker Donthineni Signed-off-by: Will Deacon --- This patch is availble at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64?h=v4.15-rc7&id=c622cc013cece073722592cff1ac6643a33b1622 arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 84385b9..424ca71d 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -93,6 +93,7 @@ #define BRCM_CPU_PART_VULCAN 0x516 #define QCOM_CPU_PART_FALKOR_V10x800 +#define QCOM_CPU_PART_FALKOR 0xC00 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) @@ -103,6 +104,7 @@ #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) +#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR) #ifndef __ASSEMBLY__ -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2 2/2] arm64: Implement branch predictor hardening for Falkor
Falkor is susceptible to branch predictor aliasing and can theoretically be attacked by malicious code. This patch implements a mitigation for these attacks, preventing any malicious entries from affecting other victim contexts. Signed-off-by: Shanker Donthineni --- Changes since v1: Corrected typo to fix the compilation errors if HARDEN_BRANCH_PREDICTOR=n This patch requires FALKOR MIDR which is available in upstream v4.15-rc7 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64?h=v4.15-rc7&id=c622cc013cece073722592cff1ac6643a33b1622 ans also attached this v2 patch series. arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/kvm_asm.h | 2 ++ arch/arm64/kernel/bpi.S | 8 +++ arch/arm64/kernel/cpu_errata.c | 49 ++-- arch/arm64/kvm/hyp/entry.S | 12 ++ arch/arm64/kvm/hyp/switch.c | 10 6 files changed, 81 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 51616e7..7049b48 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -43,7 +43,8 @@ #define ARM64_SVE 22 #define ARM64_UNMAP_KERNEL_AT_EL0 23 #define ARM64_HARDEN_BRANCH_PREDICTOR 24 +#define ARM64_HARDEN_BP_POST_GUEST_EXIT25 -#define ARM64_NCAPS25 +#define ARM64_NCAPS26 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index ab4d0a9..24961b7 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -68,6 +68,8 @@ extern u32 __init_stage2_translation(void); +extern void __qcom_hyp_sanitize_btac_predictors(void); + #endif #endif /* __ARM_KVM_ASM_H__ */ diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S index 2b10d52..44ffcda 100644 --- a/arch/arm64/kernel/bpi.S +++ b/arch/arm64/kernel/bpi.S @@ -77,3 +77,11 @@ ENTRY(__psci_hyp_bp_inval_start) ldp x2, x3, [sp], #16 ldp x0, x1, [sp], #16 ENTRY(__psci_hyp_bp_inval_end) + +ENTRY(__qcom_hyp_sanitize_link_stack_start) + stp x29, x30, [sp, #-16]! + .rept 16 + bl . + 4 + .endr + ldp x29, x30, [sp], #16 +ENTRY(__qcom_hyp_sanitize_link_stack_end) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index cb0fb37..9ee9d2e 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -54,6 +54,8 @@ static int cpu_enable_trap_ctr_access(void *__unused) #ifdef CONFIG_KVM extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[]; +extern char __qcom_hyp_sanitize_link_stack_start[]; +extern char __qcom_hyp_sanitize_link_stack_end[]; static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, const char *hyp_vecs_end) @@ -96,8 +98,10 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn, spin_unlock(&bp_lock); } #else -#define __psci_hyp_bp_inval_start NULL -#define __psci_hyp_bp_inval_endNULL +#define __psci_hyp_bp_inval_start NULL +#define __psci_hyp_bp_inval_endNULL +#define __qcom_hyp_sanitize_link_stack_start NULL +#define __qcom_hyp_sanitize_link_stack_end NULL static void __install_bp_hardening_cb(bp_hardening_cb_t fn, const char *hyp_vecs_start, @@ -138,6 +142,29 @@ static int enable_psci_bp_hardening(void *data) return 0; } + +static void qcom_link_stack_sanitization(void) +{ + u64 tmp; + + asm volatile("mov %0, x30 \n" +".rept 16 \n" +"bl. + 4 \n" +".endr \n" +"mov x30, %0 \n" +: "=&r" (tmp)); +} + +static int qcom_enable_link_stack_sanitization(void *data) +{ + const struct arm64_cpu_capabilities *entry = data; + + install_bp_hardening_cb(entry, qcom_link_stack_sanitization, + __qcom_hyp_sanitize_link_stack_start, + __qcom_hyp_sanitize_link_stack_end); + + return 0; +} #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */ #define MIDR_RANGE(model, min, max) \ @@ -302,6 +329,24 @@ static int enable_psci_bp_hardening(void *data) MIDR_ALL_VERSIONS(MIDR_CORTEX_A75), .enable = enable_psci_bp_hardening, }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1), + .enable = qcom_enable_link_stack_sanitization, + }, + { + .capability = ARM64_HAR
Re: [PATCH] arm64: Implement branch predictor hardening for Falkor
Hi Will, On 01/08/2018 12:44 PM, Will Deacon wrote: > On Mon, Jan 08, 2018 at 05:09:33PM +, Will Deacon wrote: >> On Fri, Jan 05, 2018 at 02:28:59PM -0600, Shanker Donthineni wrote: >>> Falkor is susceptible to branch predictor aliasing and can >>> theoretically be attacked by malicious code. This patch >>> implements a mitigation for these attacks, preventing any >>> malicious entries from affecting other victim contexts. >> >> Thanks, Shanker. I'll pick this up (fixing the typo pointed out by Drew). > > Note that MIDR_FALKOR doesn't exist in mainline, so I had to drop those > changes too. See the kpti branch for details. > The FALKOR MIDR patch is already available in the upstream kernel v4.15-rc7 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64?h=v4.15-rc7&id=c622cc013cece073722592cff1ac6643a33b1622 If you want I can resend the above patch in v2 series including typo fix. > If you'd like anything else done here, please send additional patches to me > and Catalin that we can apply on top of what we currently have. Note that > I'm in the air tomorrow, so won't be picking up email. > > Cheers, > > Will > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH] arm64: Implement branch predictor hardening for Falkor
Hi Andrew, On 01/08/2018 03:28 AM, Andrew Jones wrote: > Hi Shanker, > > On Fri, Jan 05, 2018 at 02:28:59PM -0600, Shanker Donthineni wrote: > ... >> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c >> index cb0fb37..daf53a5 100644 >> --- a/arch/arm64/kernel/cpu_errata.c >> +++ b/arch/arm64/kernel/cpu_errata.c >> @@ -54,6 +54,8 @@ static int cpu_enable_trap_ctr_access(void *__unused) >> >> #ifdef CONFIG_KVM >> extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[]; >> +extern char __qcom_hyp_sanitize_link_stack_start[]; >> +extern char __qcom_hyp_sanitize_link_stack_end[]; >> >> static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, >> const char *hyp_vecs_end) >> @@ -96,8 +98,10 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t >> fn, >> spin_unlock(&bp_lock); >> } >> #else >> -#define __psci_hyp_bp_inval_start NULL >> -#define __psci_hyp_bp_inval_end NULL >> +#define __psci_hyp_bp_inval_start NULL >> +#define __psci_hyp_bp_inval_end NULL >> +#define __qcom_hyp_sanitize_link_stack_startNULL >> +#define __qcom_hyp_sanitize_link_stack_startNULL > ^^ copy+paste error here Thanks for catching typo, I'll fix in v2 patch. > > Thanks, > drew > > ___________ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] arm64: Implement branch predictor hardening for Falkor
Falkor is susceptible to branch predictor aliasing and can theoretically be attacked by malicious code. This patch implements a mitigation for these attacks, preventing any malicious entries from affecting other victim contexts. Signed-off-by: Shanker Donthineni --- This patch has been verified using tip of https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti and https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm64?h=v4.15-rc6&id=c622cc013cece073722592cff1ac6643a33b1622 arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/kvm_asm.h | 2 ++ arch/arm64/kernel/bpi.S | 8 +++ arch/arm64/kernel/cpu_errata.c | 49 ++-- arch/arm64/kvm/hyp/entry.S | 12 ++ arch/arm64/kvm/hyp/switch.c | 10 6 files changed, 81 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 51616e7..7049b48 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -43,7 +43,8 @@ #define ARM64_SVE 22 #define ARM64_UNMAP_KERNEL_AT_EL0 23 #define ARM64_HARDEN_BRANCH_PREDICTOR 24 +#define ARM64_HARDEN_BP_POST_GUEST_EXIT25 -#define ARM64_NCAPS25 +#define ARM64_NCAPS26 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index ab4d0a9..24961b7 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -68,6 +68,8 @@ extern u32 __init_stage2_translation(void); +extern void __qcom_hyp_sanitize_btac_predictors(void); + #endif #endif /* __ARM_KVM_ASM_H__ */ diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S index 2b10d52..44ffcda 100644 --- a/arch/arm64/kernel/bpi.S +++ b/arch/arm64/kernel/bpi.S @@ -77,3 +77,11 @@ ENTRY(__psci_hyp_bp_inval_start) ldp x2, x3, [sp], #16 ldp x0, x1, [sp], #16 ENTRY(__psci_hyp_bp_inval_end) + +ENTRY(__qcom_hyp_sanitize_link_stack_start) + stp x29, x30, [sp, #-16]! + .rept 16 + bl . + 4 + .endr + ldp x29, x30, [sp], #16 +ENTRY(__qcom_hyp_sanitize_link_stack_end) diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index cb0fb37..daf53a5 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -54,6 +54,8 @@ static int cpu_enable_trap_ctr_access(void *__unused) #ifdef CONFIG_KVM extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[]; +extern char __qcom_hyp_sanitize_link_stack_start[]; +extern char __qcom_hyp_sanitize_link_stack_end[]; static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start, const char *hyp_vecs_end) @@ -96,8 +98,10 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn, spin_unlock(&bp_lock); } #else -#define __psci_hyp_bp_inval_start NULL -#define __psci_hyp_bp_inval_endNULL +#define __psci_hyp_bp_inval_start NULL +#define __psci_hyp_bp_inval_endNULL +#define __qcom_hyp_sanitize_link_stack_start NULL +#define __qcom_hyp_sanitize_link_stack_start NULL static void __install_bp_hardening_cb(bp_hardening_cb_t fn, const char *hyp_vecs_start, @@ -138,6 +142,29 @@ static int enable_psci_bp_hardening(void *data) return 0; } + +static void qcom_link_stack_sanitization(void) +{ + u64 tmp; + + asm volatile("mov %0, x30 \n" +".rept 16 \n" +"bl. + 4 \n" +".endr \n" +"mov x30, %0 \n" +: "=&r" (tmp)); +} + +static int qcom_enable_link_stack_sanitization(void *data) +{ + const struct arm64_cpu_capabilities *entry = data; + + install_bp_hardening_cb(entry, qcom_link_stack_sanitization, + __qcom_hyp_sanitize_link_stack_start, + __qcom_hyp_sanitize_link_stack_end); + + return 0; +} #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */ #define MIDR_RANGE(model, min, max) \ @@ -302,6 +329,24 @@ static int enable_psci_bp_hardening(void *data) MIDR_ALL_VERSIONS(MIDR_CORTEX_A75), .enable = enable_psci_bp_hardening, }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1), + .enable = qcom_enable_link_stack_sanitization, + }, + { + .capability = ARM64_HARDEN_BRANCH_PREDICTOR, + MIDR_ALL_VERSIONS(MIDR_QCO
[PATCH v5 2/2] arm64: Add software workaround for Falkor erratum 1041
The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni --- Changes since v3: Rebased to kernel v4.15-rc3 and removed the alternatives. Changes since v3: Rebased to kernel v4.15-rc1. Changes since v2: Repost the corrected patches. Changes since v1: Apply the workaround where it's required. Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 12 +++- arch/arm64/include/asm/assembler.h | 10 ++ arch/arm64/kernel/cpu-reset.S | 1 + arch/arm64/kernel/efi-entry.S | 2 ++ arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/relocate_kernel.S| 1 + arch/arm64/kvm/hyp-init.S | 1 + 8 files changed, 28 insertions(+), 1 deletion(-) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 304bf22..fc1c884 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -75,3 +75,4 @@ stable kernels. | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| | Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 | +| Qualcomm Tech. | Falkor v{1,2} | E1041 | QCOM_FALKOR_ERRATUM_1041| diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index a93339f..c9a7e9e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -557,7 +557,6 @@ config QCOM_QDF2400_ERRATUM_0065 If unsure, say Y. - config SOCIONEXT_SYNQUACER_PREITS bool "Socionext Synquacer: Workaround for GICv3 pre-ITS" default y @@ -576,6 +575,17 @@ config HISILICON_ERRATUM_161600802 a 128kB offset to be applied to the target address in this commands. If unsure, say Y. + +config QCOM_FALKOR_ERRATUM_E1041 + bool "Falkor E1041: Speculative instruction fetches might cause errant memory access" + default y + help + Falkor CPU may speculatively fetch instructions from an improper + memory location when MMU translation is changed from SCTLR_ELn[M]=1 + to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem. + + If unsure, say Y. + endmenu diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index aef72d8..8b16828 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -512,4 +512,14 @@ #endif .endm +/** + * Errata workaround prior to disable MMU. Insert an ISB immediately prior + * to executing the MSR that will change SCTLR_ELn[M] from a value of 1 to 0. + */ + .macro pre_disable_mmu_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 + isb +#endif + .endm + #endif /* __ASM_ASSEMBLER_H */ diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S index 65f42d2..2a752cb 100644 --- a/arch/arm64/kernel/cpu-reset.S +++ b/arch/arm64/kernel/cpu-reset.S @@ -37,6 +37,7 @@ ENTRY(__cpu_soft_restart) mrs x12, sctlr_el1 ldr x13, =SCTLR_ELx_FLAGS bic x12, x12, x13 + pre_disable_mmu_workaround msr sctlr_el1, x12 isb diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry
[PATCH v5 1/2] arm64: Define cputype macros for Falkor CPU
Add cputype definition macros for Qualcomm Datacenter Technologies Falkor CPU in cputype.h. It's unfortunate that the first revision of the Falkor CPU used the wrong part number 0x800, got fixed in v2 chip with part number 0xC00, and would be used the same value for future revisions. Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 235e77d..cbf08d7 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -91,6 +91,7 @@ #define BRCM_CPU_PART_VULCAN 0x516 #define QCOM_CPU_PART_FALKOR_V10x800 +#define QCOM_CPU_PART_FALKOR 0xC00 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) @@ -99,6 +100,7 @@ #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) +#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR) #ifndef __ASSEMBLY__ -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RESEND PATCH v4 2/2] arm64: Add software workaround for Falkor erratum 1041
Thanks Mark, I'll post v5 patch without alternatives. On 12/11/2017 04:45 AM, Mark Rutland wrote: > Hi, > > On Sun, Dec 10, 2017 at 08:03:43PM -0600, Shanker Donthineni wrote: >> +/** >> + * Errata workaround prior to disable MMU. Insert an ISB immediately prior >> + * to executing the MSR that will change SCTLR_ELn[M] from a value of 1 to >> 0. >> + */ >> +.macro pre_disable_mmu_workaround >> +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 >> +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 >> +isb >> +alternative_else_nop_endif >> +#endif >> +.endm > > There's really no need for this to be an alternative. It makes the > kernel larger and more complex due to all the altinstr data and probing > code. > > As Will suggested last time [1], please just use the ifdef, and always > compile-in the extra ISB if CONFIG_QCOM_FALKOR_ERRATUM_E1041 is > selected. Get rid of the alternatives and probing code. > > All you need here is: > > /* >* Some Falkor parts make errant speculative instruction fetches >* when SCTLR_ELx.M is cleared. An ISB before the write to >* SCTLR_ELx prevents this. >*/ > .macro pre_disable_mmu_workaround > #ifdef > isb > #endif > .endm > >> + >> +.macro pre_disable_mmu_early_workaround >> +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 >> +isb >> +#endif >> +.endm >> + > > ... and we don't need a special early variant. > > Thanks, > Mark. > > [1] https://lkml.kernel.org/r/20171201112457.ge18...@arm.com > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RESEND PATCH v4 2/2] arm64: Add software workaround for Falkor erratum 1041
The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni --- Changes since v3: Rebased to kernel v4.15-rc1. Changes since v2: Repost the corrected patches. Changes since v1: Apply the workaround where it's required. Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 12 +++- arch/arm64/include/asm/assembler.h | 19 +++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpu-reset.S | 1 + arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 2 ++ arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/relocate_kernel.S| 1 + arch/arm64/kvm/hyp-init.S | 1 + 10 files changed, 55 insertions(+), 2 deletions(-) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 304bf22..fc1c884 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -75,3 +75,4 @@ stable kernels. | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| | Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 | +| Qualcomm Tech. | Falkor v{1,2} | E1041 | QCOM_FALKOR_ERRATUM_1041| diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index a93339f..c9a7e9e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -557,7 +557,6 @@ config QCOM_QDF2400_ERRATUM_0065 If unsure, say Y. - config SOCIONEXT_SYNQUACER_PREITS bool "Socionext Synquacer: Workaround for GICv3 pre-ITS" default y @@ -576,6 +575,17 @@ config HISILICON_ERRATUM_161600802 a 128kB offset to be applied to the target address in this commands. If unsure, say Y. + +config QCOM_FALKOR_ERRATUM_E1041 + bool "Falkor E1041: Speculative instruction fetches might cause errant memory access" + default y + help + Falkor CPU may speculatively fetch instructions from an improper + memory location when MMU translation is changed from SCTLR_ELn[M]=1 + to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem. + + If unsure, say Y. + endmenu diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index aef72d8..c77742a 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -31,6 +31,7 @@ #include #include #include +#include .macro save_and_disable_daif, flags mrs \flags, daif @@ -512,4 +513,22 @@ #endif .endm +/** + * Errata workaround prior to disable MMU. Insert an ISB immediately prior + * to executing the MSR that will change SCTLR_ELn[M] from a value of 1 to 0. + */ + .macro pre_disable_mmu_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 + isb +alternative_else_nop_endif +#endif + .endm + + .macro pre_disable_mmu_early_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 + isb +#endif + .endm + #endif /* __ASM_ASSEMBLER_H */ diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h inde
[RESEND PATCH v4 1/2] arm64: Define cputype macros for Falkor CPU
Add cputype definition macros for Qualcomm Datacenter Technologies Falkor CPU in cputype.h. It's unfortunate that the first revision of the Falkor CPU used the wrong part number 0x800, got fixed in v2 chip with part number 0xC00, and would be used the same value for future revisions. Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 235e77d..cbf08d7 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -91,6 +91,7 @@ #define BRCM_CPU_PART_VULCAN 0x516 #define QCOM_CPU_PART_FALKOR_V10x800 +#define QCOM_CPU_PART_FALKOR 0xC00 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) @@ -99,6 +100,7 @@ #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) +#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR) #ifndef __ASSEMBLY__ -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v4 2/2] arm64: Add software workaround for Falkor erratum 1041
Hi Will, On 12/03/2017 07:35 AM, Shanker Donthineni wrote: > Hi Will, thanks for your review comments. > > On 12/01/2017 05:24 AM, Will Deacon wrote: >> On Mon, Nov 27, 2017 at 05:18:00PM -0600, Shanker Donthineni wrote: >>> The ARM architecture defines the memory locations that are permitted >>> to be accessed as the result of a speculative instruction fetch from >>> an exception level for which all stages of translation are disabled. >>> Specifically, the core is permitted to speculatively fetch from the >>> 4KB region containing the current program counter 4K and next 4K. >>> >>> When translation is changed from enabled to disabled for the running >>> exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the >>> Falkor core may errantly speculatively access memory locations outside >>> of the 4KB region permitted by the architecture. The errant memory >>> access may lead to one of the following unexpected behaviors. >>> >>> 1) A System Error Interrupt (SEI) being raised by the Falkor core due >>>to the errant memory access attempting to access a region of memory >>>that is protected by a slave-side memory protection unit. >>> 2) Unpredictable device behavior due to a speculative read from device >>>memory. This behavior may only occur if the instruction cache is >>>disabled prior to or coincident with translation being changed from >>>enabled to disabled. >>> >>> The conditions leading to this erratum will not occur when either of the >>> following occur: >>> 1) A higher exception level disables translation of a lower exception level >>>(e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). >>> 2) An exception level disabling its stage-1 translation if its stage-2 >>> translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 >>> to 0 when HCR_EL2[VM] has a value of 1). >>> >>> To avoid the errant behavior, software must execute an ISB immediately >>> prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. >>> >>> Signed-off-by: Shanker Donthineni >>> --- >>> Changes since v3: >>> Rebased to kernel v4.15-rc1. >>> Changes since v2: >>> Repost the corrected patches. >>> Changes since v1: >>> Apply the workaround where it's required. >>> >>> Documentation/arm64/silicon-errata.txt | 1 + >>> arch/arm64/Kconfig | 12 +++- >>> arch/arm64/include/asm/assembler.h | 19 +++ >>> arch/arm64/include/asm/cpucaps.h | 3 ++- >>> arch/arm64/kernel/cpu-reset.S | 1 + >>> arch/arm64/kernel/cpu_errata.c | 16 >>> arch/arm64/kernel/efi-entry.S | 2 ++ >>> arch/arm64/kernel/head.S | 1 + >>> arch/arm64/kernel/relocate_kernel.S| 1 + >>> arch/arm64/kvm/hyp-init.S | 1 + >> >> This is an awful lot of code just to add an ISB instruction prior to >> disabling the MMU. Why do you need to go through the alternatives framework >> for this? Just do it with an #ifdef; this isn't a fastpath. >> > > We can avoid changes to only two files cpu_errata.c and cpucaps.h without > using > the alternatives framework. Even though it's in slow path, cpu-errata.c > changes > provides a nice debug message which indicates the erratum E1041 is applied. > > Erratum log information would be very useful to conform our customers using > the > right kernel with E1014 patch by looking at dmesg. Other than that I don't > have > any other strong opinion to avoid alternatives and handle using #idef. > > Should I go ahead and post v5 patch without alternatives? > Please provide your thoughts on next step. We would like to merge this erratum to v4.15 kernel. >> Will >> >> ___ >> linux-arm-kernel mailing list >> linux-arm-ker...@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v4 2/2] arm64: Add software workaround for Falkor erratum 1041
Hi Will, thanks for your review comments. On 12/01/2017 05:24 AM, Will Deacon wrote: > On Mon, Nov 27, 2017 at 05:18:00PM -0600, Shanker Donthineni wrote: >> The ARM architecture defines the memory locations that are permitted >> to be accessed as the result of a speculative instruction fetch from >> an exception level for which all stages of translation are disabled. >> Specifically, the core is permitted to speculatively fetch from the >> 4KB region containing the current program counter 4K and next 4K. >> >> When translation is changed from enabled to disabled for the running >> exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the >> Falkor core may errantly speculatively access memory locations outside >> of the 4KB region permitted by the architecture. The errant memory >> access may lead to one of the following unexpected behaviors. >> >> 1) A System Error Interrupt (SEI) being raised by the Falkor core due >>to the errant memory access attempting to access a region of memory >>that is protected by a slave-side memory protection unit. >> 2) Unpredictable device behavior due to a speculative read from device >>memory. This behavior may only occur if the instruction cache is >>disabled prior to or coincident with translation being changed from >>enabled to disabled. >> >> The conditions leading to this erratum will not occur when either of the >> following occur: >> 1) A higher exception level disables translation of a lower exception level >>(e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). >> 2) An exception level disabling its stage-1 translation if its stage-2 >> translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 >> to 0 when HCR_EL2[VM] has a value of 1). >> >> To avoid the errant behavior, software must execute an ISB immediately >> prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. >> >> Signed-off-by: Shanker Donthineni >> --- >> Changes since v3: >> Rebased to kernel v4.15-rc1. >> Changes since v2: >> Repost the corrected patches. >> Changes since v1: >> Apply the workaround where it's required. >> >> Documentation/arm64/silicon-errata.txt | 1 + >> arch/arm64/Kconfig | 12 +++- >> arch/arm64/include/asm/assembler.h | 19 +++ >> arch/arm64/include/asm/cpucaps.h | 3 ++- >> arch/arm64/kernel/cpu-reset.S | 1 + >> arch/arm64/kernel/cpu_errata.c | 16 >> arch/arm64/kernel/efi-entry.S | 2 ++ >> arch/arm64/kernel/head.S | 1 + >> arch/arm64/kernel/relocate_kernel.S| 1 + >> arch/arm64/kvm/hyp-init.S | 1 + > > This is an awful lot of code just to add an ISB instruction prior to > disabling the MMU. Why do you need to go through the alternatives framework > for this? Just do it with an #ifdef; this isn't a fastpath. > We can avoid changes to only two files cpu_errata.c and cpucaps.h without using the alternatives framework. Even though it's in slow path, cpu-errata.c changes provides a nice debug message which indicates the erratum E1041 is applied. Erratum log information would be very useful to conform our customers using the right kernel with E1014 patch by looking at dmesg. Other than that I don't have any other strong opinion to avoid alternatives and handle using #idef. Should I go head and post v5 patch without alternatives? > Will > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v4 0/2] Implement a software workaround for Falkor erratum 1041
On Falkor CPU, we’ve discovered a hardware issue which might lead to a kernel crash or the unexpected behavior. The Falkor core may errantly access memory locations on speculative instruction fetches. This may happen whenever MMU translation state, SCTLR_ELn[M] bit is being changed from enabled to disabled for the currently running exception level. To prevent the errant hardware behavior, software must execute an ISB immediately prior to executing the MSR that changes SCTLR_ELn[M] from a value of 1 to 0. These v4 patches are based on 4.15-rc1 and tested on QDF2400 platform. Patch2 from V1 series got dropped to accommodate review comments. Apply the workaround where it's required. Posted wrong the patches in v2. Shanker Donthineni (2): arm64: Define cputype macros for Falkor CPU arm64: Add software workaround for Falkor erratum 1041 Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 12 +++- arch/arm64/include/asm/assembler.h | 19 +++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/cputype.h | 2 ++ arch/arm64/kernel/cpu-reset.S | 1 + arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 2 ++ arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/relocate_kernel.S| 1 + arch/arm64/kvm/hyp-init.S | 1 + 11 files changed, 57 insertions(+), 2 deletions(-) -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v4 1/2] arm64: Define cputype macros for Falkor CPU
Add cputype definition macros for Qualcomm Datacenter Technologies Falkor CPU in cputype.h. It's unfortunate that the first revision of the Falkor CPU used the wrong part number 0x800, got fixed in v2 chip with part number 0xC00, and would be used the same value for future revisions. Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 235e77d..cbf08d7 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -91,6 +91,7 @@ #define BRCM_CPU_PART_VULCAN 0x516 #define QCOM_CPU_PART_FALKOR_V10x800 +#define QCOM_CPU_PART_FALKOR 0xC00 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) @@ -99,6 +100,7 @@ #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) +#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR) #ifndef __ASSEMBLY__ -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v4 2/2] arm64: Add software workaround for Falkor erratum 1041
The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni --- Changes since v3: Rebased to kernel v4.15-rc1. Changes since v2: Repost the corrected patches. Changes since v1: Apply the workaround where it's required. Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 12 +++- arch/arm64/include/asm/assembler.h | 19 +++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpu-reset.S | 1 + arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 2 ++ arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/relocate_kernel.S| 1 + arch/arm64/kvm/hyp-init.S | 1 + 10 files changed, 55 insertions(+), 2 deletions(-) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 304bf22..fc1c884 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -75,3 +75,4 @@ stable kernels. | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| | Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 | +| Qualcomm Tech. | Falkor v{1,2} | E1041 | QCOM_FALKOR_ERRATUM_1041| diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index a93339f..c9a7e9e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -557,7 +557,6 @@ config QCOM_QDF2400_ERRATUM_0065 If unsure, say Y. - config SOCIONEXT_SYNQUACER_PREITS bool "Socionext Synquacer: Workaround for GICv3 pre-ITS" default y @@ -576,6 +575,17 @@ config HISILICON_ERRATUM_161600802 a 128kB offset to be applied to the target address in this commands. If unsure, say Y. + +config QCOM_FALKOR_ERRATUM_E1041 + bool "Falkor E1041: Speculative instruction fetches might cause errant memory access" + default y + help + Falkor CPU may speculatively fetch instructions from an improper + memory location when MMU translation is changed from SCTLR_ELn[M]=1 + to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem. + + If unsure, say Y. + endmenu diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index aef72d8..c77742a 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -31,6 +31,7 @@ #include #include #include +#include .macro save_and_disable_daif, flags mrs \flags, daif @@ -512,4 +513,22 @@ #endif .endm +/** + * Errata workaround prior to disable MMU. Insert an ISB immediately prior + * to executing the MSR that will change SCTLR_ELn[M] from a value of 1 to 0. + */ + .macro pre_disable_mmu_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 + isb +alternative_else_nop_endif +#endif + .endm + + .macro pre_disable_mmu_early_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 + isb +#endif + .endm + #endif /* __ASM_ASSEMBLER_H */ diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h inde
Re: [PATCH v2 2/2] arm64: Add software workaround for Falkor erratum 1041
Hi, Sorry, I've posted a wrong patch which causes the compilation errors. Please disregard this patch, I posted v3 patch to fix the build issue. https://patchwork.kernel.org/patch/10055077/ On 11/12/2017 07:16 PM, Shanker Donthineni wrote: > The ARM architecture defines the memory locations that are permitted > to be accessed as the result of a speculative instruction fetch from > an exception level for which all stages of translation are disabled. > Specifically, the core is permitted to speculatively fetch from the > 4KB region containing the current program counter 4K and next 4K. > > When translation is changed from enabled to disabled for the running > exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the > Falkor core may errantly speculatively access memory locations outside > of the 4KB region permitted by the architecture. The errant memory > access may lead to one of the following unexpected behaviors. > > 1) A System Error Interrupt (SEI) being raised by the Falkor core due >to the errant memory access attempting to access a region of memory >that is protected by a slave-side memory protection unit. > 2) Unpredictable device behavior due to a speculative read from device >memory. This behavior may only occur if the instruction cache is >disabled prior to or coincident with translation being changed from >enabled to disabled. > > The conditions leading to this erratum will not occur when either of the > following occur: > 1) A higher exception level disables translation of a lower exception level >(e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). > 2) An exception level disabling its stage-1 translation if its stage-2 > translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 > to 0 when HCR_EL2[VM] has a value of 1). > > To avoid the errant behavior, software must execute an ISB immediately > prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. > > Signed-off-by: Shanker Donthineni > --- > Documentation/arm64/silicon-errata.txt | 1 + > arch/arm64/Kconfig | 10 ++ > arch/arm64/include/asm/assembler.h | 18 ++ > arch/arm64/include/asm/cpucaps.h | 3 ++- > arch/arm64/kernel/cpu-reset.S | 1 + > arch/arm64/kernel/cpu_errata.c | 16 > arch/arm64/kernel/efi-entry.S | 2 ++ > arch/arm64/kernel/head.S | 1 + > arch/arm64/kernel/relocate_kernel.S| 1 + > arch/arm64/kvm/hyp-init.S | 1 + > 10 files changed, 53 insertions(+), 1 deletion(-) > > diff --git a/Documentation/arm64/silicon-errata.txt > b/Documentation/arm64/silicon-errata.txt > index 66e8ce1..704770c0 100644 > --- a/Documentation/arm64/silicon-errata.txt > +++ b/Documentation/arm64/silicon-errata.txt > @@ -74,3 +74,4 @@ stable kernels. > | Qualcomm Tech. | Falkor v1 | E1003 | > QCOM_FALKOR_ERRATUM_1003| > | Qualcomm Tech. | Falkor v1 | E1009 | > QCOM_FALKOR_ERRATUM_1009| > | Qualcomm Tech. | QDF2400 ITS | E0065 | > QCOM_QDF2400_ERRATUM_0065 | > +| Qualcomm Tech. | Falkor v{1,2} | E1041 | > QCOM_FALKOR_ERRATUM_1041| > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 0df64a6..8f73eac 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -539,6 +539,16 @@ config QCOM_QDF2400_ERRATUM_0065 > > If unsure, say Y. > > +config QCOM_FALKOR_ERRATUM_E1041 > + bool "Falkor E1041: Speculative instruction fetches might cause errant > memory access" > + default y > + help > + Falkor CPU may speculatively fetch instructions from an improper > + memory location when MMU translation is changed from SCTLR_ELn[M]=1 > + to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem. > + > + If unsure, say Y. > + > endmenu > > > diff --git a/arch/arm64/include/asm/assembler.h > b/arch/arm64/include/asm/assembler.h > index d58a625..eb11cdf 100644 > --- a/arch/arm64/include/asm/assembler.h > +++ b/arch/arm64/include/asm/assembler.h > @@ -499,4 +499,22 @@ > #endif > .endm > > +/** > + * Errata workaround prior to disable MMU. Insert an ISB immediately prior > + * to executing the MSR that will change SCTLR_ELn[M] from a value of 1 to 0. > + */ > + .macro pre_disable_mmu_workaround > +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 > +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 > + isb > +alternative_else_nop_endif > +#endif > + .end > + > + .macro pre_disable_mmu_early_workaround > +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 > +
[PATCH v3 1/2] arm64: Define cputype macros for Falkor CPU
Add cputype definition macros for Qualcomm Datacenter Technologies Falkor CPU in cputype.h. It's unfortunate that the first revision of the Falkor CPU used the wrong part number 0x800, got fixed in v2 chip with part number 0xC00, and would be used the same value for future revisions. Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 235e77d..cbf08d7 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -91,6 +91,7 @@ #define BRCM_CPU_PART_VULCAN 0x516 #define QCOM_CPU_PART_FALKOR_V10x800 +#define QCOM_CPU_PART_FALKOR 0xC00 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) @@ -99,6 +100,7 @@ #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) +#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR) #ifndef __ASSEMBLY__ -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v3 2/2] arm64: Add software workaround for Falkor erratum 1041
The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni --- Changes since v1: Apply the workaround where it's required. Changes since v2: Repost the corrected patches. Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 10 ++ arch/arm64/include/asm/assembler.h | 19 +++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpu-reset.S | 1 + arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 2 ++ arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/relocate_kernel.S| 1 + arch/arm64/kvm/hyp-init.S | 1 + 10 files changed, 54 insertions(+), 1 deletion(-) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 66e8ce1..704770c0 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -74,3 +74,4 @@ stable kernels. | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| | Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 | +| Qualcomm Tech. | Falkor v{1,2} | E1041 | QCOM_FALKOR_ERRATUM_1041| diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 0df64a6..8f73eac 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -539,6 +539,16 @@ config QCOM_QDF2400_ERRATUM_0065 If unsure, say Y. +config QCOM_FALKOR_ERRATUM_E1041 + bool "Falkor E1041: Speculative instruction fetches might cause errant memory access" + default y + help + Falkor CPU may speculatively fetch instructions from an improper + memory location when MMU translation is changed from SCTLR_ELn[M]=1 + to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem. + + If unsure, say Y. + endmenu diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index d58a625..dd9cec5 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -30,6 +30,7 @@ #include #include #include +#include /* * Enable and disable interrupts. @@ -499,4 +500,22 @@ #endif .endm +/** + * Errata workaround prior to disable MMU. Insert an ISB immediately prior + * to executing the MSR that will change SCTLR_ELn[M] from a value of 1 to 0. + */ + .macro pre_disable_mmu_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 + isb +alternative_else_nop_endif +#endif + .endm + + .macro pre_disable_mmu_early_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 + isb +#endif + .endm + #endif /* __ASM_ASSEMBLER_H */ diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 8da6216..7f7a59d 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -40,7 +40,8 @@ #define ARM64_WORKAROUND_85892119 #define ARM64_WORKAROUND_CAVIUM_30115 20 #define ARM64_HAS_DCPOP21 +#define ARM64_WORKAROUND_QCOM_FALKOR_E1041 22 -#define ARM64_NCAPS
[PATCH v3 0/2] Implement a software workaround for Falkor erratum 1041
On Falkor CPU, we’ve discovered a hardware issue which might lead to a kernel crash or the unexpected behavior. The Falkor core may errantly access memory locations on speculative instruction fetches. This may happen whenever MMU translation state, SCTLR_ELn[M] bit is being changed from enabled to disabled for the currently running exception level. To prevent the errant hardware behavior, software must execute an ISB immediately prior to executing the MSR that changes SCTLR_ELn[M] from a value of 1 to 0. To simplify the complexity of a workaround, this patch series issues an ISB whenever SCTLR_ELn[M] is changed to 0 to fix the Falkor erratum 1041. Patch2 from V1 series got dropped to accommodate review comments. Apply the workaround where it's required. Posted wrong the patches in v2. Patch1: - CPUTYPE definitions for Falkor CPU. Patch2: - Actual workaround changes for erratum E1041. Shanker Donthineni (2): arm64: Define cputype macros for Falkor CPU arm64: Add software workaround for Falkor erratum 1041 Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 10 ++ arch/arm64/include/asm/assembler.h | 18 ++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/cputype.h | 2 ++ arch/arm64/kernel/cpu-reset.S | 1 + arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 2 ++ arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/relocate_kernel.S| 1 + arch/arm64/kvm/hyp-init.S | 1 + 11 files changed, 55 insertions(+), 1 deletion(-) -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2 0/2] Implement a software workaround for Falkor erratum 1041
On Falkor CPU, we’ve discovered a hardware issue which might lead to a kernel crash or the unexpected behavior. The Falkor core may errantly access memory locations on speculative instruction fetches. This may happen whenever MMU translation state, SCTLR_ELn[M] bit is being changed from enabled to disabled for the currently running exception level. To prevent the errant hardware behavior, software must execute an ISB immediately prior to executing the MSR that changes SCTLR_ELn[M] from a value of 1 to 0. To simplify the complexity of a workaround, this patch series issues an ISB whenever SCTLR_ELn[M] is changed to 0 to fix the Falkor erratum 1041. Patch2 from V1 series got dropped to accommodate review comments. Apply the workaround where it's required. Patch1: - CPUTYPE definitions for Falkor CPU. Patch2: - Actual workaround changes for erratum E1041. Shanker Donthineni (2): arm64: Define cputype macros for Falkor CPU arm64: Add software workaround for Falkor erratum 1041 Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 10 ++ arch/arm64/include/asm/assembler.h | 18 ++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/cputype.h | 2 ++ arch/arm64/kernel/cpu-reset.S | 1 + arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 2 ++ arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/relocate_kernel.S| 1 + arch/arm64/kvm/hyp-init.S | 1 + 11 files changed, 55 insertions(+), 1 deletion(-) -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2 2/2] arm64: Add software workaround for Falkor erratum 1041
The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter 4K and next 4K. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. The conditions leading to this erratum will not occur when either of the following occur: 1) A higher exception level disables translation of a lower exception level (e.g. EL2 changing SCTLR_EL1[M] from a value of 1 to 0). 2) An exception level disabling its stage-1 translation if its stage-2 translation is enabled (e.g. EL1 changing SCTLR_EL1[M] from a value of 1 to 0 when HCR_EL2[VM] has a value of 1). To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni --- Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 10 ++ arch/arm64/include/asm/assembler.h | 18 ++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpu-reset.S | 1 + arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 2 ++ arch/arm64/kernel/head.S | 1 + arch/arm64/kernel/relocate_kernel.S| 1 + arch/arm64/kvm/hyp-init.S | 1 + 10 files changed, 53 insertions(+), 1 deletion(-) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 66e8ce1..704770c0 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -74,3 +74,4 @@ stable kernels. | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| | Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 | +| Qualcomm Tech. | Falkor v{1,2} | E1041 | QCOM_FALKOR_ERRATUM_1041| diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 0df64a6..8f73eac 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -539,6 +539,16 @@ config QCOM_QDF2400_ERRATUM_0065 If unsure, say Y. +config QCOM_FALKOR_ERRATUM_E1041 + bool "Falkor E1041: Speculative instruction fetches might cause errant memory access" + default y + help + Falkor CPU may speculatively fetch instructions from an improper + memory location when MMU translation is changed from SCTLR_ELn[M]=1 + to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem. + + If unsure, say Y. + endmenu diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index d58a625..eb11cdf 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -499,4 +499,22 @@ #endif .endm +/** + * Errata workaround prior to disable MMU. Insert an ISB immediately prior + * to executing the MSR that will change SCTLR_ELn[M] from a value of 1 to 0. + */ + .macro pre_disable_mmu_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 + isb +alternative_else_nop_endif +#endif + .end + + .macro pre_disable_mmu_early_workaround +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_E1041 + isb +#endif + .end + #endif /* __ASM_ASSEMBLER_H */ diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 8da6216..7f7a59d 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -40,7 +40,8 @@ #define ARM64_WORKAROUND_85892119 #define ARM64_WORKAROUND_CAVIUM_30115 20 #define ARM64_HAS_DCPOP21 +#define ARM64_WORKAROUND_QCOM_FALKOR_E1041 22 -#define ARM64_NCAPS22 +#define ARM64_NCAPS23 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S index 65f42d2..2a752cb 100644 --- a/arch/arm64/kernel/c
[PATCH v2 1/2] arm64: Define cputype macros for Falkor CPU
Add cputype definition macros for Qualcomm Datacenter Technologies Falkor CPU in cputype.h. It's unfortunate that the first revision of the Falkor CPU used the wrong part number 0x800, got fixed in v2 chip with part number 0xC00, and would be used the same value for future revisions. Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 235e77d..cbf08d7 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -91,6 +91,7 @@ #define BRCM_CPU_PART_VULCAN 0x516 #define QCOM_CPU_PART_FALKOR_V10x800 +#define QCOM_CPU_PART_FALKOR 0xC00 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) @@ -99,6 +100,7 @@ #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) +#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR) #ifndef __ASSEMBLY__ -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 3/3] arm64: Add software workaround for Falkor erratum 1041
Hi James, On 11/10/2017 04:24 AM, James Morse wrote: > Hi Shanker, > > On 09/11/17 15:22, Shanker Donthineni wrote: >> On 11/09/2017 05:08 AM, James Morse wrote: >>> On 04/11/17 21:43, Shanker Donthineni wrote: >>>> On 11/03/2017 10:11 AM, Robin Murphy wrote: >>>>> On 03/11/17 03:27, Shanker Donthineni wrote: >>>>>> The ARM architecture defines the memory locations that are permitted >>>>>> to be accessed as the result of a speculative instruction fetch from >>>>>> an exception level for which all stages of translation are disabled. >>>>>> Specifically, the core is permitted to speculatively fetch from the >>>>>> 4KB region containing the current program counter and next 4KB. >>>>>> >>>>>> When translation is changed from enabled to disabled for the running >>>>>> exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the >>>>>> Falkor core may errantly speculatively access memory locations outside >>>>>> of the 4KB region permitted by the architecture. The errant memory >>>>>> access may lead to one of the following unexpected behaviors. >>>>>> >>>>>> 1) A System Error Interrupt (SEI) being raised by the Falkor core due >>>>>>to the errant memory access attempting to access a region of memory >>>>>>that is protected by a slave-side memory protection unit. >>>>>> 2) Unpredictable device behavior due to a speculative read from device >>>>>>memory. This behavior may only occur if the instruction cache is >>>>>>disabled prior to or coincident with translation being changed from >>>>>>enabled to disabled. >>>>>> >>>>>> To avoid the errant behavior, software must execute an ISB immediately >>>>>> prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. > >>>>>> diff --git a/arch/arm64/include/asm/assembler.h >>>>>> b/arch/arm64/include/asm/assembler.h >>>>>> index b6dfb4f..4c91efb 100644 >>>>>> --- a/arch/arm64/include/asm/assembler.h >>>>>> +++ b/arch/arm64/include/asm/assembler.h >>>>>> @@ -514,6 +515,22 @@ >>>>>> * reg: the value to be written. >>>>>> */ >>>>>> .macro write_sctlr, eln, reg >>>>>> +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 >>>>>> +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 >>>>>> +tbnz\reg, #0, 8000f // enable MMU? >>> >>> Won't this match any change that leaves the MMU enabled? >> >> Yes. No need to apply workaround if the MMU is going to be enabled. > > (Sorry, looks like I had this upside down) > > My badly-made-point is you can't know if the MMU is being disabled unless you > have both the old and new values. > > As an example, in el2_setup, (where the MMU is disabled), we set the EE/E0E > bits > to match the kernel's endianness. Won't your macro will insert an unnecessary > isb? Is this needed for the errata workaround? > Yes, It's not required in this case. I'll post a v2 patch and apply the workaround where it's absolutely required. Seems handling a workaround inside helper macros causing confusion. > >>> I think the macro is making this more confusing. Disabling the MMU is >>> obvious >>> from the call-site, (and really rare!). Trying to work it out from a macro >>> makes >>> it more complicated than necessary. > >> Not clear, are you suggesting not to use read{write}_sctlr() macros instead >> apply >> the workaround from the call-site based on the MMU-on status? > > Yes. This is the only way to patch only the locations that turn the MMU off. > > >> If yes, It simplifies >> the code logic but CONFIG_QCOM_FALKOR_ERRATUM_1041 references are scatter >> everywhere. > > Wouldn't they only appear in the places that are affected by the errata? > This is exactly what we want, anyone touching that code now knows they need to > double check this behaviour, (and ask you to test it!). > > Otherwise we have a macro second guessing what is happening, if its not quite > right (because some information has been lost), we're now not sure what we > need > to do if we ever refactor any of this code. > > [...] > >>>> I'll prefer alternatives >>>> just to avoid the unnecessar
Re: [PATCH 3/3] arm64: Add software workaround for Falkor erratum 1041
Hi James, On 11/09/2017 05:08 AM, James Morse wrote: > Hi Shanker, Robin, > > On 04/11/17 21:43, Shanker Donthineni wrote: >> On 11/03/2017 10:11 AM, Robin Murphy wrote: >>> On 03/11/17 03:27, Shanker Donthineni wrote: >>>> The ARM architecture defines the memory locations that are permitted >>>> to be accessed as the result of a speculative instruction fetch from >>>> an exception level for which all stages of translation are disabled. >>>> Specifically, the core is permitted to speculatively fetch from the >>>> 4KB region containing the current program counter and next 4KB. >>>> >>>> When translation is changed from enabled to disabled for the running >>>> exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the >>>> Falkor core may errantly speculatively access memory locations outside >>>> of the 4KB region permitted by the architecture. The errant memory >>>> access may lead to one of the following unexpected behaviors. >>>> >>>> 1) A System Error Interrupt (SEI) being raised by the Falkor core due >>>>to the errant memory access attempting to access a region of memory >>>>that is protected by a slave-side memory protection unit. >>>> 2) Unpredictable device behavior due to a speculative read from device >>>>memory. This behavior may only occur if the instruction cache is >>>>disabled prior to or coincident with translation being changed from >>>>enabled to disabled. >>>> >>>> To avoid the errant behavior, software must execute an ISB immediately >>>> prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. > > >>>> diff --git a/arch/arm64/include/asm/assembler.h >>>> b/arch/arm64/include/asm/assembler.h >>>> index b6dfb4f..4c91efb 100644 >>>> --- a/arch/arm64/include/asm/assembler.h >>>> +++ b/arch/arm64/include/asm/assembler.h >>>> @@ -30,6 +30,7 @@ >>>> #include >>>> #include >>>> #include >>>> +#include >>>> >>>> /* >>>> * Enable and disable interrupts. >>>> @@ -514,6 +515,22 @@ >>>> * reg: the value to be written. >>>> */ >>>>.macro write_sctlr, eln, reg >>>> +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 >>>> +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 >>>> + tbnz\reg, #0, 8000f // enable MMU? > > Won't this match any change that leaves the MMU enabled? > Yes. No need to apply workaround if the MMU is going to be enabled. > I think the macro is making this more confusing. Disabling the MMU is obvious > from the call-site, (and really rare!). Trying to work it out from a macro > makes > it more complicated than necessary. > Not clear, are you suggesting not to use read{write}_sctlr() macros instead apply the workaround from the call-site based on the MMU-on status? If yes, It simplifies the code logic but CONFIG_QCOM_FALKOR_ERRATUM_1041 references are scatter everywhere. > >>> Do we really need the branch here? It's not like enabling the MMU is >>> something we do on the syscall fastpath, and I can't imagine an extra >>> ISB hurts much (and is probably comparable to a mispredicted branch > >> I don't have any strong opinion on whether to use an ISB conditionally >> or unconditionally. Yes, the current kernel code is not touching >> SCTLR_ELn register on the system call fast path. I would like to keep >> it as a conditional ISB in case if the future kernel accesses the >> SCTLR_ELn on the fast path. An extra ISB should not hurt a lot but I >> believe it has more overhead than the TBZ+branch mis-prediction on Falkor >> CPU. This patch has been tested on the real hardware to fix the problem. > >> I'm open to change to an unconditional ISB if it's the better fix. >> >>> anyway). In fact, is there any noticeable hit on other >>> microarchitectures if we save the alternative bother and just do it >>> unconditionally always? >>> >> >> I can't comment on the performance impacts of other CPUs since I don't >> have access to their development platforms. I'll prefer alternatives >> just to avoid the unnecessary overhead on future Qualcomm Datacenter >> server CPUs and regression on other CPUs because of inserting an ISB > > I think hiding errata on other CPUs is a good argument. > > My suggestion would be: >> #ifdef CONFIG_QCOM_FALK
Re: [PATCH 3/3] arm64: Add software workaround for Falkor erratum 1041
Hi Robin, Thanks for your review comments. On 11/03/2017 10:11 AM, Robin Murphy wrote: > On 03/11/17 03:27, Shanker Donthineni wrote: >> The ARM architecture defines the memory locations that are permitted >> to be accessed as the result of a speculative instruction fetch from >> an exception level for which all stages of translation are disabled. >> Specifically, the core is permitted to speculatively fetch from the >> 4KB region containing the current program counter and next 4KB. >> >> When translation is changed from enabled to disabled for the running >> exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the >> Falkor core may errantly speculatively access memory locations outside >> of the 4KB region permitted by the architecture. The errant memory >> access may lead to one of the following unexpected behaviors. >> >> 1) A System Error Interrupt (SEI) being raised by the Falkor core due >>to the errant memory access attempting to access a region of memory >>that is protected by a slave-side memory protection unit. >> 2) Unpredictable device behavior due to a speculative read from device >>memory. This behavior may only occur if the instruction cache is >>disabled prior to or coincident with translation being changed from >>enabled to disabled. >> >> To avoid the errant behavior, software must execute an ISB immediately >> prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. >> >> Signed-off-by: Shanker Donthineni >> --- >> Documentation/arm64/silicon-errata.txt | 1 + >> arch/arm64/Kconfig | 10 ++ >> arch/arm64/include/asm/assembler.h | 17 + >> arch/arm64/include/asm/cpucaps.h | 3 ++- >> arch/arm64/kernel/cpu_errata.c | 16 >> arch/arm64/kernel/efi-entry.S | 4 ++-- >> arch/arm64/kernel/head.S | 4 ++-- >> 7 files changed, 50 insertions(+), 5 deletions(-) >> >> diff --git a/Documentation/arm64/silicon-errata.txt >> b/Documentation/arm64/silicon-errata.txt >> index 66e8ce1..704770c0 100644 >> --- a/Documentation/arm64/silicon-errata.txt >> +++ b/Documentation/arm64/silicon-errata.txt >> @@ -74,3 +74,4 @@ stable kernels. >> | Qualcomm Tech. | Falkor v1 | E1003 | >> QCOM_FALKOR_ERRATUM_1003| >> | Qualcomm Tech. | Falkor v1 | E1009 | >> QCOM_FALKOR_ERRATUM_1009| >> | Qualcomm Tech. | QDF2400 ITS | E0065 | >> QCOM_QDF2400_ERRATUM_0065 | >> +| Qualcomm Tech. | Falkor v{1,2} | E1041 | >> QCOM_FALKOR_ERRATUM_1041| >> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig >> index 0df64a6..7e933fb 100644 >> --- a/arch/arm64/Kconfig >> +++ b/arch/arm64/Kconfig >> @@ -539,6 +539,16 @@ config QCOM_QDF2400_ERRATUM_0065 >> >>If unsure, say Y. >> >> +config QCOM_FALKOR_ERRATUM_1041 >> +bool "Falkor E1041: Speculative instruction fetches might cause errant >> memory access" >> +default y >> +help >> + Falkor CPU may speculatively fetch instructions from an improper >> + memory location when MMU translation is changed from SCTLR_ELn[M]=1 >> + to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem. >> + >> + If unsure, say Y. >> + >> endmenu >> >> >> diff --git a/arch/arm64/include/asm/assembler.h >> b/arch/arm64/include/asm/assembler.h >> index b6dfb4f..4c91efb 100644 >> --- a/arch/arm64/include/asm/assembler.h >> +++ b/arch/arm64/include/asm/assembler.h >> @@ -30,6 +30,7 @@ >> #include >> #include >> #include >> +#include >> >> /* >> * Enable and disable interrupts. >> @@ -514,6 +515,22 @@ >> * reg: the value to be written. >> */ >> .macro write_sctlr, eln, reg >> +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 >> +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 >> +tbnz\reg, #0, 8000f // enable MMU? > > Do we really need the branch here? It's not like enabling the MMU is > something we do on the syscall fastpath, and I can't imagine an extra > ISB hurts much (and is probably comparable to a mispredicted branch I don't have any strong opinion on whether to use an ISB conditionally or unconditionally. Yes, the current kernel code is not touching SCTLR_ELn register on the system call fast path. I would like to keep it as a conditional ISB in case if the future kernel accesses the SCTLR_ELn on the fast path. An
[PATCH 3/3] arm64: Add software workaround for Falkor erratum 1041
The ARM architecture defines the memory locations that are permitted to be accessed as the result of a speculative instruction fetch from an exception level for which all stages of translation are disabled. Specifically, the core is permitted to speculatively fetch from the 4KB region containing the current program counter and next 4KB. When translation is changed from enabled to disabled for the running exception level (SCTLR_ELn[M] changed from a value of 1 to 0), the Falkor core may errantly speculatively access memory locations outside of the 4KB region permitted by the architecture. The errant memory access may lead to one of the following unexpected behaviors. 1) A System Error Interrupt (SEI) being raised by the Falkor core due to the errant memory access attempting to access a region of memory that is protected by a slave-side memory protection unit. 2) Unpredictable device behavior due to a speculative read from device memory. This behavior may only occur if the instruction cache is disabled prior to or coincident with translation being changed from enabled to disabled. To avoid the errant behavior, software must execute an ISB immediately prior to executing the MSR that will change SCTLR_ELn[M] from 1 to 0. Signed-off-by: Shanker Donthineni --- Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 10 ++ arch/arm64/include/asm/assembler.h | 17 + arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 4 ++-- arch/arm64/kernel/head.S | 4 ++-- 7 files changed, 50 insertions(+), 5 deletions(-) diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt index 66e8ce1..704770c0 100644 --- a/Documentation/arm64/silicon-errata.txt +++ b/Documentation/arm64/silicon-errata.txt @@ -74,3 +74,4 @@ stable kernels. | Qualcomm Tech. | Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003| | Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009| | Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 | +| Qualcomm Tech. | Falkor v{1,2} | E1041 | QCOM_FALKOR_ERRATUM_1041| diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 0df64a6..7e933fb 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -539,6 +539,16 @@ config QCOM_QDF2400_ERRATUM_0065 If unsure, say Y. +config QCOM_FALKOR_ERRATUM_1041 + bool "Falkor E1041: Speculative instruction fetches might cause errant memory access" + default y + help + Falkor CPU may speculatively fetch instructions from an improper + memory location when MMU translation is changed from SCTLR_ELn[M]=1 + to SCTLR_ELn[M]=0. Prefix an ISB instruction to fix the problem. + + If unsure, say Y. + endmenu diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index b6dfb4f..4c91efb 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -30,6 +30,7 @@ #include #include #include +#include /* * Enable and disable interrupts. @@ -514,6 +515,22 @@ * reg: the value to be written. */ .macro write_sctlr, eln, reg +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 +alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1041 + tbnz\reg, #0, 8000f // enable MMU? + isb +8000: +alternative_else_nop_endif +#endif + msr sctlr_\eln, \reg + .endm + + .macro early_write_sctlr, eln, reg +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 + tbnz\reg, #0, 8000f // enable MMU? + isb +8000: +#endif msr sctlr_\eln, \reg .endm diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 8da6216..7f7a59d 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -40,7 +40,8 @@ #define ARM64_WORKAROUND_85892119 #define ARM64_WORKAROUND_CAVIUM_30115 20 #define ARM64_HAS_DCPOP21 +#define ARM64_WORKAROUND_QCOM_FALKOR_E1041 22 -#define ARM64_NCAPS22 +#define ARM64_NCAPS23 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 0e27f86..27f9a45 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -179,6 +179,22 @@ static int cpu_enable_trap_ctr_access(void *__unused) MIDR_CPU_VAR_REV(0, 0)), }, #endif +#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1041 + { + .desc = "Qualcomm Technologies Falkor erratum 1041", + .capability = ARM64_WORKAROUND_QCOM_FALKOR_E1041, + MIDR_RANGE
[PATCH 0/3] Implement a software workaround for Falkor erratum 1041
On Falkor CPU, we’ve discovered a hardware issue which might lead to a kernel crash or the unexpected behavior. The Falkor core may errantly access memory locations on speculative instruction fetches. This may happen whenever MMU translation state, SCTLR_ELn[M] bit is being changed from enabled to disabled for the currently running exception level. To prevent the errant hardware behavior, software must execute an ISB immediately prior to executing the MSR that changes SCTLR_ELn[M] from a value of 1 to 0. To simplify the complexity of a workaround, this patch series issues an ISB whenever SCTLR_ELn[M] is changed to 0 to fix the Falkor erratum 1041. Patch1: - CPUTYPE definitions for Falkor CPU. Patch2: - Define two ASM helper macros to read/write SCTLR_ELn register. Patch3: - Actual workaround changes for erratum E1041. Shanker Donthineni (3): arm64: Define cputype macros for Falkor CPU arm64: Prepare SCTLR_ELn accesses to handle Falkor erratum 1041 arm64: Add software workaround for Falkor erratum 1041 Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/Kconfig | 10 ++ arch/arm64/include/asm/assembler.h | 35 ++ arch/arm64/include/asm/cpucaps.h | 3 ++- arch/arm64/include/asm/cputype.h | 2 ++ arch/arm64/kernel/cpu-reset.S | 4 ++-- arch/arm64/kernel/cpu_errata.c | 16 arch/arm64/kernel/efi-entry.S | 8 arch/arm64/kernel/head.S | 18 - arch/arm64/kernel/relocate_kernel.S| 4 ++-- arch/arm64/kvm/hyp-init.S | 6 +++--- arch/arm64/mm/proc.S | 6 +++--- 12 files changed, 89 insertions(+), 24 deletions(-) -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 1/3] arm64: Define cputype macros for Falkor CPU
Add cputype definition macros for Qualcomm Datacenter Technologies Falkor CPU in cputype.h. It's unfortunate that the first revision of the Falkor CPU used the wrong part number 0x800, got fixed in v2 chip with part number 0xC00, and would be used the same value for future revisions. Signed-off-by: Shanker Donthineni Signed-off-by: Neil Leeder --- arch/arm64/include/asm/cputype.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h index 235e77d..cbf08d7 100644 --- a/arch/arm64/include/asm/cputype.h +++ b/arch/arm64/include/asm/cputype.h @@ -91,6 +91,7 @@ #define BRCM_CPU_PART_VULCAN 0x516 #define QCOM_CPU_PART_FALKOR_V10x800 +#define QCOM_CPU_PART_FALKOR 0xC00 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53) #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57) @@ -99,6 +100,7 @@ #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX) #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1) +#define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR) #ifndef __ASSEMBLY__ -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 2/3] arm64: Prepare SCTLR_ELn accesses to handle Falkor erratum 1041
This patch introduces two helper macros read_sctlr and write_sctlr to access system register SCTLR_ELn. Replace all MSR/MRS references to sctlr_el1{el2} with macros. This should cause no behavioral change. Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/assembler.h | 18 ++ arch/arm64/kernel/cpu-reset.S | 4 ++-- arch/arm64/kernel/efi-entry.S | 8 arch/arm64/kernel/head.S| 18 +- arch/arm64/kernel/relocate_kernel.S | 4 ++-- arch/arm64/kvm/hyp-init.S | 6 +++--- arch/arm64/mm/proc.S| 6 +++--- 7 files changed, 41 insertions(+), 23 deletions(-) diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index d58a625..b6dfb4f 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -499,4 +499,22 @@ #endif .endm +/** + * Read value of the system control register SCTLR_ELn. + * eln: which system control register. + * reg: contents of the SCTLR_ELn. + */ + .macro read_sctlr, eln, reg + mrs \reg, sctlr_\eln + .endm + +/** + * Write the value to the system control register SCTLR_ELn. + * eln: which system control register. + * reg: the value to be written. + */ + .macro write_sctlr, eln, reg + msr sctlr_\eln, \reg + .endm + #endif /* __ASM_ASSEMBLER_H */ diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S index 65f42d2..9224abd 100644 --- a/arch/arm64/kernel/cpu-reset.S +++ b/arch/arm64/kernel/cpu-reset.S @@ -34,10 +34,10 @@ */ ENTRY(__cpu_soft_restart) /* Clear sctlr_el1 flags. */ - mrs x12, sctlr_el1 + read_sctlr el1, x12 ldr x13, =SCTLR_ELx_FLAGS bic x12, x12, x13 - msr sctlr_el1, x12 + write_sctlr el1, x12 isb cbz x0, 1f // el2_switch? diff --git a/arch/arm64/kernel/efi-entry.S b/arch/arm64/kernel/efi-entry.S index 4e6ad35..acae627 100644 --- a/arch/arm64/kernel/efi-entry.S +++ b/arch/arm64/kernel/efi-entry.S @@ -93,17 +93,17 @@ ENTRY(entry) mrs x0, CurrentEL cmp x0, #CurrentEL_EL2 b.ne1f - mrs x0, sctlr_el2 + read_sctlr el2, x0 bic x0, x0, #1 << 0 // clear SCTLR.M bic x0, x0, #1 << 2 // clear SCTLR.C - msr sctlr_el2, x0 + write_sctlr el2, x0 isb b 2f 1: - mrs x0, sctlr_el1 + read_sctlr el1, x0 bic x0, x0, #1 << 0 // clear SCTLR.M bic x0, x0, #1 << 2 // clear SCTLR.C - msr sctlr_el1, x0 + write_sctlr el1, x0 isb 2: /* Jump to kernel entry point */ diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S index 0b243ec..b8d5b73 100644 --- a/arch/arm64/kernel/head.S +++ b/arch/arm64/kernel/head.S @@ -388,18 +388,18 @@ ENTRY(el2_setup) mrs x0, CurrentEL cmp x0, #CurrentEL_EL2 b.eq1f - mrs x0, sctlr_el1 + read_sctlr el1, x0 CPU_BE(orr x0, x0, #(3 << 24) ) // Set the EE and E0E bits for EL1 CPU_LE(bic x0, x0, #(3 << 24) ) // Clear the EE and E0E bits for EL1 - msr sctlr_el1, x0 + write_sctlr el1, x0 mov w0, #BOOT_CPU_MODE_EL1 // This cpu booted in EL1 isb ret -1: mrs x0, sctlr_el2 +1: read_sctlr el2, x0 CPU_BE(orr x0, x0, #(1 << 25) ) // Set the EE bit for EL2 CPU_LE(bic x0, x0, #(1 << 25) ) // Clear the EE bit for EL2 - msr sctlr_el2, x0 + write_sctlr el2, x0 #ifdef CONFIG_ARM64_VHE /* @@ -511,7 +511,7 @@ install_el2_stub: mov x0, #0x0800 // Set/clear RES{1,0} bits CPU_BE(movkx0, #0x33d0, lsl #16) // Set EE and E0E on BE systems CPU_LE(movkx0, #0x30d0, lsl #16) // Clear EE and E0E on LE systems - msr sctlr_el1, x0 + write_sctlr el1, x0 /* Coprocessor traps. */ mov x0, #0x33ff @@ -664,7 +664,7 @@ ENTRY(__enable_mmu) msr ttbr0_el1, x1 // load TTBR0 msr ttbr1_el1, x2 // load TTBR1 isb - msr sctlr_el1, x0 + write_sctlr el1, x0 isb /* * Invalidate the local I-cache so that any instructions fetched @@ -716,7 +716,7 @@ ENDPROC(__relocate_kernel) __primary_switch: #ifdef CONFIG_RANDOMIZE_BASE mov x19, x0 // preserve new SCTLR_EL1 value - mrs x20, sctlr_el1 // preserve old SCTLR_EL1 value + read_sctlr el1, x20 // preserve old SCTLR_EL1 value #endif bl __enable_mmu @@ -732,14 +732,14 @@ __primary_switch: * to take into acco
Re: [PATCH v4 00/26] KVM/ARM: Add support for GICv4
Hi Marc, I've tested this patch series on QDF2400 server platform using NVME card, the basic functionality works fine and the below log messages shows around 70 interrupts are delivered to vCPU directly. Tested-by: Shanker Donthineni >From guest kernel: /mnt # cat /proc/interrupts | grep ITS 51: 83 ITS-MSI 32768 Edge nvme0q0, nvme0q1 52: 0 ITS-MSI 16384 Edge virtio0-config 53: 0 ITS-MSI 16385 Edge virtio0-input.0 54: 0 ITS-MSI 16386 Edge virtio0-output.0 >From host kernel: /mnt # cat /proc/interrupts | grep GICv4 388: 9 GICv4-vpe 0 Edge vcpu -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] arm64: KVM: Reject non-compliant HVC calls from guest kernel
The SMC/HVC instructions with an immediate value non-zero are not compliant according to 'SMC calling convention system software document'. Add a validation check in handle_hvc() to avoid malicious HVC calls from VM, and inject an undefined instruction for those calls. http://infocenter.arm.com/help/topic/com.arm.doc.den0028b/ARM_DEN0028B_SMC_Calling_Convention.pdf Signed-off-by: Shanker Donthineni --- arch/arm64/include/asm/esr.h | 4 arch/arm64/kvm/handle_exit.c | 12 +++- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index 8cabd57..fa988e5 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -107,6 +107,9 @@ #define ESR_ELx_AR (UL(1) << 14) #define ESR_ELx_CM (UL(1) << 8) +/* ISS field definitions for HVC/SVC instruction execution traps */ +#define ESR_HVC_IMMEDIATE(esr) ((esr) & 0x) + /* ISS field definitions for exceptions taken in to Hyp */ #define ESR_ELx_CV (UL(1) << 24) #define ESR_ELx_COND_SHIFT (20) @@ -114,6 +117,7 @@ #define ESR_ELx_WFx_ISS_WFE(UL(1) << 0) #define ESR_ELx_xVC_IMM_MASK ((1UL << 16) - 1) + /* ESR value templates for specific events */ /* BRK instruction trap from AArch64 state */ diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c index 17d8a16..a900dcd 100644 --- a/arch/arm64/kvm/handle_exit.c +++ b/arch/arm64/kvm/handle_exit.c @@ -42,13 +42,15 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run) kvm_vcpu_hvc_get_imm(vcpu)); vcpu->stat.hvc_exit_stat++; - ret = kvm_psci_call(vcpu); - if (ret < 0) { - kvm_inject_undefined(vcpu); - return 1; + /* HVC immediate value must be zero for all compliant calls */ + if (!ESR_HVC_IMMEDIATE(kvm_vcpu_get_hsr(vcpu))) { + ret = kvm_psci_call(vcpu); + if (ret >= 0) + return ret; } - return ret; + kvm_inject_undefined(vcpu); + return 1; } static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run) -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] KVM: arm/arm64: Fix bug in advertising KVM_CAP_MSI_DEVID capability
Commit 0e4e82f154e3 ("KVM: arm64: vgic-its: Enable ITS emulation as a virtual MSI controller") tried to advertise KVM_CAP_MSI_DEVID, but the code logic was not updating the dist->msis_require_devid field correctly. If hypervisor tool creates the ITS device after VGIC initialization then we don't advertise KVM_CAP_MSI_DEVID capability. Update the field msis_require_devid to true inside vgic_its_create() to fix the issue. Fixes: 0e4e82f154e3 ("vgic-its: Enable ITS emulation as a virtual MSI controller") Signed-off-by: Shanker Donthineni --- virt/kvm/arm/vgic/vgic-init.c | 3 --- virt/kvm/arm/vgic/vgic-its.c | 1 + 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c index 3a0b899..5801261 100644 --- a/virt/kvm/arm/vgic/vgic-init.c +++ b/virt/kvm/arm/vgic/vgic-init.c @@ -285,9 +285,6 @@ int vgic_init(struct kvm *kvm) if (ret) goto out; - if (vgic_has_its(kvm)) - dist->msis_require_devid = true; - kvm_for_each_vcpu(i, vcpu, kvm) kvm_vgic_vcpu_enable(vcpu); diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c index 2dff288..aa6b68d 100644 --- a/virt/kvm/arm/vgic/vgic-its.c +++ b/virt/kvm/arm/vgic/vgic-its.c @@ -1598,6 +1598,7 @@ static int vgic_its_create(struct kvm_device *dev, u32 type) INIT_LIST_HEAD(&its->device_list); INIT_LIST_HEAD(&its->collection_list); + dev->kvm->arch.vgic.msis_require_devid = true; dev->kvm->arch.vgic.has_its = true; its->enabled = false; its->dev = dev; -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2 38/52] KVM: arm/arm64: GICv4: Wire init/teardown of per-VM support
Hi Marc, On 06/28/2017 10:03 AM, Marc Zyngier wrote: > Should the HW support GICv4 and an ITS being associated with this > VM, let's init the its_vm and its_vpe structures. > > Signed-off-by: Marc Zyngier > --- > virt/kvm/arm/vgic/vgic-init.c | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c > index 3a0b8999f011..0de1f0d986d4 100644 > --- a/virt/kvm/arm/vgic/vgic-init.c > +++ b/virt/kvm/arm/vgic/vgic-init.c > @@ -285,8 +285,14 @@ int vgic_init(struct kvm *kvm) > if (ret) > goto out; > > - if (vgic_has_its(kvm)) > + if (vgic_has_its(kvm)) { > dist->msis_require_devid = true; > + if (kvm_vgic_global_state.has_gicv4) { > + ret = vgic_v4_init(kvm); > + if (ret) > + goto out; > + } This is not quite right, ITS virtual device may not be initialized at the time of calling vgic-init(). This change breaks the existing KVM functionality with QEMU hypervisor tool. In later patches, code assumes vgic_v4_init(kvm) was called when vgic_has_its(kvm) returns a true value. The right change would be move this logic to inside vgic_its_create() something like this. --- a/virt/kvm/arm/vgic/vgic-init.c +++ b/virt/kvm/arm/vgic/vgic-init.c @@ -285,14 +285,8 @@ int vgic_init(struct kvm *kvm) if (ret) goto out; - if (vgic_has_its(kvm)) { + if (vgic_has_its(kvm)) dist->msis_require_devid = true; - if (kvm_vgic_global_state.has_gicv4) { - ret = vgic_v4_init(kvm); - if (ret) - goto out; - } - } kvm_for_each_vcpu(i, vcpu, kvm) kvm_vgic_vcpu_enable(vcpu); --- a/virt/kvm/arm/vgic/vgic-its.c +++ b/virt/kvm/arm/vgic/vgic-its.c @@ -1637,6 +1637,7 @@ static int vgic_register_its_iodev(struct kvm *kvm, struct static int vgic_its_create(struct kvm_device *dev, u32 type) { struct vgic_its *its; + int ret; if (type != KVM_DEV_TYPE_ARM_VGIC_ITS) return -ENODEV; @@ -1657,6 +1658,12 @@ static int vgic_its_create(struct kvm_device *dev, u32 ty its->enabled = false; its->dev = dev; + if (kvm_vgic_global_state.has_gicv4) { + ret = vgic_v4_init(dev->kvm); + if (ret) + return -ENOMEM; + } + > + } > > kvm_for_each_vcpu(i, vcpu, kvm) > kvm_vgic_vcpu_enable(vcpu); > @@ -323,6 +329,9 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm) > > kfree(dist->spis); > dist->nr_spis = 0; > + > + if (kvm_vgic_global_state.has_gicv4 && vgic_has_its(kvm)) > + vgic_v4_teardown(kvm); > } > > void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu) > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
RE: [PATCH v2 00/52] irqchip: KVM: Add support for GICv4
Hi Marc, I've verified the basic GICv4 functionality with v2 series + Eric's IRQ bypass patches on QDF2400 platform with a minor change in vgic-init.c successfully. Nice, I don't see any deadlock or catastrophic issues running on QCOM hardware. You can add my tested-by, I'll provide comments after reviewing giant v2 series. Tested-by: Shanker Donthineni -Original Message- From: linux-arm-kernel [mailto:linux-arm-kernel-boun...@lists.infradead.org] On Behalf Of Marc Zyngier Sent: Wednesday, June 28, 2017 10:03 AM To: linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu Cc: Mark Rutland ; Jason Cooper ; Eric Auger ; Christoffer Dall ; Thomas Gleixner ; Shanker Donthineni Subject: [PATCH v2 00/52] irqchip: KVM: Add support for GICv4 Yes, it's been a long time coming, but I really wasn't looking forward to picking this up again. Anyway... This (monster of a) series implements full support for GICv4, bringing direct injection of MSIs to KVM on arm and arm64, assuming you have the right hardware (which is quite unlikely). To get an idea of the design, I'd recommend you start with patch #32, which tries to shed some light on the approach that I've taken. And before that, please digest some of the GICv3/GICv4 architecture documentation[1] (less than 800 pages!). Once you feel reasonably insane, you'll be in the right mood to read the code. The structure of the series is fairly simple. The initial 34 patches add some generic support for GICv4, while the rest of the code plugs KVM into it. This series relies on Eric Auger's irq-bypass series[2], which is a prerequisite for this work. The stack has been *very lightly* tested on an arm64 model, with a PCI virtio block device passed from the host to a guest (using kvmtool and Jean-Philippe Brucker's excellent VFIO support patches[3]). As it has never seen any HW, I expect things to be subtly broken, so go forward and test if you can, though I'm mostly interested in people reviewing the code at the moment. I've pushed out a branch based on 4.12-rc6 containing the dependencies (as well as a couple of debug patches): git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git kvm-arm64/gicv4-kvm * From v1: - The bulk of the 30-something initial patches have seen countless bugs being fixed, and some key data structures have been subtly tweaked (or killed altogether). They are still quite similar to what I had in v1 though. - The whole KVM code is brand new and as I said above, only lightly tested. - Collected a bunch a R-bs from Thomas and Eric (many thanks, guys). [1] https://static.docs.arm.com/ihi0069/c/IHI0069C_gic_architecture_specificatio n.pdf [2] http://www.spinics.net/lists/kvm/msg151463.html [3] http://www.spinics.net/lists/kvm/msg151823.html Marc Zyngier (52): genirq: Let irq_set_vcpu_affinity() iterate over hierarchy irqchip/gic-v3: Add redistributor iterator irqchip/gic-v3: Add VLPI/DirectLPI discovery irqchip/gic-v3-its: Move LPI definitions around irqchip/gic-v3-its: Add probing for VLPI properties irqchip/gic-v3-its: Macro-ize its_send_single_command irqchip/gic-v3-its: Implement irq_set_irqchip_state for pending state irqchip/gic-v3-its: Split out property table allocation irqchip/gic-v3-its: Allow use of indirect VCPU tables irqchip/gic-v3-its: Split out pending table allocation irqchip/gic-v3-its: Rework LPI freeing irqchip/gic-v3-its: Generalize device table allocation irqchip/gic-v3-its: Generalize LPI configuration irqchip/gic-v4: Add management structure definitions irqchip/gic-v3-its: Add GICv4 ITS command definitions irqchip/gic-v3-its: Add VLPI configuration hook irqchip/gic-v3-its: Add VLPI map/unmap operations irqchip/gic-v3-its: Add VLPI configuration handling irqchip/gic-v3-its: Add VPE domain infrastructure irqchip/gic-v3-its: Add VPE irq domain allocation/teardown irqchip/gic-v3-its: Add VPE irq domain [de]activation irqchip/gic-v3-its: Add VPENDBASER/VPROPBASER accessors irqchip/gic-v3-its: Add VPE scheduling irqchip/gic-v3-its: Add VPE invalidation hook irqchip/gic-v3-its: Add VPE affinity changes irqchip/gic-v3-its: Add VPE interrupt masking irqchip/gic-v3-its: Support VPE doorbell invalidation even when !DirectLPI irqchip/gic-v3-its: Set implementation defined bit to enable VLPIs irqchip/gic-v4: Add per-VM VPE domain creation irqchip/gic-v4: Add VPE command interface irqchip/gic-v4: Add VLPI configuration interface irqchip/gic-v4: Add some basic documentation irqchip/gic-v4: Enable low-level GICv4 operations irqchip/gic-v3: Advertise GICv4 support to KVM KVM: arm/arm64: vgic: Move kvm_vgic_destroy call around KVM: arm/arm64: vITS: Add MSI translation helpers KVM: arm/arm64: GICv4: Add init and teardown of the vPE irq domain KVM: arm/arm64: GICv4: Wire init/teardown of per-VM support KVM: arm/arm
Re: [RFC PATCH 24/33] irqchip/gic-v3-its: Add VPE scheduling
Hi Eric, On 03/16/2017 04:23 PM, Auger Eric wrote: > Hi, > > On 17/01/2017 11:20, Marc Zyngier wrote: >> When a VPE is scheduled to run, the corresponding redistributor must >> be told so, by setting VPROPBASER to the VM's property table, and >> VPENDBASER to the vcpu's pending table. >> >> When scheduled out, we preserve the IDAI and PendingLast bits. The >> latter is specially important, as it tells the hypervisor that >> there are pending interrupts for this vcpu. >> >> Signed-off-by: Marc Zyngier >> --- >> drivers/irqchip/irq-gic-v3-its.c | 57 ++ >> include/linux/irqchip/arm-gic-v3.h | 63 >> ++ >> 2 files changed, 120 insertions(+) >> >> diff --git a/drivers/irqchip/irq-gic-v3-its.c >> b/drivers/irqchip/irq-gic-v3-its.c >> index 598e25b..f918d59 100644 >> --- a/drivers/irqchip/irq-gic-v3-its.c >> +++ b/drivers/irqchip/irq-gic-v3-its.c >> @@ -143,6 +143,7 @@ static DEFINE_IDA(its_vpeid_ida); >> >> #define gic_data_rdist()(raw_cpu_ptr(gic_rdists->rdist)) >> #define gic_data_rdist_rd_base()(gic_data_rdist()->rd_base) >> +#define gic_data_rdist_vlpi_base() (gic_data_rdist_rd_base() + SZ_128K) >> >> static struct its_collection *dev_event_to_col(struct its_device *its_dev, >> u32 event) >> @@ -2039,8 +2040,64 @@ static const struct irq_domain_ops its_domain_ops = { >> .deactivate = its_irq_domain_deactivate, >> }; >> >> +static int its_vpe_set_vcpu_affinity(struct irq_data *d, void *vcpu_info) >> +{ >> +struct its_vpe *vpe = irq_data_get_irq_chip_data(d); >> +struct its_cmd_info *info = vcpu_info; >> +u64 val; >> + >> +switch (info->cmd_type) { >> +case SCHEDULE_VPE: >> +{ >> +void * __iomem vlpi_base = gic_data_rdist_vlpi_base(); >> + >> +/* Schedule the VPE */ >> +val = virt_to_phys(page_address(vpe->its_vm->vprop_page)) & >> +GENMASK_ULL(51, 12); >> +val |= (LPI_NRBITS - 1) & GICR_VPROPBASER_IDBITS_MASK; >> +val |= GICR_VPROPBASER_RaWb; >> +val |= GICR_VPROPBASER_InnerShareable; >> +gits_write_vpropbaser(val, vlpi_base + GICR_VPROPBASER); >> + >> +val = virt_to_phys(page_address(vpe->vpt_page)) & GENMASK(51, >> 16); >> +val |= GICR_VPENDBASER_WaWb; >> +val |= GICR_VPENDBASER_NonShareable; >> +val |= GICR_PENDBASER_PendingLast; > don't you want to restore the vpe->pending_last here? anyway I > understand this will force the HW to read the LPI pending table. It's not a good idea to set PendLast bit always. There is no correctness issue but causes a huge impact on the system performance. No need to read pending table contents from memory if no VLPI are pending on vPE that is being scheduled. > Reviewed-by: Eric Auger > > Eric > > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v2] arm64: kvm: Use has_vhe() instead of hyp_alternate_select()
Hi Marc, On 03/06/2017 02:34 AM, Marc Zyngier wrote: Hi Shanker, On Mon, Mar 06 2017 at 2:33:18 am GMT, Shanker Donthineni wrote: Now all the cpu_hwcaps features have their own static keys. We don't need a separate function hyp_alternate_select() to patch the vhe/nvhe code. We can achieve the same functionality by using has_vhe(). It improves the code readability, uses the jump label instructions, and also compiler generates the better code with a fewer instructions. How do you define "better"? Which compiler? Do you have any benchmarking data? I'm using gcc version 5.2.0. With has_vhe() it shows the smaller code size as shown below. I tried to benchmark the code changes using Cristiffer's microbench tool, but not seeing a noticeable difference on QDF2400 platform. hyp_alternate_select() uses BR/BLR instructions to patch vhe/mvhe code, which is not good for branch prediction purpose. compiler treats patched code as a function call, so the contents of the registers x0-x18 are not reusable after vhe/nvhe call. Current code: arch/arm64/kvm/hyp/switch.o: file format elf64-littleaarch64 Sections: Idx Name Size VMA LMA File off Algn 0 .text 0040 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .data 0040 2**0 CONTENTS, ALLOC, LOAD, DATA 2 .bss 0040 2**0 ALLOC 3 .hyp.text 0550 0040 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE New code: arch/arm64/kvm/hyp/switch.o: file format elf64-littleaarch64 Sections: Idx Name Size VMA LMA File off Algn 0 .text 0040 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .data 0040 2**0 CONTENTS, ALLOC, LOAD, DATA 2 .bss 0040 2**0 ALLOC 3 .hyp.text 0488 0040 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE Signed-off-by: Shanker Donthineni --- v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit arch/arm64/kvm/hyp/debug-sr.c | 12 ++ arch/arm64/kvm/hyp/switch.c| 50 +++--- arch/arm64/kvm/hyp/sysreg-sr.c | 23 +-- 3 files changed, 43 insertions(+), 42 deletions(-) diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c index f5154ed..e5642c2 100644 --- a/arch/arm64/kvm/hyp/debug-sr.c +++ b/arch/arm64/kvm/hyp/debug-sr.c @@ -109,9 +109,13 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1) dsb(nsh); } -static hyp_alternate_select(__debug_save_spe, - __debug_save_spe_nvhe, __debug_save_spe_vhe, - ARM64_HAS_VIRT_HOST_EXTN); +static void __hyp_text __debug_save_spe(u64 *pmscr_el1) +{ + if (has_vhe()) + __debug_save_spe_vhe(pmscr_el1); + else + __debug_save_spe_nvhe(pmscr_el1); +} I have two worries about this kind of thing: - Not all compilers do support jump labels, leading to a memory access on each static key (GCC 4.8, for example). This would immediately introduce a pretty big regression - The hyp_alternate_select() method doesn't introduce a fast/slow path duality. Each path has the exact same cost. I'm not keen on choosing what is supposed to be the fast path, really. Yes, it'll require a runtime check if the compiler doesn't support ASM GOTO labels. Agree, hyp_alternate_select() has a constant branch over head but it might cause a branch prediction penality. Thanks, M. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v2] arm64: kvm: Use has_vhe() instead of hyp_alternate_select()
Now all the cpu_hwcaps features have their own static keys. We don't need a separate function hyp_alternate_select() to patch the vhe/nvhe code. We can achieve the same functionality by using has_vhe(). It improves the code readability, uses the jump label instructions, and also compiler generates the better code with a fewer instructions. Signed-off-by: Shanker Donthineni --- v2: removed 'Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038' from commit arch/arm64/kvm/hyp/debug-sr.c | 12 ++ arch/arm64/kvm/hyp/switch.c| 50 +++--- arch/arm64/kvm/hyp/sysreg-sr.c | 23 +-- 3 files changed, 43 insertions(+), 42 deletions(-) diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c index f5154ed..e5642c2 100644 --- a/arch/arm64/kvm/hyp/debug-sr.c +++ b/arch/arm64/kvm/hyp/debug-sr.c @@ -109,9 +109,13 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1) dsb(nsh); } -static hyp_alternate_select(__debug_save_spe, - __debug_save_spe_nvhe, __debug_save_spe_vhe, - ARM64_HAS_VIRT_HOST_EXTN); +static void __hyp_text __debug_save_spe(u64 *pmscr_el1) +{ + if (has_vhe()) + __debug_save_spe_vhe(pmscr_el1); + else + __debug_save_spe_nvhe(pmscr_el1); +} static void __hyp_text __debug_restore_spe(u64 pmscr_el1) { @@ -180,7 +184,7 @@ void __hyp_text __debug_cond_save_host_state(struct kvm_vcpu *vcpu) __debug_save_state(vcpu, &vcpu->arch.host_debug_state.regs, kern_hyp_va(vcpu->arch.host_cpu_context)); - __debug_save_spe()(&vcpu->arch.host_debug_state.pmscr_el1); + __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1); } void __hyp_text __debug_cond_restore_host_state(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index aede165..c5c77b8 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -33,13 +33,9 @@ static bool __hyp_text __fpsimd_enabled_vhe(void) return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN); } -static hyp_alternate_select(__fpsimd_is_enabled, - __fpsimd_enabled_nvhe, __fpsimd_enabled_vhe, - ARM64_HAS_VIRT_HOST_EXTN); - bool __hyp_text __fpsimd_enabled(void) { - return __fpsimd_is_enabled()(); + return has_vhe() ? __fpsimd_enabled_vhe() : __fpsimd_enabled_nvhe(); } static void __hyp_text __activate_traps_vhe(void) @@ -63,9 +59,10 @@ static void __hyp_text __activate_traps_nvhe(void) write_sysreg(val, cptr_el2); } -static hyp_alternate_select(__activate_traps_arch, - __activate_traps_nvhe, __activate_traps_vhe, - ARM64_HAS_VIRT_HOST_EXTN); +static void __hyp_text __activate_traps_arch(void) +{ + has_vhe() ? __activate_traps_vhe() : __activate_traps_nvhe(); +} static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) { @@ -97,7 +94,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) write_sysreg(0, pmselr_el0); write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); - __activate_traps_arch()(); + __activate_traps_arch(); } static void __hyp_text __deactivate_traps_vhe(void) @@ -127,9 +124,10 @@ static void __hyp_text __deactivate_traps_nvhe(void) write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); } -static hyp_alternate_select(__deactivate_traps_arch, - __deactivate_traps_nvhe, __deactivate_traps_vhe, - ARM64_HAS_VIRT_HOST_EXTN); +static void __hyp_text __deactivate_traps_arch(void) +{ + has_vhe() ? __deactivate_traps_vhe() : __deactivate_traps_nvhe(); +} static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) { @@ -142,7 +140,7 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) if (vcpu->arch.hcr_el2 & HCR_VSE) vcpu->arch.hcr_el2 = read_sysreg(hcr_el2); - __deactivate_traps_arch()(); + __deactivate_traps_arch(); write_sysreg(0, hstr_el2); write_sysreg(0, pmuserenr_el0); } @@ -183,20 +181,14 @@ static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) __vgic_v2_restore_state(vcpu); } -static bool __hyp_text __true_value(void) +static bool __check_arm_834220(void) { - return true; -} + if (cpus_have_const_cap(ARM64_WORKAROUND_834220)) + return true; -static bool __hyp_text __false_value(void) -{ return false; } -static hyp_alternate_select(__check_arm_834220, - __false_value, __true_value, - ARM64_WORKAROUND_834220); - static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar) {
[PATCH] arm64: kvm: Use has_vhe() instead of hyp_alternate_select()
Now all the cpu_hwcaps features have their own static keys. We don't need a separate function hyp_alternate_select() to patch the vhe/nvhe code. We can achieve the same functionality by using has_vhe(). It improves the code readability, uses the jump label instructions, and also compiler generates the better code with a fewer instructions. Change-Id: Ia8084189833f2081ff13c392deb5070c46a64038 Signed-off-by: Shanker Donthineni --- arch/arm64/kvm/hyp/debug-sr.c | 12 ++ arch/arm64/kvm/hyp/switch.c| 50 +++--- arch/arm64/kvm/hyp/sysreg-sr.c | 23 +-- 3 files changed, 43 insertions(+), 42 deletions(-) diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c index f5154ed..e5642c2 100644 --- a/arch/arm64/kvm/hyp/debug-sr.c +++ b/arch/arm64/kvm/hyp/debug-sr.c @@ -109,9 +109,13 @@ static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1) dsb(nsh); } -static hyp_alternate_select(__debug_save_spe, - __debug_save_spe_nvhe, __debug_save_spe_vhe, - ARM64_HAS_VIRT_HOST_EXTN); +static void __hyp_text __debug_save_spe(u64 *pmscr_el1) +{ + if (has_vhe()) + __debug_save_spe_vhe(pmscr_el1); + else + __debug_save_spe_nvhe(pmscr_el1); +} static void __hyp_text __debug_restore_spe(u64 pmscr_el1) { @@ -180,7 +184,7 @@ void __hyp_text __debug_cond_save_host_state(struct kvm_vcpu *vcpu) __debug_save_state(vcpu, &vcpu->arch.host_debug_state.regs, kern_hyp_va(vcpu->arch.host_cpu_context)); - __debug_save_spe()(&vcpu->arch.host_debug_state.pmscr_el1); + __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1); } void __hyp_text __debug_cond_restore_host_state(struct kvm_vcpu *vcpu) diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index aede165..c5c77b8 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -33,13 +33,9 @@ static bool __hyp_text __fpsimd_enabled_vhe(void) return !!(read_sysreg(cpacr_el1) & CPACR_EL1_FPEN); } -static hyp_alternate_select(__fpsimd_is_enabled, - __fpsimd_enabled_nvhe, __fpsimd_enabled_vhe, - ARM64_HAS_VIRT_HOST_EXTN); - bool __hyp_text __fpsimd_enabled(void) { - return __fpsimd_is_enabled()(); + return has_vhe() ? __fpsimd_enabled_vhe() : __fpsimd_enabled_nvhe(); } static void __hyp_text __activate_traps_vhe(void) @@ -63,9 +59,10 @@ static void __hyp_text __activate_traps_nvhe(void) write_sysreg(val, cptr_el2); } -static hyp_alternate_select(__activate_traps_arch, - __activate_traps_nvhe, __activate_traps_vhe, - ARM64_HAS_VIRT_HOST_EXTN); +static void __hyp_text __activate_traps_arch(void) +{ + has_vhe() ? __activate_traps_vhe() : __activate_traps_nvhe(); +} static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) { @@ -97,7 +94,7 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu) write_sysreg(0, pmselr_el0); write_sysreg(ARMV8_PMU_USERENR_MASK, pmuserenr_el0); write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2); - __activate_traps_arch()(); + __activate_traps_arch(); } static void __hyp_text __deactivate_traps_vhe(void) @@ -127,9 +124,10 @@ static void __hyp_text __deactivate_traps_nvhe(void) write_sysreg(CPTR_EL2_DEFAULT, cptr_el2); } -static hyp_alternate_select(__deactivate_traps_arch, - __deactivate_traps_nvhe, __deactivate_traps_vhe, - ARM64_HAS_VIRT_HOST_EXTN); +static void __hyp_text __deactivate_traps_arch(void) +{ + has_vhe() ? __deactivate_traps_vhe() : __deactivate_traps_nvhe(); +} static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) { @@ -142,7 +140,7 @@ static void __hyp_text __deactivate_traps(struct kvm_vcpu *vcpu) if (vcpu->arch.hcr_el2 & HCR_VSE) vcpu->arch.hcr_el2 = read_sysreg(hcr_el2); - __deactivate_traps_arch()(); + __deactivate_traps_arch(); write_sysreg(0, hstr_el2); write_sysreg(0, pmuserenr_el0); } @@ -183,20 +181,14 @@ static void __hyp_text __vgic_restore_state(struct kvm_vcpu *vcpu) __vgic_v2_restore_state(vcpu); } -static bool __hyp_text __true_value(void) +static bool __check_arm_834220(void) { - return true; -} + if (cpus_have_const_cap(ARM64_WORKAROUND_834220)) + return true; -static bool __hyp_text __false_value(void) -{ return false; } -static hyp_alternate_select(__check_arm_834220, - __false_value, __true_value, - ARM64_WORKAROUND_834220); - static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar) { u64 par, tmp; @@ -251,7 +243,7 @@ static bool __hyp_tex
Re: [RFC PATCH 24/33] irqchip/gic-v3-its: Add VPE scheduling
R_CACHEABILITY_MASK \ + GIC_BASER_CACHEABILITY(GICR_VPROPBASER, OUTER, MASK) +#define GICR_VPROPBASER_CACHEABILITY_MASK \ + GICR_VPROPBASER_INNER_CACHEABILITY_MASK + +#define GICR_VPROPBASER_InnerShareable \ + GIC_BASER_SHAREABILITY(GICR_VPROPBASER, InnerShareable) + +#define GICR_VPROPBASER_nCnB GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, nCnB) +#define GICR_VPROPBASER_nC GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, nC) +#define GICR_VPROPBASER_RaWt GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, RaWt) +#define GICR_VPROPBASER_RaWb GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, RaWt) +#define GICR_VPROPBASER_WaWt GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, WaWt) +#define GICR_VPROPBASER_WaWb GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, WaWb) +#define GICR_VPROPBASER_RaWaWt GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, RaWaWt) +#define GICR_VPROPBASER_RaWaWb GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, RaWaWb) + +#define GICR_VPENDBASER0x0078 + +#define GICR_VPENDBASER_SHAREABILITY_SHIFT (10) +#define GICR_VPENDBASER_INNER_CACHEABILITY_SHIFT (7) +#define GICR_VPENDBASER_OUTER_CACHEABILITY_SHIFT (56) +#define GICR_VPENDBASER_SHAREABILITY_MASK \ + GIC_BASER_SHAREABILITY(GICR_VPENDBASER, SHAREABILITY_MASK) +#define GICR_VPENDBASER_INNER_CACHEABILITY_MASK \ + GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, MASK) +#define GICR_VPENDBASER_OUTER_CACHEABILITY_MASK \ + GIC_BASER_CACHEABILITY(GICR_VPENDBASER, OUTER, MASK) +#define GICR_VPENDBASER_CACHEABILITY_MASK \ + GICR_VPENDBASER_INNER_CACHEABILITY_MASK + +#define GICR_VPENDBASER_NonShareable \ + GIC_BASER_SHAREABILITY(GICR_VPENDBASER, NonShareable) + +#define GICR_VPENDBASER_nCnB GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, nCnB) +#define GICR_VPENDBASER_nC GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, nC) +#define GICR_VPENDBASER_RaWt GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, RaWt) +#define GICR_VPENDBASER_RaWb GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, RaWt) +#define GICR_VPENDBASER_WaWt GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, WaWt) +#define GICR_VPENDBASER_WaWb GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, WaWb) +#define GICR_VPENDBASER_RaWaWt GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, RaWaWt) +#define GICR_VPENDBASER_RaWaWb GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, RaWaWb) + +#define GICR_PENDBASER_Dirty (1ULL << 60) +#define GICR_PENDBASER_PendingLast (1ULL << 61) +#define GICR_PENDBASER_IDAI(1ULL << 62) +#define GICR_PENDBASER_Valid (1ULL << 63) + +/* * ITS registers, offsets from ITS_base */ #define GITS_CTLR 0x -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 28/33] irqchip/gic-v3-its: Support VPE doorbell invalidation even when !DirectLPI
Hi Marc, On 01/17/2017 04:20 AM, Marc Zyngier wrote: When we don't have the DirectLPI feature, we must work around the architecture shortcomings to be able to perform the required invalidation. For this, we create a fake device whose sole purpose is to provide a way to issue a map/inv/unmap sequence (and the corresponding sync operations). That's 6 commands and a full serialization point to be able to do this. You just have hope the hypervisor won't do that too often... Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3-its.c | 59 ++-- 1 file changed, 57 insertions(+), 2 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 008fb71..3787579 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -133,6 +133,9 @@ struct its_device { u32 device_id; }; +static struct its_device *vpe_proxy_dev; +static DEFINE_RAW_SPINLOCK(vpe_proxy_dev_lock); + static LIST_HEAD(its_nodes); static DEFINE_SPINLOCK(its_lock); static struct rdists *gic_rdists; @@ -993,8 +996,35 @@ static void lpi_update_config(struct irq_data *d, u8 clr, u8 set) struct its_vpe *vpe = irq_data_get_irq_chip_data(d); void __iomem *rdbase; - rdbase = per_cpu_ptr(gic_rdists->rdist, vpe->col_idx)->rd_base; - writeq_relaxed(d->hwirq, rdbase + GICR_INVLPIR); + if (gic_rdists->has_direct_lpi) { + rdbase = per_cpu_ptr(gic_rdists->rdist, vpe->col_idx)->rd_base; + writeq_relaxed(d->hwirq, rdbase + GICR_INVLPIR); + } else { + /* +* This is insane. +* +* If a GICv4 doesn't implement Direct LPIs, +* the only way to perform an invalidate is to +* use a fake device to issue a MAP/INV/UNMAP +* sequence. Since each of these commands has +* a sync operation, this is really fast. Not. +* +* We always use event 0, and this serialize +* all VPE invalidations in the system. +* +* Broken by design(tm). +*/ + unsigned long flags; + + raw_spin_lock_irqsave(&vpe_proxy_dev_lock, flags); + + vpe_proxy_dev->event_map.col_map[0] = vpe->col_idx; + its_send_mapvi(vpe_proxy_dev, vpe->vpe_db_lpi, 0); + its_send_inv(vpe_proxy_dev, 0); + its_send_discard(vpe_proxy_dev, 0); + + raw_spin_unlock_irqrestore(&vpe_proxy_dev_lock, flags); + } } } @@ -2481,6 +2511,31 @@ static struct irq_domain *its_init_vpe_domain(void) struct fwnode_handle *handle; struct irq_domain *domain; + if (gic_rdists->has_direct_lpi) { + pr_info("ITS: Using DirectLPI for VPE invalidation\n"); + } else { + struct its_node *its; + + list_for_each_entry(its, &its_nodes, entry) { + u32 devid; + + if (!its->is_v4) + continue; + + /* Use the last possible DevID */ + devid = GENMASK(its->device_ids - 1, 0); How do we know this 'devid' is not being used by real hardware devices? I think we need some kind check in its_msi_prepare() to skip this device or WARN. Unfortunately Qualcomm doesn't support Direct LPI feature. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 24/33] irqchip/gic-v3-its: Add VPE scheduling
(1 << 0) /* + * Re-Distributor registers, offsets from VLPI_base + */ +#define GICR_VPROPBASER0x0070 + +#define GICR_VPROPBASER_IDBITS_MASK0x1f + +#define GICR_VPROPBASER_SHAREABILITY_SHIFT (10) +#define GICR_VPROPBASER_INNER_CACHEABILITY_SHIFT (7) +#define GICR_VPROPBASER_OUTER_CACHEABILITY_SHIFT (56) + +#define GICR_VPROPBASER_SHAREABILITY_MASK \ + GIC_BASER_SHAREABILITY(GICR_VPROPBASER, SHAREABILITY_MASK) +#define GICR_VPROPBASER_INNER_CACHEABILITY_MASK \ + GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, MASK) +#define GICR_VPROPBASER_OUTER_CACHEABILITY_MASK \ + GIC_BASER_CACHEABILITY(GICR_VPROPBASER, OUTER, MASK) +#define GICR_VPROPBASER_CACHEABILITY_MASK \ + GICR_VPROPBASER_INNER_CACHEABILITY_MASK + +#define GICR_VPROPBASER_InnerShareable \ + GIC_BASER_SHAREABILITY(GICR_VPROPBASER, InnerShareable) + +#define GICR_VPROPBASER_nCnB GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, nCnB) +#define GICR_VPROPBASER_nC GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, nC) +#define GICR_VPROPBASER_RaWt GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, RaWt) +#define GICR_VPROPBASER_RaWb GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, RaWt) +#define GICR_VPROPBASER_WaWt GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, WaWt) +#define GICR_VPROPBASER_WaWb GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, WaWb) +#define GICR_VPROPBASER_RaWaWt GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, RaWaWt) +#define GICR_VPROPBASER_RaWaWb GIC_BASER_CACHEABILITY(GICR_VPROPBASER, INNER, RaWaWb) + +#define GICR_VPENDBASER0x0078 + +#define GICR_VPENDBASER_SHAREABILITY_SHIFT (10) +#define GICR_VPENDBASER_INNER_CACHEABILITY_SHIFT (7) +#define GICR_VPENDBASER_OUTER_CACHEABILITY_SHIFT (56) +#define GICR_VPENDBASER_SHAREABILITY_MASK \ + GIC_BASER_SHAREABILITY(GICR_VPENDBASER, SHAREABILITY_MASK) +#define GICR_VPENDBASER_INNER_CACHEABILITY_MASK \ + GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, MASK) +#define GICR_VPENDBASER_OUTER_CACHEABILITY_MASK \ + GIC_BASER_CACHEABILITY(GICR_VPENDBASER, OUTER, MASK) +#define GICR_VPENDBASER_CACHEABILITY_MASK \ + GICR_VPENDBASER_INNER_CACHEABILITY_MASK + +#define GICR_VPENDBASER_NonShareable \ + GIC_BASER_SHAREABILITY(GICR_VPENDBASER, NonShareable) + +#define GICR_VPENDBASER_nCnB GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, nCnB) +#define GICR_VPENDBASER_nC GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, nC) +#define GICR_VPENDBASER_RaWt GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, RaWt) +#define GICR_VPENDBASER_RaWb GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, RaWt) +#define GICR_VPENDBASER_WaWt GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, WaWt) +#define GICR_VPENDBASER_WaWb GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, WaWb) +#define GICR_VPENDBASER_RaWaWt GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, RaWaWt) +#define GICR_VPENDBASER_RaWaWb GIC_BASER_CACHEABILITY(GICR_VPENDBASER, INNER, RaWaWb) + +#define GICR_PENDBASER_Dirty (1ULL << 60) +#define GICR_PENDBASER_PendingLast (1ULL << 61) +#define GICR_PENDBASER_IDAI(1ULL << 62) +#define GICR_PENDBASER_Valid (1ULL << 63) + +/* * ITS registers, offsets from ITS_base */ #define GITS_CTLR 0x -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 23/33] irqchip/gic-v3-its: Add VPENDBASER/VPROPBASER accessors
Hi Marc, On 01/17/2017 04:20 AM, Marc Zyngier wrote: V{PEND,PROP}BASER being 64bit registers, they need some ad-hoc accessors on 32bit, specially given that VPENDBASER contains a Valid bit, making the access a bit convoluted. Signed-off-by: Marc Zyngier --- arch/arm/include/asm/arch_gicv3.h | 28 arch/arm64/include/asm/arch_gicv3.h | 5 + 2 files changed, 33 insertions(+) diff --git a/arch/arm/include/asm/arch_gicv3.h b/arch/arm/include/asm/arch_gicv3.h index 2747590..3f18832 100644 --- a/arch/arm/include/asm/arch_gicv3.h +++ b/arch/arm/include/asm/arch_gicv3.h @@ -291,5 +291,33 @@ static inline u64 __gic_readq_nonatomic(const volatile void __iomem *addr) */ #define gits_write_cwriter(v, c) __gic_writeq_nonatomic(v, c) +/* + * GITS_VPROPBASER - hi and lo bits may be accessed independently. + */ +#define gits_write_vpropbaser(v, c)__gic_writeq_nonatomic(v, c) + +/* + * GITS_VPENDBASER - the Valid bit must be cleared before changing + * anything else. + */ +static inline void gits_write_vpendbaser(u64 val, void * __iomem addr) +{ + u32 tmp; + + tmp = readl_relaxed(addr + 4); + if (tmp & GICR_PENDBASER_Valid) { + tmp &= ~GICR_PENDBASER_Valid; + writel_relaxed(tmp, addr + 4); + } + + /* +* Use the fact that __gic_writeq_nonatomic writes the second +* half of the 64bit quantity after the first. +*/ + __gic_writeq_nonatomic(val, addr); I'm not sure whether software has to check a register write pending bit GICR_CTLR.RWP or not. GICv3 spec says, the effect of a write to GICR_VPENDBASER register is not guaranteed to be visible throughout the affinity hierarchy,as indicated by GICR_CTLR.RWP == 0. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 21/33] irqchip/gic-v3-its: Add VPE irq domain allocation/teardown
Hi Marc, On 01/17/2017 04:20 AM, Marc Zyngier wrote: When creating a VM, the low level GICv4 code is responsible for: - allocating each VPE a unique VPEID - allocating a doorbell interrupt for each VPE - allocating the pending tables for each VPE - allocating the property table for the VM This of course has to be reversed when the VM is brought down. All of this is wired into the irq domain alloc/free methods. Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3-its.c | 174 +++ 1 file changed, 174 insertions(+) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index ddd8096..54d0075 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -139,6 +139,7 @@ static struct rdists *gic_rdists; static struct irq_domain *its_parent; static unsigned long its_list_map; +static DEFINE_IDA(its_vpeid_ida); #define gic_data_rdist() (raw_cpu_ptr(gic_rdists->rdist)) #define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base) @@ -1146,6 +1147,11 @@ static struct page *its_allocate_prop_table(gfp_t gfp_flags) return prop_page; } +static void its_free_prop_table(struct page *prop_page) +{ + free_pages((unsigned long)page_address(prop_page), + get_order(LPI_PROPBASE_SZ)); +} static int __init its_alloc_lpi_tables(void) { @@ -1444,6 +1450,12 @@ static struct page *its_allocate_pending_table(gfp_t gfp_flags) return pend_page; } +static void its_free_pending_table(struct page *pt) +{ + free_pages((unsigned long)page_address(pt), + get_order(max(LPI_PENDBASE_SZ, SZ_64K))); +} + static void its_cpu_init_lpis(void) { void __iomem *rbase = gic_data_rdist_rd_base(); @@ -1666,6 +1678,34 @@ static bool its_alloc_device_table(struct its_node *its, u32 dev_id) return its_alloc_table_entry(baser, dev_id); } +static bool its_alloc_vpe_table(u32 vpe_id) +{ + struct its_node *its; + + /* +* Make sure the L2 tables are allocated on *all* v4 ITSs. We +* could try and only do it on ITSs corresponding to devices +* that have interrupts targeted at this VPE, but the +* complexity becomes crazy (and you have tons of memory +* anyway, right?). +*/ + list_for_each_entry(its, &its_nodes, entry) { + struct its_baser *baser; + + if (!its->is_v4) + continue; + + baser = its_get_baser(its, GITS_BASER_TYPE_VCPU); + if (!baser) + return false; + + if (!its_alloc_table_entry(baser, vpe_id)) + return false; + } + + return true; +} + static struct its_device *its_create_device(struct its_node *its, u32 dev_id, int nvecs) { @@ -1922,7 +1962,141 @@ static struct irq_chip its_vpe_irq_chip = { .name = "GICv4-vpe", }; +static int its_vpe_id_alloc(void) +{ + return ida_simple_get(&its_vpeid_ida, 0, 1 << 16, GFP_KERNEL); +} + +static void its_vpe_id_free(u16 id) +{ + ida_simple_remove(&its_vpeid_ida, id); +} + +static int its_vpe_init(struct its_vpe *vpe) +{ + struct page *vpt_page; + int vpe_id; + + /* Allocate vpe_id */ + vpe_id = its_vpe_id_alloc(); + if (vpe_id < 0) + return vpe_id; + + /* Allocate VPT */ + vpt_page = its_allocate_pending_table(GFP_KERNEL); + if (vpt_page) { Change to 'if (!vpt_page)'. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 17/33] irqchip/gic-v3-its: Add VLPI configuration hook
On 01/17/2017 04:20 AM, Marc Zyngier wrote: Add the skeleton irq_set_vcpu_affinity method that will be used to configure VLPIs. Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3-its.c | 33 + 1 file changed, 33 insertions(+) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 0dbc8b0..1bd78ca 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -36,6 +36,7 @@ #include #include +#include #include #include @@ -771,6 +772,37 @@ static int its_irq_set_irqchip_state(struct irq_data *d, return 0; } +static int its_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu_info) +{ + struct its_device *its_dev = irq_data_get_irq_chip_data(d); + struct its_cmd_info *info = vcpu_info; + u32 event = its_get_event_id(d); + + /* Need a v4 ITS */ + if (!its_dev->its->is_v4 || !info) + return -EINVAL; + + switch (info->cmd_type) { + case MAP_VLPI: + { + return 0; + } + + case UNMAP_VLPI: + { + return 0; + } + + case PROP_UPDATE_VLPI: + { + return 0; + } + + default: + return -EINVAL; + } Missing a return statement. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 11/33] irqchip/gic-v3-its: Split out pending table allocation
Hi Marc, On 01/17/2017 04:20 AM, Marc Zyngier wrote: Just as for the property table, let's move the pending table allocation to a separate function. Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3-its.c | 29 - 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 14305db1..dce8f8c 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1188,6 +1188,24 @@ static int its_alloc_collections(struct its_node *its) return 0; } +static struct page *its_allocate_pending_table(gfp_t gfp_flags) +{ PEND and PROP table sizes are defined as compile time macros, but as per ITS spec implementation 24bit LPI space is also possible. It would be nicer to parametrize both the tables sizes so that it would easier to enable 24bit LPI later. Actually Qualcomm server chips support 24bit IDBITS. -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 10/33] irqchip/gic-v4-its: Allow use of indirect VCPU tables
Hi Marc, On 01/17/2017 04:20 AM, Marc Zyngier wrote: The VCPU tables can be quite sparse as well, and it makes sense to use indirect tables as well if possible. The VCPU table has maximum of 2^16 entries as compared to 2^32 entries in device table. ITS hardware implementations may not support indirect table because of low memory requirement. Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3-its.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index c92ff4d..14305db1 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1060,10 +1060,13 @@ static int its_setup_baser(struct its_node *its, struct its_baser *baser, return 0; } -static bool its_parse_baser_device(struct its_node *its, struct its_baser *baser, - u32 psz, u32 *order) +static bool its_parse_indirect_baser(struct its_node *its, +struct its_baser *baser, +u32 psz, u32 *order) { - u64 esz = GITS_BASER_ENTRY_SIZE(its_read_baser(its, baser)); + u64 tmp = its_read_baser(its, baser); + u64 type = GITS_BASER_TYPE(tmp); + u64 esz = GITS_BASER_ENTRY_SIZE(tmp); u64 val = GITS_BASER_InnerShareable | GITS_BASER_WaWb; u32 ids = its->device_ids; u32 new_order = *order; @@ -1102,8 +1105,9 @@ static bool its_parse_baser_device(struct its_node *its, struct its_baser *baser if (new_order >= MAX_ORDER) { new_order = MAX_ORDER - 1; ids = ilog2(PAGE_ORDER_TO_SIZE(new_order) / (int)esz); - pr_warn("ITS@%pa: Device Table too large, reduce ids %u->%u\n", - &its->phys_base, its->device_ids, ids); + pr_warn("ITS@%pa: %s Table too large, reduce ids %u->%u\n", + &its->phys_base, its_base_type_string[type], + its->device_ids, ids); } *order = new_order; @@ -1154,8 +1158,10 @@ static int its_alloc_tables(struct its_node *its) if (type == GITS_BASER_TYPE_NONE) continue; - if (type == GITS_BASER_TYPE_DEVICE) - indirect = its_parse_baser_device(its, baser, psz, &order); Try to allocate maximum memory as possible then attempt enabling indirection table. #define ITS_VPES_MAX(65536) if (type == GITS_BASER_TYPE_VCPU) order = get_order(esz * ITS_VPES_MAX); On Qualcomm implementation, 1MBytes, 64536 * 16Byte (vPE entry size) memory is enough to sparse 16bit vPE. + if (type == GITS_BASER_TYPE_DEVICE || + type == GITS_BASER_TYPE_VCPU) + indirect = its_parse_indirect_baser(its, baser, + psz, &order); err = its_setup_baser(its, baser, cache, shr, psz, order, indirect); if (err < 0) { -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 06/33] irqchip/gic-v3-its: Add probing for VLPI properties
On 01/17/2017 04:20 AM, Marc Zyngier wrote: Add the probing code for the ITS VLPI support. This includes configuring the ITS number if not supporting the single VMOVP command feature. Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3-its.c | 47 ++ include/linux/irqchip/arm-gic-v3.h | 4 2 files changed, 47 insertions(+), 4 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 9304dd2..99f6130 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -103,6 +103,7 @@ struct its_node { u32 ite_size; u32 device_ids; int numa_node; + boolis_v4; }; #define ITS_ITT_ALIGN SZ_256 @@ -135,6 +136,8 @@ static DEFINE_SPINLOCK(its_lock); static struct rdists *gic_rdists; static struct irq_domain *its_parent; +static unsigned long its_list_map; + #define gic_data_rdist() (raw_cpu_ptr(gic_rdists->rdist)) #define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base) @@ -1661,8 +1664,8 @@ static int __init its_probe_one(struct resource *res, { struct its_node *its; void __iomem *its_base; - u32 val; - u64 baser, tmp; + u32 val, ctlr; + u64 baser, tmp, typer; int err; its_base = ioremap(res->start, resource_size(res)); @@ -1695,9 +1698,44 @@ static int __init its_probe_one(struct resource *res, raw_spin_lock_init(&its->lock); INIT_LIST_HEAD(&its->entry); INIT_LIST_HEAD(&its->its_device_list); + typer = gic_read_typer(its_base + GITS_TYPER); its->base = its_base; its->phys_base = res->start; - its->ite_size = ((gic_read_typer(its_base + GITS_TYPER) >> 4) & 0xf) + 1; + its->ite_size = ((typer >> 4) & 0xf) + 1; I think we should move bit manipulations to a macro, some thing like this. its->ite_size = GITS_TYPER_ITEBITS(typer); #define GITS_TYPER_ITEBITS_SHIFT4 #define GITS_TYPER_ITEBITS(r) r) >> GITS_TYPER_ITEBITS_SHIFT) & 0xf) + 1) + its->is_v4 = !!(typer & GITS_TYPER_VLPIS); + if (its->is_v4 && !(typer & GITS_TYPER_VMOVP)) { + int its_number; + + its_number = find_first_zero_bit(&its_list_map, 16); + if (its_number >= 16) { + pr_err("ITS@%pa: No ITSList entry available!\n", + &res->start); + err = -EINVAL; + goto out_free_its; + } + + ctlr = readl_relaxed(its_base + GITS_CTLR); + ctlr &= ~GITS_CTLR_ITS_NUMBER; + ctlr |= its_number << GITS_CTLR_ITS_NUMBER_SHIFT; + writel_relaxed(ctlr, its_base + GITS_CTLR); + ctlr = readl_relaxed(its_base + GITS_CTLR); + if ((ctlr & GITS_CTLR_ITS_NUMBER) != (its_number << GITS_CTLR_ITS_NUMBER_SHIFT)) { + its_number = ctlr & GITS_CTLR_ITS_NUMBER; + its_number >>= GITS_CTLR_ITS_NUMBER_SHIFT; + } + + if (test_and_set_bit(its_number, &its_list_map)) { + pr_err("ITS@%pa: Duplicate ITSList entry %d\n", + &res->start, its_number); + err = -EINVAL; + goto out_free_its; + } + + pr_info("ITS@%pa: Using ITS number %d\n", &res->start, its_number); + } else { + pr_info("ITS@%pa: Single VMOVP capable\n", &res->start); + } Can we move to a separate function for code readability purpose? -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 02/33] irqchip/gic-v3: Add VLPI/DirectLPI discovery
On 01/17/2017 04:20 AM, Marc Zyngier wrote: Add helper functions that probe for VLPI and DirectLPI properties. Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3.c | 22 ++ include/linux/irqchip/arm-gic-v3.h | 3 +++ 2 files changed, 25 insertions(+) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index 5cadec0..8a6de91 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -514,6 +514,24 @@ static int gic_populate_rdist(void) return -ENODEV; } +static int __gic_update_vlpi_properties(struct redist_region *region, + void __iomem *ptr) +{ + u64 typer = gic_read_typer(ptr + GICR_TYPER); + gic_data.rdists.has_vlpis &= !!(typer & GICR_TYPER_VLPIS); + gic_data.rdists.has_direct_lpi &= !!(typer & GICR_TYPER_DirectLPIS); + + return 1; +} + +static void gic_update_vlpi_properties(void) +{ + gic_scan_rdist_properties(__gic_update_vlpi_properties); + pr_info("%sVLPI support, %sdirect LPI support\n", Would be better if we keep one space after 'no'? -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC PATCH 01/33] irqchip/gic-v3: Add redistributor iterator
Hi Marc, On 01/17/2017 04:20 AM, Marc Zyngier wrote: In order to discover the VLPI properties, we need to iterate over the redistributor regions. As we already have code that does this, let's factor it out and make it slightly more generic. Signed-off-by: Marc Zyngier --- drivers/irqchip/irq-gic-v3.c | 77 1 file changed, 56 insertions(+), 21 deletions(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index c132f29..5cadec0 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -421,24 +421,15 @@ static void __init gic_dist_init(void) gic_write_irouter(affinity, base + GICD_IROUTER + i * 8); } -static int gic_populate_rdist(void) +static int gic_scan_rdist_properties(int (*fn)(struct redist_region *, + void __iomem *)) I don't see this function is parsing GICR properties, may be it makes readable on changing name to gic_redist_iterator(). { - unsigned long mpidr = cpu_logical_map(smp_processor_id()); - u64 typer; - u32 aff; + int ret = 0; For readability purpose set ret = ENODEV, to cover error case where gic_data.nr_redist_regions == 0. int i; - /* -* Convert affinity to a 32bit value that can be matched to -* GICR_TYPER bits [63:32]. -*/ - aff = (MPIDR_AFFINITY_LEVEL(mpidr, 3) << 24 | - MPIDR_AFFINITY_LEVEL(mpidr, 2) << 16 | - MPIDR_AFFINITY_LEVEL(mpidr, 1) << 8 | - MPIDR_AFFINITY_LEVEL(mpidr, 0)); - for (i = 0; i < gic_data.nr_redist_regions; i++) { void __iomem *ptr = gic_data.redist_regions[i].redist_base; + u64 typer; u32 reg; reg = readl_relaxed(ptr + GICR_PIDR2) & GIC_PIDR2_ARCH_MASK; @@ -450,14 +441,14 @@ static int gic_populate_rdist(void) do { typer = gic_read_typer(ptr + GICR_TYPER); - if ((typer >> 32) == aff) { - u64 offset = ptr - gic_data.redist_regions[i].redist_base; - gic_data_rdist_rd_base() = ptr; - gic_data_rdist()->phys_base = gic_data.redist_regions[i].phys_base + offset; - pr_info("CPU%d: found redistributor %lx region %d:%pa\n", - smp_processor_id(), mpidr, i, - &gic_data_rdist()->phys_base); + ret = fn(gic_data.redist_regions + i, ptr); + switch (ret) { + case 0: return 0; + case -1: + break; + default: + ret = 0; } if (gic_data.redist_regions[i].single_redist) @@ -473,9 +464,53 @@ static int gic_populate_rdist(void) } while (!(typer & GICR_TYPER_LAST)); + if (ret == -1) + ret = -ENODEV; + __gic_populate_rdist() returns 1 to try next entry in the list. We should not return value 0 here if no matching entry is found otherwise the gic_populate_rdist() assumes that it found the corresponding GICR. + return 0; +} + +static int __gic_populate_rdist(struct redist_region *region, void __iomem *ptr) +{ + unsigned long mpidr = cpu_logical_map(smp_processor_id()); + u64 typer; + u32 aff; + + /* +* Convert affinity to a 32bit value that can be matched to +* GICR_TYPER bits [63:32]. +*/ + aff = (MPIDR_AFFINITY_LEVEL(mpidr, 3) << 24 | + MPIDR_AFFINITY_LEVEL(mpidr, 2) << 16 | + MPIDR_AFFINITY_LEVEL(mpidr, 1) << 8 | + MPIDR_AFFINITY_LEVEL(mpidr, 0)); + + typer = gic_read_typer(ptr + GICR_TYPER); + if ((typer >> 32) == aff) { + u64 offset = ptr - region->redist_base; + gic_data_rdist_rd_base() = ptr; + gic_data_rdist()->phys_base = region->phys_base + offset; + + pr_info("CPU%d: found redistributor %lx region %d:%pa\n", + smp_processor_id(), mpidr, + (int)(region - gic_data.redist_regions), + &gic_data_rdist()->phys_base); + return 0; + } + + /* Try next one */ + return 1; +} + +static int gic_populate_rdist(void) +{ + if (gic_scan_rdist_properties(__gic_populate_rdist) == 0) what about 'if (!gic_scan_rdist_properties(__gic_populate_rdist))'? + return 0; + /* We couldn't even deal with ourselves... */ WARN(true, "CPU%d: mpidr %lx has no re-distributor!\n", -smp_processor_id(), mpidr); +
[RESEND PATCH] KVM: arm/arm64: vgic: Stop injecting the MSI occurrence twice
The IRQFD framework calls the architecture dependent function twice if the corresponding GSI type is edge triggered. For ARM, the function kvm_set_msi() is getting called twice whenever the IRQFD receives the event signal. The rest of the code path is trying to inject the MSI without any validation checks. No need to call the function vgic_its_inject_msi() second time to avoid an unnecessary overhead in IRQ queue logic. It also avoids the possibility of VM seeing the MSI twice. Simple fix, return -1 if the argument 'level' value is zero. Signed-off-by: Shanker Donthineni Reviewed-by: Eric Auger Reviewed-by: Christoffer Dall --- Forgot to CC the kvmarm list earlier, including now. virt/kvm/arm/vgic/vgic-irqfd.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/virt/kvm/arm/vgic/vgic-irqfd.c b/virt/kvm/arm/vgic/vgic-irqfd.c index d918dcf..f138ed2 100644 --- a/virt/kvm/arm/vgic/vgic-irqfd.c +++ b/virt/kvm/arm/vgic/vgic-irqfd.c @@ -99,6 +99,9 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, if (!vgic_has_its(kvm)) return -ENODEV; + if (!level) + return -1; + return vgic_its_inject_msi(kvm, &msi); } -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH v3] arm64: KVM: Optimize __guest_enter/exit() to save a few instructions
We are doing an unnecessary stack push/pop operation when restoring the guest registers x0-x18 in __guest_enter(). This patch saves the two instructions by using x18 as a base register. No need to store the vcpu context pointer in stack because it is redundant, the same information is available in tpidr_el2. The function __guest_exit() calling convention is slightly modified, caller only pushes the regs x0-x1 to stack instead of regs x0-x3. Signed-off-by: Shanker Donthineni Reviewed-by: Christoffer Dall --- Tested this patch using the Qualcomm QDF24XXX platform. Changes since v2: Removed macros save_x0_to_x3/restore_x0_to_x3. Modified el1_sync() to use regs x0 and x1. Edited commit text. Changes since v1: Incorporated Cristoffer suggestions. __guest_exit prototype is changed to 'void __guest_exit(u64 reason, struct kvm_vcpu *vcpu)'. arch/arm64/kvm/hyp/entry.S | 101 - arch/arm64/kvm/hyp/hyp-entry.S | 37 ++- 2 files changed, 63 insertions(+), 75 deletions(-) diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index ce9e5e5..3967c231 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -55,79 +55,78 @@ */ ENTRY(__guest_enter) // x0: vcpu - // x1: host/guest context - // x2-x18: clobbered by macros + // x1: host context + // x2-x17: clobbered by macros + // x18: guest context // Store the host regs save_callee_saved_regs x1 - // Preserve vcpu & host_ctxt for use at exit time - stp x0, x1, [sp, #-16]! + // Store the host_ctxt for use at exit time + str x1, [sp, #-16]! - add x1, x0, #VCPU_CONTEXT + add x18, x0, #VCPU_CONTEXT - // Prepare x0-x1 for later restore by pushing them onto the stack - ldp x2, x3, [x1, #CPU_XREG_OFFSET(0)] - stp x2, x3, [sp, #-16]! + // Restore guest regs x0-x17 + ldp x0, x1, [x18, #CPU_XREG_OFFSET(0)] + ldp x2, x3, [x18, #CPU_XREG_OFFSET(2)] + ldp x4, x5, [x18, #CPU_XREG_OFFSET(4)] + ldp x6, x7, [x18, #CPU_XREG_OFFSET(6)] + ldp x8, x9, [x18, #CPU_XREG_OFFSET(8)] + ldp x10, x11, [x18, #CPU_XREG_OFFSET(10)] + ldp x12, x13, [x18, #CPU_XREG_OFFSET(12)] + ldp x14, x15, [x18, #CPU_XREG_OFFSET(14)] + ldp x16, x17, [x18, #CPU_XREG_OFFSET(16)] - // x2-x18 - ldp x2, x3, [x1, #CPU_XREG_OFFSET(2)] - ldp x4, x5, [x1, #CPU_XREG_OFFSET(4)] - ldp x6, x7, [x1, #CPU_XREG_OFFSET(6)] - ldp x8, x9, [x1, #CPU_XREG_OFFSET(8)] - ldp x10, x11, [x1, #CPU_XREG_OFFSET(10)] - ldp x12, x13, [x1, #CPU_XREG_OFFSET(12)] - ldp x14, x15, [x1, #CPU_XREG_OFFSET(14)] - ldp x16, x17, [x1, #CPU_XREG_OFFSET(16)] - ldr x18, [x1, #CPU_XREG_OFFSET(18)] - - // x19-x29, lr - restore_callee_saved_regs x1 - - // Last bits of the 64bit state - ldp x0, x1, [sp], #16 + // Restore guest regs x19-x29, lr + restore_callee_saved_regs x18 + + // Restore guest reg x18 + ldr x18, [x18, #CPU_XREG_OFFSET(18)] // Do not touch any register after this! eret ENDPROC(__guest_enter) ENTRY(__guest_exit) - // x0: vcpu - // x1: return code - // x2-x3: free - // x4-x29,lr: vcpu regs - // vcpu x0-x3 on the stack - - add x2, x0, #VCPU_CONTEXT - - stp x4, x5, [x2, #CPU_XREG_OFFSET(4)] - stp x6, x7, [x2, #CPU_XREG_OFFSET(6)] - stp x8, x9, [x2, #CPU_XREG_OFFSET(8)] - stp x10, x11, [x2, #CPU_XREG_OFFSET(10)] - stp x12, x13, [x2, #CPU_XREG_OFFSET(12)] - stp x14, x15, [x2, #CPU_XREG_OFFSET(14)] - stp x16, x17, [x2, #CPU_XREG_OFFSET(16)] - str x18, [x2, #CPU_XREG_OFFSET(18)] - - ldp x6, x7, [sp], #16 // x2, x3 - ldp x4, x5, [sp], #16 // x0, x1 - - stp x4, x5, [x2, #CPU_XREG_OFFSET(0)] - stp x6, x7, [x2, #CPU_XREG_OFFSET(2)] + // x0: return code + // x1: vcpu + // x2-x29,lr: vcpu regs + // vcpu x0-x1 on the stack + + add x1, x1, #VCPU_CONTEXT + + // Store the guest regs x2 and x3 + stp x2, x3, [x1, #CPU_XREG_OFFSET(2)] + + // Retrieve the guest regs x0-x1 from the stack + ldp x2, x3, [sp], #16 // x0, x1 + + // Store the guest regs x0-x1 and x4-x18 + stp x2, x3, [x1, #CPU_XREG_OFFSET(0)] + stp x4, x5, [x1, #CPU_XREG_OFFSET(4)] + stp x6, x7, [x1, #CPU_XREG_OFFSET(6)] + stp x8, x9, [x1, #CPU_XREG_OFFSET(8)] + stp x10, x11, [x1, #CPU_XREG_OFFSET(10)] + stp x12, x13, [x1, #CPU_XREG_OFFSET(12)] + stp x14, x15, [x1, #CPU_XREG_OFFSET(14)] + stp x16, x17, [x1, #C
Re: [PATCH v2] arm64: KVM: Save four instructions in __guest_enter/exit()
Hi Marc, On 08/30/2016 05:54 AM, Marc Zyngier wrote: On 30/08/16 10:55, Christoffer Dall wrote: On Mon, Aug 29, 2016 at 10:51:14PM -0500, Shanker Donthineni wrote: We are doing an unnecessary stack push/pop operation when restoring the guest registers x0-x18 in __guest_enter(). This patch saves the two instructions by using x18 as a base register. No need to store the vcpu context pointer in stack because it is redundant, the same information is available in tpidr_el2. The function __guest_exit() prototype is simplified and caller pushes the regs x0-x1 to stack instead of regs x0-x3. Signed-off-by: Shanker Donthineni This looks reasonable to me: Reviewed-by: Christoffer Dall Unless Marc has any insight into this having a negative effect on ARM CPUs, I'll go ahead an merge this. I've given it a go on Seattle, and couldn't observe any difference with the original code, which is pretty good news! I have some comments below, though: -Christoffer --- Changes since v1: Incorporated Cristoffer suggestions. __guest_exit prototype is changed to 'void __guest_exit(u64 reason, struct kvm_vcpu *vcpu)'. arch/arm64/kvm/hyp/entry.S | 101 + arch/arm64/kvm/hyp/hyp-entry.S | 11 +++-- 2 files changed, 57 insertions(+), 55 deletions(-) diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index ce9e5e5..f70489a 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -55,75 +55,76 @@ */ ENTRY(__guest_enter) // x0: vcpu - // x1: host/guest context - // x2-x18: clobbered by macros + // x1: host context + // x2-x17: clobbered by macros + // x18: guest context // Store the host regs save_callee_saved_regs x1 - // Preserve vcpu & host_ctxt for use at exit time - stp x0, x1, [sp, #-16]! + // Store the host_ctxt for use at exit time + str x1, [sp, #-16]! - add x1, x0, #VCPU_CONTEXT + add x18, x0, #VCPU_CONTEXT - // Prepare x0-x1 for later restore by pushing them onto the stack - ldp x2, x3, [x1, #CPU_XREG_OFFSET(0)] - stp x2, x3, [sp, #-16]! + // Restore guest regs x0-x17 + ldp x0, x1, [x18, #CPU_XREG_OFFSET(0)] + ldp x2, x3, [x18, #CPU_XREG_OFFSET(2)] + ldp x4, x5, [x18, #CPU_XREG_OFFSET(4)] + ldp x6, x7, [x18, #CPU_XREG_OFFSET(6)] + ldp x8, x9, [x18, #CPU_XREG_OFFSET(8)] + ldp x10, x11, [x18, #CPU_XREG_OFFSET(10)] + ldp x12, x13, [x18, #CPU_XREG_OFFSET(12)] + ldp x14, x15, [x18, #CPU_XREG_OFFSET(14)] + ldp x16, x17, [x18, #CPU_XREG_OFFSET(16)] - // x2-x18 - ldp x2, x3, [x1, #CPU_XREG_OFFSET(2)] - ldp x4, x5, [x1, #CPU_XREG_OFFSET(4)] - ldp x6, x7, [x1, #CPU_XREG_OFFSET(6)] - ldp x8, x9, [x1, #CPU_XREG_OFFSET(8)] - ldp x10, x11, [x1, #CPU_XREG_OFFSET(10)] - ldp x12, x13, [x1, #CPU_XREG_OFFSET(12)] - ldp x14, x15, [x1, #CPU_XREG_OFFSET(14)] - ldp x16, x17, [x1, #CPU_XREG_OFFSET(16)] - ldr x18, [x1, #CPU_XREG_OFFSET(18)] + // Restore guest regs x19-x29, lr + restore_callee_saved_regs x18 - // x19-x29, lr - restore_callee_saved_regs x1 - - // Last bits of the 64bit state - ldp x0, x1, [sp], #16 + // Restore guest reg x18 + ldr x18, [x18, #CPU_XREG_OFFSET(18)] // Do not touch any register after this! eret ENDPROC(__guest_enter) +/* + * void __guest_exit(u64 exit_reason, struct kvm_vcpu *vcpu); + */ I'm not sure this comment makes much sense as it stands. This is not a C function by any stretch of the imagination, but the continuation of __guest_enter. The calling convention is not the C one at all (see how the stack is involved), and caller-saved registers are going to be clobbered. I'll remove this confusing comments. ENTRY(__guest_exit) - // x0: vcpu - // x1: return code - // x2-x3: free - // x4-x29,lr: vcpu regs - // vcpu x0-x3 on the stack - - add x2, x0, #VCPU_CONTEXT - - stp x4, x5, [x2, #CPU_XREG_OFFSET(4)] - stp x6, x7, [x2, #CPU_XREG_OFFSET(6)] - stp x8, x9, [x2, #CPU_XREG_OFFSET(8)] - stp x10, x11, [x2, #CPU_XREG_OFFSET(10)] - stp x12, x13, [x2, #CPU_XREG_OFFSET(12)] - stp x14, x15, [x2, #CPU_XREG_OFFSET(14)] - stp x16, x17, [x2, #CPU_XREG_OFFSET(16)] - str x18, [x2, #CPU_XREG_OFFSET(18)] - - ldp x6, x7, [sp], #16 // x2, x3 - ldp x4, x5, [sp], #16 // x0, x1 - - stp x4, x5, [x2, #CPU_XREG_OFFSET(0)] - stp x6, x7, [x2, #CPU_XREG_OFFSET(2)] + // x0: return code + // x1: vcpu + // x2-x29,lr: vcpu regs + // vcpu x0-x1 on the stack + + add x1, x
[PATCH v2] arm64: KVM: Save four instructions in __guest_enter/exit()
We are doing an unnecessary stack push/pop operation when restoring the guest registers x0-x18 in __guest_enter(). This patch saves the two instructions by using x18 as a base register. No need to store the vcpu context pointer in stack because it is redundant, the same information is available in tpidr_el2. The function __guest_exit() prototype is simplified and caller pushes the regs x0-x1 to stack instead of regs x0-x3. Signed-off-by: Shanker Donthineni --- Changes since v1: Incorporated Cristoffer suggestions. __guest_exit prototype is changed to 'void __guest_exit(u64 reason, struct kvm_vcpu *vcpu)'. arch/arm64/kvm/hyp/entry.S | 101 + arch/arm64/kvm/hyp/hyp-entry.S | 11 +++-- 2 files changed, 57 insertions(+), 55 deletions(-) diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index ce9e5e5..f70489a 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -55,75 +55,76 @@ */ ENTRY(__guest_enter) // x0: vcpu - // x1: host/guest context - // x2-x18: clobbered by macros + // x1: host context + // x2-x17: clobbered by macros + // x18: guest context // Store the host regs save_callee_saved_regs x1 - // Preserve vcpu & host_ctxt for use at exit time - stp x0, x1, [sp, #-16]! + // Store the host_ctxt for use at exit time + str x1, [sp, #-16]! - add x1, x0, #VCPU_CONTEXT + add x18, x0, #VCPU_CONTEXT - // Prepare x0-x1 for later restore by pushing them onto the stack - ldp x2, x3, [x1, #CPU_XREG_OFFSET(0)] - stp x2, x3, [sp, #-16]! + // Restore guest regs x0-x17 + ldp x0, x1, [x18, #CPU_XREG_OFFSET(0)] + ldp x2, x3, [x18, #CPU_XREG_OFFSET(2)] + ldp x4, x5, [x18, #CPU_XREG_OFFSET(4)] + ldp x6, x7, [x18, #CPU_XREG_OFFSET(6)] + ldp x8, x9, [x18, #CPU_XREG_OFFSET(8)] + ldp x10, x11, [x18, #CPU_XREG_OFFSET(10)] + ldp x12, x13, [x18, #CPU_XREG_OFFSET(12)] + ldp x14, x15, [x18, #CPU_XREG_OFFSET(14)] + ldp x16, x17, [x18, #CPU_XREG_OFFSET(16)] - // x2-x18 - ldp x2, x3, [x1, #CPU_XREG_OFFSET(2)] - ldp x4, x5, [x1, #CPU_XREG_OFFSET(4)] - ldp x6, x7, [x1, #CPU_XREG_OFFSET(6)] - ldp x8, x9, [x1, #CPU_XREG_OFFSET(8)] - ldp x10, x11, [x1, #CPU_XREG_OFFSET(10)] - ldp x12, x13, [x1, #CPU_XREG_OFFSET(12)] - ldp x14, x15, [x1, #CPU_XREG_OFFSET(14)] - ldp x16, x17, [x1, #CPU_XREG_OFFSET(16)] - ldr x18, [x1, #CPU_XREG_OFFSET(18)] + // Restore guest regs x19-x29, lr + restore_callee_saved_regs x18 - // x19-x29, lr - restore_callee_saved_regs x1 - - // Last bits of the 64bit state - ldp x0, x1, [sp], #16 + // Restore guest reg x18 + ldr x18, [x18, #CPU_XREG_OFFSET(18)] // Do not touch any register after this! eret ENDPROC(__guest_enter) +/* + * void __guest_exit(u64 exit_reason, struct kvm_vcpu *vcpu); + */ ENTRY(__guest_exit) - // x0: vcpu - // x1: return code - // x2-x3: free - // x4-x29,lr: vcpu regs - // vcpu x0-x3 on the stack - - add x2, x0, #VCPU_CONTEXT - - stp x4, x5, [x2, #CPU_XREG_OFFSET(4)] - stp x6, x7, [x2, #CPU_XREG_OFFSET(6)] - stp x8, x9, [x2, #CPU_XREG_OFFSET(8)] - stp x10, x11, [x2, #CPU_XREG_OFFSET(10)] - stp x12, x13, [x2, #CPU_XREG_OFFSET(12)] - stp x14, x15, [x2, #CPU_XREG_OFFSET(14)] - stp x16, x17, [x2, #CPU_XREG_OFFSET(16)] - str x18, [x2, #CPU_XREG_OFFSET(18)] - - ldp x6, x7, [sp], #16 // x2, x3 - ldp x4, x5, [sp], #16 // x0, x1 - - stp x4, x5, [x2, #CPU_XREG_OFFSET(0)] - stp x6, x7, [x2, #CPU_XREG_OFFSET(2)] + // x0: return code + // x1: vcpu + // x2-x29,lr: vcpu regs + // vcpu x0-x1 on the stack + + add x1, x1, #VCPU_CONTEXT + + // Store the guest regs x2 and x3 + stp x2, x3, [x1, #CPU_XREG_OFFSET(2)] + + // Retrieve the guest regs x0-x1 from the stack + ldp x2, x3, [sp], #16 // x0, x1 + + // Store the guest regs x0-x1 and x4-x18 + stp x2, x3, [x1, #CPU_XREG_OFFSET(0)] + stp x4, x5, [x1, #CPU_XREG_OFFSET(4)] + stp x6, x7, [x1, #CPU_XREG_OFFSET(6)] + stp x8, x9, [x1, #CPU_XREG_OFFSET(8)] + stp x10, x11, [x1, #CPU_XREG_OFFSET(10)] + stp x12, x13, [x1, #CPU_XREG_OFFSET(12)] + stp x14, x15, [x1, #CPU_XREG_OFFSET(14)] + stp x16, x17, [x1, #CPU_XREG_OFFSET(16)] + str x18, [x1, #CPU_XREG_OFFSET(18)] + + // Store the guest regs x19-x29, lr + save_callee_saved_regs x1 - save_callee
Re: [PATCH] arm64: KVM: Save two instructions in __guest_enter()
Hi Christoffer, This is change may not provide the measurable performance improvement, but still we can save a few cpu cycles on vCPU context switch and also improves the code readability. On 08/25/2016 08:31 AM, Christoffer Dall wrote: Hi Shanker, On Tue, Aug 09, 2016 at 08:15:36PM -0500, Shanker Donthineni wrote: We are doing an unnecessary stack push/pop operation when restoring the guest registers x0-x18 in __guest_enter(). This patch saves the two instructions by using x18 as a base register. No need to store the vcpu context pointer in stack because it is redundant and not being used anywhere, the same information is available in tpidr_el2. Does this have any measureable benefit? Thanks, -Christoffer -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH] arm64: KVM: Save two instructions in __guest_enter()
We are doing an unnecessary stack push/pop operation when restoring the guest registers x0-x18 in __guest_enter(). This patch saves the two instructions by using x18 as a base register. No need to store the vcpu context pointer in stack because it is redundant and not being used anywhere, the same information is available in tpidr_el2. Signed-off-by: Shanker Donthineni --- arch/arm64/kvm/hyp/entry.S | 66 ++ 1 file changed, 32 insertions(+), 34 deletions(-) diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S index ce9e5e5..d2e09a1 100644 --- a/arch/arm64/kvm/hyp/entry.S +++ b/arch/arm64/kvm/hyp/entry.S @@ -55,37 +55,32 @@ */ ENTRY(__guest_enter) // x0: vcpu - // x1: host/guest context - // x2-x18: clobbered by macros + // x1: host context + // x2-x17: clobbered by macros + // x18: guest context // Store the host regs save_callee_saved_regs x1 - // Preserve vcpu & host_ctxt for use at exit time - stp x0, x1, [sp, #-16]! + // Preserve the host_ctxt for use at exit time + str x1, [sp, #-16]! - add x1, x0, #VCPU_CONTEXT + add x18, x0, #VCPU_CONTEXT - // Prepare x0-x1 for later restore by pushing them onto the stack - ldp x2, x3, [x1, #CPU_XREG_OFFSET(0)] - stp x2, x3, [sp, #-16]! + // Restore guest regs x19-x29, lr + restore_callee_saved_regs x18 - // x2-x18 - ldp x2, x3, [x1, #CPU_XREG_OFFSET(2)] - ldp x4, x5, [x1, #CPU_XREG_OFFSET(4)] - ldp x6, x7, [x1, #CPU_XREG_OFFSET(6)] - ldp x8, x9, [x1, #CPU_XREG_OFFSET(8)] - ldp x10, x11, [x1, #CPU_XREG_OFFSET(10)] - ldp x12, x13, [x1, #CPU_XREG_OFFSET(12)] - ldp x14, x15, [x1, #CPU_XREG_OFFSET(14)] - ldp x16, x17, [x1, #CPU_XREG_OFFSET(16)] - ldr x18, [x1, #CPU_XREG_OFFSET(18)] - - // x19-x29, lr - restore_callee_saved_regs x1 - - // Last bits of the 64bit state - ldp x0, x1, [sp], #16 + // Restore guest regs x0-x18 + ldp x0, x1, [x18, #CPU_XREG_OFFSET(0)] + ldp x2, x3, [x18, #CPU_XREG_OFFSET(2)] + ldp x4, x5, [x18, #CPU_XREG_OFFSET(4)] + ldp x6, x7, [x18, #CPU_XREG_OFFSET(6)] + ldp x8, x9, [x18, #CPU_XREG_OFFSET(8)] + ldp x10, x11, [x18, #CPU_XREG_OFFSET(10)] + ldp x12, x13, [x18, #CPU_XREG_OFFSET(12)] + ldp x14, x15, [x18, #CPU_XREG_OFFSET(14)] + ldp x16, x17, [x18, #CPU_XREG_OFFSET(16)] + ldr x18, [x18, #CPU_XREG_OFFSET(18)] // Do not touch any register after this! eret @@ -100,6 +95,16 @@ ENTRY(__guest_exit) add x2, x0, #VCPU_CONTEXT + // Store the guest regs x19-x29, lr + save_callee_saved_regs x2 + + // Retrieve the guest regs x0-x3 from the stack + ldp x21, x22, [sp], #16 // x2, x3 + ldp x19, x20, [sp], #16 // x0, x1 + + // Store the guest regs x0-x18 + stp x19, x20, [x2, #CPU_XREG_OFFSET(0)] + stp x21, x22, [x2, #CPU_XREG_OFFSET(2)] stp x4, x5, [x2, #CPU_XREG_OFFSET(4)] stp x6, x7, [x2, #CPU_XREG_OFFSET(6)] stp x8, x9, [x2, #CPU_XREG_OFFSET(8)] @@ -109,20 +114,13 @@ ENTRY(__guest_exit) stp x16, x17, [x2, #CPU_XREG_OFFSET(16)] str x18, [x2, #CPU_XREG_OFFSET(18)] - ldp x6, x7, [sp], #16 // x2, x3 - ldp x4, x5, [sp], #16 // x0, x1 + // Restore the host_ctxt from the stack + ldr x2, [sp], #16 - stp x4, x5, [x2, #CPU_XREG_OFFSET(0)] - stp x6, x7, [x2, #CPU_XREG_OFFSET(2)] - - save_callee_saved_regs x2 - - // Restore vcpu & host_ctxt from the stack - // (preserving return code in x1) - ldp x0, x2, [sp], #16 // Now restore the host regs restore_callee_saved_regs x2 + // Preserving return code (x1) mov x0, x1 ret ENDPROC(__guest_exit) -- Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PULL 00/29] KVM/ARM Changes for v4.7
Hi Itaru, Look at this commit that might be causing the problem. https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/drivers/firmware/efi/arm-runtime.c?id=14c43be60166981f0b1f034ad9c59252c6f99e0d Review your EFI system table and runtime service region attributes. On 05/17/2016 03:00 AM, Julien Grall wrote: > Hello, > > On 17/05/2016 00:28, Itaru Kitayama wrote: >> The new v4.6 upstream kernel gets to the prompt on Mustang (Rev A3). > > I would recommend you to bissect Linux and finger one or multiple commits > which break booting on your board. > > Regards, > -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PULL 00/29] KVM/ARM Changes for v4.7
mory: 10432K (8003f9a8 - 8003fa4b) >> kvm [1]: 8-bit VMID >> kvm [1]: Hyp mode initialized successfully >> INFO: rcu_preempt detected stalls on CPUs/tasks: >> 0-...: (113 GPs behind) idle=1f9/1/0 softirq=86/86 fqs=0 >> 2-...: (107 GPs behind) idle=559/1/0 softirq=120/120 fqs=0 >> 4-...: (107 GPs behind) idle=33b/1/0 softirq=106/106 fqs=0 >> 5-...: (108 GPs behind) idle=333/1/0 softirq=130/130 fqs=0 >> 6-...: (105 GPs behind) idle=2f7/1/0 softirq=120/120 fqs=0 >> 7-...: (105 GPs behind) idle=327/1/0 softirq=131/131 fqs=0 >> (detected by 1, t=5252 jiffies, g=-135, c=-136, q=8) >> Task dump for CPU 0: >> swapper/0 R running task0 0 0 0x >> Call trace: >> [] __switch_to+0x1c8/0x240 >> [] __f.8076+0x10/0x28 >> Task dump for CPU 2: >> swapper/2 R running task0 0 1 0x >> Call trace: >> [] __switch_to+0x1c8/0x240 >> [<0006a42a>] 0x6a42a >> Task dump for CPU 4: >> swapper/4 R running task0 0 1 0x >> Call trace: >> [] __switch_to+0x1c8/0x240 >> [<00048555>] 0x48555 >> Task dump for CPU 5: >> swapper/5 R running task0 0 1 0x >> Call trace: >> [] __switch_to+0x1c8/0x240 >> [<00048554>] 0x48554 >> Task dump for CPU 6: >> swapper/6 R running task0 0 1 0x >> Call trace: >> [] __switch_to+0x1c8/0x240 >> [<0004855b>] 0x4855b >> Task dump for CPU 7: >> swapper/7 R running task0 0 1 0x >> Call trace: >> [] __switch_to+0x1c8/0x240 >> [<0004855a>] 0x4855a >> rcu_preempt kthread starved for 5252 jiffies! g18446744073709551481 >> c18446744073709551480 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 >> rcu_preempt S 0808ba20 0 7 2 0x >> Call trace: >> [] __switch_to+0x1c8/0x240 >> [] __schedule+0xb68/0x2558 >> [] schedule+0xc4/0x230 >> [] schedule_timeout+0x430/0x858 >> [] rcu_gp_kthread+0x1138/0x1ed0 >> [] kthread+0x1cc/0x1e0 >> [] ret_from_fork+0x10/0x40 >> NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [swapper/0:1] >> Modules linked in: >> >> CPU: 1 PID: 1 Comm: swapper/0 Tainted: GW > 4.6.0-rc7-next-20160513 #6 >> Hardware name: AppliedMicro Mustang/Mustang, BIOS 3.05.05-beta_rc Jan 27 >> 2016 >> task: 8003c034 ti: 8003c040 task.ti: 8003c040 >> PC is at smp_call_function_many+0x780/0x7b0 >> LR is at smp_call_function_many+0x73c/0x7b0 >> pc : [] lr : [] pstate: 8045 >> sp : 8003c0403b50 >> x29: 8003c0403b50 x28: 0a482540 >> x27: 0a482550 x26: 09ffa000 >> x25: 0b34e7c0 x24: 0a041dd0 >> x23: 0b34e7c0 x22: 8003ffe2aa08 >> x21: 0b34e7c0 x20: 8003ffe2aa00 >> x19: 0b34e240 x18: 05f5e0ff >> x17: x16: >> x15: 00c4 x14: 8003fff79500 >> x13: 0a464f28 x12: 0b59cd10 >> x11: x10: 0a62eef8 >> x9 : 0b59ccd0 x8 : 8003c040 >> x7 : x6 : 0b59ccd0 >> x5 : x4 : >> x3 : 8003ffdfdc18 x2 : >> x1 : 0a043548 x0 : 0003 >> >> Kernel panic - not syncing: softlockup: hung tasks >> CPU: 1 PID: 1 Comm: swapper/0 Tainted: GWL >> 4.6.0-rc7-next-20160513 #6 >> Hardware name: AppliedMicro Mustang/Mustang, BIOS 3.05.05-beta_rc Jan 27 >> 2016 >> Call trace: >> [] dump_backtrace+0x0/0x4b0 >> [] show_stack+0x3c/0x60 >> [] dump_stack+0x1dc/0x2c8 >> [] panic+0x264/0x5fc >> [] watchdog_timer_fn+0x804/0x840 >> [] __hrtimer_run_queues+0x4b4/0xf50 >> [] hrtimer_interrupt+0x174/0x478 >> [] arch_timer_handler_phys+0xac/0xc8 >> [] handle_percpu_devid_irq+0x214/0x8b0 >> [] generic_handle_irq+0x8c/0xb0 >> [] __handle_domain_irq+0x178/0x288 >> [] gic_handle_irq+0x1e8/0x270 >> Exception stack(0x8003ffe13fa0 to 0x8003ffe140c0) >> 3fa0: 8003c0403a30 8003ffe2aa00 8003c0403b50 > 082b6390 >> 3fc0: 8045 0a041dd0 8003ffe10020 > 8003ffe14010 >> 3fe0: 0a482550 8003c040 0800f000 > 8003c0403a30 >> 4000: 8003c0403b50 8003c0403a30 > >> 4020: 00
Re: [PATCH v6 10/10] clocksource: arm_arch_timer: Remove arch_timer_get_timecounter
On 04/11/2016 10:33 AM, Julien Grall wrote: > The only call of arch_timer_get_timecounter (in KVM) has been removed. > > Signed-off-by: Julien Grall > Acked-by: Christoffer Dall Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6 09/10] KVM: arm/arm64: vgic: Rely on the GIC driver to parse the firmware tables
Hi Julien, On 04/11/2016 10:32 AM, Julien Grall wrote: > Currently, the firmware tables are parsed 2 times: once in the GIC > drivers, the other time when initializing the vGIC. It means code > duplication and make more tedious to add the support for another > firmware table (like ACPI). > > Use the recently introduced helper gic_get_kvm_info() to get > information about the virtual GIC. > > With this change, the virtual GIC becomes agnostic to the firmware > table and KVM will be able to initialize the vGIC on ACPI. > > Signed-off-by: Julien Grall > Reviewed-by: Christoffer Dall > Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server with PAGE_SIZE=4K. > --- > Cc: Marc Zyngier > Cc: Gleb Natapov > Cc: Paolo Bonzini > > Changes in v6: > - Add Christoffer's reviewed-by > > Changes in v4: > - Remove validation check as they are already done during > parsing. > - Move the alignement check from the parsing to the vGIC code. > - Fix typo in the commit message > > Changes in v2: > - Use 0 rather than a negative value to know when the maintenance > IRQ > is not present. > - Use resource for vcpu and vctrl. > --- > include/kvm/arm_vgic.h | 7 +++--- > virt/kvm/arm/vgic-v2.c | 61 > +- > virt/kvm/arm/vgic-v3.c | 47 +- > virt/kvm/arm/vgic.c| 50 ++--- > 4 files changed, 73 insertions(+), 92 deletions(-) > > diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h > index 281caf8..be6037a 100644 > --- a/include/kvm/arm_vgic.h > +++ b/include/kvm/arm_vgic.h > @@ -25,6 +25,7 @@ > #include > #include > #include > +#include > > #define VGIC_NR_IRQS_LEGACY 256 > #define VGIC_NR_SGIS 16 > @@ -353,15 +354,15 @@ bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, > struct irq_phys_map *map); > #define vgic_initialized(k) (!!((k)->arch.vgic.nr_cpus)) > #define vgic_ready(k)((k)->arch.vgic.ready) > > -int vgic_v2_probe(struct device_node *vgic_node, > +int vgic_v2_probe(const struct gic_kvm_info *gic_kvm_info, > const struct vgic_ops **ops, > const struct vgic_params **params); > #ifdef CONFIG_KVM_ARM_VGIC_V3 > -int vgic_v3_probe(struct device_node *vgic_node, > +int vgic_v3_probe(const struct gic_kvm_info *gic_kvm_info, > const struct vgic_ops **ops, > const struct vgic_params **params); > #else > -static inline int vgic_v3_probe(struct device_node *vgic_node, > +static inline int vgic_v3_probe(const struct gic_kvm_info *gic_kvm_info, > const struct vgic_ops **ops, > const struct vgic_params **params) > { > diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c > index 67ec334..7e826c9 100644 > --- a/virt/kvm/arm/vgic-v2.c > +++ b/virt/kvm/arm/vgic-v2.c > @@ -20,9 +20,6 @@ > #include > #include > #include > -#include > -#include > -#include > > #include > > @@ -186,38 +183,39 @@ static void vgic_cpu_init_lrs(void *params) > } > > /** > - * vgic_v2_probe - probe for a GICv2 compatible interrupt controller in > DT > - * @node:pointer to the DT node > - * @ops: address of a pointer to the GICv2 operations > - * @params: address of a pointer to HW-specific parameters > + * vgic_v2_probe - probe for a GICv2 compatible interrupt controller > + * @gic_kvm_info:pointer to the GIC description > + * @ops: address of a pointer to the GICv2 operations > + * @params: address of a pointer to HW-specific parameters > * > * Returns 0 if a GICv2 has been found, with the low level operations > * in *ops and the HW parameters in *params. Returns an error code > * otherwise. > */ > -int vgic_v2_probe(struct device_node *vgic_node, > - const struct vgic_ops **ops, > - const struct vgic_params **params) > +int vgic_v2_probe(const struct gic_kvm_info *gic_kvm_info, > +const struct vgic_ops **ops, > +const struct vgic_params **params) > { > int ret; > - struct resource vctrl_res; > - struct resource vcpu_res; > struct vgic_params *vgic = &vgic_v2_params; > + const struct resource *vctrl_res = &gic_kvm_info->vctrl; > + const struct resource *vcpu_res = &gic_kvm_info->vcpu; > > - vgic->maint_irq = irq_of_parse_and_map(vgic_node, 0); > - if (!vgic->maint_irq) { > - kvm_err("err
Re: [PATCH v6 08/10] KVM: arm/arm64: arch_timer: Rely on the arch timer to parse the firmware tables
On 04/11/2016 10:32 AM, Julien Grall wrote: > The firmware table is currently parsed by the virtual timer code in > order to retrieve the virtual timer interrupt. However, this is already > done by the arch timer driver. > > To avoid code duplication, use the newly function > arch_timer_get_kvm_info() > which return all the information required by the virtual timer code. > > Signed-off-by: Julien Grall > Acked-by: Christoffer Dall > Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server platform. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6 07/10] irqchip/gic-v3: Parse and export virtual GIC information
On 04/11/2016 10:32 AM, Julien Grall wrote: > Fill up the recently introduced gic_kvm_info with the hardware > information used for virtualization. > > Signed-off-by: Julien Grall > Cc: Thomas Gleixner > Cc: Jason Cooper > Cc: Marc Zyngier > Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server platform. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6 06/10] irqchip/gic-v3: Gather all ACPI specific data in a single structure
On 04/11/2016 10:32 AM, Julien Grall wrote: > The ACPI code requires to use global variables in order to collect > information from the tables. > > To make clear those variables are ACPI specific, gather all of them in a > single structure. > > Furthermore, even if some of the variables are not marked with > __initdata, they are all only used during the initialization. Therefore, > the new variable, which hold the structure, can be marked with > __initdata. > > Signed-off-by: Julien Grall > Acked-by: Christoffer Dall > Reviewed-by: Hanjun Guo > Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server platform. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6 04/10] irqchip/gic-v2: Parse and export virtual GIC information
On 04/11/2016 10:32 AM, Julien Grall wrote: > For now, the firmware tables are parsed 2 times: once in the GIC > drivers, the other timer when initializing the vGIC. It means code > duplication and make more tedious to add the support for another > firmware table (like ACPI). > > Introduce a new structure and set of helpers to get/set the virtual GIC > information. Also fill up the structure for GICv2. > > Signed-off-by: Julien Grall > Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server platform. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6 03/10] irqchip/gic-v2: Gather ACPI specific data in a single structure
On 04/11/2016 10:32 AM, Julien Grall wrote: > The ACPI code requires to use global variables in order to collect > information from the tables. > > For now, a single global variable is used, but more will be added in a > subsequent patch. To make clear they are ACPI specific, gather all the > information in a single structure. > > Signed-off-by: Julien Grall > Acked-by: Christofer Dall > Acked-by: Hanjun Guo > Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server platform. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6 02/10] clocksource: arm_arch_timer: Extend arch_timer_kvm_info to get the virtual IRQ
On 04/11/2016 10:32 AM, Julien Grall wrote: > Currently, the firmware table is parsed by the virtual timer code in > order to retrieve the virtual timer interrupt. However, this is already > done by the arch timer driver. > > To avoid code duplication, extend arch_timer_kvm_info to get the virtual > IRQ. > > Note that the KVM code will be modified in a subsequent patch. > > Signed-off-by: Julien Grall > Acked-by: Christoffer Dall > Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server platform. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v6 01/10] clocksource: arm_arch_timer: Gather KVM specific information in a structure
On 04/11/2016 10:32 AM, Julien Grall wrote: > Introduce a structure which are filled up by the arch timer driver and > used by the virtual timer in KVM. > > The first member of this structure will be the timecounter. More members > will be added later. > > A stub for the new helper isn't introduced because KVM requires the arch > timer for both ARM64 and ARM32. > > The function arch_timer_get_timecounter is kept for the time being and > will be dropped in a subsequent patch. > > Signed-off-by: Julien Grall > Acked-by: Christoffer Dall > Tested-by: Shanker Donthineni Using the Qualcomm Technologies QDF2XXX server platform. > --- > Cc: Daniel Lezcano > Cc: Thomas Gleixner > Cc: Marc Zyngier > > Changes in v6: > - Add Christoffer's acked-by > > Changes in v3: > - Rename the patch > - Move the KVM changes and removal of arch_timer_get_timecounter > in separate patches. > --- > drivers/clocksource/arm_arch_timer.c | 12 +--- > include/clocksource/arm_arch_timer.h | 5 + > 2 files changed, 14 insertions(+), 3 deletions(-) > > diff --git a/drivers/clocksource/arm_arch_timer.c > b/drivers/clocksource/arm_arch_timer.c > index 5152b38..62bdfe7 100644 > --- a/drivers/clocksource/arm_arch_timer.c > +++ b/drivers/clocksource/arm_arch_timer.c > @@ -468,11 +468,16 @@ static struct cyclecounter cyclecounter = { > .mask = CLOCKSOURCE_MASK(56), > }; > > -static struct timecounter timecounter; > +static struct arch_timer_kvm_info arch_timer_kvm_info; > + > +struct arch_timer_kvm_info *arch_timer_get_kvm_info(void) > +{ > + return &arch_timer_kvm_info; > +} > > struct timecounter *arch_timer_get_timecounter(void) > { > - return &timecounter; > + return &arch_timer_kvm_info.timecounter; > } > > static void __init arch_counter_register(unsigned type) > @@ -500,7 +505,8 @@ static void __init arch_counter_register(unsigned > type) > clocksource_register_hz(&clocksource_counter, arch_timer_rate); > cyclecounter.mult = clocksource_counter.mult; > cyclecounter.shift = clocksource_counter.shift; > - timecounter_init(&timecounter, &cyclecounter, start_count); > + timecounter_init(&arch_timer_kvm_info.timecounter, > + &cyclecounter, start_count); > > /* 56 bits minimum, so we assume worst case rollover */ > sched_clock_register(arch_timer_read_counter, 56, > arch_timer_rate); > diff --git a/include/clocksource/arm_arch_timer.h > b/include/clocksource/arm_arch_timer.h > index 25d0914..9101ed6b 100644 > --- a/include/clocksource/arm_arch_timer.h > +++ b/include/clocksource/arm_arch_timer.h > @@ -49,11 +49,16 @@ enum arch_timer_reg { > > #define ARCH_TIMER_EVT_STREAM_FREQ 1 /* 100us */ > > +struct arch_timer_kvm_info { > + struct timecounter timecounter; > +}; > + > #ifdef CONFIG_ARM_ARCH_TIMER > > extern u32 arch_timer_get_rate(void); > extern u64 (*arch_timer_read_counter)(void); > extern struct timecounter *arch_timer_get_timecounter(void); > +extern struct arch_timer_kvm_info *arch_timer_get_kvm_info(void); > > #else > -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v5 6/9] irqchip/gic-v3: Parse and export virtual GIC information
Hi Julien, On 04/11/2016 09:27 AM, Julien Grall wrote: > Hello Hanjun, > > On 11/04/16 06:27, Hanjun Guo wrote: >> On 2016/4/4 19:37, Julien Grall wrote: >>> +static void __init gic_acpi_setup_kvm_info(void) >>> +{ >>> +int irq; >>> + >>> +if (!gic_acpi_collect_virt_info()) { >>> +pr_warn("Unable to get hardware information used for >>> virtualization\n"); >>> +return; >>> +} >>> + >>> +gic_v3_kvm_info.type = GIC_V3; >>> + >>> +irq = acpi_register_gsi(NULL, acpi_data.maint_irq, >>> +acpi_data.maint_irq_mode, >>> +ACPI_ACTIVE_HIGH); >>> +if (irq <= 0) >>> +return; >>> + >>> +gic_v3_kvm_info.maint_irq = irq; >>> + >>> +if (acpi_data.vcpu_base) { >> >> Sorry, I'm not familiar with KVM, but I got a question here, will >> KVM works without valid vcpu_base in GICv3 mode? > Yes, KVM works without vcpu_base in GICv3 mode. The vcpu_base will be used for emulatingvGICv2 feature. The vGICv3 emulation isdone through the system registers. > vcpu_base is only required for supporting GICv2 on GICv3. > Yes, you are right, > Regards, > -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH v5 6/9] irqchip/gic-v3: Parse and export virtual GIC information
return 0; > + if (first_madt) { > + first_madt = false; > + > + acpi_data.maint_irq = gicc->vgic_interrupt; > + acpi_data.maint_irq_mode = maint_irq_mode; > + acpi_data.vcpu_base = gicc->gicv_base_address; > + > + return 0; > + } > + > + /* > + * The maintenance interrupt and GICV should be the same for every > CPU > + */ > + if ((acpi_data.maint_irq != gicc->vgic_interrupt) || > + (acpi_data.maint_irq_mode != maint_irq_mode) || > + (acpi_data.vcpu_base != gicc->gicv_base_address)) > + return -EINVAL; > + > + return 0; > +} > + > +static bool __init gic_acpi_collect_virt_info(void) > +{ > + int count; > + > + count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT, > + gic_acpi_parse_virt_madt_gicc, 0); > + > + return (count > 0); > +} > + > #define ACPI_GICV3_DIST_MEM_SIZE (SZ_64K) > +#define ACPI_GICV2_VCTRL_MEM_SIZE(SZ_4K) > +#define ACPI_GICV2_VCPU_MEM_SIZE (SZ_8K) > + > +static void __init gic_acpi_setup_kvm_info(void) > +{ > + int irq; > + > + if (!gic_acpi_collect_virt_info()) { > + pr_warn("Unable to get hardware information used for > virtualization\n"); > + return; > + } > + > + gic_v3_kvm_info.type = GIC_V3; > + > + irq = acpi_register_gsi(NULL, acpi_data.maint_irq, > + acpi_data.maint_irq_mode, > + ACPI_ACTIVE_HIGH); > + if (irq <= 0) > + return; > + > + gic_v3_kvm_info.maint_irq = irq; > + > + if (acpi_data.vcpu_base) { > + struct resource *vcpu = &gic_v3_kvm_info.vcpu; > + > + vcpu->flags = IORESOURCE_MEM; > + vcpu->start = acpi_data.vcpu_base; > + vcpu->end = vcpu->start + ACPI_GICV2_VCPU_MEM_SIZE - 1; > + } > + > + gic_set_kvm_info(&gic_v3_kvm_info); > +} > > static int __init > gic_acpi_init(struct acpi_subtable_header *header, const unsigned long > end) > @@ -1159,6 +1265,8 @@ gic_acpi_init(struct acpi_subtable_header *header, > const unsigned long end) > goto out_fwhandle_free; > > acpi_set_irq_model(ACPI_IRQ_MODEL_GIC, domain_handle); > + gic_acpi_setup_kvm_info(); > + > return 0; > > out_fwhandle_free: > diff --git a/include/linux/irqchip/arm-gic-common.h > b/include/linux/irqchip/arm-gic-common.h > index ef34f6f..c647b05 100644 > --- a/include/linux/irqchip/arm-gic-common.h > +++ b/include/linux/irqchip/arm-gic-common.h > @@ -15,6 +15,7 @@ > > enum gic_type { > GIC_V2, > + GIC_V3, > }; > > struct gic_kvm_info { -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
On 03/03/2016 08:03 AM, Marc Zyngier wrote: > On 03/03/16 13:25, Shanker Donthineni wrote: >> >> On 03/02/2016 11:35 AM, Marc Zyngier wrote: >>> On 02/03/16 15:48, Shanker Donthineni wrote: >>> >>>> We haven't started running heavy workloads in VMs. So far we >>>> have noticed this random nature behavior only during guest >>>> kernel boot (at EL1). >>>> >>>> We didn't see this problem on 4.3 kernel. Do you think it is >>>> related to TLB conflicts? >>> I cannot imagine why a DSB would solve a TLB conflict. But the fact that >>> you didn't see it crashing on 4.3 is a good indication that something >>> else it at play. >>> >>> In 4.5, we've rewritten a large part of KVM in C, which has changed the >>> ordering of the various accesses a lot. It could be that a latent >>> problem is now exposed more widely. >>> >>> Can you try moving this DSB around and find out what is the earliest >>> point where it solves this problem? Some sort of bisection? >> The maximum I can move up 'dsb ishst' to the beginning of >> __guest_enter() but not out side of this function. >> >> I don't understand why it is failing below code, branch >> instruction causing problems. >> >> /* Jump in the fire! */ >> + dsb(ishst); >> exit_code = __guest_enter(vcpu, host_ctxt); >> /* And we're baaack! */ > That's very worrying. I can't see how the branch can have an influence > on the the DSB (nor why the DSB has an influence on the rest of the > execution, btw). > > What if you replace the DSB with an ISB? Do you observe a similar > behaviour (works if the barrier is in __guest_enter, but not if it is > outside)? I have already tried with isb without success. I did another experiment flush stage-2 TLBs before calling __guest_enetr(), it fixed the problem. > Another thing worth looking at is what happened just before we decided > to get back into the guest. Or to put it differently, what was the > reason to exit the first place. Was it a Stage-2 fault by any chance? I will collect as much possible debug data and update results to you. I went through your KVM refracted 'C' code and did not find any thing suspicious. I am thinking may be Qualcomm CPUs have a very aggressive prefech logic that causing the problem. > Thanks, > > M. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
On 03/02/2016 11:35 AM, Marc Zyngier wrote: > On 02/03/16 15:48, Shanker Donthineni wrote: > >> We haven't started running heavy workloads in VMs. So far we >> have noticed this random nature behavior only during guest >> kernel boot (at EL1). >> >> We didn't see this problem on 4.3 kernel. Do you think it is >> related to TLB conflicts? > I cannot imagine why a DSB would solve a TLB conflict. But the fact that > you didn't see it crashing on 4.3 is a good indication that something > else it at play. > > In 4.5, we've rewritten a large part of KVM in C, which has changed the > ordering of the various accesses a lot. It could be that a latent > problem is now exposed more widely. > > Can you try moving this DSB around and find out what is the earliest > point where it solves this problem? Some sort of bisection? The maximum I can move up 'dsb ishst' to the beginning of __guest_enter() but not out side of this function. I don't understand why it is failing below code, branch instruction causing problems. /* Jump in the fire! */ + dsb(ishst); exit_code = __guest_enter(vcpu, host_ctxt); /* And we're baaack! */ > Thanks, > > M. -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Intermittent guest kernel crashes with v4.5-rc6.
On 03/02/2016 09:09 AM, Marc Zyngier wrote: > On 02/03/16 14:59, Shanker Donthineni wrote: >> Hi Marc, >> >> Thanks for your quick reply. >> >> On 03/02/2016 08:16 AM, Marc Zyngier wrote: >>> On 02/03/16 13:56, Shanker Donthineni wrote: >>>> For some reason v4.5-rc6 kernel is not stable for guest machines on >>>> Qualcomm server platforms. >>>> We are getting IABT translation faults while booting the guest kernel. >>>> The problem disappears with >>>> the following code snippet (insert "dsb ish" instruction just before >>>> switching to EL1 guest). I am >>>> using v4.5-rc6 kernel for both host and guest machines. >>>> >>>> Please let me know if you have any thoughts or ideas for tracing this >>>> problem. >>>> >>>> --- a/arch/arm64/kvm/hyp/entry.S >>>> +++ b/arch/arm64/kvm/hyp/entry.S >>>> @@ -88,6 +88,7 @@ ENTRY(__guest_enter) >>>> ldp x0, x1, [sp], #16 >>>> >>>> // Do not touch any register after this! >>>> + dsb ish >>>> eret >>>>ENDPROC(__guest_enter) >>>> >>>> >>>> Using below QEMU command for launching guest machine: >>>> >>>> qemu-system-aarch64 -machine type=virt,accel=kvm,gic-version=3 \ >>>> -cpu "host" -smp cpus=1,maxcpus=1 -m 256M -serial stdio \ >>>> -kernel /boot/Image -initrd /boot/rootfs.cpio.gz \ >>>> -append 'earlycon=earlycon=pl011,0x0900 \ >>>> console=ttyAMA0,115200 root=/dev/ram' >>>> >>>> >>>> Guest machine crash log messages: >>>> >>>> [0.00] Booting Linux on physical CPU 0x0 >>>> [0.00] Boot CPU: AArch64 Processor [510f2811] >>>> [0.00] Bad mode in Synchronous Abort handler detected, code >>>> 0x860f -- IABT (current EL) >>>> [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 4.5.rc6+ >>>> [0.00] task: ffc000d52200 ti: ffc000d44000 task.ti: >>>> ffc000d44000 >>>> [0.00] PC is at early_init_dt_scan_root+0x28/0x94 >>>> [0.00] LR is at of_scan_flat_dt+0x9c/0xd0 >>>> [0.00] pc : [] lr : [] >>>> pstate: 83c5 >>>> [0.00] sp : ffc000d47e80 >>>> [0.00] x29: ffc000d47e80 x28: >>>> >>> If you're getting a prefetch abort, it would be interesting to find out >>> what instruction is there, whether the page is mapped at stage-2 or not, >>> what are the stage-2 permissions... Basically, a full description of the >>> memory state. >>> >>> Also, does it work if you do a "dsb ishst" instead? >>> >>> Thanks, >>> >>> M. >> Most of the times it is faulting at ldr/str instructions. I have >> verified stage-1 page and the >> the corresponding stage-2 page attributes (SH, AP, PERM), PA etc. after >> IABT, everything >> perfectly matches. I am very confident that stage-1/stage-2 MMU page >> tables are correct. >> >> Instruction "dsb ishst" also fixing the problem. >> >> One more Interesting observation, if retry an instruction fetch that >> caused IABT, second >> time fetch is successful and I don't see IABT. I used below >> experimental code to test. >> >> --- a/arch/arm64/kernel/entry.S >> +++ b/arch/arm64/kernel/entry.S >> @@ -346,6 +346,7 @@ el1_sync: >> b.eqel1_undef >> cmp x24, #ESR_ELx_EC_BREAKPT_CUR// debug exception in EL1 >> b.geel1_dbg >> + kernel_exit 1 >> b el1_inv >> el1_da: >> >> > OK, that's pretty scary, specially considering that we don't have a DSB > on that path. Do you ever see it exploding at EL0? > > Thanks, > > M. We haven't started running heavy workloads in VMs. So far we have noticed this random nature behavior only during guest kernel boot (at EL1). We didn't see this problem on 4.3 kernel. Do you think it is related to TLB conflicts? -- Shanker Donthineni Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc. Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm