Re: [PATCH v5 0/3] LLVM/Clang fixes for a few defconfigs
On Thu, Nov 28, 2019 at 03:59:07PM +1100, Michael Ellerman wrote: > Nick Desaulniers writes: > > Hi Michael, > > Do you have feedback for Nathan? Rebasing these patches is becoming a > > nuisance for our CI, and we would like to keep building PPC w/ Clang. > > Sorry just lost in the flood of patches. > > Merged now. > > cheers Thank you very much for picking them up :) Cheers, Nathan
Re: [PATCH v3 4/8] powerpc/vdso32: inline __get_datapage()
Christophe Leroy writes: > Le 22/11/2019 à 07:38, Michael Ellerman a écrit : >> Michael Ellerman writes: >>> Christophe Leroy writes: __get_datapage() is only a few instructions to retrieve the address of the page where the kernel stores data to the VDSO. By inlining this function into its users, a bl/blr pair and a mflr/mtlr pair is avoided, plus a few reg moves. The improvement is noticeable (about 55 nsec/call on an 8xx) vdsotest before the patch: gettimeofday:vdso: 731 nsec/call clock-gettime-realtime-coarse:vdso: 668 nsec/call clock-gettime-monotonic-coarse:vdso: 745 nsec/call vdsotest after the patch: gettimeofday:vdso: 677 nsec/call clock-gettime-realtime-coarse:vdso: 613 nsec/call clock-gettime-monotonic-coarse:vdso: 690 nsec/call Signed-off-by: Christophe Leroy >>> >>> This doesn't build with gcc 4.6.3: >>> >>>/linux/arch/powerpc/kernel/vdso32/gettimeofday.S: Assembler messages: >>>/linux/arch/powerpc/kernel/vdso32/gettimeofday.S:41: Error: unsupported >>> relocation against __kernel_datapage_offset >>>/linux/arch/powerpc/kernel/vdso32/gettimeofday.S:86: Error: unsupported >>> relocation against __kernel_datapage_offset >>>/linux/arch/powerpc/kernel/vdso32/gettimeofday.S:213: Error: unsupported >>> relocation against __kernel_datapage_offset >>>/linux/arch/powerpc/kernel/vdso32/gettimeofday.S:247: Error: unsupported >>> relocation against __kernel_datapage_offset >>>make[4]: *** [arch/powerpc/kernel/vdso32/gettimeofday.o] Error 1 >> >> Actually I guess it's binutils, which is v2.22 in this case. >> >> Needed this: >> >> diff --git a/arch/powerpc/include/asm/vdso_datapage.h >> b/arch/powerpc/include/asm/vdso_datapage.h >> index 12785f72f17d..0048db347ddf 100644 >> --- a/arch/powerpc/include/asm/vdso_datapage.h >> +++ b/arch/powerpc/include/asm/vdso_datapage.h >> @@ -117,7 +117,7 @@ extern struct vdso_data *vdso_data; >> .macro get_datapage ptr, tmp >> bcl 20, 31, .+4 >> mflr\ptr >> -addi\ptr, \ptr, __kernel_datapage_offset - (.-4) >> +addi\ptr, \ptr, (__kernel_datapage_offset - (.-4))@l >> lwz \tmp, 0(\ptr) >> add \ptr, \tmp, \ptr >> .endm >> > > Are you still planning to getting this series merged ? Do you need any > help / rebase / re-spin ? Not sure. I'll possibly send a 2nd pull request next week with it included. cheers
Re: [PATCH v11 0/7] KVM: PPC: Driver to manage pages of secure guest
On Mon, Nov 25, 2019 at 08:36:24AM +0530, Bharata B Rao wrote: > Hi, > > This is the next version of the patchset that adds required support > in the KVM hypervisor to run secure guests on PEF-enabled POWER platforms. > Here is a fix for the issue Hugh identified with the usage of ksm_madvise() in this patchset. It applies on top of this patchset. >From 8a4d769bf4c61f921c79ce68923be3c403bd5862 Mon Sep 17 00:00:00 2001 From: Bharata B Rao Date: Thu, 28 Nov 2019 09:31:54 +0530 Subject: [PATCH 1/1] KVM: PPC: Book3S HV: Take write mmap_sem when calling ksm_madvise In order to prevent the device private pages (that correspond to pages of secure guest) from participating in KSM merging, H_SVM_PAGE_IN calls ksm_madvise() under read version of mmap_sem. However ksm_madvise() needs to be under write lock, fix this. Signed-off-by: Bharata B Rao --- arch/powerpc/kvm/book3s_hv_uvmem.c | 29 - 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index f24ac3cfb34c..2de264fc3156 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -46,11 +46,10 @@ * * Locking order * - * 1. srcu_read_lock(>srcu) - Protects KVM memslots - * 2. down_read(>mm->mmap_sem) - find_vma, migrate_vma_pages and helpers - * 3. mutex_lock(>arch.uvmem_lock) - protects read/writes to uvmem slots - * thus acting as sync-points - * for page-in/out + * 1. kvm->srcu - Protects KVM memslots + * 2. kvm->mm->mmap_sem - find_vma, migrate_vma_pages and helpers, ksm_madvise + * 3. kvm->arch.uvmem_lock - protects read/writes to uvmem slots thus acting + * as sync-points for page-in/out */ /* @@ -344,7 +343,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm) static int kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start, unsigned long end, unsigned long gpa, struct kvm *kvm, - unsigned long page_shift) + unsigned long page_shift, bool *downgrade) { unsigned long src_pfn, dst_pfn = 0; struct migrate_vma mig; @@ -360,8 +359,15 @@ kvmppc_svm_page_in(struct vm_area_struct *vma, unsigned long start, mig.src = _pfn; mig.dst = _pfn; + /* +* We come here with mmap_sem write lock held just for +* ksm_madvise(), otherwise we only need read mmap_sem. +* Hence downgrade to read lock once ksm_madvise() is done. +*/ ret = ksm_madvise(vma, vma->vm_start, vma->vm_end, MADV_UNMERGEABLE, >vm_flags); + downgrade_write(>mm->mmap_sem); + *downgrade = true; if (ret) return ret; @@ -456,6 +462,7 @@ unsigned long kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa, unsigned long flags, unsigned long page_shift) { + bool downgrade = false; unsigned long start, end; struct vm_area_struct *vma; int srcu_idx; @@ -476,7 +483,7 @@ kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa, ret = H_PARAMETER; srcu_idx = srcu_read_lock(>srcu); - down_read(>mm->mmap_sem); + down_write(>mm->mmap_sem); start = gfn_to_hva(kvm, gfn); if (kvm_is_error_hva(start)) @@ -492,12 +499,16 @@ kvmppc_h_svm_page_in(struct kvm *kvm, unsigned long gpa, if (!vma || vma->vm_start > start || vma->vm_end < end) goto out_unlock; - if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift)) + if (!kvmppc_svm_page_in(vma, start, end, gpa, kvm, page_shift, + )) ret = H_SUCCESS; out_unlock: mutex_unlock(>arch.uvmem_lock); out: - up_read(>mm->mmap_sem); + if (downgrade) + up_read(>mm->mmap_sem); + else + up_write(>mm->mmap_sem); srcu_read_unlock(>srcu, srcu_idx); return ret; } -- 2.21.0
Re: [PATCH v5 0/3] LLVM/Clang fixes for a few defconfigs
Nick Desaulniers writes: > Hi Michael, > Do you have feedback for Nathan? Rebasing these patches is becoming a > nuisance for our CI, and we would like to keep building PPC w/ Clang. Sorry just lost in the flood of patches. Merged now. cheers > On Mon, Nov 18, 2019 at 8:57 PM Nathan Chancellor > wrote: >> >> Hi all, >> >> This series includes a set of fixes for LLVM/Clang when building >> a few defconfigs (powernv, ppc44x, and pseries are the ones that our >> CI configuration tests [1]). The first patch fixes pseries_defconfig, >> which has never worked in mainline. The second and third patches fixes >> issues with all of these configs due to internal changes to LLVM, which >> point out issues with the kernel. >> >> These have been broken since July/August, it would be nice to get these >> reviewed and applied. Please let me know what I can do to get these >> applied soon so we can stop applying them out of tree. >> >> [1]: https://github.com/ClangBuiltLinux/continuous-integration >> >> Previous versions: >> >> v3: >> https://lore.kernel.org/lkml/20190911182049.77853-1-natechancel...@gmail.com/ >> >> v4: >> https://lore.kernel.org/lkml/20191014025101.18567-1-natechancel...@gmail.com/ >> >> Cheers, >> Nathan >> >> > > > -- > Thanks, > ~Nick Desaulniers
Re: [Very RFC 45/46] powernv/pci: Remove requirement for a pdn in config accessors
On 20/11/2019 12:28, Oliver O'Halloran wrote: > :toot: > > Signed-off-by: Oliver O'Halloran Squash it into 26/46 "powernv/pci: Remove pdn from pnv_pci_cfg_{read|write}". Thanks, > --- > arch/powerpc/platforms/powernv/pci.c | 10 -- > 1 file changed, 10 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/pci.c > b/arch/powerpc/platforms/powernv/pci.c > index 0eeea8652426..6383dcfec606 100644 > --- a/arch/powerpc/platforms/powernv/pci.c > +++ b/arch/powerpc/platforms/powernv/pci.c > @@ -750,17 +750,12 @@ static int pnv_pci_read_config(struct pci_bus *bus, > unsigned int devfn, > int where, int size, u32 *val) > { > - struct pci_dn *pdn; > struct pnv_phb *phb = pci_bus_to_pnvhb(bus); > u16 bdfn = bus->number << 8 | devfn; > struct eeh_dev *edev; > int ret; > > *val = 0x; > - pdn = pci_get_pdn_by_devfn(bus, devfn); > - if (!pdn) > - return PCIBIOS_DEVICE_NOT_FOUND; > - > edev = pnv_eeh_find_edev(phb, bdfn); > if (!pnv_eeh_pre_cfg_check(edev)) > return PCIBIOS_DEVICE_NOT_FOUND; > @@ -781,16 +776,11 @@ static int pnv_pci_write_config(struct pci_bus *bus, > unsigned int devfn, > int where, int size, u32 val) > { > - struct pci_dn *pdn; > struct pnv_phb *phb = pci_bus_to_pnvhb(bus); > u16 bdfn = bus->number << 8 | devfn; > struct eeh_dev *edev; > int ret; > > - pdn = pci_get_pdn_by_devfn(bus, devfn); > - if (!pdn) > - return PCIBIOS_DEVICE_NOT_FOUND; > - > edev = pnv_eeh_find_edev(phb, bdfn); > if (!pnv_eeh_pre_cfg_check(edev)) > return PCIBIOS_DEVICE_NOT_FOUND; > -- Alexey
Re: [PATCH 1/1] powerpc/kvm/book3s: Fixes possible 'use after release' of kvm
On Tue, Nov 26, 2019 at 02:52:12PM -0300, Leonardo Bras wrote: > Fixes a possible 'use after free' of kvm variable. > It does use mutex_unlock(>lock) after possible freeing a variable > with kvm_put_kvm(kvm). Comments below... > diff --git a/arch/powerpc/kvm/book3s_64_vio.c > b/arch/powerpc/kvm/book3s_64_vio.c > index 5834db0a54c6..a402ead833b6 100644 > --- a/arch/powerpc/kvm/book3s_64_vio.c > +++ b/arch/powerpc/kvm/book3s_64_vio.c > @@ -316,14 +316,13 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, > > if (ret >= 0) > list_add_rcu(>list, >arch.spapr_tce_tables); > - else > - kvm_put_kvm(kvm); > > mutex_unlock(>lock); > > if (ret >= 0) > return ret; > > + kvm_put_kvm(kvm); There isn't a potential use-after-free here. We are relying on the property that the release function (kvm_vm_release) cannot be called in parallel with this function. The reason is that this function (kvm_vm_ioctl_create_spapr_tce) is handling an ioctl on a kvm VM file descriptor. That means that a userspace process has the file descriptor still open. The code that implements the close() system call makes sure that no thread is still executing inside any system call that is using the same file descriptor before calling the file descriptor's release function (in this case, kvm_vm_release). That means that this kvm_put_kvm() call here cannot make the reference count go to zero. > kfree(stt); > fail_acct: > account_locked_vm(current->mm, kvmppc_stt_pages(npages), false); > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 13efc291b1c7..f37089b60d09 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2744,10 +2744,8 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, > u32 id) > /* Now it's all set up, let userspace reach it */ > kvm_get_kvm(kvm); > r = create_vcpu_fd(vcpu); > - if (r < 0) { > - kvm_put_kvm(kvm); > + if (r < 0) > goto unlock_vcpu_destroy; > - } > > kvm->vcpus[atomic_read(>online_vcpus)] = vcpu; > > @@ -2771,6 +2769,8 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, > u32 id) > mutex_lock(>lock); > kvm->created_vcpus--; > mutex_unlock(>lock); > + if (r < 0) > + kvm_put_kvm(kvm); > return r; > } Once again we are inside an ioctl on the kvm VM file descriptor, so the reference count cannot go to zero. > @@ -3183,10 +3183,10 @@ static int kvm_ioctl_create_device(struct kvm *kvm, > kvm_get_kvm(kvm); > ret = anon_inode_getfd(ops->name, _device_fops, dev, O_RDWR | > O_CLOEXEC); > if (ret < 0) { > - kvm_put_kvm(kvm); > mutex_lock(>lock); > list_del(>vm_node); > mutex_unlock(>lock); > + kvm_put_kvm(kvm); > ops->destroy(dev); > return ret; > } Same again here. Paul.
Re: [Very RFC 44/46] powerpc/pci: Don't set pdn->pe_number when applying the weird P8 NVLink PE hack
On 20/11/2019 12:28, Oliver O'Halloran wrote: > P8 needs to shove four GPUs into three PEs for $reasons. Remove the > pdn->pe_assignment done there since we just use the pe_rmap[] now. Reviewed-by: Alexey Kardashevskiy > > Signed-off-by: Oliver O'Halloran > --- > arch/powerpc/platforms/powernv/pci-ioda.c | 6 ++ > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > b/arch/powerpc/platforms/powernv/pci-ioda.c > index 2a9201306543..eceff27357e5 100644 > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > @@ -1183,7 +1183,6 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct > pci_dev *npu_pdev) > long rid; > struct pnv_ioda_pe *pe; > struct pci_dev *gpu_pdev; > - struct pci_dn *npu_pdn; > struct pnv_phb *phb = pci_bus_to_pnvhb(npu_pdev->bus); > > /* > @@ -1210,9 +1209,8 @@ static struct pnv_ioda_pe *pnv_ioda_setup_npu_PE(struct > pci_dev *npu_pdev) > dev_info(_pdev->dev, > "Associating to existing PE %x\n", pe_num); > pci_dev_get(npu_pdev); > - npu_pdn = pci_get_pdn(npu_pdev); > - rid = npu_pdev->bus->number << 8 | npu_pdn->devfn; > - npu_pdn->pe_number = pe_num; > + > + rid = npu_pdev->bus->number << 8 | npu_pdev->devfn; > phb->ioda.pe_rmap[rid] = pe->pe_number; > > /* Map the PE to this link */ > -- Alexey
Re: [Very RFC 43/46] powernv/pci: Do not set pdn->pe_number for NPU/CAPI devices
cc: Greg. On 20/11/2019 12:28, Oliver O'Halloran wrote: > The only thing we need the pdn for in this function is setting the pe_number > field, which we don't use anymore. Fix the weird refcounting behaviour while > we're here. > > Signed-off-by: Oliver O'Halloran > --- > Either Fred, or Reza also fixed this in some patch lately and that'll > probably get > merged before this one does. > --- > arch/powerpc/platforms/powernv/pci-ioda.c | 27 +-- > 1 file changed, 10 insertions(+), 17 deletions(-) > > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > b/arch/powerpc/platforms/powernv/pci-ioda.c > index 45d940730c30..2a9201306543 100644 > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > @@ -1066,16 +1066,13 @@ static int pnv_pci_vf_resource_shift(struct pci_dev > *dev, int offset) > static struct pnv_ioda_pe *pnv_ioda_setup_dev_PE(struct pci_dev *dev) > { > struct pnv_phb *phb = pci_bus_to_pnvhb(dev->bus); > - struct pci_dn *pdn = pci_get_pdn(dev); > - struct pnv_ioda_pe *pe; > + struct pnv_ioda_pe *pe = pnv_ioda_get_pe(dev); > > - if (!pdn) { > - pr_err("%s: Device tree node not associated properly\n", > -pci_name(dev)); > + /* Already has a PE assigned? huh? */ > + if (pe) { > + WARN_ON(1); > return NULL; > } > - if (pdn->pe_number != IODA_INVALID_PE) > - return NULL; > > pe = pnv_ioda_alloc_pe(phb); > if (!pe) { > @@ -1084,29 +1081,25 @@ static struct pnv_ioda_pe > *pnv_ioda_setup_dev_PE(struct pci_dev *dev) > return NULL; > } > > - /* NOTE: We get only one ref to the pci_dev for the pdn, not for the > - * pointer in the PE data structure, both should be destroyed at the > - * same time. However, this needs to be looked at more closely again > - * once we actually start removing things (Hotplug, SR-IOV, ...) > + /* > + * NB: We **do not** hold a pci_dev ref for pe->pdev. >* > - * At some point we want to remove the PDN completely anyways > + * The pci_dev's release function cleans up the ioda_pe state, so: > + * a) We can't take a ref otherwise the release function is never > called > + * b) The pe->pdev pointer will always point to valid pci_dev (or NULL) >*/ > - pci_dev_get(dev); > - pdn->pe_number = pe->pe_number; > pe->flags = PNV_IODA_PE_DEV; > pe->pdev = dev; > pe->pbus = NULL; > pe->mve_number = -1; > - pe->rid = dev->bus->number << 8 | pdn->devfn; > + pe->rid = dev->bus->number << 8 | dev->devfn; > > pe_info(pe, "Associated device to PE\n"); > > if (pnv_ioda_configure_pe(phb, pe)) { > /* XXX What do we do here ? */ > pnv_ioda_free_pe(pe); > - pdn->pe_number = IODA_INVALID_PE; > pe->pdev = NULL; > - pci_dev_put(dev); > return NULL; > } > > -- Alexey
[PATCH] powerpc: add link stack flush mitigation status in debugfs.
The link stack flush status is not visible in debugfs. It can be enabled even when count cache flush is disabled. Add separate file for its status. Signed-off-by: Michal Suchanek --- arch/powerpc/kernel/security.c | 12 1 file changed, 12 insertions(+) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 7d4b2080a658..56dce4798a4d 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -446,14 +446,26 @@ static int count_cache_flush_get(void *data, u64 *val) return 0; } +static int link_stack_flush_get(void *data, u64 *val) +{ + *val = link_stack_flush_enabled; + + return 0; +} + DEFINE_DEBUGFS_ATTRIBUTE(fops_count_cache_flush, count_cache_flush_get, count_cache_flush_set, "%llu\n"); +DEFINE_DEBUGFS_ATTRIBUTE(fops_link_stack_flush, link_stack_flush_get, +count_cache_flush_set, "%llu\n"); static __init int count_cache_flush_debugfs_init(void) { debugfs_create_file_unsafe("count_cache_flush", 0600, powerpc_debugfs_root, NULL, _count_cache_flush); + debugfs_create_file_unsafe("link_stack_flush", 0600, + powerpc_debugfs_root, NULL, + _link_stack_flush); return 0; } device_initcall(count_cache_flush_debugfs_init); -- 2.23.0
[PATCH] selftests/powerpc: Use write_pmc instead of count_pmc to reset PMCs at the end of ebb selftests
By using count_pmc() to reset the pmc instead of write_pmc(), an extra count is performed on ebb_state.stats.pmc_count[PMC_INDEX(pmc)] more than the value accounted by ebb_state.stats.ebb_count in the main test loops. This extra pmc_count makes a few tests fail occasionally on PowerVM systems with high workloads, such as cycles_test shown hereafter, where the ebb_count is occasionally above the upper limit due to this extra count. Moreover, this is also indicated by extra PMC1 trace_log on the output of a few tests: == ... [21]: counter = 8 [22]: register SPRN_MMCR0 = 0x8080 [23]: register SPRN_PMC1 = 0x8004 [24]: counter = 9 [25]: register SPRN_MMCR0 = 0x8080 [26]: register SPRN_PMC1 = 0x8004 [27]: counter = 10 [28]: register SPRN_MMCR0 = 0x8080 [29]: register SPRN_PMC1 = 0x8004 >> [30]: register SPRN_PMC1 = 0x451e PMC1 count (0x28546) above upper limit 0x283e8 (+0x15e) [FAIL] Test FAILED on line 52 failure: cycles == Signed-off-by: Desnes A. Nunes do Rosario --- .../powerpc/pmu/ebb/back_to_back_ebbs_test.c | 2 +- .../testing/selftests/powerpc/pmu/ebb/cycles_test.c | 2 +- .../powerpc/pmu/ebb/cycles_with_freeze_test.c| 2 +- .../powerpc/pmu/ebb/cycles_with_mmcr2_test.c | 2 +- tools/testing/selftests/powerpc/pmu/ebb/ebb.c| 2 +- .../powerpc/pmu/ebb/ebb_on_willing_child_test.c | 2 +- .../selftests/powerpc/pmu/ebb/lost_exception_test.c | 2 +- .../selftests/powerpc/pmu/ebb/multi_counter_test.c | 12 ++-- .../selftests/powerpc/pmu/ebb/multi_ebb_procs_test.c | 2 +- .../selftests/powerpc/pmu/ebb/pmae_handling_test.c | 2 +- .../selftests/powerpc/pmu/ebb/pmc56_overflow_test.c | 2 +- 11 files changed, 16 insertions(+), 16 deletions(-) diff --git a/tools/testing/selftests/powerpc/pmu/ebb/back_to_back_ebbs_test.c b/tools/testing/selftests/powerpc/pmu/ebb/back_to_back_ebbs_test.c index a2d7b0e3dca9..f133ab425f10 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/back_to_back_ebbs_test.c +++ b/tools/testing/selftests/powerpc/pmu/ebb/back_to_back_ebbs_test.c @@ -91,7 +91,7 @@ int back_to_back_ebbs(void) ebb_global_disable(); ebb_freeze_pmcs(); - count_pmc(1, sample_period); + write_pmc(1, pmc_sample_period(sample_period)); dump_ebb_state(); diff --git a/tools/testing/selftests/powerpc/pmu/ebb/cycles_test.c b/tools/testing/selftests/powerpc/pmu/ebb/cycles_test.c index bc893813483e..14a399a64729 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/cycles_test.c +++ b/tools/testing/selftests/powerpc/pmu/ebb/cycles_test.c @@ -42,7 +42,7 @@ int cycles(void) ebb_global_disable(); ebb_freeze_pmcs(); - count_pmc(1, sample_period); + write_pmc(1, pmc_sample_period(sample_period)); dump_ebb_state(); diff --git a/tools/testing/selftests/powerpc/pmu/ebb/cycles_with_freeze_test.c b/tools/testing/selftests/powerpc/pmu/ebb/cycles_with_freeze_test.c index dcd351d20328..0f2089f6f82c 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/cycles_with_freeze_test.c +++ b/tools/testing/selftests/powerpc/pmu/ebb/cycles_with_freeze_test.c @@ -99,7 +99,7 @@ int cycles_with_freeze(void) ebb_global_disable(); ebb_freeze_pmcs(); - count_pmc(1, sample_period); + write_pmc(1, pmc_sample_period(sample_period)); dump_ebb_state(); diff --git a/tools/testing/selftests/powerpc/pmu/ebb/cycles_with_mmcr2_test.c b/tools/testing/selftests/powerpc/pmu/ebb/cycles_with_mmcr2_test.c index 94c99c12c0f2..a8f3bee04cd8 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/cycles_with_mmcr2_test.c +++ b/tools/testing/selftests/powerpc/pmu/ebb/cycles_with_mmcr2_test.c @@ -71,7 +71,7 @@ int cycles_with_mmcr2(void) ebb_global_disable(); ebb_freeze_pmcs(); - count_pmc(1, sample_period); + write_pmc(1, pmc_sample_period(sample_period)); dump_ebb_state(); diff --git a/tools/testing/selftests/powerpc/pmu/ebb/ebb.c b/tools/testing/selftests/powerpc/pmu/ebb/ebb.c index dfbc5c3ad52d..bf6f25dfcf7b 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/ebb.c +++ b/tools/testing/selftests/powerpc/pmu/ebb/ebb.c @@ -396,7 +396,7 @@ int ebb_child(union pipe read_pipe, union pipe write_pipe) ebb_global_disable(); ebb_freeze_pmcs(); - count_pmc(1, sample_period); + write_pmc(1, pmc_sample_period(sample_period)); dump_ebb_state(); diff --git a/tools/testing/selftests/powerpc/pmu/ebb/ebb_on_willing_child_test.c b/tools/testing/selftests/powerpc/pmu/ebb/ebb_on_willing_child_test.c index ca2f7d729155..513812cdcca1 100644 --- a/tools/testing/selftests/powerpc/pmu/ebb/ebb_on_willing_child_test.c +++ b/tools/testing/selftests/powerpc/pmu/ebb/ebb_on_willing_child_test.c @@ -38,7 +38,7 @@ static int victim_child(union pipe read_pipe, union pipe write_pipe)
Re: [PATCH 1/1] powerpc/kvm/book3s: Fixes possible 'use after release' of kvm
On 26/11/19 18:52, Leonardo Bras wrote: > Fixes a possible 'use after free' of kvm variable. > It does use mutex_unlock(>lock) after possible freeing a variable > with kvm_put_kvm(kvm). > > Signed-off-by: Leonardo Bras > --- > arch/powerpc/kvm/book3s_64_vio.c | 3 +-- > virt/kvm/kvm_main.c | 8 > 2 files changed, 5 insertions(+), 6 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_64_vio.c > b/arch/powerpc/kvm/book3s_64_vio.c > index 5834db0a54c6..a402ead833b6 100644 > --- a/arch/powerpc/kvm/book3s_64_vio.c > +++ b/arch/powerpc/kvm/book3s_64_vio.c > @@ -316,14 +316,13 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, > > if (ret >= 0) > list_add_rcu(>list, >arch.spapr_tce_tables); > - else > - kvm_put_kvm(kvm); > > mutex_unlock(>lock); > > if (ret >= 0) > return ret; > > + kvm_put_kvm(kvm); > kfree(stt); > fail_acct: > account_locked_vm(current->mm, kvmppc_stt_pages(npages), false); This part is a good change, as it makes the code clearer. The virt/kvm/kvm_main.c bits, however, are not necessary as explained by Sean. Paolo > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 13efc291b1c7..f37089b60d09 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2744,10 +2744,8 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, > u32 id) > /* Now it's all set up, let userspace reach it */ > kvm_get_kvm(kvm); > r = create_vcpu_fd(vcpu); > - if (r < 0) { > - kvm_put_kvm(kvm); > + if (r < 0) > goto unlock_vcpu_destroy; > - } > > kvm->vcpus[atomic_read(>online_vcpus)] = vcpu; > > @@ -2771,6 +2769,8 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, > u32 id) > mutex_lock(>lock); > kvm->created_vcpus--; > mutex_unlock(>lock); > + if (r < 0) > + kvm_put_kvm(kvm); > return r; > } > > @@ -3183,10 +3183,10 @@ static int kvm_ioctl_create_device(struct kvm *kvm, > kvm_get_kvm(kvm); > ret = anon_inode_getfd(ops->name, _device_fops, dev, O_RDWR | > O_CLOEXEC); > if (ret < 0) { > - kvm_put_kvm(kvm); > mutex_lock(>lock); > list_del(>vm_node); > mutex_unlock(>lock); > + kvm_put_kvm(kvm); > ops->destroy(dev); > return ret; > } >
RE: [PATCH 09/14] powerpc/vas: Update CSB and notify process for fault CRBs
"Linuxppc-dev" wrote on 11/27/2019 12:46:09 AM: > > > > +static void notify_process(pid_t pid, u64 fault_addr) > > +{ > > + int rc; > > + struct kernel_siginfo info; > > + > > + memset(, 0, sizeof(info)); > > + > > + info.si_signo = SIGSEGV; > > + info.si_errno = EFAULT; > > + info.si_code = SEGV_MAPERR; > > + > > + info.si_addr = (void *)fault_addr; > > + rcu_read_lock(); > > + rc = kill_pid_info(SIGSEGV, , find_vpid(pid)); > > + rcu_read_unlock(); > > + > > + pr_devel("%s(): pid %d kill_proc_info() rc %d\n", __func__, pid, rc); > > +} > > Shouldn't this use force_sig_fault_to_task instead? > > > + /* > > +* User space passed invalid CSB address, Notify process with > > +* SEGV signal. > > +*/ > > + tsk = get_pid_task(window->pid, PIDTYPE_PID); > > + /* > > +* Send window will be closed after processing all NX requests > > +* and process exits after closing all windows. In multi-thread > > +* applications, thread may not exists, but does not close FD > > +* (means send window) upon exit. Parent thread (tgid) can use > > +* and close the window later. > > +*/ > > + if (tsk) { > > + if (tsk->flags & PF_EXITING) > > + task_exit = 1; > > + put_task_struct(tsk); > > + pid = vas_window_pid(window); > > The pid is later used for sending the signal again, why not keep the > reference? Sorry, Not dropping the PID reference here, Happens only when window closed. If the task for this PID is not available, looking for tgid in the case of multi-thread process. > > > + } else { > > + pid = vas_window_tgid(window); > > + > > + rcu_read_lock(); > > + tsk = find_task_by_vpid(pid); > > + if (!tsk) { > > + rcu_read_unlock(); > > + return; > > + } > > + if (tsk->flags & PF_EXITING) > > + task_exit = 1; > > + rcu_read_unlock(); > > Why does this not need a reference to the task, but the other one does? Window is opened with open() and ioctl(fd), will be closed either by close (fd) explicitly or release FD during process exit. Process closes all open windows when it exits. So we do not need to keep the reference for this case. In multi-thread case, child thread can open a window, but it does not release FD when it exits. Parent thread (tgid) can continue use this window and closes it upon its exit. So taking reference to PID in case if this pid is assigned to child thread to make sure its pid is not reused until window is closed. We are taking pid reference during window open and releases it when closing the window. Thanks Haren >
RE: [PATCH 03/14] powerpc/vas: Define nx_fault_stamp in coprocessor_request_block
"Linuxppc-dev" wrote on 11/27/2019 12:30:55 AM: > > > +#define crb_csb_addr(c) __be64_to_cpu(c->csb_addr) > > +#define crb_nx_fault_addr(c) __be64_to_cpu > (c->stamp.nx.fault_storage_addr) > > +#define crb_nx_flags(c) c->stamp.nx.flags > > +#define crb_nx_fault_status(c) c->stamp.nx.fault_status > > Except for crb_nx_fault_addr all these macros are unused, and > crb_nx_fault_addr probably makes more sense open coded in the only > caller. Thanks, My mistake, code got changed and forgot to remove unused macros. > > Also please don't use the __ prefixed byte swap helpers in any driver > or arch code. > > > + > > +static inline uint32_t crb_nx_pswid(struct coprocessor_request_block *crb) > > +{ > > + return __be32_to_cpu(crb->stamp.nx.pswid); > > +} > > Same here. Also not sure what the point of the helper is except for > obsfucating the code. >
RE: [PATCH 02/14] Revert "powerpc/powernv: remove the unused vas_win_paste_addr and vas_win_id functions"
"Linuxppc-dev" wrote on 11/27/2019 12:28:10 AM: > > On Tue, Nov 26, 2019 at 05:03:27PM -0800, Haren Myneni wrote: > > > > This reverts commit 452d23c0f6bd97f2fd8a9691fee79b76040a0feb. > > > > User space send windows (NX GZIP compression) need vas_win_paste_addr() > > to mmap window paste address and vas_win_id() to get window ID when > > window address is given. > > Even with your full series applied vas_win_paste_addr is entirely > unused, and vas_win_id is only used once in the same file it is defined. Thanks for the review. vas_win_paste_addr() will be used in NX compression driver and planning to post this series soon. Can I add this change later as part of this series? > > So instead of this patch you should just open code vas_win_id in > init_winctx_for_txwin. > > > +static inline u32 encode_pswid(int vasid, int winid) > > +{ > > + u32 pswid = 0; > > + > > + pswid |= vasid << (31 - 7); > > + pswid |= winid; > > + > > + return pswid; > > This can be simplified down to: > >return (u32)winid | (vasid << (31 - 7)); >
Re: [PATCH 1/1] powerpc/kvm/book3s: Fixes possible 'use after release' of kvm
On Wed, 2019-11-27 at 17:40 +0100, Paolo Bonzini wrote: > > > >if (ret >= 0) > >list_add_rcu(>list, >arch.spapr_tce_tables); > > - else > > - kvm_put_kvm(kvm); > > > >mutex_unlock(>lock); > > > >if (ret >= 0) > >return ret; > > > > + kvm_put_kvm(kvm); > >kfree(stt); > >fail_acct: > >account_locked_vm(current->mm, kvmppc_stt_pages(npages), false); > > This part is a good change, as it makes the code clearer. The > virt/kvm/kvm_main.c bits, however, are not necessary as explained by Sean. > Thanks! So, like this patch? https://lkml.org/lkml/2019/11/7/763 Best regards, Leonardo signature.asc Description: This is a digitally signed message part
Re: [PATCH v3 0/2] Replace current->mm by kvm->mm on powerpc/kvm
Result of Travis-CI testing the change: https://travis-ci.org/LeoBras/linux-ppc/builds/617712012 signature.asc Description: This is a digitally signed message part
Re: Bug 205201 - Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M
On 27 November 2019 at 07:56 am, Mike Rapoport wrote: Maybe we'll simply force bottom up allocation before calling swiotlb_init()? Anyway, it's the last memblock allocation. diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 62f74b1b33bd..771e6cf7e2b9 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -286,14 +286,15 @@ void __init mem_init(void) /* * book3s is limited to 16 page sizes due to encoding this in * a 4-bit field for slices. */ BUILD_BUG_ON(MMU_PAGE_COUNT > 16); #ifdef CONFIG_SWIOTLB + memblock_set_bottom_up(true); swiotlb_init(0); #endif high_memory = (void *) __va(max_low_pfn * PAGE_SIZE); set_max_mapnr(max_pfn); memblock_free_all(); Hello Mike, I tested the latest Git kernel with your new patch today. My PCI TV card works without any problems. Thanks, Christian
Re: [PATCH v4 2/2] powerpc/irq: inline call_do_irq() and call_do_softirq()
Le 27/11/2019 à 15:59, Segher Boessenkool a écrit : On Wed, Nov 27, 2019 at 02:50:30PM +0100, Christophe Leroy wrote: So what do we do ? We just drop the "r2" clobber ? You have to make sure your asm code works for all ABIs. This is quite involved if you do a call to an external function. The compiler does *not* see this call, so you will have to make sure that all that the compiler and linker do will work, or prevent some of those things (say, inlining of the function containing the call). But the whole purpose of the patch is to inline the call to __do_irq() in order to avoid the trampoline function. Otherwise, to be on the safe side we can just save r2 in a local var before the bl and restore it after. I guess it won't collapse CPU time on a performant PPC64. That does not fix everything. The called function requires a specific value in r2 on entry. Euh ... but there is nothing like that when using existing call_do_irq(). How does GCC know that call_do_irq() has same TOC as __do_irq() ? So all this needs verification. Hopefully you can get away with just not clobbering r2 (and not adding a nop after the bl), sure. But this needs to be checked. Changing control flow inside inline assembler always is problematic. Another problem in this case (on all ABIs) is that the compiler does not see you call __do_irq. Again, you can probably get away with that too, but :-) Anyway it sees I reference it, as it is in input arguments. Isn't it enough ? Christophe
Re: [PATCH v4 2/2] powerpc/irq: inline call_do_irq() and call_do_softirq()
On Wed, Nov 27, 2019 at 02:50:30PM +0100, Christophe Leroy wrote: > So what do we do ? We just drop the "r2" clobber ? You have to make sure your asm code works for all ABIs. This is quite involved if you do a call to an external function. The compiler does *not* see this call, so you will have to make sure that all that the compiler and linker do will work, or prevent some of those things (say, inlining of the function containing the call). > Otherwise, to be on the safe side we can just save r2 in a local var > before the bl and restore it after. I guess it won't collapse CPU time > on a performant PPC64. That does not fix everything. The called function requires a specific value in r2 on entry. So all this needs verification. Hopefully you can get away with just not clobbering r2 (and not adding a nop after the bl), sure. But this needs to be checked. Changing control flow inside inline assembler always is problematic. Another problem in this case (on all ABIs) is that the compiler does not see you call __do_irq. Again, you can probably get away with that too, but :-) Segher
Re: [PATCH v3 4/8] powerpc/vdso32: inline __get_datapage()
Hi Michael, Le 22/11/2019 à 07:38, Michael Ellerman a écrit : Michael Ellerman writes: Christophe Leroy writes: __get_datapage() is only a few instructions to retrieve the address of the page where the kernel stores data to the VDSO. By inlining this function into its users, a bl/blr pair and a mflr/mtlr pair is avoided, plus a few reg moves. The improvement is noticeable (about 55 nsec/call on an 8xx) vdsotest before the patch: gettimeofday:vdso: 731 nsec/call clock-gettime-realtime-coarse:vdso: 668 nsec/call clock-gettime-monotonic-coarse:vdso: 745 nsec/call vdsotest after the patch: gettimeofday:vdso: 677 nsec/call clock-gettime-realtime-coarse:vdso: 613 nsec/call clock-gettime-monotonic-coarse:vdso: 690 nsec/call Signed-off-by: Christophe Leroy This doesn't build with gcc 4.6.3: /linux/arch/powerpc/kernel/vdso32/gettimeofday.S: Assembler messages: /linux/arch/powerpc/kernel/vdso32/gettimeofday.S:41: Error: unsupported relocation against __kernel_datapage_offset /linux/arch/powerpc/kernel/vdso32/gettimeofday.S:86: Error: unsupported relocation against __kernel_datapage_offset /linux/arch/powerpc/kernel/vdso32/gettimeofday.S:213: Error: unsupported relocation against __kernel_datapage_offset /linux/arch/powerpc/kernel/vdso32/gettimeofday.S:247: Error: unsupported relocation against __kernel_datapage_offset make[4]: *** [arch/powerpc/kernel/vdso32/gettimeofday.o] Error 1 Actually I guess it's binutils, which is v2.22 in this case. Needed this: diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h index 12785f72f17d..0048db347ddf 100644 --- a/arch/powerpc/include/asm/vdso_datapage.h +++ b/arch/powerpc/include/asm/vdso_datapage.h @@ -117,7 +117,7 @@ extern struct vdso_data *vdso_data; .macro get_datapage ptr, tmp bcl 20, 31, .+4 mflr\ptr - addi\ptr, \ptr, __kernel_datapage_offset - (.-4) + addi\ptr, \ptr, (__kernel_datapage_offset - (.-4))@l lwz \tmp, 0(\ptr) add \ptr, \tmp, \ptr .endm Are you still planning to getting this series merged ? Do you need any help / rebase / re-spin ? Christophe
Re: [PATCH v1 1/4] powerpc/fixmap: don't clear fixmap area in paging_init()
Le 26/11/2019 à 02:13, Michael Ellerman a écrit : On Thu, 2019-09-12 at 13:49:41 UTC, Christophe Leroy wrote: fixmap is intended to map things permanently like the IMMR region on FSL SOC (8xx, 83xx, ...), so don't clear it when initialising paging() Signed-off-by: Christophe Leroy Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/f2bb86937d86ebcb0e52f95b6d19aba1d850e601 Hi, What happened ? It looks like it is gone in today's powerpc next. Christophe
Re: [PATCH v4 2/2] powerpc/irq: inline call_do_irq() and call_do_softirq()
Le 25/11/2019 à 15:25, Segher Boessenkool a écrit : On Mon, Nov 25, 2019 at 09:32:23PM +1100, Michael Ellerman wrote: Segher Boessenkool writes: +static inline void call_do_irq(struct pt_regs *regs, void *sp) +{ + register unsigned long r3 asm("r3") = (unsigned long)regs; + + /* Temporarily switch r1 to sp, call __do_irq() then restore r1 */ + asm volatile( + " "PPC_STLU"1, %2(%1);\n" + " mr 1, %1;\n" + " bl %3;\n" + " "PPC_LL" 1, 0(1);\n" : + "+r"(r3) : + "b"(sp), "i"(THREAD_SIZE - STACK_FRAME_OVERHEAD), "i"(__do_irq) : + "lr", "xer", "ctr", "memory", "cr0", "cr1", "cr5", "cr6", "cr7", + "r0", "r2", "r4", "r5", "r6", "r7", "r8", "r9", "r10", "r11", "r12"); +} If we add a nop after the bl, so the linker could insert a TOC restore, then I don't think there's any circumstance under which we expect this to actually clobber r2, is there? That is mostly correct. That's the standard I aspire to :P If call_do_irq was a no-inline function, there would not be problems. What TOC does __do_irq require in r2 on entry, and what will be there when it returns? The kernel TOC, and also the kernel TOC, unless something's gone wrong or I'm missing something. If that is the case, we can just do the bl, no nop at all? And that works for all of our ABIs. If we can be certain that we have the kernel TOC in r2 on entry to call_do_irq, that is! (Or it establishes it itself). So what do we do ? We just drop the "r2" clobber ? Otherwise, to be on the safe side we can just save r2 in a local var before the bl and restore it after. I guess it won't collapse CPU time on a performant PPC64. Christophe
[GIT PULL] y2038: syscall implementation cleanups
The following changes since commit a99d8080aaf358d5d23581244e5da23b35e340b9: Linux 5.4-rc6 (2019-11-03 14:07:26 -0800) are available in the Git repository at: git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground.git tags/y2038-cleanups-5.5 for you to fetch changes up to b111df8447acdeb4b9220f99d5d4b28f83eb56ad: y2038: alarm: fix half-second cut-off (2019-11-25 21:52:35 +0100) y2038: syscall implementation cleanups This is a series of cleanups for the y2038 work, mostly intended for namespace cleaning: the kernel defines the traditional time_t, timeval and timespec types that often lead to y2038-unsafe code. Even though the unsafe usage is mostly gone from the kernel, having the types and associated functions around means that we can still grow new users, and that we may be missing conversions to safe types that actually matter. There are still a number of driver specific patches needed to get the last users of these types removed, those have been submitted to the respective maintainers. Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-a...@arndb.de/ Signed-off-by: Arnd Bergmann Arnd Bergmann (26): y2038: remove CONFIG_64BIT_TIME y2038: add __kernel_old_timespec and __kernel_old_time_t y2038: vdso: change timeval to __kernel_old_timeval y2038: vdso: change timespec to __kernel_old_timespec y2038: vdso: change time_t to __kernel_old_time_t y2038: vdso: nds32: open-code timespec_add_ns() y2038: vdso: powerpc: avoid timespec references y2038: ipc: remove __kernel_time_t reference from headers y2038: stat: avoid 'time_t' in 'struct stat' y2038: uapi: change __kernel_time_t to __kernel_old_time_t y2038: rusage: use __kernel_old_timeval y2038: syscalls: change remaining timeval to __kernel_old_timeval y2038: socket: remove timespec reference in timestamping y2038: socket: use __kernel_old_timespec instead of timespec y2038: make ns_to_compat_timeval use __kernel_old_timeval y2038: elfcore: Use __kernel_old_timeval for process times y2038: timerfd: Use timespec64 internally y2038: time: avoid timespec usage in settimeofday() y2038: itimer: compat handling to itimer.c y2038: use compat_{get,set}_itimer on alpha y2038: move itimer reset into itimer.c y2038: itimer: change implementation to timespec64 y2038: allow disabling time32 system calls y2038: fix typo in powerpc vdso "LOPART" y2038: ipc: fix x32 ABI breakage y2038: alarm: fix half-second cut-off arch/Kconfig | 11 +- arch/alpha/kernel/osf_sys.c | 67 +-- arch/alpha/kernel/syscalls/syscall.tbl| 4 +- arch/ia64/kernel/asm-offsets.c| 2 +- arch/mips/include/uapi/asm/msgbuf.h | 6 +- arch/mips/include/uapi/asm/sembuf.h | 4 +- arch/mips/include/uapi/asm/shmbuf.h | 6 +- arch/mips/include/uapi/asm/stat.h | 16 +-- arch/mips/kernel/binfmt_elfn32.c | 4 +- arch/mips/kernel/binfmt_elfo32.c | 4 +- arch/nds32/kernel/vdso/gettimeofday.c | 61 +- arch/parisc/include/uapi/asm/msgbuf.h | 6 +- arch/parisc/include/uapi/asm/sembuf.h | 4 +- arch/parisc/include/uapi/asm/shmbuf.h | 6 +- arch/powerpc/include/asm/asm-prototypes.h | 3 +- arch/powerpc/include/asm/vdso_datapage.h | 6 +- arch/powerpc/include/uapi/asm/msgbuf.h| 6 +- arch/powerpc/include/uapi/asm/sembuf.h| 4 +- arch/powerpc/include/uapi/asm/shmbuf.h| 6 +- arch/powerpc/include/uapi/asm/stat.h | 2 +- arch/powerpc/kernel/asm-offsets.c | 18 ++- arch/powerpc/kernel/syscalls.c| 4 +- arch/powerpc/kernel/time.c| 5 +- arch/powerpc/kernel/vdso32/gettimeofday.S | 6 +- arch/powerpc/kernel/vdso64/gettimeofday.S | 8 +- arch/sparc/include/uapi/asm/msgbuf.h | 6 +- arch/sparc/include/uapi/asm/sembuf.h | 4 +- arch/sparc/include/uapi/asm/shmbuf.h | 6 +- arch/sparc/include/uapi/asm/stat.h| 24 ++-- arch/sparc/vdso/vclock_gettime.c | 36 +++--- arch/x86/entry/vdso/vclock_gettime.c | 6 +- arch/x86/entry/vsyscall/vsyscall_64.c | 4 +- arch/x86/include/uapi/asm/msgbuf.h| 6 +- arch/x86/include/uapi/asm/sembuf.h| 4 +- arch/x86/include/uapi/asm/shmbuf.h| 6 +- arch/x86/um/vdso/um_vdso.c| 12 +- fs/aio.c | 2 +- fs/binfmt_elf.c | 12 +- fs/binfmt_elf_fdpic.c | 12 +- fs/compat_binfmt_elf.c| 4 +- fs/select.c | 10 +- fs/timerfd.c | 14 +-- fs/utimes.c | 8 +- include/linux/compat.h
Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
On Wed, 27 Nov 2019 10:47:45 +0100 Frederic Barrat wrote: > > > Le 27/11/2019 à 10:33, Greg Kurz a écrit : > > On Wed, 27 Nov 2019 10:10:13 +0100 > > Frederic Barrat wrote: > > > >> > >> > >> Le 27/11/2019 à 09:24, Greg Kurz a écrit : > >>> On Wed, 27 Nov 2019 18:09:40 +1100 > >>> Alexey Kardashevskiy wrote: > >>> > > > On 20/11/2019 12:28, Oliver O'Halloran wrote: > > The comment here implies that we don't need to take a ref to the pci_dev > > because the ioda_pe will always have one. This implies that the current > > expection is that the pci_dev for an NPU device will *never* be torn > > down since the ioda_pe having a ref to the device will prevent the > > release function from being called. > > > > In other words, the desired behaviour here appears to be leaking a ref. > > > > Nice! > > > There is a history: https://patchwork.ozlabs.org/patch/1088078/ > > We did not fix anything in particular then, we do not seem to be fixing > anything now (in other words - we cannot test it in a normal natural > way). I'd drop this one. > > >>> > >>> Yeah, I didn't fix anything at the time. Just reverted to the ref > >>> count behavior we had before: > >>> > >>> https://patchwork.ozlabs.org/patch/829172/ > >>> > >>> Frederic recently posted his take on the same topic from the OpenCAPI > >>> point of view: > >>> > >>> http://patchwork.ozlabs.org/patch/1198947/ > >>> > >>> He seems to indicate the NPU devices as the real culprit because > >>> nobody ever cared for them to be removable. Fixing that seems be > >>> a chore nobody really wants to address obviously... :-\ > >> > >> > >> I had taken a stab at not leaking a ref for the nvlink devices and do > >> the proper thing regarding ref counting (i.e. fixing all the callers of > >> get_pci_dev() to drop the reference when they were done). With that, I > >> could see that the ref count of the nvlink devices could drop to 0 > >> (calling remove for the device in /sys) and that the devices could go away. > >> > >> But then, I realized it's not necessarily desirable at this point. There > >> are several comments in the code saying the npu devices (for nvlink) > >> don't go away, there's no device release callback defined when it seems > >> there should be, at least to handle releasing PEs All in all, it > >> seems that some work would be needed. And if it hasn't been required by > >> now... > >> > > > > If everyone is ok with leaking a reference in the NPU case, I guess > > this isn't a problem. But if we move forward with Oliver's patch, a > > pci_dev_put() would be needed for OpenCAPI, correct ? > > > No, these code paths are nvlink-only. > Oh yes indeed. Then this patch and yours fit well together :) >Fred > > > > >> Fred > >> > >> > > > > > > Signed-off-by: Oliver O'Halloran > > --- > >arch/powerpc/platforms/powernv/npu-dma.c | 11 +++ > >1 file changed, 3 insertions(+), 8 deletions(-) > > > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c > > b/arch/powerpc/platforms/powernv/npu-dma.c > > index 72d3749da02c..2eb6e6d45a98 100644 > > --- a/arch/powerpc/platforms/powernv/npu-dma.c > > +++ b/arch/powerpc/platforms/powernv/npu-dma.c > > @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct > > device_node *dn) > > break; > > > > /* > > -* pci_get_domain_bus_and_slot() increased the reference count > > of > > -* the PCI device, but callers don't need that actually as the > > PE > > -* already holds a reference to the device. Since callers aren't > > -* aware of the reference count change, call pci_dev_put() now > > to > > -* avoid leaks. > > +* NB: for_each_pci_dev() elevates the pci_dev refcount. > > +* Caller is responsible for dropping the ref when it's > > +* finished with it. > > */ > > - if (pdev) > > - pci_dev_put(pdev); > > - > > return pdev; > >} > > > > > > >>> > >> > > >
Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
On Wed, 27 Nov 2019 10:10:13 +0100 Frederic Barrat wrote: > > > Le 27/11/2019 à 09:24, Greg Kurz a écrit : > > On Wed, 27 Nov 2019 18:09:40 +1100 > > Alexey Kardashevskiy wrote: > > > >> > >> > >> On 20/11/2019 12:28, Oliver O'Halloran wrote: > >>> The comment here implies that we don't need to take a ref to the pci_dev > >>> because the ioda_pe will always have one. This implies that the current > >>> expection is that the pci_dev for an NPU device will *never* be torn > >>> down since the ioda_pe having a ref to the device will prevent the > >>> release function from being called. > >>> > >>> In other words, the desired behaviour here appears to be leaking a ref. > >>> > >>> Nice! > >> > >> > >> There is a history: https://patchwork.ozlabs.org/patch/1088078/ > >> > >> We did not fix anything in particular then, we do not seem to be fixing > >> anything now (in other words - we cannot test it in a normal natural > >> way). I'd drop this one. > >> > > > > Yeah, I didn't fix anything at the time. Just reverted to the ref > > count behavior we had before: > > > > https://patchwork.ozlabs.org/patch/829172/ > > > > Frederic recently posted his take on the same topic from the OpenCAPI > > point of view: > > > > http://patchwork.ozlabs.org/patch/1198947/ > > > > He seems to indicate the NPU devices as the real culprit because > > nobody ever cared for them to be removable. Fixing that seems be > > a chore nobody really wants to address obviously... :-\ > > > I had taken a stab at not leaking a ref for the nvlink devices and do > the proper thing regarding ref counting (i.e. fixing all the callers of > get_pci_dev() to drop the reference when they were done). With that, I > could see that the ref count of the nvlink devices could drop to 0 > (calling remove for the device in /sys) and that the devices could go away. > > But then, I realized it's not necessarily desirable at this point. There > are several comments in the code saying the npu devices (for nvlink) > don't go away, there's no device release callback defined when it seems > there should be, at least to handle releasing PEs All in all, it > seems that some work would be needed. And if it hasn't been required by > now... > If everyone is ok with leaking a reference in the NPU case, I guess this isn't a problem. But if we move forward with Oliver's patch, a pci_dev_put() would be needed for OpenCAPI, correct ? >Fred > > > >> > >> > >>> > >>> Signed-off-by: Oliver O'Halloran > >>> --- > >>> arch/powerpc/platforms/powernv/npu-dma.c | 11 +++ > >>> 1 file changed, 3 insertions(+), 8 deletions(-) > >>> > >>> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c > >>> b/arch/powerpc/platforms/powernv/npu-dma.c > >>> index 72d3749da02c..2eb6e6d45a98 100644 > >>> --- a/arch/powerpc/platforms/powernv/npu-dma.c > >>> +++ b/arch/powerpc/platforms/powernv/npu-dma.c > >>> @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node > >>> *dn) > >>> break; > >>> > >>> /* > >>> - * pci_get_domain_bus_and_slot() increased the reference count of > >>> - * the PCI device, but callers don't need that actually as the PE > >>> - * already holds a reference to the device. Since callers aren't > >>> - * aware of the reference count change, call pci_dev_put() now to > >>> - * avoid leaks. > >>> + * NB: for_each_pci_dev() elevates the pci_dev refcount. > >>> + * Caller is responsible for dropping the ref when it's > >>> + * finished with it. > >>>*/ > >>> - if (pdev) > >>> - pci_dev_put(pdev); > >>> - > >>> return pdev; > >>> } > >>> > >>> > >> > > >
[PATCH 1/3] powerpc/pseries: Account for SPURR ticks on idle CPUs
From: "Gautham R. Shenoy" On PSeries LPARs, to compute the utilization, tools such as lparstat need to know the [S]PURR ticks when the CPUs were busy or idle. In the pseries cpuidle driver, we keep track of the idle PURR ticks in the VPA variable "wait_state_cycles". This patch extends the support to account for the idle SPURR ticks. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/idle.c| 2 ++ drivers/cpuidle/cpuidle-pseries.c | 28 +--- 2 files changed, 19 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c index a36fd05..708ec68 100644 --- a/arch/powerpc/kernel/idle.c +++ b/arch/powerpc/kernel/idle.c @@ -33,6 +33,8 @@ unsigned long cpuidle_disable = IDLE_NO_OVERRIDE; EXPORT_SYMBOL(cpuidle_disable); +DEFINE_PER_CPU(u64, idle_spurr_cycles); + static int __init powersave_off(char *arg) { ppc_md.power_save = NULL; diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c index 74c2479..45e2be4 100644 --- a/drivers/cpuidle/cpuidle-pseries.c +++ b/drivers/cpuidle/cpuidle-pseries.c @@ -30,11 +30,14 @@ struct cpuidle_driver pseries_idle_driver = { static struct cpuidle_state *cpuidle_state_table __read_mostly; static u64 snooze_timeout __read_mostly; static bool snooze_timeout_en __read_mostly; +DECLARE_PER_CPU(u64, idle_spurr_cycles); -static inline void idle_loop_prolog(unsigned long *in_purr) +static inline void idle_loop_prolog(unsigned long *in_purr, + unsigned long *in_spurr) { ppc64_runlatch_off(); *in_purr = mfspr(SPRN_PURR); + *in_spurr = mfspr(SPRN_SPURR); /* * Indicate to the HV that we are idle. Now would be * a good time to find other work to dispatch. @@ -42,13 +45,16 @@ static inline void idle_loop_prolog(unsigned long *in_purr) get_lppaca()->idle = 1; } -static inline void idle_loop_epilog(unsigned long in_purr) +static inline void idle_loop_epilog(unsigned long in_purr, + unsigned long in_spurr) { u64 wait_cycles; + u64 *idle_spurr_cycles_ptr = this_cpu_ptr(_spurr_cycles); wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles); wait_cycles += mfspr(SPRN_PURR) - in_purr; get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles); + *idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr; get_lppaca()->idle = 0; ppc64_runlatch_on(); @@ -58,12 +64,12 @@ static int snooze_loop(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long in_purr; + unsigned long in_purr, in_spurr; u64 snooze_exit_time; set_thread_flag(TIF_POLLING_NRFLAG); - idle_loop_prolog(_purr); + idle_loop_prolog(_purr, _spurr); local_irq_enable(); snooze_exit_time = get_tb() + snooze_timeout; @@ -87,7 +93,7 @@ static int snooze_loop(struct cpuidle_device *dev, local_irq_disable(); - idle_loop_epilog(in_purr); + idle_loop_epilog(in_purr, in_spurr); return index; } @@ -113,9 +119,9 @@ static int dedicated_cede_loop(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long in_purr; + unsigned long in_purr, in_spurr; - idle_loop_prolog(_purr); + idle_loop_prolog(_purr, _spurr); get_lppaca()->donate_dedicated_cpu = 1; HMT_medium(); @@ -124,7 +130,7 @@ static int dedicated_cede_loop(struct cpuidle_device *dev, local_irq_disable(); get_lppaca()->donate_dedicated_cpu = 0; - idle_loop_epilog(in_purr); + idle_loop_epilog(in_purr, in_spurr); return index; } @@ -133,9 +139,9 @@ static int shared_cede_loop(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { - unsigned long in_purr; + unsigned long in_purr, in_spurr; - idle_loop_prolog(_purr); + idle_loop_prolog(_purr, _spurr); /* * Yield the processor to the hypervisor. We return if @@ -147,7 +153,7 @@ static int shared_cede_loop(struct cpuidle_device *dev, check_and_cede_processor(); local_irq_disable(); - idle_loop_epilog(in_purr); + idle_loop_epilog(in_purr, in_spurr); return index; } -- 1.9.4
[PATCH 0/3] pseries: Track and expose idle PURR and SPURR ticks
From: "Gautham R. Shenoy" On PSeries LPARs, the data centers planners desire a more accurate view of system utilization per resource such as CPU to plan the system capacity requirements better. Such accuracy can be obtained by reading PURR/SPURR registers for CPU resource utilization. Tools such as lparstat which are used to compute the utilization need to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR counters are already exposed through sysfs. We already account for PURR ticks when we go to idle so that we can update the VPA area. This patchset extends support to account for SPURR ticks when idle, and expose both via per-cpu sysfs files. These patches are required for enhancement to the lparstat utility that compute the CPU utilization based on PURR and SPURR which can be found here : https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4 Gautham R. Shenoy (3): powerpc/pseries: Account for SPURR ticks on idle CPUs powerpc/sysfs: Show idle_purr and idle_spurr for every CPU Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++ arch/powerpc/kernel/idle.c | 2 ++ arch/powerpc/kernel/sysfs.c| 32 ++ drivers/cpuidle/cpuidle-pseries.c | 28 ++-- 4 files changed, 90 insertions(+), 11 deletions(-) -- 1.9.4
[PATCH 2/3] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
From: "Gautham R. Shenoy" On Pseries LPARs, to calculate utilization, we need to know the [S]PURR ticks when the CPUs were busy or idle. The total PURR and SPURR ticks are already exposed via the per-cpu sysfs files /sys/devices/system/cpu/cpuX/purr and /sys/devices/system/cpu/cpuX/spurr. This patch adds support for exposing the idle PURR and SPURR ticks via /sys/devices/system/cpu/cpuX/idle_purr and /sys/devices/system/cpu/cpuX/idle_spurr. Signed-off-by: Gautham R. Shenoy --- arch/powerpc/kernel/sysfs.c | 32 1 file changed, 32 insertions(+) diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c index 80a676d..42ade55 100644 --- a/arch/powerpc/kernel/sysfs.c +++ b/arch/powerpc/kernel/sysfs.c @@ -1044,6 +1044,36 @@ static ssize_t show_physical_id(struct device *dev, } static DEVICE_ATTR(physical_id, 0444, show_physical_id, NULL); +static ssize_t idle_purr_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cpu *cpu = container_of(dev, struct cpu, dev); + unsigned int cpuid = cpu->dev.id; + struct lppaca *cpu_lppaca_ptr = paca_ptrs[cpuid]->lppaca_ptr; + u64 idle_purr_cycles = be64_to_cpu(cpu_lppaca_ptr->wait_state_cycles); + + return sprintf(buf, "%llx\n", idle_purr_cycles); +} +static DEVICE_ATTR_RO(idle_purr); + +DECLARE_PER_CPU(u64, idle_spurr_cycles); +static ssize_t idle_spurr_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct cpu *cpu = container_of(dev, struct cpu, dev); + unsigned int cpuid = cpu->dev.id; + u64 *idle_spurr_cycles_ptr = per_cpu_ptr(_spurr_cycles, cpuid); + + return sprintf(buf, "%llx\n", *idle_spurr_cycles_ptr); +} +static DEVICE_ATTR_RO(idle_spurr); + +static void create_idle_purr_spurr_sysfs_entry(struct device *cpudev) +{ + device_create_file(cpudev, _attr_idle_purr); + device_create_file(cpudev, _attr_idle_spurr); +} + static int __init topology_init(void) { int cpu, r; @@ -1067,6 +1097,8 @@ static int __init topology_init(void) register_cpu(c, cpu); device_create_file(>dev, _attr_physical_id); + if (firmware_has_feature(FW_FEATURE_SPLPAR)) + create_idle_purr_spurr_sysfs_entry(>dev); } } r = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "powerpc/topology:online", -- 1.9.4
[PATCH 3/3] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr
From: "Gautham R. Shenoy" Add documentation for the following sysfs interfaces: /sys/devices/system/cpu/cpuX/purr /sys/devices/system/cpu/cpuX/spurr /sys/devices/system/cpu/cpuX/idle_purr /sys/devices/system/cpu/cpuX/idle_spurr Signed-off-by: Gautham R. Shenoy --- Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++ 1 file changed, 39 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index fc20cde..ecd23fb 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -574,3 +574,42 @@ Description: Secure Virtual Machine If 1, it means the system is using the Protected Execution Facility in POWER9 and newer processors. i.e., it is a Secure Virtual Machine. + +What: /sys/devices/system/cpu/cpuX/purr +Date: Apr 2005 +Contact: Linux for PowerPC mailing list +Description: PURR ticks for this CPU since the system boot. + + The Processor Utilization Resources Register (PURR) is + a 64-bit counter which provides an estimate of the + resources used by the CPU thread. The contents of this + register increases monotonically. This sysfs interface + exposes the number of PURR ticks for cpuX. + +What: /sys/devices/system/cpu/cpuX/spurr +Date: Dec 2006 +Contact: Linux for PowerPC mailing list +Description: SPURR ticks for this CPU since the system boot. + + The Scaled Processor Utilization Resources Register + (SPURR) is a 64-bit counter that provides a frequency + invariant estimate of the resources used by the CPU + thread. The contents of this register increases + monotonically. This sysfs interface exposes the number + of SPURR ticks for cpuX. + +What: /sys/devices/system/cpu/cpuX/idle_purr +Date: Nov 2019 +Contact: Linux for PowerPC mailing list +Description: PURR ticks for cpuX when it was idle. + + This sysfs interface exposes the number of PURR ticks + for cpuX when it was idle. + +What: /sys/devices/system/cpu/cpuX/spurr +Date: Nov 2019 +Contact: Linux for PowerPC mailing list +Description: SPURR ticks for cpuX when it was idle. + + This sysfs interface exposes the number of SPURR ticks + for cpuX when it was idle. -- 1.9.4
Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
On Wed, 27 Nov 2019 20:40:00 +1100 "Oliver O'Halloran" wrote: > On Wed, Nov 27, 2019 at 8:34 PM Greg Kurz wrote: > > > > > > If everyone is ok with leaking a reference in the NPU case, I guess > > this isn't a problem. But if we move forward with Oliver's patch, a > > pci_dev_put() would be needed for OpenCAPI, correct ? > > Yes, but I think that's fair enough. By convention it's the callers > responsibility to drop the ref when it calls a function that returns a > refcounted object. Doing anything else creates a race condition since > the object's count could drop to zero before the caller starts using > it. > Sure, you're right, especially with Frederic's patch that drops the pci_dev_get(dev) in pnv_ioda_setup_dev_PE(). > Oliver
Re: [Y2038] [PATCH 07/23] y2038: vdso: powerpc: avoid timespec references
On Thu, Nov 21, 2019 at 5:25 PM Christophe Leroy wrote: > Arnd Bergmann a écrit : > > On Wed, Nov 20, 2019 at 11:43 PM Ben Hutchings > > wrote: > >> > >> On Fri, 2019-11-08 at 22:07 +0100, Arnd Bergmann wrote: > >> > @@ -192,7 +190,7 @@ V_FUNCTION_BEGIN(__kernel_time) > >> > bl __get_datapage@local > >> > mr r9, r3 /* datapage ptr in r9 */ > >> > > >> > - lwz r3,STAMP_XTIME+TSPEC_TV_SEC(r9) > >> > + lwz r3,STAMP_XTIME_SEC+LOWPART(r9) > >> > >> "LOWPART" should be "LOPART". > >> > > > > Thanks, fixed both instances in a patch on top now. I considered folding > > it into the original patch, but as it's close to the merge window I'd > > rather not rebase it, and this way I also give you credit for > > finding the bug. > > Take care, might conflict with > https://github.com/linuxppc/linux/commit/5e381d727fe8834ca5a126f510194a7a4ac6dd3a Sorry for my late reply. I see this commit and no other variant of it has made it into linux-next by now, so I assume this is not getting sent for v5.5 and it's not stopping me from sending my own pull request. Please let me know if I missed something and this will cause problems. On a related note: are you still working on the generic lib/vdso support for powerpc? Without that, future libc implementations that use 64-bit time_t will have to use the slow clock_gettime64 syscall instead of the vdso, which has a significant performance impact. Arnd
[PATCH v2 rebase 34/34] MAINTAINERS: perf: Add pattern that matches ppc perf to the perf entry.
Signed-off-by: Michal Suchanek --- MAINTAINERS | 2 ++ 1 file changed, 2 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 9d3a5c54a41d..4d2a43542c83 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -12774,6 +12774,8 @@ F: arch/*/kernel/*/perf_event*.c F: arch/*/kernel/*/*/perf_event*.c F: arch/*/include/asm/perf_event.h F: arch/*/kernel/perf_callchain.c +F: arch/*/perf/* +F: arch/*/perf/*/* F: arch/*/events/* F: arch/*/events/*/* F: tools/perf/ -- 2.23.0
[PATCH v2 rebase 33/34] powerpc/perf: split callchain.c by bitness
Building callchain.c with !COMPAT proved quite ugly with all the defines. Splitting out the 32bit and 64bit parts looks better. No code change intended. Signed-off-by: Michal Suchanek --- arch/powerpc/perf/Makefile | 5 +- arch/powerpc/perf/callchain.c| 362 +-- arch/powerpc/perf/callchain.h| 20 ++ arch/powerpc/perf/callchain_32.c | 197 + arch/powerpc/perf/callchain_64.c | 178 +++ 5 files changed, 400 insertions(+), 362 deletions(-) create mode 100644 arch/powerpc/perf/callchain.h create mode 100644 arch/powerpc/perf/callchain_32.c create mode 100644 arch/powerpc/perf/callchain_64.c diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile index c155dcbb8691..53d614e98537 100644 --- a/arch/powerpc/perf/Makefile +++ b/arch/powerpc/perf/Makefile @@ -1,6 +1,9 @@ # SPDX-License-Identifier: GPL-2.0 -obj-$(CONFIG_PERF_EVENTS) += callchain.o perf_regs.o +obj-$(CONFIG_PERF_EVENTS) += callchain.o callchain_$(BITS).o perf_regs.o +ifdef CONFIG_COMPAT +obj-$(CONFIG_PERF_EVENTS) += callchain_32.o +endif obj-$(CONFIG_PPC_PERF_CTRS)+= core-book3s.o bhrb.o obj64-$(CONFIG_PPC_PERF_CTRS) += ppc970-pmu.o power5-pmu.o \ diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c index b9fc2f297f30..dd5051015008 100644 --- a/arch/powerpc/perf/callchain.c +++ b/arch/powerpc/perf/callchain.c @@ -15,11 +15,9 @@ #include #include #include -#ifdef CONFIG_COMPAT -#include "../kernel/ppc32.h" -#endif #include +#include "callchain.h" /* * Is sp valid as the address of the next kernel stack frame after prev_sp? @@ -102,364 +100,6 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re } } -static inline int valid_user_sp(unsigned long sp) -{ - bool is_64 = !is_32bit_task(); - - if (!sp || (sp & (is_64 ? 7 : 3)) || sp > STACK_TOP - (is_64 ? 32 : 16)) - return 0; - return 1; -} - -#ifdef CONFIG_PPC64 -/* - * On 64-bit we don't want to invoke hash_page on user addresses from - * interrupt context, so if the access faults, we read the page tables - * to find which page (if any) is mapped and access it directly. - */ -static int read_user_stack_slow(void __user *ptr, void *buf, int nb) -{ - int ret = -EFAULT; - pgd_t *pgdir; - pte_t *ptep, pte; - unsigned shift; - unsigned long addr = (unsigned long) ptr; - unsigned long offset; - unsigned long pfn, flags; - void *kaddr; - - pgdir = current->mm->pgd; - if (!pgdir) - return -EFAULT; - - local_irq_save(flags); - ptep = find_current_mm_pte(pgdir, addr, NULL, ); - if (!ptep) - goto err_out; - if (!shift) - shift = PAGE_SHIFT; - - /* align address to page boundary */ - offset = addr & ((1UL << shift) - 1); - - pte = READ_ONCE(*ptep); - if (!pte_present(pte) || !pte_user(pte)) - goto err_out; - pfn = pte_pfn(pte); - if (!page_is_ram(pfn)) - goto err_out; - - /* no highmem to worry about here */ - kaddr = pfn_to_kaddr(pfn); - memcpy(buf, kaddr + offset, nb); - ret = 0; -err_out: - local_irq_restore(flags); - return ret; -} - -static int read_user_stack_64(unsigned long __user *ptr, unsigned long *ret) -{ - if ((unsigned long)ptr > TASK_SIZE - sizeof(unsigned long) || - ((unsigned long)ptr & 7)) - return -EFAULT; - - pagefault_disable(); - if (!__get_user_inatomic(*ret, ptr)) { - pagefault_enable(); - return 0; - } - pagefault_enable(); - - return read_user_stack_slow(ptr, ret, 8); -} - -/* - * 64-bit user processes use the same stack frame for RT and non-RT signals. - */ -struct signal_frame_64 { - chardummy[__SIGNAL_FRAMESIZE]; - struct ucontext uc; - unsigned long unused[2]; - unsigned inttramp[6]; - struct siginfo *pinfo; - void*puc; - struct siginfo info; - charabigap[288]; -}; - -static int is_sigreturn_64_address(unsigned long nip, unsigned long fp) -{ - if (nip == fp + offsetof(struct signal_frame_64, tramp)) - return 1; - if (vdso64_rt_sigtramp && current->mm->context.vdso_base && - nip == current->mm->context.vdso_base + vdso64_rt_sigtramp) - return 1; - return 0; -} - -/* - * Do some sanity checking on the signal frame pointed to by sp. - * We check the pinfo and puc pointers in the frame. - */ -static int sane_signal_64_frame(unsigned long sp) -{ - struct signal_frame_64 __user *sf; - unsigned long pinfo, puc; - - sf = (struct signal_frame_64 __user *) sp; - if (read_user_stack_64((unsigned long __user *) >pinfo, ) || - read_user_stack_64((unsigned long __user *) >puc, )) -
[PATCH v2 rebase 31/34] powerpc/64: make buildable without CONFIG_COMPAT
There are numerous references to 32bit functions in generic and 64bit code so ifdef them out. Signed-off-by: Michal Suchanek --- arch/powerpc/include/asm/thread_info.h | 4 ++-- arch/powerpc/kernel/Makefile | 6 +++--- arch/powerpc/kernel/entry_64.S | 2 ++ arch/powerpc/kernel/signal.c | 3 +-- arch/powerpc/kernel/syscall_64.c | 6 ++ arch/powerpc/kernel/vdso.c | 3 ++- arch/powerpc/perf/callchain.c | 8 +++- 7 files changed, 19 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h index 8e1d0195ac36..c128d8a48ea3 100644 --- a/arch/powerpc/include/asm/thread_info.h +++ b/arch/powerpc/include/asm/thread_info.h @@ -144,10 +144,10 @@ static inline bool test_thread_local_flags(unsigned int flags) return (ti->local_flags & flags) != 0; } -#ifdef CONFIG_PPC64 +#ifdef CONFIG_COMPAT #define is_32bit_task()(test_thread_flag(TIF_32BIT)) #else -#define is_32bit_task()(1) +#define is_32bit_task()(IS_ENABLED(CONFIG_PPC32)) #endif #if defined(CONFIG_PPC64) diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 72ba4622fc2c..0270f4b440a5 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -41,16 +41,16 @@ CFLAGS_btext.o += -DDISABLE_BRANCH_PROFILING endif obj-y := cputable.o ptrace.o syscalls.o \ - irq.o align.o signal_32.o pmc.o vdso.o \ + irq.o align.o signal_$(BITS).o pmc.o vdso.o \ process.o systbl.o idle.o \ signal.o sysfs.o cacheinfo.o time.o \ prom.o traps.o setup-common.o \ udbg.o misc.o io.o misc_$(BITS).o \ of_platform.o prom_parse.o -obj-$(CONFIG_PPC64)+= setup_64.o sys_ppc32.o \ - signal_64.o ptrace32.o \ +obj-$(CONFIG_PPC64)+= setup_64.o \ paca.o nvram_64.o firmware.o note.o \ syscall_64.o +obj-$(CONFIG_COMPAT) += sys_ppc32.o ptrace32.o signal_32.o obj-$(CONFIG_VDSO32) += vdso32/ obj-$(CONFIG_PPC_WATCHDOG) += watchdog.o obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 00173cc904ef..c339a984958f 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -52,8 +52,10 @@ SYS_CALL_TABLE: .tc sys_call_table[TC],sys_call_table +#ifdef CONFIG_COMPAT COMPAT_SYS_CALL_TABLE: .tc compat_sys_call_table[TC],compat_sys_call_table +#endif /* This value is used to mark exception frames on the stack. */ exception_marker: diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c index 60436432399f..61678cb0e6a1 100644 --- a/arch/powerpc/kernel/signal.c +++ b/arch/powerpc/kernel/signal.c @@ -247,7 +247,6 @@ static void do_signal(struct task_struct *tsk) sigset_t *oldset = sigmask_to_save(); struct ksignal ksig = { .sig = 0 }; int ret; - int is32 = is_32bit_task(); BUG_ON(tsk != current); @@ -277,7 +276,7 @@ static void do_signal(struct task_struct *tsk) rseq_signal_deliver(, tsk->thread.regs); - if (is32) { + if (is_32bit_task()) { if (ksig.ka.sa.sa_flags & SA_SIGINFO) ret = handle_rt_signal32(, oldset, tsk); else diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c index 62f44c3072f3..783deda66866 100644 --- a/arch/powerpc/kernel/syscall_64.c +++ b/arch/powerpc/kernel/syscall_64.c @@ -18,7 +18,6 @@ typedef long (*syscall_fn)(long, long, long, long, long, long); long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, unsigned long r0, struct pt_regs *regs) { - unsigned long ti_flags; syscall_fn f; if (IS_ENABLED(CONFIG_PPC_BOOK3S)) @@ -65,8 +64,7 @@ long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, __hard_irq_enable(); - ti_flags = current_thread_info()->flags; - if (unlikely(ti_flags & _TIF_SYSCALL_DOTRACE)) { + if (unlikely(current_thread_info()->flags & _TIF_SYSCALL_DOTRACE)) { /* * We use the return value of do_syscall_trace_enter() as the * syscall number. If the syscall was rejected for any reason @@ -82,7 +80,7 @@ long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, /* May be faster to do array_index_nospec? */ barrier_nospec(); - if (unlikely(ti_flags & _TIF_32BIT)) { + if (unlikely(is_32bit_task())) { f = (void
[PATCH v2 rebase 32/34] powerpc/64: Make COMPAT user-selectable disabled on littleendian by default.
On bigendian ppc64 it is common to have 32bit legacy binaries but much less so on littleendian. Signed-off-by: Michal Suchanek Reviewed-by: Christophe Leroy --- arch/powerpc/Kconfig | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index e446bb5b3f8d..fabae186eea7 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -267,8 +267,9 @@ config PANIC_TIMEOUT default 180 config COMPAT - bool - default y if PPC64 + bool "Enable support for 32bit binaries" + depends on PPC64 + default y if !CPU_LITTLE_ENDIAN select COMPAT_BINFMT_ELF select ARCH_WANT_OLD_COMPAT_IPC select COMPAT_OLD_SIGACTION -- 2.23.0
[PATCH v2 rebase 30/34] powerpc/perf: consolidate valid_user_sp
Merge the 32bit and 64bit version. Halve the check constants on 32bit. Use STACK_TOP since it is defined. Passing is_64 is now redundant since is_32bit_task() is used to determine which callchain variant should be used. Use STACK_TOP and is_32bit_task() directly. This removes a page from the valid 32bit area on 64bit: #define TASK_SIZE_USER32 (0x0001UL - (1 * PAGE_SIZE)) #define STACK_TOP_USER32 TASK_SIZE_USER32 Signed-off-by: Michal Suchanek --- arch/powerpc/perf/callchain.c | 27 +++ 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c index c6c4c609cc14..a22a19975a19 100644 --- a/arch/powerpc/perf/callchain.c +++ b/arch/powerpc/perf/callchain.c @@ -102,6 +102,15 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re } } +static inline int valid_user_sp(unsigned long sp) +{ + bool is_64 = !is_32bit_task(); + + if (!sp || (sp & (is_64 ? 7 : 3)) || sp > STACK_TOP - (is_64 ? 32 : 16)) + return 0; + return 1; +} + #ifdef CONFIG_PPC64 /* * On 64-bit we don't want to invoke hash_page on user addresses from @@ -165,13 +174,6 @@ static int read_user_stack_64(unsigned long __user *ptr, unsigned long *ret) return read_user_stack_slow(ptr, ret, 8); } -static inline int valid_user_sp(unsigned long sp, int is_64) -{ - if (!sp || (sp & 7) || sp > (is_64 ? TASK_SIZE : 0x1UL) - 32) - return 0; - return 1; -} - /* * 64-bit user processes use the same stack frame for RT and non-RT signals. */ @@ -230,7 +232,7 @@ static void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry, while (entry->nr < entry->max_stack) { fp = (unsigned long __user *) sp; - if (!valid_user_sp(sp, 1) || read_user_stack_64(fp, _sp)) + if (!valid_user_sp(sp) || read_user_stack_64(fp, _sp)) return; if (level > 0 && read_user_stack_64([2], _ip)) return; @@ -279,13 +281,6 @@ static inline void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry { } -static inline int valid_user_sp(unsigned long sp, int is_64) -{ - if (!sp || (sp & 7) || sp > TASK_SIZE - 32) - return 0; - return 1; -} - #define __SIGNAL_FRAMESIZE32 __SIGNAL_FRAMESIZE #define sigcontext32 sigcontext #define mcontext32 mcontext @@ -428,7 +423,7 @@ static void perf_callchain_user_32(struct perf_callchain_entry_ctx *entry, while (entry->nr < entry->max_stack) { fp = (unsigned int __user *) (unsigned long) sp; - if (!valid_user_sp(sp, 0) || read_user_stack_32(fp, _sp)) + if (!valid_user_sp(sp) || read_user_stack_32(fp, _sp)) return; if (level > 0 && read_user_stack_32([1], _ip)) return; -- 2.23.0
[PATCH v2 rebase 29/34] powerpc/perf: consolidate read_user_stack_32
There are two almost identical copies for 32bit and 64bit. The function is used only in 32bit code which will be split out in next patch so consolidate to one function. Signed-off-by: Michal Suchanek Reviewed-by: Christophe Leroy --- arch/powerpc/perf/callchain.c | 59 +++ 1 file changed, 25 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c index 35d542515faf..c6c4c609cc14 100644 --- a/arch/powerpc/perf/callchain.c +++ b/arch/powerpc/perf/callchain.c @@ -165,22 +165,6 @@ static int read_user_stack_64(unsigned long __user *ptr, unsigned long *ret) return read_user_stack_slow(ptr, ret, 8); } -static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret) -{ - if ((unsigned long)ptr > TASK_SIZE - sizeof(unsigned int) || - ((unsigned long)ptr & 3)) - return -EFAULT; - - pagefault_disable(); - if (!__get_user_inatomic(*ret, ptr)) { - pagefault_enable(); - return 0; - } - pagefault_enable(); - - return read_user_stack_slow(ptr, ret, 4); -} - static inline int valid_user_sp(unsigned long sp, int is_64) { if (!sp || (sp & 7) || sp > (is_64 ? TASK_SIZE : 0x1UL) - 32) @@ -285,25 +269,9 @@ static void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry, } #else /* CONFIG_PPC64 */ -/* - * On 32-bit we just access the address and let hash_page create a - * HPTE if necessary, so there is no need to fall back to reading - * the page tables. Since this is called at interrupt level, - * do_page_fault() won't treat a DSI as a page fault. - */ -static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret) +static int read_user_stack_slow(void __user *ptr, void *buf, int nb) { - int rc; - - if ((unsigned long)ptr > TASK_SIZE - sizeof(unsigned int) || - ((unsigned long)ptr & 3)) - return -EFAULT; - - pagefault_disable(); - rc = __get_user_inatomic(*ret, ptr); - pagefault_enable(); - - return rc; + return 0; } static inline void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry, @@ -326,6 +294,29 @@ static inline int valid_user_sp(unsigned long sp, int is_64) #endif /* CONFIG_PPC64 */ +/* + * On 32-bit we just access the address and let hash_page create a + * HPTE if necessary, so there is no need to fall back to reading + * the page tables. Since this is called at interrupt level, + * do_page_fault() won't treat a DSI as a page fault. + */ +static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret) +{ + int rc; + + if ((unsigned long)ptr > TASK_SIZE - sizeof(unsigned int) || + ((unsigned long)ptr & 3)) + return -EFAULT; + + pagefault_disable(); + rc = __get_user_inatomic(*ret, ptr); + pagefault_enable(); + + if (IS_ENABLED(CONFIG_PPC64) && rc) + return read_user_stack_slow(ptr, ret, 4); + return rc; +} + /* * Layout for non-RT signal frames */ -- 2.23.0
[PATCH v2 rebase 28/34] powerpc: move common register copy functions from signal_32.c to signal.c
These functions are required for 64bit as well. Signed-off-by: Michal Suchanek Reviewed-by: Christophe Leroy --- arch/powerpc/kernel/signal.c| 141 arch/powerpc/kernel/signal_32.c | 140 --- 2 files changed, 141 insertions(+), 140 deletions(-) diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c index e6c30cee6abf..60436432399f 100644 --- a/arch/powerpc/kernel/signal.c +++ b/arch/powerpc/kernel/signal.c @@ -18,12 +18,153 @@ #include #include #include +#include #include #include #include #include "signal.h" +#ifdef CONFIG_VSX +unsigned long copy_fpr_to_user(void __user *to, + struct task_struct *task) +{ + u64 buf[ELF_NFPREG]; + int i; + + /* save FPR copy to local buffer then write to the thread_struct */ + for (i = 0; i < (ELF_NFPREG - 1) ; i++) + buf[i] = task->thread.TS_FPR(i); + buf[i] = task->thread.fp_state.fpscr; + return __copy_to_user(to, buf, ELF_NFPREG * sizeof(double)); +} + +unsigned long copy_fpr_from_user(struct task_struct *task, +void __user *from) +{ + u64 buf[ELF_NFPREG]; + int i; + + if (__copy_from_user(buf, from, ELF_NFPREG * sizeof(double))) + return 1; + for (i = 0; i < (ELF_NFPREG - 1) ; i++) + task->thread.TS_FPR(i) = buf[i]; + task->thread.fp_state.fpscr = buf[i]; + + return 0; +} + +unsigned long copy_vsx_to_user(void __user *to, + struct task_struct *task) +{ + u64 buf[ELF_NVSRHALFREG]; + int i; + + /* save FPR copy to local buffer then write to the thread_struct */ + for (i = 0; i < ELF_NVSRHALFREG; i++) + buf[i] = task->thread.fp_state.fpr[i][TS_VSRLOWOFFSET]; + return __copy_to_user(to, buf, ELF_NVSRHALFREG * sizeof(double)); +} + +unsigned long copy_vsx_from_user(struct task_struct *task, +void __user *from) +{ + u64 buf[ELF_NVSRHALFREG]; + int i; + + if (__copy_from_user(buf, from, ELF_NVSRHALFREG * sizeof(double))) + return 1; + for (i = 0; i < ELF_NVSRHALFREG ; i++) + task->thread.fp_state.fpr[i][TS_VSRLOWOFFSET] = buf[i]; + return 0; +} + +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM +unsigned long copy_ckfpr_to_user(void __user *to, + struct task_struct *task) +{ + u64 buf[ELF_NFPREG]; + int i; + + /* save FPR copy to local buffer then write to the thread_struct */ + for (i = 0; i < (ELF_NFPREG - 1) ; i++) + buf[i] = task->thread.TS_CKFPR(i); + buf[i] = task->thread.ckfp_state.fpscr; + return __copy_to_user(to, buf, ELF_NFPREG * sizeof(double)); +} + +unsigned long copy_ckfpr_from_user(struct task_struct *task, + void __user *from) +{ + u64 buf[ELF_NFPREG]; + int i; + + if (__copy_from_user(buf, from, ELF_NFPREG * sizeof(double))) + return 1; + for (i = 0; i < (ELF_NFPREG - 1) ; i++) + task->thread.TS_CKFPR(i) = buf[i]; + task->thread.ckfp_state.fpscr = buf[i]; + + return 0; +} + +unsigned long copy_ckvsx_to_user(void __user *to, + struct task_struct *task) +{ + u64 buf[ELF_NVSRHALFREG]; + int i; + + /* save FPR copy to local buffer then write to the thread_struct */ + for (i = 0; i < ELF_NVSRHALFREG; i++) + buf[i] = task->thread.ckfp_state.fpr[i][TS_VSRLOWOFFSET]; + return __copy_to_user(to, buf, ELF_NVSRHALFREG * sizeof(double)); +} + +unsigned long copy_ckvsx_from_user(struct task_struct *task, + void __user *from) +{ + u64 buf[ELF_NVSRHALFREG]; + int i; + + if (__copy_from_user(buf, from, ELF_NVSRHALFREG * sizeof(double))) + return 1; + for (i = 0; i < ELF_NVSRHALFREG ; i++) + task->thread.ckfp_state.fpr[i][TS_VSRLOWOFFSET] = buf[i]; + return 0; +} +#endif /* CONFIG_PPC_TRANSACTIONAL_MEM */ +#else +inline unsigned long copy_fpr_to_user(void __user *to, + struct task_struct *task) +{ + return __copy_to_user(to, task->thread.fp_state.fpr, + ELF_NFPREG * sizeof(double)); +} + +inline unsigned long copy_fpr_from_user(struct task_struct *task, + void __user *from) +{ + return __copy_from_user(task->thread.fp_state.fpr, from, + ELF_NFPREG * sizeof(double)); +} + +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM +inline unsigned long copy_ckfpr_to_user(void __user *to, +struct task_struct *task) +{ + return __copy_to_user(to, task->thread.ckfp_state.fpr, + ELF_NFPREG * sizeof(double)); +} +
[PATCH v2 rebase 27/34] powerpc: Add back __ARCH_WANT_SYS_LLSEEK macro
This partially reverts commit caf6f9c8a326 ("asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro") When CONFIG_COMPAT is disabled on ppc64 the kernel does not build. There is resistance to both removing the llseek syscall from the 64bit syscall tables and building the llseek interface unconditionally. Link: https://lore.kernel.org/lkml/20190828151552.ga16...@infradead.org/ Link: https://lore.kernel.org/lkml/20190829214319.498c7de2@naga/ Signed-off-by: Michal Suchanek Reviewed-by: Arnd Bergmann --- arch/powerpc/include/asm/unistd.h | 1 + fs/read_write.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index b0720c7c3fcf..700fcdac2e3c 100644 --- a/arch/powerpc/include/asm/unistd.h +++ b/arch/powerpc/include/asm/unistd.h @@ -31,6 +31,7 @@ #define __ARCH_WANT_SYS_SOCKETCALL #define __ARCH_WANT_SYS_FADVISE64 #define __ARCH_WANT_SYS_GETPGRP +#define __ARCH_WANT_SYS_LLSEEK #define __ARCH_WANT_SYS_NICE #define __ARCH_WANT_SYS_OLD_GETRLIMIT #define __ARCH_WANT_SYS_OLD_UNAME diff --git a/fs/read_write.c b/fs/read_write.c index 5bbf587f5bc1..89aa2701dbeb 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -331,7 +331,8 @@ COMPAT_SYSCALL_DEFINE3(lseek, unsigned int, fd, compat_off_t, offset, unsigned i } #endif -#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) +#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) || \ + defined(__ARCH_WANT_SYS_LLSEEK) SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high, unsigned long, offset_low, loff_t __user *, result, unsigned int, whence) -- 2.23.0
[PATCH v2 rebase 26/34] powerpc/64: system call: Fix sparse warning about missing declaration
Sparse warns about missing declarations for these functions: +arch/powerpc/kernel/syscall_64.c:108:23: warning: symbol 'syscall_exit_prepare' was not declared. Should it be static? +arch/powerpc/kernel/syscall_64.c:18:6: warning: symbol 'system_call_exception' was not declared. Should it be static? +arch/powerpc/kernel/syscall_64.c:200:23: warning: symbol 'interrupt_exit_user_prepare' was not declared. Should it be static? +arch/powerpc/kernel/syscall_64.c:288:23: warning: symbol 'interrupt_exit_kernel_prepare' was not declared. Should it be static? Add declaration for them. Signed-off-by: Michal Suchanek --- arch/powerpc/include/asm/asm-prototypes.h | 6 ++ arch/powerpc/kernel/syscall_64.c | 1 + 2 files changed, 7 insertions(+) diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h index 399ca63196e4..841746357833 100644 --- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -96,6 +96,12 @@ ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, s unsigned long __init early_init(unsigned long dt_ptr); void __init machine_init(u64 dt_ptr); #endif +#ifdef CONFIG_PPC64 +long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, unsigned long r0, struct pt_regs *regs); +notrace unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs *regs); +notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned long msr); +notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsigned long msr); +#endif long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, u32 len_high, u32 len_low); diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c index d00cfc4a39a9..62f44c3072f3 100644 --- a/arch/powerpc/kernel/syscall_64.c +++ b/arch/powerpc/kernel/syscall_64.c @@ -1,4 +1,5 @@ #include +#include #include #include #include -- 2.23.0
[PATCH v2 rebase 25/34] powerpc/64s/exception: remove lite interrupt return
From: Nicholas Piggin The difference between lite and regular returns is that the lite case restores all NVGPRs, whereas lite skips that. This is quite clumsy though, most interrupts want the NVGPRs saved for debugging, not to modify in the caller, so the NVGPRs restore is not necessary most of the time. Restore NVGPRs explicitly for one case that requires it, and move everything else over to avoiding the restore unless the interrupt return demands it (e.g., handling a signal). Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/entry_64.S | 4 arch/powerpc/kernel/exceptions-64s.S | 21 +++-- 2 files changed, 11 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index b2e68f5ca8f7..00173cc904ef 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -452,10 +452,6 @@ _GLOBAL(fast_interrupt_return) .balign IFETCH_ALIGN_BYTES _GLOBAL(interrupt_return) - REST_NVGPRS(r1) - - .balign IFETCH_ALIGN_BYTES -_GLOBAL(interrupt_return_lite) ld r4,_MSR(r1) andi. r0,r4,MSR_PR beq kernel_interrupt_return diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 269edd1460be..1bccc869ebd3 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1507,7 +1507,7 @@ EXC_COMMON_BEGIN(hardware_interrupt_common) RUNLATCH_ON addir3,r1,STACK_FRAME_OVERHEAD bl do_IRQ - b interrupt_return_lite + b interrupt_return GEN_KVM hardware_interrupt @@ -1694,7 +1694,7 @@ EXC_COMMON_BEGIN(decrementer_common) RUNLATCH_ON addir3,r1,STACK_FRAME_OVERHEAD bl timer_interrupt - b interrupt_return_lite + b interrupt_return GEN_KVM decrementer @@ -1785,7 +1785,7 @@ EXC_COMMON_BEGIN(doorbell_super_common) #else bl unknown_exception #endif - b interrupt_return_lite + b interrupt_return GEN_KVM doorbell_super @@ -2183,7 +2183,7 @@ EXC_COMMON_BEGIN(h_doorbell_common) #else bl unknown_exception #endif - b interrupt_return_lite + b interrupt_return GEN_KVM h_doorbell @@ -2213,7 +2213,7 @@ EXC_COMMON_BEGIN(h_virt_irq_common) RUNLATCH_ON addir3,r1,STACK_FRAME_OVERHEAD bl do_IRQ - b interrupt_return_lite + b interrupt_return GEN_KVM h_virt_irq @@ -2260,7 +2260,7 @@ EXC_COMMON_BEGIN(performance_monitor_common) RUNLATCH_ON addir3,r1,STACK_FRAME_OVERHEAD bl performance_monitor_exception - b interrupt_return_lite + b interrupt_return GEN_KVM performance_monitor @@ -3013,7 +3013,7 @@ do_hash_page: cmpdi r3,0/* see if __hash_page succeeded */ /* Success */ - beq interrupt_return_lite /* Return from exception on success */ + beq interrupt_return/* Return from exception on success */ /* Error */ blt-13f @@ -3027,10 +3027,11 @@ do_hash_page: handle_page_fault: 11:andis. r0,r5,DSISR_DABRMATCH@h bne-handle_dabr_fault + bl save_nvgprs addir3,r1,STACK_FRAME_OVERHEAD bl do_page_fault cmpdi r3,0 - beq+interrupt_return_lite + beq+interrupt_return mr r5,r3 addir3,r1,STACK_FRAME_OVERHEAD ld r4,_DAR(r1) @@ -3045,9 +3046,9 @@ handle_dabr_fault: bl do_break /* * do_break() may have changed the NV GPRS while handling a breakpoint. -* If so, we need to restore them with their updated values. Don't use -* interrupt_return_lite here. +* If so, we need to restore them with their updated values. */ + REST_NVGPRS(r1) b interrupt_return -- 2.23.0
[PATCH v2 rebase 24/34] powerpc/64s: interrupt return in C
From: Nicholas Piggin Implement the bulk of interrupt return logic in C. The asm return code must handle a few cases: restoring full GPRs, and emulating stack store. The asm return code is moved into 64e for now. The new logic has made allowance for 64e, but I don't have a full environment that works well to test it, and even booting in emulated qemu is not great for stress testing. 64e shouldn't be too far off working with this, given a bit more testing and auditing of the logic. This is slightly faster on a POWER9 (page fault speed increases about 1.1%), probably due to reduced mtmsrd. Signed-off-by: Nicholas Piggin [ms: Move the FP restore functions to restore_math. They are not used anywhere else and when restore_math is not built gcc warns about them being unused. Add asm/context_tracking.h include to exceptions-64e.S for SCHEDULE_USER definition.] Signed-off-by: Michal Suchanek --- .../powerpc/include/asm/book3s/64/kup-radix.h | 10 + arch/powerpc/include/asm/switch_to.h | 6 + arch/powerpc/kernel/entry_64.S| 475 -- arch/powerpc/kernel/exceptions-64e.S | 255 +- arch/powerpc/kernel/exceptions-64s.S | 119 ++--- arch/powerpc/kernel/process.c | 89 ++-- arch/powerpc/kernel/syscall_64.c | 157 +- arch/powerpc/kernel/vector.S | 2 +- 8 files changed, 623 insertions(+), 490 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h b/arch/powerpc/include/asm/book3s/64/kup-radix.h index 07058edc5970..762afbed4762 100644 --- a/arch/powerpc/include/asm/book3s/64/kup-radix.h +++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h @@ -60,6 +60,12 @@ #include #include +static inline void kuap_restore_amr(struct pt_regs *regs) +{ + if (mmu_has_feature(MMU_FTR_RADIX_KUAP)) + mtspr(SPRN_AMR, regs->kuap); +} + static inline void kuap_check_amr(void) { if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) && mmu_has_feature(MMU_FTR_RADIX_KUAP)) @@ -110,6 +116,10 @@ static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write) "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read"); } #else /* CONFIG_PPC_KUAP */ +static inline void kuap_restore_amr(struct pt_regs *regs) +{ +} + static inline void kuap_check_amr(void) { } diff --git a/arch/powerpc/include/asm/switch_to.h b/arch/powerpc/include/asm/switch_to.h index 476008bc3d08..b867b58b1093 100644 --- a/arch/powerpc/include/asm/switch_to.h +++ b/arch/powerpc/include/asm/switch_to.h @@ -23,7 +23,13 @@ extern void switch_booke_debug_regs(struct debug_reg *new_debug); extern int emulate_altivec(struct pt_regs *); +#ifdef CONFIG_PPC_BOOK3S_64 void restore_math(struct pt_regs *regs); +#else +static inline void restore_math(struct pt_regs *regs) +{ +} +#endif void restore_tm_state(struct pt_regs *regs); diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 15bc2a872a76..b2e68f5ca8f7 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -16,6 +16,7 @@ #include #include +#include #include #include #include @@ -279,7 +280,7 @@ flush_count_cache: * state of one is saved on its kernel stack. Then the state * of the other is restored from its kernel stack. The memory * management hardware is updated to the second process's state. - * Finally, we can return to the second process, via ret_from_except. + * Finally, we can return to the second process, via interrupt_return. * On entry, r3 points to the THREAD for the current task, r4 * points to the THREAD for the new task. * @@ -433,408 +434,150 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S) addir1,r1,SWITCH_FRAME_SIZE blr - .align 7 -_GLOBAL(ret_from_except) - ld r11,_TRAP(r1) - andi. r0,r11,1 - bne ret_from_except_lite - REST_NVGPRS(r1) - -_GLOBAL(ret_from_except_lite) +#ifdef CONFIG_PPC_BOOK3S /* -* Disable interrupts so that current_thread_info()->flags -* can't change between when we test it and when we return -* from the interrupt. -*/ -#ifdef CONFIG_PPC_BOOK3E - wrteei 0 -#else - li r10,MSR_RI - mtmsrd r10,1 /* Update machine state */ -#endif /* CONFIG_PPC_BOOK3E */ +* If MSR EE/RI was never enabled, IRQs not reconciled, NVGPRs not +* touched, AMR not set, no exit work created, then this can be used. +*/ + .balign IFETCH_ALIGN_BYTES +_GLOBAL(fast_interrupt_return) + ld r4,_MSR(r1) + andi. r0,r4,MSR_PR + bne .Lfast_user_interrupt_return + andi. r0,r4,MSR_RI + bne+.Lfast_kernel_interrupt_return + addir3,r1,STACK_FRAME_OVERHEAD + bl unrecoverable_exception + b . /* should not get here */ - ld r9, PACA_THREAD_INFO(r13) - ld r3,_MSR(r1) -#ifdef
[PATCH v2 rebase 23/34] powerpc/64: system call implement the bulk of the logic in C
From: Nicholas Piggin System call entry and particularly exit code is beyond the limit of what is reasonable to implement in asm. This conversion moves all conditional branches out of the asm code, except for the case that all GPRs should be restored at exit. Null syscall test is about 5% faster after this patch, because the exit work is handled under local_irq_disable, and the hard mask and pending interrupt replay is handled after that, which avoids games with MSR. Signed-off-by: Nicholas Piggin [ms: add endian conversion for dtl_idx] Signed-off-by: Michal Suchanek v3: - Fix !KUAP build [mpe] - Fix BookE build/boot [mpe] - Don't trace irqs with MSR[RI]=0 - Don't allow syscall_exit_prepare to be ftraced, because function graph tracing which traces exits barfs after the IRQ state is prepared for kernel exit. - Fix BE syscall table to use normal function descriptors now that they are called from C. - Comment syscall_exit_prepare. --- arch/powerpc/include/asm/asm-prototypes.h | 11 - .../powerpc/include/asm/book3s/64/kup-radix.h | 14 +- arch/powerpc/include/asm/cputime.h| 24 ++ arch/powerpc/include/asm/hw_irq.h | 4 + arch/powerpc/include/asm/ptrace.h | 3 + arch/powerpc/include/asm/signal.h | 3 + arch/powerpc/include/asm/switch_to.h | 5 + arch/powerpc/include/asm/time.h | 3 + arch/powerpc/kernel/Makefile | 3 +- arch/powerpc/kernel/entry_64.S| 337 +++--- arch/powerpc/kernel/signal.h | 2 - arch/powerpc/kernel/syscall_64.c | 195 ++ arch/powerpc/kernel/systbl.S | 9 +- 13 files changed, 300 insertions(+), 313 deletions(-) create mode 100644 arch/powerpc/kernel/syscall_64.c diff --git a/arch/powerpc/include/asm/asm-prototypes.h b/arch/powerpc/include/asm/asm-prototypes.h index 8561498e653c..399ca63196e4 100644 --- a/arch/powerpc/include/asm/asm-prototypes.h +++ b/arch/powerpc/include/asm/asm-prototypes.h @@ -103,14 +103,6 @@ long sys_switch_endian(void); notrace unsigned int __check_irq_replay(void); void notrace restore_interrupts(void); -/* ptrace */ -long do_syscall_trace_enter(struct pt_regs *regs); -void do_syscall_trace_leave(struct pt_regs *regs); - -/* process */ -void restore_math(struct pt_regs *regs); -void restore_tm_state(struct pt_regs *regs); - /* prom_init (OpenFirmware) */ unsigned long __init prom_init(unsigned long r3, unsigned long r4, unsigned long pp, @@ -121,9 +113,6 @@ unsigned long __init prom_init(unsigned long r3, unsigned long r4, void __init early_setup(unsigned long dt_ptr); void early_setup_secondary(void); -/* time */ -void accumulate_stolen_time(void); - /* misc runtime */ extern u64 __bswapdi2(u64); extern s64 __lshrdi3(s64, int); diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h b/arch/powerpc/include/asm/book3s/64/kup-radix.h index f254de956d6a..07058edc5970 100644 --- a/arch/powerpc/include/asm/book3s/64/kup-radix.h +++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h @@ -3,6 +3,7 @@ #define _ASM_POWERPC_BOOK3S_64_KUP_RADIX_H #include +#include #define AMR_KUAP_BLOCK_READUL(0x4000) #define AMR_KUAP_BLOCK_WRITE UL(0x8000) @@ -56,7 +57,14 @@ #ifdef CONFIG_PPC_KUAP -#include +#include +#include + +static inline void kuap_check_amr(void) +{ + if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) && mmu_has_feature(MMU_FTR_RADIX_KUAP)) + WARN_ON_ONCE(mfspr(SPRN_AMR) != AMR_KUAP_BLOCKED); +} /* * We support individually allowing read or write, but we don't support nesting @@ -101,6 +109,10 @@ static inline bool bad_kuap_fault(struct pt_regs *regs, bool is_write) (regs->kuap & (is_write ? AMR_KUAP_BLOCK_WRITE : AMR_KUAP_BLOCK_READ)), "Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read"); } +#else /* CONFIG_PPC_KUAP */ +static inline void kuap_check_amr(void) +{ +} #endif /* CONFIG_PPC_KUAP */ #endif /* __ASSEMBLY__ */ diff --git a/arch/powerpc/include/asm/cputime.h b/arch/powerpc/include/asm/cputime.h index 2431b4ada2fa..c43614cffaac 100644 --- a/arch/powerpc/include/asm/cputime.h +++ b/arch/powerpc/include/asm/cputime.h @@ -60,6 +60,30 @@ static inline void arch_vtime_task_switch(struct task_struct *prev) } #endif +static inline void account_cpu_user_entry(void) +{ + unsigned long tb = mftb(); + struct cpu_accounting_data *acct = get_accounting(current); + + acct->utime += (tb - acct->starttime_user); + acct->starttime = tb; +} +static inline void account_cpu_user_exit(void) +{ + unsigned long tb = mftb(); + struct cpu_accounting_data *acct = get_accounting(current); + + acct->stime += (tb - acct->starttime); + acct->starttime_user = tb; +} + #endif /* __KERNEL__ */ +#else /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */ +static inline void
[PATCH v2 rebase 22/34] powerpc/64: system call remove non-volatile GPR save optimisation
From: Nicholas Piggin powerpc has an optimisation where interrupts avoid saving the non-volatile (or callee saved) registers to the interrupt stack frame if they are not required. Two problems with this are that an interrupt does not always know whether it will need non-volatiles; and if it does need them, they can only be saved from the entry-scoped asm code (because we don't control what the C compiler does with these registers). system calls are the most difficult: some system calls always require all registers (e.g., fork, to copy regs into the child). Sometimes registers are only required under certain conditions (e.g., tracing, signal delivery). These cases require ugly logic in the call chains (e.g., ppc_fork), and require a lot of logic to be implemented in asm. So remove the optimisation for system calls, and always save NVGPRs on entry. Modern high performance CPUs are not so sensitive, because the stores are dense in cache and can be hidden by other expensive work in the syscall path -- the null syscall selftests benchmark on POWER9 is not slowed (124.40ns before and 123.64ns after, i.e., within the noise). Other interrupts retain the NVGPR optimisation for now. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/entry_64.S | 72 +--- arch/powerpc/kernel/syscalls/syscall.tbl | 22 +--- 2 files changed, 28 insertions(+), 66 deletions(-) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 6467bdab8d40..5a3e0b5c9ad1 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -98,13 +98,14 @@ END_BTB_FLUSH_SECTION std r11,_XER(r1) std r11,_CTR(r1) std r9,GPR13(r1) + SAVE_NVGPRS(r1) mflrr10 /* * This clears CR0.SO (bit 28), which is the error indication on * return from this system call. */ rldimi r2,r11,28,(63-28) - li r11,0xc01 + li r11,0xc00 std r10,_LINK(r1) std r11,_TRAP(r1) std r3,ORIG_GPR3(r1) @@ -323,7 +324,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) /* Traced system call support */ .Lsyscall_dotrace: - bl save_nvgprs addir3,r1,STACK_FRAME_OVERHEAD bl do_syscall_trace_enter @@ -408,7 +408,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) mtmsrd r10,1 #endif /* CONFIG_PPC_BOOK3E */ - bl save_nvgprs addir3,r1,STACK_FRAME_OVERHEAD bl do_syscall_trace_leave b ret_from_except @@ -442,62 +441,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) _ASM_NOKPROBE_SYMBOL(system_call_common); _ASM_NOKPROBE_SYMBOL(system_call_exit); -/* Save non-volatile GPRs, if not already saved. */ -_GLOBAL(save_nvgprs) - ld r11,_TRAP(r1) - andi. r0,r11,1 - beqlr- - SAVE_NVGPRS(r1) - clrrdi r0,r11,1 - std r0,_TRAP(r1) - blr -_ASM_NOKPROBE_SYMBOL(save_nvgprs); - - -/* - * The sigsuspend and rt_sigsuspend system calls can call do_signal - * and thus put the process into the stopped state where we might - * want to examine its user state with ptrace. Therefore we need - * to save all the nonvolatile registers (r14 - r31) before calling - * the C code. Similarly, fork, vfork and clone need the full - * register state on the stack so that it can be copied to the child. - */ - -_GLOBAL(ppc_fork) - bl save_nvgprs - bl sys_fork - b .Lsyscall_exit - -_GLOBAL(ppc_vfork) - bl save_nvgprs - bl sys_vfork - b .Lsyscall_exit - -_GLOBAL(ppc_clone) - bl save_nvgprs - bl sys_clone - b .Lsyscall_exit - -_GLOBAL(ppc_clone3) - bl save_nvgprs - bl sys_clone3 - b .Lsyscall_exit - -_GLOBAL(ppc32_swapcontext) - bl save_nvgprs - bl compat_sys_swapcontext - b .Lsyscall_exit - -_GLOBAL(ppc64_swapcontext) - bl save_nvgprs - bl sys_swapcontext - b .Lsyscall_exit - -_GLOBAL(ppc_switch_endian) - bl save_nvgprs - bl sys_switch_endian - b .Lsyscall_exit - _GLOBAL(ret_from_fork) bl schedule_tail REST_NVGPRS(r1) @@ -516,6 +459,17 @@ _GLOBAL(ret_from_kernel_thread) li r3,0 b .Lsyscall_exit +/* Save non-volatile GPRs, if not already saved. */ +_GLOBAL(save_nvgprs) + ld r11,_TRAP(r1) + andi. r0,r11,1 + beqlr- + SAVE_NVGPRS(r1) + clrrdi r0,r11,1 + std r0,_TRAP(r1) + blr +_ASM_NOKPROBE_SYMBOL(save_nvgprs); + #ifdef CONFIG_PPC_BOOK3S_64 #define FLUSH_COUNT_CACHE \ diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 43f736ed47f2..d899bcb5343e 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -9,7 +9,9 @@
[PATCH v2 rebase 21/34] powerpc/64s/exception: soft nmi interrupt should not use ret_from_except
From: Nicholas Piggin The soft nmi handler does not reconcile interrupt state, so it should not return via the normal ret_from_except path. Return like other NMIs, using the EXCEPTION_RESTORE_REGS macro. This becomes important when the scv interrupt is implemented, which must handle soft-masked interrupts that have r13 set to something other than the PACA -- returning to kernel in this case must restore r13. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 38bc66b95516..af1264cd005f 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -2740,7 +2740,11 @@ EXC_COMMON_BEGIN(soft_nmi_common) bl save_nvgprs addir3,r1,STACK_FRAME_OVERHEAD bl soft_nmi_interrupt - b ret_from_except + /* Clear MSR_RI before setting SRR0 and SRR1. */ + li r9,0 + mtmsrd r9,1 + EXCEPTION_RESTORE_REGS hsrr=0 + RFI_TO_KERNEL #endif /* CONFIG_PPC_WATCHDOG */ -- 2.23.0
[PATCH v2 rebase 20/34] powerpc/64s/exception: only test KVM in SRR interrupts when PR KVM is supported
From: Nicholas Piggin Apart from SRESET, MCE, and syscall (hcall variant), the SRR type interrupts are not escalated to hypervisor mode, so delivered to the OS. When running PR KVM, the OS is the hypervisor, and the guest runs with MSR[PR]=1, so these interrupts must test if a guest was running when interrupted. These tests are required at the real-mode entry points because the PR KVM host runs with LPCR[AIL]=0. In HV KVM and nested HV KVM, the guest always receives these interrupts, so there is no need for the host to make this test. So remove the tests if PR KVM is not configured. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 65 ++-- 1 file changed, 62 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 2f50587392aa..38bc66b95516 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -214,9 +214,36 @@ do_define_int n #ifdef CONFIG_KVM_BOOK3S_64_HANDLER #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE /* - * If hv is possible, interrupts come into to the hv version - * of the kvmppc_interrupt code, which then jumps to the PR handler, - * kvmppc_interrupt_pr, if the guest is a PR guest. + * All interrupts which set HSRR registers, as well as SRESET and MCE and + * syscall when invoked with "sc 1" switch to MSR[HV]=1 (HVMODE) to be taken, + * so they all generally need to test whether they were taken in guest context. + * + * Note: SRESET and MCE may also be sent to the guest by the hypervisor, and be + * taken with MSR[HV]=0. + * + * Interrupts which set SRR registers (with the above exceptions) do not + * elevate to MSR[HV]=1 mode, though most can be taken when running with + * MSR[HV]=1 (e.g., bare metal kernel and userspace). So these interrupts do + * not need to test whether a guest is running because they get delivered to + * the guest directly, including nested HV KVM guests. + * + * The exception is PR KVM, where the guest runs with MSR[PR]=1 and the host + * runs with MSR[HV]=0, so the host takes all interrupts on behalf of the + * guest. PR KVM runs with LPCR[AIL]=0 which causes interrupts to always be + * delivered to the real-mode entry point, therefore such interrupts only test + * KVM in their real mode handlers, and only when PR KVM is possible. + * + * Interrupts that are taken in MSR[HV]=0 and escalate to MSR[HV]=1 are always + * delivered in real-mode when the MMU is in hash mode because the MMU + * registers are not set appropriately to translate host addresses. In nested + * radix mode these can be delivered in virt-mode as the host translations are + * used implicitly (see: effective LPID, effective PID). + */ + +/* + * If an interrupt is taken while a guest is running, it is immediately routed + * to KVM to handle. If both HV and PR KVM arepossible, KVM interrupts go first + * to kvmppc_interrupt_hv, which handles the PR guest case. */ #define kvmppc_interrupt kvmppc_interrupt_hv #else @@ -1258,8 +1285,10 @@ INT_DEFINE_BEGIN(data_access) IVEC=0x300 IDAR=1 IDSISR=1 +#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE IKVM_SKIP=1 IKVM_REAL=1 +#endif INT_DEFINE_END(data_access) EXC_REAL_BEGIN(data_access, 0x300, 0x80) @@ -1306,8 +1335,10 @@ INT_DEFINE_BEGIN(data_access_slb) IAREA=PACA_EXSLB IRECONCILE=0 IDAR=1 +#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE IKVM_SKIP=1 IKVM_REAL=1 +#endif INT_DEFINE_END(data_access_slb) EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80) @@ -1357,7 +1388,9 @@ INT_DEFINE_BEGIN(instruction_access) IISIDE=1 IDAR=1 IDSISR=1 +#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE IKVM_REAL=1 +#endif INT_DEFINE_END(instruction_access) EXC_REAL_BEGIN(instruction_access, 0x400, 0x80) @@ -1396,7 +1429,9 @@ INT_DEFINE_BEGIN(instruction_access_slb) IRECONCILE=0 IISIDE=1 IDAR=1 +#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE IKVM_REAL=1 +#endif INT_DEFINE_END(instruction_access_slb) EXC_REAL_BEGIN(instruction_access_slb, 0x480, 0x80) @@ -1488,7 +1523,9 @@ INT_DEFINE_BEGIN(alignment) IVEC=0x600 IDAR=1 IDSISR=1 +#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE IKVM_REAL=1 +#endif INT_DEFINE_END(alignment) EXC_REAL_BEGIN(alignment, 0x600, 0x100) @@ -1518,7 +1555,9 @@ EXC_COMMON_BEGIN(alignment_common) */ INT_DEFINE_BEGIN(program_check) IVEC=0x700 +#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE IKVM_REAL=1 +#endif INT_DEFINE_END(program_check) EXC_REAL_BEGIN(program_check, 0x700, 0x100) @@ -1581,7 +1620,9 @@ EXC_COMMON_BEGIN(program_check_common) INT_DEFINE_BEGIN(fp_unavailable) IVEC=0x800 IRECONCILE=0 +#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE IKVM_REAL=1 +#endif INT_DEFINE_END(fp_unavailable) EXC_REAL_BEGIN(fp_unavailable, 0x800, 0x100) @@ -1643,7 +1684,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM) INT_DEFINE_BEGIN(decrementer) IVEC=0x900
[PATCH v2 rebase 19/34] powerpc/64s/exception: add more comments for interrupt handlers
From: Nicholas Piggin A few of the non-standard handlers are left uncommented. Some more description could be added to some. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 391 --- 1 file changed, 353 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index ef37d0ab6594..2f50587392aa 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -121,26 +121,26 @@ name: /* * Interrupt code generation macros */ -#define IVEC .L_IVEC_\name\() -#define IHSRR .L_IHSRR_\name\() -#define IHSRR_IF_HVMODE.L_IHSRR_IF_HVMODE_\name\() -#define IAREA .L_IAREA_\name\() -#define IVIRT .L_IVIRT_\name\() -#define IISIDE .L_IISIDE_\name\() -#define IDAR .L_IDAR_\name\() -#define IDSISR .L_IDSISR_\name\() -#define ISET_RI.L_ISET_RI_\name\() -#define IBRANCH_TO_COMMON .L_IBRANCH_TO_COMMON_\name\() -#define IREALMODE_COMMON .L_IREALMODE_COMMON_\name\() -#define IMASK .L_IMASK_\name\() -#define IKVM_SKIP .L_IKVM_SKIP_\name\() -#define IKVM_REAL .L_IKVM_REAL_\name\() +#define IVEC .L_IVEC_\name\()/* Interrupt vector address */ +#define IHSRR .L_IHSRR_\name\() /* Sets SRR or HSRR registers */ +#define IHSRR_IF_HVMODE.L_IHSRR_IF_HVMODE_\name\() /* HSRR if HV else SRR */ +#define IAREA .L_IAREA_\name\() /* PACA save area */ +#define IVIRT .L_IVIRT_\name\() /* Has virt mode entry point */ +#define IISIDE .L_IISIDE_\name\() /* Uses SRR0/1 not DAR/DSISR */ +#define IDAR .L_IDAR_\name\()/* Uses DAR (or SRR0) */ +#define IDSISR .L_IDSISR_\name\() /* Uses DSISR (or SRR1) */ +#define ISET_RI.L_ISET_RI_\name\() /* Run common code w/ MSR[RI]=1 */ +#define IBRANCH_TO_COMMON .L_IBRANCH_TO_COMMON_\name\() /* ENTRY branch to common */ +#define IREALMODE_COMMON .L_IREALMODE_COMMON_\name\() /* Common runs in realmode */ +#define IMASK .L_IMASK_\name\() /* IRQ soft-mask bit */ +#define IKVM_SKIP .L_IKVM_SKIP_\name\() /* Generate KVM skip handler */ +#define IKVM_REAL .L_IKVM_REAL_\name\() /* Real entry tests KVM */ #define __IKVM_REAL(name) .L_IKVM_REAL_ ## name -#define IKVM_VIRT .L_IKVM_VIRT_\name\() -#define ISTACK .L_ISTACK_\name\() +#define IKVM_VIRT .L_IKVM_VIRT_\name\() /* Virt entry tests KVM */ +#define ISTACK .L_ISTACK_\name\() /* Set regular kernel stack */ #define __ISTACK(name) .L_ISTACK_ ## name -#define IRECONCILE .L_IRECONCILE_\name\() -#define IKUAP .L_IKUAP_\name\() +#define IRECONCILE .L_IRECONCILE_\name\() /* Do RECONCILE_IRQ_STATE */ +#define IKUAP .L_IKUAP_\name\() /* Do KUAP lock */ #define INT_DEFINE_BEGIN(n)\ .macro int_define_ ## n name @@ -759,6 +759,39 @@ __start_interrupts: EXC_VIRT_NONE(0x4000, 0x100) +/** + * Interrupt 0x100 - System Reset Interrupt (SRESET aka NMI). + * This is a non-maskable, asynchronous interrupt always taken in real-mode. + * It is caused by: + * - Wake from power-saving state, on powernv. + * - An NMI from another CPU, triggered by firmware or hypercall. + * - As crash/debug signal injected from BMC, firmware or hypervisor. + * + * Handling: + * Power-save wakeup is the only performance critical path, so this is + * determined quickly as possible first. In this case volatile registers + * can be discarded and SPRs like CFAR don't need to be read. + * + * If not a powersave wakeup, then it's run as a regular interrupt, however + * it uses its own stack and PACA save area to preserve the regular kernel + * environment for debugging. + * + * This interrupt is not maskable, so triggering it when MSR[RI] is clear, + * or SCRATCH0 is in use, etc. may cause a crash. It's also not entirely + * correct to switch to virtual mode to run the regular interrupt handler + * because it might be interrupted when the MMU is in a bad state (e.g., SLB + * is clear). + * + * FWNMI: + * PAPR specifies a "fwnmi" facility which sends the sreset to a different + * entry point with a different register set up. Some hypervisors will + * send the sreset to 0x100 in the guest if it is not fwnmi capable. + * + * KVM: + * Unlike most SRR interrupts, this may be taken by the host while executing + * in a guest, so a KVM test is required. KVM will pull the CPU out of guest + * mode and then raise the sreset. + */ INT_DEFINE_BEGIN(system_reset) IVEC=0x100 IAREA=PACA_EXNMI @@ -834,6 +867,7 @@ TRAMP_REAL_BEGIN(system_reset_idle_wake) * Vectors for the FWNMI option. Share common code. */ TRAMP_REAL_BEGIN(system_reset_fwnmi) + /* XXX: fwnmi guest could run a nested/PR guest, so why no test? */
[PATCH v2 rebase 18/34] powerpc/64s/exception: Clean up SRR specifiers
From: Nicholas Piggin Remove more magic numbers and replace with nicely named bools. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 68 +--- 1 file changed, 32 insertions(+), 36 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 9494403b9586..ef37d0ab6594 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -105,11 +105,6 @@ name: ori reg,reg,(ABS_ADDR(label))@l;\ addis reg,reg,(ABS_ADDR(label))@h -/* Exception register prefixes */ -#define EXC_HV_OR_STD 2 /* depends on HVMODE */ -#define EXC_HV 1 -#define EXC_STD0 - /* * Branch to label using its 0xC000 address. This results in instruction * address suitable for MSR[IR]=0 or 1, which allows relocation to be turned @@ -128,6 +123,7 @@ name: */ #define IVEC .L_IVEC_\name\() #define IHSRR .L_IHSRR_\name\() +#define IHSRR_IF_HVMODE.L_IHSRR_IF_HVMODE_\name\() #define IAREA .L_IAREA_\name\() #define IVIRT .L_IVIRT_\name\() #define IISIDE .L_IISIDE_\name\() @@ -159,7 +155,10 @@ do_define_int n .error "IVEC not defined" .endif .ifndef IHSRR - IHSRR=EXC_STD + IHSRR=0 + .endif + .ifndef IHSRR_IF_HVMODE + IHSRR_IF_HVMODE=0 .endif .ifndef IAREA IAREA=PACA_EXGEN @@ -257,7 +256,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r9,IAREA+EX_R9(r13) ld r10,IAREA+EX_R10(r13) /* HSRR variants have the 0x2 bit added to their trap number */ - .if IHSRR == EXC_HV_OR_STD + .if IHSRR_IF_HVMODE BEGIN_FTR_SECTION ori r12,r12,(IVEC + 0x2) FTR_SECTION_ELSE @@ -278,7 +277,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r10,IAREA+EX_R10(r13) ld r11,IAREA+EX_R11(r13) ld r12,IAREA+EX_R12(r13) - .if IHSRR == EXC_HV_OR_STD + .if IHSRR_IF_HVMODE BEGIN_FTR_SECTION b kvmppc_skip_Hinterrupt FTR_SECTION_ELSE @@ -403,7 +402,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) stw r10,IAREA+EX_DSISR(r13) .endif - .if IHSRR == EXC_HV_OR_STD + .if IHSRR_IF_HVMODE BEGIN_FTR_SECTION mfspr r11,SPRN_HSRR0 /* save HSRR0 */ mfspr r12,SPRN_HSRR1 /* and HSRR1 */ @@ -485,7 +484,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_virt) .abort "Bad maskable vector" .endif - .if IHSRR == EXC_HV_OR_STD + .if IHSRR_IF_HVMODE BEGIN_FTR_SECTION bne masked_Hinterrupt FTR_SECTION_ELSE @@ -618,12 +617,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) * Restore all registers including H/SRR0/1 saved in a stack frame of a * standard exception. */ -.macro EXCEPTION_RESTORE_REGS hsrr +.macro EXCEPTION_RESTORE_REGS hsrr=0 /* Move original SRR0 and SRR1 into the respective regs */ ld r9,_MSR(r1) - .if \hsrr == EXC_HV_OR_STD - .error "EXC_HV_OR_STD Not implemented for EXCEPTION_RESTORE_REGS" - .endif .if \hsrr mtspr SPRN_HSRR1,r9 .else @@ -898,7 +894,7 @@ EXC_COMMON_BEGIN(system_reset_common) ld r10,SOFTE(r1) stb r10,PACAIRQSOFTMASK(r13) - EXCEPTION_RESTORE_REGS EXC_STD + EXCEPTION_RESTORE_REGS RFI_TO_USER_OR_KERNEL GEN_KVM system_reset @@ -952,7 +948,7 @@ TRAMP_REAL_BEGIN(machine_check_fwnmi) lhz r12,PACA_IN_MCE(r13); \ subir12,r12,1; \ sth r12,PACA_IN_MCE(r13); \ - EXCEPTION_RESTORE_REGS EXC_STD + EXCEPTION_RESTORE_REGS EXC_COMMON_BEGIN(machine_check_early_common) /* @@ -1321,7 +1317,7 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) INT_DEFINE_BEGIN(hardware_interrupt) IVEC=0x500 - IHSRR=EXC_HV_OR_STD + IHSRR_IF_HVMODE=1 IMASK=IRQS_DISABLED IKVM_REAL=1 IKVM_VIRT=1 @@ -1490,7 +1486,7 @@ EXC_COMMON_BEGIN(decrementer_common) INT_DEFINE_BEGIN(hdecrementer) IVEC=0x980 - IHSRR=EXC_HV + IHSRR=1 ISTACK=0 IRECONCILE=0 IKVM_REAL=1 @@ -1732,7 +1728,7 @@ EXC_COMMON_BEGIN(single_step_common) INT_DEFINE_BEGIN(h_data_storage) IVEC=0xe00 - IHSRR=EXC_HV + IHSRR=1 IDAR=1 IDSISR=1 IKVM_SKIP=1 @@ -1764,7 +1760,7 @@ ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_TYPE_RADIX) INT_DEFINE_BEGIN(h_instr_storage) IVEC=0xe20 - IHSRR=EXC_HV + IHSRR=1 IKVM_REAL=1 IKVM_VIRT=1 INT_DEFINE_END(h_instr_storage) @@ -1787,7 +1783,7 @@ EXC_COMMON_BEGIN(h_instr_storage_common) INT_DEFINE_BEGIN(emulation_assist)
[PATCH v2 rebase 17/34] powerpc/64s/exception: re-inline some handlers
From: Nicholas Piggin The reduction in interrupt entry size allows some handlers to be re-inlined. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 7a234e6d7bf5..9494403b9586 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1186,7 +1186,7 @@ INT_DEFINE_BEGIN(data_access) INT_DEFINE_END(data_access) EXC_REAL_BEGIN(data_access, 0x300, 0x80) - GEN_INT_ENTRY data_access, virt=0, ool=1 + GEN_INT_ENTRY data_access, virt=0 EXC_REAL_END(data_access, 0x300, 0x80) EXC_VIRT_BEGIN(data_access, 0x4300, 0x80) GEN_INT_ENTRY data_access, virt=1 @@ -1216,7 +1216,7 @@ INT_DEFINE_BEGIN(data_access_slb) INT_DEFINE_END(data_access_slb) EXC_REAL_BEGIN(data_access_slb, 0x380, 0x80) - GEN_INT_ENTRY data_access_slb, virt=0, ool=1 + GEN_INT_ENTRY data_access_slb, virt=0 EXC_REAL_END(data_access_slb, 0x380, 0x80) EXC_VIRT_BEGIN(data_access_slb, 0x4380, 0x80) GEN_INT_ENTRY data_access_slb, virt=1 @@ -1472,7 +1472,7 @@ INT_DEFINE_BEGIN(decrementer) INT_DEFINE_END(decrementer) EXC_REAL_BEGIN(decrementer, 0x900, 0x80) - GEN_INT_ENTRY decrementer, virt=0, ool=1 + GEN_INT_ENTRY decrementer, virt=0 EXC_REAL_END(decrementer, 0x900, 0x80) EXC_VIRT_BEGIN(decrementer, 0x4900, 0x80) GEN_INT_ENTRY decrementer, virt=1 -- 2.23.0
[PATCH v2 rebase 16/34] powerpc/64s/exception: hdecrementer avoid touching the stack
From: Nicholas Piggin The hdec interrupt handler is reported to sometimes fire in Linux if KVM leaves it pending after a guest exists. This is harmless, so there is a no-op handler for it. The interrupt handler currently uses the regular kernel stack. Change this to avoid touching the stack entirely. This should be the last place where the regular Linux stack can be accessed with asynchronous interrupts (including PMI) soft-masked. It might be possible to take advantage of this invariant, e.g., to context switch the kernel stack SLB entry without clearing MSR[EE]. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/time.h | 1 - arch/powerpc/kernel/exceptions-64s.S | 25 - arch/powerpc/kernel/time.c | 9 - 3 files changed, 20 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h index 08dbe3e6831c..e0107495c4de 100644 --- a/arch/powerpc/include/asm/time.h +++ b/arch/powerpc/include/asm/time.h @@ -24,7 +24,6 @@ extern struct clock_event_device decrementer_clockevent; extern void generic_calibrate_decr(void); -extern void hdec_interrupt(struct pt_regs *regs); /* Some sane defaults: 125 MHz timebase, 1GHz processor */ extern unsigned long ppc_proc_freq; diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 9fa71d51ecf4..7a234e6d7bf5 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -1491,6 +1491,8 @@ EXC_COMMON_BEGIN(decrementer_common) INT_DEFINE_BEGIN(hdecrementer) IVEC=0x980 IHSRR=EXC_HV + ISTACK=0 + IRECONCILE=0 IKVM_REAL=1 IKVM_VIRT=1 INT_DEFINE_END(hdecrementer) @@ -1502,11 +1504,24 @@ EXC_VIRT_BEGIN(hdecrementer, 0x4980, 0x80) GEN_INT_ENTRY hdecrementer, virt=1 EXC_VIRT_END(hdecrementer, 0x4980, 0x80) EXC_COMMON_BEGIN(hdecrementer_common) - GEN_COMMON hdecrementer - bl save_nvgprs - addir3,r1,STACK_FRAME_OVERHEAD - bl hdec_interrupt - b ret_from_except + __GEN_COMMON_ENTRY hdecrementer + /* +* Hypervisor decrementer interrupts not caught by the KVM test +* shouldn't occur but are sometimes left pending on exit from a KVM +* guest. We don't need to do anything to clear them, as they are +* edge-triggered. +* +* Be careful to avoid touching the kernel stack. +*/ + ld r10,PACA_EXGEN+EX_CTR(r13) + mtctr r10 + mtcrf 0x80,r9 + ld r9,PACA_EXGEN+EX_R9(r13) + ld r10,PACA_EXGEN+EX_R10(r13) + ld r11,PACA_EXGEN+EX_R11(r13) + ld r12,PACA_EXGEN+EX_R12(r13) + ld r13,PACA_EXGEN+EX_R13(r13) + HRFI_TO_KERNEL GEN_KVM hdecrementer diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 968ae97382b4..e4572d67cc76 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -663,15 +663,6 @@ void timer_broadcast_interrupt(void) } #endif -/* - * Hypervisor decrementer interrupts shouldn't occur but are sometimes - * left pending on exit from a KVM guest. We don't need to do anything - * to clear them, as they are edge-triggered. - */ -void hdec_interrupt(struct pt_regs *regs) -{ -} - #ifdef CONFIG_SUSPEND static void generic_suspend_disable_irqs(void) { -- 2.23.0
[PATCH v2 rebase 15/34] powerpc/64s/exception: trim unused arguments from KVMTEST macro
From: Nicholas Piggin Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index abf26db36427..9fa71d51ecf4 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -224,7 +224,7 @@ do_define_int n #define kvmppc_interrupt kvmppc_interrupt_pr #endif -.macro KVMTEST name, hsrr, n +.macro KVMTEST name lbz r10,HSTATE_IN_GUEST(r13) cmpwi r10,0 bne \name\()_kvm @@ -293,7 +293,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) .endm #else -.macro KVMTEST name, hsrr, n +.macro KVMTEST name .endm .macro GEN_KVM name .endm @@ -437,7 +437,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) DEFINE_FIXED_SYMBOL(\name\()_common_real) \name\()_common_real: .if IKVM_REAL - KVMTEST \name IHSRR IVEC + KVMTEST \name .endif ld r10,PACAKMSR(r13) /* get MSR value for kernel */ @@ -460,7 +460,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real) DEFINE_FIXED_SYMBOL(\name\()_common_virt) \name\()_common_virt: .if IKVM_VIRT - KVMTEST \name IHSRR IVEC + KVMTEST \name 1: .endif .endif /* IVIRT */ @@ -1595,7 +1595,7 @@ INT_DEFINE_END(system_call) GET_PACA(r13) std r10,PACA_EXGEN+EX_R10(r13) INTERRUPT_TO_KERNEL - KVMTEST system_call EXC_STD 0xc00 /* uses r10, branch to system_call_kvm */ + KVMTEST system_call /* uses r10, branch to system_call_kvm */ mfctr r9 #else mr r9,r13 -- 2.23.0
[PATCH v2 rebase 13/34] powerpc/64s/exception: remove confusing IEARLY option
From: Nicholas Piggin Replace IEARLY=1 and IEARLY=2 with IBRANCH_COMMON, which controls if the entry code branches to a common handler; and IREALMODE_COMMON, which controls whether the common handler should remain in real mode. These special cases no longer avoid loading the SRR registers, there is no point as most of them load the registers immediately anyway. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 48 ++-- 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 7db76e7be0aa..716a95ba814f 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -174,7 +174,8 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) #define IDAR .L_IDAR_\name\() #define IDSISR .L_IDSISR_\name\() #define ISET_RI.L_ISET_RI_\name\() -#define IEARLY .L_IEARLY_\name\() +#define IBRANCH_TO_COMMON .L_IBRANCH_TO_COMMON_\name\() +#define IREALMODE_COMMON .L_IREALMODE_COMMON_\name\() #define IMASK .L_IMASK_\name\() #define IKVM_SKIP .L_IKVM_SKIP_\name\() #define IKVM_REAL .L_IKVM_REAL_\name\() @@ -218,8 +219,15 @@ do_define_int n .ifndef ISET_RI ISET_RI=1 .endif - .ifndef IEARLY - IEARLY=0 + .ifndef IBRANCH_TO_COMMON + IBRANCH_TO_COMMON=1 + .endif + .ifndef IREALMODE_COMMON + IREALMODE_COMMON=0 + .else + .if ! IBRANCH_TO_COMMON + .error "IREALMODE_COMMON=1 but IBRANCH_TO_COMMON=0" + .endif .endif .ifndef IMASK IMASK=0 @@ -353,6 +361,11 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) */ .macro GEN_BRANCH_TO_COMMON name, virt + .if IREALMODE_COMMON + LOAD_HANDLER(r10, \name\()_common) + mtctr r10 + bctr + .else .if \virt #ifndef CONFIG_RELOCATABLE b \name\()_common_virt @@ -366,6 +379,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) mtctr r10 bctr .endif + .endif .endm .macro GEN_INT_ENTRY name, virt, ool=0 @@ -421,11 +435,6 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) stw r10,IAREA+EX_DSISR(r13) .endif - .if IEARLY == 2 - /* nothing more */ - .elseif IEARLY - BRANCH_TO_C000(r11, \name\()_common) - .else .if IHSRR == EXC_HV_OR_STD BEGIN_FTR_SECTION mfspr r11,SPRN_HSRR0 /* save HSRR0 */ @@ -441,6 +450,8 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) mfspr r11,SPRN_SRR0 /* save SRR0 */ mfspr r12,SPRN_SRR1 /* and SRR1 */ .endif + + .if IBRANCH_TO_COMMON GEN_BRANCH_TO_COMMON \name \virt .endif @@ -926,6 +937,7 @@ INT_DEFINE_BEGIN(machine_check_early) IVEC=0x200 IAREA=PACA_EXMC IVIRT=0 /* no virt entry point */ + IREALMODE_COMMON=1 /* * MSR_RI is not enabled, because PACA_EXMC is being used, so a * nested machine check corrupts it. machine_check_common enables @@ -933,7 +945,6 @@ INT_DEFINE_BEGIN(machine_check_early) */ ISET_RI=0 ISTACK=0 - IEARLY=1 IDAR=1 IDSISR=1 IRECONCILE=0 @@ -973,9 +984,6 @@ TRAMP_REAL_BEGIN(machine_check_fwnmi) EXCEPTION_RESTORE_REGS EXC_STD EXC_COMMON_BEGIN(machine_check_early_common) - mfspr r11,SPRN_SRR0 - mfspr r12,SPRN_SRR1 - /* * Switch to mc_emergency stack and handle re-entrancy (we limit * the nested MCE upto level 4 to avoid stack overflow). @@ -1822,7 +1830,7 @@ EXC_COMMON_BEGIN(emulation_assist_common) INT_DEFINE_BEGIN(hmi_exception_early) IVEC=0xe60 IHSRR=EXC_HV - IEARLY=1 + IREALMODE_COMMON=1 ISTACK=0 IRECONCILE=0 IKUAP=0 /* We don't touch AMR here, we never go to virtual mode */ @@ -1842,8 +1850,6 @@ EXC_REAL_END(hmi_exception, 0xe60, 0x20) EXC_VIRT_NONE(0x4e60, 0x20) EXC_COMMON_BEGIN(hmi_exception_early_common) - mfspr r11,SPRN_HSRR0 /* Save HSRR0 */ - mfspr r12,SPRN_HSRR1 /* Save HSRR1 */ mr r10,r1 /* Save r1 */ ld r1,PACAEMERGSP(r13) /* Use emergency stack for realmode */ subir1,r1,INT_FRAME_SIZE/* alloc stack frame*/ @@ -2169,29 +2175,23 @@ EXC_VIRT_NONE(0x5400, 0x100) INT_DEFINE_BEGIN(denorm_exception) IVEC=0x1500 IHSRR=EXC_HV - IEARLY=2 + IBRANCH_TO_COMMON=0 IKVM_REAL=1 INT_DEFINE_END(denorm_exception) EXC_REAL_BEGIN(denorm_exception, 0x1500, 0x100) GEN_INT_ENTRY denorm_exception, virt=0 #ifdef CONFIG_PPC_DENORMALISATION - mfspr r10,SPRN_HSRR1 - andis.
[PATCH v2 rebase 14/34] powerpc/64s/exception: remove the SPR saving patch code macros
From: Nicholas Piggin These are used infrequently enough they don't provide much help, so inline them. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 82 ++-- 1 file changed, 28 insertions(+), 54 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 716a95ba814f..abf26db36427 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -110,46 +110,6 @@ name: #define EXC_HV 1 #define EXC_STD0 -/* - * PPR save/restore macros used in exceptions-64s.S - * Used for P7 or later processors - */ -#define SAVE_PPR(area, ra) \ -BEGIN_FTR_SECTION_NESTED(940) \ - ld ra,area+EX_PPR(r13);/* Read PPR from paca */\ - std ra,_PPR(r1);\ -END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,940) - -#define RESTORE_PPR_PACA(area, ra) \ -BEGIN_FTR_SECTION_NESTED(941) \ - ld ra,area+EX_PPR(r13);\ - mtspr SPRN_PPR,ra;\ -END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,941) - -/* - * Get an SPR into a register if the CPU has the given feature - */ -#define OPT_GET_SPR(ra, spr, ftr) \ -BEGIN_FTR_SECTION_NESTED(943) \ - mfspr ra,spr; \ -END_FTR_SECTION_NESTED(ftr,ftr,943) - -/* - * Set an SPR from a register if the CPU has the given feature - */ -#define OPT_SET_SPR(ra, spr, ftr) \ -BEGIN_FTR_SECTION_NESTED(943) \ - mtspr spr,ra; \ -END_FTR_SECTION_NESTED(ftr,ftr,943) - -/* - * Save a register to the PACA if the CPU has the given feature - */ -#define OPT_SAVE_REG_TO_PACA(offset, ra, ftr) \ -BEGIN_FTR_SECTION_NESTED(943) \ - std ra,offset(r13); \ -END_FTR_SECTION_NESTED(ftr,ftr,943) - /* * Branch to label using its 0xC000 address. This results in instruction * address suitable for MSR[IR]=0 or 1, which allows relocation to be turned @@ -278,18 +238,18 @@ do_define_int n cmpwi r10,KVM_GUEST_MODE_SKIP beq 89f .else -BEGIN_FTR_SECTION_NESTED(947) +BEGIN_FTR_SECTION ld r10,IAREA+EX_CFAR(r13) std r10,HSTATE_CFAR(r13) -END_FTR_SECTION_NESTED(CPU_FTR_CFAR,CPU_FTR_CFAR,947) +END_FTR_SECTION_IFSET(CPU_FTR_CFAR) .endif ld r10,PACA_EXGEN+EX_CTR(r13) mtctr r10 -BEGIN_FTR_SECTION_NESTED(948) +BEGIN_FTR_SECTION ld r10,IAREA+EX_PPR(r13) std r10,HSTATE_PPR(r13) -END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) +END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r11,IAREA+EX_R11(r13) ld r12,IAREA+EX_R12(r13) std r12,HSTATE_SCRATCH0(r13) @@ -386,10 +346,14 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) SET_SCRATCH0(r13) /* save r13 */ GET_PACA(r13) std r9,IAREA+EX_R9(r13) /* save r9 */ - OPT_GET_SPR(r9, SPRN_PPR, CPU_FTR_HAS_PPR) +BEGIN_FTR_SECTION + mfspr r9,SPRN_PPR +END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) HMT_MEDIUM std r10,IAREA+EX_R10(r13) /* save r10 - r12 */ - OPT_GET_SPR(r10, SPRN_CFAR, CPU_FTR_CFAR) +BEGIN_FTR_SECTION + mfspr r10,SPRN_CFAR +END_FTR_SECTION_IFSET(CPU_FTR_CFAR) .if \ool .if !\virt b tramp_real_\name @@ -402,8 +366,12 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .endif .endif - OPT_SAVE_REG_TO_PACA(IAREA+EX_PPR, r9, CPU_FTR_HAS_PPR) - OPT_SAVE_REG_TO_PACA(IAREA+EX_CFAR, r10, CPU_FTR_CFAR) +BEGIN_FTR_SECTION + std r9,IAREA+EX_PPR(r13) +END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) +BEGIN_FTR_SECTION + std r10,IAREA+EX_CFAR(r13) +END_FTR_SECTION_IFSET(CPU_FTR_CFAR) INTERRUPT_TO_KERNEL mfctr r10 std r10,IAREA+EX_CTR(r13) @@ -558,7 +526,10 @@ DEFINE_FIXED_SYMBOL(\name\()_common_virt) .endif beq 101f/* if from kernel mode */ ACCOUNT_CPU_USER_ENTRY(r13, r9, r10) - SAVE_PPR(IAREA, r9) +BEGIN_FTR_SECTION + ld r9,IAREA+EX_PPR(r13)/* Read PPR from paca */ + std r9,_PPR(r1) +END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) 101: .else .if IKUAP @@ -598,10 +569,10 @@ DEFINE_FIXED_SYMBOL(\name\()_common_virt) std r10,_DSISR(r1)
[PATCH v2 rebase 10/34] powerpc/64s/exception: move real->virt switch into the common handler
From: Nicholas Piggin The real mode interrupt entry points currently use rfid to branch to the common handler in virtual mode. This is a significant amount of code, and forces other code (notably the KVM test) to live in the real mode handler. In the interest of minimising the amount of code that runs unrelocated move the switch to virt mode into the common code, and do it with mtmsrd, which avoids clobbering SRRs (although the post-KVMTEST performance of real-mode interrupt handlers is not a big concern these days). This requires CTR to always be saved (real-mode needs to reach 0xc...) but that's not a huge impact these days. It could be optimized away in future. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/exception-64s.h | 4 - arch/powerpc/kernel/exceptions-64s.S | 251 ++- 2 files changed, 109 insertions(+), 146 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h index 33f4f72eb035..47bd4ea0837d 100644 --- a/arch/powerpc/include/asm/exception-64s.h +++ b/arch/powerpc/include/asm/exception-64s.h @@ -33,11 +33,7 @@ #include /* PACA save area size in u64 units (exgen, exmc, etc) */ -#if defined(CONFIG_RELOCATABLE) #define EX_SIZE10 -#else -#define EX_SIZE9 -#endif /* * maximum recursive depth of MCE exceptions diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index b8588618cdc3..5803ce3b9404 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -32,16 +32,10 @@ #define EX_CCR 52 #define EX_CFAR56 #define EX_PPR 64 -#if defined(CONFIG_RELOCATABLE) #define EX_CTR 72 .if EX_SIZE != 10 .error "EX_SIZE is wrong" .endif -#else -.if EX_SIZE != 9 - .error "EX_SIZE is wrong" -.endif -#endif /* * Following are fixed section helper macros. @@ -124,22 +118,6 @@ name: #define EXC_HV 1 #define EXC_STD0 -#if defined(CONFIG_RELOCATABLE) -/* - * If we support interrupts with relocation on AND we're a relocatable kernel, - * we need to use CTR to get to the 2nd level handler. So, save/restore it - * when required. - */ -#define SAVE_CTR(reg, area)mfctr reg ; std reg,area+EX_CTR(r13) -#define GET_CTR(reg, area) ld reg,area+EX_CTR(r13) -#define RESTORE_CTR(reg, area) ld reg,area+EX_CTR(r13) ; mtctr reg -#else -/* ...else CTR is unused and in register. */ -#define SAVE_CTR(reg, area) -#define GET_CTR(reg, area) mfctr reg -#define RESTORE_CTR(reg, area) -#endif - /* * PPR save/restore macros used in exceptions-64s.S * Used for P7 or later processors @@ -199,6 +177,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) #define IVEC .L_IVEC_\name\() #define IHSRR .L_IHSRR_\name\() #define IAREA .L_IAREA_\name\() +#define IVIRT .L_IVIRT_\name\() #define IISIDE .L_IISIDE_\name\() #define IDAR .L_IDAR_\name\() #define IDSISR .L_IDSISR_\name\() @@ -232,6 +211,9 @@ do_define_int n .ifndef IAREA IAREA=PACA_EXGEN .endif + .ifndef IVIRT + IVIRT=1 + .endif .ifndef IISIDE IISIDE=0 .endif @@ -325,7 +307,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) * outside the head section. CONFIG_RELOCATABLE KVM expects CTR * to be saved in HSTATE_SCRATCH1. */ - mfctr r9 + ld r9,IAREA+EX_CTR(r13) std r9,HSTATE_SCRATCH1(r13) __LOAD_FAR_HANDLER(r9, kvmppc_interrupt) mtctr r9 @@ -362,101 +344,6 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .endm #endif -.macro INT_SAVE_SRR_AND_JUMP label, hsrr, set_ri - ld r10,PACAKMSR(r13) /* get MSR value for kernel */ - .if ! \set_ri - xorir10,r10,MSR_RI /* Clear MSR_RI */ - .endif - .if \hsrr == EXC_HV_OR_STD - BEGIN_FTR_SECTION - mfspr r11,SPRN_HSRR0 /* save HSRR0 */ - mfspr r12,SPRN_HSRR1 /* and HSRR1 */ - mtspr SPRN_HSRR1,r10 - FTR_SECTION_ELSE - mfspr r11,SPRN_SRR0 /* save SRR0 */ - mfspr r12,SPRN_SRR1 /* and SRR1 */ - mtspr SPRN_SRR1,r10 - ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) - .elseif \hsrr - mfspr r11,SPRN_HSRR0 /* save HSRR0 */ - mfspr r12,SPRN_HSRR1 /* and HSRR1 */ - mtspr SPRN_HSRR1,r10 - .else - mfspr r11,SPRN_SRR0 /* save SRR0 */ - mfspr r12,SPRN_SRR1 /* and SRR1 */ - mtspr SPRN_SRR1,r10 - .endif - LOAD_HANDLER(r10, \label\()) - .if \hsrr == EXC_HV_OR_STD - BEGIN_FTR_SECTION - mtspr SPRN_HSRR0,r10 - HRFI_TO_KERNEL - FTR_SECTION_ELSE - mtspr
[PATCH v2 rebase 12/34] powerpc/64s/exception: move KVM test to common code
From: Nicholas Piggin This allows more code to be moved out of unrelocated regions. The system call KVMTEST is changed to be open-coded and remain in the tramp area to avoid having to move it to entry_64.S. The custom nature of the system call entry code means the hcall case can be made more streamlined than regular interrupt handlers. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S| 239 arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 -- arch/powerpc/kvm/book3s_segment.S | 7 - 3 files changed, 119 insertions(+), 138 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index fbc3fbb293f7..7db76e7be0aa 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -44,7 +44,6 @@ * EXC_VIRT_BEGIN/END - virt (AIL), unrelocated exception vectors * TRAMP_REAL_BEGIN- real, unrelocated helpers (virt may call these) * TRAMP_VIRT_BEGIN- virt, unreloc helpers (in practice, real can use) - * TRAMP_KVM_BEGIN - KVM handlers, these are put into real, unrelocated * EXC_COMMON - After switching to virtual, relocated mode. */ @@ -74,13 +73,6 @@ name: #define TRAMP_VIRT_BEGIN(name) \ FIXED_SECTION_ENTRY_BEGIN(virt_trampolines, name) -#ifdef CONFIG_KVM_BOOK3S_64_HANDLER -#define TRAMP_KVM_BEGIN(name) \ - TRAMP_VIRT_BEGIN(name) -#else -#define TRAMP_KVM_BEGIN(name) -#endif - #define EXC_REAL_NONE(start, size) \ FIXED_SECTION_ENTRY_BEGIN_LOCATION(real_vectors, exc_real_##start##_##unused, start, size); \ FIXED_SECTION_ENTRY_END_LOCATION(real_vectors, exc_real_##start##_##unused, start, size) @@ -271,6 +263,9 @@ do_define_int n .endm .macro GEN_KVM name + .balign IFETCH_ALIGN_BYTES +\name\()_kvm: + .if IKVM_SKIP cmpwi r10,KVM_GUEST_MODE_SKIP beq 89f @@ -281,13 +276,18 @@ BEGIN_FTR_SECTION_NESTED(947) END_FTR_SECTION_NESTED(CPU_FTR_CFAR,CPU_FTR_CFAR,947) .endif + ld r10,PACA_EXGEN+EX_CTR(r13) + mtctr r10 BEGIN_FTR_SECTION_NESTED(948) ld r10,IAREA+EX_PPR(r13) std r10,HSTATE_PPR(r13) END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) - ld r10,IAREA+EX_R10(r13) + ld r11,IAREA+EX_R11(r13) + ld r12,IAREA+EX_R12(r13) std r12,HSTATE_SCRATCH0(r13) sldir12,r9,32 + ld r9,IAREA+EX_R9(r13) + ld r10,IAREA+EX_R10(r13) /* HSRR variants have the 0x2 bit added to their trap number */ .if IHSRR == EXC_HV_OR_STD BEGIN_FTR_SECTION @@ -300,29 +300,16 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .else ori r12,r12,(IVEC) .endif - -#ifdef CONFIG_RELOCATABLE - /* -* KVM requires __LOAD_FAR_HANDLER beause kvmppc_interrupt lives -* outside the head section. CONFIG_RELOCATABLE KVM expects CTR -* to be saved in HSTATE_SCRATCH1. -*/ - ld r9,IAREA+EX_CTR(r13) - std r9,HSTATE_SCRATCH1(r13) - __LOAD_FAR_HANDLER(r9, kvmppc_interrupt) - mtctr r9 - ld r9,IAREA+EX_R9(r13) - bctr -#else - ld r9,IAREA+EX_R9(r13) b kvmppc_interrupt -#endif - .if IKVM_SKIP 89:mtocrf 0x80,r9 + ld r10,PACA_EXGEN+EX_CTR(r13) + mtctr r10 ld r9,IAREA+EX_R9(r13) ld r10,IAREA+EX_R10(r13) + ld r11,IAREA+EX_R11(r13) + ld r12,IAREA+EX_R12(r13) .if IHSRR == EXC_HV_OR_STD BEGIN_FTR_SECTION b kvmppc_skip_Hinterrupt @@ -407,11 +394,6 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) mfctr r10 std r10,IAREA+EX_CTR(r13) mfcrr9 - - .if (!\virt && IKVM_REAL) || (\virt && IKVM_VIRT) - KVMTEST \name IHSRR IVEC - .endif - std r11,IAREA+EX_R11(r13) std r12,IAREA+EX_R12(r13) @@ -475,6 +457,10 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .macro __GEN_COMMON_ENTRY name DEFINE_FIXED_SYMBOL(\name\()_common_real) \name\()_common_real: + .if IKVM_REAL + KVMTEST \name IHSRR IVEC + .endif + ld r10,PACAKMSR(r13) /* get MSR value for kernel */ /* MSR[RI] is clear iff using SRR regs */ .if IHSRR == EXC_HV_OR_STD @@ -487,9 +473,17 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real) mtmsrd r10 .if IVIRT + .if IKVM_VIRT + b 1f /* skip the virt test coming from real */ + .endif + .balign IFETCH_ALIGN_BYTES DEFINE_FIXED_SYMBOL(\name\()_common_virt) \name\()_common_virt: + .if IKVM_VIRT + KVMTEST \name IHSRR IVEC +1: + .endif .endif /* IVIRT */ .endm @@ -848,8 +842,6 @@
[PATCH v2 rebase 11/34] powerpc/64s/exception: move soft-mask test to common code
From: Nicholas Piggin As well as moving code out of the unrelocated vectors, this allows the masked handlers to be moved to common code, and allows the soft_nmi handler to be generated more like a regular handler. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 106 +-- 1 file changed, 49 insertions(+), 57 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 5803ce3b9404..fbc3fbb293f7 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -411,36 +411,6 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .if (!\virt && IKVM_REAL) || (\virt && IKVM_VIRT) KVMTEST \name IHSRR IVEC .endif - .if IMASK - lbz r10,PACAIRQSOFTMASK(r13) - andi. r10,r10,IMASK - /* Associate vector numbers with bits in paca->irq_happened */ - .if IVEC == 0x500 || IVEC == 0xea0 - li r10,PACA_IRQ_EE - .elseif IVEC == 0x900 - li r10,PACA_IRQ_DEC - .elseif IVEC == 0xa00 || IVEC == 0xe80 - li r10,PACA_IRQ_DBELL - .elseif IVEC == 0xe60 - li r10,PACA_IRQ_HMI - .elseif IVEC == 0xf00 - li r10,PACA_IRQ_PMI - .else - .abort "Bad maskable vector" - .endif - - .if IHSRR == EXC_HV_OR_STD - BEGIN_FTR_SECTION - bne masked_Hinterrupt - FTR_SECTION_ELSE - bne masked_interrupt - ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) - .elseif IHSRR - bne masked_Hinterrupt - .else - bne masked_interrupt - .endif - .endif std r11,IAREA+EX_R11(r13) std r12,IAREA+EX_R12(r13) @@ -524,6 +494,37 @@ DEFINE_FIXED_SYMBOL(\name\()_common_virt) .endm .macro __GEN_COMMON_BODY name + .if IMASK + lbz r10,PACAIRQSOFTMASK(r13) + andi. r10,r10,IMASK + /* Associate vector numbers with bits in paca->irq_happened */ + .if IVEC == 0x500 || IVEC == 0xea0 + li r10,PACA_IRQ_EE + .elseif IVEC == 0x900 + li r10,PACA_IRQ_DEC + .elseif IVEC == 0xa00 || IVEC == 0xe80 + li r10,PACA_IRQ_DBELL + .elseif IVEC == 0xe60 + li r10,PACA_IRQ_HMI + .elseif IVEC == 0xf00 + li r10,PACA_IRQ_PMI + .else + .abort "Bad maskable vector" + .endif + + .if IHSRR == EXC_HV_OR_STD + BEGIN_FTR_SECTION + bne masked_Hinterrupt + FTR_SECTION_ELSE + bne masked_interrupt + ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) + .elseif IHSRR + bne masked_Hinterrupt + .else + bne masked_interrupt + .endif + .endif + .if ISTACK andi. r10,r12,MSR_PR /* See if coming from user */ mr r10,r1 /* Save r1 */ @@ -2343,18 +2344,10 @@ EXC_VIRT_NONE(0x5800, 0x100) #ifdef CONFIG_PPC_WATCHDOG -#define MASKED_DEC_HANDLER_LABEL 3f - -#define MASKED_DEC_HANDLER(_H) \ -3: /* soft-nmi */ \ - std r12,PACA_EXGEN+EX_R12(r13); \ - GET_SCRATCH0(r10); \ - std r10,PACA_EXGEN+EX_R13(r13); \ - mfspr r11,SPRN_SRR0; /* save SRR0 */ \ - mfspr r12,SPRN_SRR1; /* and SRR1 */ \ - LOAD_HANDLER(r10, soft_nmi_common); \ - mtctr r10;\ - bctr +INT_DEFINE_BEGIN(soft_nmi) + IVEC=0x900 + ISTACK=0 +INT_DEFINE_END(soft_nmi) /* * Branch to soft_nmi_interrupt using the emergency stack. The emergency @@ -2366,19 +2359,16 @@ EXC_VIRT_NONE(0x5800, 0x100) * and run it entirely with interrupts hard disabled. */ EXC_COMMON_BEGIN(soft_nmi_common) + mfspr r11,SPRN_SRR0 mr r10,r1 ld r1,PACAEMERGSP(r13) subir1,r1,INT_FRAME_SIZE - __ISTACK(decrementer)=0 - __GEN_COMMON_BODY decrementer + __GEN_COMMON_BODY soft_nmi bl save_nvgprs addir3,r1,STACK_FRAME_OVERHEAD bl soft_nmi_interrupt b ret_from_except -#else /* CONFIG_PPC_WATCHDOG */ -#define MASKED_DEC_HANDLER_LABEL 2f /* normal return */ -#define MASKED_DEC_HANDLER(_H) #endif /* CONFIG_PPC_WATCHDOG */ /* @@ -2397,7 +2387,6 @@ masked_Hinterrupt: .else
[PATCH v2 rebase 09/34] powerpc/64s/exception: Add ISIDE option
From: Nicholas Piggin Rather than using DAR=2 to select the i-side registers, add an explicit option. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 23 --- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index bef0c2eee7dc..b8588618cdc3 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -199,6 +199,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) #define IVEC .L_IVEC_\name\() #define IHSRR .L_IHSRR_\name\() #define IAREA .L_IAREA_\name\() +#define IISIDE .L_IISIDE_\name\() #define IDAR .L_IDAR_\name\() #define IDSISR .L_IDSISR_\name\() #define ISET_RI.L_ISET_RI_\name\() @@ -231,6 +232,9 @@ do_define_int n .ifndef IAREA IAREA=PACA_EXGEN .endif + .ifndef IISIDE + IISIDE=0 + .endif .ifndef IDAR IDAR=0 .endif @@ -542,7 +546,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) */ GET_SCRATCH0(r10) std r10,IAREA+EX_R13(r13) - .if IDAR == 1 + .if IDAR && !IISIDE .if IHSRR mfspr r10,SPRN_HDAR .else @@ -550,7 +554,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .endif std r10,IAREA+EX_DAR(r13) .endif - .if IDSISR == 1 + .if IDSISR && !IISIDE .if IHSRR mfspr r10,SPRN_HDSISR .else @@ -625,16 +629,18 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) std r9,GPR11(r1) std r10,GPR12(r1) std r11,GPR13(r1) + .if IDAR - .if IDAR == 2 + .if IISIDE ld r10,_NIP(r1) .else ld r10,IAREA+EX_DAR(r13) .endif std r10,_DAR(r1) .endif + .if IDSISR - .if IDSISR == 2 + .if IISIDE ld r10,_MSR(r1) lis r11,DSISR_SRR1_MATCH_64S@h and r10,r10,r11 @@ -643,6 +649,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .endif std r10,_DSISR(r1) .endif + BEGIN_FTR_SECTION_NESTED(66) ld r10,IAREA+EX_CFAR(r13) std r10,ORIG_GPR3(r1) @@ -1311,8 +1318,9 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) INT_DEFINE_BEGIN(instruction_access) IVEC=0x400 - IDAR=2 - IDSISR=2 + IISIDE=1 + IDAR=1 + IDSISR=1 IKVM_REAL=1 INT_DEFINE_END(instruction_access) @@ -1341,7 +1349,8 @@ INT_DEFINE_BEGIN(instruction_access_slb) IVEC=0x480 IAREA=PACA_EXSLB IRECONCILE=0 - IDAR=2 + IISIDE=1 + IDAR=1 IKVM_REAL=1 INT_DEFINE_END(instruction_access_slb) -- 2.23.0
[PATCH v2 rebase 08/34] powerpc/64s/exception: Remove old INT_KVM_HANDLER
From: Nicholas Piggin Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 55 +--- 1 file changed, 26 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index f318869607db..bef0c2eee7dc 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -266,15 +266,6 @@ do_define_int n .endif .endm -.macro INT_KVM_HANDLER name, vec, hsrr, area, skip - TRAMP_KVM_BEGIN(\name\()_kvm) - KVM_HANDLER \vec, \hsrr, \area, \skip -.endm - -.macro GEN_KVM name - KVM_HANDLER IVEC, IHSRR, IAREA, IKVM_SKIP -.endm - #ifdef CONFIG_KVM_BOOK3S_64_HANDLER #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE /* @@ -293,35 +284,35 @@ do_define_int n bne \name\()_kvm .endm -.macro KVM_HANDLER vec, hsrr, area, skip - .if \skip +.macro GEN_KVM name + .if IKVM_SKIP cmpwi r10,KVM_GUEST_MODE_SKIP beq 89f .else BEGIN_FTR_SECTION_NESTED(947) - ld r10,\area+EX_CFAR(r13) + ld r10,IAREA+EX_CFAR(r13) std r10,HSTATE_CFAR(r13) END_FTR_SECTION_NESTED(CPU_FTR_CFAR,CPU_FTR_CFAR,947) .endif BEGIN_FTR_SECTION_NESTED(948) - ld r10,\area+EX_PPR(r13) + ld r10,IAREA+EX_PPR(r13) std r10,HSTATE_PPR(r13) END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) - ld r10,\area+EX_R10(r13) + ld r10,IAREA+EX_R10(r13) std r12,HSTATE_SCRATCH0(r13) sldir12,r9,32 /* HSRR variants have the 0x2 bit added to their trap number */ - .if \hsrr == EXC_HV_OR_STD + .if IHSRR == EXC_HV_OR_STD BEGIN_FTR_SECTION - ori r12,r12,(\vec + 0x2) + ori r12,r12,(IVEC + 0x2) FTR_SECTION_ELSE - ori r12,r12,(\vec) + ori r12,r12,(IVEC) ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) - .elseif \hsrr - ori r12,r12,(\vec + 0x2) + .elseif IHSRR + ori r12,r12,(IVEC+ 0x2) .else - ori r12,r12,(\vec) + ori r12,r12,(IVEC) .endif #ifdef CONFIG_RELOCATABLE @@ -334,25 +325,25 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) std r9,HSTATE_SCRATCH1(r13) __LOAD_FAR_HANDLER(r9, kvmppc_interrupt) mtctr r9 - ld r9,\area+EX_R9(r13) + ld r9,IAREA+EX_R9(r13) bctr #else - ld r9,\area+EX_R9(r13) + ld r9,IAREA+EX_R9(r13) b kvmppc_interrupt #endif - .if \skip + .if IKVM_SKIP 89:mtocrf 0x80,r9 - ld r9,\area+EX_R9(r13) - ld r10,\area+EX_R10(r13) - .if \hsrr == EXC_HV_OR_STD + ld r9,IAREA+EX_R9(r13) + ld r10,IAREA+EX_R10(r13) + .if IHSRR == EXC_HV_OR_STD BEGIN_FTR_SECTION b kvmppc_skip_Hinterrupt FTR_SECTION_ELSE b kvmppc_skip_interrupt ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) - .elseif \hsrr + .elseif IHSRR b kvmppc_skip_Hinterrupt .else b kvmppc_skip_interrupt @@ -363,7 +354,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) #else .macro KVMTEST name, hsrr, n .endm -.macro KVM_HANDLER name, vec, hsrr, area, skip +.macro GEN_KVM name .endm #endif @@ -1640,6 +1631,12 @@ EXC_VIRT_NONE(0x4b00, 0x100) * without saving, though xer is not a good idea to use, as hardware may * interpret some bits so it may be costly to change them. */ +INT_DEFINE_BEGIN(system_call) + IVEC=0xc00 + IKVM_REAL=1 + IKVM_VIRT=1 +INT_DEFINE_END(system_call) + .macro SYSTEM_CALL virt #ifdef CONFIG_KVM_BOOK3S_64_HANDLER /* @@ -1733,7 +1730,7 @@ TRAMP_KVM_BEGIN(system_call_kvm) SET_SCRATCH0(r10) std r9,PACA_EXGEN+EX_R9(r13) mfcrr9 - KVM_HANDLER 0xc00, EXC_STD, PACA_EXGEN, 0 + GEN_KVM system_call #endif -- 2.23.0
[PATCH v2 rebase 07/34] powerpc/64s/exception: Remove old INT_COMMON macro
From: Nicholas Piggin Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 51 +--- 1 file changed, 24 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 17e4aaf6ed42..f318869607db 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -591,8 +591,8 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) * If stack=0, then the stack is already set in r1, and r1 is saved in r10. * PPR save and CPU accounting is not done for the !stack case (XXX why not?) */ -.macro INT_COMMON vec, area, stack, kuap, reconcile, dar, dsisr - .if \stack +.macro GEN_COMMON name + .if ISTACK andi. r10,r12,MSR_PR /* See if coming from user */ mr r10,r1 /* Save r1 */ subir1,r1,INT_FRAME_SIZE/* alloc frame on kernel stack */ @@ -609,54 +609,54 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) std r0,GPR0(r1) /* save r0 in stackframe*/ std r10,GPR1(r1)/* save r1 in stackframe*/ - .if \stack - .if \kuap + .if ISTACK + .if IKUAP kuap_save_amr_and_lock r9, r10, cr1, cr0 .endif beq 101f/* if from kernel mode */ ACCOUNT_CPU_USER_ENTRY(r13, r9, r10) - SAVE_PPR(\area, r9) + SAVE_PPR(IAREA, r9) 101: .else - .if \kuap + .if IKUAP kuap_save_amr_and_lock r9, r10, cr1 .endif .endif /* Save original regs values from save area to stack frame. */ - ld r9,\area+EX_R9(r13) /* move r9, r10 to stackframe */ - ld r10,\area+EX_R10(r13) + ld r9,IAREA+EX_R9(r13) /* move r9, r10 to stackframe */ + ld r10,IAREA+EX_R10(r13) std r9,GPR9(r1) std r10,GPR10(r1) - ld r9,\area+EX_R11(r13)/* move r11 - r13 to stackframe */ - ld r10,\area+EX_R12(r13) - ld r11,\area+EX_R13(r13) + ld r9,IAREA+EX_R11(r13)/* move r11 - r13 to stackframe */ + ld r10,IAREA+EX_R12(r13) + ld r11,IAREA+EX_R13(r13) std r9,GPR11(r1) std r10,GPR12(r1) std r11,GPR13(r1) - .if \dar - .if \dar == 2 + .if IDAR + .if IDAR == 2 ld r10,_NIP(r1) .else - ld r10,\area+EX_DAR(r13) + ld r10,IAREA+EX_DAR(r13) .endif std r10,_DAR(r1) .endif - .if \dsisr - .if \dsisr == 2 + .if IDSISR + .if IDSISR == 2 ld r10,_MSR(r1) lis r11,DSISR_SRR1_MATCH_64S@h and r10,r10,r11 .else - lwz r10,\area+EX_DSISR(r13) + lwz r10,IAREA+EX_DSISR(r13) .endif std r10,_DSISR(r1) .endif BEGIN_FTR_SECTION_NESTED(66) - ld r10,\area+EX_CFAR(r13) + ld r10,IAREA+EX_CFAR(r13) std r10,ORIG_GPR3(r1) END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 66) - GET_CTR(r10, \area) + GET_CTR(r10, IAREA) std r10,_CTR(r1) std r2,GPR2(r1) /* save r2 in stackframe*/ SAVE_4GPRS(3, r1) /* save r3 - r6 in stackframe */ @@ -668,26 +668,22 @@ END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 66) mfspr r11,SPRN_XER/* save XER in stackframe */ std r10,SOFTE(r1) std r11,_XER(r1) - li r9,(\vec)+1 + li r9,(IVEC)+1 std r9,_TRAP(r1)/* set trap number */ li r10,0 ld r11,exception_marker@toc(r2) std r10,RESULT(r1) /* clear regs->result */ std r11,STACK_FRAME_OVERHEAD-16(r1) /* mark the frame */ - .if \stack + .if ISTACK ACCOUNT_STOLEN_TIME .endif - .if \reconcile + .if IRECONCILE RECONCILE_IRQ_STATE(r10, r11) .endif .endm -.macro GEN_COMMON name - INT_COMMON IVEC, IAREA, ISTACK, IKUAP, IRECONCILE, IDAR, IDSISR -.endm - /* * Restore all registers including H/SRR0/1 saved in a stack frame of a * standard exception. @@ -2400,7 +2396,8 @@ EXC_COMMON_BEGIN(soft_nmi_common) mr r10,r1 ld r1,PACAEMERGSP(r13) subir1,r1,INT_FRAME_SIZE - INT_COMMON 0x900, PACA_EXGEN, 0, 1, 1, 0, 0 + __ISTACK(decrementer)=0 + GEN_COMMON decrementer bl save_nvgprs addir3,r1,STACK_FRAME_OVERHEAD bl soft_nmi_interrupt -- 2.23.0
[PATCH v2 rebase 06/34] powerpc/64s/exception: Remove old INT_ENTRY macro
From: Nicholas Piggin Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 68 1 file changed, 30 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index f70c9fb2566a..17e4aaf6ed42 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -482,13 +482,13 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) * - Fall through and continue executing in real, unrelocated mode. * This is done if early=2. */ -.macro INT_HANDLER name, vec, ool=0, early=0, virt=0, hsrr=0, area=PACA_EXGEN, ri=1, dar=0, dsisr=0, bitmask=0, kvm=0 +.macro GEN_INT_ENTRY name, virt, ool=0 SET_SCRATCH0(r13) /* save r13 */ GET_PACA(r13) - std r9,\area\()+EX_R9(r13) /* save r9 */ + std r9,IAREA+EX_R9(r13) /* save r9 */ OPT_GET_SPR(r9, SPRN_PPR, CPU_FTR_HAS_PPR) HMT_MEDIUM - std r10,\area\()+EX_R10(r13)/* save r10 - r12 */ + std r10,IAREA+EX_R10(r13) /* save r10 - r12 */ OPT_GET_SPR(r10, SPRN_CFAR, CPU_FTR_CFAR) .if \ool .if !\virt @@ -502,47 +502,47 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .endif .endif - OPT_SAVE_REG_TO_PACA(\area\()+EX_PPR, r9, CPU_FTR_HAS_PPR) - OPT_SAVE_REG_TO_PACA(\area\()+EX_CFAR, r10, CPU_FTR_CFAR) + OPT_SAVE_REG_TO_PACA(IAREA+EX_PPR, r9, CPU_FTR_HAS_PPR) + OPT_SAVE_REG_TO_PACA(IAREA+EX_CFAR, r10, CPU_FTR_CFAR) INTERRUPT_TO_KERNEL - SAVE_CTR(r10, \area\()) + SAVE_CTR(r10, IAREA) mfcrr9 - .if \kvm - KVMTEST \name \hsrr \vec + .if (!\virt && IKVM_REAL) || (\virt && IKVM_VIRT) + KVMTEST \name IHSRR IVEC .endif - .if \bitmask + .if IMASK lbz r10,PACAIRQSOFTMASK(r13) - andi. r10,r10,\bitmask + andi. r10,r10,IMASK /* Associate vector numbers with bits in paca->irq_happened */ - .if \vec == 0x500 || \vec == 0xea0 + .if IVEC == 0x500 || IVEC == 0xea0 li r10,PACA_IRQ_EE - .elseif \vec == 0x900 + .elseif IVEC == 0x900 li r10,PACA_IRQ_DEC - .elseif \vec == 0xa00 || \vec == 0xe80 + .elseif IVEC == 0xa00 || IVEC == 0xe80 li r10,PACA_IRQ_DBELL - .elseif \vec == 0xe60 + .elseif IVEC == 0xe60 li r10,PACA_IRQ_HMI - .elseif \vec == 0xf00 + .elseif IVEC == 0xf00 li r10,PACA_IRQ_PMI .else .abort "Bad maskable vector" .endif - .if \hsrr == EXC_HV_OR_STD + .if IHSRR == EXC_HV_OR_STD BEGIN_FTR_SECTION bne masked_Hinterrupt FTR_SECTION_ELSE bne masked_interrupt ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) - .elseif \hsrr + .elseif IHSRR bne masked_Hinterrupt .else bne masked_interrupt .endif .endif - std r11,\area\()+EX_R11(r13) - std r12,\area\()+EX_R12(r13) + std r11,IAREA+EX_R11(r13) + std r12,IAREA+EX_R12(r13) /* * DAR/DSISR, SCRATCH0 must be read before setting MSR[RI], @@ -550,47 +550,39 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) * not recoverable if they are live. */ GET_SCRATCH0(r10) - std r10,\area\()+EX_R13(r13) - .if \dar == 1 - .if \hsrr + std r10,IAREA+EX_R13(r13) + .if IDAR == 1 + .if IHSRR mfspr r10,SPRN_HDAR .else mfspr r10,SPRN_DAR .endif - std r10,\area\()+EX_DAR(r13) + std r10,IAREA+EX_DAR(r13) .endif - .if \dsisr == 1 - .if \hsrr + .if IDSISR == 1 + .if IHSRR mfspr r10,SPRN_HDSISR .else mfspr r10,SPRN_DSISR .endif - stw r10,\area\()+EX_DSISR(r13) + stw r10,IAREA+EX_DSISR(r13) .endif - .if \early == 2 + .if IEARLY == 2 /* nothing more */ - .elseif \early + .elseif IEARLY mfctr r10 /* save ctr, even for !RELOCATABLE */ BRANCH_TO_C000(r11, \name\()_common) .elseif !\virt - INT_SAVE_SRR_AND_JUMP \name\()_common, \hsrr, \ri + INT_SAVE_SRR_AND_JUMP \name\()_common, IHSRR, ISET_RI .else - INT_VIRT_SAVE_SRR_AND_JUMP \name\()_common, \hsrr + INT_VIRT_SAVE_SRR_AND_JUMP \name\()_common, IHSRR .endif .if \ool .popsection
[PATCH v2 rebase 05/34] powerpc/64s/exception: Move all interrupt handlers to new style code gen macros
From: Nicholas Piggin Aside from label names and BUG line numbers, the generated code change is an additional HMI KVM handler added for the "late" KVM handler, because early and late HMI generation is achieved by defining two different interrupt types. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 556 --- 1 file changed, 418 insertions(+), 138 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index cefe2e9a9e05..f70c9fb2566a 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -206,8 +206,10 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) #define IMASK .L_IMASK_\name\() #define IKVM_SKIP .L_IKVM_SKIP_\name\() #define IKVM_REAL .L_IKVM_REAL_\name\() +#define __IKVM_REAL(name) .L_IKVM_REAL_ ## name #define IKVM_VIRT .L_IKVM_VIRT_\name\() #define ISTACK .L_ISTACK_\name\() +#define __ISTACK(name) .L_ISTACK_ ## name #define IRECONCILE .L_IRECONCILE_\name\() #define IKUAP .L_IKUAP_\name\() @@ -570,7 +572,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) /* nothing more */ .elseif \early mfctr r10 /* save ctr, even for !RELOCATABLE */ - BRANCH_TO_C000(r11, \name\()_early_common) + BRANCH_TO_C000(r11, \name\()_common) .elseif !\virt INT_SAVE_SRR_AND_JUMP \name\()_common, \hsrr, \ri .else @@ -843,6 +845,19 @@ __start_interrupts: EXC_VIRT_NONE(0x4000, 0x100) +INT_DEFINE_BEGIN(system_reset) + IVEC=0x100 + IAREA=PACA_EXNMI + /* +* MSR_RI is not enabled, because PACA_EXNMI and nmi stack is +* being used, so a nested NMI exception would corrupt it. +*/ + ISET_RI=0 + ISTACK=0 + IRECONCILE=0 + IKVM_REAL=1 +INT_DEFINE_END(system_reset) + EXC_REAL_BEGIN(system_reset, 0x100, 0x100) #ifdef CONFIG_PPC_P7_NAP /* @@ -880,11 +895,8 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) #endif - INT_HANDLER system_reset, 0x100, area=PACA_EXNMI, ri=0, kvm=1 + GEN_INT_ENTRY system_reset, virt=0 /* -* MSR_RI is not enabled, because PACA_EXNMI and nmi stack is -* being used, so a nested NMI exception would corrupt it. -* * In theory, we should not enable relocation here if it was disabled * in SRR1, because the MMU may not be configured to support it (e.g., * SLB may have been cleared). In practice, there should only be a few @@ -893,7 +905,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE | CPU_FTR_ARCH_206) */ EXC_REAL_END(system_reset, 0x100, 0x100) EXC_VIRT_NONE(0x4100, 0x100) -INT_KVM_HANDLER system_reset 0x100, EXC_STD, PACA_EXNMI, 0 +TRAMP_KVM_BEGIN(system_reset_kvm) + GEN_KVM system_reset #ifdef CONFIG_PPC_P7_NAP TRAMP_REAL_BEGIN(system_reset_idle_wake) @@ -908,8 +921,8 @@ TRAMP_REAL_BEGIN(system_reset_idle_wake) * Vectors for the FWNMI option. Share common code. */ TRAMP_REAL_BEGIN(system_reset_fwnmi) - /* See comment at system_reset exception, don't turn on RI */ - INT_HANDLER system_reset, 0x100, area=PACA_EXNMI, ri=0 + __IKVM_REAL(system_reset)=0 + GEN_INT_ENTRY system_reset, virt=0 #endif /* CONFIG_PPC_PSERIES */ @@ -929,7 +942,7 @@ EXC_COMMON_BEGIN(system_reset_common) mr r10,r1 ld r1,PACA_NMI_EMERG_SP(r13) subir1,r1,INT_FRAME_SIZE - INT_COMMON 0x100, PACA_EXNMI, 0, 1, 0, 0, 0 + GEN_COMMON system_reset bl save_nvgprs /* * Set IRQS_ALL_DISABLED unconditionally so arch_irqs_disabled does @@ -971,23 +984,46 @@ EXC_COMMON_BEGIN(system_reset_common) RFI_TO_USER_OR_KERNEL -EXC_REAL_BEGIN(machine_check, 0x200, 0x100) - INT_HANDLER machine_check, 0x200, early=1, area=PACA_EXMC, dar=1, dsisr=1 +INT_DEFINE_BEGIN(machine_check_early) + IVEC=0x200 + IAREA=PACA_EXMC /* * MSR_RI is not enabled, because PACA_EXMC is being used, so a * nested machine check corrupts it. machine_check_common enables * MSR_RI. */ + ISET_RI=0 + ISTACK=0 + IEARLY=1 + IDAR=1 + IDSISR=1 + IRECONCILE=0 + IKUAP=0 /* We don't touch AMR here, we never go to virtual mode */ +INT_DEFINE_END(machine_check_early) + +INT_DEFINE_BEGIN(machine_check) + IVEC=0x200 + IAREA=PACA_EXMC + ISET_RI=0 + IDAR=1 + IDSISR=1 + IKVM_SKIP=1 + IKVM_REAL=1 +INT_DEFINE_END(machine_check) + +EXC_REAL_BEGIN(machine_check, 0x200, 0x100) + GEN_INT_ENTRY machine_check_early, virt=0 EXC_REAL_END(machine_check, 0x200, 0x100) EXC_VIRT_NONE(0x4200, 0x100) #ifdef CONFIG_PPC_PSERIES TRAMP_REAL_BEGIN(machine_check_fwnmi) /* See comment at machine_check exception, don't turn on RI */ - INT_HANDLER machine_check, 0x200,
[PATCH v2 rebase 04/34] powerpc/64s/exception: Expand EXC_COMMON and EXC_COMMON_ASYNC macros
From: Nicholas Piggin These don't provide a large amount of code sharing. Removing them makes code easier to shuffle around. For example, some of the common instructions will be moved into the common code gen macro. No generated code change. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 160 --- 1 file changed, 117 insertions(+), 43 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 087df86d03ff..cefe2e9a9e05 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -757,28 +757,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_CAN_NAP) #define FINISH_NAP #endif -#define EXC_COMMON(name, realvec, hdlr) \ - EXC_COMMON_BEGIN(name); \ - INT_COMMON realvec, PACA_EXGEN, 1, 1, 1, 0, 0 ; \ - bl save_nvgprs;\ - addir3,r1,STACK_FRAME_OVERHEAD; \ - bl hdlr; \ - b ret_from_except - -/* - * Like EXC_COMMON, but for exceptions that can occur in the idle task and - * therefore need the special idle handling (finish nap and runlatch) - */ -#define EXC_COMMON_ASYNC(name, realvec, hdlr) \ - EXC_COMMON_BEGIN(name); \ - INT_COMMON realvec, PACA_EXGEN, 1, 1, 1, 0, 0 ; \ - FINISH_NAP; \ - RUNLATCH_ON;\ - addir3,r1,STACK_FRAME_OVERHEAD; \ - bl hdlr; \ - b ret_from_except_lite - - /* * There are a few constraints to be concerned with. * - Real mode exceptions code/data must be located at their physical location. @@ -1349,7 +1327,13 @@ EXC_VIRT_BEGIN(hardware_interrupt, 0x4500, 0x100) INT_HANDLER hardware_interrupt, 0x500, virt=1, hsrr=EXC_HV_OR_STD, bitmask=IRQS_DISABLED, kvm=1 EXC_VIRT_END(hardware_interrupt, 0x4500, 0x100) INT_KVM_HANDLER hardware_interrupt, 0x500, EXC_HV_OR_STD, PACA_EXGEN, 0 -EXC_COMMON_ASYNC(hardware_interrupt_common, 0x500, do_IRQ) +EXC_COMMON_BEGIN(hardware_interrupt_common) + INT_COMMON 0x500, PACA_EXGEN, 1, 1, 1, 0, 0 + FINISH_NAP + RUNLATCH_ON + addir3,r1,STACK_FRAME_OVERHEAD + bl do_IRQ + b ret_from_except_lite EXC_REAL_BEGIN(alignment, 0x600, 0x100) @@ -1455,7 +1439,13 @@ EXC_VIRT_BEGIN(decrementer, 0x4900, 0x80) INT_HANDLER decrementer, 0x900, virt=1, bitmask=IRQS_DISABLED EXC_VIRT_END(decrementer, 0x4900, 0x80) INT_KVM_HANDLER decrementer, 0x900, EXC_STD, PACA_EXGEN, 0 -EXC_COMMON_ASYNC(decrementer_common, 0x900, timer_interrupt) +EXC_COMMON_BEGIN(decrementer_common) + INT_COMMON 0x900, PACA_EXGEN, 1, 1, 1, 0, 0 + FINISH_NAP + RUNLATCH_ON + addir3,r1,STACK_FRAME_OVERHEAD + bl timer_interrupt + b ret_from_except_lite EXC_REAL_BEGIN(hdecrementer, 0x980, 0x80) @@ -1465,7 +1455,12 @@ EXC_VIRT_BEGIN(hdecrementer, 0x4980, 0x80) INT_HANDLER hdecrementer, 0x980, virt=1, hsrr=EXC_HV, kvm=1 EXC_VIRT_END(hdecrementer, 0x4980, 0x80) INT_KVM_HANDLER hdecrementer, 0x980, EXC_HV, PACA_EXGEN, 0 -EXC_COMMON(hdecrementer_common, 0x980, hdec_interrupt) +EXC_COMMON_BEGIN(hdecrementer_common) + INT_COMMON 0x980, PACA_EXGEN, 1, 1, 1, 0, 0 + bl save_nvgprs + addir3,r1,STACK_FRAME_OVERHEAD + bl hdec_interrupt + b ret_from_except EXC_REAL_BEGIN(doorbell_super, 0xa00, 0x100) @@ -1475,11 +1470,17 @@ EXC_VIRT_BEGIN(doorbell_super, 0x4a00, 0x100) INT_HANDLER doorbell_super, 0xa00, virt=1, bitmask=IRQS_DISABLED EXC_VIRT_END(doorbell_super, 0x4a00, 0x100) INT_KVM_HANDLER doorbell_super, 0xa00, EXC_STD, PACA_EXGEN, 0 +EXC_COMMON_BEGIN(doorbell_super_common) + INT_COMMON 0xa00, PACA_EXGEN, 1, 1, 1, 0, 0 + FINISH_NAP + RUNLATCH_ON + addir3,r1,STACK_FRAME_OVERHEAD #ifdef CONFIG_PPC_DOORBELL -EXC_COMMON_ASYNC(doorbell_super_common, 0xa00, doorbell_exception) + bl doorbell_exception #else -EXC_COMMON_ASYNC(doorbell_super_common, 0xa00, unknown_exception) + bl unknown_exception #endif + b ret_from_except_lite EXC_REAL_NONE(0xb00, 0x100) @@ -1623,7 +1624,12 @@ EXC_VIRT_BEGIN(single_step, 0x4d00, 0x100) INT_HANDLER single_step, 0xd00, virt=1 EXC_VIRT_END(single_step, 0x4d00, 0x100) INT_KVM_HANDLER single_step, 0xd00, EXC_STD, PACA_EXGEN, 0 -EXC_COMMON(single_step_common, 0xd00, single_step_exception) +EXC_COMMON_BEGIN(single_step_common) + INT_COMMON 0xd00, PACA_EXGEN, 1, 1, 1, 0, 0 + bl save_nvgprs +
[PATCH v2 rebase 03/34] powerpc/64s/exception: Add GEN_KVM macro that uses INT_DEFINE parameters
From: Nicholas Piggin No generated code change. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 595e215515e9..087df86d03ff 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -204,6 +204,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) #define ISET_RI.L_ISET_RI_\name\() #define IEARLY .L_IEARLY_\name\() #define IMASK .L_IMASK_\name\() +#define IKVM_SKIP .L_IKVM_SKIP_\name\() #define IKVM_REAL .L_IKVM_REAL_\name\() #define IKVM_VIRT .L_IKVM_VIRT_\name\() #define ISTACK .L_ISTACK_\name\() @@ -243,6 +244,9 @@ do_define_int n .ifndef IMASK IMASK=0 .endif + .ifndef IKVM_SKIP + IKVM_SKIP=0 + .endif .ifndef IKVM_REAL IKVM_REAL=0 .endif @@ -265,6 +269,10 @@ do_define_int n KVM_HANDLER \vec, \hsrr, \area, \skip .endm +.macro GEN_KVM name + KVM_HANDLER IVEC, IHSRR, IAREA, IKVM_SKIP +.endm + #ifdef CONFIG_KVM_BOOK3S_64_HANDLER #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE /* @@ -1226,6 +1234,7 @@ INT_DEFINE_BEGIN(data_access) IVEC=0x300 IDAR=1 IDSISR=1 + IKVM_SKIP=1 IKVM_REAL=1 INT_DEFINE_END(data_access) @@ -1235,7 +1244,8 @@ EXC_REAL_END(data_access, 0x300, 0x80) EXC_VIRT_BEGIN(data_access, 0x4300, 0x80) GEN_INT_ENTRY data_access, virt=1 EXC_VIRT_END(data_access, 0x4300, 0x80) -INT_KVM_HANDLER data_access, 0x300, EXC_STD, PACA_EXGEN, 1 +TRAMP_KVM_BEGIN(data_access_kvm) + GEN_KVM data_access EXC_COMMON_BEGIN(data_access_common) GEN_COMMON data_access ld r4,_DAR(r1) -- 2.23.0
[PATCH v2 rebase 02/34] powerpc/64s/exception: Add GEN_COMMON macro that uses INT_DEFINE parameters
From: Nicholas Piggin No generated code change. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 0be6d8c34536..595e215515e9 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -206,6 +206,9 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) #define IMASK .L_IMASK_\name\() #define IKVM_REAL .L_IKVM_REAL_\name\() #define IKVM_VIRT .L_IKVM_VIRT_\name\() +#define ISTACK .L_ISTACK_\name\() +#define IRECONCILE .L_IRECONCILE_\name\() +#define IKUAP .L_IKUAP_\name\() #define INT_DEFINE_BEGIN(n)\ .macro int_define_ ## n name @@ -246,6 +249,15 @@ do_define_int n .ifndef IKVM_VIRT IKVM_VIRT=0 .endif + .ifndef ISTACK + ISTACK=1 + .endif + .ifndef IRECONCILE + IRECONCILE=1 + .endif + .ifndef IKUAP + IKUAP=1 + .endif .endm .macro INT_KVM_HANDLER name, vec, hsrr, area, skip @@ -670,6 +682,10 @@ END_FTR_SECTION_NESTED(CPU_FTR_CFAR, CPU_FTR_CFAR, 66) .endif .endm +.macro GEN_COMMON name + INT_COMMON IVEC, IAREA, ISTACK, IKUAP, IRECONCILE, IDAR, IDSISR +.endm + /* * Restore all registers including H/SRR0/1 saved in a stack frame of a * standard exception. @@ -1221,13 +1237,7 @@ EXC_VIRT_BEGIN(data_access, 0x4300, 0x80) EXC_VIRT_END(data_access, 0x4300, 0x80) INT_KVM_HANDLER data_access, 0x300, EXC_STD, PACA_EXGEN, 1 EXC_COMMON_BEGIN(data_access_common) - /* -* Here r13 points to the paca, r9 contains the saved CR, -* SRR0 and SRR1 are saved in r11 and r12, -* r9 - r13 are saved in paca->exgen. -* EX_DAR and EX_DSISR have saved DAR/DSISR -*/ - INT_COMMON 0x300, PACA_EXGEN, 1, 1, 1, 1, 1 + GEN_COMMON data_access ld r4,_DAR(r1) ld r5,_DSISR(r1) BEGIN_MMU_FTR_SECTION -- 2.23.0
[PATCH v2 rebase 00/34] exception cleanup, syscall in C and !COMPAT
Hello, This is merge of https://patchwork.ozlabs.org/cover/1162376/ (except two last experimental patches) and https://patchwork.ozlabs.org/patch/1162079/ rebased on top of master. There was minor conflict in Makefile in the latter series. Refreshed the patchset to fix build error on ppc32 and ppc64e. Rebased on top of powerpc/merge. Thanks Michal Michal Suchanek (9): powerpc/64: system call: Fix sparse warning about missing declaration powerpc: Add back __ARCH_WANT_SYS_LLSEEK macro powerpc: move common register copy functions from signal_32.c to signal.c powerpc/perf: consolidate read_user_stack_32 powerpc/perf: consolidate valid_user_sp powerpc/64: make buildable without CONFIG_COMPAT powerpc/64: Make COMPAT user-selectable disabled on littleendian by default. powerpc/perf: split callchain.c by bitness MAINTAINERS: perf: Add pattern that matches ppc perf to the perf entry. Nicholas Piggin (25): powerpc/64s/exception: Introduce INT_DEFINE parameter block for code generation powerpc/64s/exception: Add GEN_COMMON macro that uses INT_DEFINE parameters powerpc/64s/exception: Add GEN_KVM macro that uses INT_DEFINE parameters powerpc/64s/exception: Expand EXC_COMMON and EXC_COMMON_ASYNC macros powerpc/64s/exception: Move all interrupt handlers to new style code gen macros powerpc/64s/exception: Remove old INT_ENTRY macro powerpc/64s/exception: Remove old INT_COMMON macro powerpc/64s/exception: Remove old INT_KVM_HANDLER powerpc/64s/exception: Add ISIDE option powerpc/64s/exception: move real->virt switch into the common handler powerpc/64s/exception: move soft-mask test to common code powerpc/64s/exception: move KVM test to common code powerpc/64s/exception: remove confusing IEARLY option powerpc/64s/exception: remove the SPR saving patch code macros powerpc/64s/exception: trim unused arguments from KVMTEST macro powerpc/64s/exception: hdecrementer avoid touching the stack powerpc/64s/exception: re-inline some handlers powerpc/64s/exception: Clean up SRR specifiers powerpc/64s/exception: add more comments for interrupt handlers powerpc/64s/exception: only test KVM in SRR interrupts when PR KVM is supported powerpc/64s/exception: soft nmi interrupt should not use ret_from_except powerpc/64: system call remove non-volatile GPR save optimisation powerpc/64: system call implement the bulk of the logic in C powerpc/64s: interrupt return in C powerpc/64s/exception: remove lite interrupt return MAINTAINERS |2 + arch/powerpc/Kconfig |5 +- arch/powerpc/include/asm/asm-prototypes.h | 17 +- .../powerpc/include/asm/book3s/64/kup-radix.h | 24 +- arch/powerpc/include/asm/cputime.h| 24 + arch/powerpc/include/asm/exception-64s.h |4 - arch/powerpc/include/asm/hw_irq.h |4 + arch/powerpc/include/asm/ptrace.h |3 + arch/powerpc/include/asm/signal.h |3 + arch/powerpc/include/asm/switch_to.h | 11 + arch/powerpc/include/asm/thread_info.h|4 +- arch/powerpc/include/asm/time.h |4 +- arch/powerpc/include/asm/unistd.h |1 + arch/powerpc/kernel/Makefile |9 +- arch/powerpc/kernel/entry_64.S| 880 ++-- arch/powerpc/kernel/exceptions-64e.S | 255 ++- arch/powerpc/kernel/exceptions-64s.S | 1937 - arch/powerpc/kernel/process.c | 89 +- arch/powerpc/kernel/signal.c | 144 +- arch/powerpc/kernel/signal.h |2 - arch/powerpc/kernel/signal_32.c | 140 -- arch/powerpc/kernel/syscall_64.c | 349 +++ arch/powerpc/kernel/syscalls/syscall.tbl | 22 +- arch/powerpc/kernel/systbl.S |9 +- arch/powerpc/kernel/time.c|9 - arch/powerpc/kernel/vdso.c|3 +- arch/powerpc/kernel/vector.S |2 +- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 - arch/powerpc/kvm/book3s_segment.S |7 - arch/powerpc/perf/Makefile|5 +- arch/powerpc/perf/callchain.c | 370 +--- arch/powerpc/perf/callchain.h | 20 + arch/powerpc/perf/callchain_32.c | 197 ++ arch/powerpc/perf/callchain_64.c | 178 ++ fs/read_write.c |3 +- 35 files changed, 2798 insertions(+), 1949 deletions(-) create mode 100644 arch/powerpc/kernel/syscall_64.c create mode 100644 arch/powerpc/perf/callchain.h create mode 100644 arch/powerpc/perf/callchain_32.c create mode 100644 arch/powerpc/perf/callchain_64.c -- 2.23.0
[PATCH v2 rebase 01/34] powerpc/64s/exception: Introduce INT_DEFINE parameter block for code generation
From: Nicholas Piggin The code generation macro arguments are difficult to read, and defaults can't easily be used. This introduces a block where parameters can be set for interrupt handler code generation by the subsequent macros, and adds the first generation macro for interrupt entry. One interrupt handler is converted to the new macros to demonstrate the change, the rest will be coverted all at once. No generated code change. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/exceptions-64s.S | 77 ++-- 1 file changed, 73 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 46508b148e16..0be6d8c34536 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -193,6 +193,61 @@ END_FTR_SECTION_NESTED(ftr,ftr,943) mtctr reg;\ bctr +/* + * Interrupt code generation macros + */ +#define IVEC .L_IVEC_\name\() +#define IHSRR .L_IHSRR_\name\() +#define IAREA .L_IAREA_\name\() +#define IDAR .L_IDAR_\name\() +#define IDSISR .L_IDSISR_\name\() +#define ISET_RI.L_ISET_RI_\name\() +#define IEARLY .L_IEARLY_\name\() +#define IMASK .L_IMASK_\name\() +#define IKVM_REAL .L_IKVM_REAL_\name\() +#define IKVM_VIRT .L_IKVM_VIRT_\name\() + +#define INT_DEFINE_BEGIN(n)\ +.macro int_define_ ## n name + +#define INT_DEFINE_END(n) \ +.endm ; \ +int_define_ ## n n ; \ +do_define_int n + +.macro do_define_int name + .ifndef IVEC + .error "IVEC not defined" + .endif + .ifndef IHSRR + IHSRR=EXC_STD + .endif + .ifndef IAREA + IAREA=PACA_EXGEN + .endif + .ifndef IDAR + IDAR=0 + .endif + .ifndef IDSISR + IDSISR=0 + .endif + .ifndef ISET_RI + ISET_RI=1 + .endif + .ifndef IEARLY + IEARLY=0 + .endif + .ifndef IMASK + IMASK=0 + .endif + .ifndef IKVM_REAL + IKVM_REAL=0 + .endif + .ifndef IKVM_VIRT + IKVM_VIRT=0 + .endif +.endm + .macro INT_KVM_HANDLER name, vec, hsrr, area, skip TRAMP_KVM_BEGIN(\name\()_kvm) KVM_HANDLER \vec, \hsrr, \area, \skip @@ -474,7 +529,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) */ GET_SCRATCH0(r10) std r10,\area\()+EX_R13(r13) - .if \dar + .if \dar == 1 .if \hsrr mfspr r10,SPRN_HDAR .else @@ -482,7 +537,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .endif std r10,\area\()+EX_DAR(r13) .endif - .if \dsisr + .if \dsisr == 1 .if \hsrr mfspr r10,SPRN_HDSISR .else @@ -506,6 +561,14 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,948) .endif .endm +.macro GEN_INT_ENTRY name, virt, ool=0 + .if ! \virt + INT_HANDLER \name, IVEC, \ool, IEARLY, \virt, IHSRR, IAREA, ISET_RI, IDAR, IDSISR, IMASK, IKVM_REAL + .else + INT_HANDLER \name, IVEC, \ool, IEARLY, \virt, IHSRR, IAREA, ISET_RI, IDAR, IDSISR, IMASK, IKVM_VIRT + .endif +.endm + /* * On entry r13 points to the paca, r9-r13 are saved in the paca, * r9 contains the saved CR, r11 and r12 contain the saved SRR0 and @@ -1143,12 +1206,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) bl unrecoverable_exception b . +INT_DEFINE_BEGIN(data_access) + IVEC=0x300 + IDAR=1 + IDSISR=1 + IKVM_REAL=1 +INT_DEFINE_END(data_access) EXC_REAL_BEGIN(data_access, 0x300, 0x80) - INT_HANDLER data_access, 0x300, ool=1, dar=1, dsisr=1, kvm=1 + GEN_INT_ENTRY data_access, virt=0, ool=1 EXC_REAL_END(data_access, 0x300, 0x80) EXC_VIRT_BEGIN(data_access, 0x4300, 0x80) - INT_HANDLER data_access, 0x300, virt=1, dar=1, dsisr=1 + GEN_INT_ENTRY data_access, virt=1 EXC_VIRT_END(data_access, 0x4300, 0x80) INT_KVM_HANDLER data_access, 0x300, EXC_STD, PACA_EXGEN, 1 EXC_COMMON_BEGIN(data_access_common) -- 2.23.0
Re: [Very RFC 35/46] powernv/pci: Remove open-coded PE lookup in pnv_pci_release_device
On Wed, Nov 27, 2019 at 4:24 PM Alexey Kardashevskiy wrote: > > > > On 20/11/2019 12:28, Oliver O'Halloran wrote: > > Signed-off-by: Oliver O'Halloran > > --- > > arch/powerpc/platforms/powernv/pci-ioda.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > > b/arch/powerpc/platforms/powernv/pci-ioda.c > > index 4f38652c7cd7..8525642b1256 100644 > > --- a/arch/powerpc/platforms/powernv/pci-ioda.c > > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > > @@ -3562,14 +3562,14 @@ static void pnv_ioda_release_pe(struct pnv_ioda_pe > > *pe) > > static void pnv_pci_release_device(struct pci_dev *pdev) > > { > > struct pnv_phb *phb = pci_bus_to_pnvhb(pdev->bus); > > + struct pnv_ioda_pe *pe = pnv_ioda_get_pe(pdev); > > struct pci_dn *pdn = pci_get_pdn(pdev); > > - struct pnv_ioda_pe *pe; > > > > /* The VF PE state is torn down when sriov_disable() is called */ > > if (pdev->is_virtfn) > > return; > > > > - if (!pdn || pdn->pe_number == IODA_INVALID_PE) > > + if (WARN_ON(!pe)) > > > Is that WARN_ON because there is always a PE - from upstream bridge or a The device should always belong to a PE. If it doesn't (at this point) then something deeply strange has happened. > reserved one? If it's associated with the reserved PE the rmap is set to IODA_PE_INVALID, so would return NULL and we'd hit the WARN_ON(). I think that's ok though since PE assignment should always succeed. If it failed, or we're tearing down the device before we got to the point of assigning a PE then there's probably a bug.
Re: [PATCH v2 29/35] powerpc/perf: remove current_is_64bit()
On Wed, Nov 27, 2019 at 06:41:09AM +0100, Christophe Leroy wrote: > > > Le 26/11/2019 à 21:13, Michal Suchanek a écrit : > > Since commit ed1cd6deb013 ("powerpc: Activate CONFIG_THREAD_INFO_IN_TASK") > > current_is_64bit() is quivalent to !is_32bit_task(). > > Remove the redundant function. > > > > Link: https://github.com/linuxppc/issues/issues/275 > > Link: https://lkml.org/lkml/2019/9/12/540 > > > > Fixes: linuxppc#275 > > Suggested-by: Christophe Leroy > > Signed-off-by: Michal Suchanek > > This change is already in powerpc/next, see > https://github.com/linuxppc/linux/commit/42484d2c0f82b666292faf6668c77b49a3a04bc0 Right, needs rebase. Thanks Michal > > Christophe > > > --- > > arch/powerpc/perf/callchain.c | 17 + > > 1 file changed, 1 insertion(+), 16 deletions(-) > > > > diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c > > index c84bbd4298a0..35d542515faf 100644 > > --- a/arch/powerpc/perf/callchain.c > > +++ b/arch/powerpc/perf/callchain.c > > @@ -284,16 +284,6 @@ static void perf_callchain_user_64(struct > > perf_callchain_entry_ctx *entry, > > } > > } > > -static inline int current_is_64bit(void) > > -{ > > - /* > > -* We can't use test_thread_flag() here because we may be on an > > -* interrupt stack, and the thread flags don't get copied over > > -* from the thread_info on the main stack to the interrupt stack. > > -*/ > > - return !test_ti_thread_flag(task_thread_info(current), TIF_32BIT); > > -} > > - > > #else /* CONFIG_PPC64 */ > > /* > >* On 32-bit we just access the address and let hash_page create a > > @@ -321,11 +311,6 @@ static inline void perf_callchain_user_64(struct > > perf_callchain_entry_ctx *entry > > { > > } > > -static inline int current_is_64bit(void) > > -{ > > - return 0; > > -} > > - > > static inline int valid_user_sp(unsigned long sp, int is_64) > > { > > if (!sp || (sp & 7) || sp > TASK_SIZE - 32) > > @@ -486,7 +471,7 @@ static void perf_callchain_user_32(struct > > perf_callchain_entry_ctx *entry, > > void > > perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct > > pt_regs *regs) > > { > > - if (current_is_64bit()) > > + if (!is_32bit_task()) > > perf_callchain_user_64(entry, regs); > > else > > perf_callchain_user_32(entry, regs); > >
Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
Le 27/11/2019 à 10:33, Greg Kurz a écrit : On Wed, 27 Nov 2019 10:10:13 +0100 Frederic Barrat wrote: Le 27/11/2019 à 09:24, Greg Kurz a écrit : On Wed, 27 Nov 2019 18:09:40 +1100 Alexey Kardashevskiy wrote: On 20/11/2019 12:28, Oliver O'Halloran wrote: The comment here implies that we don't need to take a ref to the pci_dev because the ioda_pe will always have one. This implies that the current expection is that the pci_dev for an NPU device will *never* be torn down since the ioda_pe having a ref to the device will prevent the release function from being called. In other words, the desired behaviour here appears to be leaking a ref. Nice! There is a history: https://patchwork.ozlabs.org/patch/1088078/ We did not fix anything in particular then, we do not seem to be fixing anything now (in other words - we cannot test it in a normal natural way). I'd drop this one. Yeah, I didn't fix anything at the time. Just reverted to the ref count behavior we had before: https://patchwork.ozlabs.org/patch/829172/ Frederic recently posted his take on the same topic from the OpenCAPI point of view: http://patchwork.ozlabs.org/patch/1198947/ He seems to indicate the NPU devices as the real culprit because nobody ever cared for them to be removable. Fixing that seems be a chore nobody really wants to address obviously... :-\ I had taken a stab at not leaking a ref for the nvlink devices and do the proper thing regarding ref counting (i.e. fixing all the callers of get_pci_dev() to drop the reference when they were done). With that, I could see that the ref count of the nvlink devices could drop to 0 (calling remove for the device in /sys) and that the devices could go away. But then, I realized it's not necessarily desirable at this point. There are several comments in the code saying the npu devices (for nvlink) don't go away, there's no device release callback defined when it seems there should be, at least to handle releasing PEs All in all, it seems that some work would be needed. And if it hasn't been required by now... If everyone is ok with leaking a reference in the NPU case, I guess this isn't a problem. But if we move forward with Oliver's patch, a pci_dev_put() would be needed for OpenCAPI, correct ? No, these code paths are nvlink-only. Fred Fred Signed-off-by: Oliver O'Halloran --- arch/powerpc/platforms/powernv/npu-dma.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index 72d3749da02c..2eb6e6d45a98 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn) break; /* -* pci_get_domain_bus_and_slot() increased the reference count of -* the PCI device, but callers don't need that actually as the PE -* already holds a reference to the device. Since callers aren't -* aware of the reference count change, call pci_dev_put() now to -* avoid leaks. +* NB: for_each_pci_dev() elevates the pci_dev refcount. +* Caller is responsible for dropping the ref when it's +* finished with it. */ - if (pdev) - pci_dev_put(pdev); - return pdev; }
Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
On Wed, Nov 27, 2019 at 8:34 PM Greg Kurz wrote: > > > If everyone is ok with leaking a reference in the NPU case, I guess > this isn't a problem. But if we move forward with Oliver's patch, a > pci_dev_put() would be needed for OpenCAPI, correct ? Yes, but I think that's fair enough. By convention it's the callers responsibility to drop the ref when it calls a function that returns a refcounted object. Doing anything else creates a race condition since the object's count could drop to zero before the caller starts using it. Oliver
Re: [PATCH 02/14] Revert "powerpc/powernv: remove the unused vas_win_paste_addr and vas_win_id functions"
On Wed, Nov 27, 2019 at 01:20:36AM -0800, Haren Myneni wrote: > Thanks for the review. > vas_win_paste_addr() will be used in NX compression driver and planning to > post this series soon. Can I add this change later as part of this series? Please only add core functionality and exports with the actual users.
Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
Le 27/11/2019 à 09:24, Greg Kurz a écrit : On Wed, 27 Nov 2019 18:09:40 +1100 Alexey Kardashevskiy wrote: On 20/11/2019 12:28, Oliver O'Halloran wrote: The comment here implies that we don't need to take a ref to the pci_dev because the ioda_pe will always have one. This implies that the current expection is that the pci_dev for an NPU device will *never* be torn down since the ioda_pe having a ref to the device will prevent the release function from being called. In other words, the desired behaviour here appears to be leaking a ref. Nice! There is a history: https://patchwork.ozlabs.org/patch/1088078/ We did not fix anything in particular then, we do not seem to be fixing anything now (in other words - we cannot test it in a normal natural way). I'd drop this one. Yeah, I didn't fix anything at the time. Just reverted to the ref count behavior we had before: https://patchwork.ozlabs.org/patch/829172/ Frederic recently posted his take on the same topic from the OpenCAPI point of view: http://patchwork.ozlabs.org/patch/1198947/ He seems to indicate the NPU devices as the real culprit because nobody ever cared for them to be removable. Fixing that seems be a chore nobody really wants to address obviously... :-\ I had taken a stab at not leaking a ref for the nvlink devices and do the proper thing regarding ref counting (i.e. fixing all the callers of get_pci_dev() to drop the reference when they were done). With that, I could see that the ref count of the nvlink devices could drop to 0 (calling remove for the device in /sys) and that the devices could go away. But then, I realized it's not necessarily desirable at this point. There are several comments in the code saying the npu devices (for nvlink) don't go away, there's no device release callback defined when it seems there should be, at least to handle releasing PEs All in all, it seems that some work would be needed. And if it hasn't been required by now... Fred Signed-off-by: Oliver O'Halloran --- arch/powerpc/platforms/powernv/npu-dma.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c index 72d3749da02c..2eb6e6d45a98 100644 --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node *dn) break; /* -* pci_get_domain_bus_and_slot() increased the reference count of -* the PCI device, but callers don't need that actually as the PE -* already holds a reference to the device. Since callers aren't -* aware of the reference count change, call pci_dev_put() now to -* avoid leaks. +* NB: for_each_pci_dev() elevates the pci_dev refcount. +* Caller is responsible for dropping the ref when it's +* finished with it. */ - if (pdev) - pci_dev_put(pdev); - return pdev; }
Re: [Very RFC 40/46] powernv/npu: Don't drop refcount when looking up GPU pci_devs
On Wed, 27 Nov 2019 18:09:40 +1100 Alexey Kardashevskiy wrote: > > > On 20/11/2019 12:28, Oliver O'Halloran wrote: > > The comment here implies that we don't need to take a ref to the pci_dev > > because the ioda_pe will always have one. This implies that the current > > expection is that the pci_dev for an NPU device will *never* be torn > > down since the ioda_pe having a ref to the device will prevent the > > release function from being called. > > > > In other words, the desired behaviour here appears to be leaking a ref. > > > > Nice! > > > There is a history: https://patchwork.ozlabs.org/patch/1088078/ > > We did not fix anything in particular then, we do not seem to be fixing > anything now (in other words - we cannot test it in a normal natural > way). I'd drop this one. > Yeah, I didn't fix anything at the time. Just reverted to the ref count behavior we had before: https://patchwork.ozlabs.org/patch/829172/ Frederic recently posted his take on the same topic from the OpenCAPI point of view: http://patchwork.ozlabs.org/patch/1198947/ He seems to indicate the NPU devices as the real culprit because nobody ever cared for them to be removable. Fixing that seems be a chore nobody really wants to address obviously... :-\ > > > > > > Signed-off-by: Oliver O'Halloran > > --- > > arch/powerpc/platforms/powernv/npu-dma.c | 11 +++ > > 1 file changed, 3 insertions(+), 8 deletions(-) > > > > diff --git a/arch/powerpc/platforms/powernv/npu-dma.c > > b/arch/powerpc/platforms/powernv/npu-dma.c > > index 72d3749da02c..2eb6e6d45a98 100644 > > --- a/arch/powerpc/platforms/powernv/npu-dma.c > > +++ b/arch/powerpc/platforms/powernv/npu-dma.c > > @@ -28,15 +28,10 @@ static struct pci_dev *get_pci_dev(struct device_node > > *dn) > > break; > > > > /* > > -* pci_get_domain_bus_and_slot() increased the reference count of > > -* the PCI device, but callers don't need that actually as the PE > > -* already holds a reference to the device. Since callers aren't > > -* aware of the reference count change, call pci_dev_put() now to > > -* avoid leaks. > > +* NB: for_each_pci_dev() elevates the pci_dev refcount. > > +* Caller is responsible for dropping the ref when it's > > +* finished with it. > > */ > > - if (pdev) > > - pci_dev_put(pdev); > > - > > return pdev; > > } > > > > >
Re: Bug 205201 - Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M
On Wed, Nov 27, 2019 at 08:56:25AM +0200, Mike Rapoport wrote: > Maybe we'll simply force bottom up allocation before calling > swiotlb_init()? Anyway, it's the last memblock allocation. That should work, but I don't think it is the proper fix. The underlying issue here is that ZONE_DMA/DMA32 sizing is something that needs to be propagated to memblock and dma-direct as is based around addressing limitations. But our zone initialization is such a mess that we can't just reuse a variable. Nicolas has started to clean some of this up, but we need to clean that whole zone initialization mess up a lot more.
Re: [PATCH] powerpc/32: drop unused ISA_DMA_THRESHOLD
On Mon, Nov 25, 2019 at 11:20:33AM +0200, Mike Rapoport wrote: > From: Mike Rapoport > > The ISA_DMA_THRESHOLD variable is set by several platforms but never > referenced. > Remove it. Looks good: Reviewed-by: Christoph Hellwig
Re: [PATCH 09/14] powerpc/vas: Update CSB and notify process for fault CRBs
> > +static void notify_process(pid_t pid, u64 fault_addr) > +{ > + int rc; > + struct kernel_siginfo info; > + > + memset(, 0, sizeof(info)); > + > + info.si_signo = SIGSEGV; > + info.si_errno = EFAULT; > + info.si_code = SEGV_MAPERR; > + > + info.si_addr = (void *)fault_addr; > + rcu_read_lock(); > + rc = kill_pid_info(SIGSEGV, , find_vpid(pid)); > + rcu_read_unlock(); > + > + pr_devel("%s(): pid %d kill_proc_info() rc %d\n", __func__, pid, rc); > +} Shouldn't this use force_sig_fault_to_task instead? > + /* > + * User space passed invalid CSB address, Notify process with > + * SEGV signal. > + */ > + tsk = get_pid_task(window->pid, PIDTYPE_PID); > + /* > + * Send window will be closed after processing all NX requests > + * and process exits after closing all windows. In multi-thread > + * applications, thread may not exists, but does not close FD > + * (means send window) upon exit. Parent thread (tgid) can use > + * and close the window later. > + */ > + if (tsk) { > + if (tsk->flags & PF_EXITING) > + task_exit = 1; > + put_task_struct(tsk); > + pid = vas_window_pid(window); The pid is later used for sending the signal again, why not keep the reference? > + } else { > + pid = vas_window_tgid(window); > + > + rcu_read_lock(); > + tsk = find_task_by_vpid(pid); > + if (!tsk) { > + rcu_read_unlock(); > + return; > + } > + if (tsk->flags & PF_EXITING) > + task_exit = 1; > + rcu_read_unlock(); Why does this not need a reference to the task, but the other one does?
Re: [PATCH 06/14] powerpc/vas: Setup fault handler per VAS instance
> > +struct task_struct *fault_handler; > + > +void vas_wakeup_fault_handler(int virq, void *arg) > +{ > + struct vas_instance *vinst = arg; > + > + atomic_inc(>pending_fault); > + wake_up(>fault_wq); > +} > + > +/* > + * Fault handler thread for each VAS instance and process fault CRBs. > + */ > +static int fault_handler_func(void *arg) > +{ > + struct vas_instance *vinst = (struct vas_instance *)arg; > + > + do { > + if (signal_pending(current)) > + flush_signals(current); > + > + wait_event_interruptible(vinst->fault_wq, > + atomic_read(>pending_fault) || > + kthread_should_stop()); > + > + if (kthread_should_stop()) > + break; > + > + atomic_dec(>pending_fault); > + } while (!kthread_should_stop()); > + > + return 0; > +} Pleae use threaded interrupts instead of reinventing them badly.
Re: [PATCH 05/14] powerpc/vas: Setup fault window per VAS instance
> +/* > + * We do not remove VAS instances. The following functions are needed > + * when VAS hotplug is supported. > + */ > +#if 0 Please don't add dead code to the kernel tree.
Re: [PATCH 04/14] powerpc/vas: Setup IRQ mapping and register port for each window
> +static irqreturn_t vas_irq_handler(int virq, void *data) > +{ > + struct vas_instance *vinst = data; > + > + pr_devel("VAS %d: virq %d\n", vinst->vas_id, virq); > + > + return IRQ_HANDLED; > +} An empty interrupt handler is rather pointless. It later grows code, but adding it without that is a bad idea. Please squash the patches into sesible chunks.
Re: [PATCH 03/14] powerpc/vas: Define nx_fault_stamp in coprocessor_request_block
> +#define crb_csb_addr(c) __be64_to_cpu(c->csb_addr) > +#define crb_nx_fault_addr(c) __be64_to_cpu(c->stamp.nx.fault_storage_addr) > +#define crb_nx_flags(c) c->stamp.nx.flags > +#define crb_nx_fault_status(c) c->stamp.nx.fault_status Except for crb_nx_fault_addr all these macros are unused, and crb_nx_fault_addr probably makes more sense open coded in the only caller. Also please don't use the __ prefixed byte swap helpers in any driver or arch code. > + > +static inline uint32_t crb_nx_pswid(struct coprocessor_request_block *crb) > +{ > + return __be32_to_cpu(crb->stamp.nx.pswid); > +} Same here. Also not sure what the point of the helper is except for obsfucating the code.
Re: [PATCH 02/14] Revert "powerpc/powernv: remove the unused vas_win_paste_addr and vas_win_id functions"
On Tue, Nov 26, 2019 at 05:03:27PM -0800, Haren Myneni wrote: > > This reverts commit 452d23c0f6bd97f2fd8a9691fee79b76040a0feb. > > User space send windows (NX GZIP compression) need vas_win_paste_addr() > to mmap window paste address and vas_win_id() to get window ID when > window address is given. Even with your full series applied vas_win_paste_addr is entirely unused, and vas_win_id is only used once in the same file it is defined. So instead of this patch you should just open code vas_win_id in init_winctx_for_txwin. > +static inline u32 encode_pswid(int vasid, int winid) > +{ > + u32 pswid = 0; > + > + pswid |= vasid << (31 - 7); > + pswid |= winid; > + > + return pswid; This can be simplified down to: return (u32)winid | (vasid << (31 - 7));