Re: [PATCH v1 05/10] powerpc/mm: Do early ioremaps from top to bottom on PPC64 too.
On Tue, Aug 13, 2019 at 08:11:38PM +, Christophe Leroy wrote: > Until vmalloc system is up and running, ioremap basically > allocates addresses at the border of the IOREMAP area. Note that while a few other architectures have a magic hack like powerpc to make ioremap work before vmalloc, the normal practice would be to explicitly use early_ioremap. I guess your change is fine for now, but it might make sense convert powerpc to the explicit early_ioremap scheme as well.
Re: [PATCH v1 10/10] powerpc/mm: refactor ioremap_range() and use ioremap_page_range()
Somehow this series is missing a cover letter. While you are touching all this "fun" can you also look into killing __ioremap? It seems to be a weird non-standard version of ioremap_prot (probably predating ioremap_prot) that is missing a few lines of code setting attributes that might not even be applicable for the two drivers calling it.
Re: [PATCH v1 02/10] powerpc/mm: rework io-workaround invocation.
On Tue, Aug 13, 2019 at 08:11:34PM +, Christophe Leroy wrote: > ppc_md.ioremap() is only used for I/O workaround on CELL platform, > so indirect function call can be avoided. > > This patch reworks the io-workaround and ioremap() functions to > use static keys for the activation of io-workaround. > > When CONFIG_PPC_IO_WORKAROUNDS or CONFIG_PPC_INDIRECT_MMIO are not > selected, the I/O workaround ioremap() voids and the static key is > not used at all. Why bother with the complex static key? ioremap isn't exactly a fast path. Just make it a normal branch if enabled, with the option to compile it out entirely as in your patch.
Re: [PATCH] powerpc/32s: fix boot failure with DEBUG_PAGEALLOC without KASAN.
On Wed, Aug 14, 2019 at 05:28:35AM +, Christophe Leroy wrote: > When KASAN is selected, the definitive hash table has to be > set up later, but there is already an early temporary one. > > When KASAN is not selected, there is no early hash table, > so the setup of the definitive hash table cannot be delayed. I think you also want to add this information to the code itself as comments..
Re: [REGRESSION] Boot failure with DEBUG_PAGEALLOC on Wii, after PPC32 KASAN patches
Hi Le 13/08/2019 à 17:51, Jonathan Neuschäfer a écrit : Hi, I noticed that my Nintendo Wii doesn't boot with wii_defconfig plus CONFIG_DEBUG_PAGEALLOC=y and CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y on recent kernels. I get a splash like this one: [0.022245] BUG: Unable to handle kernel data access at 0x6601 [0.025172] Faulting instruction address: 0xc01afa48 [0.027522] Oops: Kernel access of bad area, sig: 11 [#1] [0.030076] BE PAGE_SIZE=4K MMU=Hash PREEMPT DEBUG_PAGEALLOC wii [...] (Without CONFIG_DEBUG_PAGEALLOC I haven't noticed any problems.) 'git bisect' says: 72f208c6a8f7bc78ef5248babd9e6ed6302bd2a0 is the first bad commit commit 72f208c6a8f7bc78ef5248babd9e6ed6302bd2a0 Author: Christophe Leroy Date: Fri Apr 26 16:23:35 2019 + powerpc/32s: move hash code patching out of MMU_init_hw() [...] I can revert this commit, and then 5.3-rc2 (plus a patchset adding a serial driver) boot again. Christophe, is there anything I should test in order to figure out how to fix this properly? I just sent out a patch that should fix it. Please test and tell me. Thanks Christophe
[PATCH] powerpc/32s: fix boot failure with DEBUG_PAGEALLOC without KASAN.
When KASAN is selected, the definitive hash table has to be set up later, but there is already an early temporary one. When KASAN is not selected, there is no early hash table, so the setup of the definitive hash table cannot be delayed. Reported-by: Jonathan Neuschafer Fixes: 72f208c6a8f7 ("powerpc/32s: move hash code patching out of MMU_init_hw()") Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/head_32.S | 2 ++ arch/powerpc/mm/book3s32/mmu.c | 5 + 2 files changed, 7 insertions(+) diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S index f255e22184b4..c8b4f7ed318c 100644 --- a/arch/powerpc/kernel/head_32.S +++ b/arch/powerpc/kernel/head_32.S @@ -897,9 +897,11 @@ start_here: bl machine_init bl __save_cpu_setup bl MMU_init +#ifdef CONFIG_KASAN BEGIN_MMU_FTR_SECTION bl MMU_init_hw_patch END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE) +#endif /* * Go back to running unmapped so we can load up new values diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c index e249fbf6b9c3..6ddcbfad5c9e 100644 --- a/arch/powerpc/mm/book3s32/mmu.c +++ b/arch/powerpc/mm/book3s32/mmu.c @@ -358,6 +358,11 @@ void __init MMU_init_hw(void) hash_mb2 = hash_mb = 32 - LG_HPTEG_SIZE - lg_n_hpteg; if (lg_n_hpteg > 16) hash_mb2 = 16 - LG_HPTEG_SIZE; + + if (IS_ENABLED(CONFIG_KASAN)) + return; + + MMU_init_hw_patch(); } void __init MMU_init_hw_patch(void) -- 2.13.3
Re: [PATCH v1 08/10] powerpc/mm: move __ioremap_at() and __iounmap_at() into ioremap.c
> +/** > + * __iounmap_from - Low level function to tear down the page tables > + * for an IO mapping. This is used for mappings that > + * are manipulated manually, like partial unmapping of > + * PCI IOs or ISA space. > + */ > +void __iounmap_at(void *ea, unsigned long size) The comment doesn't mention the function name. That's why I ususally don't even add the function name so that it doesn't get out of sync.
Re: [PATCH v1 01/10] powerpc/mm: drop ppc_md.iounmap()
On Tue, Aug 13, 2019 at 08:11:33PM +, Christophe Leroy wrote: > ppc_md.iounmap() is never set, drop it. > > Signed-off-by: Christophe Leroy Hah, I was just going to send the same patch as part of an tree-wide ioremap related series.. Reviewed-by: Christoph Hellwig
Re: [PATCH v2 1/3] KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts
On Tue, 2019-08-13 at 20:03 +1000, Paul Mackerras wrote: > Escalation interrupts are interrupts sent to the host by the XIVE > hardware when it has an interrupt to deliver to a guest VCPU but that > VCPU is not running anywhere in the system. Hence we disable the > escalation interrupt for the VCPU being run when we enter the guest > and re-enable it when the guest does an H_CEDE hypercall indicating > it is idle. > > It is possible that an escalation interrupt gets generated just as we > are entering the guest. In that case the escalation interrupt may be > using a queue entry in one of the interrupt queues, and that queue > entry may not have been processed when the guest exits with an > H_CEDE. > The existing entry code detects this situation and does not clear the > vcpu->arch.xive_esc_on flag as an indication that there is a pending > queue entry (if the queue entry gets processed, xive_esc_irq() will > clear the flag). There is a comment in the code saying that if the > flag is still set on H_CEDE, we have to abort the cede rather than > re-enabling the escalation interrupt, lest we end up with two > occurrences of the escalation interrupt in the interrupt queue. > > However, the exit code doesn't do that; it aborts the cede in the > sense > that vcpu->arch.ceded gets cleared, but it still enables the > escalation > interrupt by setting the source's PQ bits to 00. Instead we need to > set the PQ bits to 10, indicating that an interrupt has been > triggered. > We also need to avoid setting vcpu->arch.xive_esc_on in this case > (i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because > xive_esc_irq() will run at some point and clear it, and if we race > with > that we may end up with an incorrect result (i.e. xive_esc_on set > when > the escalation interrupt has just been handled). > > It is extremely unlikely that having two queue entries would cause > observable problems; theoretically it could cause queue overflow, but > the CPU would have to have thousands of interrupts targetted to it > for > that to be possible. However, this fix will also make it possible to > determine accurately whether there is an unhandled escalation > interrupt in the queue, which will be needed by the following patch. > > Cc: sta...@vger.kernel.org # v4.16+ > Fixes: 9b9b13a6d153 ("KVM: PPC: Book3S HV: Keep XIVE escalation > interrupt masked unless ceded") > Signed-off-by: Paul Mackerras > --- > v2: don't set xive_esc_on if we're not using a XIVE escalation > interrupt. > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 36 + > > 1 file changed, 23 insertions(+), 13 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index 337e644..2e7e788 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -2831,29 +2831,39 @@ kvm_cede_prodded: > kvm_cede_exit: > ld r9, HSTATE_KVM_VCPU(r13) > #ifdef CONFIG_KVM_XICS > - /* Abort if we still have a pending escalation */ > + /* are we using XIVE with single escalation? */ > + ld r10, VCPU_XIVE_ESC_VADDR(r9) > + cmpdi r10, 0 > + beq 3f > + li r6, XIVE_ESB_SET_PQ_00 Would it make sense to put the above instruction down into the 4: label instead? If we do not branch to 4, r6 is overwriten anyway. I think that would save a load when we do not branch to 4. Also it would mean that you could use r5 everywhere instead of changing it to r6? > + /* > + * If we still have a pending escalation, abort the cede, > + * and we must set PQ to 10 rather than 00 so that we don't > + * potentially end up with two entries for the escalation > + * interrupt in the XIVE interrupt queue. In that case > + * we also don't want to set xive_esc_on to 1 here in > + * case we race with xive_esc_irq(). > + */ > lbz r5, VCPU_XIVE_ESC_ON(r9) > cmpwi r5, 0 > - beq 1f > + beq 4f > li r0, 0 > stb r0, VCPU_CEDED(r9) > -1: /* Enable XIVE escalation */ > - li r5, XIVE_ESB_SET_PQ_00 > + li r6, XIVE_ESB_SET_PQ_10 > + b 5f > +4: li r0, 1 > + stb r0, VCPU_XIVE_ESC_ON(r9) > + /* make sure store to xive_esc_on is seen before xive_esc_irq > runs */ > + sync > +5: /* Enable XIVE escalation */ > mfmsr r0 > andi. r0, r0, MSR_DR /* in real mode? */ > beq 1f > - ld r10, VCPU_XIVE_ESC_VADDR(r9) > - cmpdi r10, 0 > - beq 3f > - ldx r0, r10, r5 > + ldx r0, r10, r6 > b 2f > 1: ld r10, VCPU_XIVE_ESC_RADDR(r9) > - cmpdi r10, 0 > - beq 3f > - ldcix r0, r10, r5 > + ldcix r0, r10, r6 > 2: sync > - li r0, 1 > - stb r0, VCPU_XIVE_ESC_ON(r9) > #endif /* CONFIG_KVM_XICS */ > 3: b guest_exit_cont >
Re: [PATCH v5 1/4] nvdimm: Consider probe return -EOPNOTSUPP as success
Hi Aneesh, logic looks correct but there are some cleanups I'd like to see and a lead-in patch that I attached. I've started prefixing nvdimm patches with: libnvdimm/$component: ...since this patch mostly impacts the pmem driver lets prefix it "libnvdimm/pmem: " On Fri, Aug 9, 2019 at 12:45 AM Aneesh Kumar K.V wrote: > > This patch add -EOPNOTSUPP as return from probe callback to s/This patch add/Add/ No need to say "this patch" it's obviously a patch. > indicate we were not able to initialize a namespace due to pfn superblock > feature/version mismatch. We want to consider this a probe success so that > we can create new namesapce seed and there by avoid marking the failed > namespace as the seed namespace. Please replace usage of "we" with the exact agent involved as which "we" is being referred to gets confusing for the reader. i.e. "indicate that the pmem driver was not..." "The nvdimm core wants to consider this...". > > Signed-off-by: Aneesh Kumar K.V > --- > drivers/nvdimm/bus.c | 2 +- > drivers/nvdimm/pmem.c | 26 ++ > 2 files changed, 23 insertions(+), 5 deletions(-) > > diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c > index 798c5c4aea9c..16c35e6446a7 100644 > --- a/drivers/nvdimm/bus.c > +++ b/drivers/nvdimm/bus.c > @@ -95,7 +95,7 @@ static int nvdimm_bus_probe(struct device *dev) > rc = nd_drv->probe(dev); > debug_nvdimm_unlock(dev); > > - if (rc == 0) > + if (rc == 0 || rc == -EOPNOTSUPP) > nd_region_probe_success(nvdimm_bus, dev); This now makes the nd_region_probe_success() helper obviously misnamed since it now wants to take actions on non-probe success. I attached a lead-in cleanup that you can pull into your series that renames that routine to nd_region_advance_seeds(). When you rebase this needs a comment about why EOPNOTSUPP has special handling. > else > nd_region_disable(nvdimm_bus, dev); > diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c > index 4c121dd03dd9..3f498881dd28 100644 > --- a/drivers/nvdimm/pmem.c > +++ b/drivers/nvdimm/pmem.c > @@ -490,6 +490,7 @@ static int pmem_attach_disk(struct device *dev, > > static int nd_pmem_probe(struct device *dev) > { > + int ret; > struct nd_namespace_common *ndns; > > ndns = nvdimm_namespace_common_probe(dev); > @@ -505,12 +506,29 @@ static int nd_pmem_probe(struct device *dev) > if (is_nd_pfn(dev)) > return pmem_attach_disk(dev, ndns); > > - /* if we find a valid info-block we'll come back as that personality > */ > - if (nd_btt_probe(dev, ndns) == 0 || nd_pfn_probe(dev, ndns) == 0 > - || nd_dax_probe(dev, ndns) == 0) Similar need for an updated comment here to explain the special translation of error codes. > + ret = nd_btt_probe(dev, ndns); > + if (ret == 0) > return -ENXIO; > + else if (ret == -EOPNOTSUPP) Are there cases where the btt driver needs to return EOPNOTSUPP? I'd otherwise like to keep this special casing constrained to the pfn / dax info block cases. From 9ec13a8672e87e0b1c5b9427ab926168e53d55bc Mon Sep 17 00:00:00 2001 From: Dan Williams Date: Tue, 13 Aug 2019 13:09:27 -0700 Subject: [PATCH] libnvdimm/region: Rewrite _probe_success() to _advance_seeds() The nd_region_probe_success() helper collides seed management with nvdimm->busy tracking. Given the 'busy' increment is handled internal to the nd_region driver 'probe' path move the decrement to the 'remove' path. With that cleanup the routine can be renamed to the more descriptive nd_region_advance_seeds(). The change is prompted by an incoming need to optionally advance the seeds on other events besides 'probe' success. Cc: "Aneesh Kumar K.V" Signed-off-by: Dan Williams --- drivers/nvdimm/bus.c| 7 +--- drivers/nvdimm/namespace_devs.c | 34 ++--- drivers/nvdimm/nd-core.h| 3 +- drivers/nvdimm/region_devs.c| 68 + 4 files changed, 41 insertions(+), 71 deletions(-) diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c index 29479d3b01b0..ee6de34ae525 100644 --- a/drivers/nvdimm/bus.c +++ b/drivers/nvdimm/bus.c @@ -95,10 +95,8 @@ static int nvdimm_bus_probe(struct device *dev) rc = nd_drv->probe(dev); debug_nvdimm_unlock(dev); - if (rc == 0) - nd_region_probe_success(nvdimm_bus, dev); - else - nd_region_disable(nvdimm_bus, dev); + if (rc == 0 && dev->parent && is_nd_region(dev->parent)) + nd_region_advance_seeds(to_nd_region(dev->parent), dev); nvdimm_bus_probe_end(nvdimm_bus); dev_dbg(_bus->dev, "END: %s.probe(%s) = %d\n", dev->driver->name, @@ -121,7 +119,6 @@ static int nvdimm_bus_remove(struct device *dev) rc = nd_drv->remove(dev); debug_nvdimm_unlock(dev); } - nd_region_disable(nvdimm_bus, dev); dev_dbg(_bus->dev, "%s.remove(%s) = %d\n", dev->driver->name, dev_name(dev), rc); diff --git
[PATCHv6 2/2] PCI: layerscape: Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC separately
Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC separately. Signed-off-by: Xiaowei Bao --- v2: - No change. v3: - modify the commit message. v4: - send the patch again with '--to'. v5: - No change. v6: - remove the [EXT] tag of the $SUBJECT in email. drivers/pci/controller/dwc/Kconfig | 20 ++-- drivers/pci/controller/dwc/Makefile | 3 ++- 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig index 6ea778a..869c645 100644 --- a/drivers/pci/controller/dwc/Kconfig +++ b/drivers/pci/controller/dwc/Kconfig @@ -131,13 +131,29 @@ config PCI_KEYSTONE_EP DesignWare core functions to implement the driver. config PCI_LAYERSCAPE - bool "Freescale Layerscape PCIe controller" + bool "Freescale Layerscape PCIe controller - Host mode" depends on OF && (ARM || ARCH_LAYERSCAPE || COMPILE_TEST) depends on PCI_MSI_IRQ_DOMAIN select MFD_SYSCON select PCIE_DW_HOST help - Say Y here if you want PCIe controller support on Layerscape SoCs. + Say Y here if you want to enable PCIe controller support on Layerscape + SoCs to work in Host mode. + This controller can work either as EP or RC. The RCW[HOST_AGT_PEX] + determines which PCIe controller works in EP mode and which PCIe + controller works in RC mode. + +config PCI_LAYERSCAPE_EP + bool "Freescale Layerscape PCIe controller - Endpoint mode" + depends on OF && (ARM || ARCH_LAYERSCAPE || COMPILE_TEST) + depends on PCI_ENDPOINT + select PCIE_DW_EP + help + Say Y here if you want to enable PCIe controller support on Layerscape + SoCs to work in Endpoint mode. + This controller can work either as EP or RC. The RCW[HOST_AGT_PEX] + determines which PCIe controller works in EP mode and which PCIe + controller works in RC mode. config PCI_HISI depends on OF && (ARM64 || COMPILE_TEST) diff --git a/drivers/pci/controller/dwc/Makefile b/drivers/pci/controller/dwc/Makefile index b085dfd..824fde7 100644 --- a/drivers/pci/controller/dwc/Makefile +++ b/drivers/pci/controller/dwc/Makefile @@ -8,7 +8,8 @@ obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o obj-$(CONFIG_PCI_IMX6) += pci-imx6.o obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone.o -obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o pci-layerscape-ep.o +obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o +obj-$(CONFIG_PCI_LAYERSCAPE_EP) += pci-layerscape-ep.o obj-$(CONFIG_PCIE_QCOM) += pcie-qcom.o obj-$(CONFIG_PCIE_ARMADA_8K) += pcie-armada8k.o obj-$(CONFIG_PCIE_ARTPEC6) += pcie-artpec6.o -- 2.9.5
[PATCHv6 1/2] PCI: layerscape: Add the bar_fixed_64bit property in EP driver.
The PCIe controller of layerscape just have 4 BARs, BAR0 and BAR1 is 32bit, BAR2 and BAR4 is 64bit, this is determined by hardware, so set the bar_fixed_64bit with 0x14. Signed-off-by: Xiaowei Bao --- v2: - Replace value 0x14 with a macro. v3: - No change. v4: - send the patch again with '--to'. v5: - fix the commit message. v6: - remove the [EXT] tag of the $SUBJECT in email. drivers/pci/controller/dwc/pci-layerscape-ep.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c b/drivers/pci/controller/dwc/pci-layerscape-ep.c index be61d96..ca9aa45 100644 --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c @@ -44,6 +44,7 @@ static const struct pci_epc_features ls_pcie_epc_features = { .linkup_notifier = false, .msi_capable = true, .msix_capable = false, + .bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4), }; static const struct pci_epc_features* -- 2.9.5
Re: [PATCH 1/2] powerpc: rewrite LOAD_REG_IMMEDIATE() as an intelligent macro
On Tue, Aug 13, 2019 at 09:59:35AM +, Christophe Leroy wrote: [snip] > +.macro __LOAD_REG_IMMEDIATE r, x > + .if \x & ~0x != 0 > + __LOAD_REG_IMMEDIATE_32 \r, (\x) >> 32 > + rldicr \r, \r, 32, 31 > + .if (\x) & 0x != 0 > + oris \r, \r, (\x)@__AS_ATHIGH > + .endif > + .if (\x) & 0x != 0 > + oris \r, \r, (\x)@l > + .endif > + .else > + __LOAD_REG_IMMEDIATE_32 \r, \x > + .endif > +.endm Doesn't this force all negative constants, even small ones, to use the long sequence? For example, __LOAD_REG_IMMEDIATE r3, -1 will generate (as far as I can see): li r3, -1 rldicr r3, r3, 32, 31 orisr3, r3, 0x ori r3, r3, 0x which seems suboptimal. Paul.
Re: [RFC PATCH] powerpc/64s/radix: introduce option to disable broadcast tlbie
Hi Nick, Just a few comments. Nicholas Piggin writes: > diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c > b/arch/powerpc/mm/book3s64/radix_tlb.c > index 71f7fede2fa4..56ceecbd3d5c 100644 > --- a/arch/powerpc/mm/book3s64/radix_tlb.c > +++ b/arch/powerpc/mm/book3s64/radix_tlb.c > @@ -285,6 +286,30 @@ static inline void _tlbie_pid(unsigned long pid, > unsigned long ric) > asm volatile("eieio; tlbsync; ptesync": : :"memory"); > } > > +struct tlbiel_pid { > + unsigned long pid; > + unsigned long ric; > +}; > + > +static void do_tlbiel_pid(void *info) > +{ > + struct tlbiel_pid *t = info; > + > + if (t->ric == RIC_FLUSH_TLB) > + _tlbiel_pid(t->pid, RIC_FLUSH_TLB); > + else if (t->ric == RIC_FLUSH_PWC) > + _tlbiel_pid(t->pid, RIC_FLUSH_PWC); > + else > + _tlbiel_pid(t->pid, RIC_FLUSH_ALL); > +} > + > +static inline void _tlbiel_pid_broadcast(const struct cpumask *cpus, > + unsigned long pid, unsigned long ric) Can we call these "multicast" instead of "broadcast"? I think that's more accurate, and avoids confusion with tlbie which literally does a broadcast (at least architecturally). > @@ -524,6 +604,12 @@ static bool mm_needs_flush_escalation(struct mm_struct > *mm) > return false; > } > > +static bool tlbie_enabled = true; > +static bool use_tlbie(void) > +{ > + return tlbie_enabled; > +} No synchronisation, but that's OK. Would probably be good to have a comment though explaining why. We could use a static_key but I guess the overhead of a comparison and branch is in the noise vs the tlbie/tlbiel. > @@ -1100,3 +1221,13 @@ extern void radix_kvm_prefetch_workaround(struct > mm_struct *mm) > } > EXPORT_SYMBOL_GPL(radix_kvm_prefetch_workaround); > #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ > + > +static int __init radix_tlb_setup(void) > +{ > + debugfs_create_bool("tlbie_enabled", 0600, > + powerpc_debugfs_root, > + _enabled); > + > + return 0; > +} > +arch_initcall(radix_tlb_setup); For working around hardware bugs we would want a command line parameter or other boot time way to flip this. But I guess you're saying because we haven't converted all uses of tlbie we can't really support that anyway, and so a runtime switch is sufficient? cheers
[PATCH v1 09/10] powerpc/mm: make __ioremap_caller() common to PPC32 and PPC64
__ioremap_caller() do the same thing. Define a common one. __ioremap() is not reused because most of the tests included in it are unnecessary when coming from __ioremap_caller() Signed-off-by: Christophe Leroy --- arch/powerpc/mm/ioremap.c| 99 arch/powerpc/mm/pgtable_32.c | 75 - arch/powerpc/mm/pgtable_64.c | 61 --- 3 files changed, 99 insertions(+), 136 deletions(-) diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c index 889ee656cf64..537c9148cea1 100644 --- a/arch/powerpc/mm/ioremap.c +++ b/arch/powerpc/mm/ioremap.c @@ -76,6 +76,105 @@ void __iomem *ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long f } EXPORT_SYMBOL(ioremap_prot); +int __weak ioremap_range(unsigned long ea, phys_addr_t pa, unsigned long size, +pgprot_t prot, int nid) +{ + unsigned long i; + + for (i = 0; i < size; i += PAGE_SIZE) { + int err = map_kernel_page(ea + i, pa + i, prot); + + if (err) { + if (slab_is_available()) + unmap_kernel_range(ea, size); + else + WARN_ON_ONCE(1); /* Should clean up */ + return err; + } + } + + return 0; +} + +void __iomem *__ioremap_caller(phys_addr_t addr, unsigned long size, + pgprot_t prot, void *caller) +{ + phys_addr_t pa = addr & PAGE_MASK; + int ret; + unsigned long va; + + size = PAGE_ALIGN(addr + size) - pa; + +#ifdef CONFIG_PPC64 + /* We don't support the 4K PFN hack with ioremap */ + if (pgprot_val(prot) & H_PAGE_4K_PFN) + return NULL; +#else + /* +* If the address lies within the first 16 MB, assume it's in ISA +* memory space +*/ + if (pa < SZ_16M) + pa += _ISA_MEM_BASE; + +#ifndef CONFIG_CRASH_DUMP + /* +* Don't allow anybody to remap normal RAM that we're using. +* mem_init() sets high_memory so only do the check after that. +*/ + if (slab_is_available() && pa <= virt_to_phys(high_memory - 1) && + page_is_ram(__phys_to_pfn(pa))) { + pr_err("%s(): phys addr 0x%llx is RAM lr %ps\n", __func__, + (unsigned long long)pa, __builtin_return_address(0)); + return NULL; + } +#endif +#endif /* CONFIG_PPC64 */ + + if (size == 0 || pa == 0) + return NULL; + + /* +* Is it already mapped? Perhaps overlapped by a previous +* mapping. +*/ + va = p_block_mapped(pa); + if (va) + return (void __iomem *)va + (addr & ~PAGE_MASK); + + /* +* Choose an address to map it to. +* Once the vmalloc system is running, we use it. +* Before that, we map using addresses going +* down from ioremap_bot. vmalloc will use +* the addresses from IOREMAP_BASE through +* ioremap_bot +* +*/ + if (slab_is_available()) { + struct vm_struct *area; + + area = __get_vm_area_caller(size, VM_IOREMAP, IOREMAP_BASE, + ioremap_bot, caller); + if (area == NULL) + return NULL; + + area->phys_addr = pa; + va = (unsigned long)area->addr; + } else { + ioremap_bot -= size; + va = ioremap_bot; + } + ret = ioremap_range(va, pa, size, prot, NUMA_NO_NODE); + if (!ret) + return (void __iomem *)va + (addr & ~PAGE_MASK); + + if (!slab_is_available()) + ioremap_bot += size; + + return NULL; +} + /* * Unmap an IO region and remove it from vmalloc'd list. * Access to IO memory should be serialized by driver. diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 4597f45e4dc6..bacf3b85191c 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -35,81 +35,6 @@ extern char etext[], _stext[], _sinittext[], _einittext[]; -void __iomem * -__ioremap_caller(phys_addr_t addr, unsigned long size, pgprot_t prot, void *caller) -{ - unsigned long v, i; - phys_addr_t p; - int err; - - /* -* Choose an address to map it to. -* Once the vmalloc system is running, we use it. -* Before then, we use space going down from IOREMAP_TOP -* (ioremap_bot records where we're up to). -*/ - p = addr & PAGE_MASK; - size = PAGE_ALIGN(addr + size) - p; - - /* -* If the address lies within the first 16 MB, assume it's in ISA -* memory space -*/ - if (p < 16*1024*1024) - p += _ISA_MEM_BASE; - -#ifndef CONFIG_CRASH_DUMP - /* -* Don't
[PATCH v1 10/10] powerpc/mm: refactor ioremap_range() and use ioremap_page_range()
book3s64's ioremap_range() is almost same as fallback ioremap_range(), except that it calls radix__ioremap_range() when radix is enabled. radix__ioremap_range() is also very similar to the other ones, expect that it calls ioremap_page_range when slab is available. Lets keep only one version of ioremap_range() which calls ioremap_page_range() on all platforms when slab is available. At the same time, drop the nid parameter which is not used. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/64/radix.h | 3 --- arch/powerpc/mm/book3s64/pgtable.c | 21 - arch/powerpc/mm/book3s64/radix_pgtable.c | 20 arch/powerpc/mm/ioremap.c | 23 +-- 4 files changed, 13 insertions(+), 54 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h index e04a839cb5b9..574eca33f893 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -266,9 +266,6 @@ extern void radix__vmemmap_remove_mapping(unsigned long start, extern int radix__map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t flags, unsigned int psz); -extern int radix__ioremap_range(unsigned long ea, phys_addr_t pa, - unsigned long size, pgprot_t prot, int nid); - static inline unsigned long radix__get_tree_size(void) { unsigned long rts_field; diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index 7d0e0d0d22c4..4c8bed856533 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -446,24 +446,3 @@ int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl, return true; } - -int ioremap_range(unsigned long ea, phys_addr_t pa, unsigned long size, pgprot_t prot, int nid) -{ - unsigned long i; - - if (radix_enabled()) - return radix__ioremap_range(ea, pa, size, prot, nid); - - for (i = 0; i < size; i += PAGE_SIZE) { - int err = map_kernel_page(ea + i, pa + i, prot); - if (err) { - if (slab_is_available()) - unmap_kernel_range(ea, size); - else - WARN_ON_ONCE(1); /* Should clean up */ - return err; - } - } - - return 0; -} diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index 11303e2fffb1..d39edbb07bd1 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1218,26 +1218,6 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr) return 1; } -int radix__ioremap_range(unsigned long ea, phys_addr_t pa, unsigned long size, - pgprot_t prot, int nid) -{ - if (likely(slab_is_available())) { - int err = ioremap_page_range(ea, ea + size, pa, prot); - if (err) - unmap_kernel_range(ea, size); - return err; - } else { - unsigned long i; - - for (i = 0; i < size; i += PAGE_SIZE) { - int err = map_kernel_page(ea + i, pa + i, prot); - if (WARN_ON_ONCE(err)) /* Should clean up */ - return err; - } - return 0; - } -} - int __init arch_ioremap_p4d_supported(void) { return 0; diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c index 537c9148cea1..dc538d7f2467 100644 --- a/arch/powerpc/mm/ioremap.c +++ b/arch/powerpc/mm/ioremap.c @@ -76,21 +76,24 @@ void __iomem *ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long f } EXPORT_SYMBOL(ioremap_prot); -int __weak ioremap_range(unsigned long ea, phys_addr_t pa, unsigned long size, -pgprot_t prot, int nid) +static int ioremap_range(unsigned long ea, phys_addr_t pa, unsigned long size, +pgprot_t prot) { unsigned long i; + if (slab_is_available()) { + int err = ioremap_page_range(ea, ea + size, pa, prot); + + if (err) + unmap_kernel_range(ea, size); + return err; + } + for (i = 0; i < size; i += PAGE_SIZE) { int err = map_kernel_page(ea + i, pa + i, prot); - if (err) { - if (slab_is_available()) - unmap_kernel_range(ea, size); - else - WARN_ON_ONCE(1); /* Should clean up */ + if (WARN_ON_ONCE(err)) /* Should clean up */ return err; - } } return 0; @@ -165,7 +168,7 @@ void __iomem *__ioremap_caller(phys_addr_t addr, unsigned long size,
[PATCH v1 08/10] powerpc/mm: move __ioremap_at() and __iounmap_at() into ioremap.c
Allthough __ioremap_at() and __iounmap_at() are specific to PPC64, lets move them into ioremap.c as it wouldn't be worth creating an ioremap_64.c only for those functions. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/ioremap.c| 43 +++ arch/powerpc/mm/pgtable_64.c | 42 -- 2 files changed, 43 insertions(+), 42 deletions(-) diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c index 57d742509cec..889ee656cf64 100644 --- a/arch/powerpc/mm/ioremap.c +++ b/arch/powerpc/mm/ioremap.c @@ -103,3 +103,46 @@ void iounmap(volatile void __iomem *token) vunmap(addr); } EXPORT_SYMBOL(iounmap); + +#ifdef CONFIG_PPC64 +/** + * __ioremap_at - Low level function to establish the page tables + *for an IO mapping + */ +void __iomem *__ioremap_at(phys_addr_t pa, void *ea, unsigned long size, pgprot_t prot) +{ + /* We don't support the 4K PFN hack with ioremap */ + if (pgprot_val(prot) & H_PAGE_4K_PFN) + return NULL; + + if ((ea + size) >= (void *)IOREMAP_END) { + pr_warn("Outside the supported range\n"); + return NULL; + } + + WARN_ON(pa & ~PAGE_MASK); + WARN_ON(((unsigned long)ea) & ~PAGE_MASK); + WARN_ON(size & ~PAGE_MASK); + + if (ioremap_range((unsigned long)ea, pa, size, prot, NUMA_NO_NODE)) + return NULL; + + return (void __iomem *)ea; +} +EXPORT_SYMBOL(__ioremap_at); + +/** + * __iounmap_from - Low level function to tear down the page tables + * for an IO mapping. This is used for mappings that + * are manipulated manually, like partial unmapping of + * PCI IOs or ISA space. + */ +void __iounmap_at(void *ea, unsigned long size) +{ + WARN_ON(((unsigned long)ea) & ~PAGE_MASK); + WARN_ON(size & ~PAGE_MASK); + + unmap_kernel_range((unsigned long)ea, size); +} +EXPORT_SYMBOL(__iounmap_at); +#endif diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c index b50a53a0a42b..32220f7381d7 100644 --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -119,45 +119,6 @@ int __weak ioremap_range(unsigned long ea, phys_addr_t pa, unsigned long size, p return 0; } -/** - * __ioremap_at - Low level function to establish the page tables - *for an IO mapping - */ -void __iomem *__ioremap_at(phys_addr_t pa, void *ea, unsigned long size, pgprot_t prot) -{ - /* We don't support the 4K PFN hack with ioremap */ - if (pgprot_val(prot) & H_PAGE_4K_PFN) - return NULL; - - if ((ea + size) >= (void *)IOREMAP_END) { - pr_warn("Outside the supported range\n"); - return NULL; - } - - WARN_ON(pa & ~PAGE_MASK); - WARN_ON(((unsigned long)ea) & ~PAGE_MASK); - WARN_ON(size & ~PAGE_MASK); - - if (ioremap_range((unsigned long)ea, pa, size, prot, NUMA_NO_NODE)) - return NULL; - - return (void __iomem *)ea; -} - -/** - * __iounmap_from - Low level function to tear down the page tables - * for an IO mapping. This is used for mappings that - * are manipulated manually, like partial unmapping of - * PCI IOs or ISA space. - */ -void __iounmap_at(void *ea, unsigned long size) -{ - WARN_ON(((unsigned long)ea) & ~PAGE_MASK); - WARN_ON(size & ~PAGE_MASK); - - unmap_kernel_range((unsigned long)ea, size); -} - void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size, pgprot_t prot, void *caller) { @@ -201,9 +162,6 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size, return ret; } -EXPORT_SYMBOL(__ioremap_at); -EXPORT_SYMBOL(__iounmap_at); - #ifndef __PAGETABLE_PUD_FOLDED /* 4 level page table */ struct page *pgd_page(pgd_t pgd) -- 2.13.3
[PATCH v1 07/10] powerpc/mm: move iounmap() into ioremap.c and drop __iounmap()
On PPC64 iounmap() does nothing else than calling __iounmap() and is the only user of __iounmap(). __iounmap() is almost similar to PPC32 iounmap(). Lets define a common iounmap() and drop __iounmap(). Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/pgtable.h | 2 ++ arch/powerpc/include/asm/io.h| 5 - arch/powerpc/include/asm/nohash/32/pgtable.h | 2 ++ arch/powerpc/mm/ioremap.c| 31 arch/powerpc/mm/pgtable_32.c | 14 - arch/powerpc/mm/pgtable_64.c | 28 - 6 files changed, 35 insertions(+), 47 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index aa1bc5f8da90..af34554d19e8 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -165,6 +165,8 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot); #define IOREMAP_TOPKVIRT_TOP #endif +#define IOREMAP_BASE VMALLOC_START + /* * Just any arbitrary offset to the start of the vmalloc VM area: the * current 16MB value just means that there will be a 64MB "hole" after the diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h index 23e5d5d16c7e..02d6256fe1ea 100644 --- a/arch/powerpc/include/asm/io.h +++ b/arch/powerpc/include/asm/io.h @@ -712,9 +712,6 @@ static inline void iosync(void) * * __ioremap_caller is the same as above but takes an explicit caller * reference rather than using __builtin_return_address(0) * - * * __iounmap, is the low level implementation used by iounmap and cannot - * be hooked (but can be used by a hook on iounmap) - * */ extern void __iomem *ioremap(phys_addr_t address, unsigned long size); extern void __iomem *ioremap_prot(phys_addr_t address, unsigned long size, @@ -734,8 +731,6 @@ extern void __iomem *__ioremap(phys_addr_t, unsigned long size, extern void __iomem *__ioremap_caller(phys_addr_t, unsigned long size, pgprot_t prot, void *caller); -extern void __iounmap(volatile void __iomem *addr); - extern void __iomem * __ioremap_at(phys_addr_t pa, void *ea, unsigned long size, pgprot_t prot); extern void __iounmap_at(void *ea, unsigned long size); diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index 7ce2a7c9fade..09f2739ab556 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -93,6 +93,8 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot); #define IOREMAP_TOPKVIRT_TOP #endif +#define IOREMAP_BASE VMALLOC_START + /* * Just any arbitrary offset to the start of the vmalloc VM area: the * current 16MB value just means that there will be a 64MB "hole" after the diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c index 0c23660522ca..57d742509cec 100644 --- a/arch/powerpc/mm/ioremap.c +++ b/arch/powerpc/mm/ioremap.c @@ -1,7 +1,10 @@ // SPDX-License-Identifier: GPL-2.0-or-later #include +#include +#include #include +#include unsigned long ioremap_bot; EXPORT_SYMBOL(ioremap_bot); @@ -72,3 +75,31 @@ void __iomem *ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long f return __ioremap_caller(addr, size, pte_pgprot(pte), caller); } EXPORT_SYMBOL(ioremap_prot); + +/* + * Unmap an IO region and remove it from vmalloc'd list. + * Access to IO memory should be serialized by driver. + */ +void iounmap(volatile void __iomem *token) +{ + void *addr; + + /* +* If mapped by BATs then there is nothing to do. +*/ + if (v_block_mapped((unsigned long)token)) + return; + + if (!slab_is_available()) + return; + + addr = (void *)((unsigned long __force)PCI_FIX_ADDR(token) & PAGE_MASK); + if (WARN_ON((unsigned long)addr < IOREMAP_BASE)) + return; + if ((unsigned long)addr >= ioremap_bot) { + pr_warn("Attempt to %s early bolted mapping at 0x%p\n", __func__, addr); + return; + } + vunmap(addr); +} +EXPORT_SYMBOL(iounmap); diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 7efdb1dee19b..4597f45e4dc6 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -110,20 +110,6 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, pgprot_t prot, void *call return (void __iomem *) (v + ((unsigned long)addr & ~PAGE_MASK)); } -void iounmap(volatile void __iomem *addr) -{ - /* -* If mapped by BATs then there is nothing to do. -* Calling vfree() generates a benign warning. -*/ - if (v_block_mapped((unsigned long)addr)) - return; - - if (addr > high_memory && (unsigned long) addr <
[PATCH v1 06/10] powerpc/mm: make ioremap_bot common to all
Drop multiple definitions of ioremap_bot and make one common to all subarches. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/book3s/32/pgtable.h | 2 -- arch/powerpc/include/asm/book3s/64/pgtable.h | 1 - arch/powerpc/include/asm/nohash/32/pgtable.h | 2 -- arch/powerpc/include/asm/pgtable.h | 2 ++ arch/powerpc/mm/ioremap.c| 3 +++ arch/powerpc/mm/mmu_decl.h | 1 - arch/powerpc/mm/nohash/tlb.c | 2 ++ arch/powerpc/mm/pgtable_32.c | 3 --- arch/powerpc/mm/pgtable_64.c | 3 --- 9 files changed, 7 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 838de59f6754..aa1bc5f8da90 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -201,8 +201,6 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot); #include #include -extern unsigned long ioremap_bot; - /* Bits to mask out from a PGD to get to the PUD page */ #define PGD_MASKED_BITS0 diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 8308f32e9782..11819e3c755e 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -289,7 +289,6 @@ extern unsigned long __kernel_io_end; #define KERN_IO_END __kernel_io_end extern struct page *vmemmap; -extern unsigned long ioremap_bot; extern unsigned long pci_io_base; #endif /* __ASSEMBLY__ */ diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h index 0284f8f5305f..7ce2a7c9fade 100644 --- a/arch/powerpc/include/asm/nohash/32/pgtable.h +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h @@ -11,8 +11,6 @@ #include/* For sub-arch specific PPC_PIN_SIZE */ #include -extern unsigned long ioremap_bot; - #ifdef CONFIG_44x extern int icache_44x_need_flush; #endif diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index c58ba7963688..c54bb68c1354 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -68,6 +68,8 @@ extern pgd_t swapper_pg_dir[]; extern void paging_init(void); +extern unsigned long ioremap_bot; + /* * kern_addr_valid is intended to indicate whether an address is a valid * kernel address. Most 32-bit archs define it as always true (like this) diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c index a44d9e4c948a..0c23660522ca 100644 --- a/arch/powerpc/mm/ioremap.c +++ b/arch/powerpc/mm/ioremap.c @@ -3,6 +3,9 @@ #include #include +unsigned long ioremap_bot; +EXPORT_SYMBOL(ioremap_bot); + void __iomem *__ioremap(phys_addr_t addr, unsigned long size, unsigned long flags) { return __ioremap_caller(addr, size, __pgprot(flags), __builtin_return_address(0)); diff --git a/arch/powerpc/mm/mmu_decl.h b/arch/powerpc/mm/mmu_decl.h index 32c1a191c28a..6ee64d5e2824 100644 --- a/arch/powerpc/mm/mmu_decl.h +++ b/arch/powerpc/mm/mmu_decl.h @@ -108,7 +108,6 @@ extern u8 early_hash[]; #endif /* CONFIG_PPC32 */ -extern unsigned long ioremap_bot; extern unsigned long __max_low_memory; extern phys_addr_t __initial_memory_limit_addr; extern phys_addr_t total_memory; diff --git a/arch/powerpc/mm/nohash/tlb.c b/arch/powerpc/mm/nohash/tlb.c index d4acf6fa0596..350a54f70a37 100644 --- a/arch/powerpc/mm/nohash/tlb.c +++ b/arch/powerpc/mm/nohash/tlb.c @@ -704,6 +704,8 @@ static void __init early_init_mmu_global(void) * for use by the TLB miss code */ linear_map_top = memblock_end_of_DRAM(); + + ioremap_bot = IOREMAP_END; } static void __init early_mmu_set_memory_limit(void) diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 8126c2d1afbf..7efdb1dee19b 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -33,9 +33,6 @@ #include -unsigned long ioremap_bot; -EXPORT_SYMBOL(ioremap_bot);/* aka VMALLOC_END */ - extern char etext[], _stext[], _sinittext[], _einittext[]; void __iomem * diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c index 0f0b1e1ea5ab..d631659c8859 100644 --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -99,9 +99,6 @@ unsigned long __pte_frag_nr; EXPORT_SYMBOL(__pte_frag_nr); unsigned long __pte_frag_size_shift; EXPORT_SYMBOL(__pte_frag_size_shift); -unsigned long ioremap_bot; -#else /* !CONFIG_PPC_BOOK3S_64 */ -unsigned long ioremap_bot = IOREMAP_END; #endif int __weak ioremap_range(unsigned long ea, phys_addr_t pa, unsigned long size, pgprot_t prot, int nid) -- 2.13.3
[PATCH v1 05/10] powerpc/mm: Do early ioremaps from top to bottom on PPC64 too.
Until vmalloc system is up and running, ioremap basically allocates addresses at the border of the IOREMAP area. On PPC32, addresses are allocated down from the top of the area while on PPC64, addresses are allocated up from the base of the area. On PPC32, the base of vmalloc area is not known yet when ioremap() starts to be used, while the end of it is fixed. On PPC64, both the start and the end are already fixed when ioremap() starts to being used. Changing PPC64 behaviour is the lighest change, so change PPC64 ioremap() to allocate addresses from the top as PPC32 does. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/book3s64/hash_utils.c| 2 +- arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +- arch/powerpc/mm/pgtable_64.c | 18 +- 3 files changed, 11 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index e6d471058597..0f954dc40346 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -1030,7 +1030,7 @@ void __init hash__early_init_mmu(void) __kernel_io_start = H_KERN_IO_START; __kernel_io_end = H_KERN_IO_END; vmemmap = (struct page *)H_VMEMMAP_START; - ioremap_bot = IOREMAP_BASE; + ioremap_bot = IOREMAP_END; #ifdef CONFIG_PCI pci_io_base = ISA_IO_BASE; diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index b4ca9e95e678..11303e2fffb1 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -611,7 +611,7 @@ void __init radix__early_init_mmu(void) __kernel_io_start = RADIX_KERN_IO_START; __kernel_io_end = RADIX_KERN_IO_END; vmemmap = (struct page *)RADIX_VMEMMAP_START; - ioremap_bot = IOREMAP_BASE; + ioremap_bot = IOREMAP_END; #ifdef CONFIG_PCI pci_io_base = ISA_IO_BASE; diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c index 6fa2e969bf0e..0f0b1e1ea5ab 100644 --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -101,7 +101,7 @@ unsigned long __pte_frag_size_shift; EXPORT_SYMBOL(__pte_frag_size_shift); unsigned long ioremap_bot; #else /* !CONFIG_PPC_BOOK3S_64 */ -unsigned long ioremap_bot = IOREMAP_BASE; +unsigned long ioremap_bot = IOREMAP_END; #endif int __weak ioremap_range(unsigned long ea, phys_addr_t pa, unsigned long size, pgprot_t prot, int nid) @@ -169,11 +169,11 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size, /* * Choose an address to map it to. -* Once the imalloc system is running, we use it. +* Once the vmalloc system is running, we use it. * Before that, we map using addresses going -* up from ioremap_bot. imalloc will use -* the addresses from ioremap_bot through -* IMALLOC_END +* down from ioremap_bot. vmalloc will use +* the addresses from IOREMAP_BASE through +* ioremap_bot * */ paligned = addr & PAGE_MASK; @@ -186,7 +186,7 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size, struct vm_struct *area; area = __get_vm_area_caller(size, VM_IOREMAP, - ioremap_bot, IOREMAP_END, + IOREMAP_BASE, ioremap_bot, caller); if (area == NULL) return NULL; @@ -194,9 +194,9 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size, area->phys_addr = paligned; ret = __ioremap_at(paligned, area->addr, size, prot); } else { - ret = __ioremap_at(paligned, (void *)ioremap_bot, size, prot); + ret = __ioremap_at(paligned, (void *)ioremap_bot - size, size, prot); if (ret) - ioremap_bot += size; + ioremap_bot -= size; } if (ret) @@ -217,7 +217,7 @@ void __iounmap(volatile void __iomem *token) addr = (void *) ((unsigned long __force) PCI_FIX_ADDR(token) & PAGE_MASK); - if ((unsigned long)addr < ioremap_bot) { + if ((unsigned long)addr >= ioremap_bot) { printk(KERN_WARNING "Attempt to iounmap early bolted mapping" " at 0x%p\n", addr); return; -- 2.13.3
[PATCH v1 02/10] powerpc/mm: rework io-workaround invocation.
ppc_md.ioremap() is only used for I/O workaround on CELL platform, so indirect function call can be avoided. This patch reworks the io-workaround and ioremap() functions to use static keys for the activation of io-workaround. When CONFIG_PPC_IO_WORKAROUNDS or CONFIG_PPC_INDIRECT_MMIO are not selected, the I/O workaround ioremap() voids and the static key is not used at all. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/io-workarounds.h | 19 +++ arch/powerpc/include/asm/machdep.h| 2 -- arch/powerpc/kernel/io-workarounds.c | 11 ++- arch/powerpc/mm/pgtable_64.c | 17 + 4 files changed, 34 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/io-workarounds.h b/arch/powerpc/include/asm/io-workarounds.h index 01567ea4ceaf..ce337d17ac40 100644 --- a/arch/powerpc/include/asm/io-workarounds.h +++ b/arch/powerpc/include/asm/io-workarounds.h @@ -8,6 +8,7 @@ #ifndef _IO_WORKAROUNDS_H #define _IO_WORKAROUNDS_H +#ifdef CONFIG_PPC_IO_WORKAROUNDS #include #include @@ -32,4 +33,22 @@ extern int spiderpci_iowa_init(struct iowa_bus *, void *); #define SPIDER_PCI_DUMMY_READ 0x0810 #define SPIDER_PCI_DUMMY_READ_BASE 0x0814 +#endif + +#if defined(CONFIG_PPC_IO_WORKAROUNDS) && defined(CONFIG_PPC_INDIRECT_MMIO) +DECLARE_STATIC_KEY_FALSE(iowa_key); +static inline bool iowa_is_active(void) +{ + return static_branch_unlikely(_key); +} +#else +static inline bool iowa_is_active(void) +{ + return false; +} +#endif + +void __iomem *iowa_ioremap(phys_addr_t addr, unsigned long size, + pgprot_t prot, void *caller); + #endif /* _IO_WORKAROUNDS_H */ diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index 3370df4bdaa0..657ec893bdcb 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -31,8 +31,6 @@ struct pci_host_bridge; struct machdep_calls { char*name; #ifdef CONFIG_PPC64 - void __iomem * (*ioremap)(phys_addr_t addr, unsigned long size, - pgprot_t prot, void *caller); #ifdef CONFIG_PM void(*iommu_save)(void); void(*iommu_restore)(void); diff --git a/arch/powerpc/kernel/io-workarounds.c b/arch/powerpc/kernel/io-workarounds.c index fbd2d0007c52..8b5b2aa70840 100644 --- a/arch/powerpc/kernel/io-workarounds.c +++ b/arch/powerpc/kernel/io-workarounds.c @@ -18,6 +18,7 @@ #include #include +DEFINE_STATIC_KEY_FALSE(iowa_key); #define IOWA_MAX_BUS 8 @@ -149,8 +150,8 @@ static const struct ppc_pci_io iowa_pci_io = { }; #ifdef CONFIG_PPC_INDIRECT_MMIO -static void __iomem *iowa_ioremap(phys_addr_t addr, unsigned long size, - pgprot_t prot, void *caller) +void __iomem *iowa_ioremap(phys_addr_t addr, unsigned long size, + pgprot_t prot, void *caller) { struct iowa_bus *bus; void __iomem *res = __ioremap_caller(addr, size, prot, caller); @@ -163,8 +164,6 @@ static void __iomem *iowa_ioremap(phys_addr_t addr, unsigned long size, } return res; } -#else /* CONFIG_PPC_INDIRECT_MMIO */ -#define iowa_ioremap NULL #endif /* !CONFIG_PPC_INDIRECT_MMIO */ /* Enable IO workaround */ @@ -175,7 +174,9 @@ static void io_workaround_init(void) if (io_workaround_inited) return; ppc_pci_io = iowa_pci_io; - ppc_md.ioremap = iowa_ioremap; +#ifdef CONFIG_PPC_INDIRECT_MMIO + static_branch_enable(_key); +#endif io_workaround_inited = 1; } diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c index 11eb90ea2d4f..194efc6f39fb 100644 --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -214,8 +215,8 @@ void __iomem * ioremap(phys_addr_t addr, unsigned long size) pgprot_t prot = pgprot_noncached(PAGE_KERNEL); void *caller = __builtin_return_address(0); - if (ppc_md.ioremap) - return ppc_md.ioremap(addr, size, prot, caller); + if (iowa_is_active()) + return iowa_ioremap(addr, size, prot, caller); return __ioremap_caller(addr, size, prot, caller); } @@ -224,8 +225,8 @@ void __iomem * ioremap_wc(phys_addr_t addr, unsigned long size) pgprot_t prot = pgprot_noncached_wc(PAGE_KERNEL); void *caller = __builtin_return_address(0); - if (ppc_md.ioremap) - return ppc_md.ioremap(addr, size, prot, caller); + if (iowa_is_active()) + return iowa_ioremap(addr, size, prot, caller); return __ioremap_caller(addr, size, prot, caller); } @@ -234,8 +235,8 @@ void __iomem *ioremap_coherent(phys_addr_t addr, unsigned long size) pgprot_t prot = pgprot_cached(PAGE_KERNEL); void *caller =
[PATCH v1 01/10] powerpc/mm: drop ppc_md.iounmap()
ppc_md.iounmap() is never set, drop it. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/machdep.h | 2 -- arch/powerpc/mm/pgtable_64.c | 5 + 2 files changed, 1 insertion(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index c43d6eca9edd..3370df4bdaa0 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -33,8 +33,6 @@ struct machdep_calls { #ifdef CONFIG_PPC64 void __iomem * (*ioremap)(phys_addr_t addr, unsigned long size, pgprot_t prot, void *caller); - void(*iounmap)(volatile void __iomem *token); - #ifdef CONFIG_PM void(*iommu_save)(void); void(*iommu_restore)(void); diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c index 9ad59b733984..11eb90ea2d4f 100644 --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -285,10 +285,7 @@ void __iounmap(volatile void __iomem *token) void iounmap(volatile void __iomem *token) { - if (ppc_md.iounmap) - ppc_md.iounmap(token); - else - __iounmap(token); + __iounmap(token); } EXPORT_SYMBOL(ioremap); -- 2.13.3
[PATCH v1 03/10] powerpc/mm: move common 32/64 bits ioremap functions into ioremap.c
ioremap(), __ioremap(), ioremap_wc() and ioremap_coherent() are now identical on PPC32 and PPC64 as iowa_is_active() will always return false on PPC32. Move them into a new common location called ioremap.c Allthough ioremap_wt() only exists on PPC32, move it into ioremap.c as well. As it is the only one specific to PPC32, it is not worth creating an ioremap_32.c file and leaving it in pgtable_32.c would make it the only ioremap function in that file at the end of the series. Signed-off-by: Christophe Leroy --- arch/powerpc/mm/Makefile | 2 +- arch/powerpc/mm/ioremap.c| 52 arch/powerpc/mm/pgtable_32.c | 43 arch/powerpc/mm/pgtable_64.c | 39 - 4 files changed, 53 insertions(+), 83 deletions(-) create mode 100644 arch/powerpc/mm/ioremap.c diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile index 0f499db315d6..29c682fe9144 100644 --- a/arch/powerpc/mm/Makefile +++ b/arch/powerpc/mm/Makefile @@ -7,7 +7,7 @@ ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC) obj-y := fault.o mem.o pgtable.o mmap.o \ init_$(BITS).o pgtable_$(BITS).o \ - pgtable-frag.o \ + pgtable-frag.o ioremap.o \ init-common.o mmu_context.o drmem.o obj-$(CONFIG_PPC_MMU_NOHASH) += nohash/ obj-$(CONFIG_PPC_BOOK3S_32)+= book3s32/ diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c new file mode 100644 index ..89479ee88344 --- /dev/null +++ b/arch/powerpc/mm/ioremap.c @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0-or-later + +#include +#include + +void __iomem *__ioremap(phys_addr_t addr, unsigned long size, unsigned long flags) +{ + return __ioremap_caller(addr, size, __pgprot(flags), __builtin_return_address(0)); +} +EXPORT_SYMBOL(__ioremap); + +void __iomem *ioremap(phys_addr_t addr, unsigned long size) +{ + pgprot_t prot = pgprot_noncached(PAGE_KERNEL); + void *caller = __builtin_return_address(0); + + if (iowa_is_active()) + return iowa_ioremap(addr, size, prot, caller); + return __ioremap_caller(addr, size, prot, caller); +} +EXPORT_SYMBOL(ioremap); + +void __iomem *ioremap_wc(phys_addr_t addr, unsigned long size) +{ + pgprot_t prot = pgprot_noncached_wc(PAGE_KERNEL); + void *caller = __builtin_return_address(0); + + if (iowa_is_active()) + return iowa_ioremap(addr, size, prot, caller); + return __ioremap_caller(addr, size, prot, caller); +} +EXPORT_SYMBOL(ioremap_wc); + +#ifdef CONFIG_PPC32 +void __iomem *ioremap_wt(phys_addr_t addr, unsigned long size) +{ + pgprot_t prot = pgprot_cached_wthru(PAGE_KERNEL); + + return __ioremap_caller(addr, size, prot, __builtin_return_address(0)); +} +EXPORT_SYMBOL(ioremap_wt); +#endif + +void __iomem *ioremap_coherent(phys_addr_t addr, unsigned long size) +{ + pgprot_t prot = pgprot_cached(PAGE_KERNEL); + void *caller = __builtin_return_address(0); + + if (iowa_is_active()) + return iowa_ioremap(addr, size, prot, caller); + return __ioremap_caller(addr, size, prot, caller); +} diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 35cb96cfc258..1999ec11706d 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -39,42 +39,6 @@ EXPORT_SYMBOL(ioremap_bot); /* aka VMALLOC_END */ extern char etext[], _stext[], _sinittext[], _einittext[]; void __iomem * -ioremap(phys_addr_t addr, unsigned long size) -{ - pgprot_t prot = pgprot_noncached(PAGE_KERNEL); - - return __ioremap_caller(addr, size, prot, __builtin_return_address(0)); -} -EXPORT_SYMBOL(ioremap); - -void __iomem * -ioremap_wc(phys_addr_t addr, unsigned long size) -{ - pgprot_t prot = pgprot_noncached_wc(PAGE_KERNEL); - - return __ioremap_caller(addr, size, prot, __builtin_return_address(0)); -} -EXPORT_SYMBOL(ioremap_wc); - -void __iomem * -ioremap_wt(phys_addr_t addr, unsigned long size) -{ - pgprot_t prot = pgprot_cached_wthru(PAGE_KERNEL); - - return __ioremap_caller(addr, size, prot, __builtin_return_address(0)); -} -EXPORT_SYMBOL(ioremap_wt); - -void __iomem * -ioremap_coherent(phys_addr_t addr, unsigned long size) -{ - pgprot_t prot = pgprot_cached(PAGE_KERNEL); - - return __ioremap_caller(addr, size, prot, __builtin_return_address(0)); -} -EXPORT_SYMBOL(ioremap_coherent); - -void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags) { pte_t pte = __pte(flags); @@ -92,12 +56,6 @@ ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags) EXPORT_SYMBOL(ioremap_prot); void __iomem * -__ioremap(phys_addr_t addr, unsigned long size, unsigned long flags) -{ - return __ioremap_caller(addr, size, __pgprot(flags),
[PATCH v1 04/10] powerpc/mm: move ioremap_prot() into ioremap.c
Both ioremap_prot() are idenfical, move them into ioremap.c Signed-off-by: Christophe Leroy --- arch/powerpc/mm/ioremap.c| 19 +++ arch/powerpc/mm/pgtable_32.c | 17 - arch/powerpc/mm/pgtable_64.c | 24 3 files changed, 19 insertions(+), 41 deletions(-) diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c index 89479ee88344..a44d9e4c948a 100644 --- a/arch/powerpc/mm/ioremap.c +++ b/arch/powerpc/mm/ioremap.c @@ -50,3 +50,22 @@ void __iomem *ioremap_coherent(phys_addr_t addr, unsigned long size) return iowa_ioremap(addr, size, prot, caller); return __ioremap_caller(addr, size, prot, caller); } + +void __iomem *ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags) +{ + pte_t pte = __pte(flags); + void *caller = __builtin_return_address(0); + + /* writeable implies dirty for kernel addresses */ + if (pte_write(pte)) + pte = pte_mkdirty(pte); + + /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */ + pte = pte_exprotect(pte); + pte = pte_mkprivileged(pte); + + if (iowa_is_active()) + return iowa_ioremap(addr, size, pte_pgprot(pte), caller); + return __ioremap_caller(addr, size, pte_pgprot(pte), caller); +} +EXPORT_SYMBOL(ioremap_prot); diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index 1999ec11706d..8126c2d1afbf 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -39,23 +39,6 @@ EXPORT_SYMBOL(ioremap_bot); /* aka VMALLOC_END */ extern char etext[], _stext[], _sinittext[], _einittext[]; void __iomem * -ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags) -{ - pte_t pte = __pte(flags); - - /* writeable implies dirty for kernel addresses */ - if (pte_write(pte)) - pte = pte_mkdirty(pte); - - /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */ - pte = pte_exprotect(pte); - pte = pte_mkprivileged(pte); - - return __ioremap_caller(addr, size, pte_pgprot(pte), __builtin_return_address(0)); -} -EXPORT_SYMBOL(ioremap_prot); - -void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size, pgprot_t prot, void *caller) { unsigned long v, i; diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c index 3ad921ac4862..6fa2e969bf0e 100644 --- a/arch/powerpc/mm/pgtable_64.c +++ b/arch/powerpc/mm/pgtable_64.c @@ -204,29 +204,6 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size, return ret; } -void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size, -unsigned long flags) -{ - pte_t pte = __pte(flags); - void *caller = __builtin_return_address(0); - - /* writeable implies dirty for kernel addresses */ - if (pte_write(pte)) - pte = pte_mkdirty(pte); - - /* we don't want to let _PAGE_EXEC leak out */ - pte = pte_exprotect(pte); - /* -* Force kernel mapping. -*/ - pte = pte_mkprivileged(pte); - - if (iowa_is_active()) - return iowa_ioremap(addr, size, pte_pgprot(pte), caller); - return __ioremap_caller(addr, size, pte_pgprot(pte), caller); -} - - /* * Unmap an IO region and remove it from imalloc'd list. * Access to IO memory should be serialized by driver. @@ -253,7 +230,6 @@ void iounmap(volatile void __iomem *token) __iounmap(token); } -EXPORT_SYMBOL(ioremap_prot); EXPORT_SYMBOL(__ioremap_at); EXPORT_SYMBOL(iounmap); EXPORT_SYMBOL(__iounmap); -- 2.13.3
Re: [PATCH v2 2/3] powerpc/rtas: allow rescheduling while changing cpu states
Gautham R Shenoy writes: > On Sat, Aug 3, 2019 at 1:03 AM Nathan Lynch wrote: >> >> rtas_cpu_state_change_mask() potentially operates on scores of cpus, >> so explicitly allow rescheduling in the loop body. >> > > Are we seeing softlockups/rcu stalls while running this ? I have not seen a report yet, but since the loop is bound only by the number of processors in the LPAR I suspect it's only a matter of time. > Reviewed-by: Gautham R. Shenoy Thanks!
Re: [PATCH v2 2/3] powerpc/rtas: allow rescheduling while changing cpu states
On Sat, Aug 3, 2019 at 1:03 AM Nathan Lynch wrote: > > rtas_cpu_state_change_mask() potentially operates on scores of cpus, > so explicitly allow rescheduling in the loop body. > Are we seeing softlockups/rcu stalls while running this ? > Signed-off-by: Nathan Lynch Reviewed-by: Gautham R. Shenoy > --- > arch/powerpc/kernel/rtas.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index 05824eb4323b..b7ca2fde68a9 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -16,6 +16,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -898,6 +899,7 @@ static int rtas_cpu_state_change_mask(enum rtas_cpu_state > state, > cpumask_clear_cpu(cpu, cpus); > } > } > + cond_resched(); > } > > return ret; > -- > 2.20.1 > -- Thanks and Regards gautham.
[RFC PATCH] bpf: handle 32-bit zext during constant blinding
Since BPF constant blinding is performed after the verifier pass, there are certain ALU32 instructions inserted which don't have a corresponding zext instruction inserted after. This is causing a kernel oops on powerpc and can be reproduced by running 'test_cgroup_storage' with bpf_jit_harden=2. Fix this by emitting BPF_ZEXT during constant blinding if prog->aux->verifier_zext is set. Fixes: a4b1d3c1ddf6cb ("bpf: verifier: insert zero extension according to analysis result") Reported-by: Michael Ellerman Signed-off-by: Naveen N. Rao --- This approach (the location where zext is being introduced below, in particular) works for powerpc, but I am not entirely sure if this is sufficient for other architectures as well. This is broken on v5.3-rc4. - Naveen kernel/bpf/core.c | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 8191a7db2777..d84146e6fd9e 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -890,7 +890,8 @@ int bpf_jit_get_func_addr(const struct bpf_prog *prog, static int bpf_jit_blind_insn(const struct bpf_insn *from, const struct bpf_insn *aux, - struct bpf_insn *to_buff) + struct bpf_insn *to_buff, + bool emit_zext) { struct bpf_insn *to = to_buff; u32 imm_rnd = get_random_int(); @@ -939,6 +940,8 @@ static int bpf_jit_blind_insn(const struct bpf_insn *from, *to++ = BPF_ALU32_IMM(BPF_MOV, BPF_REG_AX, imm_rnd ^ from->imm); *to++ = BPF_ALU32_IMM(BPF_XOR, BPF_REG_AX, imm_rnd); *to++ = BPF_ALU32_REG(from->code, from->dst_reg, BPF_REG_AX); + if (emit_zext) + *to++ = BPF_ZEXT_REG(from->dst_reg); break; case BPF_ALU64 | BPF_ADD | BPF_K: @@ -992,6 +995,10 @@ static int bpf_jit_blind_insn(const struct bpf_insn *from, off -= 2; *to++ = BPF_ALU32_IMM(BPF_MOV, BPF_REG_AX, imm_rnd ^ from->imm); *to++ = BPF_ALU32_IMM(BPF_XOR, BPF_REG_AX, imm_rnd); + if (emit_zext) { + *to++ = BPF_ZEXT_REG(BPF_REG_AX); + off--; + } *to++ = BPF_JMP32_REG(from->code, from->dst_reg, BPF_REG_AX, off); break; @@ -1005,6 +1012,8 @@ static int bpf_jit_blind_insn(const struct bpf_insn *from, case 0: /* Part 2 of BPF_LD | BPF_IMM | BPF_DW. */ *to++ = BPF_ALU32_IMM(BPF_MOV, BPF_REG_AX, imm_rnd ^ aux[0].imm); *to++ = BPF_ALU32_IMM(BPF_XOR, BPF_REG_AX, imm_rnd); + if (emit_zext) + *to++ = BPF_ZEXT_REG(BPF_REG_AX); *to++ = BPF_ALU64_REG(BPF_OR, aux[0].dst_reg, BPF_REG_AX); break; @@ -1088,7 +1097,8 @@ struct bpf_prog *bpf_jit_blind_constants(struct bpf_prog *prog) insn[1].code == 0) memcpy(aux, insn, sizeof(aux)); - rewritten = bpf_jit_blind_insn(insn, aux, insn_buff); + rewritten = bpf_jit_blind_insn(insn, aux, insn_buff, + clone->aux->verifier_zext); if (!rewritten) continue; -- 2.22.0
Re: [PATCH v2 1/3] powerpc/rtas: use device model APIs and serialization during LPM
Hello Nathan, On Sat, Aug 3, 2019 at 1:06 AM Nathan Lynch wrote: > > The LPAR migration implementation and userspace-initiated cpu hotplug > can interleave their executions like so: > > 1. Set cpu 7 offline via sysfs. > > 2. Begin a partition migration, whose implementation requires the OS >to ensure all present cpus are online; cpu 7 is onlined: > > rtas_ibm_suspend_me -> rtas_online_cpus_mask -> cpu_up > >This sets cpu 7 online in all respects except for the cpu's >corresponding struct device; dev->offline remains true. > > 3. Set cpu 7 online via sysfs. _cpu_up() determines that cpu 7 is >already online and returns success. The driver core (device_online) >sets dev->offline = false. > > 4. The migration completes and restores cpu 7 to offline state: > > rtas_ibm_suspend_me -> rtas_offline_cpus_mask -> cpu_down > > This leaves cpu7 in a state where the driver core considers the cpu > device online, but in all other respects it is offline and > unused. Attempts to online the cpu via sysfs appear to succeed but the > driver core actually does not pass the request to the lower-level > cpuhp support code. This makes the cpu unusable until the cpu device > is manually set offline and then online again via sysfs. > > Instead of directly calling cpu_up/cpu_down, the migration code should > use the higher-level device core APIs to maintain consistent state and > serialize operations. > > Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to > migration/hibernation") > Signed-off-by: Nathan Lynch Looks good to me. This locking scheme makes the code consistent with dlpar_cpu() which also uses the high-level device APIs. Reviewed-by: Gautham R. Shenoy > --- > arch/powerpc/kernel/rtas.c | 11 --- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c > index 5faf0a64c92b..05824eb4323b 100644 > --- a/arch/powerpc/kernel/rtas.c > +++ b/arch/powerpc/kernel/rtas.c > @@ -871,15 +871,17 @@ static int rtas_cpu_state_change_mask(enum > rtas_cpu_state state, > return 0; > > for_each_cpu(cpu, cpus) { > + struct device *dev = get_cpu_device(cpu); > + > switch (state) { > case DOWN: > - cpuret = cpu_down(cpu); > + cpuret = device_offline(dev); > break; > case UP: > - cpuret = cpu_up(cpu); > + cpuret = device_online(dev); > break; > } > - if (cpuret) { > + if (cpuret < 0) { > pr_debug("%s: cpu_%s for cpu#%d returned %d.\n", > __func__, > ((state == UP) ? "up" : "down"), > @@ -968,6 +970,8 @@ int rtas_ibm_suspend_me(u64 handle) > data.token = rtas_token("ibm,suspend-me"); > data.complete = > > + lock_device_hotplug(); > + > /* All present CPUs must be online */ > cpumask_andnot(offline_mask, cpu_present_mask, cpu_online_mask); > cpuret = rtas_online_cpus_mask(offline_mask); > @@ -1006,6 +1010,7 @@ int rtas_ibm_suspend_me(u64 handle) > __func__); > > out: > + unlock_device_hotplug(); > free_cpumask_var(offline_mask); > return atomic_read(); > } > -- > 2.20.1 > -- Thanks and Regards gautham.
[REGRESSION] Boot failure with DEBUG_PAGEALLOC on Wii, after PPC32 KASAN patches
Hi, I noticed that my Nintendo Wii doesn't boot with wii_defconfig plus CONFIG_DEBUG_PAGEALLOC=y and CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y on recent kernels. I get a splash like this one: [0.022245] BUG: Unable to handle kernel data access at 0x6601 [0.025172] Faulting instruction address: 0xc01afa48 [0.027522] Oops: Kernel access of bad area, sig: 11 [#1] [0.030076] BE PAGE_SIZE=4K MMU=Hash PREEMPT DEBUG_PAGEALLOC wii [0.032917] Modules linked in: [0.034368] CPU: 0 PID: 0 Comm: swapper Not tainted 5.1.0-rc3-wii-00151-g9a634f40158a #1337 [0.038318] NIP: c01afa48 LR: c0195fd0 CTR: [0.040707] REGS: c0c15e78 TRAP: 0300 Not tainted (5.1.0-rc3-wii-00151-g9a634f40158a) [0.044531] MSR: 9032 CR: 84000844 XER: [0.047708] DAR: 6601 DSISR: 4000 [0.047708] GPR00: c0919998 c0c15f30 c0bad460 c0bad434 01010101 [0.047708] GPR08: 0002 0001 0002 0110 44000842 7b67efdb b3a9f2fa 7763f327 [0.047708] GPR16: f5bff97f 797ebc55 3aafa378 e76bacd3 af931fb0 013de444 00d009b0 [0.047708] GPR24: c0951504 c0c3 d3efdcc0 c0951504 c0951500 c0878fe0 c0878fe0 [0.065470] NIP [c01afa48] fs_context_for_mount+0x8/0x1c [0.067988] LR [c0195fd0] vfs_kern_mount.part.6+0x24/0xb0 [0.070540] Call Trace: [0.071699] [c0c15f40] [c019404c] get_fs_type+0x98/0x14c [0.074214] [c0c15f60] [c0919998] mnt_init+0x16c/0x264 [0.076645] [c0c15f90] [c0919594] vfs_caches_init+0x7c/0x94 [0.079283] [c0c15fb0] [c0900c34] start_kernel+0x41c/0x480 [0.081878] [c0c15ff0] [346c] 0x346c [0.083731] Instruction dump: [0.085135] 7d005028 31080001 7d00512d 40a2fff4 2f9a 419e000c 387a0054 48195e99 [0.088805] 935f000c 4bfffef4 9421fff0 7c852378 <80066601> 00725100 3880 38210010 [0.092568] ---[ end trace 7373e1c0f977bdb3 ]--- [0.094750] [1.083137] Kernel panic - not syncing: Attempted to kill the idle task! (Without CONFIG_DEBUG_PAGEALLOC I haven't noticed any problems.) 'git bisect' says: 72f208c6a8f7bc78ef5248babd9e6ed6302bd2a0 is the first bad commit commit 72f208c6a8f7bc78ef5248babd9e6ed6302bd2a0 Author: Christophe Leroy Date: Fri Apr 26 16:23:35 2019 + powerpc/32s: move hash code patching out of MMU_init_hw() For KASAN, hash table handling will be activated early for accessing to KASAN shadow areas. In order to avoid any modification of the hash functions while they are still used with the early hash table, the code patching is moved out of MMU_init_hw() and put close to the big-bang switch to the final hash table. Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman I can revert this commit, and then 5.3-rc2 (plus a patchset adding a serial driver) boot again. Christophe, is there anything I should test in order to figure out how to fix this properly? Thanks, Jonathan Neuschäfer signature.asc Description: PGP signature
[Bug 204371] BUG kmalloc-4k (Tainted: G W ): Object padding overwritten
https://bugzilla.kernel.org/show_bug.cgi?id=204371 Christophe Leroy (christophe.le...@c-s.fr) changed: What|Removed |Added CC||christophe.le...@c-s.fr --- Comment #16 from Christophe Leroy (christophe.le...@c-s.fr) --- Interesting. I see in that commit that in fs/btrfs/free-space-cache.c, copy_page() is done using entry->bitmap. entry->bitmap is allocated with kmalloc() so there is a possibility that entry->bitmap is not page aligned. copy_page() in arch/powerpc/kernel/misc_32.S assumes that source and destination are aligned on cache lines at least. -- You are receiving this mail because: You are on the CC list for the bug.
Re: [PATCH v4 13/25] powernv/fadump: support copying multiple kernel memory regions
On 2019-07-16 17:03:30 Tue, Hari Bathini wrote: > Firmware uses 32-bit field for region size while copying/backing-up > memory during MPIPL. So, the maximum copy size for a region would > be a page less than 4GB (aligned to pagesize) but FADump capture > kernel usually needs more memory than that to be preserved to avoid > running into out of memory errors. > > So, request firmware to copy multiple kernel memory regions instead > of just one (which worked fine for pseries as 64-bit field was used > for size there). With support to copy multiple kernel memory regions, > also handle holes in the memory area to be preserved. Support as many > as 128 kernel memory regions. This allows having an adequate FADump > capture kernel size for different scenarios. Can you split this patch into 2 ? One for handling holes in boot memory and other for handling 4Gb region size ? So that it will be easy to review changes. Thanks, -Mahesh. > > Signed-off-by: Hari Bathini > --- > arch/powerpc/kernel/fadump-common.c | 15 ++ > arch/powerpc/kernel/fadump-common.h | 16 ++ > arch/powerpc/kernel/fadump.c | 173 > ++ > arch/powerpc/platforms/powernv/opal-fadump.c | 25 +++- > arch/powerpc/platforms/powernv/opal-fadump.h |5 - > arch/powerpc/platforms/pseries/rtas-fadump.c | 12 ++ > arch/powerpc/platforms/pseries/rtas-fadump.h |5 + > 7 files changed, 211 insertions(+), 40 deletions(-) >
Re: [PATCH v4 12/25] powernv/fadump: define register/un-register callback functions
On 2019-07-16 17:03:23 Tue, Hari Bathini wrote: > Make OPAL calls to register and un-register with firmware for MPIPL. > > Signed-off-by: Hari Bathini > --- > arch/powerpc/platforms/powernv/opal-fadump.c | 71 > +- > 1 file changed, 69 insertions(+), 2 deletions(-) > [...] > @@ -88,12 +104,63 @@ static int opal_fadump_setup_kernel_metadata(struct > fw_dump *fadump_conf) > > static int opal_fadump_register_fadump(struct fw_dump *fadump_conf) > { > - return -EIO; > + int i, err = -EIO; > + s64 rc; > + > + for (i = 0; i < opal_fdm->region_cnt; i++) { > + rc = opal_mpipl_update(OPAL_MPIPL_ADD_RANGE, > +opal_fdm->rgn[i].src, > +opal_fdm->rgn[i].dest, > +opal_fdm->rgn[i].size); > + if (rc != OPAL_SUCCESS) You may want to remove ranges which has been added so far on error and reset opal_fdm->registered_regions. > + break; > + > + opal_fdm->registered_regions++; > + } > + > + switch (rc) { > + case OPAL_SUCCESS: > + pr_info("Registration is successful!\n"); > + fadump_conf->dump_registered = 1; > + err = 0; > + break; > + case OPAL_UNSUPPORTED: > + pr_err("Support not available.\n"); > + fadump_conf->fadump_supported = 0; > + fadump_conf->fadump_enabled = 0; > + break; > + case OPAL_INTERNAL_ERROR: > + pr_err("Failed to register. Hardware Error(%lld).\n", rc); > + break; > + case OPAL_PARAMETER: > + pr_err("Failed to register. Parameter Error(%lld).\n", rc); > + break; > + case OPAL_PERMISSION: You may want to remove this check. With latest opal mpipl patches opal_mpipl_update() no more returns OPAL_PERMISSION. Even if opal does, we can not say fadump already registered just by looking at return status of single entry addition. Thanks, -Mahesh. > + pr_err("Already registered!\n"); > + fadump_conf->dump_registered = 1; > + err = -EEXIST; > + break; > + default: > + pr_err("Failed to register. Unknown Error(%lld).\n", rc); > + break; > + } > + > + return err; > }
Re: [PATCH v2 2/3] KVM: PPC: Book3S HV: Don't push XIVE context when not using XIVE device
On 13/08/2019 12:01, Paul Mackerras wrote: > At present, when running a guest on POWER9 using HV KVM but not using > an in-kernel interrupt controller (XICS or XIVE), for example if QEMU > is run with the kernel_irqchip=off option, the guest entry code goes > ahead and tries to load the guest context into the XIVE hardware, even > though no context has been set up. > > To fix this, we check that the "CAM word" is non-zero before pushing > it to the hardware. The CAM word is initialized to a non-zero value > in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(), > and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu. If a "CAM word" is defined, it means the vCPU (VP) was enabled at the XIVE HW level. So this is the criteria to consider that a vCPU needs to update (push) its XIVE thread interrupt context when scheduled to run. Reviewed-by: Cédric Le Goater Thanks, C. > > Cc: sta...@vger.kernel.org # v4.11+ > Reported-by: Cédric Le Goater > Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt > controller") > Signed-off-by: Paul Mackerras > --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 ++ > arch/powerpc/kvm/book3s_xive.c | 11 ++- > arch/powerpc/kvm/book3s_xive_native.c | 3 +++ > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index 2e7e788..07181d0 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -942,6 +942,8 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300) > ld r11, VCPU_XIVE_SAVED_STATE(r4) > li r9, TM_QW1_OS > lwz r8, VCPU_XIVE_CAM_WORD(r4) > + cmpwi r8, 0 > + beq no_xive > li r7, TM_QW1_OS + TM_WORD2 > mfmsr r0 > andi. r0, r0, MSR_DR /* in real mode? */ > diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c > index 09f838a..586867e 100644 > --- a/arch/powerpc/kvm/book3s_xive.c > +++ b/arch/powerpc/kvm/book3s_xive.c > @@ -67,8 +67,14 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) > void __iomem *tima = local_paca->kvm_hstate.xive_tima_virt; > u64 pq; > > - if (!tima) > + /* > + * Nothing to do if the platform doesn't have a XIVE > + * or this vCPU doesn't have its own XIVE context > + * (e.g. because it's not using an in-kernel interrupt controller). > + */ > + if (!tima || !vcpu->arch.xive_cam_word) > return; > + > eieio(); > __raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS); > __raw_writel(vcpu->arch.xive_cam_word, tima + TM_QW1_OS + TM_WORD2); > @@ -1146,6 +1152,9 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu) > /* Disable the VP */ > xive_native_disable_vp(xc->vp_id); > > + /* Clear the cam word so guest entry won't try to push context */ > + vcpu->arch.xive_cam_word = 0; > + > /* Free the queues */ > for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { > struct xive_q *q = >queues[i]; > diff --git a/arch/powerpc/kvm/book3s_xive_native.c > b/arch/powerpc/kvm/book3s_xive_native.c > index 368427f..11b91b4 100644 > --- a/arch/powerpc/kvm/book3s_xive_native.c > +++ b/arch/powerpc/kvm/book3s_xive_native.c > @@ -81,6 +81,9 @@ void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) > /* Disable the VP */ > xive_native_disable_vp(xc->vp_id); > > + /* Clear the cam word so guest entry won't try to push context */ > + vcpu->arch.xive_cam_word = 0; > + > /* Free the queues */ > for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { > kvmppc_xive_native_cleanup_queue(vcpu, i); >
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #284271|0 |1 is obsolete|| --- Comment #21 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284361 --> https://bugzilla.kernel.org/attachment.cgi?id=284361=edit kernel .config (5.3-rc4, PowerMac G4 DP) -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #20 from Erhard F. (erhar...@mailbox.org) --- (In reply to Christophe Leroy from comment #18) > Two possibilities, either the value in .rodata.cst16 is wrong or the stack > gets corrupted. > > Maybe you could try disabling KASAN in lib/raid6/Makefile for altivec8.o ? > Or maybe for the entire lib/raid6/ directory, just to see what happens ? Disabled KASAN with KASAN_SANITIZE := n in lib/raid6/Makefile. As you can see in my latest dmesg, the G4 continues booting without further issues. If btrfs gets loaded it still fails with KASAN (will update bug #204397). Another funny issue. Mounting my nfs share works via: modprobe nfs mount /media/distanthome If I mount it without modprobing nfs beforehand I get: [...] [ 66.271748] == [ 66.272076] BUG: KASAN: global-out-of-bounds in _copy_to_iter+0x3d4/0x5a8 [ 66.272331] Write of size 4096 at addr f1c27000 by task modprobe/312 [ 66.272598] CPU: 0 PID: 312 Comm: modprobe Tainted: GW 5.3.0-rc4+ #1 [ 66.272883] Call Trace: [ 66.272964] [e100b848] [c075026c] dump_stack+0xb0/0x10c (unreliable) [ 66.273211] [e100b878] [c02334a8] print_address_description+0x80/0x45c [ 66.273456] [e100b908] [c0233128] __kasan_report+0x140/0x188 [ 66.273667] [e100b948] [c0233fbc] check_memory_region+0x28/0x184 [ 66.273889] [e100b958] [c023206c] memcpy+0x48/0x74 [ 66.274061] [e100b978] [c044342c] _copy_to_iter+0x3d4/0x5a8 [ 66.274265] [e100baa8] [c04437a8] copy_page_to_iter+0x90/0x550 [ 66.274482] [e100bb08] [c01b6898] generic_file_read_iter+0x5c8/0x7bc [ 66.274720] [e100bb78] [c0249034] __vfs_read+0x1b0/0x1f4 [ 66.274912] [e100bca8] [c0249134] vfs_read+0xbc/0x124 [ 66.275094] [e100bcd8] [c02491f0] kernel_read+0x54/0x70 [ 66.275284] [e100bd08] [c02535c8] kernel_read_file+0x240/0x358 [ 66.275499] [e100bdb8] [c02537cc] kernel_read_file_from_fd+0x54/0x74 [ 66.275737] [e100bdf8] [c01068ac] sys_finit_module+0xd8/0x140 [ 66.275949] [e100bf38] [c001a274] ret_from_syscall+0x0/0x34 [ 66.276152] --- interrupt: c01 at 0xa602c4 LR = 0xbe87c4 [ 66.276417] Memory state around the buggy address: [ 66.276588] f1c27a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 66.276824] f1c27a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 66.277060] >f1c27b00: 00 00 00 00 00 00 00 00 05 fa fa fa fa fa fa fa [ 66.277293]^ [ 66.277453] f1c27b80: 07 fa fa fa fa fa fa fa 00 03 fa fa fa fa fa fa [ 66.277688] f1c27c00: 04 fa fa fa fa fa fa fa 00 06 fa fa fa fa fa fa [ 66.277920] == [ 66.428224] RPC: Registered named UNIX socket transport module. [ 66.428484] RPC: Registered udp transport module. [ 66.428647] RPC: Registered tcp transport module. [ 66.428809] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 66.741275] Key type dns_resolver registered [ 67.974192] NFS: Registering the id_resolver key type [ 67.974534] Key type id_resolver registered [ 67.974681] Key type id_legacy registered But maybe it's better to not open too many ppc32 KASAN related bugs for now. ;) It probably can wait until you patches are in some later 5.3-rc I guess. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204479] KASAN hit at modprobe zram
https://bugzilla.kernel.org/show_bug.cgi?id=204479 --- Comment #19 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284355 --> https://bugzilla.kernel.org/attachment.cgi?id=284355=edit dmesg (kernel 5.3-rc4 + shadow patch + parallel patch, PowerMac G4 DP) -- You are receiving this mail because: You are on the CC list for the bug.
RE: [EXT] Re: [PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit property in EP driver.
> -Original Message- > From: Lorenzo Pieralisi > Sent: 2019年8月13日 18:04 > To: Xiaowei Bao > Cc: bhelg...@google.com; M.h. Lian ; Mingkai Hu > ; Roy Zang ; > l.st...@pengutronix.de; kis...@ti.com; tpie...@impinj.com; Leonard > Crestez ; andrew.smir...@gmail.com; > yue.w...@amlogic.com; hayashi.kunih...@socionext.com; > d...@amazon.co.uk; jon...@amazon.com; linux-...@vger.kernel.org; > linux-ker...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; > linux-arm-ker...@lists.infradead.org > Subject: [EXT] Re: [PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit > property in EP driver. > > Caution: EXT Email > > git log --oneline --follow drivers/pci/controller/dwc/pci-layerscape.c > > Do you see any commit with a $SUBJECT ending with a period ? > > There is not. So remove it from yours too. OK, thanks a lot, I will remove it in the next version patch, I have to get the approved Form IT team of our company. > > On Tue, Aug 13, 2019 at 02:28:39PM +0800, Xiaowei Bao wrote: > > The PCIe controller of layerscape just have 4 BARs, BAR0 and BAR1 is > > 32bit, BAR2 and BAR4 is 64bit, this is determined by hardware, so set > > the bar_fixed_64bit with 0x14. > > > > Signed-off-by: Xiaowei Bao > > --- > > v2: > > - Replace value 0x14 with a macro. > > v3: > > - No change. > > v4: > > - send the patch again with '--to'. > > v5: > > - fix the commit message. > > > > drivers/pci/controller/dwc/pci-layerscape-ep.c | 1 + > > 1 file changed, 1 insertion(+) > > scripts/get_maintainer.pl -f drivers/pci/controller/dwc/pci-layerscape-ep.c > Now, with the output you get justify all the people you send this email to. > > So, again, trim the CC list and it is the last time I tell you. Do you mean that I use scripts/get_maintainer.pl -f drivers/pci/controller/ dwc/pci-layerscape-ep.c to get the mail list who I need to send? I use the command of ' scripts/get_maintainer.pl *.patch' to get the mail list before. If yes, I will use the command that you provided. Thanks a lot. > > Before sending patches on mailing lists use git --dry-run to check the emails > you are sending. > > Thanks, > Lorenzo > > > diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c > > b/drivers/pci/controller/dwc/pci-layerscape-ep.c > > index be61d96..ca9aa45 100644 > > --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c > > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c > > @@ -44,6 +44,7 @@ static const struct pci_epc_features > ls_pcie_epc_features = { > > .linkup_notifier = false, > > .msi_capable = true, > > .msix_capable = false, > > + .bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4), > > }; > > > > static const struct pci_epc_features* > > -- > > 2.9.5 > >
Re: [PATCH v4 11/25] powernv/fadump: register kernel metadata address with opal
On 2019-07-16 17:03:15 Tue, Hari Bathini wrote: > OPAL allows registering address with it in the first kernel and > retrieving it after MPIPL. Setup kernel metadata and register its > address with OPAL to use it for processing the crash dump. > > Signed-off-by: Hari Bathini > --- > arch/powerpc/kernel/fadump-common.h |4 + > arch/powerpc/kernel/fadump.c | 65 ++- > arch/powerpc/platforms/powernv/opal-fadump.c | 73 > ++ > arch/powerpc/platforms/powernv/opal-fadump.h | 37 + > arch/powerpc/platforms/pseries/rtas-fadump.c | 32 +-- > 5 files changed, 177 insertions(+), 34 deletions(-) > create mode 100644 arch/powerpc/platforms/powernv/opal-fadump.h > [...] > @@ -346,30 +349,42 @@ int __init fadump_reserve_mem(void) >* use memblock_find_in_range() here since it doesn't allocate >* from bottom to top. >*/ > - for (base = fw_dump.boot_memory_size; > - base <= (memory_boundary - size); > - base += size) { > + while (base <= (memory_boundary - size)) { > if (memblock_is_region_memory(base, size) && > !memblock_is_region_reserved(base, size)) > break; > + > + base += size; > } > - if ((base > (memory_boundary - size)) || > - memblock_reserve(base, size)) { > + > + if (base > (memory_boundary - size)) { > + pr_err("Failed to find memory chunk for reservation\n"); > + goto error_out; > + } > + fw_dump.reserve_dump_area_start = base; > + > + /* > + * Calculate the kernel metadata address and register it with > + * f/w if the platform supports. > + */ > + if (fw_dump.ops->setup_kernel_metadata(_dump) < 0) > + goto error_out; I see setup_kernel_metadata() registers the metadata address with opal without having any minimum data initialized in it. Secondaly, why can't this wait until registration ? I think we should defer this until fadump registration. What if kernel crashes before metadata area is initialized ? > + > + if (memblock_reserve(base, size)) { > pr_err("Failed to reserve memory\n"); > - return 0; > + goto error_out; > } [...] > - > static struct fadump_ops rtas_fadump_ops = { > - .init_fadump_mem_struct = rtas_fadump_init_mem_struct, > - .register_fadump= rtas_fadump_register_fadump, > - .unregister_fadump = rtas_fadump_unregister_fadump, > - .invalidate_fadump = rtas_fadump_invalidate_fadump, > - .process_fadump = rtas_fadump_process_fadump, > - .fadump_region_show = rtas_fadump_region_show, > - .fadump_trigger = rtas_fadump_trigger, > + .init_fadump_mem_struct = rtas_fadump_init_mem_struct, > + .get_kernel_metadata_size = rtas_fadump_get_kernel_metadata_size, > + .setup_kernel_metadata = rtas_fadump_setup_kernel_metadata, > + .register_fadump= rtas_fadump_register_fadump, > + .unregister_fadump = rtas_fadump_unregister_fadump, > + .invalidate_fadump = rtas_fadump_invalidate_fadump, > + .process_fadump = rtas_fadump_process_fadump, > + .fadump_region_show = rtas_fadump_region_show, > + .fadump_trigger = rtas_fadump_trigger, Can you make the tab space changes in your previous patch where these were initially introduced ? So that this patch can only show new members that are added. Thanks, -Mahesh.
[PATCH v2 1/3] KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts
Escalation interrupts are interrupts sent to the host by the XIVE hardware when it has an interrupt to deliver to a guest VCPU but that VCPU is not running anywhere in the system. Hence we disable the escalation interrupt for the VCPU being run when we enter the guest and re-enable it when the guest does an H_CEDE hypercall indicating it is idle. It is possible that an escalation interrupt gets generated just as we are entering the guest. In that case the escalation interrupt may be using a queue entry in one of the interrupt queues, and that queue entry may not have been processed when the guest exits with an H_CEDE. The existing entry code detects this situation and does not clear the vcpu->arch.xive_esc_on flag as an indication that there is a pending queue entry (if the queue entry gets processed, xive_esc_irq() will clear the flag). There is a comment in the code saying that if the flag is still set on H_CEDE, we have to abort the cede rather than re-enabling the escalation interrupt, lest we end up with two occurrences of the escalation interrupt in the interrupt queue. However, the exit code doesn't do that; it aborts the cede in the sense that vcpu->arch.ceded gets cleared, but it still enables the escalation interrupt by setting the source's PQ bits to 00. Instead we need to set the PQ bits to 10, indicating that an interrupt has been triggered. We also need to avoid setting vcpu->arch.xive_esc_on in this case (i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because xive_esc_irq() will run at some point and clear it, and if we race with that we may end up with an incorrect result (i.e. xive_esc_on set when the escalation interrupt has just been handled). It is extremely unlikely that having two queue entries would cause observable problems; theoretically it could cause queue overflow, but the CPU would have to have thousands of interrupts targetted to it for that to be possible. However, this fix will also make it possible to determine accurately whether there is an unhandled escalation interrupt in the queue, which will be needed by the following patch. Cc: sta...@vger.kernel.org # v4.16+ Fixes: 9b9b13a6d153 ("KVM: PPC: Book3S HV: Keep XIVE escalation interrupt masked unless ceded") Signed-off-by: Paul Mackerras --- v2: don't set xive_esc_on if we're not using a XIVE escalation interrupt. arch/powerpc/kvm/book3s_hv_rmhandlers.S | 36 + 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 337e644..2e7e788 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -2831,29 +2831,39 @@ kvm_cede_prodded: kvm_cede_exit: ld r9, HSTATE_KVM_VCPU(r13) #ifdef CONFIG_KVM_XICS - /* Abort if we still have a pending escalation */ + /* are we using XIVE with single escalation? */ + ld r10, VCPU_XIVE_ESC_VADDR(r9) + cmpdi r10, 0 + beq 3f + li r6, XIVE_ESB_SET_PQ_00 + /* +* If we still have a pending escalation, abort the cede, +* and we must set PQ to 10 rather than 00 so that we don't +* potentially end up with two entries for the escalation +* interrupt in the XIVE interrupt queue. In that case +* we also don't want to set xive_esc_on to 1 here in +* case we race with xive_esc_irq(). +*/ lbz r5, VCPU_XIVE_ESC_ON(r9) cmpwi r5, 0 - beq 1f + beq 4f li r0, 0 stb r0, VCPU_CEDED(r9) -1: /* Enable XIVE escalation */ - li r5, XIVE_ESB_SET_PQ_00 + li r6, XIVE_ESB_SET_PQ_10 + b 5f +4: li r0, 1 + stb r0, VCPU_XIVE_ESC_ON(r9) + /* make sure store to xive_esc_on is seen before xive_esc_irq runs */ + sync +5: /* Enable XIVE escalation */ mfmsr r0 andi. r0, r0, MSR_DR /* in real mode? */ beq 1f - ld r10, VCPU_XIVE_ESC_VADDR(r9) - cmpdi r10, 0 - beq 3f - ldx r0, r10, r5 + ldx r0, r10, r6 b 2f 1: ld r10, VCPU_XIVE_ESC_RADDR(r9) - cmpdi r10, 0 - beq 3f - ldcix r0, r10, r5 + ldcix r0, r10, r6 2: sync - li r0, 1 - stb r0, VCPU_XIVE_ESC_ON(r9) #endif /* CONFIG_KVM_XICS */ 3: b guest_exit_cont -- 2.7.4
[PATCH v2 0/3] powerpc/xive: Fix race condition leading to host crashes and hangs
This series fixes a race condition that has been observed in testing on POWER9 machines running KVM guests. An interrupt being freed by free_irq() can have an instance present in a XIVE interrupt queue, which can then be presented to the generic interrupt code after the data structures for it have been freed, leading to a variety of crashes and hangs. This series is based on current upstream kernel source plus Cédric Le Goater's patch "KVM: PPC: Book3S HV: XIVE: Free escalation interrupts before disabling the VP", which is a pre-requisite for this series. As it touches both KVM and generic PPC code, this series will probably go in via Michael Ellerman's powerpc tree. V2 of this patch series adds a patch fixing a bug noticed by Cédric, and also fixes a bug in patch 1/2 of the v1 series. Paul. arch/powerpc/include/asm/xive.h | 8 +++ arch/powerpc/kvm/book3s_hv_rmhandlers.S | 38 +- arch/powerpc/kvm/book3s_xive.c | 42 +++- arch/powerpc/kvm/book3s_xive.h | 2 + arch/powerpc/kvm/book3s_xive_native.c | 6 +++ arch/powerpc/sysdev/xive/common.c | 87 - 6 files changed, 146 insertions(+), 37 deletions(-)
[PATCH v2 2/3] KVM: PPC: Book3S HV: Don't push XIVE context when not using XIVE device
At present, when running a guest on POWER9 using HV KVM but not using an in-kernel interrupt controller (XICS or XIVE), for example if QEMU is run with the kernel_irqchip=off option, the guest entry code goes ahead and tries to load the guest context into the XIVE hardware, even though no context has been set up. To fix this, we check that the "CAM word" is non-zero before pushing it to the hardware. The CAM word is initialized to a non-zero value in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(), and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu. Cc: sta...@vger.kernel.org # v4.11+ Reported-by: Cédric Le Goater Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Signed-off-by: Paul Mackerras --- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 2 ++ arch/powerpc/kvm/book3s_xive.c | 11 ++- arch/powerpc/kvm/book3s_xive_native.c | 3 +++ 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 2e7e788..07181d0 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -942,6 +942,8 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300) ld r11, VCPU_XIVE_SAVED_STATE(r4) li r9, TM_QW1_OS lwz r8, VCPU_XIVE_CAM_WORD(r4) + cmpwi r8, 0 + beq no_xive li r7, TM_QW1_OS + TM_WORD2 mfmsr r0 andi. r0, r0, MSR_DR /* in real mode? */ diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c index 09f838a..586867e 100644 --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -67,8 +67,14 @@ void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) void __iomem *tima = local_paca->kvm_hstate.xive_tima_virt; u64 pq; - if (!tima) + /* +* Nothing to do if the platform doesn't have a XIVE +* or this vCPU doesn't have its own XIVE context +* (e.g. because it's not using an in-kernel interrupt controller). +*/ + if (!tima || !vcpu->arch.xive_cam_word) return; + eieio(); __raw_writeq(vcpu->arch.xive_saved_state.w01, tima + TM_QW1_OS); __raw_writel(vcpu->arch.xive_cam_word, tima + TM_QW1_OS + TM_WORD2); @@ -1146,6 +1152,9 @@ void kvmppc_xive_cleanup_vcpu(struct kvm_vcpu *vcpu) /* Disable the VP */ xive_native_disable_vp(xc->vp_id); + /* Clear the cam word so guest entry won't try to push context */ + vcpu->arch.xive_cam_word = 0; + /* Free the queues */ for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { struct xive_q *q = >queues[i]; diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c index 368427f..11b91b4 100644 --- a/arch/powerpc/kvm/book3s_xive_native.c +++ b/arch/powerpc/kvm/book3s_xive_native.c @@ -81,6 +81,9 @@ void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) /* Disable the VP */ xive_native_disable_vp(xc->vp_id); + /* Clear the cam word so guest entry won't try to push context */ + vcpu->arch.xive_cam_word = 0; + /* Free the queues */ for (i = 0; i < KVMPPC_XIVE_Q_COUNT; i++) { kvmppc_xive_native_cleanup_queue(vcpu, i); -- 2.7.4
[PATCH v2 3/3] powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race
Testing has revealed the existence of a race condition where a XIVE interrupt being shut down can be in one of the XIVE interrupt queues (of which there are up to 8 per CPU, one for each priority) at the point where free_irq() is called. If this happens, can return an interrupt number which has been shut down. This can lead to various symptoms: - irq_to_desc(irq) can be NULL. In this case, no end-of-interrupt function gets called, resulting in the CPU's elevated interrupt priority (numerically lowered CPPR) never gets reset. That then means that the CPU stops processing interrupts, causing device timeouts and other errors in various device drivers. - The irq descriptor or related data structures can be in the process of being freed as the interrupt code is using them. This typically leads to crashes due to bad pointer dereferences. This race is basically what commit 62e0468650c3 ("genirq: Add optional hardware synchronization for shutdown", 2019-06-28) is intended to fix, given a get_irqchip_state() method for the interrupt controller being used. It works by polling the interrupt controller when an interrupt is being freed until the controller says it is not pending. With XIVE, the PQ bits of the interrupt source indicate the state of the interrupt source, and in particular the P bit goes from 0 to 1 at the point where the hardware writes an entry into the interrupt queue that this interrupt is directed towards. Normally, the code will then process the interrupt and do an end-of-interrupt (EOI) operation which will reset PQ to 00 (assuming another interrupt hasn't been generated in the meantime). However, there are situations where the code resets P even though a queue entry exists (for example, by setting PQ to 01, which disables the interrupt source), and also situations where the code leaves P at 1 after removing the queue entry (for example, this is done for escalation interrupts so they cannot fire again until they are explicitly re-enabled). The code already has a 'saved_p' flag for the interrupt source which indicates that a queue entry exists, although it isn't maintained consistently. This patch adds a 'stale_p' flag to indicate that P has been left at 1 after processing a queue entry, and adds code to set and clear saved_p and stale_p as necessary to maintain a consistent indication of whether a queue entry may or may not exist. With this, we can implement xive_get_irqchip_state() by looking at stale_p, saved_p and the ESB PQ bits for the interrupt. There is some additional code to handle escalation interrupts properly; because they are enabled and disabled in KVM assembly code, which does not have access to the xive_irq_data struct for the escalation interrupt. Hence, stale_p may be incorrect when the escalation interrupt is freed in kvmppc_xive_{,native_}cleanup_vcpu(). Fortunately, we can fix it up by looking at vcpu->arch.xive_esc_on, with some careful attention to barriers in order to ensure the correct result if xive_esc_irq() races with kvmppc_xive_cleanup_vcpu(). Finally, this adds code to make noise on the console (pr_crit and WARN_ON(1)) if we find an interrupt queue entry for an interrupt which does not have a descriptor. While this won't catch the race reliably, if it does get triggered it will be an indication that the race is occurring and needs to be debugged. Signed-off-by: Paul Mackerras --- v2: call xive_cleanup_single_escalation from kvmppc_xive_native_cleanup_vcpu() too. arch/powerpc/include/asm/xive.h | 8 arch/powerpc/kvm/book3s_xive.c| 31 + arch/powerpc/kvm/book3s_xive.h| 2 + arch/powerpc/kvm/book3s_xive_native.c | 3 ++ arch/powerpc/sysdev/xive/common.c | 87 ++- 5 files changed, 108 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h index e401698..efb0e59 100644 --- a/arch/powerpc/include/asm/xive.h +++ b/arch/powerpc/include/asm/xive.h @@ -46,7 +46,15 @@ struct xive_irq_data { /* Setup/used by frontend */ int target; + /* +* saved_p means that there is a queue entry for this interrupt +* in some CPU's queue (not including guest vcpu queues), even +* if P is not set in the source ESB. +* stale_p means that there is no queue entry for this interrupt +* in some CPU's queue, even if P is set in the source ESB. +*/ bool saved_p; + bool stale_p; }; #define XIVE_IRQ_FLAG_STORE_EOI0x01 #define XIVE_IRQ_FLAG_LSI 0x02 diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c index 586867e..591bfb4 100644 --- a/arch/powerpc/kvm/book3s_xive.c +++ b/arch/powerpc/kvm/book3s_xive.c @@ -166,6 +166,9 @@ static irqreturn_t xive_esc_irq(int irq, void *data) */ vcpu->arch.xive_esc_on = false; + /* This orders xive_esc_on = false vs. subsequent stale_p = true */ +
Re: [PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit property in EP driver.
git log --oneline --follow drivers/pci/controller/dwc/pci-layerscape.c Do you see any commit with a $SUBJECT ending with a period ? There is not. So remove it from yours too. On Tue, Aug 13, 2019 at 02:28:39PM +0800, Xiaowei Bao wrote: > The PCIe controller of layerscape just have 4 BARs, BAR0 and BAR1 > is 32bit, BAR2 and BAR4 is 64bit, this is determined by hardware, > so set the bar_fixed_64bit with 0x14. > > Signed-off-by: Xiaowei Bao > --- > v2: > - Replace value 0x14 with a macro. > v3: > - No change. > v4: > - send the patch again with '--to'. > v5: > - fix the commit message. > > drivers/pci/controller/dwc/pci-layerscape-ep.c | 1 + > 1 file changed, 1 insertion(+) scripts/get_maintainer.pl -f drivers/pci/controller/dwc/pci-layerscape-ep.c Now, with the output you get justify all the people you send this email to. So, again, trim the CC list and it is the last time I tell you. Before sending patches on mailing lists use git --dry-run to check the emails you are sending. Thanks, Lorenzo > diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c > b/drivers/pci/controller/dwc/pci-layerscape-ep.c > index be61d96..ca9aa45 100644 > --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c > @@ -44,6 +44,7 @@ static const struct pci_epc_features ls_pcie_epc_features = > { > .linkup_notifier = false, > .msi_capable = true, > .msix_capable = false, > + .bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4), > }; > > static const struct pci_epc_features* > -- > 2.9.5 >
[PATCH 2/2] powerpc/32: replace LOAD_MSR_KERNEL() by LOAD_REG_IMMEDIATE()
LOAD_MSR_KERNEL() and LOAD_REG_IMMEDIATE() are doing the same thing in the same way. Drop LOAD_MSR_KERNEL() Signed-off-by: Christophe Leroy --- arch/powerpc/kernel/entry_32.S | 18 +- arch/powerpc/kernel/head_32.h | 21 - 2 files changed, 13 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 54fab22c9a43..972b05504a0a 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -230,7 +230,7 @@ transfer_to_handler_cont: */ lis r12,reenable_mmu@h ori r12,r12,reenable_mmu@l - LOAD_MSR_KERNEL(r0, MSR_KERNEL) + LOAD_REG_IMMEDIATE(r0, MSR_KERNEL) mtspr SPRN_SRR0,r12 mtspr SPRN_SRR1,r0 SYNC @@ -304,7 +304,7 @@ stack_ovf: addir1,r1,THREAD_SIZE-STACK_FRAME_OVERHEAD lis r9,StackOverflow@ha addir9,r9,StackOverflow@l - LOAD_MSR_KERNEL(r10,MSR_KERNEL) + LOAD_REG_IMMEDIATE(r10,MSR_KERNEL) #if defined(CONFIG_PPC_8xx) && defined(CONFIG_PERF_EVENTS) mtspr SPRN_NRI, r0 #endif @@ -324,7 +324,7 @@ trace_syscall_entry_irq_off: bl trace_hardirqs_on /* Now enable for real */ - LOAD_MSR_KERNEL(r10, MSR_KERNEL | MSR_EE) + LOAD_REG_IMMEDIATE(r10, MSR_KERNEL | MSR_EE) mtmsr r10 REST_GPR(0, r1) @@ -394,7 +394,7 @@ ret_from_syscall: #endif mr r6,r3 /* disable interrupts so current_thread_info()->flags can't change */ - LOAD_MSR_KERNEL(r10,MSR_KERNEL) /* doesn't include MSR_EE */ + LOAD_REG_IMMEDIATE(r10,MSR_KERNEL) /* doesn't include MSR_EE */ /* Note: We don't bother telling lockdep about it */ SYNC MTMSRD(r10) @@ -824,7 +824,7 @@ ret_from_except: * can't change between when we test it and when we return * from the interrupt. */ /* Note: We don't bother telling lockdep about it */ - LOAD_MSR_KERNEL(r10,MSR_KERNEL) + LOAD_REG_IMMEDIATE(r10,MSR_KERNEL) SYNC/* Some chip revs have problems here... */ MTMSRD(r10) /* disable interrupts */ @@ -991,7 +991,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_NEED_PAIRED_STWCX) * can restart the exception exit path at the label * exc_exit_restart below. -- paulus */ - LOAD_MSR_KERNEL(r10,MSR_KERNEL & ~MSR_RI) + LOAD_REG_IMMEDIATE(r10,MSR_KERNEL & ~MSR_RI) SYNC MTMSRD(r10) /* clear the RI bit */ .globl exc_exit_restart @@ -1066,7 +1066,7 @@ exc_exit_restart_end: REST_NVGPRS(r1);\ lwz r3,_MSR(r1);\ andi. r3,r3,MSR_PR; \ - LOAD_MSR_KERNEL(r10,MSR_KERNEL);\ + LOAD_REG_IMMEDIATE(r10,MSR_KERNEL); \ bne user_exc_return;\ lwz r0,GPR0(r1);\ lwz r2,GPR2(r1);\ @@ -1236,7 +1236,7 @@ recheck: * neither. Those disable/enable cycles used to peek at * TI_FLAGS aren't advertised. */ - LOAD_MSR_KERNEL(r10,MSR_KERNEL) + LOAD_REG_IMMEDIATE(r10,MSR_KERNEL) SYNC MTMSRD(r10) /* disable interrupts */ lwz r9,TI_FLAGS(r2) @@ -1329,7 +1329,7 @@ _GLOBAL(enter_rtas) lwz r4,RTASBASE(r4) mfmsr r9 stw r9,8(r1) - LOAD_MSR_KERNEL(r0,MSR_KERNEL) + LOAD_REG_IMMEDIATE(r0,MSR_KERNEL) SYNC/* disable interrupts so SRR0/1 */ MTMSRD(r0) /* don't get trashed */ li r9,MSR_KERNEL & ~(MSR_IR|MSR_DR) diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h index 4a692553651f..8abc7783dbe5 100644 --- a/arch/powerpc/kernel/head_32.h +++ b/arch/powerpc/kernel/head_32.h @@ -5,19 +5,6 @@ #include /* for STACK_FRAME_REGS_MARKER */ /* - * MSR_KERNEL is > 0x8000 on 4xx/Book-E since it include MSR_CE. - */ -.macro __LOAD_MSR_KERNEL r, x -.if \x >= 0x8000 - lis \r, (\x)@h - ori \r, \r, (\x)@l -.else - li \r, (\x) -.endif -.endm -#define LOAD_MSR_KERNEL(r, x) __LOAD_MSR_KERNEL r, x - -/* * Exception entry code. This code runs with address translation * turned off, i.e. using physical addresses. * We assume sprg3 has the physical address of the current @@ -92,7 +79,7 @@ #ifdef CONFIG_40x rlwinm r9,r9,0,14,12 /* clear MSR_WE (necessary?) */ #else - LOAD_MSR_KERNEL(r10, MSR_KERNEL & ~(MSR_IR|MSR_DR)) /* can take exceptions */ + LOAD_REG_IMMEDIATE(r10, MSR_KERNEL & ~(MSR_IR|MSR_DR)) /* can take exceptions */ MTMSRD(r10) /* (except for mach check
[PATCH 1/2] powerpc: rewrite LOAD_REG_IMMEDIATE() as an intelligent macro
Today LOAD_REG_IMMEDIATE() is a basic #define which loads all parts on a value into a register, including the parts that are NUL. This means always 2 instructions on PPC32 and always 5 instructions on PPC64. And those instructions cannot run in parallele as they are updating the same register. Ex: LOAD_REG_IMMEDIATE(r1,THREAD_SIZE) in head_64.S results in: 3c 20 00 00 lis r1,0 60 21 00 00 ori r1,r1,0 78 21 07 c6 rldicr r1,r1,32,31 64 21 00 00 orisr1,r1,0 60 21 40 00 ori r1,r1,16384 Rewrite LOAD_REG_IMMEDIATE() with GAS macro in order to skip the parts that are NUL. Rename existing LOAD_REG_IMMEDIATE() as LOAD_REG_IMMEDIATE_SYM() and use that one for loading value of symbols which are not known at compile time. Now LOAD_REG_IMMEDIATE(r1,THREAD_SIZE) in head_64.S results in: 38 20 40 00 li r1,16384 Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/ppc_asm.h | 42 +++- arch/powerpc/kernel/exceptions-64e.S | 10 - arch/powerpc/kernel/head_64.S| 2 +- 3 files changed, 43 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index e0637730a8e7..9a7c2ca9b714 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -311,13 +311,43 @@ GLUE(.,name): addis reg,reg,(name - 0b)@ha; \ addireg,reg,(name - 0b)@l; -#ifdef __powerpc64__ -#ifdef HAVE_AS_ATHIGH +#if defined(__powerpc64__) && defined(HAVE_AS_ATHIGH) #define __AS_ATHIGH high #else #define __AS_ATHIGH h #endif -#define LOAD_REG_IMMEDIATE(reg,expr) \ + +.macro __LOAD_REG_IMMEDIATE_32 r, x + .if (\x) >= 0x8000 || (\x) < -0x8000 + lis \r, (\x)@__AS_ATHIGH + .if (\x) & 0x != 0 + ori \r, \r, (\x)@l + .endif + .else + li \r, (\x)@l + .endif +.endm + +.macro __LOAD_REG_IMMEDIATE r, x + .if \x & ~0x != 0 + __LOAD_REG_IMMEDIATE_32 \r, (\x) >> 32 + rldicr \r, \r, 32, 31 + .if (\x) & 0x != 0 + oris \r, \r, (\x)@__AS_ATHIGH + .endif + .if (\x) & 0x != 0 + oris \r, \r, (\x)@l + .endif + .else + __LOAD_REG_IMMEDIATE_32 \r, \x + .endif +.endm + +#ifdef __powerpc64__ + +#define LOAD_REG_IMMEDIATE(reg, expr) __LOAD_REG_IMMEDIATE reg, expr + +#define LOAD_REG_IMMEDIATE_SYM(reg,expr) \ lis reg,(expr)@highest; \ ori reg,reg,(expr)@higher; \ rldicr reg,reg,32,31; \ @@ -335,11 +365,13 @@ GLUE(.,name): #else /* 32-bit */ -#define LOAD_REG_IMMEDIATE(reg,expr) \ +#define LOAD_REG_IMMEDIATE(reg, expr) __LOAD_REG_IMMEDIATE_32 reg, expr + +#define LOAD_REG_IMMEDIATE_SYM(reg,expr) \ lis reg,(expr)@ha; \ addireg,reg,(expr)@l; -#define LOAD_REG_ADDR(reg,name)LOAD_REG_IMMEDIATE(reg, name) +#define LOAD_REG_ADDR(reg,name)LOAD_REG_IMMEDIATE_SYM(reg, name) #define LOAD_REG_ADDRBASE(reg, name) lis reg,name@ha #define ADDROFF(name) name@l diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 1cfb3da4a84a..898aae6da167 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -751,8 +751,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_ALTIVEC) ld r14,interrupt_base_book3e@got(r15) ld r15,__end_interrupts@got(r15) #else - LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) - LOAD_REG_IMMEDIATE(r15,__end_interrupts) + LOAD_REG_IMMEDIATE_SYM(r14,interrupt_base_book3e) + LOAD_REG_IMMEDIATE_SYM(r15,__end_interrupts) #endif cmpld cr0,r10,r14 cmpld cr1,r10,r15 @@ -821,8 +821,8 @@ kernel_dbg_exc: ld r14,interrupt_base_book3e@got(r15) ld r15,__end_interrupts@got(r15) #else - LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) - LOAD_REG_IMMEDIATE(r15,__end_interrupts) + LOAD_REG_IMMEDIATE_SYM(r14,interrupt_base_book3e) + LOAD_REG_IMMEDIATE_SYM(r15,__end_interrupts) #endif cmpld cr0,r10,r14 cmpld cr1,r10,r15 @@ -1449,7 +1449,7 @@ a2_tlbinit_code_start: a2_tlbinit_after_linear_map: /* Now we branch the new virtual address mapped by this entry */ - LOAD_REG_IMMEDIATE(r3,1f) + LOAD_REG_IMMEDIATE_SYM(r3,1f) mtctr r3 bctr diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S index 91d297e696dd..1fd44761e997 100644 --- a/arch/powerpc/kernel/head_64.S +++ b/arch/powerpc/kernel/head_64.S @@ -635,7 +635,7 @@ __after_prom_start: sub r5,r5,r11 #else /* just copy interrupts */ - LOAD_REG_IMMEDIATE(r5,
Re: [EXT] Re: [PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit property in EP driver.
You should fix your email client set-up to avoid sticking an [EXT] tag to your emails $SUBJECT. On Tue, Aug 13, 2019 at 07:39:48AM +, Xiaowei Bao wrote: > > > > -Original Message- > > From: Kishon Vijay Abraham I > > Sent: 2019年8月13日 15:30 > > To: Xiaowei Bao ; lorenzo.pieral...@arm.com; > > bhelg...@google.com; M.h. Lian ; Mingkai Hu > > ; Roy Zang ; > > l.st...@pengutronix.de; tpie...@impinj.com; Leonard Crestez > > ; andrew.smir...@gmail.com; > > yue.w...@amlogic.com; hayashi.kunih...@socionext.com; > > d...@amazon.co.uk; jon...@amazon.com; linux-...@vger.kernel.org; > > linux-ker...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; > > linux-arm-ker...@lists.infradead.org > > Subject: [EXT] Re: [PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit > > property in EP driver. > > > > Caution: EXT Email See above, this "Caution" stuff should disappear. Also, quoting the email header is useless, please configure your email client to remove it. Thanks, Lorenzo > > On 13/08/19 11:58 AM, Xiaowei Bao wrote: > > > The PCIe controller of layerscape just have 4 BARs, BAR0 and BAR1 is > > > 32bit, BAR2 and BAR4 is 64bit, this is determined by hardware, so set > > > the bar_fixed_64bit with 0x14. > > > > > > Signed-off-by: Xiaowei Bao > > > > Acked-by: Kishon Vijay Abraham I > > > --- > > > v2: > > > - Replace value 0x14 with a macro. > > > v3: > > > - No change. > > > v4: > > > - send the patch again with '--to'. > > > v5: > > > - fix the commit message. > > > > > > drivers/pci/controller/dwc/pci-layerscape-ep.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c > > > b/drivers/pci/controller/dwc/pci-layerscape-ep.c > > > index be61d96..ca9aa45 100644 > > > --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c > > > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c > > > @@ -44,6 +44,7 @@ static const struct pci_epc_features > > ls_pcie_epc_features = { > > > .linkup_notifier = false, > > > .msi_capable = true, > > > .msix_capable = false, > > > + .bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4), > > > }; > > > > > > static const struct pci_epc_features* > I check other platforms, it is 'static const struct pci_epc_features', I can > get the correct > Value use this define way in pci-epf-test.c file. > > >
[Bug 204371] BUG kmalloc-4k (Tainted: G W ): Object padding overwritten
https://bugzilla.kernel.org/show_bug.cgi?id=204371 --- Comment #15 from Erhard F. (erhar...@mailbox.org) --- On Fri, 09 Aug 2019 12:31:26 + bugzilla-dae...@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=204371 # cat ~/bisect01.log binäre Suche: danach noch 37903 Commits zum Testen übrig (ungefähr 15 Schritte) [9abf8acea297b4c65f5fa3206e2b8e468e730e84] Merge tag 'tty-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty binäre Suche: danach noch 19051 Commits zum Testen übrig (ungefähr 14 Schritte) [7c00e8ae041b349992047769af741b67379ce19a] Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc binäre Suche: danach noch 9762 Commits zum Testen übrig (ungefähr 13 Schritte) [dafa5f6577a9eecd2941add553d1672c30b02364] Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 binäre Suche: danach noch 4644 Commits zum Testen übrig (ungefähr 12 Schritte) [2ed9db3074fcd8d12709fe40ff0e691d74229818] net: sched: cls_api: fix dead code in switch binäre Suche: danach noch 2319 Commits zum Testen übrig (ungefähr 11 Schritte) [b219a1d2de0c025318475e3bbf8e3215cf49d083] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md binäre Suche: danach noch 1153 Commits zum Testen übrig (ungefähr 10 Schritte) [85a0b791bc17f7a49280b33e2905d109c062a47b] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux binäre Suche: danach noch 629 Commits zum Testen übrig (ungefähr 9 Schritte) [10f3e23f07cb0c20f9bcb77a5b5a7eb2a1b2a2fe] Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 binäre Suche: danach noch 273 Commits zum Testen übrig (ungefähr 8 Schritte) [575b94386bd539a7d803aee9fd4a8d275844c40f] Merge tag 'locks-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux binäre Suche: danach noch 136 Commits zum Testen übrig (ungefähr 7 Schritte) [d7e8555b1dd493c809e56e359974eecabe7d3fde] btrfs: remove unused member async_submit_bio::fs_info binäre Suche: danach noch 68 Commits zum Testen übrig (ungefähr 6 Schritte) [389305b2aa68723c754f88d9dbd268a400e10664] btrfs: relocation: Only remove reloc rb_trees if reloc control has been initialized binäre Suche: danach noch 34 Commits zum Testen übrig (ungefähr 5 Schritte) [d814a49198eafa6163698bdd93961302f3a877a4] btrfs: use correct compare function of dirty_metadata_bytes binäre Suche: danach noch 16 Commits zum Testen übrig (ungefähr 4 Schritte) [c7b562c5480322ffaf591f45a4ff7ee089340ab4] btrfs: raid56: catch errors from full_stripe_write binäre Suche: danach noch 8 Commits zum Testen übrig (ungefähr 3 Schritte) [65ad010488a5cc0f123a9924f7ad26a1b3f6a4f6] btrfs: pass only eb to num_extent_pages binäre Suche: danach noch 3 Commits zum Testen übrig (ungefähr 2 Schritte) [37508515621551538addaf826ab4b8a9aaf0a382] btrfs: simplify some assignments of inode numbers binäre Suche: danach noch 1 Commit zum Testen übrig (ungefähr 1 Schritt) [69d2480456d1baf027a86e530989d7bedd698d5f] btrfs: use copy_page for copying pages instead of memcpy binäre Suche: danach noch 0 Commits zum Testen übrig (ungefähr 0 Schritte) [3ffbd68c48320730ef64ebfb5e639220f1f65483] btrfs: simplify pointer chasing of local fs_info variables 69d2480456d1baf027a86e530989d7bedd698d5f is the first bad commit commit 69d2480456d1baf027a86e530989d7bedd698d5f Author: David Sterba Date: Fri Jun 29 10:56:44 2018 +0200 btrfs: use copy_page for copying pages instead of memcpy Use the helper that's possibly optimized for full page copies. Signed-off-by: David Sterba :04 04 87de10a38618c1655c3266ff5a31358068fa1ca6 d0a2612d260215acaff66adaa5183ebd29a4b710 M fs -- You are receiving this mail because: You are on the CC list for the bug.
[Bug 204371] BUG kmalloc-4k (Tainted: G W ): Object padding overwritten
https://bugzilla.kernel.org/show_bug.cgi?id=204371 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #284035|0 |1 is obsolete|| --- Comment #14 from Erhard F. (erhar...@mailbox.org) --- Created attachment 284353 --> https://bugzilla.kernel.org/attachment.cgi?id=284353=edit kernel .config (PowerMac G4 DP, kernel 4.18.0-rc8+, final bisect) -- You are receiving this mail because: You are on the CC list for the bug.
Re: [PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit property in EP driver.
On 13/08/19 11:58 AM, Xiaowei Bao wrote: > The PCIe controller of layerscape just have 4 BARs, BAR0 and BAR1 > is 32bit, BAR2 and BAR4 is 64bit, this is determined by hardware, > so set the bar_fixed_64bit with 0x14. > > Signed-off-by: Xiaowei Bao Acked-by: Kishon Vijay Abraham I > --- > v2: > - Replace value 0x14 with a macro. > v3: > - No change. > v4: > - send the patch again with '--to'. > v5: > - fix the commit message. > > drivers/pci/controller/dwc/pci-layerscape-ep.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c > b/drivers/pci/controller/dwc/pci-layerscape-ep.c > index be61d96..ca9aa45 100644 > --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c > @@ -44,6 +44,7 @@ static const struct pci_epc_features ls_pcie_epc_features = > { > .linkup_notifier = false, > .msi_capable = true, > .msix_capable = false, > + .bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4), > }; > > static const struct pci_epc_features* >
RE: [EXT] Re: [PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit property in EP driver.
> -Original Message- > From: Kishon Vijay Abraham I > Sent: 2019年8月13日 15:30 > To: Xiaowei Bao ; lorenzo.pieral...@arm.com; > bhelg...@google.com; M.h. Lian ; Mingkai Hu > ; Roy Zang ; > l.st...@pengutronix.de; tpie...@impinj.com; Leonard Crestez > ; andrew.smir...@gmail.com; > yue.w...@amlogic.com; hayashi.kunih...@socionext.com; > d...@amazon.co.uk; jon...@amazon.com; linux-...@vger.kernel.org; > linux-ker...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; > linux-arm-ker...@lists.infradead.org > Subject: [EXT] Re: [PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit > property in EP driver. > > Caution: EXT Email > > On 13/08/19 11:58 AM, Xiaowei Bao wrote: > > The PCIe controller of layerscape just have 4 BARs, BAR0 and BAR1 is > > 32bit, BAR2 and BAR4 is 64bit, this is determined by hardware, so set > > the bar_fixed_64bit with 0x14. > > > > Signed-off-by: Xiaowei Bao > > Acked-by: Kishon Vijay Abraham I > > --- > > v2: > > - Replace value 0x14 with a macro. > > v3: > > - No change. > > v4: > > - send the patch again with '--to'. > > v5: > > - fix the commit message. > > > > drivers/pci/controller/dwc/pci-layerscape-ep.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c > > b/drivers/pci/controller/dwc/pci-layerscape-ep.c > > index be61d96..ca9aa45 100644 > > --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c > > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c > > @@ -44,6 +44,7 @@ static const struct pci_epc_features > ls_pcie_epc_features = { > > .linkup_notifier = false, > > .msi_capable = true, > > .msix_capable = false, > > + .bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4), > > }; > > > > static const struct pci_epc_features* I check other platforms, it is 'static const struct pci_epc_features', I can get the correct Value use this define way in pci-epf-test.c file. > >
[PATCHv5 2/2] PCI: layerscape: Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC separately
Add CONFIG_PCI_LAYERSCAPE_EP to build EP/RC separately. Signed-off-by: Xiaowei Bao --- v2: - No change. v3: - modify the commit message. v4: - send the patch again with '--to'. v5: - No change. drivers/pci/controller/dwc/Kconfig | 20 ++-- drivers/pci/controller/dwc/Makefile | 3 ++- 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig index 6ea778a..869c645 100644 --- a/drivers/pci/controller/dwc/Kconfig +++ b/drivers/pci/controller/dwc/Kconfig @@ -131,13 +131,29 @@ config PCI_KEYSTONE_EP DesignWare core functions to implement the driver. config PCI_LAYERSCAPE - bool "Freescale Layerscape PCIe controller" + bool "Freescale Layerscape PCIe controller - Host mode" depends on OF && (ARM || ARCH_LAYERSCAPE || COMPILE_TEST) depends on PCI_MSI_IRQ_DOMAIN select MFD_SYSCON select PCIE_DW_HOST help - Say Y here if you want PCIe controller support on Layerscape SoCs. + Say Y here if you want to enable PCIe controller support on Layerscape + SoCs to work in Host mode. + This controller can work either as EP or RC. The RCW[HOST_AGT_PEX] + determines which PCIe controller works in EP mode and which PCIe + controller works in RC mode. + +config PCI_LAYERSCAPE_EP + bool "Freescale Layerscape PCIe controller - Endpoint mode" + depends on OF && (ARM || ARCH_LAYERSCAPE || COMPILE_TEST) + depends on PCI_ENDPOINT + select PCIE_DW_EP + help + Say Y here if you want to enable PCIe controller support on Layerscape + SoCs to work in Endpoint mode. + This controller can work either as EP or RC. The RCW[HOST_AGT_PEX] + determines which PCIe controller works in EP mode and which PCIe + controller works in RC mode. config PCI_HISI depends on OF && (ARM64 || COMPILE_TEST) diff --git a/drivers/pci/controller/dwc/Makefile b/drivers/pci/controller/dwc/Makefile index b085dfd..824fde7 100644 --- a/drivers/pci/controller/dwc/Makefile +++ b/drivers/pci/controller/dwc/Makefile @@ -8,7 +8,8 @@ obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o obj-$(CONFIG_PCI_IMX6) += pci-imx6.o obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone.o -obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o pci-layerscape-ep.o +obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o +obj-$(CONFIG_PCI_LAYERSCAPE_EP) += pci-layerscape-ep.o obj-$(CONFIG_PCIE_QCOM) += pcie-qcom.o obj-$(CONFIG_PCIE_ARMADA_8K) += pcie-armada8k.o obj-$(CONFIG_PCIE_ARTPEC6) += pcie-artpec6.o -- 2.9.5
[PATCHv5 1/2] PCI: layerscape: Add the bar_fixed_64bit property in EP driver.
The PCIe controller of layerscape just have 4 BARs, BAR0 and BAR1 is 32bit, BAR2 and BAR4 is 64bit, this is determined by hardware, so set the bar_fixed_64bit with 0x14. Signed-off-by: Xiaowei Bao --- v2: - Replace value 0x14 with a macro. v3: - No change. v4: - send the patch again with '--to'. v5: - fix the commit message. drivers/pci/controller/dwc/pci-layerscape-ep.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c b/drivers/pci/controller/dwc/pci-layerscape-ep.c index be61d96..ca9aa45 100644 --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c @@ -44,6 +44,7 @@ static const struct pci_epc_features ls_pcie_epc_features = { .linkup_notifier = false, .msi_capable = true, .msix_capable = false, + .bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4), }; static const struct pci_epc_features* -- 2.9.5
RE: [EXT] Re: [PATCHv4 1/2] PCI: layerscape: Add the bar_fixed_64bit property in EP driver.
> -Original Message- > From: Kishon Vijay Abraham I > Sent: 2019年8月13日 12:36 > To: Xiaowei Bao ; lorenzo.pieral...@arm.com; > bhelg...@google.com; M.h. Lian ; Mingkai Hu > ; Roy Zang ; > l.st...@pengutronix.de; tpie...@impinj.com; Leonard Crestez > ; andrew.smir...@gmail.com; > yue.w...@amlogic.com; hayashi.kunih...@socionext.com; > d...@amazon.co.uk; jon...@amazon.com; linux-...@vger.kernel.org; > linux-ker...@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; > linux-arm-ker...@lists.infradead.org > Subject: [EXT] Re: [PATCHv4 1/2] PCI: layerscape: Add the bar_fixed_64bit > property in EP driver. > > Caution: EXT Email > > On 13/08/19 8:23 AM, Xiaowei Bao wrote: > > The PCIe controller of layerscape just have 4 BARs, BAR0 and BAR1 is > > 32bit, BAR3 and BAR4 is 64bit, this is determined by hardware, > > Do you mean BAR2 instead of BAR3 here? Yes. > > Thanks > Kishon > > > so set the bar_fixed_64bit with 0x14. > > > > Signed-off-by: Xiaowei Bao > > --- > > v2: > > - Replace value 0x14 with a macro. > > v3: > > - No change. > > v4: > > - send the patch again with '--to'. > > > > drivers/pci/controller/dwc/pci-layerscape-ep.c |1 + > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/pci/controller/dwc/pci-layerscape-ep.c > > b/drivers/pci/controller/dwc/pci-layerscape-ep.c > > index be61d96..227c33b 100644 > > --- a/drivers/pci/controller/dwc/pci-layerscape-ep.c > > +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c > > @@ -44,6 +44,7 @@ static int ls_pcie_establish_link(struct dw_pcie *pci) > > .linkup_notifier = false, > > .msi_capable = true, > > .msix_capable = false, > > + .bar_fixed_64bit = (1 << BAR_2) | (1 << BAR_4), > > }; > > > > static const struct pci_epc_features* > >