[PATCH NEXT 1/4] powerpc/pasemi: Add PCI initialisation for Nemo board.
Hi Michael, Hi All, kbuild test robot Wed, 03 Jan 2018 04:17:20 -0800 wrote: Hi Darren, Thank you for the patch! Perhaps something to improve: arch/powerpc/platforms/pasemi/pci.c: In function 'sb600_set_flag': >> include/linux/kern_levels.h:5:18: warning: format '%lx' expects argument of >> type 'long unsigned int', but argument 2 has type 'resource_size_t {aka long >> long unsigned int}' [-Wformat=] #define KERN_SOH "\001" /* ASCII Start Of Header */ —- I was able to fix this small format issue. I replaced the format '%08lx' with '%08llx'. + printk(KERN_CRIT "NEMO SB600 IOB base %08llx\n",res.start); Is this fix OK or is there a better solution? > On 3. May 2018, at 15:06, Michael Ellerman wrote: > >> + >> +printk(KERN_CRIT "NEMO SB600 IOB base %08lx\n",res.start); > > That's INFO or even DEBUG. >> Michael, What do you think about this fix? + printk(KERN_INFO "NEMO SB600 IOB base %08llx\n",res.start); Thanks, Christian
Re: Can people please check this patch out for cross-arch issues
On Mon, 9 Jul 2018 16:57:46 -0700 Linus Torvalds wrote: > We have a funny new bugzilla entry for an issue that is 12 years old, > and not really all that critical, but one that *can* become a problem > for people who do very specific things. > > What happens is that 'fork()' will cause a re-try (with > ERESTARTNOINTR) if a signal has come in between the point where the > fork() started, but before we add the process to the process table. > The reason is that the signal possibly *should* affect the new child > process too, but it was never queued to it because we were obviously > in the process of creating it. > > That's normally entirely a non-issue, and I don't think anybody ever > imagined it could matter in practice, but apparently there are loads > where this becomes problematic. > > See kernel bugzilla at > > https://bugzilla.kernel.org/show_bug.cgi?id=200447 > > which has a trial balloon patch for this issue already, to at least > limit that retry to only signals that actually might affect the child > (ie not any random signals sent explicitly and directly only to the > parent process). > > HOWEVER. > > The very first thing I noticed while looking at this was that one of > the more expensive parts of the fork() retry is actually marking all > the parent page tables read-only. Now, it's one of _many_ expensive > parts, and it's not nearly as expensive as all the reference counting > we do for each page, but it's actually very easy to avoid. When we > have repeated fork() calls, there's just no point in repeatedly > marking pages read-only. > > This the attached one-liner patch. > > The reason I'm sending it to the arch people is that while this is a > totally trivial patch on x86 ("pte_write()" literally tests exactly > the same bit that "pte_wrprotect()" and "ptep_set_wrprotect()" > clears), the same is not necessarily always true on other > architectures. > > Some other architectures make "ptep_set_wrprotect()" do more than just > clear the one bit we test with "pte_write()". > > Honestly, I don't think it could possibly matter: if "pte_write()" > returns false, then whatever "ptep_set_writeprotect()" does can not > really matter (at least for a COW mapping). But I thought I'd send > this out for comments anyway, despite how trivial it looks. > > So. Comments? Looks good to me (after the huge page bits are done?) If pte_write returns false here, surely any later write access has to fault otherwise we've got bigger problems right? powerpc/64/radix is pretty unsurprising so no problems there (it just modifies the pte, so shouldn't change anything in this case). Hash will actually schedule the hash table entry to be invalidated, but it can't be writable. > It doesn't make a huge difference, honestly, and the real fix for the > "retry too eagerly" is completely different, but at the same time this > one-liner trivial fix does feel like the RightThing(tm) regardless. Acked-by: Nicholas Piggin
Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100
On Thu, 7 Jun 2018 23:03:23 -0600 Alex Williamson wrote: > On Fri, 8 Jun 2018 14:14:23 +1000 > Alexey Kardashevskiy wrote: > > > On 8/6/18 1:44 pm, Alex Williamson wrote: > > > On Fri, 8 Jun 2018 13:08:54 +1000 > > > Alexey Kardashevskiy wrote: > > > > > >> On 8/6/18 8:15 am, Alex Williamson wrote: > > >>> On Fri, 08 Jun 2018 07:54:02 +1000 > > >>> Benjamin Herrenschmidt wrote: > > >>> > > On Thu, 2018-06-07 at 11:04 -0600, Alex Williamson wrote: > > > > > > Can we back up and discuss whether the IOMMU grouping of NVLink > > > connected devices makes sense? AIUI we have a PCI view of these > > > devices and from that perspective they're isolated. That's the view > > > of > > > the device used to generate the grouping. However, not visible to us, > > > these devices are interconnected via NVLink. What isolation > > > properties > > > does NVLink provide given that its entire purpose for existing seems > > > to > > > be to provide a high performance link for p2p between devices? > > > > Not entire. On POWER chips, we also have an nvlink between the device > > and the CPU which is running significantly faster than PCIe. > > > > But yes, there are cross-links and those should probably be accounted > > for in the grouping. > > >>> > > >>> Then after we fix the grouping, can we just let the host driver manage > > >>> this coherent memory range and expose vGPUs to guests? The use case of > > >>> assigning all 6 GPUs to one VM seems pretty limited. (Might need to > > >>> convince NVIDIA to support more than a single vGPU per VM though) > > >> > > >> These are physical GPUs, not virtual sriov-alike things they are > > >> implementing as well elsewhere. > > > > > > vGPUs as implemented on M- and P-series Teslas aren't SR-IOV like > > > either. That's why we have mdev devices now to implement software > > > defined devices. I don't have first hand experience with V-series, but > > > I would absolutely expect a PCIe-based Tesla V100 to support vGPU. > > > > So assuming V100 can do vGPU, you are suggesting ditching this patchset and > > using mediated vGPUs instead, correct? > > If it turns out that our PCIe-only-based IOMMU grouping doesn't > account for lack of isolation on the NVLink side and we correct that, > limiting assignment to sets of 3 interconnected GPUs, is that still a > useful feature? OTOH, it's entirely an NVIDIA proprietary decision > whether they choose to support vGPU on these GPUs or whether they can > be convinced to support multiple vGPUs per VM. > > > >> My current understanding is that every P9 chip in that box has some > > >> NVLink2 > > >> logic on it so each P9 is directly connected to 3 GPUs via PCIe and > > >> 2xNVLink2, and GPUs in that big group are interconnected by NVLink2 links > > >> as well. > > >> > > >> From small bits of information I have it seems that a GPU can perfectly > > >> work alone and if the NVIDIA driver does not see these interconnects > > >> (because we do not pass the rest of the big 3xGPU group to this guest), > > >> it > > >> continues with a single GPU. There is an "nvidia-smi -r" big reset hammer > > >> which simply refuses to work until all 3 GPUs are passed so there is some > > >> distinction between passing 1 or 3 GPUs, and I am trying (as we speak) to > > >> get a confirmation from NVIDIA that it is ok to pass just a single GPU. > > >> > > >> So we will either have 6 groups (one per GPU) or 2 groups (one per > > >> interconnected group). > > > > > > I'm not gaining much confidence that we can rely on isolation between > > > NVLink connected GPUs, it sounds like you're simply expecting that > > > proprietary code from NVIDIA on a proprietary interconnect from NVIDIA > > > is going to play nice and nobody will figure out how to do bad things > > > because... obfuscation? Thanks, > > > > Well, we already believe that a proprietary firmware of a sriov-capable > > adapter like Mellanox ConnextX is not doing bad things, how is this > > different in principle? > > It seems like the scope and hierarchy are different. Here we're > talking about exposing big discrete devices, which are peers of one > another (and have history of being reverse engineered), to userspace > drivers. Once handed to userspace, each of those devices needs to be > considered untrusted. In the case of SR-IOV, we typically have a > trusted host driver for the PF managing untrusted VFs. We do rely on > some sanity in the hardware/firmware in isolating the VFs from each > other and from the PF, but we also often have source code for Linux > drivers for these devices and sometimes even datasheets. Here we have > neither of those and perhaps we won't know the extent of the lack of > isolation between these devices until nouveau (best case) or some > exploit (worst case) exposes it. IOMMU grouping always assumes a lack > of
[PATCH] powerpc: Replaced msleep with usleep_range
Replaced msleep for less than 10ms with usleep_range because will often sleep longer than intended. For original explanation see: Documentation/timers/timers-howto.txt Signed-off-by: Daniel Klamt Signed-off-by: Bjoern Noetel --- arch/powerpc/sysdev/xive/native.c | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c index 311185b9960a..b164b1cdf4d6 100644 --- a/arch/powerpc/sysdev/xive/native.c +++ b/arch/powerpc/sysdev/xive/native.c @@ -109,7 +109,7 @@ int xive_native_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq) rc = opal_xive_set_irq_config(hw_irq, target, prio, sw_irq); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100) } return rc == 0 ? 0 : -ENXIO; } @@ -163,7 +163,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q *q, u8 prio, rc = opal_xive_set_queue_info(vp_id, prio, qpage_phys, order, flags); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100); } if (rc) { pr_err("Error %lld setting queue for prio %d\n", rc, prio); @@ -190,7 +190,7 @@ static void __xive_native_disable_queue(u32 vp_id, struct xive_q *q, u8 prio) rc = opal_xive_set_queue_info(vp_id, prio, 0, 0, 0); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100); } if (rc) pr_err("Error %lld disabling queue for prio %d\n", rc, prio); @@ -253,7 +253,7 @@ static int xive_native_get_ipi(unsigned int cpu, struct xive_cpu *xc) for (;;) { irq = opal_xive_allocate_irq(chip_id); if (irq == OPAL_BUSY) { - msleep(1); + usleep_range(1000, 1100); continue; } if (irq < 0) { @@ -275,7 +275,7 @@ u32 xive_native_alloc_irq(void) rc = opal_xive_allocate_irq(OPAL_XIVE_ANY_CHIP); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100); } if (rc < 0) return 0; @@ -289,7 +289,7 @@ void xive_native_free_irq(u32 irq) s64 rc = opal_xive_free_irq(irq); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100); } } EXPORT_SYMBOL_GPL(xive_native_free_irq); @@ -305,7 +305,7 @@ static void xive_native_put_ipi(unsigned int cpu, struct xive_cpu *xc) for (;;) { rc = opal_xive_free_irq(xc->hw_ipi); if (rc == OPAL_BUSY) { - msleep(1); + usleep_range(1000, 1100); continue; } xc->hw_ipi = 0; @@ -400,7 +400,7 @@ static void xive_native_setup_cpu(unsigned int cpu, struct xive_cpu *xc) rc = opal_xive_set_vp_info(vp, OPAL_XIVE_VP_ENABLED, 0); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100); } if (rc) { pr_err("Failed to enable pool VP on CPU %d\n", cpu); @@ -444,7 +444,7 @@ static void xive_native_teardown_cpu(unsigned int cpu, struct xive_cpu *xc) rc = opal_xive_set_vp_info(vp, 0, 0); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100); } } @@ -645,7 +645,7 @@ u32 xive_native_alloc_vp_block(u32 max_vcpus) rc = opal_xive_alloc_vp_block(order); switch (rc) { case OPAL_BUSY: - msleep(1); + usleep_range(1000, 1100); break; case OPAL_XIVE_PROVISIONING: if (!xive_native_provision_pages()) @@ -687,7 +687,7 @@ int xive_native_enable_vp(u32 vp_id, bool single_escalation) rc = opal_xive_set_vp_info(vp_id, flags, 0); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100); } return rc ? -EIO : 0; } @@ -701,7 +701,7 @@ int xive_native_disable_vp(u32 vp_id) rc = opal_xive_set_vp_info(vp_id, 0, 0); if (rc != OPAL_BUSY) break; - msleep(1); + usleep_range(1000, 1100); } return rc ? -EIO : 0; } -- 2.11.0
Re: [PATCH] Documentation: Add powerpc options for spec_store_bypass_disable
On Mon, Jul 9, 2018 at 7:08 PM, Michael Ellerman wrote: > Document the support for spec_store_bypass_disable that was added for > powerpc in commit a048a07d7f45 ("powerpc/64s: Add support for a store > forwarding barrier at kernel entry/exit"). > > Signed-off-by: Michael Ellerman Reviewed-by: Kees Cook -Kees > --- > Documentation/admin-guide/kernel-parameters.txt | 16 +--- > 1 file changed, 13 insertions(+), 3 deletions(-) > > I tried documenting the differences between the PPC options and X86 ones in > one > section, but it got quite messy, so I went with this instead. Happy to take > advice on how better to structure it if anyone has opinions. > > diff --git a/Documentation/admin-guide/kernel-parameters.txt > b/Documentation/admin-guide/kernel-parameters.txt > index efc7aa7a0670..f320c7168b04 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -4060,6 +4060,8 @@ > This parameter controls whether the Speculative Store > Bypass optimization is used. > > + On x86 the options are: > + > on - Unconditionally disable Speculative Store > Bypass > off - Unconditionally enable Speculative Store > Bypass > auto- Kernel detects whether the CPU model > contains an > @@ -4075,12 +4077,20 @@ > seccomp - Same as "prctl" above, but all seccomp > threads > will disable SSB unless they explicitly opt > out. > > - Not specifying this option is equivalent to > - spec_store_bypass_disable=auto. > - > Default mitigations: > X86:If CONFIG_SECCOMP=y "seccomp", otherwise > "prctl" > > + On powerpc the options are: > + > + on,auto - On Power8 and Power9 insert a > store-forwarding > + barrier on kernel entry and exit. On Power7 > + perform a software flush on kernel entry and > + exit. > + off - No action. > + > + Not specifying this option is equivalent to > + spec_store_bypass_disable=auto. > + > spia_io_base= [HW,MTD] > spia_fio_base= > spia_pedr= > -- > 2.14.1 > -- Kees Cook Pixel Security
Re: [kbuild-all] [PATCH v6 2/4] resource: Use list_head to link sibling resource
On 07/10/18 at 08:59am, Ye Xiaolong wrote: > Hi, > > On 07/08, Baoquan He wrote: > >Hi, > > > >On 07/05/18 at 01:00am, kbuild test robot wrote: > >> Hi Baoquan, > >> > >> I love your patch! Yet something to improve: > >> > >> [auto build test ERROR on linus/master] > >> [also build test ERROR on v4.18-rc3 next-20180704] > >> [if your patch is applied to the wrong git tree, please drop us a note to > >> help improve the system] > > > >Thanks for telling. > > > >I cloned 0day-ci/linut to my local pc. > >https://github.com/0day-ci/linux.git > > > >However, I didn't find below branch. And tried to open it in web > >broswer, also failed. > > > > Sorry for the inconvenience, 0day bot didn't push the branch to github > successfully, > Just push it manually, you can have a try again. Thanks, Xiaolong, I have applied them on top of linux-next/master, and copy the config file attached, and run the command to reproduce as suggested. Now I have fixed all those issues reported, will repost. > > > > >> url: > >> https://github.com/0day-ci/linux/commits/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180704-121402 > >> config: mips-rb532_defconfig (attached as .config) > >> compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 > >> reproduce: > >> wget > >> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross > >> -O ~/bin/make.cross > >> chmod +x ~/bin/make.cross > >> # save the attached .config to linux build tree > >> GCC_VERSION=7.2.0 make.cross ARCH=mips > > > >I did find a old one which is for the old version 5 post. > > > >[bhe@linux]$ git remote -v > >0day-ci https://github.com/0day-ci/linux.git (fetch) > >0day-ci https://github.com/0day-ci/linux.git (push) > >[bhe@dhcp-128-28 linux]$ git branch -a| grep Baoquan| grep resource > > > > remotes/0day-ci/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180612-113600 > > > >Could you help have a look at this? > > > >Thanks > >Baoquan > > > >> > >> All error/warnings (new ones prefixed by >>): > >> > >> >> arch/mips/pci/pci-rc32434.c:57:11: error: initialization from > >> >> incompatible pointer type [-Werror=incompatible-pointer-types] > >> .child = _res_pci_mem2 > >> ^ > >>arch/mips/pci/pci-rc32434.c:57:11: note: (near initialization for > >> 'rc32434_res_pci_mem1.child.next') > >> >> arch/mips/pci/pci-rc32434.c:51:47: warning: missing braces around > >> >> initializer [-Wmissing-braces] > >> static struct resource rc32434_res_pci_mem1 = { > >> ^ > >>arch/mips/pci/pci-rc32434.c:60:47: warning: missing braces around > >> initializer [-Wmissing-braces] > >> static struct resource rc32434_res_pci_mem2 = { > >> ^ > >>cc1: some warnings being treated as errors > >> > >> vim +57 arch/mips/pci/pci-rc32434.c > >> > >> 73b4390f Ralf Baechle 2008-07-16 50 > >> 73b4390f Ralf Baechle 2008-07-16 @51 static struct resource > >> rc32434_res_pci_mem1 = { > >> 73b4390f Ralf Baechle 2008-07-16 52 .name = "PCI MEM1", > >> 73b4390f Ralf Baechle 2008-07-16 53 .start = 0x5000, > >> 73b4390f Ralf Baechle 2008-07-16 54 .end = 0x5FFF, > >> 73b4390f Ralf Baechle 2008-07-16 55 .flags = IORESOURCE_MEM, > >> 73b4390f Ralf Baechle 2008-07-16 56 .sibling = NULL, > >> 73b4390f Ralf Baechle 2008-07-16 @57 .child = _res_pci_mem2 > >> 73b4390f Ralf Baechle 2008-07-16 58 }; > >> 73b4390f Ralf Baechle 2008-07-16 59 > >> > >> :: The code at line 57 was first introduced by commit > >> :: 73b4390fb23456964201abda79f1210fe337d01a [MIPS] Routerboard 532: > >> Support for base system > >> > >> :: TO: Ralf Baechle > >> :: CC: Ralf Baechle > >> > >> --- > >> 0-DAY kernel test infrastructureOpen Source Technology > >> Center > >> https://lists.01.org/pipermail/kbuild-all Intel > >> Corporation > > > > > >___ > >kbuild-all mailing list > >kbuild-...@lists.01.org > >https://lists.01.org/mailman/listinfo/kbuild-all
[PATCH] Documentation: Add powerpc options for spec_store_bypass_disable
Document the support for spec_store_bypass_disable that was added for powerpc in commit a048a07d7f45 ("powerpc/64s: Add support for a store forwarding barrier at kernel entry/exit"). Signed-off-by: Michael Ellerman --- Documentation/admin-guide/kernel-parameters.txt | 16 +--- 1 file changed, 13 insertions(+), 3 deletions(-) I tried documenting the differences between the PPC options and X86 ones in one section, but it got quite messy, so I went with this instead. Happy to take advice on how better to structure it if anyone has opinions. diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index efc7aa7a0670..f320c7168b04 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -4060,6 +4060,8 @@ This parameter controls whether the Speculative Store Bypass optimization is used. + On x86 the options are: + on - Unconditionally disable Speculative Store Bypass off - Unconditionally enable Speculative Store Bypass auto- Kernel detects whether the CPU model contains an @@ -4075,12 +4077,20 @@ seccomp - Same as "prctl" above, but all seccomp threads will disable SSB unless they explicitly opt out. - Not specifying this option is equivalent to - spec_store_bypass_disable=auto. - Default mitigations: X86:If CONFIG_SECCOMP=y "seccomp", otherwise "prctl" + On powerpc the options are: + + on,auto - On Power8 and Power9 insert a store-forwarding + barrier on kernel entry and exit. On Power7 + perform a software flush on kernel entry and + exit. + off - No action. + + Not specifying this option is equivalent to + spec_store_bypass_disable=auto. + spia_io_base= [HW,MTD] spia_fio_base= spia_pedr= -- 2.14.1
Re: [PATCH 1/3] [v2] powerpc: mac: fix rtc read/write functions
On Mon, 9 Jul 2018, Arnd Bergmann wrote: > > The most likely explanation I have here is that the RTC was indeed set > to an incorrect date, either because of a depleted battery (not unlikely > for a ~15 year old box) or because it was previously stored incorrectly. The PowerMac stores the GMT offset in NVRAM, and this gets used to initialize timezone_offset. If timezone_offset was negative and now.tv_sec was zero, I think this could store a 1969 date in the RTC: int update_persistent_clock64(struct timespec64 now) { struct rtc_time tm; if (!ppc_md.set_rtc_time) return -ENODEV; rtc_time64_to_tm(now.tv_sec + 1 + timezone_offset, ); return ppc_md.set_rtc_time(); } But maybe now.tv_sec can be shown to be greater than timezone_offset. Then, what would happen when the timezone in /etc/localtime disagrees with the timezone_offset stored in NVRAM (PRAM)? Besides that, if the battery went flat, what use is a backtrace? Why not scrap the WARN_ON()? --
Re: [kbuild-all] [PATCH v6 2/4] resource: Use list_head to link sibling resource
Hi, On 07/08, Baoquan He wrote: >Hi, > >On 07/05/18 at 01:00am, kbuild test robot wrote: >> Hi Baoquan, >> >> I love your patch! Yet something to improve: >> >> [auto build test ERROR on linus/master] >> [also build test ERROR on v4.18-rc3 next-20180704] >> [if your patch is applied to the wrong git tree, please drop us a note to >> help improve the system] > >Thanks for telling. > >I cloned 0day-ci/linut to my local pc. >https://github.com/0day-ci/linux.git > >However, I didn't find below branch. And tried to open it in web >broswer, also failed. > Sorry for the inconvenience, 0day bot didn't push the branch to github successfully, Just push it manually, you can have a try again. Thanks, Xiaolong > >> url: >> https://github.com/0day-ci/linux/commits/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180704-121402 >> config: mips-rb532_defconfig (attached as .config) >> compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 >> reproduce: >> wget >> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O >> ~/bin/make.cross >> chmod +x ~/bin/make.cross >> # save the attached .config to linux build tree >> GCC_VERSION=7.2.0 make.cross ARCH=mips > >I did find a old one which is for the old version 5 post. > >[bhe@linux]$ git remote -v >0day-cihttps://github.com/0day-ci/linux.git (fetch) >0day-cihttps://github.com/0day-ci/linux.git (push) >[bhe@dhcp-128-28 linux]$ git branch -a| grep Baoquan| grep resource > > remotes/0day-ci/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180612-113600 > >Could you help have a look at this? > >Thanks >Baoquan > >> >> All error/warnings (new ones prefixed by >>): >> >> >> arch/mips/pci/pci-rc32434.c:57:11: error: initialization from >> >> incompatible pointer type [-Werror=incompatible-pointer-types] >> .child = _res_pci_mem2 >> ^ >>arch/mips/pci/pci-rc32434.c:57:11: note: (near initialization for >> 'rc32434_res_pci_mem1.child.next') >> >> arch/mips/pci/pci-rc32434.c:51:47: warning: missing braces around >> >> initializer [-Wmissing-braces] >> static struct resource rc32434_res_pci_mem1 = { >> ^ >>arch/mips/pci/pci-rc32434.c:60:47: warning: missing braces around >> initializer [-Wmissing-braces] >> static struct resource rc32434_res_pci_mem2 = { >> ^ >>cc1: some warnings being treated as errors >> >> vim +57 arch/mips/pci/pci-rc32434.c >> >> 73b4390f Ralf Baechle 2008-07-16 50 >> 73b4390f Ralf Baechle 2008-07-16 @51 static struct resource >> rc32434_res_pci_mem1 = { >> 73b4390f Ralf Baechle 2008-07-16 52 .name = "PCI MEM1", >> 73b4390f Ralf Baechle 2008-07-16 53 .start = 0x5000, >> 73b4390f Ralf Baechle 2008-07-16 54 .end = 0x5FFF, >> 73b4390f Ralf Baechle 2008-07-16 55 .flags = IORESOURCE_MEM, >> 73b4390f Ralf Baechle 2008-07-16 56 .sibling = NULL, >> 73b4390f Ralf Baechle 2008-07-16 @57 .child = _res_pci_mem2 >> 73b4390f Ralf Baechle 2008-07-16 58 }; >> 73b4390f Ralf Baechle 2008-07-16 59 >> >> :: The code at line 57 was first introduced by commit >> :: 73b4390fb23456964201abda79f1210fe337d01a [MIPS] Routerboard 532: >> Support for base system >> >> :: TO: Ralf Baechle >> :: CC: Ralf Baechle >> >> --- >> 0-DAY kernel test infrastructureOpen Source Technology Center >> https://lists.01.org/pipermail/kbuild-all Intel Corporation > > >___ >kbuild-all mailing list >kbuild-...@lists.01.org >https://lists.01.org/mailman/listinfo/kbuild-all
Re: [PATCH] powerpc: Replaced msleep with usleep_range
On Mon, 2018-07-09 at 15:57 +0200, Daniel Klamt wrote: > Replaced msleep for less than 10ms with usleep_range because will > often sleep longer than intended. > For original explanation see: > Documentation/timers/timers-howto.txt Why ? This is pointless. The original code is smaller and more readable. We don't care how long it actually sleeps, this is the FW telling us it's busy (or the HW is), come back a bit later. Ben. > Signed-off-by: Daniel Klamt > Signed-off-by: Bjoern Noetel > --- > arch/powerpc/sysdev/xive/native.c | 24 > 1 file changed, 12 insertions(+), 12 deletions(-) > > diff --git a/arch/powerpc/sysdev/xive/native.c > b/arch/powerpc/sysdev/xive/native.c > index 311185b9960a..b164b1cdf4d6 100644 > --- a/arch/powerpc/sysdev/xive/native.c > +++ b/arch/powerpc/sysdev/xive/native.c > @@ -109,7 +109,7 @@ int xive_native_configure_irq(u32 hw_irq, u32 target, u8 > prio, u32 sw_irq) > rc = opal_xive_set_irq_config(hw_irq, target, prio, sw_irq); > if (rc != OPAL_BUSY) > break; > - msleep(1); > + usleep_range(1000, 1100) > } > return rc == 0 ? 0 : -ENXIO; > } > @@ -163,7 +163,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q > *q, u8 prio, > rc = opal_xive_set_queue_info(vp_id, prio, qpage_phys, order, > flags); > if (rc != OPAL_BUSY) > break; > - msleep(1); > + usleep_range(1000, 1100); > } > if (rc) { > pr_err("Error %lld setting queue for prio %d\n", rc, prio); > @@ -190,7 +190,7 @@ static void __xive_native_disable_queue(u32 vp_id, struct > xive_q *q, u8 prio) > rc = opal_xive_set_queue_info(vp_id, prio, 0, 0, 0); > if (rc != OPAL_BUSY) > break; > - msleep(1); > + usleep_range(1000, 1100); > } > if (rc) > pr_err("Error %lld disabling queue for prio %d\n", rc, prio); > @@ -253,7 +253,7 @@ static int xive_native_get_ipi(unsigned int cpu, struct > xive_cpu *xc) > for (;;) { > irq = opal_xive_allocate_irq(chip_id); > if (irq == OPAL_BUSY) { > - msleep(1); > + usleep_range(1000, 1100); > continue; > } > if (irq < 0) { > @@ -275,7 +275,7 @@ u32 xive_native_alloc_irq(void) > rc = opal_xive_allocate_irq(OPAL_XIVE_ANY_CHIP); > if (rc != OPAL_BUSY) > break; > - msleep(1); > + usleep_range(1000, 1100); > } > if (rc < 0) > return 0; > @@ -289,7 +289,7 @@ void xive_native_free_irq(u32 irq) > s64 rc = opal_xive_free_irq(irq); > if (rc != OPAL_BUSY) > break; > - msleep(1); > + usleep_range(1000, 1100); > } > } > EXPORT_SYMBOL_GPL(xive_native_free_irq); > @@ -305,7 +305,7 @@ static void xive_native_put_ipi(unsigned int cpu, struct > xive_cpu *xc) > for (;;) { > rc = opal_xive_free_irq(xc->hw_ipi); > if (rc == OPAL_BUSY) { > - msleep(1); > + usleep_range(1000, 1100); > continue; > } > xc->hw_ipi = 0; > @@ -400,7 +400,7 @@ static void xive_native_setup_cpu(unsigned int cpu, > struct xive_cpu *xc) > rc = opal_xive_set_vp_info(vp, OPAL_XIVE_VP_ENABLED, 0); > if (rc != OPAL_BUSY) > break; > - msleep(1); > + usleep_range(1000, 1100); > } > if (rc) { > pr_err("Failed to enable pool VP on CPU %d\n", cpu); > @@ -444,7 +444,7 @@ static void xive_native_teardown_cpu(unsigned int cpu, > struct xive_cpu *xc) > rc = opal_xive_set_vp_info(vp, 0, 0); > if (rc != OPAL_BUSY) > break; > - msleep(1); > + usleep_range(1000, 1100); > } > } > > @@ -645,7 +645,7 @@ u32 xive_native_alloc_vp_block(u32 max_vcpus) > rc = opal_xive_alloc_vp_block(order); > switch (rc) { > case OPAL_BUSY: > - msleep(1); > + usleep_range(1000, 1100); > break; > case OPAL_XIVE_PROVISIONING: > if (!xive_native_provision_pages()) > @@ -687,7 +687,7 @@ int xive_native_enable_vp(u32 vp_id, bool > single_escalation) > rc = opal_xive_set_vp_info(vp_id, flags, 0); > if (rc != OPAL_BUSY) > break; > - msleep(1); > + usleep_range(1000, 1100); > } > return rc ? -EIO : 0; > } > @@ -701,7 +701,7 @@ int xive_native_disable_vp(u32 vp_id) > rc = opal_xive_set_vp_info(vp_id, 0, 0); > if (rc
Re: NXP p1010se device trees only correct for P1010E/P1014E, not P1010/P1014 SoCs.
On Mon, 2018-07-09 at 09:38 +0100, Tim Small wrote: > On 06/07/18 19:41, Scott Wood wrote: > > > My openwrt patch > > > just does a: > > > > > > /delete-node/ crypto@3; > > > > > > after the p1010si-post.dtsi include. > > > > U-Boot should already be removing the node on non-E chips -- see > > ft_cpu_setup() in arch/powerpc/cpu/mpc85xx/fdt.c > > > Hi Scott, > > Thanks for your email. The device in question ships an old uboot (a > vendor fork of U-Boot 2010.12-svn15934). This was added by commit 6b70ffb9d1b2e, committed in July 2008... maybe there's a problem with the old U-Boot finding the crypto node on this particular chip? > I am right in saying that the right fix is to either: > > Use a bootloader (such as current upstream uboot) which adjusts the > device tree properly... > > or: > > In the case (such as OpenWRT) where the preferred installation method is > to retain the vendor bootloader, then the distro kernel should handle > the device tree fixup itself? The NXP PPC device trees in the kernel are meant to be completed by U-Boot (years ago I repeatedly suggested that the trees be moved into the U-Boot source to reflect this, but nobody seemed interested). Generally that is mainline and NXP SDK U-Boot, but a board dts file might cater to some other U-Boot fork (or other bootloader) if that's what ships on the board. Does this hardware have a board dts file in mainline Linux? -Scott
Re: [PATCH 1/3] [v2] powerpc: mac: fix rtc read/write functions
On Sun, Jul 1, 2018 at 5:47 PM, Meelis Roos wrote: > A patch for the subject is now upstream. That made me finally take some > time to test it on my PowerMac G4. Tha date is OK but I get two warnings > with backtrace on bootup. Full dmesg below. Thanks for testing this, and sorry for the slow reply. > [4.026490] WARNING: CPU: 0 PID: 1 at > arch/powerpc/platforms/powermac/time.c:154 pmu_get_time+0x7c/0xc8 > [4.032261] Modules linked in: > [4.037878] CPU: 0 PID: 1 Comm: swapper Tainted: GW > 4.18.0-rc2-00223-g1904148a361a #88 > [4.043750] NIP: c0021354 LR: c0021308 CTR: > [4.049585] REGS: ef047cd0 TRAP: 0700 Tainted: GW > (4.18.0-rc2-00223-g1904148a361a) > [4.055572] MSR: 00029032 CR: 44000222 XER: 2000 > [4.061620] >GPR00: c0021308 ef047d80 ef048000 00d7029c 0004 > 0001 009c >GPR08: 00d7 0001 0200 c06a 24000228 > c0004c9c >GPR16: > c067 c0601a38 >GPR24: 0008 c0630f18 c062a40c c05fc10c ef047e50 ef273800 > ef047e50 ef047e50 > [4.092393] NIP [c0021354] pmu_get_time+0x7c/0xc8 > [4.098596] LR [c0021308] pmu_get_time+0x30/0xc8 I don't see how the WARN_ON() triggered unless the PMU time is actually before 1970. > [4.104779] Call Trace: > [4.110909] [ef047d80] [c0021308] pmu_get_time+0x30/0xc8 (unreliable) > [4.117209] [ef047df0] [c00213e8] pmac_get_rtc_time+0x28/0x40 > [4.123470] [ef047e00] [c000bc04] rtc_generic_get_time+0x20/0x34 > [4.129770] [ef047e10] [c03aca34] __rtc_read_time+0x5c/0xe0 > [4.136060] [ef047e20] [c03acafc] rtc_read_time+0x44/0x7c > [4.142356] [ef047e40] [c061e000] rtc_hctosys+0x64/0x11c > [4.148616] [ef047ea0] [c0004aa4] do_one_initcall+0x4c/0x1a8 > [4.154866] [ef047f00] [c06022f0] kernel_init_freeable+0x12c/0x1f4 > [4.161123] [ef047f30] [c0004cb4] kernel_init+0x18/0x130 > [4.167359] [ef047f40] [c00121c4] ret_from_kernel_thread+0x14/0x1c > [4.173610] Instruction dump: > [4.179766] 8941002e 5484c00e 5508801e 88e1002f 7c844214 554a402e 7c845214 > 7c843a14 > [4.186076] 7d244810 7d294910 7d2948f8 552907fe <0f09> 3d2083da > 80010074 38210070 > [4.192388] ---[ end trace 2e01ad9337fe08fd ]--- > [4.198643] rtc-generic rtc-generic: hctosys: unable to read the hardware > clock The last message here happens exactly in that case as well: tm_year is before 1970: int rtc_valid_tm(struct rtc_time *tm) { if (tm->tm_year < 70 || ((unsigned)tm->tm_mon) >= 12 || tm->tm_mday < 1 || tm->tm_mday > rtc_month_days(tm->tm_mon, tm->tm_year + 1900) || ((unsigned)tm->tm_hour) >= 24 || ((unsigned)tm->tm_min) >= 60 || ((unsigned)tm->tm_sec) >= 60) return -EINVAL; return 0; } The most likely explanation I have here is that the RTC was indeed set to an incorrect date, either because of a depleted battery (not unlikely for a ~15 year old box) or because it was previously stored incorrectly. You say that the time is correct, but that could also be the case if the machine is connected to the network and synchronized using NTP. It should not have gotten the time from the RTC after that error. If you have the time to do another test, can you boot the machine with its network disconnected, see if the warning persists (it may have been repaired after the correct time got written into the RTC during shutdown), what the output of 'sudo hwclock' is, and whether anything changes after you set the correct time using 'hwclock --systohc' and reboot? Arnd
Re: [PATCH] powerpc: Replaced msleep with usleep_range
Hi Daniel, Thank you for the patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on v4.18-rc4 next-20180709] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Daniel-Klamt/powerpc-Replaced-msleep-with-usleep_range/20180709-231913 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=powerpc All errors (new ones prefixed by >>): arch/powerpc/sysdev/xive/native.c: In function 'xive_native_configure_irq': >> arch/powerpc/sysdev/xive/native.c:113:2: error: expected ';' before '}' token } ^ vim +113 arch/powerpc/sysdev/xive/native.c 243e2511 Benjamin Herrenschmidt 2017-04-05 103 243e2511 Benjamin Herrenschmidt 2017-04-05 104 int xive_native_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq) 243e2511 Benjamin Herrenschmidt 2017-04-05 105 { 243e2511 Benjamin Herrenschmidt 2017-04-05 106 s64 rc; 243e2511 Benjamin Herrenschmidt 2017-04-05 107 243e2511 Benjamin Herrenschmidt 2017-04-05 108 for (;;) { 243e2511 Benjamin Herrenschmidt 2017-04-05 109 rc = opal_xive_set_irq_config(hw_irq, target, prio, sw_irq); 243e2511 Benjamin Herrenschmidt 2017-04-05 110 if (rc != OPAL_BUSY) 243e2511 Benjamin Herrenschmidt 2017-04-05 111 break; c332c793 Daniel Klamt 2018-07-09 112 usleep_range(1000, 1100) 243e2511 Benjamin Herrenschmidt 2017-04-05 @113 } 243e2511 Benjamin Herrenschmidt 2017-04-05 114 return rc == 0 ? 0 : -ENXIO; 243e2511 Benjamin Herrenschmidt 2017-04-05 115 } 5af50993 Benjamin Herrenschmidt 2017-04-05 116 EXPORT_SYMBOL_GPL(xive_native_configure_irq); 5af50993 Benjamin Herrenschmidt 2017-04-05 117 243e2511 Benjamin Herrenschmidt 2017-04-05 118 :: The code at line 113 was first introduced by commit :: 243e25112d06b348f087a6f7aba4bbc288285bdd powerpc/xive: Native exploitation of the XIVE interrupt controller :: TO: Benjamin Herrenschmidt :: CC: Michael Ellerman --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [next-20180709][bisected 9cf57731][ppc] build fail with ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734
On Mon, 2018-07-09 at 13:47 +0200, Peter Zijlstra wrote: > On Mon, Jul 09, 2018 at 03:21:23PM +0530, Abdul Haleem wrote: > > Greeting's > > > > Today's next fails to build on powerpc with below error > > > > kernel/cpu.o:(.data.rel+0x18e0): undefined reference to > > `lockup_detector_online_cpu' > > ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734 > > kernel/cpu.o:(.data.rel+0x18e8): undefined reference to > > `lockup_detector_offline_cpu' > > ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734 > > Makefile:1005: recipe for target 'vmlinux' failed > > make: *** [vmlinux] Error 1 > > Urgh, sorry about that. I think the below should cure that. > > I got confused by all the varioud CONFIG options here abour and > conflated CONFIG_LOCKUP_DETECTOR and CONFIG_SOFTLOCKUP_DETECTOR it > seems. > > diff --git a/include/linux/nmi.h b/include/linux/nmi.h > index 80664bbeca43..08f9247e9827 100644 > --- a/include/linux/nmi.h > +++ b/include/linux/nmi.h > @@ -33,15 +33,10 @@ extern int sysctl_hardlockup_all_cpu_backtrace; > #define sysctl_hardlockup_all_cpu_backtrace 0 > #endif /* !CONFIG_SMP */ > > -extern int lockup_detector_online_cpu(unsigned int cpu); > -extern int lockup_detector_offline_cpu(unsigned int cpu); > - > #else /* CONFIG_LOCKUP_DETECTOR */ > static inline void lockup_detector_init(void) { } > static inline void lockup_detector_soft_poweroff(void) { } > static inline void lockup_detector_cleanup(void) { } > -#define lockup_detector_online_cpu NULL > -#define lockup_detector_offline_cpu NULL > #endif /* !CONFIG_LOCKUP_DETECTOR */ > > #ifdef CONFIG_SOFTLOCKUP_DETECTOR > @@ -50,12 +45,18 @@ extern void touch_softlockup_watchdog(void); > extern void touch_softlockup_watchdog_sync(void); > extern void touch_all_softlockup_watchdogs(void); > extern unsigned int softlockup_panic; > -#else > + > +extern int lockup_detector_online_cpu(unsigned int cpu); > +extern int lockup_detector_offline_cpu(unsigned int cpu); > +#else /* CONFIG_SOFTLOCKUP_DETECTOR */ > static inline void touch_softlockup_watchdog_sched(void) { } > static inline void touch_softlockup_watchdog(void) { } > static inline void touch_softlockup_watchdog_sync(void) { } > static inline void touch_all_softlockup_watchdogs(void) { } > -#endif > + > +#define lockup_detector_online_cpu NULL > +#define lockup_detector_offline_cpu NULL > +#endif /* CONFIG_SOFTLOCKUP_DETECTOR */ > > #ifdef CONFIG_DETECT_HUNG_TASK > void reset_hung_task_detector(void); > Thanks Peter for the patch, build and boot is fine. Reported-and-tested-by: Abdul Haleem -- Regard's Abdul Haleem IBM Linux Technology Centre
Re: [PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()
On 07/09/2018 05:19 AM, Marek Szyprowski wrote: cma_alloc() function doesn't really support gfp flags other than __GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter. This will help to avoid giving false feeling that this function supports standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer, what has already been an issue: see commit dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag"). For Ion, Acked-by: Laura Abbott Signed-off-by: Marek Szyprowski --- arch/powerpc/kvm/book3s_hv_builtin.c | 2 +- drivers/s390/char/vmcp.c | 2 +- drivers/staging/android/ion/ion_cma_heap.c | 2 +- include/linux/cma.h| 2 +- kernel/dma/contiguous.c| 3 ++- mm/cma.c | 8 mm/cma_debug.c | 2 +- 7 files changed, 11 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c index d4a3f4da409b..fc6bb9630a9c 100644 --- a/arch/powerpc/kvm/book3s_hv_builtin.c +++ b/arch/powerpc/kvm/book3s_hv_builtin.c @@ -77,7 +77,7 @@ struct page *kvm_alloc_hpt_cma(unsigned long nr_pages) VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT); return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES), -GFP_KERNEL); +false); } EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma); diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c index 948ce82a7725..0fa1b6b1491a 100644 --- a/drivers/s390/char/vmcp.c +++ b/drivers/s390/char/vmcp.c @@ -68,7 +68,7 @@ static void vmcp_response_alloc(struct vmcp_session *session) * anymore the system won't work anyway. */ if (order > 2) - page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL); + page = cma_alloc(vmcp_cma, nr_pages, 0, false); if (page) { session->response = (char *)page_to_phys(page); session->cma_alloc = 1; diff --git a/drivers/staging/android/ion/ion_cma_heap.c b/drivers/staging/android/ion/ion_cma_heap.c index 49718c96bf9e..3fafd013d80a 100644 --- a/drivers/staging/android/ion/ion_cma_heap.c +++ b/drivers/staging/android/ion/ion_cma_heap.c @@ -39,7 +39,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct ion_buffer *buffer, if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL); + pages = cma_alloc(cma_heap->cma, nr_pages, align, false); if (!pages) return -ENOMEM; diff --git a/include/linux/cma.h b/include/linux/cma.h index bf90f0bb42bd..190184b5ff32 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -33,7 +33,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, const char *name, struct cma **res_cma); extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, - gfp_t gfp_mask); + bool no_warn); extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count); extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index d987dcd1bd56..19ea5d70150c 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -191,7 +191,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - return cma_alloc(dev_get_cma_area(dev), count, align, gfp_mask); + return cma_alloc(dev_get_cma_area(dev), count, align, +gfp_mask & __GFP_NOWARN); } /** diff --git a/mm/cma.c b/mm/cma.c index 5809bbe360d7..4cb76121a3ab 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -395,13 +395,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { } * @cma: Contiguous memory region for which the allocation is performed. * @count: Requested number of pages. * @align: Requested alignment of pages (in PAGE_SIZE order). - * @gfp_mask: GFP mask to use during compaction + * @no_warn: Avoid printing message about failed allocation * * This function allocates part of contiguous memory on specific * contiguous memory area. */ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, - gfp_t gfp_mask) + bool no_warn) { unsigned long mask, offset; unsigned long pfn = -1; @@ -447,7 +447,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
Re: [PATCH v6 8/8] powernv/pseries: consolidate code for mce early handling.
On Fri, 6 Jul 2018 19:40:24 +1000 Nicholas Piggin wrote: > On Wed, 04 Jul 2018 23:30:12 +0530 > Mahesh J Salgaonkar wrote: > > > From: Mahesh Salgaonkar > > > > Now that other platforms also implements real mode mce handler, > > lets consolidate the code by sharing existing powernv machine check > > early code. Rename machine_check_powernv_early to > > machine_check_common_early and reuse the code. > > > > Signed-off-by: Mahesh Salgaonkar > > --- > > arch/powerpc/kernel/exceptions-64s.S | 56 > > +++--- 1 file changed, 11 > > insertions(+), 45 deletions(-) > > > > diff --git a/arch/powerpc/kernel/exceptions-64s.S > > b/arch/powerpc/kernel/exceptions-64s.S index > > 0038596b7906..3e877ec55d50 100644 --- > > a/arch/powerpc/kernel/exceptions-64s.S +++ > > b/arch/powerpc/kernel/exceptions-64s.S @@ -243,14 +243,13 @@ > > EXC_REAL_BEGIN(machine_check, 0x200, 0x100) > > SET_SCRATCH0(r13) /* save r13 */ > > EXCEPTION_PROLOG_0(PACA_EXMC) BEGIN_FTR_SECTION > > - b machine_check_powernv_early > > + b machine_check_common_early > > FTR_SECTION_ELSE > > b machine_check_pSeries_0 > > ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) > > EXC_REAL_END(machine_check, 0x200, 0x100) > > EXC_VIRT_NONE(0x4200, 0x100) > > -TRAMP_REAL_BEGIN(machine_check_powernv_early) > > -BEGIN_FTR_SECTION > > +TRAMP_REAL_BEGIN(machine_check_common_early) > > EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200) > > /* > > * Register contents: > > @@ -306,7 +305,9 @@ BEGIN_FTR_SECTION > > /* Save r9 through r13 from EXMC save area to stack frame. > > */ EXCEPTION_PROLOG_COMMON_2(PACA_EXMC) > > mfmsr r11 /* get MSR value */ > > +BEGIN_FTR_SECTION > > ori r11,r11,MSR_ME /* turn on ME bit > > */ +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) > > ori r11,r11,MSR_RI /* turn on RI bit > > */ LOAD_HANDLER(r12, machine_check_handle_early) > > 1: mtspr SPRN_SRR0,r12 > > @@ -325,7 +326,6 @@ BEGIN_FTR_SECTION > > andcr11,r11,r10 /* Turn off MSR_ME > > */ b1b > > b . /* prevent speculative execution */ > > -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) > > > > TRAMP_REAL_BEGIN(machine_check_pSeries) > > .globl machine_check_fwnmi > > @@ -333,7 +333,7 @@ machine_check_fwnmi: > > SET_SCRATCH0(r13) /* save r13 */ > > EXCEPTION_PROLOG_0(PACA_EXMC) > > BEGIN_FTR_SECTION > > - b machine_check_pSeries_early > > + b machine_check_common_early > > END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE) > > machine_check_pSeries_0: > > EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200) > > @@ -346,45 +346,6 @@ machine_check_pSeries_0: > > > > TRAMP_KVM_SKIP(PACA_EXMC, 0x200) > > > > -TRAMP_REAL_BEGIN(machine_check_pSeries_early) > > -BEGIN_FTR_SECTION > > - EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200) > > - mr r10,r1 /* Save r1 */ > > - ld r1,PACAMCEMERGSP(r13) /* Use MC emergency > > stack */ > > - subir1,r1,INT_FRAME_SIZE/* alloc stack > > frame */ > > - mfspr r11,SPRN_SRR0 /* Save SRR0 */ > > - mfspr r12,SPRN_SRR1 /* Save SRR1 */ > > - EXCEPTION_PROLOG_COMMON_1() > > - EXCEPTION_PROLOG_COMMON_2(PACA_EXMC) > > - EXCEPTION_PROLOG_COMMON_3(0x200) > > - addir3,r1,STACK_FRAME_OVERHEAD > > - BRANCH_LINK_TO_FAR(machine_check_early) /* Function call > > ABI */ - > > - /* Move original SRR0 and SRR1 into the respective regs */ > > - ld r9,_MSR(r1) > > - mtspr SPRN_SRR1,r9 > > - ld r3,_NIP(r1) > > - mtspr SPRN_SRR0,r3 > > - ld r9,_CTR(r1) > > - mtctr r9 > > - ld r9,_XER(r1) > > - mtxer r9 > > - ld r9,_LINK(r1) > > - mtlrr9 > > - REST_GPR(0, r1) > > - REST_8GPRS(2, r1) > > - REST_GPR(10, r1) > > - ld r11,_CCR(r1) > > - mtcrr11 > > - REST_GPR(11, r1) > > - REST_2GPRS(12, r1) > > - /* restore original r1. */ > > - ld r1,GPR1(r1) > > - SET_SCRATCH0(r13) /* save r13 */ > > - EXCEPTION_PROLOG_0(PACA_EXMC) > > - b machine_check_pSeries_0 > > -END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE) > > - > > EXC_COMMON_BEGIN(machine_check_common) > > /* > > * Machine check is different because we use a different > > @@ -483,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early) > > bl machine_check_early > > std r3,RESULT(r1) /* Save result */ > > ld r12,_MSR(r1) > > +BEGIN_FTR_SECTION > > + bne 9f /* pSeries: continue > > to V mode. */ +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE) > > Should this be "b 9f" ? Although... > > > > > #ifdef CONFIG_PPC_P7_NAP > > /* > > @@ -564,7 +528,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early) > > 9: > > /* Deliver the machine check to host kernel in V mode. */ > > MACHINE_CHECK_HANDLER_WINDUP > > - b machine_check_pSeries > > + SET_SCRATCH0(r13)
[PATCH v06 9/9] hotplug/pmt: Update topology after PMT
hotplug/pmt: Call rebuild_sched_domains after applying changes to update CPU associativity i.e. 'readd' CPUs. This is to ensure that the deferred calls to arch_update_cpu_topology are now reflected in the system data structures. Signed-off-by: Michael Bringmann --- arch/powerpc/platforms/pseries/dlpar.c |4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index 7264b8e..ea3c08a 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include @@ -451,6 +452,9 @@ static int dlpar_pmt(struct pseries_hp_errorlog *work) ssleep(10); } + ssleep(5); + rebuild_sched_domains(); + return 0; }
[PATCH v06 8/9] hotplug/rtas: No rtas_event_scan during PMT update
hotplug/rtas: Disable rtas_event_scan during device-tree property updates after migration to reduce conflicts with changes propagated to other parts of the kernel configuration, such as CPUs or memory. Signed-off-by: Michael Bringmann --- arch/powerpc/platforms/pseries/hotplug-cpu.c |4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 6267b53..f5c9e8f 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -686,14 +686,18 @@ static int dlpar_cpu_readd_by_index(u32 drc_index) pr_info("Attempting to re-add CPU, drc index %x\n", drc_index); + rtas_event_scan_disable(); arch_update_cpu_topology_suspend(); rc = dlpar_cpu_remove_by_index(drc_index, false); arch_update_cpu_topology_resume(); + rtas_event_scan_enable(); if (!rc) { + rtas_event_scan_disable(); arch_update_cpu_topology_suspend(); rc = dlpar_cpu_add(drc_index, false); arch_update_cpu_topology_resume(); + rtas_event_scan_enable(); } if (rc)
[PATCH v06 7/9] powerpc/rtas: Allow disabling rtas_event_scan
powerpc/rtas: Provide mechanism by which the rtas_event_scan can be disabled/re-enabled by other portions of the powerpc code. Among other things, this simplifies the usage of locking mechanisms for shared kernel resources. Signed-off-by: Michael Bringmann --- arch/powerpc/include/asm/rtas.h |4 arch/powerpc/kernel/rtasd.c | 14 ++ 2 files changed, 18 insertions(+) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 4f601c7..4ab605a 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -386,8 +386,12 @@ extern int early_init_dt_scan_rtas(unsigned long node, #ifdef CONFIG_PPC_RTAS_DAEMON extern void rtas_cancel_event_scan(void); +extern void rtas_event_scan_disable(void); +extern void rtas_event_scan_enable(void); #else static inline void rtas_cancel_event_scan(void) { } +static inline void rtas_event_scan_disable(void) { } +static inline void rtas_event_scan_enable(void) { } #endif /* Error types logged. */ diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c index 44d66c33d..af69e44 100644 --- a/arch/powerpc/kernel/rtasd.c +++ b/arch/powerpc/kernel/rtasd.c @@ -455,11 +455,25 @@ static void do_event_scan(void) */ static unsigned long event_scan_delay = 1*HZ; static int first_pass = 1; +static int res_enable = 1; + +void rtas_event_scan_disable(void) +{ + res_enable = 0; +} + +void rtas_event_scan_enable(void) +{ + res_enable = 1; +} static void rtas_event_scan(struct work_struct *w) { unsigned int cpu; + if (!res_enable) + return; + do_event_scan(); get_online_cpus();
[PATCH v06 6/9] pmt/numa: Disable arch_update_cpu_topology during CPU readd
pmt/numa: Disable arch_update_cpu_topology during post migration CPU readd updates when evaluating device-tree changes after LPM to avoid thread deadlocks trying to update node assignments. System timing between all of the threads and timers restarted in a migrated system overlapped frequently allowing tasks to start acquiring resources (get_online_cpus) needed by rebuild_sched_domains. Defer the operation of that function until after the CPU readd has completed. Signed-off-by: Michael Bringmann --- arch/powerpc/platforms/pseries/hotplug-cpu.c |9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 8f28160..6267b53 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -26,6 +26,7 @@ #include/* for idle_task_exit */ #include #include +#include #include #include #include @@ -685,9 +686,15 @@ static int dlpar_cpu_readd_by_index(u32 drc_index) pr_info("Attempting to re-add CPU, drc index %x\n", drc_index); + arch_update_cpu_topology_suspend(); rc = dlpar_cpu_remove_by_index(drc_index, false); - if (!rc) + arch_update_cpu_topology_resume(); + + if (!rc) { + arch_update_cpu_topology_suspend(); rc = dlpar_cpu_add(drc_index, false); + arch_update_cpu_topology_resume(); + } if (rc) pr_info("Failed to update cpu at drc_index %lx\n",
[PATCH v06 5/9] numa: Disable/enable arch_update_cpu_topology
numa: Provide mechanism to disable/enable operation of arch_update_cpu_topology/numa_update_cpu_topology. This is a simple tool to eliminate some avenues for thread deadlock observed during system execution. Signed-off-by: Michael Bringmann --- arch/powerpc/include/asm/topology.h | 10 ++ arch/powerpc/mm/numa.c | 14 ++ 2 files changed, 24 insertions(+) diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h index 16b0778..d9ceba6 100644 --- a/arch/powerpc/include/asm/topology.h +++ b/arch/powerpc/include/asm/topology.h @@ -43,6 +43,8 @@ static inline int pcibus_to_node(struct pci_bus *bus) extern int sysfs_add_device_to_node(struct device *dev, int nid); extern void sysfs_remove_device_from_node(struct device *dev, int nid); extern int numa_update_cpu_topology(bool cpus_locked); +extern void arch_update_cpu_topology_suspend(void); +extern void arch_update_cpu_topology_resume(void); static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node) { @@ -82,6 +84,14 @@ static inline int numa_update_cpu_topology(bool cpus_locked) return 0; } +static inline void arch_update_cpu_topology_suspend(void) +{ +} + +static inline void arch_update_cpu_topology_resume(void) +{ +} + static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node) {} #endif /* CONFIG_NUMA */ diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index b22e27a..2352489 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -1079,6 +1079,7 @@ struct topology_update_data { static int topology_timer_secs = 1; static int topology_inited; static int topology_update_needed; +static int topology_update_enabled = 1; static struct mutex topology_update_lock; /* @@ -1313,6 +1314,9 @@ int numa_update_cpu_topology(bool cpus_locked) return 0; } + if (!topology_update_enabled) + return 0; + weight = cpumask_weight(_associativity_changes_mask); if (!weight) return 0; @@ -1439,6 +1443,16 @@ int arch_update_cpu_topology(void) return numa_update_cpu_topology(true); } +void arch_update_cpu_topology_suspend(void) +{ + topology_update_enabled = 0; +} + +void arch_update_cpu_topology_resume(void) +{ + topology_update_enabled = 1; +} + static void topology_work_fn(struct work_struct *work) { rebuild_sched_domains();
[PATCH v06 4/9] mobility/numa: Ensure numa update does not overlap
mobility/numa: Ensure that numa_update_cpu_topology() can not be entered multiple times concurrently. It may be accessed through many different paths / concurrent work functions, and the lock ordering may be difficult to ensure otherwise. Signed-off-by: Michael Bringmann --- arch/powerpc/mm/numa.c |9 + 1 file changed, 9 insertions(+) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index a789d57..b22e27a 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -1079,6 +1079,7 @@ struct topology_update_data { static int topology_timer_secs = 1; static int topology_inited; static int topology_update_needed; +static struct mutex topology_update_lock; /* * Change polling interval for associativity changes. @@ -1320,6 +1321,11 @@ int numa_update_cpu_topology(bool cpus_locked) if (!updates) return 0; + if (!mutex_trylock(_update_lock)) { + kfree(updates); + return 0; + } + cpumask_clear(_cpus); for_each_cpu(cpu, _associativity_changes_mask) { @@ -1424,6 +1430,7 @@ int numa_update_cpu_topology(bool cpus_locked) out: kfree(updates); topology_update_needed = 0; + mutex_unlock(_update_lock); return changed; } @@ -1598,6 +1605,8 @@ static ssize_t topology_write(struct file *file, const char __user *buf, static int topology_update_init(void) { + mutex_init(_update_lock); + /* Do not poll for changes if disabled at boot */ if (topology_updates_enabled) start_topology_update();
[PATCH v06 3/9] hotplug/cpu: Provide CPU readd operation
powerpc/dlpar: Provide hotplug CPU 'readd by index' operation to support LPAR Post Migration state updates. When such changes are invoked by the PowerPC 'mobility' code, they will be queued up so that modifications to CPU properties will take place after the new property value is written to the device-tree. Signed-off-by: Michael Bringmann --- Changes in patch: -- Add CPU validity check to pseries_smp_notifier -- Improve check on 'ibm,associativity' property -- Add check for cpu type to new update property entry -- Cleanup reference to outdated queuing function. --- arch/powerpc/platforms/pseries/hotplug-cpu.c | 58 ++ 1 file changed, 58 insertions(+) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 3632db2..8f28160 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -305,6 +305,36 @@ static int pseries_add_processor(struct device_node *np) return err; } +static int pseries_update_processor(struct of_reconfig_data *pr) +{ + int old_entries, new_entries, rc = 0; + __be32 *old_assoc, *new_assoc; + + /* We only handle changes due to 'ibm,associativity' property +*/ + old_assoc = pr->old_prop->value; + old_entries = be32_to_cpu(*old_assoc++); + + new_assoc = pr->prop->value; + new_entries = be32_to_cpu(*new_assoc++); + + if (old_entries == new_entries) { + int sz = old_entries * sizeof(int); + + if (memcmp(old_assoc, new_assoc, sz)) + rc = dlpar_queue_action( + PSERIES_HP_ELOG_RESOURCE_CPU, + PSERIES_HP_ELOG_ACTION_READD, + pr->dn->phandle); + } else { + rc = dlpar_queue_action(PSERIES_HP_ELOG_RESOURCE_CPU, + PSERIES_HP_ELOG_ACTION_READD, + pr->dn->phandle); + } + + return rc; +} + /* * Update the present map for a cpu node which is going away, and set * the hard id in the paca(s) to -1 to be consistent with boot time @@ -649,6 +679,26 @@ static int dlpar_cpu_remove_by_index(u32 drc_index, bool release_drc) return rc; } +static int dlpar_cpu_readd_by_index(u32 drc_index) +{ + int rc = 0; + + pr_info("Attempting to re-add CPU, drc index %x\n", drc_index); + + rc = dlpar_cpu_remove_by_index(drc_index, false); + if (!rc) + rc = dlpar_cpu_add(drc_index, false); + + if (rc) + pr_info("Failed to update cpu at drc_index %lx\n", + (unsigned long int)drc_index); + else + pr_info("CPU at drc_index %lx was updated\n", + (unsigned long int)drc_index); + + return rc; +} + static int find_dlpar_cpus_to_remove(u32 *cpu_drcs, int cpus_to_remove) { struct device_node *dn; @@ -839,6 +889,9 @@ int dlpar_cpu(struct pseries_hp_errorlog *hp_elog) else rc = -EINVAL; break; + case PSERIES_HP_ELOG_ACTION_READD: + rc = dlpar_cpu_readd_by_index(drc_index); + break; default: pr_err("Invalid action (%d) specified\n", hp_elog->action); rc = -EINVAL; @@ -902,6 +955,11 @@ static int pseries_smp_notifier(struct notifier_block *nb, case OF_RECONFIG_DETACH_NODE: pseries_remove_processor(rd->dn); break; + case OF_RECONFIG_UPDATE_PROPERTY: + if (!strcmp(rd->dn->type, "cpu") && + !strcmp(rd->prop->name, "ibm,associativity")) + pseries_update_processor(rd); + break; } return notifier_from_errno(err); }
[PATCH v06 2/9] hotplug/cpu: Add operation queuing function
migration/dlpar: This patch adds function dlpar_queue_action() which will queued up information about a CPU/Memory 'readd' operation according to resource type, action code, and DRC index. At a subsequent point, the list of operations can be run/played in series. Examples of such oprations include 'readd' of CPU and Memory blocks identified as having changed their associativity during an LPAR migration event. Signed-off-by: Michael Bringmann --- Changes in patch: -- Correct drc_index before adding to pseries_hp_errorlog struct -- Correct text of notice -- Revise queuing model to save up all of the DLPAR actions for later execution. -- Restore list init statement missing from patch -- Move call to apply queued operations into 'mobility.c' -- Compress some code -- Rename some of queueing function APIs -- Revise implementation to push execution of queued operations to a workqueue task. -- Cleanup reference to outdated queuing operation. --- arch/powerpc/include/asm/rtas.h |2 + arch/powerpc/platforms/pseries/dlpar.c| 61 + arch/powerpc/platforms/pseries/mobility.c |4 ++ arch/powerpc/platforms/pseries/pseries.h |2 + 4 files changed, 69 insertions(+) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 71e393c..4f601c7 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -310,12 +310,14 @@ struct pseries_hp_errorlog { struct { __be32 count, index; } ic; chardrc_name[1]; } _drc_u; + struct list_head list; }; #define PSERIES_HP_ELOG_RESOURCE_CPU 1 #define PSERIES_HP_ELOG_RESOURCE_MEM 2 #define PSERIES_HP_ELOG_RESOURCE_SLOT 3 #define PSERIES_HP_ELOG_RESOURCE_PHB 4 +#define PSERIES_HP_ELOG_RESOURCE_PMT 5 #define PSERIES_HP_ELOG_ACTION_ADD 1 #define PSERIES_HP_ELOG_ACTION_REMOVE 2 diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c index a0b20c0..7264b8e 100644 --- a/arch/powerpc/platforms/pseries/dlpar.c +++ b/arch/powerpc/platforms/pseries/dlpar.c @@ -25,6 +25,7 @@ #include #include #include +#include #include static struct workqueue_struct *pseries_hp_wq; @@ -329,6 +330,8 @@ int dlpar_release_drc(u32 drc_index) return 0; } +static int dlpar_pmt(struct pseries_hp_errorlog *work); + static int handle_dlpar_errorlog(struct pseries_hp_errorlog *hp_elog) { int rc; @@ -357,6 +360,9 @@ static int handle_dlpar_errorlog(struct pseries_hp_errorlog *hp_elog) case PSERIES_HP_ELOG_RESOURCE_CPU: rc = dlpar_cpu(hp_elog); break; + case PSERIES_HP_ELOG_RESOURCE_PMT: + rc = dlpar_pmt(hp_elog); + break; default: pr_warn_ratelimited("Invalid resource (%d) specified\n", hp_elog->resource); @@ -407,6 +413,61 @@ void queue_hotplug_event(struct pseries_hp_errorlog *hp_errlog, } } +LIST_HEAD(dlpar_delayed_list); + +int dlpar_queue_action(int resource, int action, u32 drc_index) +{ + struct pseries_hp_errorlog *hp_errlog; + + hp_errlog = kmalloc(sizeof(struct pseries_hp_errorlog), GFP_KERNEL); + if (!hp_errlog) + return -ENOMEM; + + hp_errlog->resource = resource; + hp_errlog->action = action; + hp_errlog->id_type = PSERIES_HP_ELOG_ID_DRC_INDEX; + hp_errlog->_drc_u.drc_index = cpu_to_be32(drc_index); + + list_add_tail(_errlog->list, _delayed_list); + + return 0; +} + +static int dlpar_pmt(struct pseries_hp_errorlog *work) +{ + struct list_head *pos, *q; + + ssleep(15); + + list_for_each_safe(pos, q, _delayed_list) { + struct pseries_hp_errorlog *tmp; + + tmp = list_entry(pos, struct pseries_hp_errorlog, list); + handle_dlpar_errorlog(tmp); + + list_del(pos); + kfree(tmp); + + ssleep(10); + } + + return 0; +} + +int dlpar_queued_actions_run(void) +{ + if (!list_empty(_delayed_list)) { + struct pseries_hp_errorlog hp_errlog; + + hp_errlog.resource = PSERIES_HP_ELOG_RESOURCE_PMT; + hp_errlog.action = 0; + hp_errlog.id_type = 0; + + queue_hotplug_event(_errlog, 0, 0); + } + return 0; +} + static int dlpar_parse_resource(char **cmd, struct pseries_hp_errorlog *hp_elog) { char *arg; diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c index f6364d9..d0d1cae 100644 --- a/arch/powerpc/platforms/pseries/mobility.c +++ b/arch/powerpc/platforms/pseries/mobility.c @@ -378,6 +378,10 @@ static ssize_t migration_store(struct class *class, return rc; post_mobility_fixup(); + + /* Apply any necessary changes identified during fixup */ +
[PATCH v06 1/9] hotplug/cpu: Conditionally acquire/release DRC index
powerpc/cpu: Modify dlpar_cpu_add and dlpar_cpu_remove to allow the skipping of DRC index acquire or release operations during the CPU add or remove operations. This is intended to support subsequent changes to provide a 'CPU readd' operation. Signed-off-by: Michael Bringmann --- Changes in patch: -- Move new validity check added to pseries_smp_notifier to another patch --- arch/powerpc/platforms/pseries/hotplug-cpu.c | 68 +++--- 1 file changed, 39 insertions(+), 29 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 6ef77ca..3632db2 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -432,7 +432,7 @@ static bool valid_cpu_drc_index(struct device_node *parent, u32 drc_index) return found; } -static ssize_t dlpar_cpu_add(u32 drc_index) +static ssize_t dlpar_cpu_add(u32 drc_index, bool acquire_drc) { struct device_node *dn, *parent; int rc, saved_rc; @@ -457,19 +457,22 @@ static ssize_t dlpar_cpu_add(u32 drc_index) return -EINVAL; } - rc = dlpar_acquire_drc(drc_index); - if (rc) { - pr_warn("Failed to acquire DRC, rc: %d, drc index: %x\n", - rc, drc_index); - of_node_put(parent); - return -EINVAL; + if (acquire_drc) { + rc = dlpar_acquire_drc(drc_index); + if (rc) { + pr_warn("Failed to acquire DRC, rc: %d, drc index: %x\n", + rc, drc_index); + of_node_put(parent); + return -EINVAL; + } } dn = dlpar_configure_connector(cpu_to_be32(drc_index), parent); if (!dn) { pr_warn("Failed call to configure-connector, drc index: %x\n", drc_index); - dlpar_release_drc(drc_index); + if (acquire_drc) + dlpar_release_drc(drc_index); of_node_put(parent); return -EINVAL; } @@ -484,8 +487,9 @@ static ssize_t dlpar_cpu_add(u32 drc_index) pr_warn("Failed to attach node %s, rc: %d, drc index: %x\n", dn->name, rc, drc_index); - rc = dlpar_release_drc(drc_index); - if (!rc) + if (acquire_drc) + rc = dlpar_release_drc(drc_index); + if (!rc || acquire_drc) dlpar_free_cc_nodes(dn); return saved_rc; @@ -498,7 +502,7 @@ static ssize_t dlpar_cpu_add(u32 drc_index) dn->name, rc, drc_index); rc = dlpar_detach_node(dn); - if (!rc) + if (!rc && acquire_drc) dlpar_release_drc(drc_index); return saved_rc; @@ -566,7 +570,8 @@ static int dlpar_offline_cpu(struct device_node *dn) } -static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index) +static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index, + bool release_drc) { int rc; @@ -579,12 +584,14 @@ static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index) return -EINVAL; } - rc = dlpar_release_drc(drc_index); - if (rc) { - pr_warn("Failed to release drc (%x) for CPU %s, rc: %d\n", - drc_index, dn->name, rc); - dlpar_online_cpu(dn); - return rc; + if (release_drc) { + rc = dlpar_release_drc(drc_index); + if (rc) { + pr_warn("Failed to release drc (%x) for CPU %s, rc: %d\n", + drc_index, dn->name, rc); + dlpar_online_cpu(dn); + return rc; + } } rc = dlpar_detach_node(dn); @@ -593,7 +600,10 @@ static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index) pr_warn("Failed to detach CPU %s, rc: %d", dn->name, rc); - rc = dlpar_acquire_drc(drc_index); + if (release_drc) + rc = dlpar_acquire_drc(drc_index); + else + rc = 0; if (!rc) dlpar_online_cpu(dn); @@ -622,7 +632,7 @@ static struct device_node *cpu_drc_index_to_dn(u32 drc_index) return dn; } -static int dlpar_cpu_remove_by_index(u32 drc_index) +static int dlpar_cpu_remove_by_index(u32 drc_index, bool release_drc) { struct device_node *dn; int rc; @@ -634,7 +644,7 @@ static int dlpar_cpu_remove_by_index(u32 drc_index) return -ENODEV; } - rc = dlpar_cpu_remove(dn, drc_index); + rc = dlpar_cpu_remove(dn, drc_index, release_drc);
[PATCH v06 0/9] powerpc/hotplug: Update affinity for migrated CPUs
The migration of LPARs across Power systems affects many attributes including that of the associativity of CPUs. The patches in this set execute when a system is coming up fresh upon a migration target. They are intended to, * Recognize changes to the associativity of CPUs recorded in internal data structures when compared to the latest copies in the device tree. * Generate calls to other code layers to reset the data structures related to associativity of the CPUs. * Re-register the 'changed' entities into the target system. Re-registration of CPUs mostly entails acting as if they have been newly hot-added into the target system. Signed-off-by: Michael Bringmann Michael Bringmann (9): hotplug/cpu: Conditionally acquire/release DRC index hotplug/cpu: Add operation queuing function hotplug/cpu: Provide CPU readd operation mobility/numa: Ensure numa update does not overlap numa: Disable/enable arch_update_cpu_topology pmt/numa: Disable arch_update_cpu_topology during CPU readd powerpc/rtas: Allow disabling rtas_event_scan hotplug/rtas: No rtas_event_scan during PMT update hotplug/pmt: Update topology after PMT --- Changes in patch: -- Restructure and rearrange content of patches to co-locate similar or related modifications -- Rename pseries_update_drconf_cpu to pseries_update_processor -- Simplify code to update CPU nodes during mobility checks. Remove functions to generate extra HP_ELOG messages in favor of direct function calls to dlpar_cpu_readd_by_index. -- Revise code order in dlpar_cpu_readd_by_index() to present more appropriate error codes from underlying layers of the implementation. -- Add hotplug device lock around all property updates -- Add call to rebuild_sched_domains in case of changes -- Various code cleanups and compaction -- Rebase to 4.18-rc1 kernel -- Change operation to run CPU readd after end of migration store. -- Improve descriptive text -- Cleanup patch reference to outdated function
Re: [RFC PATCH 1/2] dma-mapping: Clean up dma_set_*mask() hooks
On 08/07/18 16:07, Christoph Hellwig wrote: On Fri, Jul 06, 2018 at 03:20:34PM +0100, Robin Murphy wrote: What are you trying to do? I really don't want to see more users of the hooks as they are are a horribly bad idea. I really need to fix the ongoing problem we have where, due to funky integrations, devices suffer some downstream addressing limit (described by DT dma-ranges or ACPI IORT/_DMA) which we carefully set up in dma_configure(), but then just gets lost when the driver probes and innocently calls dma_set_mask() with something wider. I think it's effectively the generalised case of the VIA 32-bit quirk, if I understand that one correctly. I'd much rather fix this in generic code. How funky are your limitations? In fact when I did the 32-bit quirk (which will also be used by a Xiling PCIe root port usable on a lot of architectures) I did initially consider adding a bus_dma_mask or similar to struct device, but opted for the most simple implementation for now. I'd be happy to chanfe this. Especially these days where busses and IP blocks are generally not tied to a specific cpu instruction set I really believe that having any more architecture code than absolutely required is a bad idea. Oh, for sure, the generic fix would be the longer-term goal, this was just an expedient compromise because I want to get *something* landed for 4.19. Since in practice this is predominantly affecting arm64, doing the arch-specific fix to appease affected customers then working to generalise it afterwards seemed to carry the lowest risk. That said, I think I can see a relatively safe and clean alternative approach based on converting dma_32bit_limit to a mask, so I'll spin some patches around that idea ASAP to continue the discussion. The approach that seemed to me to be safest is largely based on the one proposed in a thread from ages ago[1]; namely to make dma_configure() better at distinguishing firmware-specified masks from bus defaults, capture the firmware mask in dev->archdata during arch_setup_dma_ops(), then use the custom set_mask routines to ensure any subsequent updates never exceed that. It doesn't seem possible to make this work robustly without storing *some* additional per-device data, and for that archdata is a lesser evil than struct device itself. Plus even though it's not actually an arch-specific issue it feels like there's such a risk of breaking other platforms that I'm reticent to even try handling it entirely in generic code. My plan for a few merge windows from now is that dma_mask and coherent_mask are 100% in device control and dma_set_mask will never fail. It will be up to the dma ops to make sure things are addressible. It's entirely possible to plug an old PCI soundcard via a bridge adapter into a modern board where the card's 24-bit DMA mask reaches nothing but the SoC's boot flash, and no IOMMU is available (e.g. some of the smaller NXP Layercape stuff); I still think there should be an error in such rare cases when DMA is utterly impossible, but otherwise I agree it would be much nicer for drivers to just provide their preferred mask and let the ops massage it as necessary. Robin.
[PATCH 2/2] powerpc: Add ppc64le and ppc64_book3e allmodconfig targets
Similarly as we just did for 32-bit, add phony targets for generating a little endian and Book3E allmodconfig. These aren't covered by the regular allmodconfig, which is big endian and Book3S due to the way the Kconfig symbols are structured. Signed-off-by: Michael Ellerman --- arch/powerpc/Makefile | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 2556c2182789..48e887f03a6c 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -359,6 +359,16 @@ ppc32_allmodconfig: $(Q)$(MAKE) KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/book3s_32.config \ -f $(srctree)/Makefile allmodconfig +PHONY += ppc64le_allmodconfig +ppc64le_allmodconfig: + $(Q)$(MAKE) KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/le.config \ + -f $(srctree)/Makefile allmodconfig + +PHONY += ppc64_book3e_allmodconfig +ppc64_book3e_allmodconfig: + $(Q)$(MAKE) KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/85xx-64bit.config \ + -f $(srctree)/Makefile allmodconfig + define archhelp @echo '* zImage - Build default images selected by kernel config' @echo ' zImage.*- Compressed kernel image (arch/$(ARCH)/boot/zImage.*)' -- 2.14.1
[PATCH 1/2] powerpc: Add ppc32_allmodconfig defconfig target
Because the allmodconfig logic just sets every symbol to M or Y, it has the effect of always generating a 64-bit config, because CONFIG_PPC64 becomes Y. So to make it easier for folks to test 32-bit code, provide a phony defconfig target that generates a 32-bit allmodconfig. The 32-bit port has several mutually exclusive CPU types, we choose the Book3S variants as that's what the help text in Kconfig says is most common. Signed-off-by: Michael Ellerman --- arch/powerpc/Makefile | 5 + arch/powerpc/configs/book3s_32.config | 2 ++ 2 files changed, 7 insertions(+) create mode 100644 arch/powerpc/configs/book3s_32.config diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 2ea575cb3401..2556c2182789 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -354,6 +354,11 @@ mpc86xx_smp_defconfig: $(call merge_into_defconfig,mpc86xx_basic_defconfig,\ 86xx-smp 86xx-hw fsl-emb-nonhw) +PHONY += ppc32_allmodconfig +ppc32_allmodconfig: + $(Q)$(MAKE) KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/book3s_32.config \ + -f $(srctree)/Makefile allmodconfig + define archhelp @echo '* zImage - Build default images selected by kernel config' @echo ' zImage.*- Compressed kernel image (arch/$(ARCH)/boot/zImage.*)' diff --git a/arch/powerpc/configs/book3s_32.config b/arch/powerpc/configs/book3s_32.config new file mode 100644 index ..8721eb7b1294 --- /dev/null +++ b/arch/powerpc/configs/book3s_32.config @@ -0,0 +1,2 @@ +CONFIG_PPC64=n +CONFIG_PPC_BOOK3S_32=y -- 2.14.1
Re: [PATCH v4 00/11] hugetlb: Factorize hugetlb architecture primitives
[CC hugetlb guys - http://lkml.kernel.org/r/20180705110716.3919-1-a...@ghiti.fr] On Thu 05-07-18 11:07:05, Alexandre Ghiti wrote: > In order to reduce copy/paste of functions across architectures and then > make riscv hugetlb port (and future ports) simpler and smaller, this > patchset intends to factorize the numerous hugetlb primitives that are > defined across all the architectures. > > Except for prepare_hugepage_range, this patchset moves the versions that > are just pass-through to standard pte primitives into > asm-generic/hugetlb.h by using the same #ifdef semantic that can be > found in asm-generic/pgtable.h, i.e. __HAVE_ARCH_***. > > s390 architecture has not been tackled in this serie since it does not > use asm-generic/hugetlb.h at all. > powerpc could be factorized a bit more (cf huge_ptep_set_wrprotect). > > This patchset has been compiled on x86 only. > > Changelog: > > v4: > Fix powerpc build error due to misplacing of #include >outside of #ifdef CONFIG_HUGETLB_PAGE, as > pointed by Christophe Leroy. > > v1, v2, v3: > Same version, just problems with email provider and misuse of > --batch-size option of git send-email > > Alexandre Ghiti (11): > hugetlb: Harmonize hugetlb.h arch specific defines with pgtable.h > hugetlb: Introduce generic version of hugetlb_free_pgd_range > hugetlb: Introduce generic version of set_huge_pte_at > hugetlb: Introduce generic version of huge_ptep_get_and_clear > hugetlb: Introduce generic version of huge_ptep_clear_flush > hugetlb: Introduce generic version of huge_pte_none > hugetlb: Introduce generic version of huge_pte_wrprotect > hugetlb: Introduce generic version of prepare_hugepage_range > hugetlb: Introduce generic version of huge_ptep_set_wrprotect > hugetlb: Introduce generic version of huge_ptep_set_access_flags > hugetlb: Introduce generic version of huge_ptep_get > > arch/arm/include/asm/hugetlb-3level.h| 32 +- > arch/arm/include/asm/hugetlb.h | 33 +-- > arch/arm64/include/asm/hugetlb.h | 39 +++- > arch/ia64/include/asm/hugetlb.h | 47 ++- > arch/mips/include/asm/hugetlb.h | 40 +++-- > arch/parisc/include/asm/hugetlb.h| 33 +++ > arch/powerpc/include/asm/book3s/32/pgtable.h | 2 + > arch/powerpc/include/asm/book3s/64/pgtable.h | 1 + > arch/powerpc/include/asm/hugetlb.h | 43 ++ > arch/powerpc/include/asm/nohash/32/pgtable.h | 2 + > arch/powerpc/include/asm/nohash/64/pgtable.h | 1 + > arch/sh/include/asm/hugetlb.h| 54 ++--- > arch/sparc/include/asm/hugetlb.h | 40 +++-- > arch/x86/include/asm/hugetlb.h | 72 +-- > include/asm-generic/hugetlb.h| 88 > +++- > 15 files changed, 143 insertions(+), 384 deletions(-) > > -- > 2.16.2 -- Michal Hocko SUSE Labs
Re: [PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()
On Mon 09-07-18 14:19:55, Marek Szyprowski wrote: > cma_alloc() function doesn't really support gfp flags other than > __GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter. > > This will help to avoid giving false feeling that this function supports > standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer, > what has already been an issue: see commit dd65a941f6ba ("arm64: > dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag"). > > Signed-off-by: Marek Szyprowski Thanks! This makes perfect sense to me. If there is a real need for the gfp_mask then we should start by defining the semantic first. Acked-by: Michal Hocko > --- > arch/powerpc/kvm/book3s_hv_builtin.c | 2 +- > drivers/s390/char/vmcp.c | 2 +- > drivers/staging/android/ion/ion_cma_heap.c | 2 +- > include/linux/cma.h| 2 +- > kernel/dma/contiguous.c| 3 ++- > mm/cma.c | 8 > mm/cma_debug.c | 2 +- > 7 files changed, 11 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c > b/arch/powerpc/kvm/book3s_hv_builtin.c > index d4a3f4da409b..fc6bb9630a9c 100644 > --- a/arch/powerpc/kvm/book3s_hv_builtin.c > +++ b/arch/powerpc/kvm/book3s_hv_builtin.c > @@ -77,7 +77,7 @@ struct page *kvm_alloc_hpt_cma(unsigned long nr_pages) > VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT); > > return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES), > - GFP_KERNEL); > + false); > } > EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma); > > diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c > index 948ce82a7725..0fa1b6b1491a 100644 > --- a/drivers/s390/char/vmcp.c > +++ b/drivers/s390/char/vmcp.c > @@ -68,7 +68,7 @@ static void vmcp_response_alloc(struct vmcp_session > *session) >* anymore the system won't work anyway. >*/ > if (order > 2) > - page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL); > + page = cma_alloc(vmcp_cma, nr_pages, 0, false); > if (page) { > session->response = (char *)page_to_phys(page); > session->cma_alloc = 1; > diff --git a/drivers/staging/android/ion/ion_cma_heap.c > b/drivers/staging/android/ion/ion_cma_heap.c > index 49718c96bf9e..3fafd013d80a 100644 > --- a/drivers/staging/android/ion/ion_cma_heap.c > +++ b/drivers/staging/android/ion/ion_cma_heap.c > @@ -39,7 +39,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct > ion_buffer *buffer, > if (align > CONFIG_CMA_ALIGNMENT) > align = CONFIG_CMA_ALIGNMENT; > > - pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL); > + pages = cma_alloc(cma_heap->cma, nr_pages, align, false); > if (!pages) > return -ENOMEM; > > diff --git a/include/linux/cma.h b/include/linux/cma.h > index bf90f0bb42bd..190184b5ff32 100644 > --- a/include/linux/cma.h > +++ b/include/linux/cma.h > @@ -33,7 +33,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, > phys_addr_t size, > const char *name, > struct cma **res_cma); > extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int > align, > - gfp_t gfp_mask); > + bool no_warn); > extern bool cma_release(struct cma *cma, const struct page *pages, unsigned > int count); > > extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void > *data); > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c > index d987dcd1bd56..19ea5d70150c 100644 > --- a/kernel/dma/contiguous.c > +++ b/kernel/dma/contiguous.c > @@ -191,7 +191,8 @@ struct page *dma_alloc_from_contiguous(struct device > *dev, size_t count, > if (align > CONFIG_CMA_ALIGNMENT) > align = CONFIG_CMA_ALIGNMENT; > > - return cma_alloc(dev_get_cma_area(dev), count, align, gfp_mask); > + return cma_alloc(dev_get_cma_area(dev), count, align, > + gfp_mask & __GFP_NOWARN); > } > > /** > diff --git a/mm/cma.c b/mm/cma.c > index 5809bbe360d7..4cb76121a3ab 100644 > --- a/mm/cma.c > +++ b/mm/cma.c > @@ -395,13 +395,13 @@ static inline void cma_debug_show_areas(struct cma > *cma) { } > * @cma: Contiguous memory region for which the allocation is performed. > * @count: Requested number of pages. > * @align: Requested alignment of pages (in PAGE_SIZE order). > - * @gfp_mask: GFP mask to use during compaction > + * @no_warn: Avoid printing message about failed allocation > * > * This function allocates part of contiguous memory on specific > * contiguous memory area. > */ > struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, > -gfp_t gfp_mask) > +bool no_warn) >
[PATCH 2/2] dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()
The CMA memory allocator doesn't support standard gfp flags for memory allocation, so there is no point having it as a parameter for dma_alloc_from_contiguous() function. Replace it by a boolean no_warn argument, which covers all the underlaying cma_alloc() function supports. This will help to avoid giving false feeling that this function supports standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer, what has already been an issue: see commit dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag"). Signed-off-by: Marek Szyprowski --- arch/arm/mm/dma-mapping.c | 5 +++-- arch/arm64/mm/dma-mapping.c| 4 ++-- arch/xtensa/kernel/pci-dma.c | 2 +- drivers/iommu/amd_iommu.c | 2 +- drivers/iommu/intel-iommu.c| 3 ++- include/linux/dma-contiguous.h | 4 ++-- kernel/dma/contiguous.c| 7 +++ kernel/dma/direct.c| 3 ++- 8 files changed, 16 insertions(+), 14 deletions(-) diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index be0fa7e39c26..121c6c3ba9e0 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -594,7 +594,7 @@ static void *__alloc_from_contiguous(struct device *dev, size_t size, struct page *page; void *ptr = NULL; - page = dma_alloc_from_contiguous(dev, count, order, gfp); + page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN); if (!page) return NULL; @@ -1294,7 +1294,8 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, unsigned long order = get_order(size); struct page *page; - page = dma_alloc_from_contiguous(dev, count, order, gfp); + page = dma_alloc_from_contiguous(dev, count, order, +gfp & __GFP_NOWARN); if (!page) goto error; diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index 61e93f0b5482..072c51fb07d7 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -355,7 +355,7 @@ static int __init atomic_pool_init(void) if (dev_get_cma_area(NULL)) page = dma_alloc_from_contiguous(NULL, nr_pages, -pool_size_order, GFP_KERNEL); +pool_size_order, false); else page = alloc_pages(GFP_DMA32, pool_size_order); @@ -573,7 +573,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size, struct page *page; page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT, -get_order(size), gfp); + get_order(size), gfp & __GFP_NOWARN); if (!page) return NULL; diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c index ba4640cc0093..b2c7ba91fb08 100644 --- a/arch/xtensa/kernel/pci-dma.c +++ b/arch/xtensa/kernel/pci-dma.c @@ -137,7 +137,7 @@ static void *xtensa_dma_alloc(struct device *dev, size_t size, if (gfpflags_allow_blocking(flag)) page = dma_alloc_from_contiguous(dev, count, get_order(size), -flag); +flag & __GFP_NOWARN); if (!page) page = alloc_pages(flag, get_order(size)); diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index 64cfe854e0f5..5ec97ffb561a 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2622,7 +2622,7 @@ static void *alloc_coherent(struct device *dev, size_t size, return NULL; page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT, -get_order(size), flag); + get_order(size), flag & __GFP_NOWARN); if (!page) return NULL; } diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 869321c594e2..dd2d343428ab 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3746,7 +3746,8 @@ static void *intel_alloc_coherent(struct device *dev, size_t size, if (gfpflags_allow_blocking(flags)) { unsigned int count = size >> PAGE_SHIFT; - page = dma_alloc_from_contiguous(dev, count, order, flags); + page = dma_alloc_from_contiguous(dev, count, order, +flags & __GFP_NOWARN); if (page && iommu_no_mapping(dev) && page_to_phys(page) + size > dev->coherent_dma_mask) { dma_release_from_contiguous(dev, page, count); diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
[PATCH 0/2] CMA: remove unsupported gfp mask parameter
Dear All, The CMA related functions cma_alloc() and dma_alloc_from_contiguous() have gfp mask parameter, but sadly they only support __GFP_NOWARN flag. This gave their users a misleading feeling that any standard memory allocation flags are supported, what resulted in the security issue when caller have set __GFP_ZERO flag and expected the buffer to be cleared. This patchset changes gfp_mask parameter to a simple boolean no_warn argument, which covers all the underlaying code supports. This patchset is a result of the following discussion: https://patchwork.kernel.org/patch/10461919/ Best regards Marek Szyprowski Samsung R Institute Poland Patch summary: Marek Szyprowski (2): mm/cma: remove unsupported gfp_mask parameter from cma_alloc() dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous() arch/arm/mm/dma-mapping.c | 5 +++-- arch/arm64/mm/dma-mapping.c| 4 ++-- arch/powerpc/kvm/book3s_hv_builtin.c | 2 +- arch/xtensa/kernel/pci-dma.c | 2 +- drivers/iommu/amd_iommu.c | 2 +- drivers/iommu/intel-iommu.c| 3 ++- drivers/s390/char/vmcp.c | 2 +- drivers/staging/android/ion/ion_cma_heap.c | 2 +- include/linux/cma.h| 2 +- include/linux/dma-contiguous.h | 4 ++-- kernel/dma/contiguous.c| 6 +++--- kernel/dma/direct.c| 3 ++- mm/cma.c | 8 mm/cma_debug.c | 2 +- 14 files changed, 25 insertions(+), 22 deletions(-) -- 2.17.1
[PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()
cma_alloc() function doesn't really support gfp flags other than __GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter. This will help to avoid giving false feeling that this function supports standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer, what has already been an issue: see commit dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag"). Signed-off-by: Marek Szyprowski --- arch/powerpc/kvm/book3s_hv_builtin.c | 2 +- drivers/s390/char/vmcp.c | 2 +- drivers/staging/android/ion/ion_cma_heap.c | 2 +- include/linux/cma.h| 2 +- kernel/dma/contiguous.c| 3 ++- mm/cma.c | 8 mm/cma_debug.c | 2 +- 7 files changed, 11 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c index d4a3f4da409b..fc6bb9630a9c 100644 --- a/arch/powerpc/kvm/book3s_hv_builtin.c +++ b/arch/powerpc/kvm/book3s_hv_builtin.c @@ -77,7 +77,7 @@ struct page *kvm_alloc_hpt_cma(unsigned long nr_pages) VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT); return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES), -GFP_KERNEL); +false); } EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma); diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c index 948ce82a7725..0fa1b6b1491a 100644 --- a/drivers/s390/char/vmcp.c +++ b/drivers/s390/char/vmcp.c @@ -68,7 +68,7 @@ static void vmcp_response_alloc(struct vmcp_session *session) * anymore the system won't work anyway. */ if (order > 2) - page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL); + page = cma_alloc(vmcp_cma, nr_pages, 0, false); if (page) { session->response = (char *)page_to_phys(page); session->cma_alloc = 1; diff --git a/drivers/staging/android/ion/ion_cma_heap.c b/drivers/staging/android/ion/ion_cma_heap.c index 49718c96bf9e..3fafd013d80a 100644 --- a/drivers/staging/android/ion/ion_cma_heap.c +++ b/drivers/staging/android/ion/ion_cma_heap.c @@ -39,7 +39,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct ion_buffer *buffer, if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL); + pages = cma_alloc(cma_heap->cma, nr_pages, align, false); if (!pages) return -ENOMEM; diff --git a/include/linux/cma.h b/include/linux/cma.h index bf90f0bb42bd..190184b5ff32 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -33,7 +33,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, const char *name, struct cma **res_cma); extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, - gfp_t gfp_mask); + bool no_warn); extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count); extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index d987dcd1bd56..19ea5d70150c 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -191,7 +191,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count, if (align > CONFIG_CMA_ALIGNMENT) align = CONFIG_CMA_ALIGNMENT; - return cma_alloc(dev_get_cma_area(dev), count, align, gfp_mask); + return cma_alloc(dev_get_cma_area(dev), count, align, +gfp_mask & __GFP_NOWARN); } /** diff --git a/mm/cma.c b/mm/cma.c index 5809bbe360d7..4cb76121a3ab 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -395,13 +395,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { } * @cma: Contiguous memory region for which the allocation is performed. * @count: Requested number of pages. * @align: Requested alignment of pages (in PAGE_SIZE order). - * @gfp_mask: GFP mask to use during compaction + * @no_warn: Avoid printing message about failed allocation * * This function allocates part of contiguous memory on specific * contiguous memory area. */ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, - gfp_t gfp_mask) + bool no_warn) { unsigned long mask, offset; unsigned long pfn = -1; @@ -447,7 +447,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(_mutex); ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, -
Re: powerpc: 32BIT vs. 64BIT (PPC32 vs. PPC64)
On Sun, Jul 8, 2018 at 1:53 PM Michael Ellerman wrote: > > Randy Dunlap writes: > > Hi, > > > > Is there a good way (or a shortcut) to do something like: > > The best I know of is: > > > $ make ARCH=powerpc O=PPC32 [other_options] allmodconfig > > to get a PPC32/32BIT allmodconfig > > $ echo CONFIG_PPC64=n > allmod.config > $ KCONFIG_ALLCONFIG=1 make allmodconfig > $ grep PPC32 .config > CONFIG_PPC32=y > > Which is still a bit clunky. > > > I looked at this a while back and the problem we have is that the 32-bit > kernel is not a single thing. There are multiple 32-bit platforms which > are mutually exclusive. > > eg, from menuconfig: > > - 512x/52xx/6xx/7xx/74xx/82xx/83xx/86xx > - Freescale 85xx > - Freescale 8xx > - AMCC 40x > - AMCC 44x, 46x or 47x > - Freescale e200 Most Linux distro seems to have drop support for ppc32. So I'd suggest to pick Debian powperc default config (but I agree that I am a little biased here). > > So we could have a 32-bit allmodconfig, but we'd need to chose one of > the above, and we'd still only be testing some of the code. > > Having said that you're the 2nd person to ask about this, so we should > clearly do something to make a 32-bit allmodconfig easier, even if it's > not perfect. > > cheers
Re: powerpc: 32BIT vs. 64BIT (PPC32 vs. PPC64)
Nicholas Piggin writes: > On Fri, 6 Jul 2018 21:58:29 -0700 > Randy Dunlap wrote: > >> On 07/06/2018 06:45 PM, Benjamin Herrenschmidt wrote: >> > On Thu, 2018-07-05 at 14:30 -0700, Randy Dunlap wrote: >> >> Hi, >> >> >> >> Is there a good way (or a shortcut) to do something like: >> >> >> >> $ make ARCH=powerpc O=PPC32 [other_options] allmodconfig >> >> to get a PPC32/32BIT allmodconfig >> >> >> >> and also be able to do: >> >> >> >> $make ARCH=powerpc O=PPC64 [other_options] allmodconfig >> >> to get a PPC64/64BIT allmodconfig? >> > >> > Hrm... O= is for the separate build dir, so there much be something >> > else. >> > >> > You mean having ARCH= aliases like ppc/ppc32 and ppc64 ? >> >> Yes. >> >> > That would be a matter of overriding some .config defaults I suppose, I >> > don't know how this is done on other archs. >> > >> > I see the aliasing trick in the Makefile but that's about it. >> > >> >> Note that arch/x86, arch/sh, and arch/sparc have ways to do >> >> some flavor(s) of this (from Documentation/kbuild/kbuild.txt; >> >> sh and sparc based on a recent "fix" patch from me): >> > >> > I fail to see what you are actually talking about here ... sorry. Do >> > you have concrete examples on x86 or sparc ? From what I can tell the >> > "i386" or "sparc32/sparc64" aliases just change SRCARCH in Makefile and >> > 32 vs 64-bit is just a Kconfig option... >> >> Yes, your summary is mostly correct. >> >> I'm just looking for a way to do cross-compile builds that are close to >> ppc32 allmodconfig and ppc64 allmodconfig. > > Would there a problem with adding ARCH=ppc32 / ppc64 matching? This > seems to work... It's a cute trick but I'd rather avoid it. It overloads ARCH which can be confusing to people and tools. For example I'd have to special case it in kisskb. I think we can achieve a similar result by having more PHONY defconfig targets. eg, we can do ppc32_allmodconfig like below. And if there's interest we could do a 4xx_allmodconfig etc. diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 2ea575cb3401..2556c2182789 100644 --- a/arch/powerpc/Makefile +++ b/arch/powerpc/Makefile @@ -354,6 +354,11 @@ mpc86xx_smp_defconfig: $(call merge_into_defconfig,mpc86xx_basic_defconfig,\ 86xx-smp 86xx-hw fsl-emb-nonhw) +PHONY += ppc32_allmodconfig +ppc32_allmodconfig: + $(Q)$(MAKE) KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/book3s_32.config \ + -f $(srctree)/Makefile allmodconfig + define archhelp @echo '* zImage - Build default images selected by kernel config' @echo ' zImage.*- Compressed kernel image (arch/$(ARCH)/boot/zImage.*)' diff --git a/arch/powerpc/configs/book3s_32.config b/arch/powerpc/configs/book3s_32.config new file mode 100644 index ..8721eb7b1294 --- /dev/null +++ b/arch/powerpc/configs/book3s_32.config @@ -0,0 +1,2 @@ +CONFIG_PPC64=n +CONFIG_PPC_BOOK3S_32=y cheers
Re: [next-20180709][bisected 9cf57731][ppc] build fail with ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734
On Mon, Jul 09, 2018 at 03:21:23PM +0530, Abdul Haleem wrote: > Greeting's > > Today's next fails to build on powerpc with below error > > kernel/cpu.o:(.data.rel+0x18e0): undefined reference to > `lockup_detector_online_cpu' > ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734 > kernel/cpu.o:(.data.rel+0x18e8): undefined reference to > `lockup_detector_offline_cpu' > ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734 > Makefile:1005: recipe for target 'vmlinux' failed > make: *** [vmlinux] Error 1 Urgh, sorry about that. I think the below should cure that. I got confused by all the varioud CONFIG options here abour and conflated CONFIG_LOCKUP_DETECTOR and CONFIG_SOFTLOCKUP_DETECTOR it seems. diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 80664bbeca43..08f9247e9827 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -33,15 +33,10 @@ extern int sysctl_hardlockup_all_cpu_backtrace; #define sysctl_hardlockup_all_cpu_backtrace 0 #endif /* !CONFIG_SMP */ -extern int lockup_detector_online_cpu(unsigned int cpu); -extern int lockup_detector_offline_cpu(unsigned int cpu); - #else /* CONFIG_LOCKUP_DETECTOR */ static inline void lockup_detector_init(void) { } static inline void lockup_detector_soft_poweroff(void) { } static inline void lockup_detector_cleanup(void) { } -#define lockup_detector_online_cpu NULL -#define lockup_detector_offline_cpuNULL #endif /* !CONFIG_LOCKUP_DETECTOR */ #ifdef CONFIG_SOFTLOCKUP_DETECTOR @@ -50,12 +45,18 @@ extern void touch_softlockup_watchdog(void); extern void touch_softlockup_watchdog_sync(void); extern void touch_all_softlockup_watchdogs(void); extern unsigned int softlockup_panic; -#else + +extern int lockup_detector_online_cpu(unsigned int cpu); +extern int lockup_detector_offline_cpu(unsigned int cpu); +#else /* CONFIG_SOFTLOCKUP_DETECTOR */ static inline void touch_softlockup_watchdog_sched(void) { } static inline void touch_softlockup_watchdog(void) { } static inline void touch_softlockup_watchdog_sync(void) { } static inline void touch_all_softlockup_watchdogs(void) { } -#endif + +#define lockup_detector_online_cpu NULL +#define lockup_detector_offline_cpuNULL +#endif /* CONFIG_SOFTLOCKUP_DETECTOR */ #ifdef CONFIG_DETECT_HUNG_TASK void reset_hung_task_detector(void);
[PATCH 7/7 v6] arm64: dts: ls208xa: comply with the iommu map binding for fsl_mc
fsl-mc bus support the new iommu-map property. Comply to this binding for fsl_mc bus. Signed-off-by: Nipun Gupta Reviewed-by: Laurentiu Tudor --- arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi index 137ef4d..3d5e049 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi @@ -184,6 +184,7 @@ #address-cells = <2>; #size-cells = <2>; ranges; + dma-ranges = <0x0 0x0 0x0 0x0 0x1 0x>; clockgen: clocking@130 { compatible = "fsl,ls2080a-clockgen"; @@ -357,6 +358,8 @@ reg = <0x0008 0x0c00 0 0x40>,/* MC portal base */ <0x 0x0834 0 0x4>; /* MC control reg */ msi-parent = <>; + iommu-map = <0 0 0>; /* This is fixed-up by u-boot */ + dma-coherent; #address-cells = <3>; #size-cells = <1>; @@ -460,6 +463,9 @@ compatible = "arm,mmu-500"; reg = <0 0x500 0 0x80>; #global-interrupts = <12>; + #iommu-cells = <1>; + stream-match-mask = <0x7C00>; + dma-coherent; interrupts = <0 13 4>, /* global secure fault */ <0 14 4>, /* combined secure interrupt */ <0 15 4>, /* global non-secure fault */ @@ -502,7 +508,6 @@ <0 204 4>, <0 205 4>, <0 206 4>, <0 207 4>, <0 208 4>, <0 209 4>; - mmu-masters = <_mc 0x300 0>; }; dspi: dspi@210 { -- 1.9.1
[PATCH 6/7 v6] bus/fsl-mc: set coherent dma mask for devices on fsl-mc bus
of_dma_configure() API expects coherent_dma_mask to be correctly set in the devices. This patch does the needful. Signed-off-by: Nipun Gupta Reviewed-by: Robin Murphy --- drivers/bus/fsl-mc/fsl-mc-bus.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c index fa43c7d..624828b 100644 --- a/drivers/bus/fsl-mc/fsl-mc-bus.c +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c @@ -627,6 +627,7 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc, mc_dev->icid = parent_mc_dev->icid; mc_dev->dma_mask = FSL_MC_DEFAULT_DMA_MASK; mc_dev->dev.dma_mask = _dev->dma_mask; + mc_dev->dev.coherent_dma_mask = mc_dev->dma_mask; dev_set_msi_domain(_dev->dev, dev_get_msi_domain(_mc_dev->dev)); } -- 1.9.1
[PATCH 5/7 v6] bus/fsl-mc: support dma configure for devices on fsl-mc bus
This patch adds support of dma configuration for devices on fsl-mc bus using 'dma_configure' callback for busses. Also, directly calling arch_setup_dma_ops is removed from the fsl-mc bus. Signed-off-by: Nipun Gupta Reviewed-by: Laurentiu Tudor --- drivers/bus/fsl-mc/fsl-mc-bus.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c index 5d8266c..fa43c7d 100644 --- a/drivers/bus/fsl-mc/fsl-mc-bus.c +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c @@ -127,6 +127,16 @@ static int fsl_mc_bus_uevent(struct device *dev, struct kobj_uevent_env *env) return 0; } +static int fsl_mc_dma_configure(struct device *dev) +{ + struct device *dma_dev = dev; + + while (dev_is_fsl_mc(dma_dev)) + dma_dev = dma_dev->parent; + + return of_dma_configure(dev, dma_dev->of_node, 0); +} + static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -148,6 +158,7 @@ struct bus_type fsl_mc_bus_type = { .name = "fsl-mc", .match = fsl_mc_bus_match, .uevent = fsl_mc_bus_uevent, + .dma_configure = fsl_mc_dma_configure, .dev_groups = fsl_mc_dev_groups, }; EXPORT_SYMBOL_GPL(fsl_mc_bus_type); @@ -633,10 +644,6 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc, goto error_cleanup_dev; } - /* Objects are coherent, unless 'no shareability' flag set. */ - if (!(obj_desc->flags & FSL_MC_OBJ_FLAG_NO_MEM_SHAREABILITY)) - arch_setup_dma_ops(_dev->dev, 0, 0, NULL, true); - /* * The device-specific probe callback will get invoked by device_add() */ -- 1.9.1
[PATCH 4/7 v6] iommu/arm-smmu: Add support for the fsl-mc bus
Implement bus specific support for the fsl-mc bus including registering arm_smmu_ops and bus specific device add operations. Signed-off-by: Nipun Gupta --- drivers/iommu/arm-smmu.c | 7 +++ drivers/iommu/iommu.c| 13 + include/linux/fsl/mc.h | 8 include/linux/iommu.h| 2 ++ 4 files changed, 30 insertions(+) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index f7a96bc..a011bb6 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -52,6 +52,7 @@ #include #include +#include #include "io-pgtable.h" #include "arm-smmu-regs.h" @@ -1459,6 +1460,8 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev) if (dev_is_pci(dev)) group = pci_device_group(dev); + else if (dev_is_fsl_mc(dev)) + group = fsl_mc_device_group(dev); else group = generic_device_group(dev); @@ -2037,6 +2040,10 @@ static void arm_smmu_bus_init(void) bus_set_iommu(_bus_type, _smmu_ops); } #endif +#ifdef CONFIG_FSL_MC_BUS + if (!iommu_present(_mc_bus_type)) + bus_set_iommu(_mc_bus_type, _smmu_ops); +#endif } static int arm_smmu_device_probe(struct platform_device *pdev) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index d227b86..df2f49e 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -32,6 +32,7 @@ #include #include #include +#include #include static struct kset *iommu_group_kset; @@ -988,6 +989,18 @@ struct iommu_group *pci_device_group(struct device *dev) return iommu_group_alloc(); } +/* Get the IOMMU group for device on fsl-mc bus */ +struct iommu_group *fsl_mc_device_group(struct device *dev) +{ + struct device *cont_dev = fsl_mc_cont_dev(dev); + struct iommu_group *group; + + group = iommu_group_get(cont_dev); + if (!group) + group = iommu_group_alloc(); + return group; +} + /** * iommu_group_get_for_dev - Find or create the IOMMU group for a device * @dev: target device diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h index f27cb14..dddaca1 100644 --- a/include/linux/fsl/mc.h +++ b/include/linux/fsl/mc.h @@ -351,6 +351,14 @@ struct fsl_mc_io { #define dev_is_fsl_mc(_dev) (0) #endif +/* Macro to check if a device is a container device */ +#define fsl_mc_is_cont_dev(_dev) (to_fsl_mc_device(_dev)->flags & \ + FSL_MC_IS_DPRC) + +/* Macro to get the container device of a MC device */ +#define fsl_mc_cont_dev(_dev) (fsl_mc_is_cont_dev(_dev) ? \ + (_dev) : (_dev)->parent) + /* * module_fsl_mc_driver() - Helper macro for drivers that don't do * anything special in module init/exit. This eliminates a lot of diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 7447b0b..209891d 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -389,6 +389,8 @@ static inline size_t iommu_map_sg(struct iommu_domain *domain, extern struct iommu_group *pci_device_group(struct device *dev); /* Generic device grouping function */ extern struct iommu_group *generic_device_group(struct device *dev); +/* FSL-MC device grouping function */ +struct iommu_group *fsl_mc_device_group(struct device *dev); /** * struct iommu_fwspec - per-device IOMMU instance data -- 1.9.1
[PATCH 3/7 v6] iommu/of: support iommu configuration for fsl-mc devices
With of_pci_map_rid available for all the busses, use the function for configuration of devices on fsl-mc bus Signed-off-by: Nipun Gupta Reviewed-by: Robin Murphy --- drivers/iommu/of_iommu.c | 20 1 file changed, 20 insertions(+) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 811e160..284474d 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -24,6 +24,7 @@ #include #include #include +#include #define NO_IOMMU 1 @@ -159,6 +160,23 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data) return err; } +static int of_fsl_mc_iommu_init(struct fsl_mc_device *mc_dev, + struct device_node *master_np) +{ + struct of_phandle_args iommu_spec = { .args_count = 1 }; + int err; + + err = of_map_rid(master_np, mc_dev->icid, "iommu-map", +"iommu-map-mask", _spec.np, +iommu_spec.args); + if (err) + return err == -ENODEV ? NO_IOMMU : err; + + err = of_iommu_xlate(_dev->dev, _spec); + of_node_put(iommu_spec.np); + return err; +} + const struct iommu_ops *of_iommu_configure(struct device *dev, struct device_node *master_np) { @@ -190,6 +208,8 @@ const struct iommu_ops *of_iommu_configure(struct device *dev, err = pci_for_each_dma_alias(to_pci_dev(dev), of_pci_iommu_init, ); + } else if (dev_is_fsl_mc(dev)) { + err = of_fsl_mc_iommu_init(to_fsl_mc_device(dev), master_np); } else { struct of_phandle_args iommu_spec; int idx = 0; -- 1.9.1
[PATCH 2/7 v6] iommu/of: make of_pci_map_rid() available for other devices too
iommu-map property is also used by devices with fsl-mc. This patch moves the of_pci_map_rid to generic location, so that it can be used by other busses too. 'of_pci_map_rid' is renamed here to 'of_map_rid' and there is no functional change done in the API. Signed-off-by: Nipun Gupta Reviewed-by: Rob Herring Reviewed-by: Robin Murphy Acked-by: Bjorn Helgaas --- drivers/iommu/of_iommu.c | 5 +-- drivers/of/base.c| 102 +++ drivers/of/irq.c | 5 +-- drivers/pci/of.c | 101 -- include/linux/of.h | 11 + include/linux/of_pci.h | 10 - 6 files changed, 117 insertions(+), 117 deletions(-) diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c index 5c36a8b..811e160 100644 --- a/drivers/iommu/of_iommu.c +++ b/drivers/iommu/of_iommu.c @@ -149,9 +149,8 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data) struct of_phandle_args iommu_spec = { .args_count = 1 }; int err; - err = of_pci_map_rid(info->np, alias, "iommu-map", -"iommu-map-mask", _spec.np, -iommu_spec.args); + err = of_map_rid(info->np, alias, "iommu-map", "iommu-map-mask", +_spec.np, iommu_spec.args); if (err) return err == -ENODEV ? NO_IOMMU : err; diff --git a/drivers/of/base.c b/drivers/of/base.c index 848f549..c7aac81 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1995,3 +1995,105 @@ int of_find_last_cache_level(unsigned int cpu) return cache_level; } + +/** + * of_map_rid - Translate a requester ID through a downstream mapping. + * @np: root complex device node. + * @rid: device requester ID to map. + * @map_name: property name of the map to use. + * @map_mask_name: optional property name of the mask to use. + * @target: optional pointer to a target device node. + * @id_out: optional pointer to receive the translated ID. + * + * Given a device requester ID, look up the appropriate implementation-defined + * platform ID and/or the target device which receives transactions on that + * ID, as per the "iommu-map" and "msi-map" bindings. Either of @target or + * @id_out may be NULL if only the other is required. If @target points to + * a non-NULL device node pointer, only entries targeting that node will be + * matched; if it points to a NULL value, it will receive the device node of + * the first matching target phandle, with a reference held. + * + * Return: 0 on success or a standard error code on failure. + */ +int of_map_rid(struct device_node *np, u32 rid, + const char *map_name, const char *map_mask_name, + struct device_node **target, u32 *id_out) +{ + u32 map_mask, masked_rid; + int map_len; + const __be32 *map = NULL; + + if (!np || !map_name || (!target && !id_out)) + return -EINVAL; + + map = of_get_property(np, map_name, _len); + if (!map) { + if (target) + return -ENODEV; + /* Otherwise, no map implies no translation */ + *id_out = rid; + return 0; + } + + if (!map_len || map_len % (4 * sizeof(*map))) { + pr_err("%pOF: Error: Bad %s length: %d\n", np, + map_name, map_len); + return -EINVAL; + } + + /* The default is to select all bits. */ + map_mask = 0x; + + /* +* Can be overridden by "{iommu,msi}-map-mask" property. +* If of_property_read_u32() fails, the default is used. +*/ + if (map_mask_name) + of_property_read_u32(np, map_mask_name, _mask); + + masked_rid = map_mask & rid; + for ( ; map_len > 0; map_len -= 4 * sizeof(*map), map += 4) { + struct device_node *phandle_node; + u32 rid_base = be32_to_cpup(map + 0); + u32 phandle = be32_to_cpup(map + 1); + u32 out_base = be32_to_cpup(map + 2); + u32 rid_len = be32_to_cpup(map + 3); + + if (rid_base & ~map_mask) { + pr_err("%pOF: Invalid %s translation - %s-mask (0x%x) ignores rid-base (0x%x)\n", + np, map_name, map_name, + map_mask, rid_base); + return -EFAULT; + } + + if (masked_rid < rid_base || masked_rid >= rid_base + rid_len) + continue; + + phandle_node = of_find_node_by_phandle(phandle); + if (!phandle_node) + return -ENODEV; + + if (target) { + if (*target) + of_node_put(phandle_node); + else + *target = phandle_node; + + if
[PATCH 1/7 v6] Documentation: fsl-mc: add iommu-map device-tree binding for fsl-mc bus
The existing IOMMU bindings cannot be used to specify the relationship between fsl-mc devices and IOMMUs. This patch adds a generic binding for mapping fsl-mc devices to IOMMUs, using iommu-map property. Signed-off-by: Nipun Gupta Reviewed-by: Rob Herring --- .../devicetree/bindings/misc/fsl,qoriq-mc.txt | 39 ++ 1 file changed, 39 insertions(+) diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt index 6611a7c..01fdc33 100644 --- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt +++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt @@ -9,6 +9,25 @@ blocks that can be used to create functional hardware objects/devices such as network interfaces, crypto accelerator instances, L2 switches, etc. +For an overview of the DPAA2 architecture and fsl-mc bus see: +Documentation/networking/dpaa2/overview.rst + +As described in the above overview, all DPAA2 objects in a DPRC share the +same hardware "isolation context" and a 10-bit value called an ICID +(isolation context id) is expressed by the hardware to identify +the requester. + +The generic 'iommus' property is insufficient to describe the relationship +between ICIDs and IOMMUs, so an iommu-map property is used to define +the set of possible ICIDs under a root DPRC and how they map to +an IOMMU. + +For generic IOMMU bindings, see +Documentation/devicetree/bindings/iommu/iommu.txt. + +For arm-smmu binding, see: +Documentation/devicetree/bindings/iommu/arm,smmu.txt. + Required properties: - compatible @@ -88,14 +107,34 @@ Sub-nodes: Value type: Definition: Specifies the phandle to the PHY device node associated with the this dpmac. +Optional properties: + +- iommu-map: Maps an ICID to an IOMMU and associated iommu-specifier + data. + + The property is an arbitrary number of tuples of + (icid-base,iommu,iommu-base,length). + + Any ICID i in the interval [icid-base, icid-base + length) is + associated with the listed IOMMU, with the iommu-specifier + (i - icid-base + iommu-base). Example: +smmu: iommu@500 { + compatible = "arm,mmu-500"; + #iommu-cells = <1>; + stream-match-mask = <0x7C00>; + ... +}; + fsl_mc: fsl-mc@80c00 { compatible = "fsl,qoriq-mc"; reg = <0x0008 0x0c00 0 0x40>,/* MC portal base */ <0x 0x0834 0 0x4>; /* MC control reg */ msi-parent = <>; +/* define map for ICIDs 23-64 */ +iommu-map = <23 23 41>; #address-cells = <3>; #size-cells = <1>; -- 1.9.1
[PATCH 0/7 v6] Support for fsl-mc bus and its devices in SMMU
This patchset defines IOMMU DT binding for fsl-mc bus and adds support in SMMU for fsl-mc bus. The patch series is based on top of dma-mapping tree (for-next branch): http://git.infradead.org/users/hch/dma-mapping.git These patches - Define property 'iommu-map' for fsl-mc bus (patch 1) - Integrates the fsl-mc bus with the SMMU using this IOMMU binding (patch 2,3,4) - Adds the dma configuration support for fsl-mc bus (patch 5, 6) - Updates the fsl-mc device node with iommu/dma related changes (patch 7) Changes in v2: - use iommu-map property for fsl-mc bus - rebase over patchset https://patchwork.kernel.org/patch/10317337/ and make corresponding changes for dma configuration of devices on fsl-mc bus Changes in v3: - move of_map_rid in drivers/of/address.c Changes in v4: - move of_map_rid in drivers/of/base.c Changes in v5: - break patch 5 in two separate patches (now patch 5/7 and patch 6/7) - add changelog text in patch 3/7 and patch 5/7 - typo fix Changes in v6: - Updated fsl_mc_device_group() API to be more rational - Added dma-coherent property in the LS2 smmu device node - Minor fixes in the device-tree documentation Nipun Gupta (7): Documentation: fsl-mc: add iommu-map device-tree binding for fsl-mc bus iommu/of: make of_pci_map_rid() available for other devices too iommu/of: support iommu configuration for fsl-mc devices iommu/arm-smmu: Add support for the fsl-mc bus bus: fsl-mc: support dma configure for devices on fsl-mc bus bus: fsl-mc: set coherent dma mask for devices on fsl-mc bus arm64: dts: ls208xa: comply with the iommu map binding for fsl_mc .../devicetree/bindings/misc/fsl,qoriq-mc.txt | 39 arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 7 +- drivers/bus/fsl-mc/fsl-mc-bus.c| 16 +++- drivers/iommu/arm-smmu.c | 7 ++ drivers/iommu/iommu.c | 13 +++ drivers/iommu/of_iommu.c | 25 - drivers/of/base.c | 102 + drivers/of/irq.c | 5 +- drivers/pci/of.c | 101 include/linux/fsl/mc.h | 8 ++ include/linux/iommu.h | 2 + include/linux/of.h | 11 +++ include/linux/of_pci.h | 10 -- 13 files changed, 224 insertions(+), 122 deletions(-) -- 1.9.1
Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset
On Mon, Jul 9, 2018 at 10:40 AM, Pingfan Liu wrote: > On Mon, Jul 9, 2018 at 3:48 PM Rafael J. Wysocki wrote: >> >> On Mon, Jul 9, 2018 at 8:48 AM, Pingfan Liu wrote: >> > On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki wrote: [cut] >> >> I simply think that there should be one way to iterate over devices >> for both system-wide PM and shutdown. >> >> The reason why it is not like that today is because of the development >> history, but if it doesn't work and we want to fix it, let's just >> consolidate all of that. >> >> Now, system-wide suspend resume sometimes iterates the list in the >> reverse order which would be hard without having a list, wouldn't it? >> > Yes, it would be hard without having a list. I just thought to use > device tree info to build up a shadowed list, and rebuild the list > until there is new device_link_add() operation. For > device_add/_remove(), it can modify the shadowed list directly. Right, and that's the idea of dpm_list, generally speaking: It represents one of the (possibly many) orders in which devices can be suspended (or shut down) based on the information coming from the device hierarchy and device links. So it appears straightforward (even though it may be complicated because of the build-time dependencies) to start using dpm_list for shutdown too - and to ensure that it is properly maintained everywhere. Thanks, Rafael
Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset
On Mon, Jul 9, 2018 at 3:48 PM Rafael J. Wysocki wrote: > > On Mon, Jul 9, 2018 at 8:48 AM, Pingfan Liu wrote: > > On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki wrote: > >> > >> On Sat, Jul 7, 2018 at 6:24 AM, Pingfan Liu wrote: > >> > On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu wrote: > >> >> > >> >> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki > >> >> wrote: > >> >> > > >> >> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner wrote: > >> >> > > [cc += Kishon Vijay Abraham] > >> >> > > > >> >> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote: > >> >> > >> OK, so calling devices_kset_move_last() from really_probe() > >> >> > >> clearly is > >> >> > >> a mistake. > >> >> > >> > >> >> > >> I'm not really sure what the intention of it was as the changelog > >> >> > >> of > >> >> > >> commit 52cdbdd49853d doesn't really explain that (why would it be > >> >> > >> insufficient without that change?) > >> >> > > > >> >> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC > >> >> > > whose reset pin needs to be driven high on shutdown, lest the MMC > >> >> > > won't be found on the next boot. > >> >> > > > >> >> > > The boards' devicetrees use a kludge wherein the reset pin is > >> >> > > modelled > >> >> > > as a regulator. The regulator is enabled when the MMC probes and > >> >> > > disabled on driver unbind and shutdown. As a result, the pin is > >> >> > > driven > >> >> > > low on shutdown and the MMC is not found on the next boot. > >> >> > > > >> >> > > To fix this, another kludge was invented wherein the GPIO expander > >> >> > > driving the reset pin unconditionally drives all its pins high on > >> >> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c > >> >> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state > >> >> > > of all pcf lines"). > >> >> > > > >> >> > > For this kludge to work, the GPIO expander's ->shutdown hook needs > >> >> > > to > >> >> > > be executed after the MMC expander's ->shutdown hook. > >> >> > > > >> >> > > Commit 52cdbdd49853d achieved that by reordering devices_kset > >> >> > > according > >> >> > > to the probe order. Apparently the MMC probes after the GPIO > >> >> > > expander, > >> >> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator > >> >> > > isn't > >> >> > > available yet, see mmc_regulator_get_supply(). > >> >> > > > >> >> > > Note, I'm just piecing the information together from git history, > >> >> > > I'm not responsible for these kludges. (I'm innocent!) > >> >> > > >> >> > Sure enough. :-) > >> >> > > >> >> > In any case, calling devices_kset_move_last() in really_probe() is > >> >> > plain broken and if its only purpose was to address a single, arguably > >> >> > kludgy, use case, let's just get rid of it in the first place IMO. > >> >> > > >> >> Yes, if it is only used for a single use case. > >> >> > >> > Think it again, I saw other potential issue with the current code. > >> > device_link_add->device_reorder_to_tail() can break the > >> > "supplier<-consumer" order. During moving children after parent's > >> > supplier, it ignores the order of child's consumer. > >> > >> What do you mean? > >> > > The drivers use device_link_add() to build "supplier<-consumer" order > > without knowing each other. Hence there is the following potential > > odds: (consumerX, child_a, ...) (consumer_a,..) (supplierX), where > > consumer_a consumes child_a. > > Well, what's the initial state of the list? > > > When device_link_add()->device_reorder_to_tail() moves all descendant of > > consumerX to the tail, it breaks the "supplier<-consumer" order by > > "consumer_a <- child_a". > > That depends on what the initial ordering of the list is and please > note that circular dependencies are explicitly assumed to be not > present. > > The assumption is that the initial ordering of the list reflects the > correct suspend (or shutdown) order without the new link. Therefore > initially all children are located after their parents and all known > consumers are located after their suppliers. > > If a new link is added, the new consumer goes to the end of the list > and all of its children and all of its consumers go after it. > device_reorder_to_tail() is recursive, so for each of the devices that > went to the end of the list, all of its children and all of its > consumers go after it and so on. > > Now, that operation doesn't change the order of any of the > parent<-child or supplier<-consumer pairs that get moved and since all > of the devices that depend on any device that get moved go to the end > of list after it, the only devices that don't go to the end of list > are guaranteed to not depend on any of them (they may be parents or > suppliers of the devices that go to the end of the list, but not their > children or suppliers). > Thanks for the detailed explain. It is clear now, and you are right. > > And we need recrusion to resolve the item in > >
Re: NXP p1010se device trees only correct for P1010E/P1014E, not P1010/P1014 SoCs.
On 06/07/18 19:41, Scott Wood wrote: My openwrt patch just does a: /delete-node/ crypto@3; after the p1010si-post.dtsi include. U-Boot should already be removing the node on non-E chips -- see ft_cpu_setup() in arch/powerpc/cpu/mpc85xx/fdt.c Hi Scott, Thanks for your email. The device in question ships an old uboot (a vendor fork of U-Boot 2010.12-svn15934). I am right in saying that the right fix is to either: Use a bootloader (such as current upstream uboot) which adjusts the device tree properly... or: In the case (such as OpenWRT) where the preferred installation method is to retain the vendor bootloader, then the distro kernel should handle the device tree fixup itself? Regards, Tim.
Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset
On Mon, Jul 9, 2018 at 8:48 AM, Pingfan Liu wrote: > On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki wrote: >> >> On Sat, Jul 7, 2018 at 6:24 AM, Pingfan Liu wrote: >> > On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu wrote: >> >> >> >> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki >> >> wrote: >> >> > >> >> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner wrote: >> >> > > [cc += Kishon Vijay Abraham] >> >> > > >> >> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote: >> >> > >> OK, so calling devices_kset_move_last() from really_probe() clearly >> >> > >> is >> >> > >> a mistake. >> >> > >> >> >> > >> I'm not really sure what the intention of it was as the changelog of >> >> > >> commit 52cdbdd49853d doesn't really explain that (why would it be >> >> > >> insufficient without that change?) >> >> > > >> >> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC >> >> > > whose reset pin needs to be driven high on shutdown, lest the MMC >> >> > > won't be found on the next boot. >> >> > > >> >> > > The boards' devicetrees use a kludge wherein the reset pin is modelled >> >> > > as a regulator. The regulator is enabled when the MMC probes and >> >> > > disabled on driver unbind and shutdown. As a result, the pin is >> >> > > driven >> >> > > low on shutdown and the MMC is not found on the next boot. >> >> > > >> >> > > To fix this, another kludge was invented wherein the GPIO expander >> >> > > driving the reset pin unconditionally drives all its pins high on >> >> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c >> >> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state >> >> > > of all pcf lines"). >> >> > > >> >> > > For this kludge to work, the GPIO expander's ->shutdown hook needs to >> >> > > be executed after the MMC expander's ->shutdown hook. >> >> > > >> >> > > Commit 52cdbdd49853d achieved that by reordering devices_kset >> >> > > according >> >> > > to the probe order. Apparently the MMC probes after the GPIO >> >> > > expander, >> >> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't >> >> > > available yet, see mmc_regulator_get_supply(). >> >> > > >> >> > > Note, I'm just piecing the information together from git history, >> >> > > I'm not responsible for these kludges. (I'm innocent!) >> >> > >> >> > Sure enough. :-) >> >> > >> >> > In any case, calling devices_kset_move_last() in really_probe() is >> >> > plain broken and if its only purpose was to address a single, arguably >> >> > kludgy, use case, let's just get rid of it in the first place IMO. >> >> > >> >> Yes, if it is only used for a single use case. >> >> >> > Think it again, I saw other potential issue with the current code. >> > device_link_add->device_reorder_to_tail() can break the >> > "supplier<-consumer" order. During moving children after parent's >> > supplier, it ignores the order of child's consumer. >> >> What do you mean? >> > The drivers use device_link_add() to build "supplier<-consumer" order > without knowing each other. Hence there is the following potential > odds: (consumerX, child_a, ...) (consumer_a,..) (supplierX), where > consumer_a consumes child_a. Well, what's the initial state of the list? > When device_link_add()->device_reorder_to_tail() moves all descendant of > consumerX to the tail, it breaks the "supplier<-consumer" order by > "consumer_a <- child_a". That depends on what the initial ordering of the list is and please note that circular dependencies are explicitly assumed to be not present. The assumption is that the initial ordering of the list reflects the correct suspend (or shutdown) order without the new link. Therefore initially all children are located after their parents and all known consumers are located after their suppliers. If a new link is added, the new consumer goes to the end of the list and all of its children and all of its consumers go after it. device_reorder_to_tail() is recursive, so for each of the devices that went to the end of the list, all of its children and all of its consumers go after it and so on. Now, that operation doesn't change the order of any of the parent<-child or supplier<-consumer pairs that get moved and since all of the devices that depend on any device that get moved go to the end of list after it, the only devices that don't go to the end of list are guaranteed to not depend on any of them (they may be parents or suppliers of the devices that go to the end of the list, but not their children or suppliers). > And we need recrusion to resolve the item in > (consumer_a,..), each time when moving a consumer behind its supplier, > we may break "parent<-child". I don't see this as per the above. Say, device_reorder_to_tail() moves a parent after its child. This means that device_reorder_to_tail() was not called for the child after it had been called for the parent, but that is not true, because it is called for all of the children of each
Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset
On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki wrote: > > On Sat, Jul 7, 2018 at 6:24 AM, Pingfan Liu wrote: > > On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu wrote: > >> > >> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki wrote: > >> > > >> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner wrote: > >> > > [cc += Kishon Vijay Abraham] > >> > > > >> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote: > >> > >> OK, so calling devices_kset_move_last() from really_probe() clearly is > >> > >> a mistake. > >> > >> > >> > >> I'm not really sure what the intention of it was as the changelog of > >> > >> commit 52cdbdd49853d doesn't really explain that (why would it be > >> > >> insufficient without that change?) > >> > > > >> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC > >> > > whose reset pin needs to be driven high on shutdown, lest the MMC > >> > > won't be found on the next boot. > >> > > > >> > > The boards' devicetrees use a kludge wherein the reset pin is modelled > >> > > as a regulator. The regulator is enabled when the MMC probes and > >> > > disabled on driver unbind and shutdown. As a result, the pin is driven > >> > > low on shutdown and the MMC is not found on the next boot. > >> > > > >> > > To fix this, another kludge was invented wherein the GPIO expander > >> > > driving the reset pin unconditionally drives all its pins high on > >> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c > >> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state > >> > > of all pcf lines"). > >> > > > >> > > For this kludge to work, the GPIO expander's ->shutdown hook needs to > >> > > be executed after the MMC expander's ->shutdown hook. > >> > > > >> > > Commit 52cdbdd49853d achieved that by reordering devices_kset according > >> > > to the probe order. Apparently the MMC probes after the GPIO expander, > >> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't > >> > > available yet, see mmc_regulator_get_supply(). > >> > > > >> > > Note, I'm just piecing the information together from git history, > >> > > I'm not responsible for these kludges. (I'm innocent!) > >> > > >> > Sure enough. :-) > >> > > >> > In any case, calling devices_kset_move_last() in really_probe() is > >> > plain broken and if its only purpose was to address a single, arguably > >> > kludgy, use case, let's just get rid of it in the first place IMO. > >> > > >> Yes, if it is only used for a single use case. > >> > > Think it again, I saw other potential issue with the current code. > > device_link_add->device_reorder_to_tail() can break the > > "supplier<-consumer" order. During moving children after parent's > > supplier, it ignores the order of child's consumer. > > What do you mean? > The drivers use device_link_add() to build "supplier<-consumer" order without knowing each other. Hence there is the following potential odds: (consumerX, child_a, ...) (consumer_a,..) (supplierX), where consumer_a consumes child_a. When device_link_add()->device_reorder_to_tail() moves all descendant of consumerX to the tail, it breaks the "supplier<-consumer" order by "consumer_a <- child_a". And we need recrusion to resolve the item in (consumer_a,..), each time when moving a consumer behind its supplier, we may break "parent<-child". > > Beside this, essentially both devices_kset_move_after/_before() and > > device_pm_move_after/_before() expose the shutdown order to the > > indirect caller, and we can not expect that the caller can not handle > > it correctly. It should be a job of drivers core. > > Arguably so, but that's how those functions were designed and the > callers should be aware of the limitation. > > If they aren't, there is a bug in the caller. > If we consider device_move()-> device_pm_move_after/_before() more carefully like the above description, then we can hide the detail from caller. And keep the info of the pm order inside the core. > > It is hard to extract high dimension info and pack them into one dimension > > linked-list. > > Well, yes and no. > For "hard", I means that we need two interleaved recursion to make the order correct. Otherwise, I think it is a bug or limitation. > We know it for a fact that there is a linear ordering that will work. > It is inefficient to figure it out every time during system suspend > and resume, for one and that's why we have dpm_list. > Yeah, I agree that iterating over device tree may hurt performance. I guess the iterating will not cost the majority of the suspend time, comparing to the device_suspend(), which causes hardware's sync. But data is more persuasive. Besides the performance, do you have other concern till now? > Now, if we have it for suspend and resume, it can also be used for shutdown. > Yes, I do think so. Thanks and regards, Pingfan
[PATCH] powerpc64s: Show ori31 availability in spectre_v1 sysfs file not v2
When I added the spectre_v2 information in sysfs, I included the availability of the ori31 speculation barrier. Although the ori31 barrier can be used to mitigate v2, it's primarily intended as a spectre v1 mitigation. Spectre v2 is mitigated by hardware changes. So rework the sysfs files to show the ori31 information in the spectre_v1 file, rather than v2. Currently we display eg: $ grep . spectre_v* spectre_v1:Mitigation: __user pointer sanitization spectre_v2:Mitigation: Indirect branch cache disabled, ori31 speculation barrier enabled After: $ grep . spectre_v* spectre_v1:Mitigation: __user pointer sanitization, ori31 speculation barrier enabled spectre_v2:Mitigation: Indirect branch cache disabled Fixes: d6fbe1c55c55 ("powerpc/64s: Wire up cpu_show_spectre_v2()") Cc: sta...@vger.kernel.org # v4.17+ Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/security.c | 27 +-- 1 file changed, 17 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index a8b277362931..4cb8f1f7b593 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -117,25 +117,35 @@ ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, cha ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf) { - if (!security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) - return sprintf(buf, "Not affected\n"); + struct seq_buf s; + + seq_buf_init(, buf, PAGE_SIZE - 1); - if (barrier_nospec_enabled) - return sprintf(buf, "Mitigation: __user pointer sanitization\n"); + if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) { + if (barrier_nospec_enabled) + seq_buf_printf(, "Mitigation: __user pointer sanitization"); + else + seq_buf_printf(, "Vulnerable"); - return sprintf(buf, "Vulnerable\n"); + if (security_ftr_enabled(SEC_FTR_SPEC_BAR_ORI31)) + seq_buf_printf(, ", ori31 speculation barrier enabled"); + + seq_buf_printf(, "\n"); + } else + seq_buf_printf(, "Not affected\n"); + + return s.len; } ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf) { - bool bcs, ccd, ori; struct seq_buf s; + bool bcs, ccd; seq_buf_init(, buf, PAGE_SIZE - 1); bcs = security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED); ccd = security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED); - ori = security_ftr_enabled(SEC_FTR_SPEC_BAR_ORI31); if (bcs || ccd) { seq_buf_printf(, "Mitigation: "); @@ -151,9 +161,6 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c } else seq_buf_printf(, "Vulnerable"); - if (ori) - seq_buf_printf(, ", ori31 speculation barrier enabled"); - seq_buf_printf(, "\n"); return s.len; -- 2.14.1