[PATCH NEXT 1/4] powerpc/pasemi: Add PCI initialisation for Nemo board.

2018-07-09 Thread Christian Zigotzky
Hi Michael,
Hi All,

kbuild test robot Wed, 03 Jan 2018 04:17:20 -0800 wrote:

Hi Darren,

Thank you for the patch! Perhaps something to improve:

arch/powerpc/platforms/pasemi/pci.c: In function 'sb600_set_flag': >> 
include/linux/kern_levels.h:5:18: warning: format '%lx' expects argument of >> 
type 'long unsigned int', but argument 2 has type 'resource_size_t {aka long >> 
long unsigned int}' [-Wformat=] #define KERN_SOH "\001" /* ASCII Start Of 
Header */

—-

I was able to fix this small format issue. I replaced the format '%08lx' with 
'%08llx'.

+ printk(KERN_CRIT "NEMO SB600 IOB base %08llx\n",res.start);

Is this fix OK or is there a better solution?

> On 3. May 2018, at 15:06, Michael Ellerman  wrote:
> 
>> +
>> +printk(KERN_CRIT "NEMO SB600 IOB base %08lx\n",res.start);
> 
> That's INFO or even DEBUG.
>> 

Michael,

What do you think about this fix?

+ printk(KERN_INFO "NEMO SB600 IOB base %08llx\n",res.start);

Thanks,
Christian

Re: Can people please check this patch out for cross-arch issues

2018-07-09 Thread Nicholas Piggin
On Mon, 9 Jul 2018 16:57:46 -0700
Linus Torvalds  wrote:

> We have a funny new bugzilla entry for an issue that is 12 years old,
> and not really all that critical, but one that *can* become a problem
> for people who do very specific things.
> 
> What happens is that 'fork()' will cause a re-try (with
> ERESTARTNOINTR) if a signal has come in between the point where the
> fork() started, but before we add the process to the process table.
> The reason is that the signal possibly *should* affect the new child
> process too, but it was never queued to it because we were obviously
> in the process of creating it.
> 
> That's normally entirely a non-issue, and I don't think anybody ever
> imagined it could matter in practice, but apparently there are loads
> where this becomes problematic.
> 
> See kernel bugzilla at
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=200447
> 
> which has a trial balloon patch for this issue already, to at least
> limit that retry to only signals that actually might affect the child
> (ie not any random signals sent explicitly and directly only to the
> parent process).
> 
> HOWEVER.
> 
> The very first thing I noticed while looking at this was that one of
> the more expensive parts of the fork() retry is actually marking all
> the parent page tables read-only. Now, it's one of _many_ expensive
> parts, and it's not nearly as expensive as all the reference counting
> we do for each page, but it's actually very easy to avoid. When we
> have repeated fork() calls, there's just no point in repeatedly
> marking pages read-only.
> 
> This the attached one-liner patch.
> 
> The reason I'm sending it to the arch people is that while this is a
> totally trivial patch on x86 ("pte_write()" literally tests exactly
> the same bit that "pte_wrprotect()" and "ptep_set_wrprotect()"
> clears), the same is not necessarily always true on other
> architectures.
> 
> Some other architectures make "ptep_set_wrprotect()" do more than just
> clear the one bit we test with "pte_write()".
> 
> Honestly, I don't think it could possibly matter: if "pte_write()"
> returns false, then whatever "ptep_set_writeprotect()" does can not
> really matter (at least for a COW mapping). But I thought I'd send
> this out for comments anyway, despite how trivial it looks.
> 
> So. Comments?

Looks good to me (after the huge page bits are done?)

If pte_write returns false here, surely any later write access has to
fault otherwise we've got bigger problems right? powerpc/64/radix is
pretty unsurprising so no problems there (it just modifies the pte, so
shouldn't change anything in this case). Hash will actually schedule
the hash table entry to be invalidated, but it can't be writable.

> It doesn't make a huge difference, honestly, and the real fix for the
> "retry too eagerly" is completely different, but at the same time this
> one-liner trivial fix does feel like the RightThing(tm) regardless.

Acked-by: Nicholas Piggin 


Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100

2018-07-09 Thread Alexey Kardashevskiy
On Thu, 7 Jun 2018 23:03:23 -0600
Alex Williamson  wrote:

> On Fri, 8 Jun 2018 14:14:23 +1000
> Alexey Kardashevskiy  wrote:
> 
> > On 8/6/18 1:44 pm, Alex Williamson wrote:  
> > > On Fri, 8 Jun 2018 13:08:54 +1000
> > > Alexey Kardashevskiy  wrote:
> > > 
> > >> On 8/6/18 8:15 am, Alex Williamson wrote:
> > >>> On Fri, 08 Jun 2018 07:54:02 +1000
> > >>> Benjamin Herrenschmidt  wrote:
> > >>>   
> >  On Thu, 2018-06-07 at 11:04 -0600, Alex Williamson wrote:  
> > >
> > > Can we back up and discuss whether the IOMMU grouping of NVLink
> > > connected devices makes sense?  AIUI we have a PCI view of these
> > > devices and from that perspective they're isolated.  That's the view 
> > > of
> > > the device used to generate the grouping.  However, not visible to us,
> > > these devices are interconnected via NVLink.  What isolation 
> > > properties
> > > does NVLink provide given that its entire purpose for existing seems 
> > > to
> > > be to provide a high performance link for p2p between devices?
> > 
> >  Not entire. On POWER chips, we also have an nvlink between the device
> >  and the CPU which is running significantly faster than PCIe.
> > 
> >  But yes, there are cross-links and those should probably be accounted
> >  for in the grouping.  
> > >>>
> > >>> Then after we fix the grouping, can we just let the host driver manage
> > >>> this coherent memory range and expose vGPUs to guests?  The use case of
> > >>> assigning all 6 GPUs to one VM seems pretty limited.  (Might need to
> > >>> convince NVIDIA to support more than a single vGPU per VM though)  
> > >>
> > >> These are physical GPUs, not virtual sriov-alike things they are
> > >> implementing as well elsewhere.
> > > 
> > > vGPUs as implemented on M- and P-series Teslas aren't SR-IOV like
> > > either.  That's why we have mdev devices now to implement software
> > > defined devices.  I don't have first hand experience with V-series, but
> > > I would absolutely expect a PCIe-based Tesla V100 to support vGPU.
> > 
> > So assuming V100 can do vGPU, you are suggesting ditching this patchset and
> > using mediated vGPUs instead, correct?  
> 
> If it turns out that our PCIe-only-based IOMMU grouping doesn't
> account for lack of isolation on the NVLink side and we correct that,
> limiting assignment to sets of 3 interconnected GPUs, is that still a
> useful feature?  OTOH, it's entirely an NVIDIA proprietary decision
> whether they choose to support vGPU on these GPUs or whether they can
> be convinced to support multiple vGPUs per VM.
> 
> > >> My current understanding is that every P9 chip in that box has some 
> > >> NVLink2
> > >> logic on it so each P9 is directly connected to 3 GPUs via PCIe and
> > >> 2xNVLink2, and GPUs in that big group are interconnected by NVLink2 links
> > >> as well.
> > >>
> > >> From small bits of information I have it seems that a GPU can perfectly
> > >> work alone and if the NVIDIA driver does not see these interconnects
> > >> (because we do not pass the rest of the big 3xGPU group to this guest), 
> > >> it
> > >> continues with a single GPU. There is an "nvidia-smi -r" big reset hammer
> > >> which simply refuses to work until all 3 GPUs are passed so there is some
> > >> distinction between passing 1 or 3 GPUs, and I am trying (as we speak) to
> > >> get a confirmation from NVIDIA that it is ok to pass just a single GPU.
> > >>
> > >> So we will either have 6 groups (one per GPU) or 2 groups (one per
> > >> interconnected group).
> > > 
> > > I'm not gaining much confidence that we can rely on isolation between
> > > NVLink connected GPUs, it sounds like you're simply expecting that
> > > proprietary code from NVIDIA on a proprietary interconnect from NVIDIA
> > > is going to play nice and nobody will figure out how to do bad things
> > > because... obfuscation?  Thanks,
> > 
> > Well, we already believe that a proprietary firmware of a sriov-capable
> > adapter like Mellanox ConnextX is not doing bad things, how is this
> > different in principle?  
> 
> It seems like the scope and hierarchy are different.  Here we're
> talking about exposing big discrete devices, which are peers of one
> another (and have history of being reverse engineered), to userspace
> drivers.  Once handed to userspace, each of those devices needs to be
> considered untrusted.  In the case of SR-IOV, we typically have a
> trusted host driver for the PF managing untrusted VFs.  We do rely on
> some sanity in the hardware/firmware in isolating the VFs from each
> other and from the PF, but we also often have source code for Linux
> drivers for these devices and sometimes even datasheets.  Here we have
> neither of those and perhaps we won't know the extent of the lack of
> isolation between these devices until nouveau (best case) or some
> exploit (worst case) exposes it.  IOMMU grouping always assumes a lack
> of 

[PATCH] powerpc: Replaced msleep with usleep_range

2018-07-09 Thread Daniel Klamt
Replaced msleep for less than 10ms with usleep_range because will
often sleep longer than intended.
For original explanation see:
Documentation/timers/timers-howto.txt

Signed-off-by: Daniel Klamt 
Signed-off-by: Bjoern Noetel 
---
 arch/powerpc/sysdev/xive/native.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/sysdev/xive/native.c 
b/arch/powerpc/sysdev/xive/native.c
index 311185b9960a..b164b1cdf4d6 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -109,7 +109,7 @@ int xive_native_configure_irq(u32 hw_irq, u32 target, u8 
prio, u32 sw_irq)
rc = opal_xive_set_irq_config(hw_irq, target, prio, sw_irq);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100)
}
return rc == 0 ? 0 : -ENXIO;
 }
@@ -163,7 +163,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q 
*q, u8 prio,
rc = opal_xive_set_queue_info(vp_id, prio, qpage_phys, order, 
flags);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100);
}
if (rc) {
pr_err("Error %lld setting queue for prio %d\n", rc, prio);
@@ -190,7 +190,7 @@ static void __xive_native_disable_queue(u32 vp_id, struct 
xive_q *q, u8 prio)
rc = opal_xive_set_queue_info(vp_id, prio, 0, 0, 0);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100);
}
if (rc)
pr_err("Error %lld disabling queue for prio %d\n", rc, prio);
@@ -253,7 +253,7 @@ static int xive_native_get_ipi(unsigned int cpu, struct 
xive_cpu *xc)
for (;;) {
irq = opal_xive_allocate_irq(chip_id);
if (irq == OPAL_BUSY) {
-   msleep(1);
+   usleep_range(1000, 1100);
continue;
}
if (irq < 0) {
@@ -275,7 +275,7 @@ u32 xive_native_alloc_irq(void)
rc = opal_xive_allocate_irq(OPAL_XIVE_ANY_CHIP);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100);
}
if (rc < 0)
return 0;
@@ -289,7 +289,7 @@ void xive_native_free_irq(u32 irq)
s64 rc = opal_xive_free_irq(irq);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100);
}
 }
 EXPORT_SYMBOL_GPL(xive_native_free_irq);
@@ -305,7 +305,7 @@ static void xive_native_put_ipi(unsigned int cpu, struct 
xive_cpu *xc)
for (;;) {
rc = opal_xive_free_irq(xc->hw_ipi);
if (rc == OPAL_BUSY) {
-   msleep(1);
+   usleep_range(1000, 1100);
continue;
}
xc->hw_ipi = 0;
@@ -400,7 +400,7 @@ static void xive_native_setup_cpu(unsigned int cpu, struct 
xive_cpu *xc)
rc = opal_xive_set_vp_info(vp, OPAL_XIVE_VP_ENABLED, 0);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100);
}
if (rc) {
pr_err("Failed to enable pool VP on CPU %d\n", cpu);
@@ -444,7 +444,7 @@ static void xive_native_teardown_cpu(unsigned int cpu, 
struct xive_cpu *xc)
rc = opal_xive_set_vp_info(vp, 0, 0);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100);
}
 }
 
@@ -645,7 +645,7 @@ u32 xive_native_alloc_vp_block(u32 max_vcpus)
rc = opal_xive_alloc_vp_block(order);
switch (rc) {
case OPAL_BUSY:
-   msleep(1);
+   usleep_range(1000, 1100);
break;
case OPAL_XIVE_PROVISIONING:
if (!xive_native_provision_pages())
@@ -687,7 +687,7 @@ int xive_native_enable_vp(u32 vp_id, bool single_escalation)
rc = opal_xive_set_vp_info(vp_id, flags, 0);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100);
}
return rc ? -EIO : 0;
 }
@@ -701,7 +701,7 @@ int xive_native_disable_vp(u32 vp_id)
rc = opal_xive_set_vp_info(vp_id, 0, 0);
if (rc != OPAL_BUSY)
break;
-   msleep(1);
+   usleep_range(1000, 1100);
}
return rc ? -EIO : 0;
 }
-- 
2.11.0



Re: [PATCH] Documentation: Add powerpc options for spec_store_bypass_disable

2018-07-09 Thread Kees Cook
On Mon, Jul 9, 2018 at 7:08 PM, Michael Ellerman  wrote:
> Document the support for spec_store_bypass_disable that was added for
> powerpc in commit a048a07d7f45 ("powerpc/64s: Add support for a store
> forwarding barrier at kernel entry/exit").
>
> Signed-off-by: Michael Ellerman 

Reviewed-by: Kees Cook 

-Kees

> ---
>  Documentation/admin-guide/kernel-parameters.txt | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> I tried documenting the differences between the PPC options and X86 ones in 
> one
> section, but it got quite messy, so I went with this instead. Happy to take
> advice on how better to structure it if anyone has opinions.
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index efc7aa7a0670..f320c7168b04 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4060,6 +4060,8 @@
> This parameter controls whether the Speculative Store
> Bypass optimization is used.
>
> +   On x86 the options are:
> +
> on  - Unconditionally disable Speculative Store 
> Bypass
> off - Unconditionally enable Speculative Store 
> Bypass
> auto- Kernel detects whether the CPU model 
> contains an
> @@ -4075,12 +4077,20 @@
> seccomp - Same as "prctl" above, but all seccomp 
> threads
>   will disable SSB unless they explicitly opt 
> out.
>
> -   Not specifying this option is equivalent to
> -   spec_store_bypass_disable=auto.
> -
> Default mitigations:
> X86:If CONFIG_SECCOMP=y "seccomp", otherwise 
> "prctl"
>
> +   On powerpc the options are:
> +
> +   on,auto - On Power8 and Power9 insert a 
> store-forwarding
> + barrier on kernel entry and exit. On Power7
> + perform a software flush on kernel entry and
> + exit.
> +   off - No action.
> +
> +   Not specifying this option is equivalent to
> +   spec_store_bypass_disable=auto.
> +
> spia_io_base=   [HW,MTD]
> spia_fio_base=
> spia_pedr=
> --
> 2.14.1
>



-- 
Kees Cook
Pixel Security


Re: [kbuild-all] [PATCH v6 2/4] resource: Use list_head to link sibling resource

2018-07-09 Thread Baoquan He
On 07/10/18 at 08:59am, Ye Xiaolong wrote:
> Hi,
> 
> On 07/08, Baoquan He wrote:
> >Hi,
> >
> >On 07/05/18 at 01:00am, kbuild test robot wrote:
> >> Hi Baoquan,
> >> 
> >> I love your patch! Yet something to improve:
> >> 
> >> [auto build test ERROR on linus/master]
> >> [also build test ERROR on v4.18-rc3 next-20180704]
> >> [if your patch is applied to the wrong git tree, please drop us a note to 
> >> help improve the system]
> >
> >Thanks for telling. 
> >
> >I cloned 0day-ci/linut to my local pc.
> >https://github.com/0day-ci/linux.git
> >
> >However, I didn't find below branch. And tried to open it in web
> >broswer, also failed.
> >
> 
> Sorry for the inconvenience, 0day bot didn't push the branch to github 
> successfully,
> Just push it manually, you can have a try again.

Thanks, Xiaolong, I have applied them on top of linux-next/master, and
copy the config file attached, and run the command to reproduce as
suggested. Now I have fixed all those issues reported, will repost.

> 
> >
> >> url:
> >> https://github.com/0day-ci/linux/commits/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180704-121402
> >> config: mips-rb532_defconfig (attached as .config)
> >> compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
> >> reproduce:
> >> wget 
> >> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross 
> >> -O ~/bin/make.cross
> >> chmod +x ~/bin/make.cross
> >> # save the attached .config to linux build tree
> >> GCC_VERSION=7.2.0 make.cross ARCH=mips 
> >
> >I did find a old one which is for the old version 5 post.
> >
> >[bhe@linux]$ git remote -v
> >0day-ci  https://github.com/0day-ci/linux.git (fetch)
> >0day-ci  https://github.com/0day-ci/linux.git (push)
> >[bhe@dhcp-128-28 linux]$ git branch -a| grep Baoquan| grep resource
> >  
> > remotes/0day-ci/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180612-113600
> >
> >Could you help have a look at this?
> >
> >Thanks
> >Baoquan
> >
> >> 
> >> All error/warnings (new ones prefixed by >>):
> >> 
> >> >> arch/mips/pci/pci-rc32434.c:57:11: error: initialization from 
> >> >> incompatible pointer type [-Werror=incompatible-pointer-types]
> >>  .child = _res_pci_mem2
> >>   ^
> >>arch/mips/pci/pci-rc32434.c:57:11: note: (near initialization for 
> >> 'rc32434_res_pci_mem1.child.next')
> >> >> arch/mips/pci/pci-rc32434.c:51:47: warning: missing braces around 
> >> >> initializer [-Wmissing-braces]
> >> static struct resource rc32434_res_pci_mem1 = {
> >>   ^
> >>arch/mips/pci/pci-rc32434.c:60:47: warning: missing braces around 
> >> initializer [-Wmissing-braces]
> >> static struct resource rc32434_res_pci_mem2 = {
> >>   ^
> >>cc1: some warnings being treated as errors
> >> 
> >> vim +57 arch/mips/pci/pci-rc32434.c
> >> 
> >> 73b4390f Ralf Baechle 2008-07-16  50  
> >> 73b4390f Ralf Baechle 2008-07-16 @51  static struct resource 
> >> rc32434_res_pci_mem1 = {
> >> 73b4390f Ralf Baechle 2008-07-16  52   .name = "PCI MEM1",
> >> 73b4390f Ralf Baechle 2008-07-16  53   .start = 0x5000,
> >> 73b4390f Ralf Baechle 2008-07-16  54   .end = 0x5FFF,
> >> 73b4390f Ralf Baechle 2008-07-16  55   .flags = IORESOURCE_MEM,
> >> 73b4390f Ralf Baechle 2008-07-16  56   .sibling = NULL,
> >> 73b4390f Ralf Baechle 2008-07-16 @57   .child = _res_pci_mem2
> >> 73b4390f Ralf Baechle 2008-07-16  58  };
> >> 73b4390f Ralf Baechle 2008-07-16  59  
> >> 
> >> :: The code at line 57 was first introduced by commit
> >> :: 73b4390fb23456964201abda79f1210fe337d01a [MIPS] Routerboard 532: 
> >> Support for base system
> >> 
> >> :: TO: Ralf Baechle 
> >> :: CC: Ralf Baechle 
> >> 
> >> ---
> >> 0-DAY kernel test infrastructureOpen Source Technology 
> >> Center
> >> https://lists.01.org/pipermail/kbuild-all   Intel 
> >> Corporation
> >
> >
> >___
> >kbuild-all mailing list
> >kbuild-...@lists.01.org
> >https://lists.01.org/mailman/listinfo/kbuild-all


[PATCH] Documentation: Add powerpc options for spec_store_bypass_disable

2018-07-09 Thread Michael Ellerman
Document the support for spec_store_bypass_disable that was added for
powerpc in commit a048a07d7f45 ("powerpc/64s: Add support for a store
forwarding barrier at kernel entry/exit").

Signed-off-by: Michael Ellerman 
---
 Documentation/admin-guide/kernel-parameters.txt | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

I tried documenting the differences between the PPC options and X86 ones in one
section, but it got quite messy, so I went with this instead. Happy to take
advice on how better to structure it if anyone has opinions.

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index efc7aa7a0670..f320c7168b04 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4060,6 +4060,8 @@
This parameter controls whether the Speculative Store
Bypass optimization is used.
 
+   On x86 the options are:
+
on  - Unconditionally disable Speculative Store 
Bypass
off - Unconditionally enable Speculative Store 
Bypass
auto- Kernel detects whether the CPU model contains 
an
@@ -4075,12 +4077,20 @@
seccomp - Same as "prctl" above, but all seccomp threads
  will disable SSB unless they explicitly opt 
out.
 
-   Not specifying this option is equivalent to
-   spec_store_bypass_disable=auto.
-
Default mitigations:
X86:If CONFIG_SECCOMP=y "seccomp", otherwise "prctl"
 
+   On powerpc the options are:
+
+   on,auto - On Power8 and Power9 insert a store-forwarding
+ barrier on kernel entry and exit. On Power7
+ perform a software flush on kernel entry and
+ exit.
+   off - No action.
+
+   Not specifying this option is equivalent to
+   spec_store_bypass_disable=auto.
+
spia_io_base=   [HW,MTD]
spia_fio_base=
spia_pedr=
-- 
2.14.1



Re: [PATCH 1/3] [v2] powerpc: mac: fix rtc read/write functions

2018-07-09 Thread Finn Thain
On Mon, 9 Jul 2018, Arnd Bergmann wrote:

> 
> The most likely explanation I have here is that the RTC was indeed set 
> to an incorrect date, either because of a depleted battery (not unlikely 
> for a ~15 year old box) or because it was previously stored incorrectly.

The PowerMac stores the GMT offset in NVRAM, and this gets used to 
initialize timezone_offset.

If timezone_offset was negative and now.tv_sec was zero, I think this 
could store a 1969 date in the RTC:

int update_persistent_clock64(struct timespec64 now)
{
struct rtc_time tm;

if (!ppc_md.set_rtc_time)
return -ENODEV;

rtc_time64_to_tm(now.tv_sec + 1 + timezone_offset, );

return ppc_md.set_rtc_time();
}

But maybe now.tv_sec can be shown to be greater than timezone_offset.

Then, what would happen when the timezone in /etc/localtime disagrees with 
the timezone_offset stored in NVRAM (PRAM)?

Besides that, if the battery went flat, what use is a backtrace? Why not 
scrap the WARN_ON()?

-- 


Re: [kbuild-all] [PATCH v6 2/4] resource: Use list_head to link sibling resource

2018-07-09 Thread Ye Xiaolong
Hi,

On 07/08, Baoquan He wrote:
>Hi,
>
>On 07/05/18 at 01:00am, kbuild test robot wrote:
>> Hi Baoquan,
>> 
>> I love your patch! Yet something to improve:
>> 
>> [auto build test ERROR on linus/master]
>> [also build test ERROR on v4.18-rc3 next-20180704]
>> [if your patch is applied to the wrong git tree, please drop us a note to 
>> help improve the system]
>
>Thanks for telling. 
>
>I cloned 0day-ci/linut to my local pc.
>https://github.com/0day-ci/linux.git
>
>However, I didn't find below branch. And tried to open it in web
>broswer, also failed.
>

Sorry for the inconvenience, 0day bot didn't push the branch to github 
successfully,
Just push it manually, you can have a try again.

Thanks,
Xiaolong


>
>> url:
>> https://github.com/0day-ci/linux/commits/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180704-121402
>> config: mips-rb532_defconfig (attached as .config)
>> compiler: mipsel-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
>> reproduce:
>> wget 
>> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
>> ~/bin/make.cross
>> chmod +x ~/bin/make.cross
>> # save the attached .config to linux build tree
>> GCC_VERSION=7.2.0 make.cross ARCH=mips 
>
>I did find a old one which is for the old version 5 post.
>
>[bhe@linux]$ git remote -v
>0day-cihttps://github.com/0day-ci/linux.git (fetch)
>0day-cihttps://github.com/0day-ci/linux.git (push)
>[bhe@dhcp-128-28 linux]$ git branch -a| grep Baoquan| grep resource
>  
> remotes/0day-ci/Baoquan-He/resource-Use-list_head-to-link-sibling-resource/20180612-113600
>
>Could you help have a look at this?
>
>Thanks
>Baoquan
>
>> 
>> All error/warnings (new ones prefixed by >>):
>> 
>> >> arch/mips/pci/pci-rc32434.c:57:11: error: initialization from 
>> >> incompatible pointer type [-Werror=incompatible-pointer-types]
>>  .child = _res_pci_mem2
>>   ^
>>arch/mips/pci/pci-rc32434.c:57:11: note: (near initialization for 
>> 'rc32434_res_pci_mem1.child.next')
>> >> arch/mips/pci/pci-rc32434.c:51:47: warning: missing braces around 
>> >> initializer [-Wmissing-braces]
>> static struct resource rc32434_res_pci_mem1 = {
>>   ^
>>arch/mips/pci/pci-rc32434.c:60:47: warning: missing braces around 
>> initializer [-Wmissing-braces]
>> static struct resource rc32434_res_pci_mem2 = {
>>   ^
>>cc1: some warnings being treated as errors
>> 
>> vim +57 arch/mips/pci/pci-rc32434.c
>> 
>> 73b4390f Ralf Baechle 2008-07-16  50  
>> 73b4390f Ralf Baechle 2008-07-16 @51  static struct resource 
>> rc32434_res_pci_mem1 = {
>> 73b4390f Ralf Baechle 2008-07-16  52 .name = "PCI MEM1",
>> 73b4390f Ralf Baechle 2008-07-16  53 .start = 0x5000,
>> 73b4390f Ralf Baechle 2008-07-16  54 .end = 0x5FFF,
>> 73b4390f Ralf Baechle 2008-07-16  55 .flags = IORESOURCE_MEM,
>> 73b4390f Ralf Baechle 2008-07-16  56 .sibling = NULL,
>> 73b4390f Ralf Baechle 2008-07-16 @57 .child = _res_pci_mem2
>> 73b4390f Ralf Baechle 2008-07-16  58  };
>> 73b4390f Ralf Baechle 2008-07-16  59  
>> 
>> :: The code at line 57 was first introduced by commit
>> :: 73b4390fb23456964201abda79f1210fe337d01a [MIPS] Routerboard 532: 
>> Support for base system
>> 
>> :: TO: Ralf Baechle 
>> :: CC: Ralf Baechle 
>> 
>> ---
>> 0-DAY kernel test infrastructureOpen Source Technology Center
>> https://lists.01.org/pipermail/kbuild-all   Intel Corporation
>
>
>___
>kbuild-all mailing list
>kbuild-...@lists.01.org
>https://lists.01.org/mailman/listinfo/kbuild-all


Re: [PATCH] powerpc: Replaced msleep with usleep_range

2018-07-09 Thread Benjamin Herrenschmidt
On Mon, 2018-07-09 at 15:57 +0200, Daniel Klamt wrote:
> Replaced msleep for less than 10ms with usleep_range because will
> often sleep longer than intended.
> For original explanation see:
> Documentation/timers/timers-howto.txt

Why ? This is pointless. The original code is smaller and more
readable. We don't care how long it actually sleeps, this is the FW
telling us it's busy (or the HW is), come back a bit later.

Ben.

> Signed-off-by: Daniel Klamt 
> Signed-off-by: Bjoern Noetel 
> ---
>  arch/powerpc/sysdev/xive/native.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/sysdev/xive/native.c 
> b/arch/powerpc/sysdev/xive/native.c
> index 311185b9960a..b164b1cdf4d6 100644
> --- a/arch/powerpc/sysdev/xive/native.c
> +++ b/arch/powerpc/sysdev/xive/native.c
> @@ -109,7 +109,7 @@ int xive_native_configure_irq(u32 hw_irq, u32 target, u8 
> prio, u32 sw_irq)
>   rc = opal_xive_set_irq_config(hw_irq, target, prio, sw_irq);
>   if (rc != OPAL_BUSY)
>   break;
> - msleep(1);
> + usleep_range(1000, 1100)
>   }
>   return rc == 0 ? 0 : -ENXIO;
>  }
> @@ -163,7 +163,7 @@ int xive_native_configure_queue(u32 vp_id, struct xive_q 
> *q, u8 prio,
>   rc = opal_xive_set_queue_info(vp_id, prio, qpage_phys, order, 
> flags);
>   if (rc != OPAL_BUSY)
>   break;
> - msleep(1);
> + usleep_range(1000, 1100);
>   }
>   if (rc) {
>   pr_err("Error %lld setting queue for prio %d\n", rc, prio);
> @@ -190,7 +190,7 @@ static void __xive_native_disable_queue(u32 vp_id, struct 
> xive_q *q, u8 prio)
>   rc = opal_xive_set_queue_info(vp_id, prio, 0, 0, 0);
>   if (rc != OPAL_BUSY)
>   break;
> - msleep(1);
> + usleep_range(1000, 1100);
>   }
>   if (rc)
>   pr_err("Error %lld disabling queue for prio %d\n", rc, prio);
> @@ -253,7 +253,7 @@ static int xive_native_get_ipi(unsigned int cpu, struct 
> xive_cpu *xc)
>   for (;;) {
>   irq = opal_xive_allocate_irq(chip_id);
>   if (irq == OPAL_BUSY) {
> - msleep(1);
> + usleep_range(1000, 1100);
>   continue;
>   }
>   if (irq < 0) {
> @@ -275,7 +275,7 @@ u32 xive_native_alloc_irq(void)
>   rc = opal_xive_allocate_irq(OPAL_XIVE_ANY_CHIP);
>   if (rc != OPAL_BUSY)
>   break;
> - msleep(1);
> + usleep_range(1000, 1100);
>   }
>   if (rc < 0)
>   return 0;
> @@ -289,7 +289,7 @@ void xive_native_free_irq(u32 irq)
>   s64 rc = opal_xive_free_irq(irq);
>   if (rc != OPAL_BUSY)
>   break;
> - msleep(1);
> + usleep_range(1000, 1100);
>   }
>  }
>  EXPORT_SYMBOL_GPL(xive_native_free_irq);
> @@ -305,7 +305,7 @@ static void xive_native_put_ipi(unsigned int cpu, struct 
> xive_cpu *xc)
>   for (;;) {
>   rc = opal_xive_free_irq(xc->hw_ipi);
>   if (rc == OPAL_BUSY) {
> - msleep(1);
> + usleep_range(1000, 1100);
>   continue;
>   }
>   xc->hw_ipi = 0;
> @@ -400,7 +400,7 @@ static void xive_native_setup_cpu(unsigned int cpu, 
> struct xive_cpu *xc)
>   rc = opal_xive_set_vp_info(vp, OPAL_XIVE_VP_ENABLED, 0);
>   if (rc != OPAL_BUSY)
>   break;
> - msleep(1);
> + usleep_range(1000, 1100);
>   }
>   if (rc) {
>   pr_err("Failed to enable pool VP on CPU %d\n", cpu);
> @@ -444,7 +444,7 @@ static void xive_native_teardown_cpu(unsigned int cpu, 
> struct xive_cpu *xc)
>   rc = opal_xive_set_vp_info(vp, 0, 0);
>   if (rc != OPAL_BUSY)
>   break;
> - msleep(1);
> + usleep_range(1000, 1100);
>   }
>  }
>  
> @@ -645,7 +645,7 @@ u32 xive_native_alloc_vp_block(u32 max_vcpus)
>   rc = opal_xive_alloc_vp_block(order);
>   switch (rc) {
>   case OPAL_BUSY:
> - msleep(1);
> + usleep_range(1000, 1100);
>   break;
>   case OPAL_XIVE_PROVISIONING:
>   if (!xive_native_provision_pages())
> @@ -687,7 +687,7 @@ int xive_native_enable_vp(u32 vp_id, bool 
> single_escalation)
>   rc = opal_xive_set_vp_info(vp_id, flags, 0);
>   if (rc != OPAL_BUSY)
>   break;
> - msleep(1);
> + usleep_range(1000, 1100);
>   }
>   return rc ? -EIO : 0;
>  }
> @@ -701,7 +701,7 @@ int xive_native_disable_vp(u32 vp_id)
>   rc = opal_xive_set_vp_info(vp_id, 0, 0);
>   if (rc 

Re: NXP p1010se device trees only correct for P1010E/P1014E, not P1010/P1014 SoCs.

2018-07-09 Thread Scott Wood
On Mon, 2018-07-09 at 09:38 +0100, Tim Small wrote:
> On 06/07/18 19:41, Scott Wood wrote:
> > > My openwrt patch
> > > just does a:
> > > 
> > > /delete-node/  crypto@3;
> > > 
> > > after the p1010si-post.dtsi include.
> > 
> > U-Boot should already be removing the node on non-E chips -- see
> > ft_cpu_setup() in arch/powerpc/cpu/mpc85xx/fdt.c
> 
> 
> Hi Scott,
> 
> Thanks for your email.  The device in question ships an old uboot (a 
> vendor fork of U-Boot 2010.12-svn15934).

This was added by commit 6b70ffb9d1b2e, committed in July 2008... maybe
there's a problem with the old U-Boot finding the crypto node on this
particular chip?

> I am right in saying that the right fix is to either:
> 
> Use a bootloader (such as current upstream uboot) which adjusts the 
> device tree properly...
> 
> or:
> 
> In the case (such as OpenWRT) where the preferred installation method is 
> to retain the vendor bootloader, then the distro kernel should handle 
> the device tree fixup itself?

The NXP PPC device trees in the kernel are meant to be completed by U-Boot
(years ago I repeatedly suggested that the trees be moved into the U-Boot
source to reflect this, but nobody seemed interested).  Generally that is
mainline and  NXP SDK U-Boot, but a board dts file might cater to some other
U-Boot fork (or   other bootloader) if that's what ships on the board.  Does
this hardware have a board dts file in mainline Linux?

-Scott



Re: [PATCH 1/3] [v2] powerpc: mac: fix rtc read/write functions

2018-07-09 Thread Arnd Bergmann
On Sun, Jul 1, 2018 at 5:47 PM, Meelis Roos  wrote:
> A patch for the subject is now upstream. That made me finally take some
> time to test it on my PowerMac G4. Tha date is OK but I get two warnings
> with backtrace on bootup. Full dmesg below.

Thanks for testing this, and sorry for the slow reply.

> [4.026490] WARNING: CPU: 0 PID: 1 at 
> arch/powerpc/platforms/powermac/time.c:154 pmu_get_time+0x7c/0xc8
> [4.032261] Modules linked in:
> [4.037878] CPU: 0 PID: 1 Comm: swapper Tainted: GW 
> 4.18.0-rc2-00223-g1904148a361a #88
> [4.043750] NIP:  c0021354 LR: c0021308 CTR: 
> [4.049585] REGS: ef047cd0 TRAP: 0700   Tainted: GW  
> (4.18.0-rc2-00223-g1904148a361a)
> [4.055572] MSR:  00029032   CR: 44000222  XER: 2000
> [4.061620]
>GPR00: c0021308 ef047d80 ef048000  00d7029c 0004 
> 0001 009c
>GPR08: 00d7 0001 0200 c06a 24000228  
> c0004c9c 
>GPR16:       
> c067 c0601a38
>GPR24: 0008 c0630f18 c062a40c c05fc10c ef047e50 ef273800 
> ef047e50 ef047e50
> [4.092393] NIP [c0021354] pmu_get_time+0x7c/0xc8
> [4.098596] LR [c0021308] pmu_get_time+0x30/0xc8

I don't see how the WARN_ON() triggered unless the PMU time is
actually before 1970.

> [4.104779] Call Trace:
> [4.110909] [ef047d80] [c0021308] pmu_get_time+0x30/0xc8 (unreliable)
> [4.117209] [ef047df0] [c00213e8] pmac_get_rtc_time+0x28/0x40
> [4.123470] [ef047e00] [c000bc04] rtc_generic_get_time+0x20/0x34
> [4.129770] [ef047e10] [c03aca34] __rtc_read_time+0x5c/0xe0
> [4.136060] [ef047e20] [c03acafc] rtc_read_time+0x44/0x7c
> [4.142356] [ef047e40] [c061e000] rtc_hctosys+0x64/0x11c
> [4.148616] [ef047ea0] [c0004aa4] do_one_initcall+0x4c/0x1a8
> [4.154866] [ef047f00] [c06022f0] kernel_init_freeable+0x12c/0x1f4
> [4.161123] [ef047f30] [c0004cb4] kernel_init+0x18/0x130
> [4.167359] [ef047f40] [c00121c4] ret_from_kernel_thread+0x14/0x1c
> [4.173610] Instruction dump:
> [4.179766] 8941002e 5484c00e 5508801e 88e1002f 7c844214 554a402e 7c845214 
> 7c843a14
> [4.186076] 7d244810 7d294910 7d2948f8 552907fe <0f09> 3d2083da 
> 80010074 38210070
> [4.192388] ---[ end trace 2e01ad9337fe08fd ]---
> [4.198643] rtc-generic rtc-generic: hctosys: unable to read the hardware 
> clock

The last message here happens exactly in that case as well: tm_year is before
1970:

int rtc_valid_tm(struct rtc_time *tm)
{
if (tm->tm_year < 70
|| ((unsigned)tm->tm_mon) >= 12
|| tm->tm_mday < 1
|| tm->tm_mday > rtc_month_days(tm->tm_mon, tm->tm_year + 1900)
|| ((unsigned)tm->tm_hour) >= 24
|| ((unsigned)tm->tm_min) >= 60
|| ((unsigned)tm->tm_sec) >= 60)
return -EINVAL;

return 0;
}

The most likely explanation I have here is that the RTC was indeed set to an
incorrect date, either because of a depleted battery (not unlikely for
a ~15 year
old box) or because it was previously stored incorrectly. You say that the
time is correct, but that could also be the case if the machine is connected to
the network and synchronized using NTP. It should not have gotten the
time from the RTC after that error.

If you have the time to do another test, can you boot the machine with
its network disconnected, see if the warning persists (it may have been
repaired after the correct time got written into the RTC during shutdown),
what the output of 'sudo hwclock' is, and whether anything changes after
you set the correct time using 'hwclock --systohc' and reboot?

 Arnd


Re: [PATCH] powerpc: Replaced msleep with usleep_range

2018-07-09 Thread kbuild test robot
Hi Daniel,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.18-rc4 next-20180709]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Daniel-Klamt/powerpc-Replaced-msleep-with-usleep_range/20180709-231913
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/sysdev/xive/native.c: In function 'xive_native_configure_irq':
>> arch/powerpc/sysdev/xive/native.c:113:2: error: expected ';' before '}' token
 }
 ^

vim +113 arch/powerpc/sysdev/xive/native.c

243e2511 Benjamin Herrenschmidt 2017-04-05  103  
243e2511 Benjamin Herrenschmidt 2017-04-05  104  int 
xive_native_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq)
243e2511 Benjamin Herrenschmidt 2017-04-05  105  {
243e2511 Benjamin Herrenschmidt 2017-04-05  106 s64 rc;
243e2511 Benjamin Herrenschmidt 2017-04-05  107  
243e2511 Benjamin Herrenschmidt 2017-04-05  108 for (;;) {
243e2511 Benjamin Herrenschmidt 2017-04-05  109 rc = 
opal_xive_set_irq_config(hw_irq, target, prio, sw_irq);
243e2511 Benjamin Herrenschmidt 2017-04-05  110 if (rc != 
OPAL_BUSY)
243e2511 Benjamin Herrenschmidt 2017-04-05  111 break;
c332c793 Daniel Klamt   2018-07-09  112 
usleep_range(1000, 1100)
243e2511 Benjamin Herrenschmidt 2017-04-05 @113 }
243e2511 Benjamin Herrenschmidt 2017-04-05  114 return rc == 0 ? 0 : 
-ENXIO;
243e2511 Benjamin Herrenschmidt 2017-04-05  115  }
5af50993 Benjamin Herrenschmidt 2017-04-05  116  
EXPORT_SYMBOL_GPL(xive_native_configure_irq);
5af50993 Benjamin Herrenschmidt 2017-04-05  117  
243e2511 Benjamin Herrenschmidt 2017-04-05  118  

:: The code at line 113 was first introduced by commit
:: 243e25112d06b348f087a6f7aba4bbc288285bdd powerpc/xive: Native 
exploitation of the XIVE interrupt controller

:: TO: Benjamin Herrenschmidt 
:: CC: Michael Ellerman 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [next-20180709][bisected 9cf57731][ppc] build fail with ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734

2018-07-09 Thread Abdul Haleem
On Mon, 2018-07-09 at 13:47 +0200, Peter Zijlstra wrote:
> On Mon, Jul 09, 2018 at 03:21:23PM +0530, Abdul Haleem wrote:
> > Greeting's
> > 
> > Today's next fails to build on powerpc with below error
> > 
> > kernel/cpu.o:(.data.rel+0x18e0): undefined reference to
> > `lockup_detector_online_cpu'
> > ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734
> > kernel/cpu.o:(.data.rel+0x18e8): undefined reference to
> > `lockup_detector_offline_cpu'
> > ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734
> > Makefile:1005: recipe for target 'vmlinux' failed
> > make: *** [vmlinux] Error 1
> 
> Urgh, sorry about that. I think the below should cure that.
> 
> I got confused by all the varioud CONFIG options here abour and
> conflated CONFIG_LOCKUP_DETECTOR and CONFIG_SOFTLOCKUP_DETECTOR it
> seems.
> 
> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index 80664bbeca43..08f9247e9827 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -33,15 +33,10 @@ extern int sysctl_hardlockup_all_cpu_backtrace;
>  #define sysctl_hardlockup_all_cpu_backtrace 0
>  #endif /* !CONFIG_SMP */
> 
> -extern int lockup_detector_online_cpu(unsigned int cpu);
> -extern int lockup_detector_offline_cpu(unsigned int cpu);
> -
>  #else /* CONFIG_LOCKUP_DETECTOR */
>  static inline void lockup_detector_init(void) { }
>  static inline void lockup_detector_soft_poweroff(void) { }
>  static inline void lockup_detector_cleanup(void) { }
> -#define lockup_detector_online_cpu   NULL
> -#define lockup_detector_offline_cpu  NULL
>  #endif /* !CONFIG_LOCKUP_DETECTOR */
> 
>  #ifdef CONFIG_SOFTLOCKUP_DETECTOR
> @@ -50,12 +45,18 @@ extern void touch_softlockup_watchdog(void);
>  extern void touch_softlockup_watchdog_sync(void);
>  extern void touch_all_softlockup_watchdogs(void);
>  extern unsigned int  softlockup_panic;
> -#else
> +
> +extern int lockup_detector_online_cpu(unsigned int cpu);
> +extern int lockup_detector_offline_cpu(unsigned int cpu);
> +#else /* CONFIG_SOFTLOCKUP_DETECTOR */
>  static inline void touch_softlockup_watchdog_sched(void) { }
>  static inline void touch_softlockup_watchdog(void) { }
>  static inline void touch_softlockup_watchdog_sync(void) { }
>  static inline void touch_all_softlockup_watchdogs(void) { }
> -#endif
> +
> +#define lockup_detector_online_cpu   NULL
> +#define lockup_detector_offline_cpu  NULL
> +#endif /* CONFIG_SOFTLOCKUP_DETECTOR */
> 
>  #ifdef CONFIG_DETECT_HUNG_TASK
>  void reset_hung_task_detector(void);
> 

Thanks Peter for the patch, build and boot is fine.

Reported-and-tested-by: Abdul Haleem 

-- 
Regard's

Abdul Haleem
IBM Linux Technology Centre





Re: [PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()

2018-07-09 Thread Laura Abbott

On 07/09/2018 05:19 AM, Marek Szyprowski wrote:

cma_alloc() function doesn't really support gfp flags other than
__GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter.

This will help to avoid giving false feeling that this function supports
standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
what has already been an issue: see commit dd65a941f6ba ("arm64:
dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").



For Ion,

Acked-by: Laura Abbott 


Signed-off-by: Marek Szyprowski 
---
  arch/powerpc/kvm/book3s_hv_builtin.c   | 2 +-
  drivers/s390/char/vmcp.c   | 2 +-
  drivers/staging/android/ion/ion_cma_heap.c | 2 +-
  include/linux/cma.h| 2 +-
  kernel/dma/contiguous.c| 3 ++-
  mm/cma.c   | 8 
  mm/cma_debug.c | 2 +-
  7 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index d4a3f4da409b..fc6bb9630a9c 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -77,7 +77,7 @@ struct page *kvm_alloc_hpt_cma(unsigned long nr_pages)
VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
  
  	return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES),

-GFP_KERNEL);
+false);
  }
  EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma);
  
diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c

index 948ce82a7725..0fa1b6b1491a 100644
--- a/drivers/s390/char/vmcp.c
+++ b/drivers/s390/char/vmcp.c
@@ -68,7 +68,7 @@ static void vmcp_response_alloc(struct vmcp_session *session)
 * anymore the system won't work anyway.
 */
if (order > 2)
-   page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL);
+   page = cma_alloc(vmcp_cma, nr_pages, 0, false);
if (page) {
session->response = (char *)page_to_phys(page);
session->cma_alloc = 1;
diff --git a/drivers/staging/android/ion/ion_cma_heap.c 
b/drivers/staging/android/ion/ion_cma_heap.c
index 49718c96bf9e..3fafd013d80a 100644
--- a/drivers/staging/android/ion/ion_cma_heap.c
+++ b/drivers/staging/android/ion/ion_cma_heap.c
@@ -39,7 +39,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct 
ion_buffer *buffer,
if (align > CONFIG_CMA_ALIGNMENT)
align = CONFIG_CMA_ALIGNMENT;
  
-	pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL);

+   pages = cma_alloc(cma_heap->cma, nr_pages, align, false);
if (!pages)
return -ENOMEM;
  
diff --git a/include/linux/cma.h b/include/linux/cma.h

index bf90f0bb42bd..190184b5ff32 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -33,7 +33,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, 
phys_addr_t size,
const char *name,
struct cma **res_cma);
  extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int 
align,
- gfp_t gfp_mask);
+ bool no_warn);
  extern bool cma_release(struct cma *cma, const struct page *pages, unsigned 
int count);
  
  extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index d987dcd1bd56..19ea5d70150c 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -191,7 +191,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, 
size_t count,
if (align > CONFIG_CMA_ALIGNMENT)
align = CONFIG_CMA_ALIGNMENT;
  
-	return cma_alloc(dev_get_cma_area(dev), count, align, gfp_mask);

+   return cma_alloc(dev_get_cma_area(dev), count, align,
+gfp_mask & __GFP_NOWARN);
  }
  
  /**

diff --git a/mm/cma.c b/mm/cma.c
index 5809bbe360d7..4cb76121a3ab 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -395,13 +395,13 @@ static inline void cma_debug_show_areas(struct cma *cma) 
{ }
   * @cma:   Contiguous memory region for which the allocation is performed.
   * @count: Requested number of pages.
   * @align: Requested alignment of pages (in PAGE_SIZE order).
- * @gfp_mask:  GFP mask to use during compaction
+ * @no_warn: Avoid printing message about failed allocation
   *
   * This function allocates part of contiguous memory on specific
   * contiguous memory area.
   */
  struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
-  gfp_t gfp_mask)
+  bool no_warn)
  {
unsigned long mask, offset;
unsigned long pfn = -1;
@@ -447,7 +447,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align,
pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);

Re: [PATCH v6 8/8] powernv/pseries: consolidate code for mce early handling.

2018-07-09 Thread Michal Suchánek
On Fri, 6 Jul 2018 19:40:24 +1000
Nicholas Piggin  wrote:

> On Wed, 04 Jul 2018 23:30:12 +0530
> Mahesh J Salgaonkar  wrote:
> 
> > From: Mahesh Salgaonkar 
> > 
> > Now that other platforms also implements real mode mce handler,
> > lets consolidate the code by sharing existing powernv machine check
> > early code. Rename machine_check_powernv_early to
> > machine_check_common_early and reuse the code.
> > 
> > Signed-off-by: Mahesh Salgaonkar 
> > ---
> >  arch/powerpc/kernel/exceptions-64s.S |   56
> > +++--- 1 file changed, 11
> > insertions(+), 45 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/exceptions-64s.S
> > b/arch/powerpc/kernel/exceptions-64s.S index
> > 0038596b7906..3e877ec55d50 100644 ---
> > a/arch/powerpc/kernel/exceptions-64s.S +++
> > b/arch/powerpc/kernel/exceptions-64s.S @@ -243,14 +243,13 @@
> > EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
> > SET_SCRATCH0(r13)   /* save r13 */
> > EXCEPTION_PROLOG_0(PACA_EXMC) BEGIN_FTR_SECTION
> > -   b   machine_check_powernv_early
> > +   b   machine_check_common_early
> >  FTR_SECTION_ELSE
> > b   machine_check_pSeries_0
> >  ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> >  EXC_REAL_END(machine_check, 0x200, 0x100)
> >  EXC_VIRT_NONE(0x4200, 0x100)
> > -TRAMP_REAL_BEGIN(machine_check_powernv_early)
> > -BEGIN_FTR_SECTION
> > +TRAMP_REAL_BEGIN(machine_check_common_early)
> > EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> > /*
> >  * Register contents:
> > @@ -306,7 +305,9 @@ BEGIN_FTR_SECTION
> > /* Save r9 through r13 from EXMC save area to stack frame.
> > */ EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> > mfmsr   r11 /* get MSR value */
> > +BEGIN_FTR_SECTION
> > ori r11,r11,MSR_ME  /* turn on ME bit
> > */ +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> > ori r11,r11,MSR_RI  /* turn on RI bit
> > */ LOAD_HANDLER(r12, machine_check_handle_early)
> >  1: mtspr   SPRN_SRR0,r12
> > @@ -325,7 +326,6 @@ BEGIN_FTR_SECTION
> > andcr11,r11,r10 /* Turn off MSR_ME
> > */ b1b
> > b   .   /* prevent speculative execution */
> > -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
> >  
> >  TRAMP_REAL_BEGIN(machine_check_pSeries)
> > .globl machine_check_fwnmi
> > @@ -333,7 +333,7 @@ machine_check_fwnmi:
> > SET_SCRATCH0(r13)   /* save r13 */
> > EXCEPTION_PROLOG_0(PACA_EXMC)
> >  BEGIN_FTR_SECTION
> > -   b   machine_check_pSeries_early
> > +   b   machine_check_common_early
> >  END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> >  machine_check_pSeries_0:
> > EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
> > @@ -346,45 +346,6 @@ machine_check_pSeries_0:
> >  
> >  TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
> >  
> > -TRAMP_REAL_BEGIN(machine_check_pSeries_early)
> > -BEGIN_FTR_SECTION
> > -   EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> > -   mr  r10,r1  /* Save r1 */
> > -   ld  r1,PACAMCEMERGSP(r13)   /* Use MC emergency
> > stack */
> > -   subir1,r1,INT_FRAME_SIZE/* alloc stack
> > frame   */
> > -   mfspr   r11,SPRN_SRR0   /* Save SRR0 */
> > -   mfspr   r12,SPRN_SRR1   /* Save SRR1 */
> > -   EXCEPTION_PROLOG_COMMON_1()
> > -   EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> > -   EXCEPTION_PROLOG_COMMON_3(0x200)
> > -   addir3,r1,STACK_FRAME_OVERHEAD
> > -   BRANCH_LINK_TO_FAR(machine_check_early) /* Function call
> > ABI */ -
> > -   /* Move original SRR0 and SRR1 into the respective regs */
> > -   ld  r9,_MSR(r1)
> > -   mtspr   SPRN_SRR1,r9
> > -   ld  r3,_NIP(r1)
> > -   mtspr   SPRN_SRR0,r3
> > -   ld  r9,_CTR(r1)
> > -   mtctr   r9
> > -   ld  r9,_XER(r1)
> > -   mtxer   r9
> > -   ld  r9,_LINK(r1)
> > -   mtlrr9
> > -   REST_GPR(0, r1)
> > -   REST_8GPRS(2, r1)
> > -   REST_GPR(10, r1)
> > -   ld  r11,_CCR(r1)
> > -   mtcrr11
> > -   REST_GPR(11, r1)
> > -   REST_2GPRS(12, r1)
> > -   /* restore original r1. */
> > -   ld  r1,GPR1(r1)
> > -   SET_SCRATCH0(r13)   /* save r13 */
> > -   EXCEPTION_PROLOG_0(PACA_EXMC)
> > -   b   machine_check_pSeries_0
> > -END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> > -
> >  EXC_COMMON_BEGIN(machine_check_common)
> > /*
> >  * Machine check is different because we use a different
> > @@ -483,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
> > bl  machine_check_early
> > std r3,RESULT(r1)   /* Save result */
> > ld  r12,_MSR(r1)
> > +BEGIN_FTR_SECTION
> > +   bne 9f  /* pSeries: continue
> > to V mode. */ +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)  
> 
> Should this be "b 9f" ? Although...
> 
> >  
> >  #ifdef CONFIG_PPC_P7_NAP
> > /*
> > @@ -564,7 +528,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
> >  9:
> > /* Deliver the machine check to host kernel in V mode. */
> > MACHINE_CHECK_HANDLER_WINDUP
> > -   b   machine_check_pSeries
> > +   SET_SCRATCH0(r13)  

[PATCH v06 9/9] hotplug/pmt: Update topology after PMT

2018-07-09 Thread Michael Bringmann
hotplug/pmt: Call rebuild_sched_domains after applying changes
to update CPU associativity i.e. 'readd' CPUs.  This is to
ensure that the deferred calls to arch_update_cpu_topology are
now reflected in the system data structures.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/platforms/pseries/dlpar.c |4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index 7264b8e..ea3c08a 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -451,6 +452,9 @@ static int dlpar_pmt(struct pseries_hp_errorlog *work)
ssleep(10);
}
 
+   ssleep(5);
+   rebuild_sched_domains();
+
return 0;
 }
 



[PATCH v06 8/9] hotplug/rtas: No rtas_event_scan during PMT update

2018-07-09 Thread Michael Bringmann
hotplug/rtas: Disable rtas_event_scan during device-tree property
updates after migration to reduce conflicts with changes propagated
to other parts of the kernel configuration, such as CPUs or memory.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c |4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 6267b53..f5c9e8f 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -686,14 +686,18 @@ static int dlpar_cpu_readd_by_index(u32 drc_index)
 
pr_info("Attempting to re-add CPU, drc index %x\n", drc_index);
 
+   rtas_event_scan_disable();
arch_update_cpu_topology_suspend();
rc = dlpar_cpu_remove_by_index(drc_index, false);
arch_update_cpu_topology_resume();
+   rtas_event_scan_enable();
 
if (!rc) {
+   rtas_event_scan_disable();
arch_update_cpu_topology_suspend();
rc = dlpar_cpu_add(drc_index, false);
arch_update_cpu_topology_resume();
+   rtas_event_scan_enable();
}
 
if (rc)



[PATCH v06 7/9] powerpc/rtas: Allow disabling rtas_event_scan

2018-07-09 Thread Michael Bringmann
powerpc/rtas: Provide mechanism by which the rtas_event_scan can
be disabled/re-enabled by other portions of the powerpc code.
Among other things, this simplifies the usage of locking mechanisms
for shared kernel resources.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/include/asm/rtas.h |4 
 arch/powerpc/kernel/rtasd.c |   14 ++
 2 files changed, 18 insertions(+)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 4f601c7..4ab605a 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -386,8 +386,12 @@ extern int early_init_dt_scan_rtas(unsigned long node,
 
 #ifdef CONFIG_PPC_RTAS_DAEMON
 extern void rtas_cancel_event_scan(void);
+extern void rtas_event_scan_disable(void);
+extern void rtas_event_scan_enable(void);
 #else
 static inline void rtas_cancel_event_scan(void) { }
+static inline void rtas_event_scan_disable(void) { }
+static inline void rtas_event_scan_enable(void) { }
 #endif
 
 /* Error types logged.  */
diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index 44d66c33d..af69e44 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -455,11 +455,25 @@ static void do_event_scan(void)
  */
 static unsigned long event_scan_delay = 1*HZ;
 static int first_pass = 1;
+static int res_enable = 1;
+
+void rtas_event_scan_disable(void)
+{
+   res_enable = 0;
+}
+
+void rtas_event_scan_enable(void)
+{
+   res_enable = 1;
+}
 
 static void rtas_event_scan(struct work_struct *w)
 {
unsigned int cpu;
 
+   if (!res_enable)
+   return;
+
do_event_scan();
 
get_online_cpus();



[PATCH v06 6/9] pmt/numa: Disable arch_update_cpu_topology during CPU readd

2018-07-09 Thread Michael Bringmann
pmt/numa: Disable arch_update_cpu_topology during post migration
CPU readd updates when evaluating device-tree changes after LPM
to avoid thread deadlocks trying to update node assignments.
System timing between all of the threads and timers restarted in
a migrated system overlapped frequently allowing tasks to start
acquiring resources (get_online_cpus) needed by rebuild_sched_domains.
Defer the operation of that function until after the CPU readd has
completed.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 8f28160..6267b53 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -26,6 +26,7 @@
 #include/* for idle_task_exit */
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -685,9 +686,15 @@ static int dlpar_cpu_readd_by_index(u32 drc_index)
 
pr_info("Attempting to re-add CPU, drc index %x\n", drc_index);
 
+   arch_update_cpu_topology_suspend();
rc = dlpar_cpu_remove_by_index(drc_index, false);
-   if (!rc)
+   arch_update_cpu_topology_resume();
+
+   if (!rc) {
+   arch_update_cpu_topology_suspend();
rc = dlpar_cpu_add(drc_index, false);
+   arch_update_cpu_topology_resume();
+   }
 
if (rc)
pr_info("Failed to update cpu at drc_index %lx\n",



[PATCH v06 5/9] numa: Disable/enable arch_update_cpu_topology

2018-07-09 Thread Michael Bringmann
numa: Provide mechanism to disable/enable operation of
arch_update_cpu_topology/numa_update_cpu_topology.  This is
a simple tool to eliminate some avenues for thread deadlock
observed during system execution.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/include/asm/topology.h |   10 ++
 arch/powerpc/mm/numa.c  |   14 ++
 2 files changed, 24 insertions(+)

diff --git a/arch/powerpc/include/asm/topology.h 
b/arch/powerpc/include/asm/topology.h
index 16b0778..d9ceba6 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -43,6 +43,8 @@ static inline int pcibus_to_node(struct pci_bus *bus)
 extern int sysfs_add_device_to_node(struct device *dev, int nid);
 extern void sysfs_remove_device_from_node(struct device *dev, int nid);
 extern int numa_update_cpu_topology(bool cpus_locked);
+extern void arch_update_cpu_topology_suspend(void);
+extern void arch_update_cpu_topology_resume(void);
 
 static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node)
 {
@@ -82,6 +84,14 @@ static inline int numa_update_cpu_topology(bool cpus_locked)
return 0;
 }
 
+static inline void arch_update_cpu_topology_suspend(void)
+{
+}
+
+static inline void arch_update_cpu_topology_resume(void)
+{
+}
+
 static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node) {}
 
 #endif /* CONFIG_NUMA */
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index b22e27a..2352489 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1079,6 +1079,7 @@ struct topology_update_data {
 static int topology_timer_secs = 1;
 static int topology_inited;
 static int topology_update_needed;
+static int topology_update_enabled = 1;
 static struct mutex topology_update_lock;
 
 /*
@@ -1313,6 +1314,9 @@ int numa_update_cpu_topology(bool cpus_locked)
return 0;
}
 
+   if (!topology_update_enabled)
+   return 0;
+
weight = cpumask_weight(_associativity_changes_mask);
if (!weight)
return 0;
@@ -1439,6 +1443,16 @@ int arch_update_cpu_topology(void)
return numa_update_cpu_topology(true);
 }
 
+void arch_update_cpu_topology_suspend(void)
+{
+   topology_update_enabled = 0;
+}
+
+void arch_update_cpu_topology_resume(void)
+{
+   topology_update_enabled = 1;
+}
+
 static void topology_work_fn(struct work_struct *work)
 {
rebuild_sched_domains();



[PATCH v06 4/9] mobility/numa: Ensure numa update does not overlap

2018-07-09 Thread Michael Bringmann
mobility/numa: Ensure that numa_update_cpu_topology() can not be
entered multiple times concurrently.  It may be accessed through
many different paths / concurrent work functions, and the lock
ordering may be difficult to ensure otherwise.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/mm/numa.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index a789d57..b22e27a 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1079,6 +1079,7 @@ struct topology_update_data {
 static int topology_timer_secs = 1;
 static int topology_inited;
 static int topology_update_needed;
+static struct mutex topology_update_lock;
 
 /*
  * Change polling interval for associativity changes.
@@ -1320,6 +1321,11 @@ int numa_update_cpu_topology(bool cpus_locked)
if (!updates)
return 0;
 
+   if (!mutex_trylock(_update_lock)) {
+   kfree(updates);
+   return 0;
+   }
+
cpumask_clear(_cpus);
 
for_each_cpu(cpu, _associativity_changes_mask) {
@@ -1424,6 +1430,7 @@ int numa_update_cpu_topology(bool cpus_locked)
 out:
kfree(updates);
topology_update_needed = 0;
+   mutex_unlock(_update_lock);
return changed;
 }
 
@@ -1598,6 +1605,8 @@ static ssize_t topology_write(struct file *file, const 
char __user *buf,
 
 static int topology_update_init(void)
 {
+   mutex_init(_update_lock);
+
/* Do not poll for changes if disabled at boot */
if (topology_updates_enabled)
start_topology_update();



[PATCH v06 3/9] hotplug/cpu: Provide CPU readd operation

2018-07-09 Thread Michael Bringmann
powerpc/dlpar: Provide hotplug CPU 'readd by index' operation to
support LPAR Post Migration state updates.  When such changes are
invoked by the PowerPC 'mobility' code, they will be queued up so
that modifications to CPU properties will take place after the new
property value is written to the device-tree.

Signed-off-by: Michael Bringmann 
---
Changes in patch:
  -- Add CPU validity check to pseries_smp_notifier
  -- Improve check on 'ibm,associativity' property
  -- Add check for cpu type to new update property entry
  -- Cleanup reference to outdated queuing function.
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c |   58 ++
 1 file changed, 58 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 3632db2..8f28160 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -305,6 +305,36 @@ static int pseries_add_processor(struct device_node *np)
return err;
 }
 
+static int pseries_update_processor(struct of_reconfig_data *pr)
+{
+   int old_entries, new_entries, rc = 0;
+   __be32 *old_assoc, *new_assoc;
+
+   /* We only handle changes due to 'ibm,associativity' property
+*/
+   old_assoc = pr->old_prop->value;
+   old_entries = be32_to_cpu(*old_assoc++);
+
+   new_assoc = pr->prop->value;
+   new_entries = be32_to_cpu(*new_assoc++);
+
+   if (old_entries == new_entries) {
+   int sz = old_entries * sizeof(int);
+
+   if (memcmp(old_assoc, new_assoc, sz))
+   rc = dlpar_queue_action(
+   PSERIES_HP_ELOG_RESOURCE_CPU,
+   PSERIES_HP_ELOG_ACTION_READD,
+   pr->dn->phandle);
+   } else {
+   rc = dlpar_queue_action(PSERIES_HP_ELOG_RESOURCE_CPU,
+   PSERIES_HP_ELOG_ACTION_READD,
+   pr->dn->phandle);
+   }
+
+   return rc;
+}
+
 /*
  * Update the present map for a cpu node which is going away, and set
  * the hard id in the paca(s) to -1 to be consistent with boot time
@@ -649,6 +679,26 @@ static int dlpar_cpu_remove_by_index(u32 drc_index, bool 
release_drc)
return rc;
 }
 
+static int dlpar_cpu_readd_by_index(u32 drc_index)
+{
+   int rc = 0;
+
+   pr_info("Attempting to re-add CPU, drc index %x\n", drc_index);
+
+   rc = dlpar_cpu_remove_by_index(drc_index, false);
+   if (!rc)
+   rc = dlpar_cpu_add(drc_index, false);
+
+   if (rc)
+   pr_info("Failed to update cpu at drc_index %lx\n",
+   (unsigned long int)drc_index);
+   else
+   pr_info("CPU at drc_index %lx was updated\n",
+   (unsigned long int)drc_index);
+
+   return rc;
+}
+
 static int find_dlpar_cpus_to_remove(u32 *cpu_drcs, int cpus_to_remove)
 {
struct device_node *dn;
@@ -839,6 +889,9 @@ int dlpar_cpu(struct pseries_hp_errorlog *hp_elog)
else
rc = -EINVAL;
break;
+   case PSERIES_HP_ELOG_ACTION_READD:
+   rc = dlpar_cpu_readd_by_index(drc_index);
+   break;
default:
pr_err("Invalid action (%d) specified\n", hp_elog->action);
rc = -EINVAL;
@@ -902,6 +955,11 @@ static int pseries_smp_notifier(struct notifier_block *nb,
case OF_RECONFIG_DETACH_NODE:
pseries_remove_processor(rd->dn);
break;
+   case OF_RECONFIG_UPDATE_PROPERTY:
+   if (!strcmp(rd->dn->type, "cpu") &&
+   !strcmp(rd->prop->name, "ibm,associativity"))
+   pseries_update_processor(rd);
+   break;
}
return notifier_from_errno(err);
 }



[PATCH v06 2/9] hotplug/cpu: Add operation queuing function

2018-07-09 Thread Michael Bringmann
migration/dlpar: This patch adds function dlpar_queue_action()
which will queued up information about a CPU/Memory 'readd'
operation according to resource type, action code, and DRC index.
At a subsequent point, the list of operations can be run/played
in series.  Examples of such oprations include 'readd' of CPU
and Memory blocks identified as having changed their associativity
during an LPAR migration event.

Signed-off-by: Michael Bringmann 
---
Changes in patch:
  -- Correct drc_index before adding to pseries_hp_errorlog struct
  -- Correct text of notice
  -- Revise queuing model to save up all of the DLPAR actions for
 later execution.
  -- Restore list init statement missing from patch
  -- Move call to apply queued operations into 'mobility.c'
  -- Compress some code
  -- Rename some of queueing function APIs
  -- Revise implementation to push execution of queued operations
 to a workqueue task.
  -- Cleanup reference to outdated queuing operation.
---
 arch/powerpc/include/asm/rtas.h   |2 +
 arch/powerpc/platforms/pseries/dlpar.c|   61 +
 arch/powerpc/platforms/pseries/mobility.c |4 ++
 arch/powerpc/platforms/pseries/pseries.h  |2 +
 4 files changed, 69 insertions(+)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 71e393c..4f601c7 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -310,12 +310,14 @@ struct pseries_hp_errorlog {
struct { __be32 count, index; } ic;
chardrc_name[1];
} _drc_u;
+   struct list_head list;
 };
 
 #define PSERIES_HP_ELOG_RESOURCE_CPU   1
 #define PSERIES_HP_ELOG_RESOURCE_MEM   2
 #define PSERIES_HP_ELOG_RESOURCE_SLOT  3
 #define PSERIES_HP_ELOG_RESOURCE_PHB   4
+#define PSERIES_HP_ELOG_RESOURCE_PMT   5
 
 #define PSERIES_HP_ELOG_ACTION_ADD 1
 #define PSERIES_HP_ELOG_ACTION_REMOVE  2
diff --git a/arch/powerpc/platforms/pseries/dlpar.c 
b/arch/powerpc/platforms/pseries/dlpar.c
index a0b20c0..7264b8e 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static struct workqueue_struct *pseries_hp_wq;
@@ -329,6 +330,8 @@ int dlpar_release_drc(u32 drc_index)
return 0;
 }
 
+static int dlpar_pmt(struct pseries_hp_errorlog *work);
+
 static int handle_dlpar_errorlog(struct pseries_hp_errorlog *hp_elog)
 {
int rc;
@@ -357,6 +360,9 @@ static int handle_dlpar_errorlog(struct pseries_hp_errorlog 
*hp_elog)
case PSERIES_HP_ELOG_RESOURCE_CPU:
rc = dlpar_cpu(hp_elog);
break;
+   case PSERIES_HP_ELOG_RESOURCE_PMT:
+   rc = dlpar_pmt(hp_elog);
+   break;
default:
pr_warn_ratelimited("Invalid resource (%d) specified\n",
hp_elog->resource);
@@ -407,6 +413,61 @@ void queue_hotplug_event(struct pseries_hp_errorlog 
*hp_errlog,
}
 }
 
+LIST_HEAD(dlpar_delayed_list);
+
+int dlpar_queue_action(int resource, int action, u32 drc_index)
+{
+   struct pseries_hp_errorlog *hp_errlog;
+
+   hp_errlog = kmalloc(sizeof(struct pseries_hp_errorlog), GFP_KERNEL);
+   if (!hp_errlog)
+   return -ENOMEM;
+
+   hp_errlog->resource = resource;
+   hp_errlog->action = action;
+   hp_errlog->id_type = PSERIES_HP_ELOG_ID_DRC_INDEX;
+   hp_errlog->_drc_u.drc_index = cpu_to_be32(drc_index);
+
+   list_add_tail(_errlog->list, _delayed_list);
+
+   return 0;
+}
+
+static int dlpar_pmt(struct pseries_hp_errorlog *work)
+{
+   struct list_head *pos, *q;
+
+   ssleep(15);
+
+   list_for_each_safe(pos, q, _delayed_list) {
+   struct pseries_hp_errorlog *tmp;
+
+   tmp = list_entry(pos, struct pseries_hp_errorlog, list);
+   handle_dlpar_errorlog(tmp);
+
+   list_del(pos);
+   kfree(tmp);
+
+   ssleep(10);
+   }
+
+   return 0;
+}
+
+int dlpar_queued_actions_run(void)
+{
+   if (!list_empty(_delayed_list)) {
+   struct pseries_hp_errorlog hp_errlog;
+
+   hp_errlog.resource = PSERIES_HP_ELOG_RESOURCE_PMT;
+   hp_errlog.action = 0;
+   hp_errlog.id_type = 0;
+
+   queue_hotplug_event(_errlog, 0, 0);
+   }
+   return 0;
+}
+
 static int dlpar_parse_resource(char **cmd, struct pseries_hp_errorlog 
*hp_elog)
 {
char *arg;
diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index f6364d9..d0d1cae 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -378,6 +378,10 @@ static ssize_t migration_store(struct class *class,
return rc;
 
post_mobility_fixup();
+
+   /* Apply any necessary changes identified during fixup */
+   

[PATCH v06 1/9] hotplug/cpu: Conditionally acquire/release DRC index

2018-07-09 Thread Michael Bringmann
powerpc/cpu: Modify dlpar_cpu_add and dlpar_cpu_remove to allow the
skipping of DRC index acquire or release operations during the CPU
add or remove operations.  This is intended to support subsequent
changes to provide a 'CPU readd' operation.

Signed-off-by: Michael Bringmann 
---
Changes in patch:
  -- Move new validity check added to pseries_smp_notifier
 to another patch
---
 arch/powerpc/platforms/pseries/hotplug-cpu.c |   68 +++---
 1 file changed, 39 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 6ef77ca..3632db2 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -432,7 +432,7 @@ static bool valid_cpu_drc_index(struct device_node *parent, 
u32 drc_index)
return found;
 }
 
-static ssize_t dlpar_cpu_add(u32 drc_index)
+static ssize_t dlpar_cpu_add(u32 drc_index, bool acquire_drc)
 {
struct device_node *dn, *parent;
int rc, saved_rc;
@@ -457,19 +457,22 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
return -EINVAL;
}
 
-   rc = dlpar_acquire_drc(drc_index);
-   if (rc) {
-   pr_warn("Failed to acquire DRC, rc: %d, drc index: %x\n",
-   rc, drc_index);
-   of_node_put(parent);
-   return -EINVAL;
+   if (acquire_drc) {
+   rc = dlpar_acquire_drc(drc_index);
+   if (rc) {
+   pr_warn("Failed to acquire DRC, rc: %d, drc index: 
%x\n",
+   rc, drc_index);
+   of_node_put(parent);
+   return -EINVAL;
+   }
}
 
dn = dlpar_configure_connector(cpu_to_be32(drc_index), parent);
if (!dn) {
pr_warn("Failed call to configure-connector, drc index: %x\n",
drc_index);
-   dlpar_release_drc(drc_index);
+   if (acquire_drc)
+   dlpar_release_drc(drc_index);
of_node_put(parent);
return -EINVAL;
}
@@ -484,8 +487,9 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
pr_warn("Failed to attach node %s, rc: %d, drc index: %x\n",
dn->name, rc, drc_index);
 
-   rc = dlpar_release_drc(drc_index);
-   if (!rc)
+   if (acquire_drc)
+   rc = dlpar_release_drc(drc_index);
+   if (!rc || acquire_drc)
dlpar_free_cc_nodes(dn);
 
return saved_rc;
@@ -498,7 +502,7 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
dn->name, rc, drc_index);
 
rc = dlpar_detach_node(dn);
-   if (!rc)
+   if (!rc && acquire_drc)
dlpar_release_drc(drc_index);
 
return saved_rc;
@@ -566,7 +570,8 @@ static int dlpar_offline_cpu(struct device_node *dn)
 
 }
 
-static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index)
+static ssize_t dlpar_cpu_remove(struct device_node *dn, u32 drc_index,
+   bool release_drc)
 {
int rc;
 
@@ -579,12 +584,14 @@ static ssize_t dlpar_cpu_remove(struct device_node *dn, 
u32 drc_index)
return -EINVAL;
}
 
-   rc = dlpar_release_drc(drc_index);
-   if (rc) {
-   pr_warn("Failed to release drc (%x) for CPU %s, rc: %d\n",
-   drc_index, dn->name, rc);
-   dlpar_online_cpu(dn);
-   return rc;
+   if (release_drc) {
+   rc = dlpar_release_drc(drc_index);
+   if (rc) {
+   pr_warn("Failed to release drc (%x) for CPU %s, rc: 
%d\n",
+   drc_index, dn->name, rc);
+   dlpar_online_cpu(dn);
+   return rc;
+   }
}
 
rc = dlpar_detach_node(dn);
@@ -593,7 +600,10 @@ static ssize_t dlpar_cpu_remove(struct device_node *dn, 
u32 drc_index)
 
pr_warn("Failed to detach CPU %s, rc: %d", dn->name, rc);
 
-   rc = dlpar_acquire_drc(drc_index);
+   if (release_drc)
+   rc = dlpar_acquire_drc(drc_index);
+   else
+   rc = 0;
if (!rc)
dlpar_online_cpu(dn);
 
@@ -622,7 +632,7 @@ static struct device_node *cpu_drc_index_to_dn(u32 
drc_index)
return dn;
 }
 
-static int dlpar_cpu_remove_by_index(u32 drc_index)
+static int dlpar_cpu_remove_by_index(u32 drc_index, bool release_drc)
 {
struct device_node *dn;
int rc;
@@ -634,7 +644,7 @@ static int dlpar_cpu_remove_by_index(u32 drc_index)
return -ENODEV;
}
 
-   rc = dlpar_cpu_remove(dn, drc_index);
+   rc = dlpar_cpu_remove(dn, drc_index, release_drc);

[PATCH v06 0/9] powerpc/hotplug: Update affinity for migrated CPUs

2018-07-09 Thread Michael Bringmann
The migration of LPARs across Power systems affects many attributes
including that of the associativity of CPUs.  The patches in this
set execute when a system is coming up fresh upon a migration target.
They are intended to,

* Recognize changes to the associativity of CPUs recorded in internal
  data structures when compared to the latest copies in the device tree.
* Generate calls to other code layers to reset the data structures
  related to associativity of the CPUs.
* Re-register the 'changed' entities into the target system.
  Re-registration of CPUs mostly entails acting as if they have been
  newly hot-added into the target system.

Signed-off-by: Michael Bringmann 

Michael Bringmann (9):
  hotplug/cpu: Conditionally acquire/release DRC index
  hotplug/cpu: Add operation queuing function
  hotplug/cpu: Provide CPU readd operation
  mobility/numa: Ensure numa update does not overlap
  numa: Disable/enable arch_update_cpu_topology
  pmt/numa: Disable arch_update_cpu_topology during CPU readd
  powerpc/rtas: Allow disabling rtas_event_scan
  hotplug/rtas: No rtas_event_scan during PMT update
  hotplug/pmt: Update topology after PMT
---
Changes in patch:
  -- Restructure and rearrange content of patches to co-locate
 similar or related modifications
  -- Rename pseries_update_drconf_cpu to pseries_update_processor
  -- Simplify code to update CPU nodes during mobility checks.
 Remove functions to generate extra HP_ELOG messages in favor
 of direct function calls to dlpar_cpu_readd_by_index.
  -- Revise code order in dlpar_cpu_readd_by_index() to present
 more appropriate error codes from underlying layers of the
 implementation.
  -- Add hotplug device lock around all property updates
  -- Add call to rebuild_sched_domains in case of changes
  -- Various code cleanups and compaction
  -- Rebase to 4.18-rc1 kernel
  -- Change operation to run CPU readd after end of migration store.
  -- Improve descriptive text
  -- Cleanup patch reference to outdated function



Re: [RFC PATCH 1/2] dma-mapping: Clean up dma_set_*mask() hooks

2018-07-09 Thread Robin Murphy

On 08/07/18 16:07, Christoph Hellwig wrote:

On Fri, Jul 06, 2018 at 03:20:34PM +0100, Robin Murphy wrote:

What are you trying to do?  I really don't want to see more users of
the hooks as they are are a horribly bad idea.


I really need to fix the ongoing problem we have where, due to funky
integrations, devices suffer some downstream addressing limit (described by
DT dma-ranges or ACPI IORT/_DMA) which we carefully set up in
dma_configure(), but then just gets lost when the driver probes and
innocently calls dma_set_mask() with something wider. I think it's
effectively the generalised case of the VIA 32-bit quirk, if I understand
that one correctly.


I'd much rather fix this in generic code.  How funky are your limitations?
In fact when I did the 32-bit quirk (which will also be used by a Xiling
PCIe root port usable on a lot of architectures) I did initially consider
adding a bus_dma_mask or similar to struct device, but opted for the
most simple implementation for now.  I'd be happy to chanfe this.

Especially these days where busses and IP blocks are generally not tied
to a specific cpu instruction set I really believe that having any
more architecture code than absolutely required is a bad idea.


Oh, for sure, the generic fix would be the longer-term goal, this was 
just an expedient compromise because I want to get *something* landed 
for 4.19. Since in practice this is predominantly affecting arm64, doing 
the arch-specific fix to appease affected customers then working to 
generalise it afterwards seemed to carry the lowest risk.


That said, I think I can see a relatively safe and clean alternative 
approach based on converting dma_32bit_limit to a mask, so I'll spin 
some patches around that idea ASAP to continue the discussion.



The approach that seemed to me to be safest is largely based on the one
proposed in a thread from ages ago[1]; namely to make dma_configure()
better at distinguishing firmware-specified masks from bus defaults,
capture the firmware mask in dev->archdata during arch_setup_dma_ops(),
then use the custom set_mask routines to ensure any subsequent updates
never exceed that. It doesn't seem possible to make this work robustly
without storing *some* additional per-device data, and for that archdata is
a lesser evil than struct device itself. Plus even though it's not actually
an arch-specific issue it feels like there's such a risk of breaking other
platforms that I'm reticent to even try handling it entirely in generic
code.


My plan for a few merge windows from now is that dma_mask and
coherent_mask are 100% in device control and dma_set_mask will never
fail.  It will be up to the dma ops to make sure things are addressible.


It's entirely possible to plug an old PCI soundcard via a bridge adapter 
into a modern board where the card's 24-bit DMA mask reaches nothing but 
the SoC's boot flash, and no IOMMU is available (e.g. some of the 
smaller NXP Layercape stuff); I still think there should be an error in 
such rare cases when DMA is utterly impossible, but otherwise I agree it 
would be much nicer for drivers to just provide their preferred mask and 
let the ops massage it as necessary.


Robin.


[PATCH 2/2] powerpc: Add ppc64le and ppc64_book3e allmodconfig targets

2018-07-09 Thread Michael Ellerman
Similarly as we just did for 32-bit, add phony targets for generating
a little endian and Book3E allmodconfig. These aren't covered by the
regular allmodconfig, which is big endian and Book3S due to the way
the Kconfig symbols are structured.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/Makefile | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 2556c2182789..48e887f03a6c 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -359,6 +359,16 @@ ppc32_allmodconfig:
$(Q)$(MAKE) 
KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/book3s_32.config \
-f $(srctree)/Makefile allmodconfig
 
+PHONY += ppc64le_allmodconfig
+ppc64le_allmodconfig:
+   $(Q)$(MAKE) KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/le.config 
\
+   -f $(srctree)/Makefile allmodconfig
+
+PHONY += ppc64_book3e_allmodconfig
+ppc64_book3e_allmodconfig:
+   $(Q)$(MAKE) 
KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/85xx-64bit.config \
+   -f $(srctree)/Makefile allmodconfig
+
 define archhelp
   @echo '* zImage  - Build default images selected by kernel config'
   @echo '  zImage.*- Compressed kernel image 
(arch/$(ARCH)/boot/zImage.*)'
-- 
2.14.1



[PATCH 1/2] powerpc: Add ppc32_allmodconfig defconfig target

2018-07-09 Thread Michael Ellerman
Because the allmodconfig logic just sets every symbol to M or Y, it
has the effect of always generating a 64-bit config, because
CONFIG_PPC64 becomes Y.

So to make it easier for folks to test 32-bit code, provide a phony
defconfig target that generates a 32-bit allmodconfig.

The 32-bit port has several mutually exclusive CPU types, we choose
the Book3S variants as that's what the help text in Kconfig says is
most common.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/Makefile | 5 +
 arch/powerpc/configs/book3s_32.config | 2 ++
 2 files changed, 7 insertions(+)
 create mode 100644 arch/powerpc/configs/book3s_32.config

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 2ea575cb3401..2556c2182789 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -354,6 +354,11 @@ mpc86xx_smp_defconfig:
$(call merge_into_defconfig,mpc86xx_basic_defconfig,\
86xx-smp 86xx-hw fsl-emb-nonhw)
 
+PHONY += ppc32_allmodconfig
+ppc32_allmodconfig:
+   $(Q)$(MAKE) 
KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/book3s_32.config \
+   -f $(srctree)/Makefile allmodconfig
+
 define archhelp
   @echo '* zImage  - Build default images selected by kernel config'
   @echo '  zImage.*- Compressed kernel image 
(arch/$(ARCH)/boot/zImage.*)'
diff --git a/arch/powerpc/configs/book3s_32.config 
b/arch/powerpc/configs/book3s_32.config
new file mode 100644
index ..8721eb7b1294
--- /dev/null
+++ b/arch/powerpc/configs/book3s_32.config
@@ -0,0 +1,2 @@
+CONFIG_PPC64=n
+CONFIG_PPC_BOOK3S_32=y
-- 
2.14.1



Re: [PATCH v4 00/11] hugetlb: Factorize hugetlb architecture primitives

2018-07-09 Thread Michal Hocko
[CC hugetlb guys - http://lkml.kernel.org/r/20180705110716.3919-1-a...@ghiti.fr]

On Thu 05-07-18 11:07:05, Alexandre Ghiti wrote:
> In order to reduce copy/paste of functions across architectures and then
> make riscv hugetlb port (and future ports) simpler and smaller, this
> patchset intends to factorize the numerous hugetlb primitives that are
> defined across all the architectures.
> 
> Except for prepare_hugepage_range, this patchset moves the versions that
> are just pass-through to standard pte primitives into
> asm-generic/hugetlb.h by using the same #ifdef semantic that can be
> found in asm-generic/pgtable.h, i.e. __HAVE_ARCH_***.
> 
> s390 architecture has not been tackled in this serie since it does not
> use asm-generic/hugetlb.h at all.
> powerpc could be factorized a bit more (cf huge_ptep_set_wrprotect).
> 
> This patchset has been compiled on x86 only. 
> 
> Changelog:
> 
> v4:
>   Fix powerpc build error due to misplacing of #include
>outside of #ifdef CONFIG_HUGETLB_PAGE, as
>   pointed by Christophe Leroy.
> 
> v1, v2, v3:
>   Same version, just problems with email provider and misuse of
>   --batch-size option of git send-email
> 
> Alexandre Ghiti (11):
>   hugetlb: Harmonize hugetlb.h arch specific defines with pgtable.h
>   hugetlb: Introduce generic version of hugetlb_free_pgd_range
>   hugetlb: Introduce generic version of set_huge_pte_at
>   hugetlb: Introduce generic version of huge_ptep_get_and_clear
>   hugetlb: Introduce generic version of huge_ptep_clear_flush
>   hugetlb: Introduce generic version of huge_pte_none
>   hugetlb: Introduce generic version of huge_pte_wrprotect
>   hugetlb: Introduce generic version of prepare_hugepage_range
>   hugetlb: Introduce generic version of huge_ptep_set_wrprotect
>   hugetlb: Introduce generic version of huge_ptep_set_access_flags
>   hugetlb: Introduce generic version of huge_ptep_get
> 
>  arch/arm/include/asm/hugetlb-3level.h| 32 +-
>  arch/arm/include/asm/hugetlb.h   | 33 +--
>  arch/arm64/include/asm/hugetlb.h | 39 +++-
>  arch/ia64/include/asm/hugetlb.h  | 47 ++-
>  arch/mips/include/asm/hugetlb.h  | 40 +++--
>  arch/parisc/include/asm/hugetlb.h| 33 +++
>  arch/powerpc/include/asm/book3s/32/pgtable.h |  2 +
>  arch/powerpc/include/asm/book3s/64/pgtable.h |  1 +
>  arch/powerpc/include/asm/hugetlb.h   | 43 ++
>  arch/powerpc/include/asm/nohash/32/pgtable.h |  2 +
>  arch/powerpc/include/asm/nohash/64/pgtable.h |  1 +
>  arch/sh/include/asm/hugetlb.h| 54 ++---
>  arch/sparc/include/asm/hugetlb.h | 40 +++--
>  arch/x86/include/asm/hugetlb.h   | 72 +--
>  include/asm-generic/hugetlb.h| 88 
> +++-
>  15 files changed, 143 insertions(+), 384 deletions(-)
> 
> -- 
> 2.16.2

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()

2018-07-09 Thread Michal Hocko
On Mon 09-07-18 14:19:55, Marek Szyprowski wrote:
> cma_alloc() function doesn't really support gfp flags other than
> __GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter.
> 
> This will help to avoid giving false feeling that this function supports
> standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
> what has already been an issue: see commit dd65a941f6ba ("arm64:
> dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").
> 
> Signed-off-by: Marek Szyprowski 

Thanks! This makes perfect sense to me. If there is a real need for the
gfp_mask then we should start by defining the semantic first.

Acked-by: Michal Hocko 

> ---
>  arch/powerpc/kvm/book3s_hv_builtin.c   | 2 +-
>  drivers/s390/char/vmcp.c   | 2 +-
>  drivers/staging/android/ion/ion_cma_heap.c | 2 +-
>  include/linux/cma.h| 2 +-
>  kernel/dma/contiguous.c| 3 ++-
>  mm/cma.c   | 8 
>  mm/cma_debug.c | 2 +-
>  7 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
> b/arch/powerpc/kvm/book3s_hv_builtin.c
> index d4a3f4da409b..fc6bb9630a9c 100644
> --- a/arch/powerpc/kvm/book3s_hv_builtin.c
> +++ b/arch/powerpc/kvm/book3s_hv_builtin.c
> @@ -77,7 +77,7 @@ struct page *kvm_alloc_hpt_cma(unsigned long nr_pages)
>   VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
>  
>   return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES),
> -  GFP_KERNEL);
> +  false);
>  }
>  EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma);
>  
> diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c
> index 948ce82a7725..0fa1b6b1491a 100644
> --- a/drivers/s390/char/vmcp.c
> +++ b/drivers/s390/char/vmcp.c
> @@ -68,7 +68,7 @@ static void vmcp_response_alloc(struct vmcp_session 
> *session)
>* anymore the system won't work anyway.
>*/
>   if (order > 2)
> - page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL);
> + page = cma_alloc(vmcp_cma, nr_pages, 0, false);
>   if (page) {
>   session->response = (char *)page_to_phys(page);
>   session->cma_alloc = 1;
> diff --git a/drivers/staging/android/ion/ion_cma_heap.c 
> b/drivers/staging/android/ion/ion_cma_heap.c
> index 49718c96bf9e..3fafd013d80a 100644
> --- a/drivers/staging/android/ion/ion_cma_heap.c
> +++ b/drivers/staging/android/ion/ion_cma_heap.c
> @@ -39,7 +39,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct 
> ion_buffer *buffer,
>   if (align > CONFIG_CMA_ALIGNMENT)
>   align = CONFIG_CMA_ALIGNMENT;
>  
> - pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL);
> + pages = cma_alloc(cma_heap->cma, nr_pages, align, false);
>   if (!pages)
>   return -ENOMEM;
>  
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index bf90f0bb42bd..190184b5ff32 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -33,7 +33,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, 
> phys_addr_t size,
>   const char *name,
>   struct cma **res_cma);
>  extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int 
> align,
> -   gfp_t gfp_mask);
> +   bool no_warn);
>  extern bool cma_release(struct cma *cma, const struct page *pages, unsigned 
> int count);
>  
>  extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void 
> *data);
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index d987dcd1bd56..19ea5d70150c 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -191,7 +191,8 @@ struct page *dma_alloc_from_contiguous(struct device 
> *dev, size_t count,
>   if (align > CONFIG_CMA_ALIGNMENT)
>   align = CONFIG_CMA_ALIGNMENT;
>  
> - return cma_alloc(dev_get_cma_area(dev), count, align, gfp_mask);
> + return cma_alloc(dev_get_cma_area(dev), count, align,
> +  gfp_mask & __GFP_NOWARN);
>  }
>  
>  /**
> diff --git a/mm/cma.c b/mm/cma.c
> index 5809bbe360d7..4cb76121a3ab 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -395,13 +395,13 @@ static inline void cma_debug_show_areas(struct cma 
> *cma) { }
>   * @cma:   Contiguous memory region for which the allocation is performed.
>   * @count: Requested number of pages.
>   * @align: Requested alignment of pages (in PAGE_SIZE order).
> - * @gfp_mask:  GFP mask to use during compaction
> + * @no_warn: Avoid printing message about failed allocation
>   *
>   * This function allocates part of contiguous memory on specific
>   * contiguous memory area.
>   */
>  struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
> -gfp_t gfp_mask)
> +bool no_warn)
>  

[PATCH 2/2] dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()

2018-07-09 Thread Marek Szyprowski
The CMA memory allocator doesn't support standard gfp flags for memory
allocation, so there is no point having it as a parameter for
dma_alloc_from_contiguous() function. Replace it by a boolean no_warn
argument, which covers all the underlaying cma_alloc() function supports.

This will help to avoid giving false feeling that this function supports
standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
what has already been an issue: see commit dd65a941f6ba ("arm64:
dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").

Signed-off-by: Marek Szyprowski 
---
 arch/arm/mm/dma-mapping.c  | 5 +++--
 arch/arm64/mm/dma-mapping.c| 4 ++--
 arch/xtensa/kernel/pci-dma.c   | 2 +-
 drivers/iommu/amd_iommu.c  | 2 +-
 drivers/iommu/intel-iommu.c| 3 ++-
 include/linux/dma-contiguous.h | 4 ++--
 kernel/dma/contiguous.c| 7 +++
 kernel/dma/direct.c| 3 ++-
 8 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index be0fa7e39c26..121c6c3ba9e0 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -594,7 +594,7 @@ static void *__alloc_from_contiguous(struct device *dev, 
size_t size,
struct page *page;
void *ptr = NULL;
 
-   page = dma_alloc_from_contiguous(dev, count, order, gfp);
+   page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
if (!page)
return NULL;
 
@@ -1294,7 +1294,8 @@ static struct page **__iommu_alloc_buffer(struct device 
*dev, size_t size,
unsigned long order = get_order(size);
struct page *page;
 
-   page = dma_alloc_from_contiguous(dev, count, order, gfp);
+   page = dma_alloc_from_contiguous(dev, count, order,
+gfp & __GFP_NOWARN);
if (!page)
goto error;
 
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 61e93f0b5482..072c51fb07d7 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -355,7 +355,7 @@ static int __init atomic_pool_init(void)
 
if (dev_get_cma_area(NULL))
page = dma_alloc_from_contiguous(NULL, nr_pages,
-pool_size_order, GFP_KERNEL);
+pool_size_order, false);
else
page = alloc_pages(GFP_DMA32, pool_size_order);
 
@@ -573,7 +573,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t 
size,
struct page *page;
 
page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-get_order(size), gfp);
+   get_order(size), gfp & __GFP_NOWARN);
if (!page)
return NULL;
 
diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c
index ba4640cc0093..b2c7ba91fb08 100644
--- a/arch/xtensa/kernel/pci-dma.c
+++ b/arch/xtensa/kernel/pci-dma.c
@@ -137,7 +137,7 @@ static void *xtensa_dma_alloc(struct device *dev, size_t 
size,
 
if (gfpflags_allow_blocking(flag))
page = dma_alloc_from_contiguous(dev, count, get_order(size),
-flag);
+flag & __GFP_NOWARN);
 
if (!page)
page = alloc_pages(flag, get_order(size));
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 64cfe854e0f5..5ec97ffb561a 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2622,7 +2622,7 @@ static void *alloc_coherent(struct device *dev, size_t 
size,
return NULL;
 
page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-get_order(size), flag);
+   get_order(size), flag & __GFP_NOWARN);
if (!page)
return NULL;
}
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 869321c594e2..dd2d343428ab 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3746,7 +3746,8 @@ static void *intel_alloc_coherent(struct device *dev, 
size_t size,
if (gfpflags_allow_blocking(flags)) {
unsigned int count = size >> PAGE_SHIFT;
 
-   page = dma_alloc_from_contiguous(dev, count, order, flags);
+   page = dma_alloc_from_contiguous(dev, count, order,
+flags & __GFP_NOWARN);
if (page && iommu_no_mapping(dev) &&
page_to_phys(page) + size > dev->coherent_dma_mask) {
dma_release_from_contiguous(dev, page, count);
diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h

[PATCH 0/2] CMA: remove unsupported gfp mask parameter

2018-07-09 Thread Marek Szyprowski
Dear All,

The CMA related functions cma_alloc() and dma_alloc_from_contiguous()
have gfp mask parameter, but sadly they only support __GFP_NOWARN flag.
This gave their users a misleading feeling that any standard memory
allocation flags are supported, what resulted in the security issue when
caller have set __GFP_ZERO flag and expected the buffer to be cleared.

This patchset changes gfp_mask parameter to a simple boolean no_warn
argument, which covers all the underlaying code supports.

This patchset is a result of the following discussion:
https://patchwork.kernel.org/patch/10461919/

Best regards
Marek Szyprowski
Samsung R Institute Poland


Patch summary:

Marek Szyprowski (2):
  mm/cma: remove unsupported gfp_mask parameter from cma_alloc()
  dma: remove unsupported gfp_mask parameter from
dma_alloc_from_contiguous()

 arch/arm/mm/dma-mapping.c  | 5 +++--
 arch/arm64/mm/dma-mapping.c| 4 ++--
 arch/powerpc/kvm/book3s_hv_builtin.c   | 2 +-
 arch/xtensa/kernel/pci-dma.c   | 2 +-
 drivers/iommu/amd_iommu.c  | 2 +-
 drivers/iommu/intel-iommu.c| 3 ++-
 drivers/s390/char/vmcp.c   | 2 +-
 drivers/staging/android/ion/ion_cma_heap.c | 2 +-
 include/linux/cma.h| 2 +-
 include/linux/dma-contiguous.h | 4 ++--
 kernel/dma/contiguous.c| 6 +++---
 kernel/dma/direct.c| 3 ++-
 mm/cma.c   | 8 
 mm/cma_debug.c | 2 +-
 14 files changed, 25 insertions(+), 22 deletions(-)

-- 
2.17.1



[PATCH 1/2] mm/cma: remove unsupported gfp_mask parameter from cma_alloc()

2018-07-09 Thread Marek Szyprowski
cma_alloc() function doesn't really support gfp flags other than
__GFP_NOWARN, so convert gfp_mask parameter to boolean no_warn parameter.

This will help to avoid giving false feeling that this function supports
standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
what has already been an issue: see commit dd65a941f6ba ("arm64:
dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").

Signed-off-by: Marek Szyprowski 
---
 arch/powerpc/kvm/book3s_hv_builtin.c   | 2 +-
 drivers/s390/char/vmcp.c   | 2 +-
 drivers/staging/android/ion/ion_cma_heap.c | 2 +-
 include/linux/cma.h| 2 +-
 kernel/dma/contiguous.c| 3 ++-
 mm/cma.c   | 8 
 mm/cma_debug.c | 2 +-
 7 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c 
b/arch/powerpc/kvm/book3s_hv_builtin.c
index d4a3f4da409b..fc6bb9630a9c 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -77,7 +77,7 @@ struct page *kvm_alloc_hpt_cma(unsigned long nr_pages)
VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
 
return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES),
-GFP_KERNEL);
+false);
 }
 EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma);
 
diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c
index 948ce82a7725..0fa1b6b1491a 100644
--- a/drivers/s390/char/vmcp.c
+++ b/drivers/s390/char/vmcp.c
@@ -68,7 +68,7 @@ static void vmcp_response_alloc(struct vmcp_session *session)
 * anymore the system won't work anyway.
 */
if (order > 2)
-   page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL);
+   page = cma_alloc(vmcp_cma, nr_pages, 0, false);
if (page) {
session->response = (char *)page_to_phys(page);
session->cma_alloc = 1;
diff --git a/drivers/staging/android/ion/ion_cma_heap.c 
b/drivers/staging/android/ion/ion_cma_heap.c
index 49718c96bf9e..3fafd013d80a 100644
--- a/drivers/staging/android/ion/ion_cma_heap.c
+++ b/drivers/staging/android/ion/ion_cma_heap.c
@@ -39,7 +39,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct 
ion_buffer *buffer,
if (align > CONFIG_CMA_ALIGNMENT)
align = CONFIG_CMA_ALIGNMENT;
 
-   pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL);
+   pages = cma_alloc(cma_heap->cma, nr_pages, align, false);
if (!pages)
return -ENOMEM;
 
diff --git a/include/linux/cma.h b/include/linux/cma.h
index bf90f0bb42bd..190184b5ff32 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -33,7 +33,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, 
phys_addr_t size,
const char *name,
struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int 
align,
- gfp_t gfp_mask);
+ bool no_warn);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned 
int count);
 
 extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void 
*data);
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index d987dcd1bd56..19ea5d70150c 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -191,7 +191,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, 
size_t count,
if (align > CONFIG_CMA_ALIGNMENT)
align = CONFIG_CMA_ALIGNMENT;
 
-   return cma_alloc(dev_get_cma_area(dev), count, align, gfp_mask);
+   return cma_alloc(dev_get_cma_area(dev), count, align,
+gfp_mask & __GFP_NOWARN);
 }
 
 /**
diff --git a/mm/cma.c b/mm/cma.c
index 5809bbe360d7..4cb76121a3ab 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -395,13 +395,13 @@ static inline void cma_debug_show_areas(struct cma *cma) 
{ }
  * @cma:   Contiguous memory region for which the allocation is performed.
  * @count: Requested number of pages.
  * @align: Requested alignment of pages (in PAGE_SIZE order).
- * @gfp_mask:  GFP mask to use during compaction
+ * @no_warn: Avoid printing message about failed allocation
  *
  * This function allocates part of contiguous memory on specific
  * contiguous memory area.
  */
 struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
-  gfp_t gfp_mask)
+  bool no_warn)
 {
unsigned long mask, offset;
unsigned long pfn = -1;
@@ -447,7 +447,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align,
pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
mutex_lock(_mutex);
ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
- 

Re: powerpc: 32BIT vs. 64BIT (PPC32 vs. PPC64)

2018-07-09 Thread Mathieu Malaterre
On Sun, Jul 8, 2018 at 1:53 PM Michael Ellerman  wrote:
>
> Randy Dunlap  writes:
> > Hi,
> >
> > Is there a good way (or a shortcut) to do something like:
>
> The best I know of is:
>
> > $ make ARCH=powerpc O=PPC32 [other_options] allmodconfig
> >   to get a PPC32/32BIT allmodconfig
>
> $ echo CONFIG_PPC64=n > allmod.config
> $ KCONFIG_ALLCONFIG=1 make allmodconfig
> $ grep PPC32 .config
> CONFIG_PPC32=y
>
> Which is still a bit clunky.
>
>
> I looked at this a while back and the problem we have is that the 32-bit
> kernel is not a single thing. There are multiple 32-bit platforms which
> are mutually exclusive.
>
> eg, from menuconfig:
>
>  - 512x/52xx/6xx/7xx/74xx/82xx/83xx/86xx
>  - Freescale 85xx
>  - Freescale 8xx
>  - AMCC 40x
>  - AMCC 44x, 46x or 47x
>  - Freescale e200

Most Linux distro seems to have drop support for ppc32. So I'd suggest
to pick Debian powperc default config (but I agree that I am a little
biased here).

>
> So we could have a 32-bit allmodconfig, but we'd need to chose one of
> the above, and we'd still only be testing some of the code.
>
> Having said that you're the 2nd person to ask about this, so we should
> clearly do something to make a 32-bit allmodconfig easier, even if it's
> not perfect.
>
> cheers


Re: powerpc: 32BIT vs. 64BIT (PPC32 vs. PPC64)

2018-07-09 Thread Michael Ellerman
Nicholas Piggin  writes:

> On Fri, 6 Jul 2018 21:58:29 -0700
> Randy Dunlap  wrote:
>
>> On 07/06/2018 06:45 PM, Benjamin Herrenschmidt wrote:
>> > On Thu, 2018-07-05 at 14:30 -0700, Randy Dunlap wrote:  
>> >> Hi,
>> >>
>> >> Is there a good way (or a shortcut) to do something like:
>> >>
>> >> $ make ARCH=powerpc O=PPC32 [other_options] allmodconfig
>> >>   to get a PPC32/32BIT allmodconfig
>> >>
>> >> and also be able to do:
>> >>
>> >> $make ARCH=powerpc O=PPC64 [other_options] allmodconfig
>> >>   to get a PPC64/64BIT allmodconfig?  
>> > 
>> > Hrm... O= is for the separate build dir, so there much be something
>> > else.
>> > 
>> > You mean having ARCH= aliases like ppc/ppc32 and ppc64 ?  
>> 
>> Yes.
>> 
>> > That would be a matter of overriding some .config defaults I suppose, I
>> > don't know how this is done on other archs.
>> > 
>> > I see the aliasing trick in the Makefile but that's about it.
>> >   
>> >> Note that arch/x86, arch/sh, and arch/sparc have ways to do
>> >> some flavor(s) of this (from Documentation/kbuild/kbuild.txt;
>> >> sh and sparc based on a recent "fix" patch from me):  
>> > 
>> > I fail to see what you are actually talking about here ... sorry. Do
>> > you have concrete examples on x86 or sparc ? From what I can tell the
>> > "i386" or "sparc32/sparc64" aliases just change SRCARCH in Makefile and
>> > 32 vs 64-bit is just a Kconfig option...  
>> 
>> Yes, your summary is mostly correct.
>> 
>> I'm just looking for a way to do cross-compile builds that are close to
>> ppc32 allmodconfig and ppc64 allmodconfig.
>
> Would there a problem with adding ARCH=ppc32 / ppc64 matching? This
> seems to work...

It's a cute trick but I'd rather avoid it.

It overloads ARCH which can be confusing to people and tools. For
example I'd have to special case it in kisskb.

I think we can achieve a similar result by having more PHONY defconfig
targets.

eg, we can do ppc32_allmodconfig like below. And if there's interest we
could do a 4xx_allmodconfig etc.

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 2ea575cb3401..2556c2182789 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -354,6 +354,11 @@ mpc86xx_smp_defconfig:
$(call merge_into_defconfig,mpc86xx_basic_defconfig,\
86xx-smp 86xx-hw fsl-emb-nonhw)
 
+PHONY += ppc32_allmodconfig
+ppc32_allmodconfig:
+   $(Q)$(MAKE) 
KCONFIG_ALLCONFIG=$(srctree)/arch/powerpc/configs/book3s_32.config \
+   -f $(srctree)/Makefile allmodconfig
+
 define archhelp
   @echo '* zImage  - Build default images selected by kernel config'
   @echo '  zImage.*- Compressed kernel image 
(arch/$(ARCH)/boot/zImage.*)'
diff --git a/arch/powerpc/configs/book3s_32.config 
b/arch/powerpc/configs/book3s_32.config
new file mode 100644
index ..8721eb7b1294
--- /dev/null
+++ b/arch/powerpc/configs/book3s_32.config
@@ -0,0 +1,2 @@
+CONFIG_PPC64=n
+CONFIG_PPC_BOOK3S_32=y


cheers


Re: [next-20180709][bisected 9cf57731][ppc] build fail with ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734

2018-07-09 Thread Peter Zijlstra
On Mon, Jul 09, 2018 at 03:21:23PM +0530, Abdul Haleem wrote:
> Greeting's
> 
> Today's next fails to build on powerpc with below error
> 
> kernel/cpu.o:(.data.rel+0x18e0): undefined reference to
> `lockup_detector_online_cpu'
> ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734
> kernel/cpu.o:(.data.rel+0x18e8): undefined reference to
> `lockup_detector_offline_cpu'
> ld: BFD version 2.26.1-1.fc25 assertion fail elf64-ppc.c:14734
> Makefile:1005: recipe for target 'vmlinux' failed
> make: *** [vmlinux] Error 1

Urgh, sorry about that. I think the below should cure that.

I got confused by all the varioud CONFIG options here abour and
conflated CONFIG_LOCKUP_DETECTOR and CONFIG_SOFTLOCKUP_DETECTOR it
seems.

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 80664bbeca43..08f9247e9827 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -33,15 +33,10 @@ extern int sysctl_hardlockup_all_cpu_backtrace;
 #define sysctl_hardlockup_all_cpu_backtrace 0
 #endif /* !CONFIG_SMP */
 
-extern int lockup_detector_online_cpu(unsigned int cpu);
-extern int lockup_detector_offline_cpu(unsigned int cpu);
-
 #else /* CONFIG_LOCKUP_DETECTOR */
 static inline void lockup_detector_init(void) { }
 static inline void lockup_detector_soft_poweroff(void) { }
 static inline void lockup_detector_cleanup(void) { }
-#define lockup_detector_online_cpu NULL
-#define lockup_detector_offline_cpuNULL
 #endif /* !CONFIG_LOCKUP_DETECTOR */
 
 #ifdef CONFIG_SOFTLOCKUP_DETECTOR
@@ -50,12 +45,18 @@ extern void touch_softlockup_watchdog(void);
 extern void touch_softlockup_watchdog_sync(void);
 extern void touch_all_softlockup_watchdogs(void);
 extern unsigned int  softlockup_panic;
-#else
+
+extern int lockup_detector_online_cpu(unsigned int cpu);
+extern int lockup_detector_offline_cpu(unsigned int cpu);
+#else /* CONFIG_SOFTLOCKUP_DETECTOR */
 static inline void touch_softlockup_watchdog_sched(void) { }
 static inline void touch_softlockup_watchdog(void) { }
 static inline void touch_softlockup_watchdog_sync(void) { }
 static inline void touch_all_softlockup_watchdogs(void) { }
-#endif
+
+#define lockup_detector_online_cpu NULL
+#define lockup_detector_offline_cpuNULL
+#endif /* CONFIG_SOFTLOCKUP_DETECTOR */
 
 #ifdef CONFIG_DETECT_HUNG_TASK
 void reset_hung_task_detector(void);


[PATCH 7/7 v6] arm64: dts: ls208xa: comply with the iommu map binding for fsl_mc

2018-07-09 Thread Nipun Gupta
fsl-mc bus support the new iommu-map property. Comply to this binding
for fsl_mc bus.

Signed-off-by: Nipun Gupta 
Reviewed-by: Laurentiu Tudor 
---
 arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
index 137ef4d..3d5e049 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
@@ -184,6 +184,7 @@
#address-cells = <2>;
#size-cells = <2>;
ranges;
+   dma-ranges = <0x0 0x0 0x0 0x0 0x1 0x>;
 
clockgen: clocking@130 {
compatible = "fsl,ls2080a-clockgen";
@@ -357,6 +358,8 @@
reg = <0x0008 0x0c00 0 0x40>,/* MC portal 
base */
  <0x 0x0834 0 0x4>; /* MC control 
reg */
msi-parent = <>;
+   iommu-map = <0  0 0>;  /* This is fixed-up by 
u-boot */
+   dma-coherent;
#address-cells = <3>;
#size-cells = <1>;
 
@@ -460,6 +463,9 @@
compatible = "arm,mmu-500";
reg = <0 0x500 0 0x80>;
#global-interrupts = <12>;
+   #iommu-cells = <1>;
+   stream-match-mask = <0x7C00>;
+   dma-coherent;
interrupts = <0 13 4>, /* global secure fault */
 <0 14 4>, /* combined secure interrupt */
 <0 15 4>, /* global non-secure fault */
@@ -502,7 +508,6 @@
 <0 204 4>, <0 205 4>,
 <0 206 4>, <0 207 4>,
 <0 208 4>, <0 209 4>;
-   mmu-masters = <_mc 0x300 0>;
};
 
dspi: dspi@210 {
-- 
1.9.1



[PATCH 6/7 v6] bus/fsl-mc: set coherent dma mask for devices on fsl-mc bus

2018-07-09 Thread Nipun Gupta
of_dma_configure() API expects coherent_dma_mask to be correctly
set in the devices. This patch does the needful.

Signed-off-by: Nipun Gupta 
Reviewed-by: Robin Murphy 
---
 drivers/bus/fsl-mc/fsl-mc-bus.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index fa43c7d..624828b 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -627,6 +627,7 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc,
mc_dev->icid = parent_mc_dev->icid;
mc_dev->dma_mask = FSL_MC_DEFAULT_DMA_MASK;
mc_dev->dev.dma_mask = _dev->dma_mask;
+   mc_dev->dev.coherent_dma_mask = mc_dev->dma_mask;
dev_set_msi_domain(_dev->dev,
   dev_get_msi_domain(_mc_dev->dev));
}
-- 
1.9.1



[PATCH 5/7 v6] bus/fsl-mc: support dma configure for devices on fsl-mc bus

2018-07-09 Thread Nipun Gupta
This patch adds support of dma configuration for devices on fsl-mc
bus using 'dma_configure' callback for busses. Also, directly calling
arch_setup_dma_ops is removed from the fsl-mc bus.

Signed-off-by: Nipun Gupta 
Reviewed-by: Laurentiu Tudor 
---
 drivers/bus/fsl-mc/fsl-mc-bus.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 5d8266c..fa43c7d 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -127,6 +127,16 @@ static int fsl_mc_bus_uevent(struct device *dev, struct 
kobj_uevent_env *env)
return 0;
 }
 
+static int fsl_mc_dma_configure(struct device *dev)
+{
+   struct device *dma_dev = dev;
+
+   while (dev_is_fsl_mc(dma_dev))
+   dma_dev = dma_dev->parent;
+
+   return of_dma_configure(dev, dma_dev->of_node, 0);
+}
+
 static ssize_t modalias_show(struct device *dev, struct device_attribute *attr,
 char *buf)
 {
@@ -148,6 +158,7 @@ struct bus_type fsl_mc_bus_type = {
.name = "fsl-mc",
.match = fsl_mc_bus_match,
.uevent = fsl_mc_bus_uevent,
+   .dma_configure  = fsl_mc_dma_configure,
.dev_groups = fsl_mc_dev_groups,
 };
 EXPORT_SYMBOL_GPL(fsl_mc_bus_type);
@@ -633,10 +644,6 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc *obj_desc,
goto error_cleanup_dev;
}
 
-   /* Objects are coherent, unless 'no shareability' flag set. */
-   if (!(obj_desc->flags & FSL_MC_OBJ_FLAG_NO_MEM_SHAREABILITY))
-   arch_setup_dma_ops(_dev->dev, 0, 0, NULL, true);
-
/*
 * The device-specific probe callback will get invoked by device_add()
 */
-- 
1.9.1



[PATCH 4/7 v6] iommu/arm-smmu: Add support for the fsl-mc bus

2018-07-09 Thread Nipun Gupta
Implement bus specific support for the fsl-mc bus including
registering arm_smmu_ops and bus specific device add operations.

Signed-off-by: Nipun Gupta 
---
 drivers/iommu/arm-smmu.c |  7 +++
 drivers/iommu/iommu.c| 13 +
 include/linux/fsl/mc.h   |  8 
 include/linux/iommu.h|  2 ++
 4 files changed, 30 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index f7a96bc..a011bb6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -52,6 +52,7 @@
 #include 
 
 #include 
+#include 
 
 #include "io-pgtable.h"
 #include "arm-smmu-regs.h"
@@ -1459,6 +1460,8 @@ static struct iommu_group *arm_smmu_device_group(struct 
device *dev)
 
if (dev_is_pci(dev))
group = pci_device_group(dev);
+   else if (dev_is_fsl_mc(dev))
+   group = fsl_mc_device_group(dev);
else
group = generic_device_group(dev);
 
@@ -2037,6 +2040,10 @@ static void arm_smmu_bus_init(void)
bus_set_iommu(_bus_type, _smmu_ops);
}
 #endif
+#ifdef CONFIG_FSL_MC_BUS
+   if (!iommu_present(_mc_bus_type))
+   bus_set_iommu(_mc_bus_type, _smmu_ops);
+#endif
 }
 
 static int arm_smmu_device_probe(struct platform_device *pdev)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d227b86..df2f49e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static struct kset *iommu_group_kset;
@@ -988,6 +989,18 @@ struct iommu_group *pci_device_group(struct device *dev)
return iommu_group_alloc();
 }
 
+/* Get the IOMMU group for device on fsl-mc bus */
+struct iommu_group *fsl_mc_device_group(struct device *dev)
+{
+   struct device *cont_dev = fsl_mc_cont_dev(dev);
+   struct iommu_group *group;
+
+   group = iommu_group_get(cont_dev);
+   if (!group)
+   group = iommu_group_alloc();
+   return group;
+}
+
 /**
  * iommu_group_get_for_dev - Find or create the IOMMU group for a device
  * @dev: target device
diff --git a/include/linux/fsl/mc.h b/include/linux/fsl/mc.h
index f27cb14..dddaca1 100644
--- a/include/linux/fsl/mc.h
+++ b/include/linux/fsl/mc.h
@@ -351,6 +351,14 @@ struct fsl_mc_io {
 #define dev_is_fsl_mc(_dev) (0)
 #endif
 
+/* Macro to check if a device is a container device */
+#define fsl_mc_is_cont_dev(_dev) (to_fsl_mc_device(_dev)->flags & \
+   FSL_MC_IS_DPRC)
+
+/* Macro to get the container device of a MC device */
+#define fsl_mc_cont_dev(_dev) (fsl_mc_is_cont_dev(_dev) ? \
+   (_dev) : (_dev)->parent)
+
 /*
  * module_fsl_mc_driver() - Helper macro for drivers that don't do
  * anything special in module init/exit.  This eliminates a lot of
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7447b0b..209891d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -389,6 +389,8 @@ static inline size_t iommu_map_sg(struct iommu_domain 
*domain,
 extern struct iommu_group *pci_device_group(struct device *dev);
 /* Generic device grouping function */
 extern struct iommu_group *generic_device_group(struct device *dev);
+/* FSL-MC device grouping function */
+struct iommu_group *fsl_mc_device_group(struct device *dev);
 
 /**
  * struct iommu_fwspec - per-device IOMMU instance data
-- 
1.9.1



[PATCH 3/7 v6] iommu/of: support iommu configuration for fsl-mc devices

2018-07-09 Thread Nipun Gupta
With of_pci_map_rid available for all the busses, use the function
for configuration of devices on fsl-mc bus

Signed-off-by: Nipun Gupta 
Reviewed-by: Robin Murphy 
---
 drivers/iommu/of_iommu.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 811e160..284474d 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define NO_IOMMU   1
 
@@ -159,6 +160,23 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 
alias, void *data)
return err;
 }
 
+static int of_fsl_mc_iommu_init(struct fsl_mc_device *mc_dev,
+   struct device_node *master_np)
+{
+   struct of_phandle_args iommu_spec = { .args_count = 1 };
+   int err;
+
+   err = of_map_rid(master_np, mc_dev->icid, "iommu-map",
+"iommu-map-mask", _spec.np,
+iommu_spec.args);
+   if (err)
+   return err == -ENODEV ? NO_IOMMU : err;
+
+   err = of_iommu_xlate(_dev->dev, _spec);
+   of_node_put(iommu_spec.np);
+   return err;
+}
+
 const struct iommu_ops *of_iommu_configure(struct device *dev,
   struct device_node *master_np)
 {
@@ -190,6 +208,8 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
 
err = pci_for_each_dma_alias(to_pci_dev(dev),
 of_pci_iommu_init, );
+   } else if (dev_is_fsl_mc(dev)) {
+   err = of_fsl_mc_iommu_init(to_fsl_mc_device(dev), master_np);
} else {
struct of_phandle_args iommu_spec;
int idx = 0;
-- 
1.9.1



[PATCH 2/7 v6] iommu/of: make of_pci_map_rid() available for other devices too

2018-07-09 Thread Nipun Gupta
iommu-map property is also used by devices with fsl-mc. This
patch moves the of_pci_map_rid to generic location, so that it
can be used by other busses too.

'of_pci_map_rid' is renamed here to 'of_map_rid' and there is no
functional change done in the API.

Signed-off-by: Nipun Gupta 
Reviewed-by: Rob Herring 
Reviewed-by: Robin Murphy 
Acked-by: Bjorn Helgaas 
---
 drivers/iommu/of_iommu.c |   5 +--
 drivers/of/base.c| 102 +++
 drivers/of/irq.c |   5 +--
 drivers/pci/of.c | 101 --
 include/linux/of.h   |  11 +
 include/linux/of_pci.h   |  10 -
 6 files changed, 117 insertions(+), 117 deletions(-)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 5c36a8b..811e160 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -149,9 +149,8 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 
alias, void *data)
struct of_phandle_args iommu_spec = { .args_count = 1 };
int err;
 
-   err = of_pci_map_rid(info->np, alias, "iommu-map",
-"iommu-map-mask", _spec.np,
-iommu_spec.args);
+   err = of_map_rid(info->np, alias, "iommu-map", "iommu-map-mask",
+_spec.np, iommu_spec.args);
if (err)
return err == -ENODEV ? NO_IOMMU : err;
 
diff --git a/drivers/of/base.c b/drivers/of/base.c
index 848f549..c7aac81 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1995,3 +1995,105 @@ int of_find_last_cache_level(unsigned int cpu)
 
return cache_level;
 }
+
+/**
+ * of_map_rid - Translate a requester ID through a downstream mapping.
+ * @np: root complex device node.
+ * @rid: device requester ID to map.
+ * @map_name: property name of the map to use.
+ * @map_mask_name: optional property name of the mask to use.
+ * @target: optional pointer to a target device node.
+ * @id_out: optional pointer to receive the translated ID.
+ *
+ * Given a device requester ID, look up the appropriate implementation-defined
+ * platform ID and/or the target device which receives transactions on that
+ * ID, as per the "iommu-map" and "msi-map" bindings. Either of @target or
+ * @id_out may be NULL if only the other is required. If @target points to
+ * a non-NULL device node pointer, only entries targeting that node will be
+ * matched; if it points to a NULL value, it will receive the device node of
+ * the first matching target phandle, with a reference held.
+ *
+ * Return: 0 on success or a standard error code on failure.
+ */
+int of_map_rid(struct device_node *np, u32 rid,
+  const char *map_name, const char *map_mask_name,
+  struct device_node **target, u32 *id_out)
+{
+   u32 map_mask, masked_rid;
+   int map_len;
+   const __be32 *map = NULL;
+
+   if (!np || !map_name || (!target && !id_out))
+   return -EINVAL;
+
+   map = of_get_property(np, map_name, _len);
+   if (!map) {
+   if (target)
+   return -ENODEV;
+   /* Otherwise, no map implies no translation */
+   *id_out = rid;
+   return 0;
+   }
+
+   if (!map_len || map_len % (4 * sizeof(*map))) {
+   pr_err("%pOF: Error: Bad %s length: %d\n", np,
+   map_name, map_len);
+   return -EINVAL;
+   }
+
+   /* The default is to select all bits. */
+   map_mask = 0x;
+
+   /*
+* Can be overridden by "{iommu,msi}-map-mask" property.
+* If of_property_read_u32() fails, the default is used.
+*/
+   if (map_mask_name)
+   of_property_read_u32(np, map_mask_name, _mask);
+
+   masked_rid = map_mask & rid;
+   for ( ; map_len > 0; map_len -= 4 * sizeof(*map), map += 4) {
+   struct device_node *phandle_node;
+   u32 rid_base = be32_to_cpup(map + 0);
+   u32 phandle = be32_to_cpup(map + 1);
+   u32 out_base = be32_to_cpup(map + 2);
+   u32 rid_len = be32_to_cpup(map + 3);
+
+   if (rid_base & ~map_mask) {
+   pr_err("%pOF: Invalid %s translation - %s-mask (0x%x) 
ignores rid-base (0x%x)\n",
+   np, map_name, map_name,
+   map_mask, rid_base);
+   return -EFAULT;
+   }
+
+   if (masked_rid < rid_base || masked_rid >= rid_base + rid_len)
+   continue;
+
+   phandle_node = of_find_node_by_phandle(phandle);
+   if (!phandle_node)
+   return -ENODEV;
+
+   if (target) {
+   if (*target)
+   of_node_put(phandle_node);
+   else
+   *target = phandle_node;
+
+   if 

[PATCH 1/7 v6] Documentation: fsl-mc: add iommu-map device-tree binding for fsl-mc bus

2018-07-09 Thread Nipun Gupta
The existing IOMMU bindings cannot be used to specify the relationship
between fsl-mc devices and IOMMUs. This patch adds a generic binding for
mapping fsl-mc devices to IOMMUs, using iommu-map property.

Signed-off-by: Nipun Gupta 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/misc/fsl,qoriq-mc.txt  | 39 ++
 1 file changed, 39 insertions(+)

diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt 
b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
index 6611a7c..01fdc33 100644
--- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
+++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
@@ -9,6 +9,25 @@ blocks that can be used to create functional hardware 
objects/devices
 such as network interfaces, crypto accelerator instances, L2 switches,
 etc.
 
+For an overview of the DPAA2 architecture and fsl-mc bus see:
+Documentation/networking/dpaa2/overview.rst
+
+As described in the above overview, all DPAA2 objects in a DPRC share the
+same hardware "isolation context" and a 10-bit value called an ICID
+(isolation context id) is expressed by the hardware to identify
+the requester.
+
+The generic 'iommus' property is insufficient to describe the relationship
+between ICIDs and IOMMUs, so an iommu-map property is used to define
+the set of possible ICIDs under a root DPRC and how they map to
+an IOMMU.
+
+For generic IOMMU bindings, see
+Documentation/devicetree/bindings/iommu/iommu.txt.
+
+For arm-smmu binding, see:
+Documentation/devicetree/bindings/iommu/arm,smmu.txt.
+
 Required properties:
 
 - compatible
@@ -88,14 +107,34 @@ Sub-nodes:
   Value type: 
   Definition: Specifies the phandle to the PHY device node 
associated
   with the this dpmac.
+Optional properties:
+
+- iommu-map: Maps an ICID to an IOMMU and associated iommu-specifier
+  data.
+
+  The property is an arbitrary number of tuples of
+  (icid-base,iommu,iommu-base,length).
+
+  Any ICID i in the interval [icid-base, icid-base + length) is
+  associated with the listed IOMMU, with the iommu-specifier
+  (i - icid-base + iommu-base).
 
 Example:
 
+smmu: iommu@500 {
+   compatible = "arm,mmu-500";
+   #iommu-cells = <1>;
+   stream-match-mask = <0x7C00>;
+   ...
+};
+
 fsl_mc: fsl-mc@80c00 {
 compatible = "fsl,qoriq-mc";
 reg = <0x0008 0x0c00 0 0x40>,/* MC portal base */
   <0x 0x0834 0 0x4>; /* MC control reg */
 msi-parent = <>;
+/* define map for ICIDs 23-64 */
+iommu-map = <23  23 41>;
 #address-cells = <3>;
 #size-cells = <1>;
 
-- 
1.9.1



[PATCH 0/7 v6] Support for fsl-mc bus and its devices in SMMU

2018-07-09 Thread Nipun Gupta
This patchset defines IOMMU DT binding for fsl-mc bus and adds
support in SMMU for fsl-mc bus.

The patch series is based on top of dma-mapping tree (for-next branch):
http://git.infradead.org/users/hch/dma-mapping.git

These patches
  - Define property 'iommu-map' for fsl-mc bus (patch 1)
  - Integrates the fsl-mc bus with the SMMU using this
IOMMU binding (patch 2,3,4)
  - Adds the dma configuration support for fsl-mc bus (patch 5, 6)
  - Updates the fsl-mc device node with iommu/dma related changes (patch 7)

Changes in v2:
  - use iommu-map property for fsl-mc bus
  - rebase over patchset https://patchwork.kernel.org/patch/10317337/
and make corresponding changes for dma configuration of devices on
fsl-mc bus

Changes in v3:
  - move of_map_rid in drivers/of/address.c

Changes in v4:
  - move of_map_rid in drivers/of/base.c

Changes in v5:
  - break patch 5 in two separate patches (now patch 5/7 and patch 6/7)
  - add changelog text in patch 3/7 and patch 5/7
  - typo fix

Changes in v6:
  - Updated fsl_mc_device_group() API to be more rational
  - Added dma-coherent property in the LS2 smmu device node
  - Minor fixes in the device-tree documentation

Nipun Gupta (7):
  Documentation: fsl-mc: add iommu-map device-tree binding for fsl-mc
bus
  iommu/of: make of_pci_map_rid() available for other devices too
  iommu/of: support iommu configuration for fsl-mc devices
  iommu/arm-smmu: Add support for the fsl-mc bus
  bus: fsl-mc: support dma configure for devices on fsl-mc bus
  bus: fsl-mc: set coherent dma mask for devices on fsl-mc bus
  arm64: dts: ls208xa: comply with the iommu map binding for fsl_mc

 .../devicetree/bindings/misc/fsl,qoriq-mc.txt  |  39 
 arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi |   7 +-
 drivers/bus/fsl-mc/fsl-mc-bus.c|  16 +++-
 drivers/iommu/arm-smmu.c   |   7 ++
 drivers/iommu/iommu.c  |  13 +++
 drivers/iommu/of_iommu.c   |  25 -
 drivers/of/base.c  | 102 +
 drivers/of/irq.c   |   5 +-
 drivers/pci/of.c   | 101 
 include/linux/fsl/mc.h |   8 ++
 include/linux/iommu.h  |   2 +
 include/linux/of.h |  11 +++
 include/linux/of_pci.h |  10 --
 13 files changed, 224 insertions(+), 122 deletions(-)

-- 
1.9.1



Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-09 Thread Rafael J. Wysocki
On Mon, Jul 9, 2018 at 10:40 AM, Pingfan Liu  wrote:
> On Mon, Jul 9, 2018 at 3:48 PM Rafael J. Wysocki  wrote:
>>
>> On Mon, Jul 9, 2018 at 8:48 AM, Pingfan Liu  wrote:
>> > On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki  wrote:

[cut]

>>
>> I simply think that there should be one way to iterate over devices
>> for both system-wide PM and shutdown.
>>
>> The reason why it is not like that today is because of the development
>> history, but if it doesn't work and we want to fix it, let's just
>> consolidate all of that.
>>
>> Now, system-wide suspend resume sometimes iterates the list in the
>> reverse order which would be hard without having a list, wouldn't it?
>>
> Yes, it would be hard without having a list. I just thought to use
> device tree info to build up a shadowed list, and rebuild the list
> until there is new device_link_add() operation. For
> device_add/_remove(), it can modify the shadowed list directly.

Right, and that's the idea of dpm_list, generally speaking: It
represents one of the (possibly many) orders in which devices can be
suspended (or shut down) based on the information coming from the
device hierarchy and device links.

So it appears straightforward (even though it may be complicated
because of the build-time dependencies) to start using dpm_list for
shutdown too - and to ensure that it is properly maintained
everywhere.

Thanks,
Rafael


Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-09 Thread Pingfan Liu
On Mon, Jul 9, 2018 at 3:48 PM Rafael J. Wysocki  wrote:
>
> On Mon, Jul 9, 2018 at 8:48 AM, Pingfan Liu  wrote:
> > On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki  wrote:
> >>
> >> On Sat, Jul 7, 2018 at 6:24 AM, Pingfan Liu  wrote:
> >> > On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu  wrote:
> >> >>
> >> >> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki  
> >> >> wrote:
> >> >> >
> >> >> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner  wrote:
> >> >> > > [cc += Kishon Vijay Abraham]
> >> >> > >
> >> >> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
> >> >> > >> OK, so calling devices_kset_move_last() from really_probe() 
> >> >> > >> clearly is
> >> >> > >> a mistake.
> >> >> > >>
> >> >> > >> I'm not really sure what the intention of it was as the changelog 
> >> >> > >> of
> >> >> > >> commit 52cdbdd49853d doesn't really explain that (why would it be
> >> >> > >> insufficient without that change?)
> >> >> > >
> >> >> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC
> >> >> > > whose reset pin needs to be driven high on shutdown, lest the MMC
> >> >> > > won't be found on the next boot.
> >> >> > >
> >> >> > > The boards' devicetrees use a kludge wherein the reset pin is 
> >> >> > > modelled
> >> >> > > as a regulator.  The regulator is enabled when the MMC probes and
> >> >> > > disabled on driver unbind and shutdown.  As a result, the pin is 
> >> >> > > driven
> >> >> > > low on shutdown and the MMC is not found on the next boot.
> >> >> > >
> >> >> > > To fix this, another kludge was invented wherein the GPIO expander
> >> >> > > driving the reset pin unconditionally drives all its pins high on
> >> >> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
> >> >> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state
> >> >> > > of all pcf lines").
> >> >> > >
> >> >> > > For this kludge to work, the GPIO expander's ->shutdown hook needs 
> >> >> > > to
> >> >> > > be executed after the MMC expander's ->shutdown hook.
> >> >> > >
> >> >> > > Commit 52cdbdd49853d achieved that by reordering devices_kset 
> >> >> > > according
> >> >> > > to the probe order.  Apparently the MMC probes after the GPIO 
> >> >> > > expander,
> >> >> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator 
> >> >> > > isn't
> >> >> > > available yet, see mmc_regulator_get_supply().
> >> >> > >
> >> >> > > Note, I'm just piecing the information together from git history,
> >> >> > > I'm not responsible for these kludges.  (I'm innocent!)
> >> >> >
> >> >> > Sure enough. :-)
> >> >> >
> >> >> > In any case, calling devices_kset_move_last() in really_probe() is
> >> >> > plain broken and if its only purpose was to address a single, arguably
> >> >> > kludgy, use case, let's just get rid of it in the first place IMO.
> >> >> >
> >> >> Yes, if it is only used for a single use case.
> >> >>
> >> > Think it again, I saw other potential issue with the current code.
> >> > device_link_add->device_reorder_to_tail() can break the
> >> > "supplier<-consumer" order. During moving children after parent's
> >> > supplier, it ignores the order of child's consumer.
> >>
> >> What do you mean?
> >>
> > The drivers use device_link_add() to build "supplier<-consumer" order
> > without knowing each other. Hence there is the following potential
> > odds: (consumerX, child_a, ...) (consumer_a,..) (supplierX), where
> > consumer_a consumes child_a.
>
> Well, what's the initial state of the list?
>
> > When device_link_add()->device_reorder_to_tail() moves all descendant of
> > consumerX to the tail, it breaks the "supplier<-consumer" order by
> > "consumer_a <- child_a".
>
> That depends on what the initial ordering of the list is and please
> note that circular dependencies are explicitly assumed to be not
> present.
>
> The assumption is that the initial ordering of the list reflects the
> correct suspend (or shutdown) order without the new link.  Therefore
> initially all children are located after their parents and all known
> consumers are located after their suppliers.
>
> If a new link is added, the new consumer goes to the end of the list
> and all of its children and all of its consumers go after it.
> device_reorder_to_tail() is recursive, so for each of the devices that
> went to the end of the list, all of its children and all of its
> consumers go after it and so on.
>
> Now, that operation doesn't change the order of any of the
> parent<-child or supplier<-consumer pairs that get moved and since all
> of the devices that depend on any device that get moved go to the end
> of list after it, the only devices that don't go to the end of list
> are guaranteed to not depend on any of them (they may be parents or
> suppliers of the devices that go to the end of the list, but not their
> children or suppliers).
>
Thanks for the detailed explain. It is clear now, and you are right.

> > And we need recrusion to resolve the item in
> > 

Re: NXP p1010se device trees only correct for P1010E/P1014E, not P1010/P1014 SoCs.

2018-07-09 Thread Tim Small

On 06/07/18 19:41, Scott Wood wrote:

My openwrt patch
just does a:

/delete-node/  crypto@3;

after the p1010si-post.dtsi include.

U-Boot should already be removing the node on non-E chips -- see
ft_cpu_setup() in arch/powerpc/cpu/mpc85xx/fdt.c



Hi Scott,

Thanks for your email.  The device in question ships an old uboot (a 
vendor fork of U-Boot 2010.12-svn15934).


I am right in saying that the right fix is to either:

Use a bootloader (such as current upstream uboot) which adjusts the 
device tree properly...


or:

In the case (such as OpenWRT) where the preferred installation method is 
to retain the vendor bootloader, then the distro kernel should handle 
the device tree fixup itself?


Regards,

Tim.


Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-09 Thread Rafael J. Wysocki
On Mon, Jul 9, 2018 at 8:48 AM, Pingfan Liu  wrote:
> On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki  wrote:
>>
>> On Sat, Jul 7, 2018 at 6:24 AM, Pingfan Liu  wrote:
>> > On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu  wrote:
>> >>
>> >> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki  
>> >> wrote:
>> >> >
>> >> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner  wrote:
>> >> > > [cc += Kishon Vijay Abraham]
>> >> > >
>> >> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
>> >> > >> OK, so calling devices_kset_move_last() from really_probe() clearly 
>> >> > >> is
>> >> > >> a mistake.
>> >> > >>
>> >> > >> I'm not really sure what the intention of it was as the changelog of
>> >> > >> commit 52cdbdd49853d doesn't really explain that (why would it be
>> >> > >> insufficient without that change?)
>> >> > >
>> >> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC
>> >> > > whose reset pin needs to be driven high on shutdown, lest the MMC
>> >> > > won't be found on the next boot.
>> >> > >
>> >> > > The boards' devicetrees use a kludge wherein the reset pin is modelled
>> >> > > as a regulator.  The regulator is enabled when the MMC probes and
>> >> > > disabled on driver unbind and shutdown.  As a result, the pin is 
>> >> > > driven
>> >> > > low on shutdown and the MMC is not found on the next boot.
>> >> > >
>> >> > > To fix this, another kludge was invented wherein the GPIO expander
>> >> > > driving the reset pin unconditionally drives all its pins high on
>> >> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
>> >> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state
>> >> > > of all pcf lines").
>> >> > >
>> >> > > For this kludge to work, the GPIO expander's ->shutdown hook needs to
>> >> > > be executed after the MMC expander's ->shutdown hook.
>> >> > >
>> >> > > Commit 52cdbdd49853d achieved that by reordering devices_kset 
>> >> > > according
>> >> > > to the probe order.  Apparently the MMC probes after the GPIO 
>> >> > > expander,
>> >> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't
>> >> > > available yet, see mmc_regulator_get_supply().
>> >> > >
>> >> > > Note, I'm just piecing the information together from git history,
>> >> > > I'm not responsible for these kludges.  (I'm innocent!)
>> >> >
>> >> > Sure enough. :-)
>> >> >
>> >> > In any case, calling devices_kset_move_last() in really_probe() is
>> >> > plain broken and if its only purpose was to address a single, arguably
>> >> > kludgy, use case, let's just get rid of it in the first place IMO.
>> >> >
>> >> Yes, if it is only used for a single use case.
>> >>
>> > Think it again, I saw other potential issue with the current code.
>> > device_link_add->device_reorder_to_tail() can break the
>> > "supplier<-consumer" order. During moving children after parent's
>> > supplier, it ignores the order of child's consumer.
>>
>> What do you mean?
>>
> The drivers use device_link_add() to build "supplier<-consumer" order
> without knowing each other. Hence there is the following potential
> odds: (consumerX, child_a, ...) (consumer_a,..) (supplierX), where
> consumer_a consumes child_a.

Well, what's the initial state of the list?

> When device_link_add()->device_reorder_to_tail() moves all descendant of
> consumerX to the tail, it breaks the "supplier<-consumer" order by
> "consumer_a <- child_a".

That depends on what the initial ordering of the list is and please
note that circular dependencies are explicitly assumed to be not
present.

The assumption is that the initial ordering of the list reflects the
correct suspend (or shutdown) order without the new link.  Therefore
initially all children are located after their parents and all known
consumers are located after their suppliers.

If a new link is added, the new consumer goes to the end of the list
and all of its children and all of its consumers go after it.
device_reorder_to_tail() is recursive, so for each of the devices that
went to the end of the list, all of its children and all of its
consumers go after it and so on.

Now, that operation doesn't change the order of any of the
parent<-child or supplier<-consumer pairs that get moved and since all
of the devices that depend on any device that get moved go to the end
of list after it, the only devices that don't go to the end of list
are guaranteed to not depend on any of them (they may be parents or
suppliers of the devices that go to the end of the list, but not their
children or suppliers).

> And we need recrusion to resolve the item in
> (consumer_a,..), each time when moving a consumer behind its supplier,
> we may break "parent<-child".

I don't see this as per the above.

Say, device_reorder_to_tail() moves a parent after its child.  This
means that device_reorder_to_tail() was not called for the child after
it had been called for the parent, but that is not true, because it is
called for all of the children of each 

Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-09 Thread Pingfan Liu
On Sun, Jul 8, 2018 at 4:25 PM Rafael J. Wysocki  wrote:
>
> On Sat, Jul 7, 2018 at 6:24 AM, Pingfan Liu  wrote:
> > On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu  wrote:
> >>
> >> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki  wrote:
> >> >
> >> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner  wrote:
> >> > > [cc += Kishon Vijay Abraham]
> >> > >
> >> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
> >> > >> OK, so calling devices_kset_move_last() from really_probe() clearly is
> >> > >> a mistake.
> >> > >>
> >> > >> I'm not really sure what the intention of it was as the changelog of
> >> > >> commit 52cdbdd49853d doesn't really explain that (why would it be
> >> > >> insufficient without that change?)
> >> > >
> >> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC
> >> > > whose reset pin needs to be driven high on shutdown, lest the MMC
> >> > > won't be found on the next boot.
> >> > >
> >> > > The boards' devicetrees use a kludge wherein the reset pin is modelled
> >> > > as a regulator.  The regulator is enabled when the MMC probes and
> >> > > disabled on driver unbind and shutdown.  As a result, the pin is driven
> >> > > low on shutdown and the MMC is not found on the next boot.
> >> > >
> >> > > To fix this, another kludge was invented wherein the GPIO expander
> >> > > driving the reset pin unconditionally drives all its pins high on
> >> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
> >> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state
> >> > > of all pcf lines").
> >> > >
> >> > > For this kludge to work, the GPIO expander's ->shutdown hook needs to
> >> > > be executed after the MMC expander's ->shutdown hook.
> >> > >
> >> > > Commit 52cdbdd49853d achieved that by reordering devices_kset according
> >> > > to the probe order.  Apparently the MMC probes after the GPIO expander,
> >> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't
> >> > > available yet, see mmc_regulator_get_supply().
> >> > >
> >> > > Note, I'm just piecing the information together from git history,
> >> > > I'm not responsible for these kludges.  (I'm innocent!)
> >> >
> >> > Sure enough. :-)
> >> >
> >> > In any case, calling devices_kset_move_last() in really_probe() is
> >> > plain broken and if its only purpose was to address a single, arguably
> >> > kludgy, use case, let's just get rid of it in the first place IMO.
> >> >
> >> Yes, if it is only used for a single use case.
> >>
> > Think it again, I saw other potential issue with the current code.
> > device_link_add->device_reorder_to_tail() can break the
> > "supplier<-consumer" order. During moving children after parent's
> > supplier, it ignores the order of child's consumer.
>
> What do you mean?
>
The drivers use device_link_add() to build "supplier<-consumer" order
without knowing each other. Hence there is the following potential
odds: (consumerX, child_a, ...) (consumer_a,..) (supplierX), where
consumer_a consumes child_a. When
device_link_add()->device_reorder_to_tail() moves all descendant of
consumerX to the tail, it breaks the "supplier<-consumer" order by
"consumer_a <- child_a". And we need recrusion to resolve the item in
(consumer_a,..), each time when moving a consumer behind its supplier,
we may break "parent<-child".

> > Beside this, essentially both devices_kset_move_after/_before() and
> > device_pm_move_after/_before() expose  the shutdown order to the
> > indirect caller,  and we can not expect that the caller can not handle
> > it correctly. It should be a job of drivers core.
>
> Arguably so, but that's how those functions were designed and the
> callers should be aware of the limitation.
>
> If they aren't, there is a bug in the caller.
>
If we consider device_move()-> device_pm_move_after/_before() more
carefully like the above description, then we can hide the detail from
caller. And keep the info of the pm order inside the core.

> > It is hard to extract high dimension info and pack them into one dimension
> > linked-list.
>
> Well, yes and no.
>
For "hard", I means that we need two interleaved recursion to make the
order correct. Otherwise, I think it is a bug or limitation.

> We know it for a fact that there is a linear ordering that will work.
> It is inefficient to figure it out every time during system suspend
> and resume, for one and that's why we have dpm_list.
>
Yeah, I agree that iterating over device tree may hurt performance. I
guess the iterating will not cost the majority of the suspend time,
comparing to the device_suspend(), which causes hardware's sync. But
data is more persuasive. Besides the performance, do you have other
concern till now?

> Now, if we have it for suspend and resume, it can also be used for shutdown.
>
Yes, I do think so.

Thanks and regards,
Pingfan


[PATCH] powerpc64s: Show ori31 availability in spectre_v1 sysfs file not v2

2018-07-09 Thread Michael Ellerman
When I added the spectre_v2 information in sysfs, I included the
availability of the ori31 speculation barrier.

Although the ori31 barrier can be used to mitigate v2, it's primarily
intended as a spectre v1 mitigation. Spectre v2 is mitigated by
hardware changes.

So rework the sysfs files to show the ori31 information in the
spectre_v1 file, rather than v2.

Currently we display eg:

  $ grep . spectre_v*
  spectre_v1:Mitigation: __user pointer sanitization
  spectre_v2:Mitigation: Indirect branch cache disabled, ori31 speculation 
barrier enabled

After:

  $ grep . spectre_v*
  spectre_v1:Mitigation: __user pointer sanitization, ori31 speculation barrier 
enabled
  spectre_v2:Mitigation: Indirect branch cache disabled

Fixes: d6fbe1c55c55 ("powerpc/64s: Wire up cpu_show_spectre_v2()")
Cc: sta...@vger.kernel.org # v4.17+
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/security.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c
index a8b277362931..4cb8f1f7b593 100644
--- a/arch/powerpc/kernel/security.c
+++ b/arch/powerpc/kernel/security.c
@@ -117,25 +117,35 @@ ssize_t cpu_show_meltdown(struct device *dev, struct 
device_attribute *attr, cha
 
 ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
-   if (!security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR))
-   return sprintf(buf, "Not affected\n");
+   struct seq_buf s;
+
+   seq_buf_init(, buf, PAGE_SIZE - 1);
 
-   if (barrier_nospec_enabled)
-   return sprintf(buf, "Mitigation: __user pointer 
sanitization\n");
+   if (security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)) {
+   if (barrier_nospec_enabled)
+   seq_buf_printf(, "Mitigation: __user pointer 
sanitization");
+   else
+   seq_buf_printf(, "Vulnerable");
 
-   return sprintf(buf, "Vulnerable\n");
+   if (security_ftr_enabled(SEC_FTR_SPEC_BAR_ORI31))
+   seq_buf_printf(, ", ori31 speculation barrier 
enabled");
+
+   seq_buf_printf(, "\n");
+   } else
+   seq_buf_printf(, "Not affected\n");
+
+   return s.len;
 }
 
 ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, 
char *buf)
 {
-   bool bcs, ccd, ori;
struct seq_buf s;
+   bool bcs, ccd;
 
seq_buf_init(, buf, PAGE_SIZE - 1);
 
bcs = security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED);
ccd = security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED);
-   ori = security_ftr_enabled(SEC_FTR_SPEC_BAR_ORI31);
 
if (bcs || ccd) {
seq_buf_printf(, "Mitigation: ");
@@ -151,9 +161,6 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct 
device_attribute *attr, c
} else
seq_buf_printf(, "Vulnerable");
 
-   if (ori)
-   seq_buf_printf(, ", ori31 speculation barrier enabled");
-
seq_buf_printf(, "\n");
 
return s.len;
-- 
2.14.1