Re: powerpc: 32BIT vs. 64BIT (PPC32 vs. PPC64)

2018-07-06 Thread Randy Dunlap
On 07/06/2018 06:45 PM, Benjamin Herrenschmidt wrote:
> On Thu, 2018-07-05 at 14:30 -0700, Randy Dunlap wrote:
>> Hi,
>>
>> Is there a good way (or a shortcut) to do something like:
>>
>> $ make ARCH=powerpc O=PPC32 [other_options] allmodconfig
>>   to get a PPC32/32BIT allmodconfig
>>
>> and also be able to do:
>>
>> $make ARCH=powerpc O=PPC64 [other_options] allmodconfig
>>   to get a PPC64/64BIT allmodconfig?
> 
> Hrm... O= is for the separate build dir, so there much be something
> else.
> 
> You mean having ARCH= aliases like ppc/ppc32 and ppc64 ?

Yes.

> That would be a matter of overriding some .config defaults I suppose, I
> don't know how this is done on other archs.
> 
> I see the aliasing trick in the Makefile but that's about it.
> 
>> Note that arch/x86, arch/sh, and arch/sparc have ways to do
>> some flavor(s) of this (from Documentation/kbuild/kbuild.txt;
>> sh and sparc based on a recent "fix" patch from me):
> 
> I fail to see what you are actually talking about here ... sorry. Do
> you have concrete examples on x86 or sparc ? From what I can tell the
> "i386" or "sparc32/sparc64" aliases just change SRCARCH in Makefile and
> 32 vs 64-bit is just a Kconfig option...

Yes, your summary is mostly correct.

I'm just looking for a way to do cross-compile builds that are close to
ppc32 allmodconfig and ppc64 allmodconfig.


>> x86: i386 for 32 bit, x86_64 for 64 bit
>> sh: sh for 32 bit, sh64 for 64 bit
>> sparc: sparc32 for 32 bit, sparc64 for 64 bit

I saw the powerpc merge-config targets after I sent this original email.
I can do my own homebrew that I prefer over those, though.


thanks,
-- 
~Randy


Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-06 Thread Pingfan Liu
On Fri, Jul 6, 2018 at 9:55 PM Pingfan Liu  wrote:
>
> On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki  wrote:
> >
> > On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner  wrote:
> > > [cc += Kishon Vijay Abraham]
> > >
> > > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
> > >> OK, so calling devices_kset_move_last() from really_probe() clearly is
> > >> a mistake.
> > >>
> > >> I'm not really sure what the intention of it was as the changelog of
> > >> commit 52cdbdd49853d doesn't really explain that (why would it be
> > >> insufficient without that change?)
> > >
> > > It seems 52cdbdd49853d fixed an issue with boards which have an MMC
> > > whose reset pin needs to be driven high on shutdown, lest the MMC
> > > won't be found on the next boot.
> > >
> > > The boards' devicetrees use a kludge wherein the reset pin is modelled
> > > as a regulator.  The regulator is enabled when the MMC probes and
> > > disabled on driver unbind and shutdown.  As a result, the pin is driven
> > > low on shutdown and the MMC is not found on the next boot.
> > >
> > > To fix this, another kludge was invented wherein the GPIO expander
> > > driving the reset pin unconditionally drives all its pins high on
> > > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
> > > (commit adc284755055, "gpio: pcf857x: restore the initial line state
> > > of all pcf lines").
> > >
> > > For this kludge to work, the GPIO expander's ->shutdown hook needs to
> > > be executed after the MMC expander's ->shutdown hook.
> > >
> > > Commit 52cdbdd49853d achieved that by reordering devices_kset according
> > > to the probe order.  Apparently the MMC probes after the GPIO expander,
> > > possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't
> > > available yet, see mmc_regulator_get_supply().
> > >
> > > Note, I'm just piecing the information together from git history,
> > > I'm not responsible for these kludges.  (I'm innocent!)
> >
> > Sure enough. :-)
> >
> > In any case, calling devices_kset_move_last() in really_probe() is
> > plain broken and if its only purpose was to address a single, arguably
> > kludgy, use case, let's just get rid of it in the first place IMO.
> >
> Yes, if it is only used for a single use case.
>
Think it again, I saw other potential issue with the current code.
device_link_add->device_reorder_to_tail() can break the
"supplier<-consumer" order. During moving children after parent's
supplier, it ignores the order of child's consumer.  Beside this,
essentially both devices_kset_move_after/_before() and
device_pm_move_after/_before() expose  the shutdown order to the
indirect caller,  and we can not expect that the caller can not handle
it correctly. It should be a job of drivers core.  It is hard to
extract high dimension info and pack them into one dimension
linked-list. And in theory, it is warranted that the shutdown seq is
correct by using device tree info. More important, it is cheap with
the data structure in hand. So I think it is time to resolve the issue
once for all.

Thanks and regards,
Pingfan


Re: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

2018-07-06 Thread Pingfan Liu
On Fri, Jul 6, 2018 at 5:54 PM Rafael J. Wysocki  wrote:
>
> On Friday, July 6, 2018 5:02:15 AM CEST Pingfan Liu wrote:
> > On Thu, Jul 5, 2018 at 6:13 PM Rafael J. Wysocki  wrote:
> > >
> > > On Tuesday, July 3, 2018 8:50:40 AM CEST Pingfan Liu wrote:
> > > > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> > > > places an assumption of supplier<-consumer order on the process of 
> > > > probe.
> > > > But it turns out to break down the parent <- child order in some scene.
> > > > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> > > > have been probed. Then comes the bridge's module, which enables extra
> > > > feature(such as hotplug) on this bridge. This will break the
> > > > parent<-children order and cause failure when "kexec -e" in some 
> > > > scenario.
> > > >
> > > > The detailed description of the scenario:
> > > > An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a 
> > > > mod)
> > > > match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
> > > > to some issue. For this case, the bridge is moved after its children in
> > > > devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can 
> > > > not
> > > > write back buffer in flight due to the former shutdown of the bridge 
> > > > which
> > > > clears the BusMaster bit.
> > > >
> > > > It is a little hard to impose both "parent<-child" and 
> > > > "supplier<-consumer"
> > > > order on devices_kset. Take the following scene:
> > > > step0: before a consumer's probing, (note child_a is supplier of 
> > > > consumer_a)
> > > >   [ consumer-X, child_a, , child_z] [... consumer_a, ..., 
> > > > consumer_z, ...] supplier-X
> > > >  ^^ affected range 
> > > > ^^
> > > > step1: when probing, moving consumer-X after supplier-X
> > > >   [ child_a, , child_z] [ consumer_a, ..., consumer_z, ...] 
> > > > supplier-X, consumer-X
> > > > step2: the children of consumer-X should be re-ordered to maintain the 
> > > > seq
> > > >   [... consumer_a, ..., consumer_z, ] supplier-X  [consumer-X, 
> > > > child_a, , child_z]
> > > > step3: the consumer_a should be re-ordered to maintain the seq
> > > >   [... consumer_z, ...] supplier-X [ consumer-X, child_a, consumer_a 
> > > > ..., child_z]
> > > >
> > > > It requires two nested recursion to drain out all out-of-order item in
> > > > "affected range". To avoid such complicated code, this patch suggests
> > > > to utilize the info in device tree, instead of using the order of
> > > > devices_kset during shutdown. It iterates the device tree, and firstly
> > > > shutdown a device's children and consumers. After this patch, the buggy
> > > > commit is hollow and left to clean.
> > > >
> > > > Cc: Greg Kroah-Hartman 
> > > > Cc: Rafael J. Wysocki 
> > > > Cc: Grygorii Strashko 
> > > > Cc: Christoph Hellwig 
> > > > Cc: Bjorn Helgaas 
> > > > Cc: Dave Young 
> > > > Cc: linux-...@vger.kernel.org
> > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > Signed-off-by: Pingfan Liu 
> > > > ---
> > > >  drivers/base/core.c| 48 
> > > > +++-
> > > >  include/linux/device.h |  1 +
> > > >  2 files changed, 44 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > > index a48868f..684b994 100644
> > > > --- a/drivers/base/core.c
> > > > +++ b/drivers/base/core.c
> > > > @@ -1446,6 +1446,7 @@ void device_initialize(struct device *dev)
> > > >   INIT_LIST_HEAD(>links.consumers);
> > > >   INIT_LIST_HEAD(>links.suppliers);
> > > >   dev->links.status = DL_DEV_NO_DRIVER;
> > > > + dev->shutdown = false;
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(device_initialize);
> > > >
> > > > @@ -2811,7 +2812,6 @@ static void __device_shutdown(struct device *dev)
> > > >* lock is to be held
> > > >*/
> > > >   parent = get_device(dev->parent);
> > > > - get_device(dev);
> > >
> > > Why is the get_/put_device() not needed any more?
> > >
> > They are moved upper layer into device_for_each_child_shutdown().
> > Since there is lock breakage in __device_shutdown(), resorting to
> > ref++ to protect the ancestor.  And I think the
> > get_device(dev->parent) can be deleted either.
>
> Wouldn't that break USB?
>
Sorry, I can not figure out. Is USB not modeled up-to-down? This
recursion can handle the up-to-down ref issue automatically, due to
the nature of device tree. Any hints? Thanks.

> > > >   /*
> > > >* Make sure the device is off the kset list, in the
> > > >* event that dev->*->shutdown() doesn't remove it.
> > > > @@ -2842,23 +2842,60 @@ static void __device_shutdown(struct device 
> > > > *dev)
> > > >   dev_info(dev, "shutdown\n");
> > > >   dev->driver->shutdown(dev);
> > > >   }
> > > > -
> > > > + dev->shutdown = true;
> > > >   device_unlock(dev);
> > > >   if (parent)

Re: powerpc: 32BIT vs. 64BIT (PPC32 vs. PPC64)

2018-07-06 Thread Benjamin Herrenschmidt
On Thu, 2018-07-05 at 14:30 -0700, Randy Dunlap wrote:
> Hi,
> 
> Is there a good way (or a shortcut) to do something like:
> 
> $ make ARCH=powerpc O=PPC32 [other_options] allmodconfig
>   to get a PPC32/32BIT allmodconfig
> 
> and also be able to do:
> 
> $make ARCH=powerpc O=PPC64 [other_options] allmodconfig
>   to get a PPC64/64BIT allmodconfig?

Hrm... O= is for the separate build dir, so there much be something
else.

You mean having ARCH= aliases like ppc/ppc32 and ppc64 ?

That would be a matter of overriding some .config defaults I suppose, I
don't know how this is done on other archs.

I see the aliasing trick in the Makefile but that's about it.

> Note that arch/x86, arch/sh, and arch/sparc have ways to do
> some flavor(s) of this (from Documentation/kbuild/kbuild.txt;
> sh and sparc based on a recent "fix" patch from me):

I fail to see what you are actually talking about here ... sorry. Do
you have concrete examples on x86 or sparc ? From what I can tell the
"i386" or "sparc32/sparc64" aliases just change SRCARCH in Makefile and
32 vs 64-bit is just a Kconfig option...

> x86: i386 for 32 bit, x86_64 for 64 bit
> sh: sh for 32 bit, sh64 for 64 bit
> sparc: sparc32 for 32 bit, sparc64 for 64 bit




NXP p1010se device trees only correct for P1010E/P1014E, not P1010/P1014 SoCs.

2018-07-06 Thread Tim Small

Hello,

I contributed a patch to OpenWRT a couple of days ago to fix the device 
tree for a device which uses an NXP p1014 without the SEC4 module.  I 
was wondering about getting a similar fix applied upstream...


The device uses the P1014 (without SEC4 functionality), and includes:

fsl/p1010si-pre.dtsi and fsl/p1010si-post.dtsi

... in its device tree.  The latter pulls in leads to a node for 
soc@ffe0/crypto@3 - which then causes the CAAM modules to be 
used for crypto operations, when then fail at runtime.


AN4938 states that there are versions of both the p1010 and p1014 
without the SEC4 module.


The P1010 errata at:

https://media.digikey.com/pdf/PCNs/Freescale/P1010CE_RevL.pdf

(table 2 on page 2), says that the P1010 and P1014 don't have the SEC4 
module, only the P1010E and P1014E models do.


So, I think there should probably be device trees for p101xE (with SEC) 
and p101x (without SEC).


Any thoughts?  Not really sure how best to do that...  My openwrt patch 
just does a:


/delete-node/ crypto@3;

after the p1010si-post.dtsi include.

Cheers,

Tim.
--
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309


Re: NXP p1010se device trees only correct for P1010E/P1014E, not P1010/P1014 SoCs.

2018-07-06 Thread Scott Wood
On Fri, 2018-07-06 at 15:50 +0100, Tim Small wrote:
> Hello,
> 
> I contributed a patch to OpenWRT a couple of days ago to fix the device 
> tree for a device which uses an NXP p1014 without the SEC4 module.  I 
> was wondering about getting a similar fix applied upstream...
> 
> The device uses the P1014 (without SEC4 functionality), and includes:
> 
> fsl/p1010si-pre.dtsi and fsl/p1010si-post.dtsi
> 
> ... in its device tree.  The latter pulls in leads to a node for 
> soc@ffe0/crypto@3 - which then causes the CAAM modules to be 
> used for crypto operations, when then fail at runtime.
> 
> AN4938 states that there are versions of both the p1010 and p1014 
> without the SEC4 module.
> 
> The P1010 errata at:
> 
> https://media.digikey.com/pdf/PCNs/Freescale/P1010CE_RevL.pdf
> 
> (table 2 on page 2), says that the P1010 and P1014 don't have the SEC4 
> module, only the P1010E and P1014E models do.
> 
> So, I think there should probably be device trees for p101xE (with SEC) 
> and p101x (without SEC).
> 
> Any thoughts?  Not really sure how best to do that...  My openwrt patch 
> just does a:
> 
> /delete-node/ crypto@3;
> 
> after the p1010si-post.dtsi include.

U-Boot should already be removing the node on non-E chips -- see
ft_cpu_setup() in arch/powerpc/cpu/mpc85xx/fdt.c

-Scott



Re: [RFC PATCH 2/2] dma-mapping: Clean up dma_get_required_mask() hooks

2018-07-06 Thread Robin Murphy

On 05/07/18 20:38, Christoph Hellwig wrote:

On Wed, Jul 04, 2018 at 06:50:12PM +0100, Robin Murphy wrote:

As for the other mask-related hooks, standardise the arch override into
a Kconfig option, and also pull the generic implementation into the DMA
mapping code rather than having it hide away in the platform bus code.


Heh, I have a somewhat similar patch in my queue.  I didn't want it out
because dma_get_required_mask is rather ill defined at the moment and
I wanted to clean that up first.  But I guess I could apply this first
and clean up later.

I just fear you might be wanting to add an arm64 user, so I'd really like
to understand why and how.


Nah, this guy is just pure cleanup since it also fell out of my 'git 
grep ARCH_HAS_DMA'


Robin.


Re: [RFC PATCH 1/2] dma-mapping: Clean up dma_set_*mask() hooks

2018-07-06 Thread Robin Murphy

On 05/07/18 20:37, Christoph Hellwig wrote:

On Wed, Jul 04, 2018 at 06:50:11PM +0100, Robin Murphy wrote:

Arch-specific implementions for dma_set_{coherent_,}mask() currently
rely on an inconsistent mix of arch-defined Kconfig symbols and macro
overrides. Now that we have a nice centralised home for DMA API gubbins,
let's consolidate these loose ends under consistent config options.

Signed-off-by: Robin Murphy 
---

Here's hoping the buildbot comes by to point out what I've inevitably
missed, although I did check a cursory cross-compile of ppc64_defconfig
to iron out the obvious howlers.


The patch looks sensible to me, although I was hoping to get rid of these
hooks in this or the next merge window as they are a horrible bad idea.


The motivation here is that I'm looking at adding set_mask overrides
for arm64, and having discovered a bit of a mess it seemed prudent to
clean up before ingraining it any more.


What are you trying to do?  I really don't want to see more users of
the hooks as they are are a horribly bad idea.


I really need to fix the ongoing problem we have where, due to funky 
integrations, devices suffer some downstream addressing limit (described 
by DT dma-ranges or ACPI IORT/_DMA) which we carefully set up in 
dma_configure(), but then just gets lost when the driver probes and 
innocently calls dma_set_mask() with something wider. I think it's 
effectively the generalised case of the VIA 32-bit quirk, if I 
understand that one correctly.


The approach that seemed to me to be safest is largely based on the one 
proposed in a thread from ages ago[1]; namely to make dma_configure() 
better at distinguishing firmware-specified masks from bus defaults, 
capture the firmware mask in dev->archdata during arch_setup_dma_ops(), 
then use the custom set_mask routines to ensure any subsequent updates 
never exceed that. It doesn't seem possible to make this work robustly 
without storing *some* additional per-device data, and for that archdata 
is a lesser evil than struct device itself. Plus even though it's not 
actually an arch-specific issue it feels like there's such a risk of 
breaking other platforms that I'm reticent to even try handling it 
entirely in generic code.


Robin.

[1] 
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1306507.html


Re: [PATCH V2 00/10] KASan ppc64 support

2018-07-06 Thread Christophe LEROY




Le 06/07/2018 à 16:11, Aneesh Kumar K.V a écrit :

Christophe LEROY  writes:


Hi Aneesh,

Are you still working on support for KASan for ppc64 ?



Haven't got time to work on this. The hash memory layout makes it
a bit complicated to implement this.



Ok, maybe would be easier to start with nohash.

Is there some literature somewhere about what an arch has to implement 
to use KAsan ?


Thanks,
Christophe


Re: [PATCH V2 00/10] KASan ppc64 support

2018-07-06 Thread Aneesh Kumar K.V
Christophe LEROY  writes:

> Hi Aneesh,
>
> Are you still working on support for KASan for ppc64 ?
>

Haven't got time to work on this. The hash memory layout makes it
a bit complicated to implement this.

-aneesh



Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-06 Thread Pingfan Liu
On Fri, Jul 6, 2018 at 4:47 PM Rafael J. Wysocki  wrote:
>
> On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner  wrote:
> > [cc += Kishon Vijay Abraham]
> >
> > On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
> >> OK, so calling devices_kset_move_last() from really_probe() clearly is
> >> a mistake.
> >>
> >> I'm not really sure what the intention of it was as the changelog of
> >> commit 52cdbdd49853d doesn't really explain that (why would it be
> >> insufficient without that change?)
> >
> > It seems 52cdbdd49853d fixed an issue with boards which have an MMC
> > whose reset pin needs to be driven high on shutdown, lest the MMC
> > won't be found on the next boot.
> >
> > The boards' devicetrees use a kludge wherein the reset pin is modelled
> > as a regulator.  The regulator is enabled when the MMC probes and
> > disabled on driver unbind and shutdown.  As a result, the pin is driven
> > low on shutdown and the MMC is not found on the next boot.
> >
> > To fix this, another kludge was invented wherein the GPIO expander
> > driving the reset pin unconditionally drives all its pins high on
> > shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
> > (commit adc284755055, "gpio: pcf857x: restore the initial line state
> > of all pcf lines").
> >
> > For this kludge to work, the GPIO expander's ->shutdown hook needs to
> > be executed after the MMC expander's ->shutdown hook.
> >
> > Commit 52cdbdd49853d achieved that by reordering devices_kset according
> > to the probe order.  Apparently the MMC probes after the GPIO expander,
> > possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't
> > available yet, see mmc_regulator_get_supply().
> >
> > Note, I'm just piecing the information together from git history,
> > I'm not responsible for these kludges.  (I'm innocent!)
>
> Sure enough. :-)
>
> In any case, calling devices_kset_move_last() in really_probe() is
> plain broken and if its only purpose was to address a single, arguably
> kludgy, use case, let's just get rid of it in the first place IMO.
>
Yes, if it is only used for a single use case.

> > @Pingfan Liu, if you just remove the call to devices_kset_move_last()
> > from really_probe(), does the issue go away?
>
> I would think so from the description of the problem (elsewhere in this 
> thread).
>
Yes.

Thanks,
Pingfan


Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-06 Thread Pingfan Liu
On Fri, Jul 6, 2018 at 4:36 PM Lukas Wunner  wrote:
>
> [cc += Kishon Vijay Abraham]
>
> On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
> > OK, so calling devices_kset_move_last() from really_probe() clearly is
> > a mistake.
> >
> > I'm not really sure what the intention of it was as the changelog of
> > commit 52cdbdd49853d doesn't really explain that (why would it be
> > insufficient without that change?)
>
> It seems 52cdbdd49853d fixed an issue with boards which have an MMC
> whose reset pin needs to be driven high on shutdown, lest the MMC
> won't be found on the next boot.
>
> The boards' devicetrees use a kludge wherein the reset pin is modelled
> as a regulator.  The regulator is enabled when the MMC probes and
> disabled on driver unbind and shutdown.  As a result, the pin is driven
> low on shutdown and the MMC is not found on the next boot.
>
> To fix this, another kludge was invented wherein the GPIO expander
> driving the reset pin unconditionally drives all its pins high on
> shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
> (commit adc284755055, "gpio: pcf857x: restore the initial line state
> of all pcf lines").
>
> For this kludge to work, the GPIO expander's ->shutdown hook needs to
> be executed after the MMC expander's ->shutdown hook.
>
> Commit 52cdbdd49853d achieved that by reordering devices_kset according
> to the probe order.  Apparently the MMC probes after the GPIO expander,
> possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't
> available yet, see mmc_regulator_get_supply().
>
> Note, I'm just piecing the information together from git history,
> I'm not responsible for these kludges.  (I'm innocent!)
>
Thanks for your exploration, very clearly. I had tried, but failed
since these commits are contributed with different authors. I am not
familiar with ARM and dts, so had thought
really_probe()->devices_kset_move_last() is used to address a very
popular "supplier<-consumer" order issue in smart phone, based on the
configuration hard-coded in "bios(or counterpart in ARM).

> @Pingfan Liu, if you just remove the call to devices_kset_move_last()
> from really_probe(), does the issue go away?
>
Yes, it goes away.

> If so, it might be best to remove that call and model the dependency
> with a call to device_link_add() in mmc_regulator_get_supply().
> Another idea would be to automatically add a device_link in the
> driver core if -EPROBE_DEFER is returned.
>
Maybe the first one is better, as it is already used by other drivers.

Thanks,
Pingfan


RE: [PATCH 7/7 v5] arm64: dts: ls208xa: comply with the iommu map binding for fsl_mc

2018-07-06 Thread Nipun Gupta


> -Original Message-
> From: Robin Murphy [mailto:robin.mur...@arm.com]
> Sent: Tuesday, July 3, 2018 10:06 PM
> To: Nipun Gupta ; will.dea...@arm.com;
> robh...@kernel.org; r...@kernel.org; mark.rutl...@arm.com;
> catalin.mari...@arm.com; gre...@linuxfoundation.org; Laurentiu Tudor
> ; bhelg...@google.com
> Cc: h...@lst.de; j...@8bytes.org; m.szyprow...@samsung.com;
> shawn...@kernel.org; frowand.l...@gmail.com; iommu@lists.linux-
> foundation.org; linux-ker...@vger.kernel.org; devicet...@vger.kernel.org;
> linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org; linux-
> p...@vger.kernel.org; Bharat Bhushan ;
> stuyo...@gmail.com; Leo Li 
> Subject: Re: [PATCH 7/7 v5] arm64: dts: ls208xa: comply with the iommu map
> binding for fsl_mc
> 
> On 20/05/18 14:49, Nipun Gupta wrote:
> > fsl-mc bus support the new iommu-map property. Comply to this binding
> > for fsl_mc bus.
> >
> > Signed-off-by: Nipun Gupta 
> > Reviewed-by: Laurentiu Tudor 
> > ---
> >   arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 6 +-
> >   1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
> b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
> > index 137ef4d..6010505 100644
> > --- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
> > +++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi
> > @@ -184,6 +184,7 @@
> > #address-cells = <2>;
> > #size-cells = <2>;
> > ranges;
> > +   dma-ranges = <0x0 0x0 0x0 0x0 0x1 0x>;
> >
> > clockgen: clocking@130 {
> > compatible = "fsl,ls2080a-clockgen";
> > @@ -357,6 +358,8 @@
> > reg = <0x0008 0x0c00 0 0x40>,/* MC portal
> base */
> >   <0x 0x0834 0 0x4>; /* MC
> control reg */
> > msi-parent = <>;
> > +   iommu-map = <0  0 0>;  /* This is fixed-up by
> u-boot */
> > +   dma-coherent;
> > #address-cells = <3>;
> > #size-cells = <1>;
> >
> > @@ -460,6 +463,8 @@
> > compatible = "arm,mmu-500";
> > reg = <0 0x500 0 0x80>;
> > #global-interrupts = <12>;
> > +   #iommu-cells = <1>;
> > +   stream-match-mask = <0x7C00>;
> > interrupts = <0 13 4>, /* global secure fault */
> >  <0 14 4>, /* combined secure interrupt */
> >  <0 15 4>, /* global non-secure fault */
> > @@ -502,7 +507,6 @@
> >  <0 204 4>, <0 205 4>,
> >  <0 206 4>, <0 207 4>,
> >  <0 208 4>, <0 209 4>;
> > -   mmu-masters = <_mc 0x300 0>;
> 
> Since we're in here, is the SMMU itself also coherent? If it is, you
> probably want to say so and avoid the overhead of pointlessly cleaning
> cache lines on every page table update.

Yes, dma-coherent property is also required here. I missed it somehow.
Thanks for pointing this.

Regards,
Nipun

> 
> Robin.
> 
> > };
> >
> > dspi: dspi@210 {
> >


RE: [PATCH 5/7 v5] bus: fsl-mc: support dma configure for devices on fsl-mc bus

2018-07-06 Thread Nipun Gupta


> -Original Message-
> From: Robin Murphy [mailto:robin.mur...@arm.com]
> Sent: Tuesday, July 3, 2018 9:44 PM
> To: Nipun Gupta ; will.dea...@arm.com;
> robh...@kernel.org; r...@kernel.org; mark.rutl...@arm.com;
> catalin.mari...@arm.com; gre...@linuxfoundation.org; Laurentiu Tudor
> ; bhelg...@google.com
> Cc: h...@lst.de; j...@8bytes.org; m.szyprow...@samsung.com;
> shawn...@kernel.org; frowand.l...@gmail.com; iommu@lists.linux-
> foundation.org; linux-ker...@vger.kernel.org; devicet...@vger.kernel.org;
> linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org; linux-
> p...@vger.kernel.org; Bharat Bhushan ;
> stuyo...@gmail.com; Leo Li 
> Subject: Re: [PATCH 5/7 v5] bus: fsl-mc: support dma configure for devices
> on fsl-mc bus
> 
> On 20/05/18 14:49, Nipun Gupta wrote:
> > This patch adds support of dma configuration for devices on fsl-mc
> > bus using 'dma_configure' callback for busses. Also, directly calling
> > arch_setup_dma_ops is removed from the fsl-mc bus.
> 
> Looks like this is the final arch_setup_dma_ops offender, yay!
> 
> > Signed-off-by: Nipun Gupta 
> > Reviewed-by: Laurentiu Tudor 
> > ---
> >   drivers/bus/fsl-mc/fsl-mc-bus.c | 15 +++
> >   1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-
> bus.c
> > index 5d8266c..fa43c7d 100644
> > --- a/drivers/bus/fsl-mc/fsl-mc-bus.c
> > +++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
> > @@ -127,6 +127,16 @@ static int fsl_mc_bus_uevent(struct device *dev,
> struct kobj_uevent_env *env)
> > return 0;
> >   }
> >
> > +static int fsl_mc_dma_configure(struct device *dev)
> > +{
> > +   struct device *dma_dev = dev;
> > +
> > +   while (dev_is_fsl_mc(dma_dev))
> > +   dma_dev = dma_dev->parent;
> > +
> > +   return of_dma_configure(dev, dma_dev->of_node, 0);
> > +}
> > +
> >   static ssize_t modalias_show(struct device *dev, struct device_attribute
> *attr,
> >  char *buf)
> >   {
> > @@ -148,6 +158,7 @@ struct bus_type fsl_mc_bus_type = {
> > .name = "fsl-mc",
> > .match = fsl_mc_bus_match,
> > .uevent = fsl_mc_bus_uevent,
> > +   .dma_configure  = fsl_mc_dma_configure,
> > .dev_groups = fsl_mc_dev_groups,
> >   };
> >   EXPORT_SYMBOL_GPL(fsl_mc_bus_type);
> > @@ -633,10 +644,6 @@ int fsl_mc_device_add(struct fsl_mc_obj_desc
> *obj_desc,
> > goto error_cleanup_dev;
> > }
> >
> > -   /* Objects are coherent, unless 'no shareability' flag set. */
> > -   if (!(obj_desc->flags &
> FSL_MC_OBJ_FLAG_NO_MEM_SHAREABILITY))
> 
> Although it seems we do end up without any handling of this
> "non-coherent object behind coherent MC" case, and I'm not sure how
> easily that could be accommodated by generic code... :/ How important is
> the quirk?

We have all the devices as coherent in our SoC's now :) So this is fine.
We have internally discussed it.

Regards,
Nipun

> 
> Robin.
> 
> > -   arch_setup_dma_ops(_dev->dev, 0, 0, NULL, true);
> > -
> > /*
> >  * The device-specific probe callback will get invoked by device_add()
> >  */
> >


RE: [PATCH 1/7 v5] Docs: dt: add fsl-mc iommu-map device-tree binding

2018-07-06 Thread Nipun Gupta


> -Original Message-
> From: Robin Murphy [mailto:robin.mur...@arm.com]
> Sent: Tuesday, July 3, 2018 8:10 PM
> To: Nipun Gupta ; will.dea...@arm.com;
> robh...@kernel.org; r...@kernel.org; mark.rutl...@arm.com;
> catalin.mari...@arm.com; gre...@linuxfoundation.org; Laurentiu Tudor
> ; bhelg...@google.com
> Cc: h...@lst.de; j...@8bytes.org; m.szyprow...@samsung.com;
> shawn...@kernel.org; frowand.l...@gmail.com; iommu@lists.linux-
> foundation.org; linux-ker...@vger.kernel.org; devicet...@vger.kernel.org;
> linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org; linux-
> p...@vger.kernel.org; Bharat Bhushan ;
> stuyo...@gmail.com; Leo Li 
> Subject: Re: [PATCH 1/7 v5] Docs: dt: add fsl-mc iommu-map device-tree
> binding
> 
> On 20/05/18 14:49, Nipun Gupta wrote:
> > The existing IOMMU bindings cannot be used to specify the relationship
> > between fsl-mc devices and IOMMUs. This patch adds a generic binding for
> > mapping fsl-mc devices to IOMMUs, using iommu-map property.
> >
> > Signed-off-by: Nipun Gupta 
> > Reviewed-by: Rob Herring 
> > ---
> >   .../devicetree/bindings/misc/fsl,qoriq-mc.txt  | 39
> ++
> >   1 file changed, 39 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> > index 6611a7c..8cbed4f 100644
> > --- a/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> > +++ b/Documentation/devicetree/bindings/misc/fsl,qoriq-mc.txt
> > @@ -9,6 +9,25 @@ blocks that can be used to create functional hardware
> objects/devices
> >   such as network interfaces, crypto accelerator instances, L2 switches,
> >   etc.
> >
> > +For an overview of the DPAA2 architecture and fsl-mc bus see:
> > +drivers/staging/fsl-mc/README.txt
> 
> Nit: Looks like that's Documentation/networking/dpaa2/overview.rst now.
> 
> > +
> > +As described in the above overview, all DPAA2 objects in a DPRC share the
> > +same hardware "isolation context" and a 10-bit value called an ICID
> > +(isolation context id) is expressed by the hardware to identify
> > +the requester.
> > +
> > +The generic 'iommus' property is insufficient to describe the relationship
> > +between ICIDs and IOMMUs, so an iommu-map property is used to
> define
> > +the set of possible ICIDs under a root DPRC and how they map to
> > +an IOMMU.
> > +
> > +For generic IOMMU bindings, see
> > +Documentation/devicetree/bindings/iommu/iommu.txt.
> > +
> > +For arm-smmu binding, see:
> > +Documentation/devicetree/bindings/iommu/arm,smmu.txt.
> > +
> >   Required properties:
> >
> >   - compatible
> > @@ -88,14 +107,34 @@ Sub-nodes:
> > Value type: 
> > Definition: Specifies the phandle to the PHY device node 
> > associated
> > with the this dpmac.
> > +Optional properties:
> > +
> > +- iommu-map: Maps an ICID to an IOMMU and associated iommu-
> specifier
> > +  data.
> > +
> > +  The property is an arbitrary number of tuples of
> > +  (icid-base,iommu,iommu-base,length).
> > +
> > +  Any ICID i in the interval [icid-base, icid-base + length) is
> > +  associated with the listed IOMMU, with the iommu-specifier
> > +  (i - icid-base + iommu-base).
> >
> >   Example:
> >
> > +smmu: iommu@500 {
> > +   compatible = "arm,mmu-500";
> > +   #iommu-cells = <2>;
> 
> This should be 1 if stream-match-mask is present. Bad example is bad :)

Agree :). Ill update it.

> 
> Robin.
> 
> > +   stream-match-mask = <0x7C00>;
> > +   ...
> > +};
> > +
> >   fsl_mc: fsl-mc@80c00 {
> >   compatible = "fsl,qoriq-mc";
> >   reg = <0x0008 0x0c00 0 0x40>,/* MC portal 
> > base */
> > <0x 0x0834 0 0x4>; /* MC control 
> > reg */
> >   msi-parent = <>;
> > +/* define map for ICIDs 23-64 */
> > +iommu-map = <23  23 41>;
> >   #address-cells = <3>;
> >   #size-cells = <1>;
> >
> >


[PATCH v4 3/3] hwmon: Document the sensor enable attribute

2018-07-06 Thread Shilpasri G Bhat
Signed-off-by: Shilpasri G Bhat 
---
 Documentation/hwmon/sysfs-interface | 62 +
 1 file changed, 62 insertions(+)

diff --git a/Documentation/hwmon/sysfs-interface 
b/Documentation/hwmon/sysfs-interface
index fc337c3..f865dd7 100644
--- a/Documentation/hwmon/sysfs-interface
+++ b/Documentation/hwmon/sysfs-interface
@@ -171,6 +171,15 @@ in[0-*]_label  Suggested voltage channel label.
user-space.
RO
 
+in[0-*]_enable
+   Enable or disable the sensors.
+   When disabled the sensor read will return -ENODATA. This
+   attribute can be used as per-sensor or per-sub-group attribute
+   depending on what is supported by the hardware.
+   1: Enable
+   0: Disable
+   RW
+
 cpu[0-*]_vid   CPU core reference voltage.
Unit: millivolt
RO
@@ -236,6 +245,15 @@ fan[1-*]_label Suggested fan channel label.
In all other cases, the label is provided by user-space.
RO
 
+fan[1-*]_enable
+   Enable or disable the sensors.
+   When disabled the sensor read will return -ENODATA. This
+   attribute can be used as per-sensor or per-sub-group attribute
+   depending on what is supported by the hardware.
+   1: Enable
+   0: Disable
+   RW
+
 Also see the Alarms section for status flags associated with fans.
 
 
@@ -409,6 +427,15 @@ temp_reset_history
Reset temp_lowest and temp_highest for all sensors
WO
 
+temp[1-*]_enable
+   Enable or disable the sensors.
+   When disabled the sensor read will return -ENODATA. This
+   attribute can be used as per-sensor or per-sub-group attribute
+   depending on what is supported by the hardware.
+   1: Enable
+   0: Disable
+   RW
+
 Some chips measure temperature using external thermistors and an ADC, and
 report the temperature measurement as a voltage. Converting this voltage
 back to a temperature (or the other way around for limits) requires
@@ -468,6 +495,15 @@ curr_reset_history
Reset currX_lowest and currX_highest for all sensors
WO
 
+curr[1-*]_enable
+   Enable or disable the sensors.
+   When disabled the sensor read will return -ENODATA. This
+   attribute can be used as per-sensor or per-sub-group attribute
+   depending on what is supported by the hardware.
+   1: Enable
+   0: Disable
+   RW
+
 Also see the Alarms section for status flags associated with currents.
 
 *
@@ -566,6 +602,15 @@ power[1-*]_critCritical maximum power.
Unit: microWatt
RW
 
+power[1-*]_enable  Enable or disable the sensors.
+   When disabled the sensor read will return
+   -ENODATA. This attribute can be used as
+   per-sensor or per-sub-group attribute depending
+   on what is supported by the hardware.
+   1: Enable
+   0: Disable
+   RW
+
 Also see the Alarms section for status flags associated with power readings.
 
 **
@@ -576,6 +621,14 @@ energy[1-*]_input  Cumulative energy use
Unit: microJoule
RO
 
+energy[1-*]_enable Enable or disable the sensors.
+   When disabled the sensor read will return
+   -ENODATA. This attribute can be used as
+   per-sensor or per-sub-group attribute depending
+   on what is supported by the hardware.
+   1: Enable
+   0: Disable
+   RW
 
 
 * Humidity *
@@ -586,6 +639,15 @@ humidity[1-*]_inputHumidity
RO
 
 
+humidity[1-*]_enable   Enable or disable the sensors
+   When disabled the sensor read will return
+   -ENODATA. This attribute can be used as
+   per-sensor or per-sub-group attribute depending
+   on what is supported by the hardware.
+   1: Enable
+   0: Disable
+   RW
+
 **
 * Alarms *
 **
-- 
1.8.3.1



[PATCH v4 2/3] hwmon: ibmpowernv: Add attributes to enable/disable sensor groups

2018-07-06 Thread Shilpasri G Bhat
On-Chip-Controller(OCC) is an embedded micro-processor in POWER9 chip
which measures various system and chip level sensors. These sensors
comprises of environmental sensors (like power, temperature, current
and voltage) and performance sensors (like utilization, frequency).
All these sensors are copied to main memory at a regular interval of
100ms. OCC provides a way to select a group of sensors that is copied
to the main memory to increase the update frequency of selected sensor
groups. When a sensor-group is disabled, OCC will not copy it to main
memory and those sensors read 0 values.

This patch provides support for enabling/disabling the sensor groups
like power, temperature, current and voltage. This patch adds new
per-senor sysfs attribute to disable and enable them.

Signed-off-by: Shilpasri G Bhat 
---
Changes from v3:
- Add 'enable' attribute for each sensor sub-group

 Documentation/hwmon/ibmpowernv |  43 +++-
 drivers/hwmon/ibmpowernv.c | 225 +++--
 2 files changed, 234 insertions(+), 34 deletions(-)

diff --git a/Documentation/hwmon/ibmpowernv b/Documentation/hwmon/ibmpowernv
index 8826ba2..5646825 100644
--- a/Documentation/hwmon/ibmpowernv
+++ b/Documentation/hwmon/ibmpowernv
@@ -33,9 +33,48 @@ fanX_input   Measured RPM value.
 fanX_min   Threshold RPM for alert generation.
 fanX_fault 0: No fail condition
1: Failing fan
+
 tempX_inputMeasured ambient temperature.
 tempX_max  Threshold ambient temperature for alert generation.
-inX_input  Measured power supply voltage
+tempX_highest  Historical maximum temperature
+tempX_lowest   Historical minimum temperature
+tempX_enable   Enable/disable all temperature sensors belonging to the
+   sub-group. In POWER9, this attribute corresponds to
+   each OCC. Using this attribute each OCC can be asked to
+   disable/enable all of its temperature sensors.
+   1: Enable
+   0: Disable
+
+inX_input  Measured power supply voltage (millivolt)
 inX_fault  0: No fail condition.
1: Failing power supply.
-power1_input   System power consumption (microWatt)
+inX_highestHistorical maximum voltage
+inX_lowest Historical minimum voltage
+inX_enable Enable/disable all voltage sensors belonging to the
+   sub-group. In POWER9, this attribute corresponds to
+   each OCC. Using this attribute each OCC can be asked to
+   disable/enable all of its voltage sensors.
+   1: Enable
+   0: Disable
+
+powerX_input   Power consumption (microWatt)
+powerX_input_highest   Historical maximum power
+powerX_input_lowestHistorical minimum power
+powerX_enable  Enable/disable all power sensors belonging to the
+   sub-group. In POWER9, this attribute corresponds to
+   each OCC. Using this attribute each OCC can be asked to
+   disable/enable all of its power sensors.
+   1: Enable
+   0: Disable
+
+currX_inputMeasured current (milliampere)
+currX_highest  Historical maximum current
+currX_lowest   Historical minimum current
+currX_enable   Enable/disable all current sensors belonging to the
+   sub-group. In POWER9, this attribute corresponds to
+   each OCC. Using this attribute each OCC can be asked to
+   disable/enable all of its current sensors.
+   1: Enable
+   0: Disable
+
+energyX_input  Cumulative energy (microJoule)
diff --git a/drivers/hwmon/ibmpowernv.c b/drivers/hwmon/ibmpowernv.c
index f829dad..33cf7d2 100644
--- a/drivers/hwmon/ibmpowernv.c
+++ b/drivers/hwmon/ibmpowernv.c
@@ -90,11 +90,23 @@ struct sensor_data {
char label[MAX_LABEL_LEN];
char name[MAX_ATTR_LEN];
struct device_attribute dev_attr;
+   struct sensor_group_data *sgdata;
+};
+
+struct sensor_group_data {
+   struct mutex mutex;
+   const __be32 *phandles;
+   u32 gid;
+   u32 nr_phandle;
+   enum sensors type;
+   bool enable;
 };
 
 struct platform_data {
const struct attribute_group *attr_groups[MAX_SENSOR_TYPE + 1];
+   struct sensor_group_data *sg_data;
u32 sensors_count; /* Total count of sensors from each group */
+   u32 nr_sensor_groups; /* Total number of sensor sub-groups */
 };
 
 static ssize_t show_sensor(struct device *dev, struct device_attribute 
*devattr,
@@ -105,6 +117,9 @@ static ssize_t show_sensor(struct device *dev, struct 
device_attribute *devattr,
ssize_t ret;
u64 x;
 
+   if 

[PATCH v4 1/3] powernv:opal-sensor-groups: Add support to enable sensor groups

2018-07-06 Thread Shilpasri G Bhat
Adds support to enable/disable a sensor group at runtime. This
can be used to select the sensor groups that needs to be copied to
main memory by OCC. Sensor groups like power, temperature, current,
voltage, frequency, utilization can be enabled/disabled at runtime.

Signed-off-by: Shilpasri G Bhat 
---
 arch/powerpc/include/asm/opal-api.h|  1 +
 arch/powerpc/include/asm/opal.h|  2 ++
 .../powerpc/platforms/powernv/opal-sensor-groups.c | 28 ++
 arch/powerpc/platforms/powernv/opal-wrappers.S |  1 +
 4 files changed, 32 insertions(+)

diff --git a/arch/powerpc/include/asm/opal-api.h 
b/arch/powerpc/include/asm/opal-api.h
index 3bab299..56a94a1 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -206,6 +206,7 @@
 #define OPAL_NPU_SPA_CLEAR_CACHE   160
 #define OPAL_NPU_TL_SET161
 #define OPAL_SENSOR_READ_U64   162
+#define OPAL_SENSOR_GROUP_ENABLE   163
 #define OPAL_PCI_GET_PBCQ_TUNNEL_BAR   164
 #define OPAL_PCI_SET_PBCQ_TUNNEL_BAR   165
 #define OPAL_LAST  165
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index e1b2910..fc0550e 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -292,6 +292,7 @@ int64_t opal_imc_counters_init(uint32_t type, uint64_t 
address,
 int opal_get_power_shift_ratio(u32 handle, int token, u32 *psr);
 int opal_set_power_shift_ratio(u32 handle, int token, u32 psr);
 int opal_sensor_group_clear(u32 group_hndl, int token);
+int opal_sensor_group_enable(u32 group_hndl, int token, bool enable);
 
 s64 opal_signal_system_reset(s32 cpu);
 s64 opal_quiesce(u64 shutdown_type, s32 cpu);
@@ -326,6 +327,7 @@ extern int opal_async_wait_response_interruptible(uint64_t 
token,
struct opal_msg *msg);
 extern int opal_get_sensor_data(u32 sensor_hndl, u32 *sensor_data);
 extern int opal_get_sensor_data_u64(u32 sensor_hndl, u64 *sensor_data);
+extern int sensor_group_enable(u32 grp_hndl, bool enable);
 
 struct rtc_time;
 extern time64_t opal_get_boot_time(void);
diff --git a/arch/powerpc/platforms/powernv/opal-sensor-groups.c 
b/arch/powerpc/platforms/powernv/opal-sensor-groups.c
index 541c9ea..f7d04b6 100644
--- a/arch/powerpc/platforms/powernv/opal-sensor-groups.c
+++ b/arch/powerpc/platforms/powernv/opal-sensor-groups.c
@@ -32,6 +32,34 @@ struct sg_attr {
struct sg_attr *sgattrs;
 } *sgs;
 
+int sensor_group_enable(u32 handle, bool enable)
+{
+   struct opal_msg msg;
+   int token, ret;
+
+   token = opal_async_get_token_interruptible();
+   if (token < 0)
+   return token;
+
+   ret = opal_sensor_group_enable(handle, token, enable);
+   if (ret == OPAL_ASYNC_COMPLETION) {
+   ret = opal_async_wait_response(token, );
+   if (ret) {
+   pr_devel("Failed to wait for the async response\n");
+   ret = -EIO;
+   goto out;
+   }
+   ret = opal_error_code(opal_get_async_rc(msg));
+   } else {
+   ret = opal_error_code(ret);
+   }
+
+out:
+   opal_async_release_token(token);
+   return ret;
+}
+EXPORT_SYMBOL_GPL(sensor_group_enable);
+
 static ssize_t sg_store(struct kobject *kobj, struct kobj_attribute *attr,
const char *buf, size_t count)
 {
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S 
b/arch/powerpc/platforms/powernv/opal-wrappers.S
index a8d9b40..8268a1e 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -327,3 +327,4 @@ OPAL_CALL(opal_npu_tl_set,  
OPAL_NPU_TL_SET);
 OPAL_CALL(opal_pci_get_pbcq_tunnel_bar,
OPAL_PCI_GET_PBCQ_TUNNEL_BAR);
 OPAL_CALL(opal_pci_set_pbcq_tunnel_bar,
OPAL_PCI_SET_PBCQ_TUNNEL_BAR);
 OPAL_CALL(opal_sensor_read_u64,OPAL_SENSOR_READ_U64);
+OPAL_CALL(opal_sensor_group_enable,OPAL_SENSOR_GROUP_ENABLE);
-- 
1.8.3.1



[PATCH v4 0/3] hwmon: Add attributes to enable/disable sensors

2018-07-06 Thread Shilpasri G Bhat
This patch series adds new attribute to enable or disable a sensor in
runtime.

v3 : https://lkml.org/lkml/2018/7/5/476
v2 : https://lkml.org/lkml/2018/7/4/263
v1 : https://lkml.org/lkml/2018/3/22/214

Shilpasri G Bhat (3):
  powernv:opal-sensor-groups: Add support to enable sensor groups
  hwmon: ibmpowernv: Add attributes to enable/disable sensor groups
  hwmon: Document the sensor enable attribute

 Documentation/hwmon/ibmpowernv |  43 +++-
 Documentation/hwmon/sysfs-interface|  62 ++
 arch/powerpc/include/asm/opal-api.h|   1 +
 arch/powerpc/include/asm/opal.h|   2 +
 .../powerpc/platforms/powernv/opal-sensor-groups.c |  28 +++
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 drivers/hwmon/ibmpowernv.c | 225 ++---
 7 files changed, 328 insertions(+), 34 deletions(-)

-- 
1.8.3.1



Re: [PATCH 0/7 v5] Support for fsl-mc bus and its devices in SMMU

2018-07-06 Thread j...@8bytes.org
Hi Nipun,

On Thu, Jun 21, 2018 at 03:59:27AM +, Nipun Gupta wrote:
> Will this patch-set be taken for the next kernel release (and via which tree)?

I can take this through IOMMU tree if nobody objects. Please work out
the last remaining comment on patch 7 with Robin and then re-send with
all Acks and Reviewed-by: tags included.

Oh, and please use 'iommu/of:' prefix instead of 'iommu: of:' and follow
that sheme for the other patches/prefixes as well.

Thanks,

Joerg



RE: [PATCH 0/7 v5] Support for fsl-mc bus and its devices in SMMU

2018-07-06 Thread Nipun Gupta


Thanks Joro,

I will be sending it by mid of next week.

Regards,
Nipun

> -Original Message-
> From: j...@8bytes.org [mailto:j...@8bytes.org]
> Sent: Friday, July 6, 2018 4:43 PM
> To: Nipun Gupta 
> Cc: robin.mur...@arm.com; will.dea...@arm.com;
> gre...@linuxfoundation.org; h...@lst.de; m.szyprow...@samsung.com;
> shawn...@kernel.org; frowand.l...@gmail.com; iommu@lists.linux-
> foundation.org; linux-ker...@vger.kernel.org; devicet...@vger.kernel.org;
> linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org; linux-
> p...@vger.kernel.org; Bharat Bhushan ;
> stuyo...@gmail.com; Leo Li 
> Subject: Re: [PATCH 0/7 v5] Support for fsl-mc bus and its devices in SMMU
> 
> Hi Nipun,
> 
> On Thu, Jun 21, 2018 at 03:59:27AM +, Nipun Gupta wrote:
> > Will this patch-set be taken for the next kernel release (and via which 
> > tree)?
> 
> I can take this through IOMMU tree if nobody objects. Please work out
> the last remaining comment on patch 7 with Robin and then re-send with
> all Acks and Reviewed-by: tags included.
> 
> Oh, and please use 'iommu/of:' prefix instead of 'iommu: of:' and follow
> that sheme for the other patches/prefixes as well.
> 
> Thanks,
> 
>   Joerg



Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-06 Thread Kishon Vijay Abraham I
+Grygorii, linux-omap

On Friday 06 July 2018 02:06 PM, Lukas Wunner wrote:
> [cc += Kishon Vijay Abraham]
> 
> On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
>> OK, so calling devices_kset_move_last() from really_probe() clearly is
>> a mistake.
>>
>> I'm not really sure what the intention of it was as the changelog of
>> commit 52cdbdd49853d doesn't really explain that (why would it be
>> insufficient without that change?)
> 
> It seems 52cdbdd49853d fixed an issue with boards which have an MMC
> whose reset pin needs to be driven high on shutdown, lest the MMC
> won't be found on the next boot.
> 
> The boards' devicetrees use a kludge wherein the reset pin is modelled
> as a regulator.  The regulator is enabled when the MMC probes and
> disabled on driver unbind and shutdown.  As a result, the pin is driven
> low on shutdown and the MMC is not found on the next boot.
> 
> To fix this, another kludge was invented wherein the GPIO expander
> driving the reset pin unconditionally drives all its pins high on
> shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
> (commit adc284755055, "gpio: pcf857x: restore the initial line state
> of all pcf lines").
> 
> For this kludge to work, the GPIO expander's ->shutdown hook needs to
> be executed after the MMC expander's ->shutdown hook.
> 
> Commit 52cdbdd49853d achieved that by reordering devices_kset according
> to the probe order.  Apparently the MMC probes after the GPIO expander,
> possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't
> available yet, see mmc_regulator_get_supply().
> 
> Note, I'm just piecing the information together from git history,
> I'm not responsible for these kludges.  (I'm innocent!)
> 
> @Pingfan Liu, if you just remove the call to devices_kset_move_last()
> from really_probe(), does the issue go away?
> 
> If so, it might be best to remove that call and model the dependency
> with a call to device_link_add() in mmc_regulator_get_supply().

hmm.. had a quick look on this and looks like struct regulator is a regulator
frameworks internal data structure, so its members are not accessible outside.
struct regulator's device pointer is required for device_link_add().

Thanks
Kishon


Re: [PATCHv3 2/4] drivers/base: utilize device tree info to shutdown devices

2018-07-06 Thread Rafael J. Wysocki
On Friday, July 6, 2018 5:02:15 AM CEST Pingfan Liu wrote:
> On Thu, Jul 5, 2018 at 6:13 PM Rafael J. Wysocki  wrote:
> >
> > On Tuesday, July 3, 2018 8:50:40 AM CEST Pingfan Liu wrote:
> > > commit 52cdbdd49853 ("driver core: correct device's shutdown order")
> > > places an assumption of supplier<-consumer order on the process of probe.
> > > But it turns out to break down the parent <- child order in some scene.
> > > E.g in pci, a bridge is enabled by pci core, and behind it, the devices
> > > have been probed. Then comes the bridge's module, which enables extra
> > > feature(such as hotplug) on this bridge. This will break the
> > > parent<-children order and cause failure when "kexec -e" in some scenario.
> > >
> > > The detailed description of the scenario:
> > > An IBM Power9 machine on which, two drivers portdrv_pci and shpchp(a mod)
> > > match the PCI_CLASS_BRIDGE_PCI, but neither of them success to probe due
> > > to some issue. For this case, the bridge is moved after its children in
> > > devices_kset. Then, when "kexec -e", a ata-disk behind the bridge can not
> > > write back buffer in flight due to the former shutdown of the bridge which
> > > clears the BusMaster bit.
> > >
> > > It is a little hard to impose both "parent<-child" and 
> > > "supplier<-consumer"
> > > order on devices_kset. Take the following scene:
> > > step0: before a consumer's probing, (note child_a is supplier of 
> > > consumer_a)
> > >   [ consumer-X, child_a, , child_z] [... consumer_a, ..., consumer_z, 
> > > ...] supplier-X
> > >  ^^ affected range 
> > > ^^
> > > step1: when probing, moving consumer-X after supplier-X
> > >   [ child_a, , child_z] [ consumer_a, ..., consumer_z, ...] 
> > > supplier-X, consumer-X
> > > step2: the children of consumer-X should be re-ordered to maintain the seq
> > >   [... consumer_a, ..., consumer_z, ] supplier-X  [consumer-X, 
> > > child_a, , child_z]
> > > step3: the consumer_a should be re-ordered to maintain the seq
> > >   [... consumer_z, ...] supplier-X [ consumer-X, child_a, consumer_a ..., 
> > > child_z]
> > >
> > > It requires two nested recursion to drain out all out-of-order item in
> > > "affected range". To avoid such complicated code, this patch suggests
> > > to utilize the info in device tree, instead of using the order of
> > > devices_kset during shutdown. It iterates the device tree, and firstly
> > > shutdown a device's children and consumers. After this patch, the buggy
> > > commit is hollow and left to clean.
> > >
> > > Cc: Greg Kroah-Hartman 
> > > Cc: Rafael J. Wysocki 
> > > Cc: Grygorii Strashko 
> > > Cc: Christoph Hellwig 
> > > Cc: Bjorn Helgaas 
> > > Cc: Dave Young 
> > > Cc: linux-...@vger.kernel.org
> > > Cc: linuxppc-dev@lists.ozlabs.org
> > > Signed-off-by: Pingfan Liu 
> > > ---
> > >  drivers/base/core.c| 48 
> > > +++-
> > >  include/linux/device.h |  1 +
> > >  2 files changed, 44 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > > index a48868f..684b994 100644
> > > --- a/drivers/base/core.c
> > > +++ b/drivers/base/core.c
> > > @@ -1446,6 +1446,7 @@ void device_initialize(struct device *dev)
> > >   INIT_LIST_HEAD(>links.consumers);
> > >   INIT_LIST_HEAD(>links.suppliers);
> > >   dev->links.status = DL_DEV_NO_DRIVER;
> > > + dev->shutdown = false;
> > >  }
> > >  EXPORT_SYMBOL_GPL(device_initialize);
> > >
> > > @@ -2811,7 +2812,6 @@ static void __device_shutdown(struct device *dev)
> > >* lock is to be held
> > >*/
> > >   parent = get_device(dev->parent);
> > > - get_device(dev);
> >
> > Why is the get_/put_device() not needed any more?
> >
> They are moved upper layer into device_for_each_child_shutdown().
> Since there is lock breakage in __device_shutdown(), resorting to
> ref++ to protect the ancestor.  And I think the
> get_device(dev->parent) can be deleted either.

Wouldn't that break USB?

> > >   /*
> > >* Make sure the device is off the kset list, in the
> > >* event that dev->*->shutdown() doesn't remove it.
> > > @@ -2842,23 +2842,60 @@ static void __device_shutdown(struct device *dev)
> > >   dev_info(dev, "shutdown\n");
> > >   dev->driver->shutdown(dev);
> > >   }
> > > -
> > > + dev->shutdown = true;
> > >   device_unlock(dev);
> > >   if (parent)
> > >   device_unlock(parent);
> > >
> > > - put_device(dev);
> > >   put_device(parent);
> > >   spin_lock(_kset->list_lock);
> > >  }
> > >
> > > +/* shutdown dev's children and consumer firstly, then itself */
> > > +static int device_for_each_child_shutdown(struct device *dev)
> >
> > Confusing name.
> >
> > What about device_shutdown_subordinate()?
> >
> Fine. My understanding of words is not exact.
> 
> > > +{
> > > + struct klist_iter i;
> > > + struct 

Re: [PATCH v6 8/8] powernv/pseries: consolidate code for mce early handling.

2018-07-06 Thread Nicholas Piggin
On Wed, 04 Jul 2018 23:30:12 +0530
Mahesh J Salgaonkar  wrote:

> From: Mahesh Salgaonkar 
> 
> Now that other platforms also implements real mode mce handler,
> lets consolidate the code by sharing existing powernv machine check
> early code. Rename machine_check_powernv_early to
> machine_check_common_early and reuse the code.
> 
> Signed-off-by: Mahesh Salgaonkar 
> ---
>  arch/powerpc/kernel/exceptions-64s.S |   56 
> +++---
>  1 file changed, 11 insertions(+), 45 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index 0038596b7906..3e877ec55d50 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -243,14 +243,13 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100)
>   SET_SCRATCH0(r13)   /* save r13 */
>   EXCEPTION_PROLOG_0(PACA_EXMC)
>  BEGIN_FTR_SECTION
> - b   machine_check_powernv_early
> + b   machine_check_common_early
>  FTR_SECTION_ELSE
>   b   machine_check_pSeries_0
>  ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>  EXC_REAL_END(machine_check, 0x200, 0x100)
>  EXC_VIRT_NONE(0x4200, 0x100)
> -TRAMP_REAL_BEGIN(machine_check_powernv_early)
> -BEGIN_FTR_SECTION
> +TRAMP_REAL_BEGIN(machine_check_common_early)
>   EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
>   /*
>* Register contents:
> @@ -306,7 +305,9 @@ BEGIN_FTR_SECTION
>   /* Save r9 through r13 from EXMC save area to stack frame. */
>   EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
>   mfmsr   r11 /* get MSR value */
> +BEGIN_FTR_SECTION
>   ori r11,r11,MSR_ME  /* turn on ME bit */
> +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>   ori r11,r11,MSR_RI  /* turn on RI bit */
>   LOAD_HANDLER(r12, machine_check_handle_early)
>  1:   mtspr   SPRN_SRR0,r12
> @@ -325,7 +326,6 @@ BEGIN_FTR_SECTION
>   andcr11,r11,r10 /* Turn off MSR_ME */
>   b   1b
>   b   .   /* prevent speculative execution */
> -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>  
>  TRAMP_REAL_BEGIN(machine_check_pSeries)
>   .globl machine_check_fwnmi
> @@ -333,7 +333,7 @@ machine_check_fwnmi:
>   SET_SCRATCH0(r13)   /* save r13 */
>   EXCEPTION_PROLOG_0(PACA_EXMC)
>  BEGIN_FTR_SECTION
> - b   machine_check_pSeries_early
> + b   machine_check_common_early
>  END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
>  machine_check_pSeries_0:
>   EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
> @@ -346,45 +346,6 @@ machine_check_pSeries_0:
>  
>  TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
>  
> -TRAMP_REAL_BEGIN(machine_check_pSeries_early)
> -BEGIN_FTR_SECTION
> - EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> - mr  r10,r1  /* Save r1 */
> - ld  r1,PACAMCEMERGSP(r13)   /* Use MC emergency stack */
> - subir1,r1,INT_FRAME_SIZE/* alloc stack frame*/
> - mfspr   r11,SPRN_SRR0   /* Save SRR0 */
> - mfspr   r12,SPRN_SRR1   /* Save SRR1 */
> - EXCEPTION_PROLOG_COMMON_1()
> - EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> - EXCEPTION_PROLOG_COMMON_3(0x200)
> - addir3,r1,STACK_FRAME_OVERHEAD
> - BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
> -
> - /* Move original SRR0 and SRR1 into the respective regs */
> - ld  r9,_MSR(r1)
> - mtspr   SPRN_SRR1,r9
> - ld  r3,_NIP(r1)
> - mtspr   SPRN_SRR0,r3
> - ld  r9,_CTR(r1)
> - mtctr   r9
> - ld  r9,_XER(r1)
> - mtxer   r9
> - ld  r9,_LINK(r1)
> - mtlrr9
> - REST_GPR(0, r1)
> - REST_8GPRS(2, r1)
> - REST_GPR(10, r1)
> - ld  r11,_CCR(r1)
> - mtcrr11
> - REST_GPR(11, r1)
> - REST_2GPRS(12, r1)
> - /* restore original r1. */
> - ld  r1,GPR1(r1)
> - SET_SCRATCH0(r13)   /* save r13 */
> - EXCEPTION_PROLOG_0(PACA_EXMC)
> - b   machine_check_pSeries_0
> -END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> -
>  EXC_COMMON_BEGIN(machine_check_common)
>   /*
>* Machine check is different because we use a different
> @@ -483,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>   bl  machine_check_early
>   std r3,RESULT(r1)   /* Save result */
>   ld  r12,_MSR(r1)
> +BEGIN_FTR_SECTION
> + bne 9f  /* pSeries: continue to V mode. */
> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)

Should this be "b 9f" ? Although...

>  
>  #ifdef   CONFIG_PPC_P7_NAP
>   /*
> @@ -564,7 +528,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early)
>  9:
>   /* Deliver the machine check to host kernel in V mode. */
>   MACHINE_CHECK_HANDLER_WINDUP
> - b   machine_check_pSeries
> + SET_SCRATCH0(r13)   /* save r13 */
> + EXCEPTION_PROLOG_0(PACA_EXMC)
> + b   machine_check_pSeries_0

I'm not sure that's quite right. You're missing out 

[PATCH v3 1/2] powerpc: Detect the presence of big-cores via "ibm, thread-groups"

2018-07-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

On IBM POWER9, the device tree exposes a property array identifed by
"ibm,thread-groups" which will indicate which groups of threads share a
particular set of resources.

As of today we only have one form of grouping identifying the group of
threads in the core that share the L1 cache, translation cache and
instruction data flow.

This patch defines the helper function to parse the contents of
"ibm,thread-groups" and a new structure to contain the parsed output.

The patch also creates the sysfs file named "small_core_siblings" that
returns the physical ids of the threads in the core that share the L1
cache, translation cache and instruction data flow.

Signed-off-by: Gautham R. Shenoy 
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |   8 ++
 arch/powerpc/include/asm/cputhreads.h  |  22 +++
 arch/powerpc/kernel/setup-common.c | 154 +
 arch/powerpc/kernel/sysfs.c|  35 +
 4 files changed, 219 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 9c5e7732..62f24de 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -487,3 +487,11 @@ Description:   Information about CPU vulnerabilities
"Not affected"CPU is not affected by the vulnerability
"Vulnerable"  CPU is affected and no mitigation in effect
"Mitigation: $M"  CPU is affected and mitigation $M is in effect
+
+What:  /sys/devices/system/cpu/cpu[0-9]+/small_core_siblings
+Date:  05-Jul-2018
+KernelVersion: v4.18.0
+Contact:   Gautham R. Shenoy 
+Description:   List of Physical ids of CPUs which share the the L1 cache,
+   translation cache and instruction data-flow with this CPU.
+Values:Comma separated list of decimal integers.
diff --git a/arch/powerpc/include/asm/cputhreads.h 
b/arch/powerpc/include/asm/cputhreads.h
index d71a909..33226d7 100644
--- a/arch/powerpc/include/asm/cputhreads.h
+++ b/arch/powerpc/include/asm/cputhreads.h
@@ -23,11 +23,13 @@
 extern int threads_per_core;
 extern int threads_per_subcore;
 extern int threads_shift;
+extern bool has_big_cores;
 extern cpumask_t threads_core_mask;
 #else
 #define threads_per_core   1
 #define threads_per_subcore1
 #define threads_shift  0
+#define has_big_cores  0
 #define threads_core_mask  (*get_cpu_mask(0))
 #endif
 
@@ -69,12 +71,32 @@ static inline cpumask_t cpu_online_cores_map(void)
return cpu_thread_mask_to_cores(cpu_online_mask);
 }
 
+#define MAX_THREAD_LIST_SIZE   8
+struct thread_groups {
+   unsigned int property;
+   unsigned int nr_groups;
+   unsigned int threads_per_group;
+   unsigned int thread_list[MAX_THREAD_LIST_SIZE];
+};
+
 #ifdef CONFIG_SMP
 int cpu_core_index_of_thread(int cpu);
 int cpu_first_thread_of_core(int core);
+int parse_thread_groups(struct device_node *dn, struct thread_groups *tg);
+int get_cpu_thread_group_start(int cpu, struct thread_groups *tg);
 #else
 static inline int cpu_core_index_of_thread(int cpu) { return cpu; }
 static inline int cpu_first_thread_of_core(int core) { return core; }
+static inline int parse_thread_groups(struct device_node *dn,
+ struct thread_groups *tg)
+{
+   return -ENODATA;
+}
+
+static inline int get_cpu_thread_group_start(int cpu, struct thread_groups *tg)
+{
+   return -1;
+}
 #endif
 
 static inline int cpu_thread_in_core(int cpu)
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 40b44bb..989edc1 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -402,10 +402,12 @@ void __init check_for_initrd(void)
 #ifdef CONFIG_SMP
 
 int threads_per_core, threads_per_subcore, threads_shift;
+bool has_big_cores;
 cpumask_t threads_core_mask;
 EXPORT_SYMBOL_GPL(threads_per_core);
 EXPORT_SYMBOL_GPL(threads_per_subcore);
 EXPORT_SYMBOL_GPL(threads_shift);
+EXPORT_SYMBOL_GPL(has_big_cores);
 EXPORT_SYMBOL_GPL(threads_core_mask);
 
 static void __init cpu_init_thread_core_maps(int tpc)
@@ -433,6 +435,152 @@ static void __init cpu_init_thread_core_maps(int tpc)
 
 u32 *cpu_to_phys_id = NULL;
 
+/*
+ * parse_thread_groups: Parses the "ibm,thread-groups" device tree
+ *  property for the CPU device node @dn and stores
+ *  the parsed output in the thread_groups
+ *  structure @tg.
+ *
+ * @dn: The device node of the CPU device.
+ * @tg: Pointer to a thread group structure into which the parsed
+ * output of "ibm,thread-groups" is stored.
+ *
+ * ibm,thread-groups[0..N-1] array defines which group of threads in
+ * the CPU-device node can be grouped together based on the property.
+ *
+ * ibm,thread-groups[0] tells us the property based 

[PATCH v3 2/2] powerpc: Enable CPU_FTR_ASYM_SMT for interleaved big-cores

2018-07-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

A pair of IBM POWER9 SMT4 cores can be fused together to form a big-core
with 8 SMT threads. This can be discovered via the "ibm,thread-groups"
CPU property in the device tree which will indicate which group of
threads that share the L1 cache, translation cache and instruction data
flow. If there are multiple such group of threads, then the core is a
big-core.

Furthermore, if the thread-ids of the threads of the big-core can be
obtained by interleaving the thread-ids of the thread-groups
(component small core), then such a big-core is called an interleaved
big-core.

Eg: Threads in the pair of component SMT4 cores of an interleaved
big-core are numbered {0,2,4,6} and {1,3,5,7} respectively.

The SMT4 cores forming a big-core are more or less independent
units. Thus when multiple tasks are scheduled to run on the fused
core, we get the best performance when the tasks are spread across the
pair of SMT4 cores.

This patch enables CPU_FTR_ASYM_SMT bit in the cpu-features on
detecting the presence of interleaved big-cores at boot up. This will
will bias the load-balancing of tasks on smaller numbered threads,
which will automatically result in spreading the tasks uniformly
across the associated pair of SMT4 cores.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/setup-common.c | 67 +-
 1 file changed, 66 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 989edc1..c200fb8 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -581,6 +581,56 @@ int get_cpu_thread_group_start(int cpu, struct 
thread_groups *tg)
return -1;
 }
 
+/*
+ * check_interleaved_big_core - Checks if the thread group tg
+ * corresponds to a big-core whose threads are interleavings of the
+ * threads of the component small cores.
+ *
+ * @tg: A thread-group struct for the core.
+ *
+ * Returns true if the core is a interleaved big-core.
+ * Returns false otherwise.
+ */
+static inline bool check_interleaved_big_core(struct thread_groups *tg)
+{
+   int nr_groups;
+   int threads_per_group;
+   int cur_cpu, next_cpu, i, j;
+
+   nr_groups = tg->nr_groups;
+   threads_per_group = tg->threads_per_group;
+
+   if (tg->property != 1)
+   return false;
+
+   if (nr_groups < 2 || threads_per_group < 2)
+   return false;
+
+   /*
+* In case of an interleaved big-core, the thread-ids of the
+* big-core can be obtained by interleaving the the thread-ids
+* of the component small
+*
+* Eg: On a 8-thread big-core with two SMT4 small cores, the
+* threads of the two component small cores will be
+* {0, 2, 4, 6} and {1, 3, 5, 7}.
+*/
+   for (i = 0; i < nr_groups; i++) {
+   int group_start = i * threads_per_group;
+
+   for (j = 0; j < threads_per_group - 1; j++) {
+   int cur_idx = group_start + j;
+
+   cur_cpu = tg->thread_list[cur_idx];
+   next_cpu = tg->thread_list[cur_idx + 1];
+   if (next_cpu != cur_cpu + nr_groups)
+   return false;
+   }
+   }
+
+   return true;
+}
+
 /**
  * setup_cpu_maps - initialize the following cpu maps:
  *  cpu_possible_mask
@@ -604,6 +654,7 @@ void __init smp_setup_cpu_maps(void)
struct device_node *dn;
int cpu = 0;
int nthreads = 1;
+   bool has_interleaved_big_cores = true;
 
has_big_cores = true;
DBG("smp_setup_cpu_maps()\n");
@@ -657,6 +708,12 @@ void __init smp_setup_cpu_maps(void)
 
if (has_big_cores && !dt_has_big_core(dn, )) {
has_big_cores = false;
+   has_interleaved_big_cores = false;
+   }
+
+   if (has_interleaved_big_cores) {
+   has_interleaved_big_cores =
+   check_interleaved_big_core();
}
 
if (cpu >= nr_cpu_ids) {
@@ -713,7 +770,15 @@ void __init smp_setup_cpu_maps(void)
vdso_data->processorCount = num_present_cpus();
 #endif /* CONFIG_PPC64 */
 
-/* Initialize CPU <=> thread mapping/
+   if (has_interleaved_big_cores) {
+   int key = __builtin_ctzl(CPU_FTR_ASYM_SMT);
+
+   cur_cpu_spec->cpu_features |= CPU_FTR_ASYM_SMT;
+   static_branch_enable(_feature_keys[key]);
+   pr_info("Detected interleaved big-cores\n");
+   }
+
+   /* Initialize CPU <=> thread mapping/
 *
 * WARNING: We assume that the number of threads is the same for
 * every CPU in the system. If that is not the case, then some code
-- 
1.9.4



[PATCH v3 0/2] powerpc: Detection and scheduler optimization for POWER9 bigcore

2018-07-06 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Hi,

This is the third iteration of the patchset to add support for
big-core on POWER9.

The previous versions can be found here:

v2: https://lkml.org/lkml/2018/7/3/401
v1: https://lkml.org/lkml/2018/5/11/245

Changes : v2 --> v3
- Set sane values in the tg->property, tg->nr_groups inside
parse_thread_groups before returning due to an error.
- Define a helper function to determine whether a CPU device node
  is a big-core or not.
- Updated the comments around the functions to describe the
  arguments passed to them.

These changes were suggested by Murilo Opsfelder Araujo.

v1 --> v2
- Added comments explaining the "ibm,thread-groups" device tree property.
- Uses cleaner device-tree parsing functions to parse the u32 arrays.
- Adds a sysfs file listing the small-core siblings for every CPU.
- Enables the scheduler optimization by setting the CPU_FTR_ASYM_SMT bit
  in the cur_cpu_spec->cpu_features on detecting the presence
  of interleaved big-core.
- Handles the corner case where there is only a single thread-group
  or when there is a single thread in a thread-group.

Description:

A pair of IBM POWER9 SMT4 cores can be fused together to form a
big-core with 8 SMT threads. This can be discovered via the
"ibm,thread-groups" CPU property in the device tree which will
indicate which group of threads that share the L1 cache, translation
cache and instruction data flow.  If there are multiple such group of
threads, then the core is a big-core. Furthermore, the thread-ids of
such a big-core is obtained by interleaving the thread-ids of the
component SMT4 cores.

Eg: Threads in the pair of component SMT4 cores of an interleaved
big-core are numbered {0,2,4,6} and {1,3,5,7} respectively.

On such a big-core, when multiple tasks are scheduled to run on the
big-core, we get the best performance when the tasks are spread across
the pair of SMT4 cores.

The Linux scheduler supports a flag called "SD_ASYM_PACKING" which
when set in the SMT sched-domain, biases the load-balancing of the
tasks on the smaller numbered threads in the core. On an big-core
whose threads are interleavings of the threads of the small cores,
enabling SD_ASYM_PACKING in the SMT sched-domain automatically results
in spreading the tasks uniformly across the associated pair of SMT4
cores, thereby yielding better performance.

This patchset contains two patches which on detecting the presence of
interleaved big-cores will enable the the CPU_FTR_ASYM_SMT bit in the
cur_cpu_spec->cpu_feature.

Patch 1: adds support to detect the presence of
big-cores and reports the small-core siblings of each CPU X
via the sysfs file "/sys/devices/system/cpu/cpuX/big_core_siblings".

Patch 2: checks if the thread-ids of the component small-cores are
interleaved, in which case we enable the the CPU_FTR_ASYM_SMT bit in
the cur_cpu_spec->cpu_features which results in the SD_ASYM_PACKING
flag being set at the SMT level sched-domain.

Results:
~
Experimental results for ebizzy with 2 threads, bound to a single big-core
show a marked improvement with this patchset over the 4.18-rc3 vanilla
kernel.

The result of 100 such runs for 4.18-rc3 kernel and the 4.18-rc3 +
big-core-patches are as follows

4.18-rc3:
~~~
records/s:  # samples  : Histogram
~~~
[0 -   100]  :  0  : #
[100 - 200]  :  6  : ##
[200 - 300]  :  8  : ##
[300 - 400]  :  15 : 
[400 - 500]  :  0  : #
[500 - 600]  :  71 : ###

4.18-rc3 + big-core-patches
~~~
records/s:  # samples  : Histogram
~~~
[0 -   100]  :  0  : #
[100 - 200]  :  0  : #
[200 - 300]  :  9  : ##
[300 - 400]  :  0  : #
[400 - 500]  :  0  : #
[500 - 600]  :  91 : ###


Gautham R. Shenoy (2):
  powerpc: Detect the presence of big-cores via "ibm,thread-groups"
  powerpc: Enable CPU_FTR_ASYM_SMT for interleaved big-cores

 Documentation/ABI/testing/sysfs-devices-system-cpu |   8 +
 arch/powerpc/include/asm/cputhreads.h  |  22 ++
 arch/powerpc/kernel/setup-common.c | 221 -
 arch/powerpc/kernel/sysfs.c|  35 
 4 files changed, 285 insertions(+), 1 deletion(-)

-- 
1.9.4



Re: [PATCHv3 0/4] drivers/base: bugfix for supplier<-consumer ordering in device_kset

2018-07-06 Thread Rafael J. Wysocki
On Fri, Jul 6, 2018 at 10:36 AM, Lukas Wunner  wrote:
> [cc += Kishon Vijay Abraham]
>
> On Thu, Jul 05, 2018 at 11:18:28AM +0200, Rafael J. Wysocki wrote:
>> OK, so calling devices_kset_move_last() from really_probe() clearly is
>> a mistake.
>>
>> I'm not really sure what the intention of it was as the changelog of
>> commit 52cdbdd49853d doesn't really explain that (why would it be
>> insufficient without that change?)
>
> It seems 52cdbdd49853d fixed an issue with boards which have an MMC
> whose reset pin needs to be driven high on shutdown, lest the MMC
> won't be found on the next boot.
>
> The boards' devicetrees use a kludge wherein the reset pin is modelled
> as a regulator.  The regulator is enabled when the MMC probes and
> disabled on driver unbind and shutdown.  As a result, the pin is driven
> low on shutdown and the MMC is not found on the next boot.
>
> To fix this, another kludge was invented wherein the GPIO expander
> driving the reset pin unconditionally drives all its pins high on
> shutdown, see pcf857x_shutdown() in drivers/gpio/gpio-pcf857x.c
> (commit adc284755055, "gpio: pcf857x: restore the initial line state
> of all pcf lines").
>
> For this kludge to work, the GPIO expander's ->shutdown hook needs to
> be executed after the MMC expander's ->shutdown hook.
>
> Commit 52cdbdd49853d achieved that by reordering devices_kset according
> to the probe order.  Apparently the MMC probes after the GPIO expander,
> possibly because it returns -EPROBE_DEFER if the vmmc regulator isn't
> available yet, see mmc_regulator_get_supply().
>
> Note, I'm just piecing the information together from git history,
> I'm not responsible for these kludges.  (I'm innocent!)

Sure enough. :-)

In any case, calling devices_kset_move_last() in really_probe() is
plain broken and if its only purpose was to address a single, arguably
kludgy, use case, let's just get rid of it in the first place IMO.

> @Pingfan Liu, if you just remove the call to devices_kset_move_last()
> from really_probe(), does the issue go away?

I would think so from the description of the problem (elsewhere in this thread).

Thanks,
Rafael


Re: [PATCH kernel v4 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page

2018-07-06 Thread David Gibson
On Fri, Jul 06, 2018 at 04:00:30PM +1000, Michael Ellerman wrote:
> Alexey Kardashevskiy  writes:
> 
> > diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
> > b/arch/powerpc/mm/mmu_context_iommu.c
> > index abb4364..11e1029 100644
> > --- a/arch/powerpc/mm/mmu_context_iommu.c
> > +++ b/arch/powerpc/mm/mmu_context_iommu.c
> > @@ -159,6 +161,7 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long 
> > ua, unsigned long entries,
> > goto unlock_exit;
> > }
> >  
> > +   mem->pageshift = __builtin_ctzl(ua | (entries << PAGE_SHIFT));
> 
> __builtin_ctzl(0) is undefined, so are we guaranteed that
> (ua | (entries << PAGE_SHIFT)) is never zero? I couldn't convince
> myself.

We certainly don't need to care about that case, since registering a
zero-length region (entries == 0) is no-op.  But maybe we need to
filter it above.

> > @@ -199,9 +202,17 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long 
> > ua, unsigned long entries,
> > }
> > }
> >  populate:
> > +   pageshift = PAGE_SHIFT;
> > +   if (PageCompound(page))
> > +   pageshift += compound_order(compound_head(page));
> > +   mem->pageshift = min(mem->pageshift, pageshift);
> > mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
> > }
> >  
> > +   /* We have an incomplete huge page, default to PAGE_SHIFT */
> > +   if (head)
> > +   mem->pageshift = PAGE_SHIFT;
> > +
> 
> You never set head AFIACS? (other than in the initialiser)

That looks like a leftover from the previous version.

> 
> cheers
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH kernel v4 2/2] KVM: PPC: Check if IOMMU page is contained in the pinned physical page

2018-07-06 Thread Michael Ellerman
Alexey Kardashevskiy  writes:

> diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
> b/arch/powerpc/mm/mmu_context_iommu.c
> index abb4364..11e1029 100644
> --- a/arch/powerpc/mm/mmu_context_iommu.c
> +++ b/arch/powerpc/mm/mmu_context_iommu.c
> @@ -159,6 +161,7 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, 
> unsigned long entries,
>   goto unlock_exit;
>   }
>  
> + mem->pageshift = __builtin_ctzl(ua | (entries << PAGE_SHIFT));

__builtin_ctzl(0) is undefined, so are we guaranteed that
(ua | (entries << PAGE_SHIFT)) is never zero? I couldn't convince
myself.

> @@ -199,9 +202,17 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long 
> ua, unsigned long entries,
>   }
>   }
>  populate:
> + pageshift = PAGE_SHIFT;
> + if (PageCompound(page))
> + pageshift += compound_order(compound_head(page));
> + mem->pageshift = min(mem->pageshift, pageshift);
>   mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
>   }
>  
> + /* We have an incomplete huge page, default to PAGE_SHIFT */
> + if (head)
> + mem->pageshift = PAGE_SHIFT;
> +

You never set head AFIACS? (other than in the initialiser)

cheers