date:20220609

[PATCH v2] powerpc/perf: Give generic PMU a nice name

2022-06-09 Thread Joel Stanley

When booting on a machine that uses the compat pmu driver we see this:

 [0.071192] GENERIC_COMPAT performance monitor hardware support registered

Which is a bit shouty. Give it a nicer name.

Signed-off-by: Joel Stanley 
---
v2: Go with ISAv3

 arch/powerpc/perf/generic-compat-pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/generic-compat-pmu.c 
b/arch/powerpc/perf/generic-compat-pmu.c
index f3db88aee4dd..16392962c511 100644
--- a/arch/powerpc/perf/generic-compat-pmu.c
+++ b/arch/powerpc/perf/generic-compat-pmu.c
@@ -292,7 +292,7 @@ static int generic_compute_mmcr(u64 event[], int n_ev,
 }
 
 static struct power_pmu generic_compat_pmu = {
-   .name   = "GENERIC_COMPAT",
+   .name   = "ISAv3",
.n_counter  = MAX_PMU_COUNTERS,
.add_fields = ISA207_ADD_FIELDS,
.test_adder = ISA207_TEST_ADDER,
-- 
2.35.1

Re: [PATCH] powerpc/perf: Give generic PMU a nice name

2022-06-09 Thread Joel Stanley

On Tue, 31 May 2022 at 08:53, Madhavan Srinivasan  wrote:
>
>
> On 5/26/22 12:07 PM, Joel Stanley wrote:
> > When booting on a machine that uses the compat pmu driver we see this:
> >
> >   [0.071192] GENERIC_COMPAT performance monitor hardware support 
> > registered
> Sorry that was my mistake.
> I agree having it as ISAv3 is better.

Okay. The downside of this is it's not as clear that you're using a
fallback driver.

I'll send a v2 with ISAv3

>
> Maddy
>
> >
> > Which is a bit shouty. Give it a nicer name.
> >
> > Signed-off-by: Joel Stanley 
> > ---
> >
> > Other options:
> >
> >   - ISAv3 (because it is relevant for PowerISA 3.0B and beyond, see the
> > comment in init_generic_compat_pmu)
> >
> >   - Generic Compat (same, but less shouty)
> >
> >   arch/powerpc/perf/generic-compat-pmu.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/perf/generic-compat-pmu.c 
> > b/arch/powerpc/perf/generic-compat-pmu.c
> > index f3db88aee4dd..5be5a5ebaf42 100644
> > --- a/arch/powerpc/perf/generic-compat-pmu.c
> > +++ b/arch/powerpc/perf/generic-compat-pmu.c
> > @@ -292,7 +292,7 @@ static int generic_compute_mmcr(u64 event[], int n_ev,
> >   }
> >
> >   static struct power_pmu generic_compat_pmu = {
> > - .name   = "GENERIC_COMPAT",
> > + .name   = "Architected",
> >   .n_counter  = MAX_PMU_COUNTERS,
> >   .add_fields = ISA207_ADD_FIELDS,
> >   .test_adder = ISA207_TEST_ADDER,

Re: [PATCH 1/6] powerpc: Add ZERO_GPRS macros for register clears

2022-06-09 Thread Rohan McLure

> On 2 Jun 2022, at 2:00 am, Segher Boessenkool  
> wrote:
> 
> Hi!
> 
> On Wed, Jun 01, 2022 at 03:48:45PM +1000, Rohan McLure wrote:
>> +.macro BINOP_REGS op, rhs, start, end
>> +.Lreg=\start
>> +.rept (\end - \start + 1)
>> +\op .Lreg, \rhs
>> +.Lreg=.Lreg+1
>> +.endr
>> +.endm
> 
> This is for unary operations, not binary operations (there is only one
> item on the RHS).  You can in principle put a string "a,b" in the rhs
> parameter, but in practice you need a or b to depend on the loop counter
> as well, so even such trickiness won't do.  Make the naming less
> confusing, maybe?  Or don't have an unused extra level of abstraction in
> the first place :-)
> 
> 
> Segher

Thanks Segher, Christophe for reviewing this.

Yep I see how having a macro to perform rX = rX <> Y for arbitrary infix <> and 
operand
is unlikely to find much use outside of ZERO_GPRS. As I resubmit this patch 
series I
will rename it to ZERO_REGS or similar to be more explicitly coupled to 
ZERO_GPRS.

Something like this I was thinking:

.macro ZERO_REGS start, end
.Lreg=\start
.rept (\end - \start + 1)
li  .Lreg, 0
.Lreg=.Lreg+1
.endr
.endm

Thanks,
Rohan

Re: [PATCH 11/12] powerpc: wiiu: don't enforce flat memory

2022-06-09 Thread Pali Rohár

On Friday 20 May 2022 14:30:02 Pali Rohár wrote:
> + linux-mm
> 
> Do you know what are requirements for kernel to support non-contiguous
> memory support and what is needed to enable it for 32-bit powerpc?

Any hints?

> Currently powerpc arch code does not support "memblock.memory.cnt > 1"
> except for WII which seems like a hack... See below.
> 
> On Friday 20 May 2022 20:44:04 Ash Logan wrote:
> > On 20/5/22 18:04, Pali Rohár wrote:
> > > On Friday 20 May 2022 13:41:04 Ash Logan wrote:
> > >> On 14/5/22 08:43, Pali Rohár wrote:
> > >>> On Wednesday 02 March 2022 15:44:05 Ash Logan wrote:
> >  pgtable_32.c:mapin_ram loops over each valid memory range, which means
> >  non-contiguous memory just works.
> > >>>
> > >>> Hello! Does it mean that non-contiguous memory works for any 32-bit
> > >>> powerpc platform, and not only for wiiu? If yes, should not be
> > >>> non-contiguous memory support enabled for all 32-bit ppc boards then?
> > >>
> > >> Hi! Sorry for my delayed response. As best I can tell, it does indeed
> > >> Just Work, but I have only been able to test on wiiu which is missing a
> > >> lot of features other boards have (like PCI) - so it's possible there's
> > >> still an assumption elsewhere in the kernel that I haven't hit.
> > >>
> > >> As best I can tell, the Wii and Wii U are the only 32-bit powerpc boards
> > >> out there where it's even possible to have non-contiguous memory.
> > > 
> > > What is the reason that those two boards are the **only**? Is there some
> > > specific requirement from bootloader or hardware to "enable"
> > > non-contiguous memory support?
> > 
> > Not that I know of, I was just saying that I was only aware of those two
> > boards where the memory map isn't contiguous, and that is the only place
> > where it has been tested. Evidently you know of another board!
> > 
> > > I'm interested in enabling non-contiguous memory support for P2020-based
> > > board as it has gaps in its 32-bit memory layout and which could be used
> > > for RAM mapping when 4GB DDR3 module is plugged in (default is 2GB).
> > 
> > If it's like the Wii or Wii U (some memory at 0, a gap for MMIO or
> > whatever, then more memory at a higher address) then you should try a
> > patch along these lines, because barring the unknowns I mentioned before
> > it should work. At least as far as I'm aware ;)
> > 
> >  Signed-off-by: Ash Logan 
> >  ---
> >   arch/powerpc/mm/init_32.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> >  diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
> >  index 3d690be48e84..59a84629d9a0 100644
> >  --- a/arch/powerpc/mm/init_32.c
> >  +++ b/arch/powerpc/mm/init_32.c
> >  @@ -125,10 +125,10 @@ void __init MMU_init(void)
> >  * lowmem_end_addr is initialized below.
> >  */
> > if (memblock.memory.cnt > 1) {
> >  -#ifndef CONFIG_WII
> >  +#if !defined(CONFIG_WII) && !defined(CONFIG_WIIU)
> > 
> >  memblock_enforce_memory_limit(memblock.memory.regions[0].size);
> > pr_warn("Only using first contiguous memory region\n");
> >  -#else
> >  +#elif defined(CONFIG_WII)
> > wii_memory_fixups();
> >   #endif
> > }
> >  -- 
> >  2.35.1
> >

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Tyrel Datwyler

On 6/9/22 13:21, Guilherme G. Piccoli wrote:
> First of all, thanks for looping me Bjorn! Much appreciated.
> I'm also CCing Ben and Gavin, that know a lot of PPC PCI stuff.
> 
> 
> On 09/06/2022 16:34, Bjorn Helgaas wrote:
>> [...]
>>> Upgrading powerpc kernels from LTS 4.4 version (which does not
>>> contain mentioned commit) to new LTS versions brings a
>>> regression in domain assignment.
>>
>> Can you elaborate why it is a regression ?
>> 63a72284b159 That commit says 'no functionnal changes', I'm
>> having hard time understanding how a nochange can be a
>> regression.
>
> It is not 'no functional change'. That commit completely changed
> PCI domain assignment in a way that is incompatible with other
> architectures and also incompatible with the way how it was done
> prior that commit.

 I agree that the "no functional change" statement is incorrect.
 However, for most powerpc platforms it ended up being simply a
 cosmetic behavior change. As far as I can tell there is nothing
 requiring domain ids to increase montonically from zero or that
 each architecture is required to use the same domain numbering
 scheme.
> 
> Strongly agree here in both points: first, this was not a "no functional
> change" thing, and I apologize for adding this in the commit message.
> What I meant is that: despite changing the numbering, (as Tyrel said)
> nothing should require increasing monotonic mutable PCI domains. At
> least, I'm not aware of such requirement in any spec or even in the
> kernel and adjacent tooling.
> 
> 
>>> [...]
 We could properly limit it to powernv and pseries by using
 ibm,fw-phb-id instead of reg property in the look up that follows
 a failed ibm,opal-phbid lookup. I think this is acceptable as long
 as no other powerpc platforms have started using this behavior for
 persistent naming.
>>>
>>> And what about setting that new config option to enabled by default
>>> for those series?
> 
> I don't remember all the details about PPC dt, but it should already be
> restricted to pseries/powernv, right? At least, the commit has a comment:
> 
> /* If not pseries nor powernv, or if fixed PHB numbering tried to add
>  * the same PHB number twice, then fallback to dynamic PHB numbering.*/
> 
> If this is *not* restricted to these 2 platforms, I agree with Pali's
> approach, although I'd consider the correct is to keep the persistent
> domain scheme for both pnv and pseries, as it's working like this for 5
> years and counting, and this *does* prevent a lot of issues with PCI
> hotplugging in PPC servers.

I mentioned this in a previous post, but it is clear the Author's intent was for
this only to apply to powernv and pseries platforms. However, it only really
checks for powernv, and if that fails it does a read the reg property for the
domain which works for and PPC platform. If we really only want this on powernv
and pseries and revert all other PPC platforms back we can fix this with a
pseries check instead of a config property. Using ibm,fw-phb-id instead of reg
property if ibm,opal-phbid lookup fails does the trick.

-Tyrel

> 
> 
>>> [...]
 I forgot to ask before about the actual regression here.  The commit
 log says domain numbers are different, but I don't know the connection
 from there to something failing.  I assume there's some script or
 config file that depends on specific domain numbers?  And that
 dependency is (hopefully) powerpc-specific?
>>>
>>> You assume correct. For example this is the way how OpenWRT handles PCI
>>> devices (but not only OpenWRT). This OpenWRT case is not
>>> powerpc-specific but generic to all architectures. This is just one
>>> example.
>>
>> So basically everybody uses D/b/d/f for persistent names.  That's ...
>> well, somewhat stable for things soldered down or in a motherboard
>> slot, but a terrible idea for things that can be hot-plugged.
>>
>> Even for more core things, it's possible for firmware to change bus
>> numbering between boots.  For example, if a complicated hierarchy is
>> cold-plugged into one slot, firmware is likely to assign different bus
>> numbers on the next boot to make room for it.  Obviously this can also
>> happen as a hot-add, and Linux needs the flexibility to do similar
>> renumbering then, although we don't support it yet.
>>
>> It looks like 63a72284b159 was intended to make domain numbers *more*
>> consistent, so it's ironic that this actually broke something by
>> changing domain numbers.  Maybe there's a way to limit the scope of
>> 63a72284b159 so it avoids the breakage.  I don't know enough about the
>> powerpc landscape to even guess at how.
> 
> I don't considereit breaks the userspace since this is definitely no
> stable ABI (or if it is, I'm not aware and my apologies). If scripts
> rely on

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Guilherme G. Piccoli

First of all, thanks for looping me Bjorn! Much appreciated.
I'm also CCing Ben and Gavin, that know a lot of PPC PCI stuff.

On 09/06/2022 16:34, Bjorn Helgaas wrote:
> [...]
>> Upgrading powerpc kernels from LTS 4.4 version (which does not
>> contain mentioned commit) to new LTS versions brings a
>> regression in domain assignment.
>
> Can you elaborate why it is a regression ?
> 63a72284b159 That commit says 'no functionnal changes', I'm
> having hard time understanding how a nochange can be a
> regression.

 It is not 'no functional change'. That commit completely changed
 PCI domain assignment in a way that is incompatible with other
 architectures and also incompatible with the way how it was done
 prior that commit.
>>>
>>> I agree that the "no functional change" statement is incorrect.
>>> However, for most powerpc platforms it ended up being simply a
>>> cosmetic behavior change. As far as I can tell there is nothing
>>> requiring domain ids to increase montonically from zero or that
>>> each architecture is required to use the same domain numbering
>>> scheme.

Strongly agree here in both points: first, this was not a "no functional
change" thing, and I apologize for adding this in the commit message.
What I meant is that: despite changing the numbering, (as Tyrel said)
nothing should require increasing monotonic mutable PCI domains. At
least, I'm not aware of such requirement in any spec or even in the
kernel and adjacent tooling.

>> [...]
>>> We could properly limit it to powernv and pseries by using
>>> ibm,fw-phb-id instead of reg property in the look up that follows
>>> a failed ibm,opal-phbid lookup. I think this is acceptable as long
>>> as no other powerpc platforms have started using this behavior for
>>> persistent naming.
>>
>> And what about setting that new config option to enabled by default
>> for those series?

I don't remember all the details about PPC dt, but it should already be
restricted to pseries/powernv, right? At least, the commit has a comment:

/* If not pseries nor powernv, or if fixed PHB numbering tried to add
 * the same PHB number twice, then fallback to dynamic PHB numbering.*/

If this is *not* restricted to these 2 platforms, I agree with Pali's
approach, although I'd consider the correct is to keep the persistent
domain scheme for both pnv and pseries, as it's working like this for 5
years and counting, and this *does* prevent a lot of issues with PCI
hotplugging in PPC servers.

>> [...]
>>> I forgot to ask before about the actual regression here.  The commit
>>> log says domain numbers are different, but I don't know the connection
>>> from there to something failing.  I assume there's some script or
>>> config file that depends on specific domain numbers?  And that
>>> dependency is (hopefully) powerpc-specific?
>>
>> You assume correct. For example this is the way how OpenWRT handles PCI
>> devices (but not only OpenWRT). This OpenWRT case is not
>> powerpc-specific but generic to all architectures. This is just one
>> example.
> 
> So basically everybody uses D/b/d/f for persistent names.  That's ...
> well, somewhat stable for things soldered down or in a motherboard
> slot, but a terrible idea for things that can be hot-plugged.
> 
> Even for more core things, it's possible for firmware to change bus
> numbering between boots.  For example, if a complicated hierarchy is
> cold-plugged into one slot, firmware is likely to assign different bus
> numbers on the next boot to make room for it.  Obviously this can also
> happen as a hot-add, and Linux needs the flexibility to do similar
> renumbering then, although we don't support it yet.
> 
> It looks like 63a72284b159 was intended to make domain numbers *more*
> consistent, so it's ironic that this actually broke something by
> changing domain numbers.  Maybe there's a way to limit the scope of
> 63a72284b159 so it avoids the breakage.  I don't know enough about the
> powerpc landscape to even guess at how.

I don't considereit breaks the userspace since this is definitely no
stable ABI (or if it is, I'm not aware and my apologies). If scripts
rely on that, they are doing the wrong thing it seems.

With that said, I'm definitely not against improving the situation with
Pali's KConfig - just think that we somehow should keep the persistent
behavior for powernv and pseries.
Hopefully PPC folks has more to say on that!
Cheers,

Guilherme

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Pali Rohár

On Thursday 09 June 2022 14:34:51 Bjorn Helgaas wrote:
> On Thu, Jun 09, 2022 at 08:05:26PM +0200, Pali Rohár wrote:
> > On Thursday 09 June 2022 12:10:22 Bjorn Helgaas wrote:
> > > On Thu, Jun 09, 2022 at 06:27:25PM +0200, Pali Rohár wrote:
> > > > On Thursday 09 June 2022 11:22:55 Bjorn Helgaas wrote:
> > > > > [+cc Guilherme, Michael, Ben (author of 63a72284b159 and PPC folks), 
> > > > > thread:
> > > > > https://lore.kernel.org/r/20220504175718.29011-1-p...@kernel.org]
> > > > > 
> > > > > On Fri, May 06, 2022 at 12:33:02AM +0200, Pali Rohár wrote:
> > > > > > On Thursday 05 May 2022 15:10:01 Tyrel Datwyler wrote:
> > > > > > > On 5/5/22 02:31, Pali Rohár wrote:
> > > > > > > > On Thursday 05 May 2022 07:16:40 Christophe Leroy wrote:
> > > > > > > >> Le 04/05/2022 à 19:57, Pali Rohár a écrit :
> > > > > > > >>> Since commit 63a72284b159 ("powerpc/pci: Assign fixed PHB
> > > > > > > >>> number based on device-tree properties"), powerpc kernel
> > > > > > > >>> always fallback to PCI domain assignment from OF / Device Tree
> > > > > > > >>> 'reg' property of the PCI controller.
> > > > > > > >>>
> > > > > > > >>> PCI code for other Linux architectures use increasing
> > > > > > > >>> assignment of the PCI domain for individual controllers
> > > > > > > >>> (assign the first free number), like it was also for powerpc
> > > > > > > >>> prior mentioned commit.
> > > > > > > >>>
> > > > > > > >>> Upgrading powerpc kernels from LTS 4.4 version (which does not
> > > > > > > >>> contain mentioned commit) to new LTS versions brings a
> > > > > > > >>> regression in domain assignment.
> > > > > > > >>
> > > > > > > >> Can you elaborate why it is a regression ?
> > > > > > > >> 63a72284b159 That commit says 'no functionnal changes', I'm
> > > > > > > >> having hard time understanding how a nochange can be a
> > > > > > > >> regression.
> > > > > > > > 
> > > > > > > > It is not 'no functional change'. That commit completely changed
> > > > > > > > PCI domain assignment in a way that is incompatible with other
> > > > > > > > architectures and also incompatible with the way how it was done
> > > > > > > > prior that commit.
> > > > > > > 
> > > > > > > I agree that the "no functional change" statement is incorrect.
> > > > > > > However, for most powerpc platforms it ended up being simply a
> > > > > > > cosmetic behavior change. As far as I can tell there is nothing
> > > > > > > requiring domain ids to increase montonically from zero or that
> > > > > > > each architecture is required to use the same domain numbering
> > > > > > > scheme.
> > > > > > 
> > > > > > That is truth. But it looks really suspicious why domains are not
> > > > > > assigned monotonically. Some scripts / applications are using PCI
> > > > > > location (domain:bus:dev:func) for remembering PCI device and domain
> > > > > > change can cause issue for config files. And some (older) 
> > > > > > applications
> > > > > > expects existence of domain zero. In systems without hot plug 
> > > > > > support
> > > > > > with small number of domains (e.g. 3) it means that there are always
> > > > > > domains 0, 1 and 2.
> > > > > > 
> > > > > > > Its hard to call this a true regression unless it actually broke
> > > > > > > something. The commit in question has been in the kernel since 4.8
> > > > > > > which was released over 5 1/2 years ago.
> > > > > > 
> > > > > > I agree, it really depends on how you look at it.
> > > > > > 
> > > > > > The important is that lot of people are using LTS versions and are
> > > > > > doing upgrades when LTS support is dropped. Which for 4.4 now
> > > > > > happened. So not all smaller or "cosmetic" changes could be detected
> > > > > > until longer LTS period pass.
> > > > > > 
> > > > > > > With all that said looking closer at the code in question I think
> > > > > > > it is fair to assume that the author only intended this change for
> > > > > > > powernv and pseries platforms and not every powerpc platform. That
> > > > > > > change was done to make persistent naming easier to manage in
> > > > > > > userspace.
> > > > > > 
> > > > > > I agree that this behavior change may be useful in some situations
> > > > > > and I do not object this need.
> > > > > > 
> > > > > > > Your change defaults back to the old behavior which will now break
> > > > > > > both powernv and pseries platforms with regard to hotplugging and
> > > > > > > persistent naming.
> > > > > > 
> > > > > > I was aware of it, that change could cause issues. And that is why I
> > > > > > added config option for choosing behavior. So users would be able to
> > > > > > choose what they need.
> > > > > > 
> > > > > > > We could properly limit it to powernv and pseries by using
> > > > > > > ibm,fw-phb-id instead of reg property in the look up that follows
> > > > > > > a failed ibm,opal-phbid lookup. I think this is acceptable as long
> > > > > > > as no other powerpc platforms have started using this behavior for
> > > > > > > persistent naming.
> >

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Bjorn Helgaas

On Thu, Jun 09, 2022 at 08:05:26PM +0200, Pali Rohár wrote:
> On Thursday 09 June 2022 12:10:22 Bjorn Helgaas wrote:
> > On Thu, Jun 09, 2022 at 06:27:25PM +0200, Pali Rohár wrote:
> > > On Thursday 09 June 2022 11:22:55 Bjorn Helgaas wrote:
> > > > [+cc Guilherme, Michael, Ben (author of 63a72284b159 and PPC folks), 
> > > > thread:
> > > > https://lore.kernel.org/r/20220504175718.29011-1-p...@kernel.org]
> > > > 
> > > > On Fri, May 06, 2022 at 12:33:02AM +0200, Pali Rohár wrote:
> > > > > On Thursday 05 May 2022 15:10:01 Tyrel Datwyler wrote:
> > > > > > On 5/5/22 02:31, Pali Rohár wrote:
> > > > > > > On Thursday 05 May 2022 07:16:40 Christophe Leroy wrote:
> > > > > > >> Le 04/05/2022 à 19:57, Pali Rohár a écrit :
> > > > > > >>> Since commit 63a72284b159 ("powerpc/pci: Assign fixed PHB
> > > > > > >>> number based on device-tree properties"), powerpc kernel
> > > > > > >>> always fallback to PCI domain assignment from OF / Device Tree
> > > > > > >>> 'reg' property of the PCI controller.
> > > > > > >>>
> > > > > > >>> PCI code for other Linux architectures use increasing
> > > > > > >>> assignment of the PCI domain for individual controllers
> > > > > > >>> (assign the first free number), like it was also for powerpc
> > > > > > >>> prior mentioned commit.
> > > > > > >>>
> > > > > > >>> Upgrading powerpc kernels from LTS 4.4 version (which does not
> > > > > > >>> contain mentioned commit) to new LTS versions brings a
> > > > > > >>> regression in domain assignment.
> > > > > > >>
> > > > > > >> Can you elaborate why it is a regression ?
> > > > > > >> 63a72284b159 That commit says 'no functionnal changes', I'm
> > > > > > >> having hard time understanding how a nochange can be a
> > > > > > >> regression.
> > > > > > > 
> > > > > > > It is not 'no functional change'. That commit completely changed
> > > > > > > PCI domain assignment in a way that is incompatible with other
> > > > > > > architectures and also incompatible with the way how it was done
> > > > > > > prior that commit.
> > > > > > 
> > > > > > I agree that the "no functional change" statement is incorrect.
> > > > > > However, for most powerpc platforms it ended up being simply a
> > > > > > cosmetic behavior change. As far as I can tell there is nothing
> > > > > > requiring domain ids to increase montonically from zero or that
> > > > > > each architecture is required to use the same domain numbering
> > > > > > scheme.
> > > > > 
> > > > > That is truth. But it looks really suspicious why domains are not
> > > > > assigned monotonically. Some scripts / applications are using PCI
> > > > > location (domain:bus:dev:func) for remembering PCI device and domain
> > > > > change can cause issue for config files. And some (older) applications
> > > > > expects existence of domain zero. In systems without hot plug support
> > > > > with small number of domains (e.g. 3) it means that there are always
> > > > > domains 0, 1 and 2.
> > > > > 
> > > > > > Its hard to call this a true regression unless it actually broke
> > > > > > something. The commit in question has been in the kernel since 4.8
> > > > > > which was released over 5 1/2 years ago.
> > > > > 
> > > > > I agree, it really depends on how you look at it.
> > > > > 
> > > > > The important is that lot of people are using LTS versions and are
> > > > > doing upgrades when LTS support is dropped. Which for 4.4 now
> > > > > happened. So not all smaller or "cosmetic" changes could be detected
> > > > > until longer LTS period pass.
> > > > > 
> > > > > > With all that said looking closer at the code in question I think
> > > > > > it is fair to assume that the author only intended this change for
> > > > > > powernv and pseries platforms and not every powerpc platform. That
> > > > > > change was done to make persistent naming easier to manage in
> > > > > > userspace.
> > > > > 
> > > > > I agree that this behavior change may be useful in some situations
> > > > > and I do not object this need.
> > > > > 
> > > > > > Your change defaults back to the old behavior which will now break
> > > > > > both powernv and pseries platforms with regard to hotplugging and
> > > > > > persistent naming.
> > > > > 
> > > > > I was aware of it, that change could cause issues. And that is why I
> > > > > added config option for choosing behavior. So users would be able to
> > > > > choose what they need.
> > > > > 
> > > > > > We could properly limit it to powernv and pseries by using
> > > > > > ibm,fw-phb-id instead of reg property in the look up that follows
> > > > > > a failed ibm,opal-phbid lookup. I think this is acceptable as long
> > > > > > as no other powerpc platforms have started using this behavior for
> > > > > > persistent naming.
> > > > > 
> > > > > And what about setting that new config option to enabled by default
> > > > > for those series?
> > > > > 
> > > > > Or is there issue with introduction of the new config option?
> > > > > 
> > > > > One of the point is that it is

Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.19-2 tag

2022-06-09 Thread pr-tracker-bot

The pull request you sent on Fri, 10 Jun 2022 00:59:45 +1000:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> tags/powerpc-5.19-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/95fc76c81b9270a9ab38f4947fe5cb786c8c79cc

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Pali Rohár

On Thursday 09 June 2022 12:10:22 Bjorn Helgaas wrote:
> On Thu, Jun 09, 2022 at 06:27:25PM +0200, Pali Rohár wrote:
> > On Thursday 09 June 2022 11:22:55 Bjorn Helgaas wrote:
> > > [+cc Guilherme, Michael, Ben (author of 63a72284b159 and PPC folks), 
> > > thread:
> > > https://lore.kernel.org/r/20220504175718.29011-1-p...@kernel.org]
> > > 
> > > On Fri, May 06, 2022 at 12:33:02AM +0200, Pali Rohár wrote:
> > > > On Thursday 05 May 2022 15:10:01 Tyrel Datwyler wrote:
> > > > > On 5/5/22 02:31, Pali Rohár wrote:
> > > > > > On Thursday 05 May 2022 07:16:40 Christophe Leroy wrote:
> > > > > >> Le 04/05/2022 à 19:57, Pali Rohár a écrit :
> > > > > >>> Since commit 63a72284b159 ("powerpc/pci: Assign fixed PHB
> > > > > >>> number based on device-tree properties"), powerpc kernel
> > > > > >>> always fallback to PCI domain assignment from OF / Device Tree
> > > > > >>> 'reg' property of the PCI controller.
> > > > > >>>
> > > > > >>> PCI code for other Linux architectures use increasing
> > > > > >>> assignment of the PCI domain for individual controllers
> > > > > >>> (assign the first free number), like it was also for powerpc
> > > > > >>> prior mentioned commit.
> > > > > >>>
> > > > > >>> Upgrading powerpc kernels from LTS 4.4 version (which does not
> > > > > >>> contain mentioned commit) to new LTS versions brings a
> > > > > >>> regression in domain assignment.
> > > > > >>
> > > > > >> Can you elaborate why it is a regression ?
> > > > > >> 63a72284b159 That commit says 'no functionnal changes', I'm
> > > > > >> having hard time understanding how a nochange can be a
> > > > > >> regression.
> > > > > > 
> > > > > > It is not 'no functional change'. That commit completely changed
> > > > > > PCI domain assignment in a way that is incompatible with other
> > > > > > architectures and also incompatible with the way how it was done
> > > > > > prior that commit.
> > > > > 
> > > > > I agree that the "no functional change" statement is incorrect.
> > > > > However, for most powerpc platforms it ended up being simply a
> > > > > cosmetic behavior change. As far as I can tell there is nothing
> > > > > requiring domain ids to increase montonically from zero or that
> > > > > each architecture is required to use the same domain numbering
> > > > > scheme.
> > > > 
> > > > That is truth. But it looks really suspicious why domains are not
> > > > assigned monotonically. Some scripts / applications are using PCI
> > > > location (domain:bus:dev:func) for remembering PCI device and domain
> > > > change can cause issue for config files. And some (older) applications
> > > > expects existence of domain zero. In systems without hot plug support
> > > > with small number of domains (e.g. 3) it means that there are always
> > > > domains 0, 1 and 2.
> > > > 
> > > > > Its hard to call this a true regression unless it actually broke
> > > > > something. The commit in question has been in the kernel since 4.8
> > > > > which was released over 5 1/2 years ago.
> > > > 
> > > > I agree, it really depends on how you look at it.
> > > > 
> > > > The important is that lot of people are using LTS versions and are
> > > > doing upgrades when LTS support is dropped. Which for 4.4 now
> > > > happened. So not all smaller or "cosmetic" changes could be detected
> > > > until longer LTS period pass.
> > > > 
> > > > > With all that said looking closer at the code in question I think
> > > > > it is fair to assume that the author only intended this change for
> > > > > powernv and pseries platforms and not every powerpc platform. That
> > > > > change was done to make persistent naming easier to manage in
> > > > > userspace.
> > > > 
> > > > I agree that this behavior change may be useful in some situations
> > > > and I do not object this need.
> > > > 
> > > > > Your change defaults back to the old behavior which will now break
> > > > > both powernv and pseries platforms with regard to hotplugging and
> > > > > persistent naming.
> > > > 
> > > > I was aware of it, that change could cause issues. And that is why I
> > > > added config option for choosing behavior. So users would be able to
> > > > choose what they need.
> > > > 
> > > > > We could properly limit it to powernv and pseries by using
> > > > > ibm,fw-phb-id instead of reg property in the look up that follows
> > > > > a failed ibm,opal-phbid lookup. I think this is acceptable as long
> > > > > as no other powerpc platforms have started using this behavior for
> > > > > persistent naming.
> > > > 
> > > > And what about setting that new config option to enabled by default
> > > > for those series?
> > > > 
> > > > Or is there issue with introduction of the new config option?
> > > > 
> > > > One of the point is that it is really a good idea to have
> > > > similar/same behavior for all linux platforms. And if it cannot be
> > > > enabled by default (for backward compatibility) add at least some
> > > > option, so new platforms can start using it or users can

[FSL P50x0] Keyboard and mouse don't work anymore after the devicetree updates for 5.19

2022-06-09 Thread Christian Zigotzky


On 06 June 2022 at 07:06 pm, Rob Herring wrote:

On Mon, Jun 6, 2022 at 11:14 AM Christian Zigotzky
 wrote:

On 06 June 2022 at 04:58 pm, Rob Herring wrote:

On Fri, May 27, 2022 at 9:23 AM Rob Herring  wrote:

On Fri, May 27, 2022 at 3:33 AM Christian Zigotzky
 wrote:

On 27 May 2022 at 10:14 am, Prabhakar Mahadev Lad wrote:

Hi,


-Original Message-
From: Christian Zigotzky 

On 27 May 2022 at 09:56 am, Prabhakar Mahadev Lad wrote:

Hi,


-Original Message-
From: Christophe Leroy 

[...]


Looks like the driver which you are using has not been converted to use

platform_get_irq(), could you please check that.

Cheers,
Prabhakar

Do you mean the mouse and keyboard driver?


No it could be your gpio/pinctrl driver assuming the keyboard/mouse are using 
GPIO's. If you are using interrupts then it might be some hierarchal irqc 
driver in drivers/irqchip/.

Cheers,
Prabhakar

Good to know. I only use unmodified drivers from the official Linux
kernel so it's not an issue of the Cyrus+ board.

The issue is in drivers/usb/host/fsl-mph-dr-of.c which copies the
resources to a child platform device. Can you try the following
change:

diff --git a/drivers/usb/host/fsl-mph-dr-of.c b/drivers/usb/host/fsl-mph-dr-of.c
index 44a7e58a26e3..47d9b7be60da 100644
--- a/drivers/usb/host/fsl-mph-dr-of.c
+++ b/drivers/usb/host/fsl-mph-dr-of.c
@@ -80,8 +80,6 @@ static struct platform_device *fsl_usb2_device_register(
  const char *name, int id)
   {
  struct platform_device *pdev;
-   const struct resource *res = ofdev->resource;
-   unsigned int num = ofdev->num_resources;
  int retval;

  pdev = platform_device_alloc(name, id);
@@ -106,11 +104,7 @@ static struct platform_device *fsl_usb2_device_register(
  if (retval)
  goto error;

-   if (num) {
-   retval = platform_device_add_resources(pdev, res, num);
-   if (retval)
-   goto error;
-   }
+   pdev->dev.of_node = ofdev->dev.of_node;

>From the log, I think you also need to add this line:

pdev->dev.of_node_reused = true;


  retval = platform_device_add(pdev);
  if (retval)

Hello Rob,

Thanks a lot for your answer.

Is the following patch correct?

Yes


--- a/drivers/usb/host/fsl-mph-dr-of.c2022-05-28 09:10:26.797688422
+0200
+++ b/drivers/usb/host/fsl-mph-dr-of.c2022-05-28 09:15:01.668594809
+0200
@@ -80,8 +80,6 @@ static struct platform_device *fsl_usb2_
   const char *name, int id)
   {
   struct platform_device *pdev;
-const struct resource *res = ofdev->resource;
-unsigned int num = ofdev->num_resources;
   int retval;

   pdev = platform_device_alloc(name, id);
@@ -106,11 +104,7 @@ static struct platform_device *fsl_usb2_
   if (retval)
   goto error;

-if (num) {
-retval = platform_device_add_resources(pdev, res, num);
-if (retval)
-goto error;
-}
+pdev->dev.of_node = ofdev->dev.of_node;
+pdev->dev.of_node_reused = true;

   retval = platform_device_add(pdev);
   if (retval)

---

Thanks,
Christian

Hello Rob,

I tested this patch today and unfortunately the issue still exists.

Do you have another idea?

Thanks,
Christian

Updated patch:

--- a/drivers/usb/host/fsl-mph-dr-of.c  2022-06-06 02:18:54.0 +0200
+++ b/drivers/usb/host/fsl-mph-dr-of.c  2022-06-09 19:31:50.135472793 +0200
@@ -80,8 +80,6 @@ static struct platform_device *fsl_usb2_
    const char *name, int id)
 {
    struct platform_device *pdev;
-   const struct resource *res = ofdev->resource;
-   unsigned int num = ofdev->num_resources;
    int retval;

    pdev = platform_device_alloc(name, id);
@@ -105,12 +103,8 @@ static struct platform_device *fsl_usb2_
    retval = platform_device_add_data(pdev, pdata, sizeof(*pdata));
    if (retval)
    goto error;
-
-   if (num) {
-   retval = platform_device_add_resources(pdev, res, num);
-   if (retval)
-   goto error;
-   }
+    pdev->dev.of_node = ofdev->dev.of_node;
+    pdev->dev.of_node_reused = true;

    retval = platform_device_add(pdev);
    if (retval)

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Bjorn Helgaas

On Thu, Jun 09, 2022 at 06:27:25PM +0200, Pali Rohár wrote:
> On Thursday 09 June 2022 11:22:55 Bjorn Helgaas wrote:
> > [+cc Guilherme, Michael, Ben (author of 63a72284b159 and PPC folks), thread:
> > https://lore.kernel.org/r/20220504175718.29011-1-p...@kernel.org]
> > 
> > On Fri, May 06, 2022 at 12:33:02AM +0200, Pali Rohár wrote:
> > > On Thursday 05 May 2022 15:10:01 Tyrel Datwyler wrote:
> > > > On 5/5/22 02:31, Pali Rohár wrote:
> > > > > On Thursday 05 May 2022 07:16:40 Christophe Leroy wrote:
> > > > >> Le 04/05/2022 à 19:57, Pali Rohár a écrit :
> > > > >>> Since commit 63a72284b159 ("powerpc/pci: Assign fixed PHB
> > > > >>> number based on device-tree properties"), powerpc kernel
> > > > >>> always fallback to PCI domain assignment from OF / Device Tree
> > > > >>> 'reg' property of the PCI controller.
> > > > >>>
> > > > >>> PCI code for other Linux architectures use increasing
> > > > >>> assignment of the PCI domain for individual controllers
> > > > >>> (assign the first free number), like it was also for powerpc
> > > > >>> prior mentioned commit.
> > > > >>>
> > > > >>> Upgrading powerpc kernels from LTS 4.4 version (which does not
> > > > >>> contain mentioned commit) to new LTS versions brings a
> > > > >>> regression in domain assignment.
> > > > >>
> > > > >> Can you elaborate why it is a regression ?
> > > > >> 63a72284b159 That commit says 'no functionnal changes', I'm
> > > > >> having hard time understanding how a nochange can be a
> > > > >> regression.
> > > > > 
> > > > > It is not 'no functional change'. That commit completely changed
> > > > > PCI domain assignment in a way that is incompatible with other
> > > > > architectures and also incompatible with the way how it was done
> > > > > prior that commit.
> > > > 
> > > > I agree that the "no functional change" statement is incorrect.
> > > > However, for most powerpc platforms it ended up being simply a
> > > > cosmetic behavior change. As far as I can tell there is nothing
> > > > requiring domain ids to increase montonically from zero or that
> > > > each architecture is required to use the same domain numbering
> > > > scheme.
> > > 
> > > That is truth. But it looks really suspicious why domains are not
> > > assigned monotonically. Some scripts / applications are using PCI
> > > location (domain:bus:dev:func) for remembering PCI device and domain
> > > change can cause issue for config files. And some (older) applications
> > > expects existence of domain zero. In systems without hot plug support
> > > with small number of domains (e.g. 3) it means that there are always
> > > domains 0, 1 and 2.
> > > 
> > > > Its hard to call this a true regression unless it actually broke
> > > > something. The commit in question has been in the kernel since 4.8
> > > > which was released over 5 1/2 years ago.
> > > 
> > > I agree, it really depends on how you look at it.
> > > 
> > > The important is that lot of people are using LTS versions and are
> > > doing upgrades when LTS support is dropped. Which for 4.4 now
> > > happened. So not all smaller or "cosmetic" changes could be detected
> > > until longer LTS period pass.
> > > 
> > > > With all that said looking closer at the code in question I think
> > > > it is fair to assume that the author only intended this change for
> > > > powernv and pseries platforms and not every powerpc platform. That
> > > > change was done to make persistent naming easier to manage in
> > > > userspace.
> > > 
> > > I agree that this behavior change may be useful in some situations
> > > and I do not object this need.
> > > 
> > > > Your change defaults back to the old behavior which will now break
> > > > both powernv and pseries platforms with regard to hotplugging and
> > > > persistent naming.
> > > 
> > > I was aware of it, that change could cause issues. And that is why I
> > > added config option for choosing behavior. So users would be able to
> > > choose what they need.
> > > 
> > > > We could properly limit it to powernv and pseries by using
> > > > ibm,fw-phb-id instead of reg property in the look up that follows
> > > > a failed ibm,opal-phbid lookup. I think this is acceptable as long
> > > > as no other powerpc platforms have started using this behavior for
> > > > persistent naming.
> > > 
> > > And what about setting that new config option to enabled by default
> > > for those series?
> > > 
> > > Or is there issue with introduction of the new config option?
> > > 
> > > One of the point is that it is really a good idea to have
> > > similar/same behavior for all linux platforms. And if it cannot be
> > > enabled by default (for backward compatibility) add at least some
> > > option, so new platforms can start using it or users can decide to
> > > switch behavior.
> > 
> > This is a powerpc thing so I'm just kibbitzing a little.
> > 
> > This basically looks like a new config option to selectively revert
> > 63a72284b159.  That seems hard to maintain and doesn't seem like
> >

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Pali Rohár

On Thursday 09 June 2022 11:22:55 Bjorn Helgaas wrote:
> [+cc Guilherme, Michael, Ben (author of 63a72284b159 and PPC folks), thread:
> https://lore.kernel.org/r/20220504175718.29011-1-p...@kernel.org]
> 
> On Fri, May 06, 2022 at 12:33:02AM +0200, Pali Rohár wrote:
> > On Thursday 05 May 2022 15:10:01 Tyrel Datwyler wrote:
> > > On 5/5/22 02:31, Pali Rohár wrote:
> > > > On Thursday 05 May 2022 07:16:40 Christophe Leroy wrote:
> > > >> Le 04/05/2022 à 19:57, Pali Rohár a écrit :
> > > >>> Since commit 63a72284b159 ("powerpc/pci: Assign fixed PHB
> > > >>> number based on device-tree properties"), powerpc kernel
> > > >>> always fallback to PCI domain assignment from OF / Device Tree
> > > >>> 'reg' property of the PCI controller.
> > > >>>
> > > >>> PCI code for other Linux architectures use increasing
> > > >>> assignment of the PCI domain for individual controllers
> > > >>> (assign the first free number), like it was also for powerpc
> > > >>> prior mentioned commit.
> > > >>>
> > > >>> Upgrading powerpc kernels from LTS 4.4 version (which does not
> > > >>> contain mentioned commit) to new LTS versions brings a
> > > >>> regression in domain assignment.
> > > >>
> > > >> Can you elaborate why it is a regression ?
> > > >> 63a72284b159 That commit says 'no functionnal changes', I'm
> > > >> having hard time understanding how a nochange can be a
> > > >> regression.
> > > > 
> > > > It is not 'no functional change'. That commit completely changed
> > > > PCI domain assignment in a way that is incompatible with other
> > > > architectures and also incompatible with the way how it was done
> > > > prior that commit.
> > > 
> > > I agree that the "no functional change" statement is incorrect.
> > > However, for most powerpc platforms it ended up being simply a
> > > cosmetic behavior change. As far as I can tell there is nothing
> > > requiring domain ids to increase montonically from zero or that
> > > each architecture is required to use the same domain numbering
> > > scheme.
> > 
> > That is truth. But it looks really suspicious why domains are not
> > assigned monotonically. Some scripts / applications are using PCI
> > location (domain:bus:dev:func) for remembering PCI device and domain
> > change can cause issue for config files. And some (older) applications
> > expects existence of domain zero. In systems without hot plug support
> > with small number of domains (e.g. 3) it means that there are always
> > domains 0, 1 and 2.
> > 
> > > Its hard to call this a true regression unless it actually broke
> > > something. The commit in question has been in the kernel since 4.8
> > > which was released over 5 1/2 years ago.
> > 
> > I agree, it really depends on how you look at it.
> > 
> > The important is that lot of people are using LTS versions and are
> > doing upgrades when LTS support is dropped. Which for 4.4 now
> > happened. So not all smaller or "cosmetic" changes could be detected
> > until longer LTS period pass.
> > 
> > > With all that said looking closer at the code in question I think
> > > it is fair to assume that the author only intended this change for
> > > powernv and pseries platforms and not every powerpc platform. That
> > > change was done to make persistent naming easier to manage in
> > > userspace.
> > 
> > I agree that this behavior change may be useful in some situations
> > and I do not object this need.
> > 
> > > Your change defaults back to the old behavior which will now break
> > > both powernv and pseries platforms with regard to hotplugging and
> > > persistent naming.
> > 
> > I was aware of it, that change could cause issues. And that is why I
> > added config option for choosing behavior. So users would be able to
> > choose what they need.
> > 
> > > We could properly limit it to powernv and pseries by using
> > > ibm,fw-phb-id instead of reg property in the look up that follows
> > > a failed ibm,opal-phbid lookup. I think this is acceptable as long
> > > as no other powerpc platforms have started using this behavior for
> > > persistent naming.
> > 
> > And what about setting that new config option to enabled by default
> > for those series?
> > 
> > Or is there issue with introduction of the new config option?
> > 
> > One of the point is that it is really a good idea to have
> > similar/same behavior for all linux platforms. And if it cannot be
> > enabled by default (for backward compatibility) add at least some
> > option, so new platforms can start using it or users can decide to
> > switch behavior.
> 
> This is a powerpc thing so I'm just kibbitzing a little.
> 
> This basically looks like a new config option to selectively revert
> 63a72284b159.  That seems hard to maintain and doesn't seem like
> something that needs to be baked into the kernel at compile-time.
> 
> The 63a72284b159 commit log says persistent NIC names are tied to PCI
> domain/bus/dev/fn addresses, which seems like something we should
> discourage because we can't predict PCI

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Bjorn Helgaas

[+cc Guilherme, Michael, Ben (author of 63a72284b159 and PPC folks), thread:
https://lore.kernel.org/r/20220504175718.29011-1-p...@kernel.org]

On Fri, May 06, 2022 at 12:33:02AM +0200, Pali Rohár wrote:
> On Thursday 05 May 2022 15:10:01 Tyrel Datwyler wrote:
> > On 5/5/22 02:31, Pali Rohár wrote:
> > > On Thursday 05 May 2022 07:16:40 Christophe Leroy wrote:
> > >> Le 04/05/2022 à 19:57, Pali Rohár a écrit :
> > >>> Since commit 63a72284b159 ("powerpc/pci: Assign fixed PHB
> > >>> number based on device-tree properties"), powerpc kernel
> > >>> always fallback to PCI domain assignment from OF / Device Tree
> > >>> 'reg' property of the PCI controller.
> > >>>
> > >>> PCI code for other Linux architectures use increasing
> > >>> assignment of the PCI domain for individual controllers
> > >>> (assign the first free number), like it was also for powerpc
> > >>> prior mentioned commit.
> > >>>
> > >>> Upgrading powerpc kernels from LTS 4.4 version (which does not
> > >>> contain mentioned commit) to new LTS versions brings a
> > >>> regression in domain assignment.
> > >>
> > >> Can you elaborate why it is a regression ?
> > >> 63a72284b159 That commit says 'no functionnal changes', I'm
> > >> having hard time understanding how a nochange can be a
> > >> regression.
> > > 
> > > It is not 'no functional change'. That commit completely changed
> > > PCI domain assignment in a way that is incompatible with other
> > > architectures and also incompatible with the way how it was done
> > > prior that commit.
> > 
> > I agree that the "no functional change" statement is incorrect.
> > However, for most powerpc platforms it ended up being simply a
> > cosmetic behavior change. As far as I can tell there is nothing
> > requiring domain ids to increase montonically from zero or that
> > each architecture is required to use the same domain numbering
> > scheme.
> 
> That is truth. But it looks really suspicious why domains are not
> assigned monotonically. Some scripts / applications are using PCI
> location (domain:bus:dev:func) for remembering PCI device and domain
> change can cause issue for config files. And some (older) applications
> expects existence of domain zero. In systems without hot plug support
> with small number of domains (e.g. 3) it means that there are always
> domains 0, 1 and 2.
> 
> > Its hard to call this a true regression unless it actually broke
> > something. The commit in question has been in the kernel since 4.8
> > which was released over 5 1/2 years ago.
> 
> I agree, it really depends on how you look at it.
> 
> The important is that lot of people are using LTS versions and are
> doing upgrades when LTS support is dropped. Which for 4.4 now
> happened. So not all smaller or "cosmetic" changes could be detected
> until longer LTS period pass.
> 
> > With all that said looking closer at the code in question I think
> > it is fair to assume that the author only intended this change for
> > powernv and pseries platforms and not every powerpc platform. That
> > change was done to make persistent naming easier to manage in
> > userspace.
> 
> I agree that this behavior change may be useful in some situations
> and I do not object this need.
> 
> > Your change defaults back to the old behavior which will now break
> > both powernv and pseries platforms with regard to hotplugging and
> > persistent naming.
> 
> I was aware of it, that change could cause issues. And that is why I
> added config option for choosing behavior. So users would be able to
> choose what they need.
> 
> > We could properly limit it to powernv and pseries by using
> > ibm,fw-phb-id instead of reg property in the look up that follows
> > a failed ibm,opal-phbid lookup. I think this is acceptable as long
> > as no other powerpc platforms have started using this behavior for
> > persistent naming.
> 
> And what about setting that new config option to enabled by default
> for those series?
> 
> Or is there issue with introduction of the new config option?
> 
> One of the point is that it is really a good idea to have
> similar/same behavior for all linux platforms. And if it cannot be
> enabled by default (for backward compatibility) add at least some
> option, so new platforms can start using it or users can decide to
> switch behavior.

This is a powerpc thing so I'm just kibbitzing a little.

This basically looks like a new config option to selectively revert
63a72284b159.  That seems hard to maintain and doesn't seem like
something that needs to be baked into the kernel at compile-time.

The 63a72284b159 commit log says persistent NIC names are tied to PCI
domain/bus/dev/fn addresses, which seems like something we should
discourage because we can't predict PCI addresses in general.  I
assume other platforms typically use udev with MAC addresses or
something?

> > > For example, prior that commit on P2020 RDB board were PCI
> > > domains 0, 1 and 2.
> > > 
> > > $ lspci
> > > :00:00.0 PCI bridge: Freescale

RE: [PATCH 09/10] scsi/ibmvscsi: Replace srp tasklet with work

2022-06-09 Thread David Laight

From: Sebastian Andrzej Siewior
> Sent: 09 June 2022 16:03
> 
> On 2022-05-30 16:15:11 [-0700], Davidlohr Bueso wrote:
> > Tasklets have long been deprecated as being too heavy on the system
> > by running in irq context - and this is not a performance critical
> > path. If a higher priority process wants to run, it must wait for
> > the tasklet to finish before doing so.
> >
> > Process srps asynchronously in process context in a dedicated
> > single threaded workqueue.
> 
> I would suggest threaded interrupts instead. The pattern here is the
> same as in the previous driver except here is less locking.

How long do these actions runs for, and what is waiting for
them to finish?

These changes seem to drop the priority from above that of the
highest priority RT process down to that of a default priority
user process.
There is no real guarantee that the latter will run 'any time soon'.

Consider some workloads I'm setting up where most of the cpu are
likely to spend 90%+ of the time running processes under the RT
scheduler that are processing audio.

It is quite likely that a non-RT thread (especially one bound
to a specific cpu) won't run for several milliseconds.
(We have to go through 'hoops' to avoid dropping ethernet frames.)

I'd have thought that some of these kernel threads really
need to run at a 'middling' RT priority.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

Re: [PATCH 09/10] scsi/ibmvscsi: Replace srp tasklet with work

2022-06-09 Thread Sebastian Andrzej Siewior

On 2022-05-30 16:15:11 [-0700], Davidlohr Bueso wrote:
> Tasklets have long been deprecated as being too heavy on the system
> by running in irq context - and this is not a performance critical
> path. If a higher priority process wants to run, it must wait for
> the tasklet to finish before doing so.
> 
> Process srps asynchronously in process context in a dedicated
> single threaded workqueue.

I would suggest threaded interrupts instead. The pattern here is the
same as in the previous driver except here is less locking.

Sebastian

[GIT PULL] Please pull powerpc/linux.git powerpc-5.19-2 tag

2022-06-09 Thread Michael Ellerman

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Hi Linus,

Please pull powerpc fixes for 5.19:

The following changes since commit 6112bd00e84e5dbffebc3c1e908cbe914ca772ee:

  Merge tag 'powerpc-5.19-1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux (2022-05-28 
11:27:17 -0700)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-5.19-2

for you to fetch changes up to 8e127846fc97778a5e5c99bca1ce0bbc5ec9:

  powerpc/32: Fix overread/overwrite of thread_struct via ptrace (2022-06-09 
23:32:56 +1000)

- --
powerpc fixes for 5.19 #2

 - On 32-bit fix overread/overwrite of thread_struct via ptrace PEEK/POKE.

 - Fix softirqs not switching to the softirq stack since we moved irq_exit().

 - Force thread size increase when KASAN is enabled to avoid stack overflows.

 - On Book3s 64 mark more code as not to be instrumented by KASAN to avoid 
crashes.

 - Exempt __get_wchan() from KASAN checking, as it's inherently racy.

 - Fix a recently introduced crash in the papr_scm driver in some 
configurations.

 - Remove include of  which is forbidden.

Thanks to: Ariel Miculas, Chen Jingwen, Christophe Leroy, Erhard Furtner, He 
Ying, Kees
Cook, Masahiro Yamada, Nageswara R Sastry, Paul Mackerras, Sachin Sant, Vaibhav 
Jain,
Wanming Hu.

- --
He Ying (1):
  powerpc/kasan: Silence KASAN warnings in __get_wchan()

Masahiro Yamada (1):
  powerpc/book3e: get rid of #include 

Michael Ellerman (3):
  powerpc: Don't select HAVE_IRQ_EXIT_ON_IRQ_STACK
  powerpc/kasan: Force thread size increase with KASAN
  powerpc/32: Fix overread/overwrite of thread_struct via ptrace

Paul Mackerras (1):
  powerpc/kasan: Mark more real-mode code as not to be instrumented

Vaibhav Jain (1):
  powerpc/papr_scm: don't requests stats with '0' sized stats buffer


 arch/powerpc/Kconfig  |  2 --
 arch/powerpc/include/asm/thread_info.h| 10 --
 arch/powerpc/kernel/Makefile  |  2 ++
 arch/powerpc/kernel/process.c |  4 ++--
 arch/powerpc/kernel/ptrace/ptrace-fpu.c   | 20 ++--
 arch/powerpc/kernel/ptrace/ptrace.c   |  3 +++
 arch/powerpc/kernel/rtas.c|  4 ++--
 arch/powerpc/kexec/crash.c|  2 +-
 arch/powerpc/mm/nohash/kaslr_booke.c  |  8 ++--
 arch/powerpc/platforms/powernv/Makefile   |  1 +
 arch/powerpc/platforms/pseries/papr_scm.c |  3 +++
 11 files changed, 38 insertions(+), 21 deletions(-)
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmKiClMACgkQUevqPMjh
pYBl2w//cMfBNA1wR3b+r9+Ro9SlFMzfCtk0ue+A44QLo0Va71H9eZLmZ7qM6LK4
B37Am/bNZPnI0DM1pjTXKq0aEt7KRC9VyEiS9Gl4w0mf5bRjZy8Q8A+YA/CDoKrr
Ctd1iAZzk1mAbegdSeS66DlmGVGvDwzLK8jkJDPjOhaaXZMLZLRx+j4USeLuTCMS
Ub7LXpLcewa7E7xOXshoFIFd6wCSEpsiayCEAlCd4yICKeDfNdWc/9GBd3e6z+1e
7cSr7E59TY1cJ5WZ00p1UPigImNroYBEPmZj+F3vSt/MQwb/VXZYQuFojytk126z
sl69pzyghp4oCdvNvVVxHTHjDYZhnoPEKSoPpgD+CegrrdgKBeXweGE8T8JHKr4i
1FSFrX+zwXoTRRpwyDVEtDu1ld/AMHMQO8NiFFDSuYDdgrgYCyqk1Rg3Yb6aYpLQ
+b8uwX9b1lUZyrJz/Whf4PpTW0TmU9eyNPyFScpyojX1HMcu+VDBtowuaS1FkzIb
2ft6XsQUV0f4EIrJTsgWWyIw32kE9TiCPKwOX3wMQfdx5j6YuQJ7N2ALZVoQnQbu
aUvpGu6Fv/tasSaboThHWpAsRZLMGjFD94WwvX1Kn0NtvWpoeocA6barup0Nosey
3avApSIyPG5Q9RA3m1SjM0dLJdpyUmt2hDuvuFitRmeqePOoVxc=
=s319
-END PGP SIGNATURE-

Re: [PATCH] powerpc: get rid of #include

2022-06-09 Thread Michael Ellerman

On Sat, 4 Jun 2022 17:50:50 +0900, Masahiro Yamada wrote:
> You cannot include  here because it is generated
> in init/Makefile but there is no guarantee that it happens before
> arch/powerpc/mm/nohash/kaslr_booke.c is compiled for parallel builds.
> 
> The places where you can reliably include  are:
> 
>   - init/  (because init/Makefile can specify the dependency)
>   - arch/*/boot/   (because it is compiled after vmlinux)
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc: get rid of #include 
  https://git.kernel.org/powerpc/c/7ad4bd887d27c6b6ffbef216f19c19f8fe2b8f52

cheers

Re: [PATCH -v2] powerpc/process, kasan: Silence KASAN warnings in __get_wchan()

2022-06-09 Thread Michael Ellerman

On Thu, 20 Jan 2022 20:44:18 -0500, He Ying wrote:
> The following KASAN warning was reported in our kernel.
> 
>   BUG: KASAN: stack-out-of-bounds in get_wchan+0x188/0x250
>   Read of size 4 at addr d216f958 by task ps/14437
> 
>   CPU: 3 PID: 14437 Comm: ps Tainted: G   O  5.10.0 #1
>   Call Trace:
>   [daa63858] [c0654348] dump_stack+0x9c/0xe4 (unreliable)
>   [daa63888] [c035cf0c] print_address_description.constprop.3+0x8c/0x570
>   [daa63908] [c035d6bc] kasan_report+0x1ac/0x218
>   [daa63948] [c00496e8] get_wchan+0x188/0x250
>   [daa63978] [c0461ec8] do_task_stat+0xce8/0xe60
>   [daa63b98] [c0455ac8] proc_single_show+0x98/0x170
>   [daa63bc8] [c03cab8c] seq_read_iter+0x1ec/0x900
>   [daa63c38] [c03cb47c] seq_read+0x1dc/0x290
>   [daa63d68] [c037fc94] vfs_read+0x164/0x510
>   [daa63ea8] [c03808e4] ksys_read+0x144/0x1d0
>   [daa63f38] [c005b1dc] ret_from_syscall+0x0/0x38
>   --- interrupt: c00 at 0x8fa8f4
>   LR = 0x8fa8cc
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/process, kasan: Silence KASAN warnings in __get_wchan()
  https://git.kernel.org/powerpc/c/a1b29ba2f2c171b9bea73be993bfdf0a62d37d15

cheers

Re: [PATCH] powerpc/kasan: Mark more real-mode code as not to be instrumented

2022-06-09 Thread Michael Ellerman

On Thu, 19 May 2022 17:45:21 +1000, Paul Mackerras wrote:
> This marks more files and functions that can possibly be called in
> real mode as not to be instrumented by KASAN.  Most were found by
> inspection, except for get_pseries_errorlog() which was reported as
> causing a crash in testing.
> 
> 

Applied to powerpc/fixes.

[1/1] powerpc/kasan: Mark more real-mode code as not to be instrumented
  https://git.kernel.org/powerpc/c/743cdb7bd0f1cb32c03680c8b38257957db2e692

cheers

Re: [PATCH] powerpc/kasan: Force thread size increase with KASAN

2022-06-09 Thread Michael Ellerman

On Thu, 2 Jun 2022 00:31:14 +1000, Michael Ellerman wrote:
> KASAN causes increased stack usage, which can lead to stack overflows.
> 
> The logic in Kconfig to suggest a larger default doesn't work if a user
> has CONFIG_EXPERT enabled and has an existing .config with a smaller
> value.
> 
> Follow the lead of x86 and arm64, and force the thread size to be
> increased when KASAN is enabled.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/kasan: Force thread size increase with KASAN
  https://git.kernel.org/powerpc/c/3e8635fb2e072672cbc650989ffedf8300ad67fb

cheers

Re: [PATCH] powerpc: Don't select HAVE_IRQ_EXIT_ON_IRQ_STACK

2022-06-09 Thread Michael Ellerman

On Wed, 25 May 2022 13:26:39 +1000, Michael Ellerman wrote:
> The HAVE_IRQ_EXIT_ON_IRQ_STACK option tells generic code that irq_exit()
> is called while still running on the hard irq stack (hardirq_ctx[] in
> the powerpc code).
> 
> Selecting the option means the generic code will *not* switch to the
> softirq stack before running softirqs, because the code is already
> running on the (mostly empty) hard irq stack.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc: Don't select HAVE_IRQ_EXIT_ON_IRQ_STACK
  https://git.kernel.org/powerpc/c/1346d00e1bdfd4067f92bc14e8a6131a01de4190

cheers

Re: [PATCH] powerpc/papr_scm: don't requests stats with '0' sized stats buffer

2022-06-09 Thread Michael Ellerman

On Tue, 24 May 2022 16:53:53 +0530, Vaibhav Jain wrote:
> Sachin reported [1] that on a POWER-10 lpar he is seeing a kernel panic being
> reported with vPMEM when papr_scm probe is being called. The panic is of the
> form below and is observed only with following option disabled(profile) for 
> the
> said LPAR 'Enable Performance Information Collection' in the HMC:
> 
>  Kernel attempted to write user page (1c) - exploit attempt? (uid: 0)
>  BUG: Kernel NULL pointer dereference on write at 0x001c
>  Faulting instruction address: 0xc00801b90844
>  Oops: Kernel access of bad area, sig: 11 [#1]
> 
>  NIP [c00801b90844] drc_pmem_query_stats+0x5c/0x270 [papr_scm]
>  LR [c00801b92794] papr_scm_probe+0x2ac/0x6ec [papr_scm]
>  Call Trace:
>0xc941bca0 (unreliable)
>papr_scm_probe+0x2ac/0x6ec [papr_scm]
>platform_probe+0x98/0x150
>really_probe+0xfc/0x510
>__driver_probe_device+0x17c/0x230
> 
>  ---[ end trace  ]---
>  Kernel panic - not syncing: Fatal exception
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/papr_scm: don't requests stats with '0' sized stats buffer
  https://git.kernel.org/powerpc/c/07bf9431b1590d1cd7a8d62075d0b50b073f0495

cheers

Re: [PATCH] powerpc/32: Fix overread/overwrite of thread_struct via ptrace

2022-06-09 Thread Michael Ellerman

On Thu, 9 Jun 2022 23:32:45 +1000, Michael Ellerman wrote:
> The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
> to read/write registers of another process.
> 
> To get/set a register, the API takes an index into an imaginary address
> space called the "USER area", where the registers of the process are
> laid out in some fashion.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/32: Fix overread/overwrite of thread_struct via ptrace
  https://git.kernel.org/powerpc/c/8e127846fc97778a5e5c99bca1ce0bbc5ec9

cheers

Re: [PATCH V2] ASoC: imx-audmux: remove unnecessary check of clk_disable_unprepare/clk_prepare_enable

2022-06-09 Thread Mark Brown

On Mon, 6 Jun 2022 03:37:05 +, cgel@gmail.com wrote:
> From: Minghao Chi 
> 
> Because clk_disable_unprepare/clk_prepare_enable already checked NULL clock
> parameter, so the additional checks are unnecessary, just remove them.
> 
> 

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: imx-audmux: remove unnecessary check of 
clk_disable_unprepare/clk_prepare_enable
  commit: 142d456204cf4dabe18be59e043d806440f609d4

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

[PATCH] powerpc/32: Fix overread/overwrite of thread_struct via ptrace

2022-06-09 Thread Michael Ellerman

The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.

To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.

The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.

The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.

The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.

Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.

To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.

Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.

On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.

It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
  
https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40...@keymile.com/

Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.

Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.

Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers 
in little endian builds")
Cc: sta...@vger.kernel.org # v3.13+
Reported-by: Ariel Miculas 
Tested-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/ptrace/ptrace-fpu.c | 20 ++--
 arch/powerpc/kernel/ptrace/ptrace.c |  3 +++
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c 
b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
index 5dca19361316..09c49632bfe5 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
@@ -17,9 +17,13 @@ int ptrace_get_fpr(struct task_struct *child, int index, 
unsigned long *data)
 
 #ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
-   if (fpidx < (PT_FPSCR - PT_FPR0))
-   memcpy(data, >thread.TS_FPR(fpidx), sizeof(long));
-   else
+   if (fpidx < (PT_FPSCR - PT_FPR0)) {
+   if (IS_ENABLED(CONFIG_PPC32))
+   // On 32-bit the index we are passed refers to 32-bit 
words
+   *data = ((u32 *)child->thread.fp_state.fpr)[fpidx];
+   else
+   memcpy(data, >thread.TS_FPR(fpidx), 
sizeof(long));
+   } else
*data = child->thread.fp_state.fpscr;
 #else
*data = 0;
@@ -39,9 +43,13 @@ int ptrace_put_fpr(struct task_struct *child, int index, 
unsigned long data)
 
 #ifdef CONFIG_PPC_FPU_REGS
flush_fp_to_thread(child);
-   if (fpidx < (PT_FPSCR - PT_FPR0))
-   memcpy(>thread.TS_FPR(fpidx), , sizeof(long));
-   else
+   if (fpidx < (PT_FPSCR - PT_FPR0)) {
+   if (IS_ENABLED(CONFIG_PPC32))
+   // On 32-bit the index we are passed refers to 32-bit 
words
+   ((u32 *)child->thread.fp_state.fpr)[fpidx] = data;
+   else
+   memcpy(>thread.TS_FPR(fpidx), , 
sizeof(long));
+   } else
child->thread.fp_state.fpscr = data;
 #endif
 
diff --git a/arch/powerpc/kernel/ptrace/ptrace.c 
b/arch/powerpc/kernel/ptrace/ptrace.c
index 4d2dc22d4a2d..5d7a72b41ae7 100644
--- a/arch/powerpc/kernel/ptrace/ptrace.c
+++ b/arch/powerpc/kernel/ptrace/ptrace.c
@@ -444,4 +444,7 @@ void __init pt_regs_check(void)
 * real registers.
 */
BUILD_BUG_ON(PT_DSCR < sizeof(struct user_pt_regs) / sizeof(unsigned 
long));
+
+   // ptrace_get/put_fpr() rely on PPC32 and VSX being

Re: [PATCH 2/2] uio:powerpc:mpc85xx: l2-cache-sram uio driver implementation

2022-06-09 Thread Greg KH

On Thu, Jun 09, 2022 at 03:28:55AM -0700, Wang Wenhu wrote:
> The l2-cache could be optionally configured as SRAM partly or fully.
> Users can make use of it as a block of independent memory that offers
> special usage, such as for debuging or other cratical status info
> storage which keeps consistently even when the whole system crashed.
> 
> The hardware related configuration process utilized the work of the
> earlier implementation, which has been removed now.
> See: 
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/commit/?id=dc21ed2aef4150fc2fcf58227a4ff24502015c03
> 
> Cc: Christophe Leroy 
> Signed-off-by: Wang Wenhu 
> ---
>  drivers/uio/Kconfig   |  10 +
>  drivers/uio/Makefile  |   1 +
>  drivers/uio/uio_fsl_85xx_cache_sram.c | 286 ++
>  3 files changed, 297 insertions(+)
>  create mode 100644 drivers/uio/uio_fsl_85xx_cache_sram.c
> 
> diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig
> index 2e16c5338e5b..9199ced03880 100644
> --- a/drivers/uio/Kconfig
> +++ b/drivers/uio/Kconfig
> @@ -105,6 +105,16 @@ config UIO_NETX
> To compile this driver as a module, choose M here; the module
> will be called uio_netx.
>  
> +config UIO_FSL_85XX_CACHE_SRAM
> + tristate "Freescale 85xx Cache-Sram driver"
> + depends on FSL_SOC_BOOKE && PPC32
> + help
> +   Generic driver for accessing the Cache-Sram form user level. This
> +   is extremely helpful for some user-space applications that require
> +   high performance memory accesses.
> +
> +   If you don't know what to do here, say N.

Module name information?

> +
>  config UIO_FSL_ELBC_GPCM
>   tristate "eLBC/GPCM driver"
>   depends on FSL_LBC
> diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile
> index f2f416a14228..1ba07d92a1b1 100644
> --- a/drivers/uio/Makefile
> +++ b/drivers/uio/Makefile
> @@ -12,3 +12,4 @@ obj-$(CONFIG_UIO_MF624) += uio_mf624.o
>  obj-$(CONFIG_UIO_FSL_ELBC_GPCM)  += uio_fsl_elbc_gpcm.o
>  obj-$(CONFIG_UIO_HV_GENERIC) += uio_hv_generic.o
>  obj-$(CONFIG_UIO_DFL)+= uio_dfl.o
> +obj-$(CONFIG_UIO_FSL_85XX_CACHE_SRAM)+= uio_fsl_85xx_cache_sram.o
> diff --git a/drivers/uio/uio_fsl_85xx_cache_sram.c 
> b/drivers/uio/uio_fsl_85xx_cache_sram.c
> new file mode 100644
> index ..d363f9d2b179
> --- /dev/null
> +++ b/drivers/uio/uio_fsl_85xx_cache_sram.c
> @@ -0,0 +1,286 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2022 Wang Wenhu 
> + * All rights reserved.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DRIVER_NAME  "uio_mpc85xx_cache_sram"
> +#define UIO_INFO_VER "0.0.1"
> +#define UIO_NAME "uio_cache_sram"
> +
> +#define L2CR_L2FI0x4000  /* L2 flash 
> invalidate */
> +#define L2CR_L2IO0x0020  /* L2 
> instruction only */
> +#define L2CR_SRAM_ZERO   0x  /* L2SRAM zero 
> size */
> +#define L2CR_SRAM_FULL   0x0001  /* L2SRAM full 
> size */
> +#define L2CR_SRAM_HALF   0x0002  /* L2SRAM half 
> size */
> +#define L2CR_SRAM_TWO_HALFS  0x0003  /* L2SRAM two half 
> sizes */
> +#define L2CR_SRAM_QUART  0x0004  /* L2SRAM one 
> quarter size */
> +#define L2CR_SRAM_TWO_QUARTS 0x0005  /* L2SRAM two quarter size */
> +#define L2CR_SRAM_EIGHTH 0x0006  /* L2SRAM one eighth 
> size */
> +#define L2CR_SRAM_TWO_EIGHTH 0x0007  /* L2SRAM two eighth size */
> +
> +#define L2SRAM_OPTIMAL_SZ_SHIFT  0x0003  /* Optimum size for 
> L2SRAM */
> +
> +#define L2SRAM_BAR_MSK_LO18  0xC000  /* Lower 18 bits */
> +#define L2SRAM_BARE_MSK_HI4  0x000F  /* Upper 4 bits */
> +
> +enum cache_sram_lock_ways {
> + LOCK_WAYS_ZERO,
> + LOCK_WAYS_EIGHTH,
> + LOCK_WAYS_TWO_EIGHTH,

Why not have values for these?

> + LOCK_WAYS_HALF = 4,
> + LOCK_WAYS_FULL = 8,
> +};
> +
> +struct mpc85xx_l2ctlr {
> + u32 ctl;/* 0x000 - L2 control */

What is the endian of these u32 values?  You map them directly to
memory, so they must be specified some way, right?  Please make it
obvious what they are.

> + u8  res1[0xC];
> + u32 ewar0;  /* 0x010 - External write address 0 */
> + u32 ewarea0;/* 0x014 - External write address extended 0 */
> + u32 ewcr0;  /* 0x018 - External write ctrl */
> + u8  res2[4];
> + u32 ewar1;  /* 0x020 - External write address 1 */
> + u32 ewarea1;/* 0x024 - External write address extended 1 */
> + u32 ewcr1;  /* 0x028 - External write ctrl 1 */
> + u8  res3[4];
> + u32 ewar2;  /* 0x030 - External write address 2 */
> + u32

Re: [PATCH 2/6] powerpc: Provide syscall wrapper

2022-06-09 Thread Christophe Leroy





Le 01/06/2022 à 10:29, Christophe Leroy a écrit :



Le 01/06/2022 à 07:48, Rohan McLure a écrit :
[Vous ne recevez pas souvent de courriers de la part de 
rmcl...@linux.ibm.com. Découvrez pourquoi cela peut être important à 
l'adresse https://aka.ms/LearnAboutSenderIdentification.]


Syscall wrapper implemented as per s390, x86, arm64, providing the
option for gprs to be cleared on entry to the kernel, reducing caller
influence influence on speculation within syscall routine. The wrapper
is a macro that emits syscall handler implementations with parameters
passed by stack pointer.


Passing parameters by stack is going to be sub-optimal. Did you make any 
measurement of the implied performance degradation ? We usually use the 
null_syscall selftest for that everytime we touch syscall entries/exits.


I did a test with null_syscall on an 8xx. Surprisingly I get more than 
20% improvement with your series.


Looking at the generated code in more details, we see that 
system_call_exception() is lighter as now no stack frame is needed, the 
compiler has enough registers available.


Before the patch:

c000c9ec :
c000c9ec:   94 21 ff f0 stwur1,-16(r1)
c000c9f0:   93 e1 00 0c stw r31,12(r1)
c000c9f4:   7d 5f 53 78 mr  r31,r10
c000c9f8:   81 4a 00 84 lwz r10,132(r10)
c000c9fc:   90 7f 00 88 stw r3,136(r31)
c000ca00:   71 4b 00 02 andi.   r11,r10,2
c000ca04:   41 82 00 4c beq c000ca50 
c000ca08:   71 4b 40 00 andi.   r11,r10,16384
c000ca0c:   41 82 00 50 beq c000ca5c 
c000ca10:   71 4a 80 00 andi.   r10,r10,32768
c000ca14:   41 82 00 54 beq c000ca68 
c000ca18:   7c 50 13 a6 mtspr   80,r2
c000ca1c:   81 42 00 4c lwz r10,76(r2)
c000ca20:   71 4a 84 91 andi.   r10,r10,33937
c000ca24:   40 82 00 50 bne c000ca74 
c000ca28:   28 09 01 c2 cmplwi  r9,450
c000ca2c:   41 81 00 88 bgt c000cab4 
c000ca30:   3d 40 c0 6f lis r10,-16273
c000ca34:   55 29 10 3a rlwinm  r9,r9,2,0,29
c000ca38:   39 4a c1 c5 addir10,r10,-15931
c000ca3c:   7d 2a 48 2e lwzxr9,r10,r9
c000ca40:   83 e1 00 0c lwz r31,12(r1)
c000ca44:   7d 29 03 a6 mtctr   r9
c000ca48:   38 21 00 10 addir1,r1,16
c000ca4c:   4e 80 04 20 bctr
...

After the patch:
c000cc94 :
c000cc94:   81 24 00 84 lwz r9,132(r4)
c000cc98:   81 44 00 0c lwz r10,12(r4)
c000cc9c:   71 28 00 02 andi.   r8,r9,2
c000cca0:   91 44 00 88 stw r10,136(r4)
c000cca4:   41 82 00 48 beq c000ccec 
c000cca8:   71 2a 40 00 andi.   r10,r9,16384
c000ccac:   41 82 00 44 beq c000ccf0 
c000ccb0:   71 29 80 00 andi.   r9,r9,32768
c000ccb4:   41 82 00 40 beq c000ccf4 
c000ccb8:   7c 50 13 a6 mtspr   80,r2
c000ccbc:   81 22 00 4c lwz r9,76(r2)
c000ccc0:   71 29 84 91 andi.   r9,r9,33937
c000ccc4:   40 82 00 34 bne c000ccf8 
c000ccc8:   28 03 01 c2 cmplwi  r3,450
c000:   41 81 00 78 bgt c000cd44 
c000ccd0:   3d 20 c0 70 lis r9,-16272
c000ccd4:   54 63 10 3a rlwinm  r3,r3,2,0,29
c000ccd8:   39 29 81 c5 addir9,r9,-32315
c000ccdc:   7d 29 18 2e lwzxr9,r9,r3
c000cce0:   7c 83 23 78 mr  r3,r4
c000cce4:   7d 29 03 a6 mtctr   r9
c000cce8:   4e 80 04 20 bctr
...





Why going via stack ? The main advantage of a RISC processor like 
powerpc is that, unlike x86, there are enough registers to avoid going 
through memory. RISC processors are optimised with three operands 
operations and many registers, and usually have slow memory in return.


Well, thinking about it once more. In fact registers are saved to the 
stack anyway. At the start of syscall functions they are likely to still 
be hot in the cache, so reading them back is just a few cycles. And it 
eventually provide the compiler the opportunity to organise stuff better.







For platforms supporting this syscall wrapper, emit symbols with usual
in-register parameters (`sys...`) to support calls to syscall handlers
from within the kernel.

Syscalls are wrapped on all platforms except Cell processor. SPUs require
access syscall prototypes which are omitted with ARCH_HAS_SYSCALL_WRAPPER
enabled.
This commit message isn't very clear, please describe in more details 
what is done, how and why.





Christophe

Re: [PATCH 08/10] scsi/ibmvfc: Replace tasklet with work

2022-06-09 Thread Sebastian Andrzej Siewior

On 2022-05-30 16:15:10 [-0700], Davidlohr Bueso wrote:
> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
> index d0eab5700dc5..31b1900489e7 100644
> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
> @@ -891,7 +891,7 @@ static void ibmvfc_release_crq_queue(struct ibmvfc_host 
> *vhost)
>  
>   ibmvfc_dbg(vhost, "Releasing CRQ\n");
>   free_irq(vdev->irq, vhost);
> - tasklet_kill(>tasklet);
> +cancel_work_sync(>work);
s/ {8}/\t/

is there a reason not to use threaded interrupts? The workqueue _might_
migrate to another CPU. The locking ensures that nothing bad happens but
ibmvfc_tasklet() has this piece:

| spin_lock_irqsave(vhost->host->host_lock, flags);
…
| while ((async = ibmvfc_next_async_crq(vhost)) != NULL) {
| ibmvfc_handle_async(async, vhost);
| async->valid = 0;
| wmb();
| }
…
| vio_enable_interrupts(vdev);
potentially enables interrupts which fires right away.

| if ((async = ibmvfc_next_async_crq(vhost)) != NULL) {
| vio_disable_interrupts(vdev);

disables it again.

| }
| 
| spin_unlock(vhost->crq.q_lock);
| spin_unlock_irqrestore(vhost->host->host_lock, flags);

If the worker runs on another CPU then the CPU servicing the interrupt
will be blocked on the lock which is not nice.

My guess is that you could enable interrupts right before unlocking but
is a different story.

>   do {
>   if (rc)
>   msleep(100);

Sebastian

[PATCH] powerpc: Enable execve syscall exit tracepoint

2022-06-09 Thread Naveen N. Rao

On execve[at], we are zero'ing out most of the thread register state
including gpr[0], which contains the syscall number. Due to this, we
fail to trigger the syscall exit tracepoint properly. Fix this by
retaining gpr[0] in the thread register state.

Before this patch:
  # tail /sys/kernel/debug/tracing/trace
   cat-123 [000] .61.449351: sys_execve(filename:
  7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8)
   cat-124 [000] .62.428481: sys_execve(filename:
  7fffa6b23448, argv: 7fffa6b233e0, envp: 7fffa6b233f8)
  echo-125 [000] .65.813702: sys_execve(filename:
  7fffa6b23378, argv: 7fffa6b233a0, envp: 7fffa6b233b0)
  echo-125 [000] .65.822214: sys_execveat(fd: 0,
  filename: 1009ac48, argv: 765d0c98, envp: 765d0ca8, flags: 0)

After this patch:
  # tail /sys/kernel/debug/tracing/trace
   cat-127 [000] .   100.416262: sys_execve(filename:
  7fffa41b3448, argv: 7fffa41b33e0, envp: 7fffa41b33f8)
   cat-127 [000] .   100.418203: sys_execve -> 0x0
  echo-128 [000] .   103.873968: sys_execve(filename:
  7fffa41b3378, argv: 7fffa41b33a0, envp: 7fffa41b33b0)
  echo-128 [000] .   103.875102: sys_execve -> 0x0
  echo-128 [000] .   103.882097: sys_execveat(fd: 0,
  filename: 1009ac48, argv: 7fffd10d2148, envp: 7fffd10d2158, flags: 0)
  echo-128 [000] .   103.883225: sys_execveat -> 0x0

Cc: sta...@vger.kernel.org
Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/kernel/process.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index d00b20c6596671..bb4da23ecdd7c2 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1854,7 +1854,7 @@ void start_thread(struct pt_regs *regs, unsigned long 
start, unsigned long sp)
tm_reclaim_current(0);
 #endif
 
-   memset(regs->gpr, 0, sizeof(regs->gpr));
+   memset(>gpr[1], 0, sizeof(regs->gpr) - sizeof(regs->gpr[0]));
regs->ctr = 0;
regs->link = 0;
regs->xer = 0;

base-commit: 16332b7fbbe46581ddac80c6d32834c1269bc450
-- 
2.36.1

Re: [PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

2022-06-09 Thread Pali Rohár

On Tuesday 24 May 2022 11:17:56 Pali Rohár wrote:
> On Friday 06 May 2022 00:33:02 Pali Rohár wrote:
> > On Thursday 05 May 2022 15:10:01 Tyrel Datwyler wrote:
> > > On 5/5/22 02:31, Pali Rohár wrote:
> > > > Hello!
> > > > 
> > > > On Thursday 05 May 2022 07:16:40 Christophe Leroy wrote:
> > > >> Le 04/05/2022 à 19:57, Pali Rohár a écrit :
> > > >>> Since commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number 
> > > >>> based on
> > > >>> device-tree properties"), powerpc kernel always fallback to PCI domain
> > > >>> assignment from OF / Device Tree 'reg' property of the PCI controller.
> > > >>>
> > > >>> PCI code for other Linux architectures use increasing assignment of 
> > > >>> the PCI
> > > >>> domain for individual controllers (assign the first free number), 
> > > >>> like it
> > > >>> was also for powerpc prior mentioned commit.
> > > >>>
> > > >>> Upgrading powerpc kernels from LTS 4.4 version (which does not contain
> > > >>> mentioned commit) to new LTS versions brings a regression in domain
> > > >>> assignment.
> > > >>
> > > >> Can you elaborate why it is a regression ?
> > > >>63a72284b159
> > > >> That commit says 'no functionnal changes', I'm having hard time 
> > > >> understanding how a nochange can be a regression.
> > > > 
> > > > It is not 'no functional change'. That commit completely changed PCI
> > > > domain assignment in a way that is incompatible with other architectures
> > > > and also incompatible with the way how it was done prior that commit.
> > > 
> > > I agree that the "no functional change" statement is incorrect. However, 
> > > for
> > > most powerpc platforms it ended up being simply a cosmetic behavior 
> > > change. As
> > > far as I can tell there is nothing requiring domain ids to increase 
> > > montonically
> > > from zero or that each architecture is required to use the same domain 
> > > numbering
> > > scheme.
> > 
> > That is truth. But it looks really suspicious why domains are not
> > assigned monotonically. Some scripts / applications are using PCI
> > location (domain:bus:dev:func) for remembering PCI device and domain
> > change can cause issue for config files. And some (older) applications
> > expects existence of domain zero. In systems without hot plug support
> > with small number of domains (e.g. 3) it means that there are always
> > domains 0, 1 and 2.
> > 
> > > Its hard to call this a true regression unless it actually broke
> > > something. The commit in question has been in the kernel since 4.8 which 
> > > was
> > > released over 5 1/2 years ago.
> > 
> > I agree, it really depends on how you look at it.
> > 
> > The important is that lot of people are using LTS versions and are doing
> > upgrades when LTS support is dropped. Which for 4.4 now happened. So not
> > all smaller or "cosmetic" changes could be detected until longer LTS
> > period pass.
> > 
> > > With all that said looking closer at the code in question I think it is 
> > > fair to
> > > assume that the author only intended this change for powernv and pseries
> > > platforms and not every powerpc platform. That change was done to make
> > > persistent naming easier to manage in userspace.
> > 
> > I agree that this behavior change may be useful in some situations and I
> > do not object this need.
> > 
> > > Your change defaults back to
> > > the old behavior which will now break both powernv and pseries platforms 
> > > with
> > > regard to hotplugging and persistent naming.
> > 
> > I was aware of it, that change could cause issues. And that is why I
> > added config option for choosing behavior. So users would be able to
> > choose what they need.
> > 
> > > We could properly limit it to powernv and pseries by using ibm,fw-phb-id 
> > > instead
> > > of reg property in the look up that follows a failed ibm,opal-phbid 
> > > lookup. I
> > > think this is acceptable as long as no other powerpc platforms have 
> > > started
> > > using this behavior for persistent naming.
> > 
> > And what about setting that new config option to enabled by default for
> > those series?
> > 
> > Or is there issue with introduction of the new config option?
> 
> PING? Any opinion?

PING?

> > One of the point is that it is really a good idea to have similar/same
> > behavior for all linux platforms. And if it cannot be enabled by default
> > (for backward compatibility) add at least some option, so new platforms
> > can start using it or users can decide to switch behavior.
> > 
> > > -Tyrel
> > > 
> > > > For example, prior that commit on P2020 RDB board were PCI domains 0, 1 
> > > > and 2.
> > > > 
> > > > $ lspci
> > > > :00:00.0 PCI bridge: Freescale Semiconductor Inc P2020E (rev 21)
> > > > :01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 
> > > > 3.0 xHCI Host Controller (rev 02)
> > > > 0001:02:00.0 PCI bridge: Freescale Semiconductor Inc P2020E (rev 21)
> > > > 0001:03:00.0 Network controller: Qualcomm Atheros AR93xx Wireless 
> > > > Network Adapter

Re: [PATCH] powerpc: dts: Add DTS file for CZ.NIC Turris 1.x routers

2022-06-09 Thread Pali Rohár

On Tuesday 24 May 2022 11:23:32 Pali Rohár wrote:
> On Wednesday 11 May 2022 16:37:12 Pali Rohár wrote:
> > CZ.NIC Turris 1.0 and 1.1 are open source routers, they have dual-core
> > PowerPC Freescale P2020 CPU and are based on Freescale P2020RDB-PC-A board.
> > Hardware design is fully open source, all firmware and hardware design
> > files are available at Turris project website:
> > 
> > https://docs.turris.cz/hw/turris-1x/turris-1x/
> > https://project.turris.cz/en/hardware.html
> > 
> > Signed-off-by: Pali Rohár 
> > ---
> >  arch/powerpc/boot/dts/turris1x.dts | 470 +
> >  1 file changed, 470 insertions(+)
> >  create mode 100644 arch/powerpc/boot/dts/turris1x.dts
> 
> Michael, Rob: PING?

PING?

> > diff --git a/arch/powerpc/boot/dts/turris1x.dts 
> > b/arch/powerpc/boot/dts/turris1x.dts
> > new file mode 100644
> > index ..2a624f117586
> > --- /dev/null
> > +++ b/arch/powerpc/boot/dts/turris1x.dts
> > @@ -0,0 +1,470 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +/*
> > + * Turris 1.x Device Tree Source
> > + *
> > + * Copyright 2013 - 2022 CZ.NIC z.s.p.o. (http://www.nic.cz/)
> > + *
> > + * Pinout, Schematics and Altium hardware design files are open source
> > + * and available at: https://docs.turris.cz/hw/turris-1x/turris-1x/
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +/include/ "fsl/p2020si-pre.dtsi"
> > +
> > +/ {
> > +   model = "Turris 1.x";
> > +   compatible = "cznic,turris1x", "fsl,P2020RDB-PC"; /* fsl,P2020RDB-PC is 
> > required for booting Linux */
> > +
> > +   aliases {
> > +   ethernet0 = 
> > +   ethernet1 = 
> > +   ethernet2 = 
> > +   serial0 = 
> > +   serial1 = 
> > +   pci0 = 
> > +   pci1 = 
> > +   pci2 = 
> > +   spi0 = 
> > +   };
> > +
> > +   memory {
> > +   device_type = "memory";
> > +   };
> > +
> > +   soc: soc@ffe0 {
> > +   ranges = <0x0 0x0 0xffe0 0x0010>;
> > +
> > +   i2c@3000 {
> > +   /* PCA9557PW GPIO controller for boot config */
> > +   gpio-controller@18 {
> > +   compatible = "nxp,pca9557";
> > +   label = "bootcfg";
> > +   reg = <0x18>;
> > +   #gpio-cells = <2>;
> > +   gpio-controller;
> > +   polarity = <0x00>;
> > +   };
> > +
> > +   /* STM32F030R8T6 MCU for power control */
> > +   power-control@32 {
> > +   /*
> > +* Turris Power Control firmware runs on 
> > STM32F0 MCU.
> > +* This firmware is open source and available 
> > at:
> > +* 
> > https://gitlab.nic.cz/turris/hw/turris_power_control
> > +*/
> > +   reg = <0x32>;
> > +   };
> > +
> > +   /* SA56004ED temperature control */
> > +   temperature-sensor@4c {
> > +   compatible = "nxp,sa56004";
> > +   reg = <0x4c>;
> > +   interrupt-parent = <>;
> > +   interrupts = <12 IRQ_TYPE_LEVEL_LOW>, /* GPIO12 
> > - ALERT pin */
> > +<13 IRQ_TYPE_LEVEL_LOW>; /* GPIO13 
> > - CRIT pin */
> > +   };
> > +
> > +   /* DDR3 SPD/EEPROM */
> > +   eeprom@52 {
> > +   compatible = "atmel,spd";
> > +   reg = <0x52>;
> > +   };
> > +
> > +   /* ATSHA204-TH-DA-T crypto module */
> > +   crypto@64 {
> > +   compatible = "atmel,atsha204";
> > +   reg = <0x64>;
> > +   };
> > +
> > +   /* IDT6V49205BNLGI clock generator */
> > +   clock-generator@69 {
> > +   compatible = "idt,6v49205b";
> > +   reg = <0x69>;
> > +   };
> > +
> > +   /* MCP79402-I/ST Protected EEPROM */
> > +   eeprom@57 {
> > +   reg = <0x57>;
> > +   };
> > +
> > +   /* MCP79402-I/ST RTC */
> > +   rtc@6f {
> > +   compatible = "microchip,mcp7940x";
> > +   reg = <0x6f>;
> > +   interrupt-parent = <>;
> > +   interrupts = <14 0>; /* GPIO14 - MFP pin */
> > +   };
> > +   };
> > +
> > +   /* SPI on connector P1 */
> > +   spi0: spi@7000 {
> > +   };
> > +
> > +   gpio: gpio-controller@fc00 {
> > +   #interrupt-cells = <2>;
> > +

[PATCH v2 2/3] powerpc/irq: Perform stack_overflow detection after switching to IRQ stack

2022-06-09 Thread Christophe Leroy

When KASAN is enabled, as shown by the Oops below, the 2k limit is not
enough to allow stack dump after a stack overflow detection when
CONFIG_DEBUG_STACKOVERFLOW is selected:

do_IRQ: stack overflow: 1984
CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1
Call Trace:
Oops: Kernel stack overflow, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in: sr_mod cdrom radeon(+) ohci_pci(+) hwmon 
i2c_algo_bit drm_ttm_helper ttm drm_dp_helper snd_aoa_i2sbus snd_aoa_soundbus 
snd_pcm ehci_pci snd_timer ohci_hcd snd ssb ehci_hcd 8250_pci soundcore 
drm_kms_helper pcmcia 8250 pcmcia_core syscopyarea usbcore sysfillrect 
8250_base sysimgblt serial_mctrl_gpio fb_sys_fops usb_common pkcs8_key_parser 
fuse drm drm_panel_orientation_quirks configfs
CPU: 0 PID: 126 Comm: systemd-udevd Not tainted 5.18.0-gentoo-PMacG4 #1
NIP:  c02e5558 LR: c07eb3bc CTR: c07f46a8
REGS: e7fe9f50 TRAP:    Not tainted  (5.18.0-gentoo-PMacG4)
MSR:  1032   CR: 44a14824  XER: 2000

GPR00: c07eb3bc eaa1c000 c26baea0 eaa1c0a0 0008  c07eb3bc 
eaa1c010
GPR08: eaa1c0a8 04f3f3f3 f1f1f1f1 c07f4c84 44a14824 0080f7e4 0005 
0010
GPR16: 0025 eaa1c154 eaa1c158 c0dbad64 0020 fd543810 eaa1c0a0 
eaa1c29e
GPR24: c0dbad44 c0db8740 05ff fd543802 eaa1c150 c0c9a3c0 eaa1c0a0 
c0c9a3c0
NIP [c02e5558] kasan_check_range+0xc/0x2b4
LR [c07eb3bc] format_decode+0x80/0x604
Call Trace:
[eaa1c000] [c07eb3bc] format_decode+0x80/0x604 (unreliable)
[eaa1c070] [c07f4dac] vsnprintf+0x128/0x938
[eaa1c110] [c07f5788] sprintf+0xa0/0xc0
[eaa1c180] [c0154c1c] __sprint_symbol.constprop.0+0x170/0x198
[eaa1c230] [c07ee71c] symbol_string+0xf8/0x260
[eaa1c430] [c07f46d0] pointer+0x15c/0x710
[eaa1c4b0] [c07f4fbc] vsnprintf+0x338/0x938
[eaa1c550] [c00e8fa0] vprintk_store+0x2a8/0x678
[eaa1c690] [c00e94e4] vprintk_emit+0x174/0x378
[eaa1c6d0] [c00ea008] _printk+0x9c/0xc0
[eaa1c750] [c000ca94] show_stack+0x21c/0x260
[eaa1c7a0] [c07d0bd4] dump_stack_lvl+0x60/0x90
[eaa1c7c0] [c0009234] __do_IRQ+0x170/0x174
[eaa1c800] [c0009258] do_IRQ+0x20/0x34
[eaa1c820] [c00045b4] HardwareInterrupt_virt+0x108/0x10c
...

As the detection is asynchronously performed at IRQs, we could be long
after the limit has been crossed, so increasing the limit would not
solve the problem completely.

In order to be sure that there is enough stack space for the stack
dump, do it after the switch to the IRQ stack. That way it is sure
that the stack is large enough, unless the IRQ stack has been
overfilled in which case the end of life is close.

Reported-by: Erhard Furtner 
Signed-off-by: Christophe Leroy 
---
v2: Use provided 'sp' instead of overwritting it with current stack pointer
---
 arch/powerpc/kernel/irq.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 370434f6c316..e5b7fb5282ee 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -184,14 +184,12 @@ u64 arch_irq_stat_cpu(unsigned int cpu)
return sum;
 }
 
-static inline void check_stack_overflow(void)
+static inline void check_stack_overflow(unsigned long sp)
 {
-   long sp;
-
if (!IS_ENABLED(CONFIG_DEBUG_STACKOVERFLOW))
return;
 
-   sp = current_stack_pointer & (THREAD_SIZE - 1);
+   sp &= THREAD_SIZE - 1;
 
/* check for stack overflow: is there less than 2KB free? */
if (unlikely(sp < 2048)) {
@@ -221,12 +219,14 @@ static __always_inline void call_do_softirq(const void 
*sp)
 
 DEFINE_STATIC_CALL_RET0(ppc_get_irq, *ppc_md.get_irq);
 
-static void __do_irq(struct pt_regs *regs)
+static void __do_irq(struct pt_regs *regs, unsigned long oldsp)
 {
unsigned int irq;
 
trace_irq_entry(regs);
 
+   check_stack_overflow(oldsp);
+
/*
 * Query the platform PIC for the interrupt & ack it.
 *
@@ -254,6 +254,7 @@ static __always_inline void call_do_irq(struct pt_regs 
*regs, void *sp)
/* Temporarily switch r1 to sp, call __do_irq() then restore r1. */
asm volatile (
 PPC_STLU " %%r1, %[offset](%[sp])  ;"
+   "mr %%r4, %%r1  ;"
"mr %%r1, %[sp] ;"
"bl %[callee]   ;"
 PPC_LL "   %%r1, 0(%%r1)   ;"
@@ -279,11 +280,9 @@ void __do_IRQ(struct pt_regs *regs)
irqsp = hardirq_ctx[raw_smp_processor_id()];
sirqsp = softirq_ctx[raw_smp_processor_id()];
 
-   check_stack_overflow();
-
/* Already there ? */
if (unlikely(cursp == irqsp || cursp == sirqsp)) {
-   __do_irq(regs);
+   __do_irq(regs,

[PATCH v2 1/3] powerpc/irq: Make __do_irq() static

2022-06-09 Thread Christophe Leroy

Since commit 48cf12d88969 ("powerpc/irq: Inline call_do_irq() and
call_do_softirq()"), __do_irq() is not used outside irq.c

Reorder functions and make __do_irq() static and
drop the declaration in irq.h.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/irq.h |  1 -
 arch/powerpc/kernel/irq.c  | 46 +-
 2 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/include/asm/irq.h b/arch/powerpc/include/asm/irq.h
index 13f0409dd617..5c1516a5ba8f 100644
--- a/arch/powerpc/include/asm/irq.h
+++ b/arch/powerpc/include/asm/irq.h
@@ -54,7 +54,6 @@ extern void *softirq_ctx[NR_CPUS];
 
 void __do_IRQ(struct pt_regs *regs);
 extern void __init init_IRQ(void);
-extern void __do_irq(struct pt_regs *regs);
 
 int irq_choose_cpu(const struct cpumask *mask);
 
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 873e6dffb868..370434f6c316 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -219,31 +219,9 @@ static __always_inline void call_do_softirq(const void *sp)
);
 }
 
-static __always_inline void call_do_irq(struct pt_regs *regs, void *sp)
-{
-   register unsigned long r3 asm("r3") = (unsigned long)regs;
-
-   /* Temporarily switch r1 to sp, call __do_irq() then restore r1. */
-   asm volatile (
-PPC_STLU " %%r1, %[offset](%[sp])  ;"
-   "mr %%r1, %[sp] ;"
-   "bl %[callee]   ;"
-PPC_LL "   %%r1, 0(%%r1)   ;"
-: // Outputs
-  "+r" (r3)
-: // Inputs
-  [sp] "b" (sp), [offset] "i" (THREAD_SIZE - 
STACK_FRAME_OVERHEAD),
-  [callee] "i" (__do_irq)
-: // Clobbers
-  "lr", "xer", "ctr", "memory", "cr0", "cr1", "cr5", "cr6",
-  "cr7", "r0", "r4", "r5", "r6", "r7", "r8", "r9", "r10",
-  "r11", "r12"
-   );
-}
-
 DEFINE_STATIC_CALL_RET0(ppc_get_irq, *ppc_md.get_irq);
 
-void __do_irq(struct pt_regs *regs)
+static void __do_irq(struct pt_regs *regs)
 {
unsigned int irq;
 
@@ -269,6 +247,28 @@ void __do_irq(struct pt_regs *regs)
trace_irq_exit(regs);
 }
 
+static __always_inline void call_do_irq(struct pt_regs *regs, void *sp)
+{
+   register unsigned long r3 asm("r3") = (unsigned long)regs;
+
+   /* Temporarily switch r1 to sp, call __do_irq() then restore r1. */
+   asm volatile (
+PPC_STLU " %%r1, %[offset](%[sp])  ;"
+   "mr %%r1, %[sp] ;"
+   "bl %[callee]   ;"
+PPC_LL "   %%r1, 0(%%r1)   ;"
+: // Outputs
+  "+r" (r3)
+: // Inputs
+  [sp] "b" (sp), [offset] "i" (THREAD_SIZE - 
STACK_FRAME_OVERHEAD),
+  [callee] "i" (__do_irq)
+: // Clobbers
+  "lr", "xer", "ctr", "memory", "cr0", "cr1", "cr5", "cr6",
+  "cr7", "r0", "r4", "r5", "r6", "r7", "r8", "r9", "r10",
+  "r11", "r12"
+   );
+}
+
 void __do_IRQ(struct pt_regs *regs)
 {
struct pt_regs *old_regs = set_irq_regs(regs);
-- 
2.35.3

[PATCH v2 3/3] powerpc/irq: Simplify __do_irq()

2022-06-09 Thread Christophe Leroy

Remove duplicated code by implementing a proper if/else.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/irq.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index e5b7fb5282ee..b007f16ccbd3 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -280,14 +280,11 @@ void __do_IRQ(struct pt_regs *regs)
irqsp = hardirq_ctx[raw_smp_processor_id()];
sirqsp = softirq_ctx[raw_smp_processor_id()];
 
-   /* Already there ? */
-   if (unlikely(cursp == irqsp || cursp == sirqsp)) {
+   /* Already there ? If not switch stack and call */
+   if (unlikely(cursp == irqsp || cursp == sirqsp))
__do_irq(regs, current_stack_pointer);
-   set_irq_regs(old_regs);
-   return;
-   }
-   /* Switch stack and call */
-   call_do_irq(regs, irqsp);
+   else
+   call_do_irq(regs, irqsp);
 
set_irq_regs(old_regs);
 }
-- 
2.35.3

Re: [PATCH] powerpc/ptrace: Fix buffer overflow when handling PTRACE_PEEKUSER and PTRACE_POKEUSER

2022-06-09 Thread Christophe Leroy

Hi Michael,

Le 06/06/2022 à 16:45, Michael Ellerman a écrit :
> Hi Ariel,
> 
> I've added Christophe to Cc who works on ppc32.

I've added powerpc list

> 
> I haven't actually reproduced the crash with gdbserver, but I have a
> test case which shows the bug, so I've been able to confirm it and
> test a fix.
> 
> Thanks for your patch, but I wanted to fix it differently. Can you try
> the patch below and make sure it fixes the bug for you?
> 
> I've also attached the test case I've been using.
> 
> Christophe are you able to test these on some 32-bit machines? I've
> tested it in qemu and on one 32-bit machine I have here, but some more
> real testing would be good.

Yes I will test it but my CPUs have no FPU so it will be with the kernel 
software Math emulation. But it should make no difference I guess ?

Christophe

> 
> If the patch works then I'll need to do manual back ports for several of
> the stable kernels, and then once those are ready I will publish the
> patch.
> 
> cheers
> 
> 
> ---8<---
>  From eaa9a32fe38d8722d2da8773965309365805d66d Mon Sep 17 00:00:00 2001
> From: Michael Ellerman 
> Date: Tue, 7 Jun 2022 00:34:56 +1000
> Subject: [PATCH] powerpc/32: Fix overread/overwrite of thread_struct via
>   ptrace
> 
> The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
> to read/write registers of another process.
> 
> To get/set a register, the API takes an index into an imaginary address
> space called the "USER area", where the registers of the process are
> laid out in some fashion.
> 
> The kernel then maps that index to a particular register in its own data
> structures and gets/sets the value.
> 
> The API only allows a single machine-word to be read/written at a time.
> So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.
> 
> The way floating point registers (FPRs) are addressed is somewhat
> complicated, because double precision float values are 64-bit even on
> 32-bit CPUs. That means on 32-bit kernels each FPR occupies two
> word-sized locations in the USER area. On 64-bit kernels each FPR
> occupies one word-sized location in the USER area.
> 
> Internally the kernel stores the FPRs in an array of u64s, or if VSX is
> enabled, an array of pairs of u64s where one half of each pair stores
> the FPR. Which half of the pair stores the FPR depends on the kernel's
> endianness.
> 
> To handle the different layouts of the FPRs depending on VSX/no-VSX and
> big/little endian, the TS_FPR() macro was introduced.
> 
> Unfortunately the TS_FPR() macro does not take into account the fact
> that the addressing of each FPR differs between 32-bit and 64-bit
> kernels. It just takes the index into the "USER area" passed from
> userspace and indexes into the fp_state.fpr array.
> 
> On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
> the fp_state.fpr array, meaning the user can read/write 256 bytes past
> the end of the array. Because the fp_state sits in the middle of the
> thread_struct there are various fields than can be overwritten,
> including some pointers. As such it is probably exploitable.
> 
> It has also been observed to cause systems to hang or otherwise
> misbehave when using gdbserver, and is probably the root cause of this
> report which could not be easily reproduced:
>
> https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40...@keymile.com/
> 
> Rather than trying to make the TS_FPR() macro even more complicated to
> fix the bug, or add more macros, instead add a special-case for 32-bit
> kernels. This is more obvious and hopefully avoids a similar bug
> happening again in future. Note that because 32-bit kernels never have
> VSX enabled the code doesn't need to consider TS_FPRWIDTH/OFFSET at all.
> 
> Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR 
> registers in little endian builds")
> Cc: sta...@vger.kernel.org # v3.13+
> Reported-by: Ariel Miculas 
> Signed-off-by: Michael Ellerman 
> ---
>   arch/powerpc/kernel/ptrace/ptrace-fpu.c | 20 ++--
>   1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/ptrace/ptrace-fpu.c 
> b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
> index 5dca19361316..f406436a0f6c 100644
> --- a/arch/powerpc/kernel/ptrace/ptrace-fpu.c
> +++ b/arch/powerpc/kernel/ptrace/ptrace-fpu.c
> @@ -17,9 +17,13 @@ int ptrace_get_fpr(struct task_struct *child, int index, 
> unsigned long *data)
>   
>   #ifdef CONFIG_PPC_FPU_REGS
>   flush_fp_to_thread(child);
> - if (fpidx < (PT_FPSCR - PT_FPR0))
> - memcpy(data, >thread.TS_FPR(fpidx), sizeof(long));
> - else
> + if (fpidx < (PT_FPSCR - PT_FPR0)) {
> + if (IS_ENABLED(CONFIG_PPC32))
> + // The 32-bit ptrace API addresses the FPRs as 32-bit 
> words
> + *data = ((u32 *)child->thread.fp_state.fpr)[fpidx];
> + else
> + memcpy(data, >thread.TS_FPR(fpidx), 
> sizeof(long));
> +

Re: [PATCH 0/2] Disabling NMI watchdog during LPM's memory transfer

2022-06-09 Thread Laurent Dufour

On 09/06/2022, 09:45:49, Michael Ellerman wrote:
> Nathan Lynch  writes:
>> Laurent Dufour  writes:
> ...
>>
>>> There are  ongoing investigations to clarify where and how this latency is
>>> happening. I'm not excluding any other issue in the Linux kernel, but right
>>> now, this looks to be the best option to prevent system crash during
>>> LPM.
>>
>> It will prevent the likely crash mode for enterprise distros with
>> default watchdog tunables that our internal test environments happen to
>> use. But if someone were to run the same scenario with softlockup_panic
>> enabled, or with the RCU stall timeout lower than the watchdog
>> threshold, the failure mode would be different.
>>
>> Basically I'm saying:
>> * Some users may actually want the OS to panic when it's in this state,
>>   because their applications can't work correctly.
>> * But if we're going to inhibit one watchdog, we should inhibit them
>>   all.
> 
> I'm sympathetic to both of your arguments.
> 
> But I think there is a key difference between the NMI watchdog and other
> watchdogs, which is that the NMI watchdog will use the unsafe NMI to
> interrupt other CPUs, and that can cause the system to crash when other
> watchdogs would just print a backtrace.
> 
> We had the same problem with the rcu_sched stall detector until we
> changed it to use the "safe" NMI, see:
>   5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")
> 
> 
> So even if the NMI watchdog is disabled there are still the other
> watchdogs enabled, which should print backtraces by default, and if
> desired can also be configured to cause a panic.
> 
> Instead of disabling the NMI watchdog, can we instead increase the
> timeout (by how much?) during LPM, so that it is less likely to fire in
> normal usage, but is still there as a backup if the system is completely
> clogged.

That's probably doable, tweaking wd_smp_panic_timeout_tb and
wd_panic_timeout_tb when the LPM is in progress.

I'll add a new sysctl value, so administrator will have the capability to
change that and also fully disable the NMI watchdog during LPM if he want.

I've no idea what should be the default factor, I guess this will be a bit
empiric.

I'll rework my patch in that way.

cheers,
Laurent.

Re: [PATCH 0/2] Disabling NMI watchdog during LPM's memory transfer

2022-06-09 Thread Michael Ellerman

Nathan Lynch  writes:
> Laurent Dufour  writes:
...
>
>> There are  ongoing investigations to clarify where and how this latency is
>> happening. I'm not excluding any other issue in the Linux kernel, but right
>> now, this looks to be the best option to prevent system crash during
>> LPM.
>
> It will prevent the likely crash mode for enterprise distros with
> default watchdog tunables that our internal test environments happen to
> use. But if someone were to run the same scenario with softlockup_panic
> enabled, or with the RCU stall timeout lower than the watchdog
> threshold, the failure mode would be different.
>
> Basically I'm saying:
> * Some users may actually want the OS to panic when it's in this state,
>   because their applications can't work correctly.
> * But if we're going to inhibit one watchdog, we should inhibit them
>   all.

I'm sympathetic to both of your arguments.

But I think there is a key difference between the NMI watchdog and other
watchdogs, which is that the NMI watchdog will use the unsafe NMI to
interrupt other CPUs, and that can cause the system to crash when other
watchdogs would just print a backtrace.

We had the same problem with the rcu_sched stall detector until we
changed it to use the "safe" NMI, see:
  5cc05910f26e ("powerpc/64s: Wire up arch_trigger_cpumask_backtrace()")

So even if the NMI watchdog is disabled there are still the other
watchdogs enabled, which should print backtraces by default, and if
desired can also be configured to cause a panic.

Instead of disabling the NMI watchdog, can we instead increase the
timeout (by how much?) during LPM, so that it is less likely to fire in
normal usage, but is still there as a backup if the system is completely
clogged.

cheers

38 matches

Mail list logo