Re: Calculating array sizes in C - was: Re: Build regressions/improvements in v6.2-rc1

2023-01-20 Thread Michael Karcher

Hello Adrian,

Could you post a kernel patch for that? I would be happy to test it on my
SH-7785CLR board. Also, I'm going to file a bug report against GCC.


I filed the bug already. It's 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108483.


The diff is attached. It's published as CC0 in case anyone considers 
this trivial change copyrightable. This patch prevents this one specific 
warning from being upgraded to "error" even if you configure the kernel 
to use "-Werror". It still keeps it active as warning, though.


Kind regards,
  Michael Karcher
diff --git a/Makefile b/Makefile
index e09fe100efb2..b4cd075c6a19 100644
--- a/Makefile
+++ b/Makefile
@@ -870,7 +870,7 @@ stackp-flags-$(CONFIG_STACKPROTECTOR_STRONG)  := 
-fstack-protector-strong
 
 KBUILD_CFLAGS += $(stackp-flags-y)
 
-KBUILD_CPPFLAGS-$(CONFIG_WERROR) += -Werror
+KBUILD_CPPFLAGS-$(CONFIG_WERROR) += -Werror -Wno-error=sizeof-pointer-div
 KBUILD_CPPFLAGS += $(KBUILD_CPPFLAGS-y)
 KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds
 


Re: [PATCH RESEND 0/8] Resend LED patches

2023-01-20 Thread Lee Jones
On Fri, 20 Jan 2023, Arnd Bergmann wrote:

> On Fri, Jan 20, 2023, at 18:15, Lee Jones wrote:
> > On Fri, 20 Jan 2023, Arnd Bergmann wrote:
> 
> >> > Marek Behún (3):
> >> >   leds: turris-omnia: support HW controlled mode via private trigger
> >> >   leds: turris-omnia: initialize multi-intensity to full
> >> >   leds: turris-omnia: change max brightness from 255 to 1
> >> >
> >> > Pali Rohár (5):
> >> >   dt-bindings: leds: register-bit-led: Add active-low property
> >> >   leds: syscon: Implement support for active-low property
> >> >   powerpc/85xx: DTS: Add CPLD definitions for P1021RDB Combo Board CPL
> >> > Design
> >> >   dt-bindings: leds: Add cznic,turris1x-leds.yaml binding
> >> >   leds: Add support for Turris 1.x LEDs
> >> >
> >> >  .../testing/sysfs-class-led-driver-turris1x   |  31 ++
> >> >  .../bindings/leds/cznic,turris1x-leds.yaml| 118 +
> >> >  .../bindings/leds/register-bit-led.yaml   |   5 +
> >> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc.dtsi|  92 
> >> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc_32b.dts |   6 +-
> >> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc_36b.dts |   6 +-
> >> >  arch/powerpc/boot/dts/fsl/p1020rdb-pd.dts |  44 +-
> >> >  arch/powerpc/boot/dts/fsl/p1020utm-pc.dtsi|  37 ++
> >> >  arch/powerpc/boot/dts/fsl/p1020utm-pc_32b.dts |   4 +-
> >> >  arch/powerpc/boot/dts/fsl/p1020utm-pc_36b.dts |   4 +-
> >> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc.dtsi|  37 ++
> >> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc_32b.dts |   5 +-
> >> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc_36b.dts |   5 +-
> >> >  arch/powerpc/boot/dts/fsl/p2020rdb-pc.dtsi|  33 +-
> >
> >> >  drivers/leds/Kconfig  |  10 +
> >> >  drivers/leds/Makefile |   1 +
> >> >  drivers/leds/leds-syscon.c|  14 +-
> >> >  drivers/leds/leds-turris-1x.c | 474 ++
> >> >  drivers/leds/leds-turris-omnia.c  |  46 +-
> >
> > If everyone is convinced that applying these drivers is the correct
> > thing to do, I'd be happy to (rather) take them via LEDs.
> 
> Ok, thanks. I had not actually looked at the patches until today.
> They were in the soc tree backlog but appeared to be misplaced
> there until I read the  0/10 message text.
> 
> Looking at it now, I see:
> 
> - patches 1 and 2 seem obvious and have been reviewed by
>   others already
> 
> - patch 3 is for arch/powerpc and should get merged through
>   there if there are no objections to the binding in patch 4.
> 
> - patch 5 is the big driver patch, with a Reviewed-by tag
>   from Marek Behún, who is the author of the last three patches.
>   An earlier version of this patch was sent in June and got
>   a few Acks and detailed feedback from Andy [1], but he's also
>   not on Cc, and I don't know if his comments are all resolved
>   in this version.
> 
> - Patches 6, 7 and 8 all seem simple LED subsystem patches,
>   they just need review from you in order to get applied.
>   These are also missing a Signed-off-by from the submitter
>   in addition to the author in order to be applied.

Very well.  Let's have them resent then please (with past reviewers on
Cc:) and we'll go from there.

-- 
Lee Jones [李琼斯]


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Liam R. Howlett
* Suren Baghdasaryan  [230120 12:50]:
> On Fri, Jan 20, 2023 at 9:32 AM Matthew Wilcox  wrote:
> >
> > On Fri, Jan 20, 2023 at 09:17:46AM -0800, Suren Baghdasaryan wrote:
> > > On Fri, Jan 20, 2023 at 9:08 AM Liam R. Howlett  
> > > wrote:
> > > >
> > > > * Matthew Wilcox  [230120 11:50]:
> > > > > On Fri, Jan 20, 2023 at 08:45:21AM -0800, Suren Baghdasaryan wrote:
> > > > > > On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > > > > > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko 
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > > > > > > > call_rcu() can take a long time when callback offloading 
> > > > > > > > > > > is enabled.
> > > > > > > > > > > Its use in the vm_area_free can cause regressions in the 
> > > > > > > > > > > exit path when
> > > > > > > > > > > multiple VMAs are being freed. To minimize that impact, 
> > > > > > > > > > > place VMAs into
> > > > > > > > > > > a list and free them in groups using one call_rcu() call 
> > > > > > > > > > > per group.
> > > > > > > > > >
> > > > > > > > > > After some more clarification I can understand how call_rcu 
> > > > > > > > > > might not be
> > > > > > > > > > super happy about thousands of callbacks to be invoked and 
> > > > > > > > > > I do agree
> > > > > > > > > > that this is not really optimal.
> > > > > > > > > >
> > > > > > > > > > On the other hand I do not like this solution much either.
> > > > > > > > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help 
> > > > > > > > > > all that
> > > > > > > > > > much with processes with a huge number of vmas either. It 
> > > > > > > > > > would still be
> > > > > > > > > > in housands of callbacks to be scheduled without a good 
> > > > > > > > > > reason.
> > > > > > > > > >
> > > > > > > > > > Instead, are there any other cases than remove_vma that 
> > > > > > > > > > need this
> > > > > > > > > > batching? We could easily just link all the vmas into 
> > > > > > > > > > linked list and
> > > > > > > > > > use a single call_rcu instead, no? This would both simplify 
> > > > > > > > > > the
> > > > > > > > > > implementation, remove the scaling issue as well and we do 
> > > > > > > > > > not have to
> > > > > > > > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or 
> > > > > > > > > > epsilon + 1.
> > > > > > > > >
> > > > > > > > > Yes, I agree the solution is not stellar. I wanted something 
> > > > > > > > > simple
> > > > > > > > > but this is probably too simple. OTOH keeping all dead 
> > > > > > > > > vm_area_structs
> > > > > > > > > on the list without hooking up a shrinker (additional 
> > > > > > > > > complexity) does
> > > > > > > > > not sound too appealing either.
> > > > > > > >
> > > > > > > > I suspect you have missed my idea. I do not really want to keep 
> > > > > > > > the list
> > > > > > > > around or any shrinker. It is dead simple. Collect all vmas in
> > > > > > > > remove_vma and then call_rcu the whole list at once after the 
> > > > > > > > whole list
> > > > > > > > (be it from exit_mmap or remove_mt). See?
> > > > > > >
> > > > > > > Yes, I understood your idea but keeping dead objects until the 
> > > > > > > process
> > > > > > > exits even when the system is low on memory (no shrinkers 
> > > > > > > attached)
> > > > > > > seems too wasteful. If we do this I would advocate for attaching a
> > > > > > > shrinker.
> > > > > >
> > > > > > Maybe even simpler, since we are hit with this VMA freeing flood
> > > > > > during exit_mmap (when all VMAs are destroyed), we pass a hint to
> > > > > > vm_area_free to batch the destruction and all other cases call
> > > > > > call_rcu()? I don't think there will be other cases of VMA 
> > > > > > destruction
> > > > > > floods.
> > > > >
> > > > > ... or have two different call_rcu functions; one for munmap() and
> > > > > one for exit.  It'd be nice to use kmem_cache_free_bulk().
> > > >
> > > > Do we even need a call_rcu on exit?  At the point of freeing the VMAs we
> > > > have set the MMF_OOM_SKIP bit and unmapped the vmas under the read lock.
> > > > Once we have obtained the write lock again, I think it's safe to say we
> > > > can just go ahead and free the VMAs directly.
> > >
> > > I think that would be still racy if the page fault handler found that
> > > VMA under read-RCU protection but did not lock it yet (no locks are
> > > held yet). If it's preempted, the VMA can be freed and destroyed from
> > > under it without RCU grace period.
> >
> > The page fault handler (or whatever other reader -- ptrace, proc, etc)
> > should have a refcount on the mm_struct, so we can't be in this path
> > trying to free VMAs.  Right?
> 
> Hmm. That sounds right. I checked process_mrelease() as well, which
> operated 

Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Suren Baghdasaryan
On Fri, Jan 20, 2023 at 9:21 AM Paul E. McKenney  wrote:
>
> On Fri, Jan 20, 2023 at 04:49:42PM +, Matthew Wilcox wrote:
> > On Fri, Jan 20, 2023 at 08:45:21AM -0800, Suren Baghdasaryan wrote:
> > > On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan  
> > > wrote:
> > > >
> > > > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  wrote:
> > > > >
> > > > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > > > > call_rcu() can take a long time when callback offloading is 
> > > > > > > > enabled.
> > > > > > > > Its use in the vm_area_free can cause regressions in the exit 
> > > > > > > > path when
> > > > > > > > multiple VMAs are being freed. To minimize that impact, place 
> > > > > > > > VMAs into
> > > > > > > > a list and free them in groups using one call_rcu() call per 
> > > > > > > > group.
> > > > > > >
> > > > > > > After some more clarification I can understand how call_rcu might 
> > > > > > > not be
> > > > > > > super happy about thousands of callbacks to be invoked and I do 
> > > > > > > agree
> > > > > > > that this is not really optimal.
> > > > > > >
> > > > > > > On the other hand I do not like this solution much either.
> > > > > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all 
> > > > > > > that
> > > > > > > much with processes with a huge number of vmas either. It would 
> > > > > > > still be
> > > > > > > in housands of callbacks to be scheduled without a good reason.
> > > > > > >
> > > > > > > Instead, are there any other cases than remove_vma that need this
> > > > > > > batching? We could easily just link all the vmas into linked list 
> > > > > > > and
> > > > > > > use a single call_rcu instead, no? This would both simplify the
> > > > > > > implementation, remove the scaling issue as well and we do not 
> > > > > > > have to
> > > > > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon 
> > > > > > > + 1.
> > > > > >
> > > > > > Yes, I agree the solution is not stellar. I wanted something simple
> > > > > > but this is probably too simple. OTOH keeping all dead 
> > > > > > vm_area_structs
> > > > > > on the list without hooking up a shrinker (additional complexity) 
> > > > > > does
> > > > > > not sound too appealing either.
> > > > >
> > > > > I suspect you have missed my idea. I do not really want to keep the 
> > > > > list
> > > > > around or any shrinker. It is dead simple. Collect all vmas in
> > > > > remove_vma and then call_rcu the whole list at once after the whole 
> > > > > list
> > > > > (be it from exit_mmap or remove_mt). See?
> > > >
> > > > Yes, I understood your idea but keeping dead objects until the process
> > > > exits even when the system is low on memory (no shrinkers attached)
> > > > seems too wasteful. If we do this I would advocate for attaching a
> > > > shrinker.
> > >
> > > Maybe even simpler, since we are hit with this VMA freeing flood
> > > during exit_mmap (when all VMAs are destroyed), we pass a hint to
> > > vm_area_free to batch the destruction and all other cases call
> > > call_rcu()? I don't think there will be other cases of VMA destruction
> > > floods.
> >
> > ... or have two different call_rcu functions; one for munmap() and
> > one for exit.  It'd be nice to use kmem_cache_free_bulk().
>
> Good point, kfree_rcu(p, r) where "r" is the name of the rcu_head
> structure's field, is much more cache-efficient.
>
> The penalty is that there is no callback function to do any cleanup.
> There is just a kfree()/kvfree (bulk version where applicable),
> nothing else.

If Liam's suggestion works then we won't need anything additional. We
will free the vm_area_structs directly on process exit and use
call_rcu() in all other cases. Let's see if Michal knows of any case
which still needs an RCU grace period during exit_mmap.

>
> Thanx, Paul
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.
>


Re: Calculating array sizes in C - was: Re: Build regressions/improvements in v6.2-rc1

2023-01-20 Thread Michael.Karcher

Hello!

Can someone please file a GCC PR?  With reduced testcase preferably.


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108483

There you are.

Kind regars,
  Michael Karcher



[PATCH v3] of: Make of framebuffer devices unique

2023-01-20 Thread Michal Suchanek
Since Linux 5.19 this error is observed:

sysfs: cannot create duplicate filename '/devices/platform/of-display'

This is because multiple devices with the same name 'of-display' are
created on the same bus.

Update the code to create numbered device names for the non-boot
disaplay.

cc: linuxppc-dev@lists.ozlabs.org
References: https://bugzilla.kernel.org/show_bug.cgi?id=216095
Fixes: 52b1b46c39ae ("of: Create platform devices for OF framebuffers")
Reported-by: Erhard F. 
Suggested-by: Thomas Zimmermann 
Signed-off-by: Michal Suchanek 
---
v3:
- merge fix into original patch
- Update the device name format
- add missing const
- do not continue with iterating display devices when formatting device
  name fails
---
 drivers/of/platform.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 81c8c227ab6b..4d3a4d9f79f2 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -525,6 +525,9 @@ static int __init of_platform_default_populate_init(void)
if (IS_ENABLED(CONFIG_PPC)) {
struct device_node *boot_display = NULL;
struct platform_device *dev;
+   int display_number = 0;
+   char buf[14];
+   const char *of_display_format = "of-display.%d";
int ret;
 
/* Check if we have a MacOS display without a node spec */
@@ -555,7 +558,10 @@ static int __init of_platform_default_populate_init(void)
if (!of_get_property(node, "linux,opened", NULL) ||
!of_get_property(node, "linux,boot-display", NULL))
continue;
-   dev = of_platform_device_create(node, "of-display", 
NULL);
+   ret = snprintf(buf, sizeof(buf), of_display_format, 
display_number++);
+   if (ret >= sizeof(buf))
+   return -EOVERFLOW;
+   dev = of_platform_device_create(node, buf, NULL);
if (WARN_ON(!dev))
return -ENOMEM;
boot_display = node;
@@ -564,7 +570,10 @@ static int __init of_platform_default_populate_init(void)
for_each_node_by_type(node, "display") {
if (!of_get_property(node, "linux,opened", NULL) || 
node == boot_display)
continue;
-   of_platform_device_create(node, "of-display", NULL);
+   ret = snprintf(buf, sizeof(buf), of_display_format, 
display_number++);
+   if (ret >= sizeof(buf))
+   break;
+   of_platform_device_create(node, buf, NULL);
}
 
} else {
-- 
2.35.3



Re: [PATCH v2] of: Fix of platform build on powerpc due to bad of disaply code

2023-01-20 Thread Michal Suchánek
Hello,

On Fri, Jan 20, 2023 at 11:23:39AM -0600, Rob Herring wrote:
> On Thu, Jan 19, 2023 at 3:53 AM Michal Suchanek  wrote:
> >
> > The commit 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> > breaks build because of wrong argument to snprintf. That certainly
> > avoids the runtime error but is not the intended outcome.
> >
> > Also use standard device name format of-display.N for all created
> > devices.
> >
> > Fixes: 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> > Signed-off-by: Michal Suchanek 
> > ---
> > v2: Update the device name format
> > ---
> >  drivers/of/platform.c | 12 
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > index f2a5d679a324..8c1b1de22036 100644
> > --- a/drivers/of/platform.c
> > +++ b/drivers/of/platform.c
> > @@ -525,7 +525,9 @@ static int __init 
> > of_platform_default_populate_init(void)
> > if (IS_ENABLED(CONFIG_PPC)) {
> > struct device_node *boot_display = NULL;
> > struct platform_device *dev;
> > -   int display_number = 1;
> > +   int display_number = 0;
> > +   char buf[14];
> > +   char *of_display_format = "of-display.%d";
> 
> static const as suggested and can we just move on please...
Only const, static could be dodgy

> > int ret;
> >
> > /* Check if we have a MacOS display without a node spec */
> > @@ -556,7 +558,10 @@ static int __init 
> > of_platform_default_populate_init(void)
> > if (!of_get_property(node, "linux,opened", NULL) ||
> > !of_get_property(node, "linux,boot-display", 
> > NULL))
> > continue;
> > -   dev = of_platform_device_create(node, "of-display", 
> > NULL);
> > +   ret = snprintf(buf, sizeof(buf), of_display_format, 
> > display_number++);
> 
> The boot display is always "of-display.0". Just use the fixed string
> here. Then we can get rid of the whole debate around static const.

I prefer to use the same format string when the names should be
consistent. Also it would resurrect the starting from 1 debate.

But if you really want to have two strings I do not care all that much.

> 
> > +   if (ret >= sizeof(buf))
> > +   continue;
> 
> This only happens if display_number becomes too big. Why continue on?
> The next iteration will fail too.

Yes, there is no need to continue with the loop.

Thanks

Michal

> 
> > +   dev = of_platform_device_create(node, buf, NULL);
> > if (WARN_ON(!dev))
> > return -ENOMEM;
> > boot_display = node;
> > @@ -564,10 +569,9 @@ static int __init 
> > of_platform_default_populate_init(void)
> > }
> >
> > for_each_node_by_type(node, "display") {
> > -   char *buf[14];
> > if (!of_get_property(node, "linux,opened", NULL) || 
> > node == boot_display)
> > continue;
> > -   ret = snprintf(buf, "of-display-%d", 
> > display_number++);
> > +   ret = snprintf(buf, sizeof(buf), of_display_format, 
> > display_number++);
> > if (ret >= sizeof(buf))
> > continue;
> 
> Here too in the original change.
> 
> > of_platform_device_create(node, buf, NULL);
> > --
> > 2.35.3
> >


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Suren Baghdasaryan
On Fri, Jan 20, 2023 at 9:32 AM Matthew Wilcox  wrote:
>
> On Fri, Jan 20, 2023 at 09:17:46AM -0800, Suren Baghdasaryan wrote:
> > On Fri, Jan 20, 2023 at 9:08 AM Liam R. Howlett  
> > wrote:
> > >
> > > * Matthew Wilcox  [230120 11:50]:
> > > > On Fri, Jan 20, 2023 at 08:45:21AM -0800, Suren Baghdasaryan wrote:
> > > > > On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan 
> > > > >  wrote:
> > > > > >
> > > > > > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > > > > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > > > > > > call_rcu() can take a long time when callback offloading is 
> > > > > > > > > > enabled.
> > > > > > > > > > Its use in the vm_area_free can cause regressions in the 
> > > > > > > > > > exit path when
> > > > > > > > > > multiple VMAs are being freed. To minimize that impact, 
> > > > > > > > > > place VMAs into
> > > > > > > > > > a list and free them in groups using one call_rcu() call 
> > > > > > > > > > per group.
> > > > > > > > >
> > > > > > > > > After some more clarification I can understand how call_rcu 
> > > > > > > > > might not be
> > > > > > > > > super happy about thousands of callbacks to be invoked and I 
> > > > > > > > > do agree
> > > > > > > > > that this is not really optimal.
> > > > > > > > >
> > > > > > > > > On the other hand I do not like this solution much either.
> > > > > > > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help 
> > > > > > > > > all that
> > > > > > > > > much with processes with a huge number of vmas either. It 
> > > > > > > > > would still be
> > > > > > > > > in housands of callbacks to be scheduled without a good 
> > > > > > > > > reason.
> > > > > > > > >
> > > > > > > > > Instead, are there any other cases than remove_vma that need 
> > > > > > > > > this
> > > > > > > > > batching? We could easily just link all the vmas into linked 
> > > > > > > > > list and
> > > > > > > > > use a single call_rcu instead, no? This would both simplify 
> > > > > > > > > the
> > > > > > > > > implementation, remove the scaling issue as well and we do 
> > > > > > > > > not have to
> > > > > > > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or 
> > > > > > > > > epsilon + 1.
> > > > > > > >
> > > > > > > > Yes, I agree the solution is not stellar. I wanted something 
> > > > > > > > simple
> > > > > > > > but this is probably too simple. OTOH keeping all dead 
> > > > > > > > vm_area_structs
> > > > > > > > on the list without hooking up a shrinker (additional 
> > > > > > > > complexity) does
> > > > > > > > not sound too appealing either.
> > > > > > >
> > > > > > > I suspect you have missed my idea. I do not really want to keep 
> > > > > > > the list
> > > > > > > around or any shrinker. It is dead simple. Collect all vmas in
> > > > > > > remove_vma and then call_rcu the whole list at once after the 
> > > > > > > whole list
> > > > > > > (be it from exit_mmap or remove_mt). See?
> > > > > >
> > > > > > Yes, I understood your idea but keeping dead objects until the 
> > > > > > process
> > > > > > exits even when the system is low on memory (no shrinkers attached)
> > > > > > seems too wasteful. If we do this I would advocate for attaching a
> > > > > > shrinker.
> > > > >
> > > > > Maybe even simpler, since we are hit with this VMA freeing flood
> > > > > during exit_mmap (when all VMAs are destroyed), we pass a hint to
> > > > > vm_area_free to batch the destruction and all other cases call
> > > > > call_rcu()? I don't think there will be other cases of VMA destruction
> > > > > floods.
> > > >
> > > > ... or have two different call_rcu functions; one for munmap() and
> > > > one for exit.  It'd be nice to use kmem_cache_free_bulk().
> > >
> > > Do we even need a call_rcu on exit?  At the point of freeing the VMAs we
> > > have set the MMF_OOM_SKIP bit and unmapped the vmas under the read lock.
> > > Once we have obtained the write lock again, I think it's safe to say we
> > > can just go ahead and free the VMAs directly.
> >
> > I think that would be still racy if the page fault handler found that
> > VMA under read-RCU protection but did not lock it yet (no locks are
> > held yet). If it's preempted, the VMA can be freed and destroyed from
> > under it without RCU grace period.
>
> The page fault handler (or whatever other reader -- ptrace, proc, etc)
> should have a refcount on the mm_struct, so we can't be in this path
> trying to free VMAs.  Right?

Hmm. That sounds right. I checked process_mrelease() as well, which
operated on mm with only mmgrab()+mmap_read_lock() but it only unmaps
VMAs without freeing them, so we are still good. Michal, do you agree
this is ok?

lock_vma_under_rcu() receives mm as a parameter, so I guess it's
implied that the caller should either mmget() it or 

Re: [PATCH RESEND 0/8] Resend LED patches

2023-01-20 Thread Arnd Bergmann
On Fri, Jan 20, 2023, at 18:15, Lee Jones wrote:
> On Fri, 20 Jan 2023, Arnd Bergmann wrote:

>> > Marek Behún (3):
>> >   leds: turris-omnia: support HW controlled mode via private trigger
>> >   leds: turris-omnia: initialize multi-intensity to full
>> >   leds: turris-omnia: change max brightness from 255 to 1
>> >
>> > Pali Rohár (5):
>> >   dt-bindings: leds: register-bit-led: Add active-low property
>> >   leds: syscon: Implement support for active-low property
>> >   powerpc/85xx: DTS: Add CPLD definitions for P1021RDB Combo Board CPL
>> > Design
>> >   dt-bindings: leds: Add cznic,turris1x-leds.yaml binding
>> >   leds: Add support for Turris 1.x LEDs
>> >
>> >  .../testing/sysfs-class-led-driver-turris1x   |  31 ++
>> >  .../bindings/leds/cznic,turris1x-leds.yaml| 118 +
>> >  .../bindings/leds/register-bit-led.yaml   |   5 +
>> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc.dtsi|  92 
>> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc_32b.dts |   6 +-
>> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc_36b.dts |   6 +-
>> >  arch/powerpc/boot/dts/fsl/p1020rdb-pd.dts |  44 +-
>> >  arch/powerpc/boot/dts/fsl/p1020utm-pc.dtsi|  37 ++
>> >  arch/powerpc/boot/dts/fsl/p1020utm-pc_32b.dts |   4 +-
>> >  arch/powerpc/boot/dts/fsl/p1020utm-pc_36b.dts |   4 +-
>> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc.dtsi|  37 ++
>> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc_32b.dts |   5 +-
>> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc_36b.dts |   5 +-
>> >  arch/powerpc/boot/dts/fsl/p2020rdb-pc.dtsi|  33 +-
>
>> >  drivers/leds/Kconfig  |  10 +
>> >  drivers/leds/Makefile |   1 +
>> >  drivers/leds/leds-syscon.c|  14 +-
>> >  drivers/leds/leds-turris-1x.c | 474 ++
>> >  drivers/leds/leds-turris-omnia.c  |  46 +-
>
> If everyone is convinced that applying these drivers is the correct
> thing to do, I'd be happy to (rather) take them via LEDs.

Ok, thanks. I had not actually looked at the patches until today.
They were in the soc tree backlog but appeared to be misplaced
there until I read the  0/10 message text.

Looking at it now, I see:

- patches 1 and 2 seem obvious and have been reviewed by
  others already

- patch 3 is for arch/powerpc and should get merged through
  there if there are no objections to the binding in patch 4.

- patch 5 is the big driver patch, with a Reviewed-by tag
  from Marek Behún, who is the author of the last three patches.
  An earlier version of this patch was sent in June and got
  a few Acks and detailed feedback from Andy [1], but he's also
  not on Cc, and I don't know if his comments are all resolved
  in this version.

- Patches 6, 7 and 8 all seem simple LED subsystem patches,
  they just need review from you in order to get applied.
  These are also missing a Signed-off-by from the submitter
  in addition to the author in order to be applied.
  
  Arnd

[1] 
https://lore.kernel.org/all/CAHp75Vcr6o2rm+T6Tr8sS4VXCLVHtmLPWy-njOKAvO4AcZoW=a...@mail.gmail.com/


Re: [PATCH v2 01/10] dt-bindings: arm: fsl: add TQ-Systems LS1021A board

2023-01-20 Thread Krzysztof Kozlowski
On 20/01/2023 14:34, Alexander Stein wrote:
> From: Matthias Schiffer 
> 
> TQMLS102xA is a SOM family using NXP LS1021A CPU family.
> MBLS102xA is an evaluation mainboard for this SOM.
> 
> Signed-off-by: Matthias Schiffer 
> Signed-off-by: Alexander Stein 
> ---
> Changes in v2:
> * Improved the description mentioning this is a socketable module
> 


Acked-by: Krzysztof Kozlowski 

Best regards,
Krzysztof



Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Matthew Wilcox
On Fri, Jan 20, 2023 at 09:17:46AM -0800, Suren Baghdasaryan wrote:
> On Fri, Jan 20, 2023 at 9:08 AM Liam R. Howlett  
> wrote:
> >
> > * Matthew Wilcox  [230120 11:50]:
> > > On Fri, Jan 20, 2023 at 08:45:21AM -0800, Suren Baghdasaryan wrote:
> > > > On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan  
> > > > wrote:
> > > > >
> > > > > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  wrote:
> > > > > >
> > > > > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > > > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > > > > > call_rcu() can take a long time when callback offloading is 
> > > > > > > > > enabled.
> > > > > > > > > Its use in the vm_area_free can cause regressions in the exit 
> > > > > > > > > path when
> > > > > > > > > multiple VMAs are being freed. To minimize that impact, place 
> > > > > > > > > VMAs into
> > > > > > > > > a list and free them in groups using one call_rcu() call per 
> > > > > > > > > group.
> > > > > > > >
> > > > > > > > After some more clarification I can understand how call_rcu 
> > > > > > > > might not be
> > > > > > > > super happy about thousands of callbacks to be invoked and I do 
> > > > > > > > agree
> > > > > > > > that this is not really optimal.
> > > > > > > >
> > > > > > > > On the other hand I do not like this solution much either.
> > > > > > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all 
> > > > > > > > that
> > > > > > > > much with processes with a huge number of vmas either. It would 
> > > > > > > > still be
> > > > > > > > in housands of callbacks to be scheduled without a good reason.
> > > > > > > >
> > > > > > > > Instead, are there any other cases than remove_vma that need 
> > > > > > > > this
> > > > > > > > batching? We could easily just link all the vmas into linked 
> > > > > > > > list and
> > > > > > > > use a single call_rcu instead, no? This would both simplify the
> > > > > > > > implementation, remove the scaling issue as well and we do not 
> > > > > > > > have to
> > > > > > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or 
> > > > > > > > epsilon + 1.
> > > > > > >
> > > > > > > Yes, I agree the solution is not stellar. I wanted something 
> > > > > > > simple
> > > > > > > but this is probably too simple. OTOH keeping all dead 
> > > > > > > vm_area_structs
> > > > > > > on the list without hooking up a shrinker (additional complexity) 
> > > > > > > does
> > > > > > > not sound too appealing either.
> > > > > >
> > > > > > I suspect you have missed my idea. I do not really want to keep the 
> > > > > > list
> > > > > > around or any shrinker. It is dead simple. Collect all vmas in
> > > > > > remove_vma and then call_rcu the whole list at once after the whole 
> > > > > > list
> > > > > > (be it from exit_mmap or remove_mt). See?
> > > > >
> > > > > Yes, I understood your idea but keeping dead objects until the process
> > > > > exits even when the system is low on memory (no shrinkers attached)
> > > > > seems too wasteful. If we do this I would advocate for attaching a
> > > > > shrinker.
> > > >
> > > > Maybe even simpler, since we are hit with this VMA freeing flood
> > > > during exit_mmap (when all VMAs are destroyed), we pass a hint to
> > > > vm_area_free to batch the destruction and all other cases call
> > > > call_rcu()? I don't think there will be other cases of VMA destruction
> > > > floods.
> > >
> > > ... or have two different call_rcu functions; one for munmap() and
> > > one for exit.  It'd be nice to use kmem_cache_free_bulk().
> >
> > Do we even need a call_rcu on exit?  At the point of freeing the VMAs we
> > have set the MMF_OOM_SKIP bit and unmapped the vmas under the read lock.
> > Once we have obtained the write lock again, I think it's safe to say we
> > can just go ahead and free the VMAs directly.
> 
> I think that would be still racy if the page fault handler found that
> VMA under read-RCU protection but did not lock it yet (no locks are
> held yet). If it's preempted, the VMA can be freed and destroyed from
> under it without RCU grace period.

The page fault handler (or whatever other reader -- ptrace, proc, etc)
should have a refcount on the mm_struct, so we can't be in this path
trying to free VMAs.  Right?


Re: [PATCH v2] of: Fix of platform build on powerpc due to bad of disaply code

2023-01-20 Thread Rob Herring
On Thu, Jan 19, 2023 at 3:53 AM Michal Suchanek  wrote:
>
> The commit 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> breaks build because of wrong argument to snprintf. That certainly
> avoids the runtime error but is not the intended outcome.
>
> Also use standard device name format of-display.N for all created
> devices.
>
> Fixes: 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> Signed-off-by: Michal Suchanek 
> ---
> v2: Update the device name format
> ---
>  drivers/of/platform.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> index f2a5d679a324..8c1b1de22036 100644
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -525,7 +525,9 @@ static int __init of_platform_default_populate_init(void)
> if (IS_ENABLED(CONFIG_PPC)) {
> struct device_node *boot_display = NULL;
> struct platform_device *dev;
> -   int display_number = 1;
> +   int display_number = 0;
> +   char buf[14];
> +   char *of_display_format = "of-display.%d";

static const as suggested and can we just move on please...

> int ret;
>
> /* Check if we have a MacOS display without a node spec */
> @@ -556,7 +558,10 @@ static int __init of_platform_default_populate_init(void)
> if (!of_get_property(node, "linux,opened", NULL) ||
> !of_get_property(node, "linux,boot-display", 
> NULL))
> continue;
> -   dev = of_platform_device_create(node, "of-display", 
> NULL);
> +   ret = snprintf(buf, sizeof(buf), of_display_format, 
> display_number++);

The boot display is always "of-display.0". Just use the fixed string
here. Then we can get rid of the whole debate around static const.

> +   if (ret >= sizeof(buf))
> +   continue;

This only happens if display_number becomes too big. Why continue on?
The next iteration will fail too.

> +   dev = of_platform_device_create(node, buf, NULL);
> if (WARN_ON(!dev))
> return -ENOMEM;
> boot_display = node;
> @@ -564,10 +569,9 @@ static int __init of_platform_default_populate_init(void)
> }
>
> for_each_node_by_type(node, "display") {
> -   char *buf[14];
> if (!of_get_property(node, "linux,opened", NULL) || 
> node == boot_display)
> continue;
> -   ret = snprintf(buf, "of-display-%d", 
> display_number++);
> +   ret = snprintf(buf, sizeof(buf), of_display_format, 
> display_number++);
> if (ret >= sizeof(buf))
> continue;

Here too in the original change.

> of_platform_device_create(node, buf, NULL);
> --
> 2.35.3
>


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Paul E. McKenney
On Fri, Jan 20, 2023 at 04:49:42PM +, Matthew Wilcox wrote:
> On Fri, Jan 20, 2023 at 08:45:21AM -0800, Suren Baghdasaryan wrote:
> > On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan  
> > wrote:
> > >
> > > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  wrote:
> > > >
> > > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  wrote:
> > > > > >
> > > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > > > call_rcu() can take a long time when callback offloading is 
> > > > > > > enabled.
> > > > > > > Its use in the vm_area_free can cause regressions in the exit 
> > > > > > > path when
> > > > > > > multiple VMAs are being freed. To minimize that impact, place 
> > > > > > > VMAs into
> > > > > > > a list and free them in groups using one call_rcu() call per 
> > > > > > > group.
> > > > > >
> > > > > > After some more clarification I can understand how call_rcu might 
> > > > > > not be
> > > > > > super happy about thousands of callbacks to be invoked and I do 
> > > > > > agree
> > > > > > that this is not really optimal.
> > > > > >
> > > > > > On the other hand I do not like this solution much either.
> > > > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all that
> > > > > > much with processes with a huge number of vmas either. It would 
> > > > > > still be
> > > > > > in housands of callbacks to be scheduled without a good reason.
> > > > > >
> > > > > > Instead, are there any other cases than remove_vma that need this
> > > > > > batching? We could easily just link all the vmas into linked list 
> > > > > > and
> > > > > > use a single call_rcu instead, no? This would both simplify the
> > > > > > implementation, remove the scaling issue as well and we do not have 
> > > > > > to
> > > > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon + 
> > > > > > 1.
> > > > >
> > > > > Yes, I agree the solution is not stellar. I wanted something simple
> > > > > but this is probably too simple. OTOH keeping all dead vm_area_structs
> > > > > on the list without hooking up a shrinker (additional complexity) does
> > > > > not sound too appealing either.
> > > >
> > > > I suspect you have missed my idea. I do not really want to keep the list
> > > > around or any shrinker. It is dead simple. Collect all vmas in
> > > > remove_vma and then call_rcu the whole list at once after the whole list
> > > > (be it from exit_mmap or remove_mt). See?
> > >
> > > Yes, I understood your idea but keeping dead objects until the process
> > > exits even when the system is low on memory (no shrinkers attached)
> > > seems too wasteful. If we do this I would advocate for attaching a
> > > shrinker.
> > 
> > Maybe even simpler, since we are hit with this VMA freeing flood
> > during exit_mmap (when all VMAs are destroyed), we pass a hint to
> > vm_area_free to batch the destruction and all other cases call
> > call_rcu()? I don't think there will be other cases of VMA destruction
> > floods.
> 
> ... or have two different call_rcu functions; one for munmap() and
> one for exit.  It'd be nice to use kmem_cache_free_bulk().

Good point, kfree_rcu(p, r) where "r" is the name of the rcu_head
structure's field, is much more cache-efficient.

The penalty is that there is no callback function to do any cleanup.
There is just a kfree()/kvfree (bulk version where applicable),
nothing else.

Thanx, Paul


Re: [PATCH v2 10/10] ARM: add multi_v7_lpae_defconfig

2023-01-20 Thread Arnd Bergmann
On Fri, Jan 20, 2023, at 14:34, Alexander Stein wrote:
> From: Nicolas Saenz Julienne 
>
> The only missing configuration option preventing us from using
> multi_v7_defconfig with the Raspberry Pi 4 is ARM_LPAE. It's needed as
> the PCIe controller found on the SoC depends on 64bit addressing, yet
> can't be included as not all v7 boards support LPAE.
>
> Introduce multi_v7_lpae_defconfig, built off multi_v7_defconfig, which will
> avoid us having to duplicate and maintain multiple similar configurations.
>
> Needless to say the Raspberry Pi 4 is not the only platform that can
> benefit from this new configuration.
>
> Signed-off-by: Nicolas Saenz Julienne 
> Signed-off-by: Alexander Stein 

This is ok in principle, two minor points though:

> +include $(srctree)/scripts/Makefile.defconf
> +PHONY += multi_v7_lpae_defconfig
> +multi_v7_lpae_defconfig:
> + $(call merge_into_defconfig,multi_v7_defconfig,lpae)
> 
>  define archhelp
>echo  '* zImage- Compressed kernel image 
> (arch/$(ARCH)/boot/zImage)'

The new target does not get listed in 'make help' as far as I can
tell, can you add it there in the process?

> diff --git a/arch/arm/configs/lpae.config b/arch/arm/configs/lpae.config
> new file mode 100644
> index 0..19bab134e014b
> --- /dev/null
> +++ b/arch/arm/configs/lpae.config
> @@ -0,0 +1 @@
> +CONFIG_ARM_LPAE=y

My feeling is that we probably want CONFIG_VMSPLIT_2G=y here
as well, given that a lot of the systems that want LPAE
will have a lot of memory, and are limited by the amount
of lowmem even when CONFIG_HIGHMEM is enabled.

Can you make sure that this works on your machine, and
include this option?

 Arnd


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Suren Baghdasaryan
On Fri, Jan 20, 2023 at 9:08 AM Liam R. Howlett  wrote:
>
> * Matthew Wilcox  [230120 11:50]:
> > On Fri, Jan 20, 2023 at 08:45:21AM -0800, Suren Baghdasaryan wrote:
> > > On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan  
> > > wrote:
> > > >
> > > > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  wrote:
> > > > >
> > > > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > > > > call_rcu() can take a long time when callback offloading is 
> > > > > > > > enabled.
> > > > > > > > Its use in the vm_area_free can cause regressions in the exit 
> > > > > > > > path when
> > > > > > > > multiple VMAs are being freed. To minimize that impact, place 
> > > > > > > > VMAs into
> > > > > > > > a list and free them in groups using one call_rcu() call per 
> > > > > > > > group.
> > > > > > >
> > > > > > > After some more clarification I can understand how call_rcu might 
> > > > > > > not be
> > > > > > > super happy about thousands of callbacks to be invoked and I do 
> > > > > > > agree
> > > > > > > that this is not really optimal.
> > > > > > >
> > > > > > > On the other hand I do not like this solution much either.
> > > > > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all 
> > > > > > > that
> > > > > > > much with processes with a huge number of vmas either. It would 
> > > > > > > still be
> > > > > > > in housands of callbacks to be scheduled without a good reason.
> > > > > > >
> > > > > > > Instead, are there any other cases than remove_vma that need this
> > > > > > > batching? We could easily just link all the vmas into linked list 
> > > > > > > and
> > > > > > > use a single call_rcu instead, no? This would both simplify the
> > > > > > > implementation, remove the scaling issue as well and we do not 
> > > > > > > have to
> > > > > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon 
> > > > > > > + 1.
> > > > > >
> > > > > > Yes, I agree the solution is not stellar. I wanted something simple
> > > > > > but this is probably too simple. OTOH keeping all dead 
> > > > > > vm_area_structs
> > > > > > on the list without hooking up a shrinker (additional complexity) 
> > > > > > does
> > > > > > not sound too appealing either.
> > > > >
> > > > > I suspect you have missed my idea. I do not really want to keep the 
> > > > > list
> > > > > around or any shrinker. It is dead simple. Collect all vmas in
> > > > > remove_vma and then call_rcu the whole list at once after the whole 
> > > > > list
> > > > > (be it from exit_mmap or remove_mt). See?
> > > >
> > > > Yes, I understood your idea but keeping dead objects until the process
> > > > exits even when the system is low on memory (no shrinkers attached)
> > > > seems too wasteful. If we do this I would advocate for attaching a
> > > > shrinker.
> > >
> > > Maybe even simpler, since we are hit with this VMA freeing flood
> > > during exit_mmap (when all VMAs are destroyed), we pass a hint to
> > > vm_area_free to batch the destruction and all other cases call
> > > call_rcu()? I don't think there will be other cases of VMA destruction
> > > floods.
> >
> > ... or have two different call_rcu functions; one for munmap() and
> > one for exit.  It'd be nice to use kmem_cache_free_bulk().
>
> Do we even need a call_rcu on exit?  At the point of freeing the VMAs we
> have set the MMF_OOM_SKIP bit and unmapped the vmas under the read lock.
> Once we have obtained the write lock again, I think it's safe to say we
> can just go ahead and free the VMAs directly.

I think that would be still racy if the page fault handler found that
VMA under read-RCU protection but did not lock it yet (no locks are
held yet). If it's preempted, the VMA can be freed and destroyed from
under it without RCU grace period.

>
> --
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to kernel-team+unsubscr...@android.com.
>


Re: [PATCH RESEND 0/8] Resend LED patches

2023-01-20 Thread Lee Jones
On Fri, 20 Jan 2023, Arnd Bergmann wrote:

> On Mon, Dec 26, 2022, at 13:36, Pali Rohár wrote:
> > Linus Walleij suggested me to send these patches to SoC tree [1]
> > instead. So I'm doing it.
> >
> > This patch series contains LED patches which are on the linux-leds
> > mailing list for a long time without any future movement. Could you
> > please handle them here via SoC tree? Thanks.
> >
> > [1] - 
> > https://lore.kernel.org/linux-leds/cacrpkdad6wdo7rgfa4mw8zz0mlxmcpho+sec-ylqnrz_kdr...@mail.gmail.com/
> 
> I'm going through the backlog of patches sent to s...@kernel.org
> and came across this series. While I don't mind taking these
> patches through the soc tree in principle, it is important
> that this is only done as an exception, and with all the
> relevant parties on Cc.
> 
> In particular, the original series that you got no
> feedback for did not include the arch/powerpc/ changes,
> and I would assume those should go through the powerpc
> tree anyway. We have recently decided to take
> risc-v and loongarch dts changes through the soc
> tree, and I don't mind doing it for powerpc as well
> if the powerpc maintainers prefer that, but this is
> not something we have even discussed so far.
> 
> I've added everyone to Cc on this mail, but please
> resend the series once more so everyone has the patches,
> and then we can decide who will pick up what.

Thanks Arnd (PSB).

> > Marek Behún (3):
> >   leds: turris-omnia: support HW controlled mode via private trigger
> >   leds: turris-omnia: initialize multi-intensity to full
> >   leds: turris-omnia: change max brightness from 255 to 1
> >
> > Pali Rohár (5):
> >   dt-bindings: leds: register-bit-led: Add active-low property
> >   leds: syscon: Implement support for active-low property
> >   powerpc/85xx: DTS: Add CPLD definitions for P1021RDB Combo Board CPL
> > Design
> >   dt-bindings: leds: Add cznic,turris1x-leds.yaml binding
> >   leds: Add support for Turris 1.x LEDs
> >
> >  .../testing/sysfs-class-led-driver-turris1x   |  31 ++
> >  .../bindings/leds/cznic,turris1x-leds.yaml| 118 +
> >  .../bindings/leds/register-bit-led.yaml   |   5 +
> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc.dtsi|  92 
> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc_32b.dts |   6 +-
> >  arch/powerpc/boot/dts/fsl/p1020mbg-pc_36b.dts |   6 +-
> >  arch/powerpc/boot/dts/fsl/p1020rdb-pd.dts |  44 +-
> >  arch/powerpc/boot/dts/fsl/p1020utm-pc.dtsi|  37 ++
> >  arch/powerpc/boot/dts/fsl/p1020utm-pc_32b.dts |   4 +-
> >  arch/powerpc/boot/dts/fsl/p1020utm-pc_36b.dts |   4 +-
> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc.dtsi|  37 ++
> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc_32b.dts |   5 +-
> >  arch/powerpc/boot/dts/fsl/p1021rdb-pc_36b.dts |   5 +-
> >  arch/powerpc/boot/dts/fsl/p2020rdb-pc.dtsi|  33 +-

> >  drivers/leds/Kconfig  |  10 +
> >  drivers/leds/Makefile |   1 +
> >  drivers/leds/leds-syscon.c|  14 +-
> >  drivers/leds/leds-turris-1x.c | 474 ++
> >  drivers/leds/leds-turris-omnia.c  |  46 +-

If everyone is convinced that applying these drivers is the correct
thing to do, I'd be happy to (rather) take them via LEDs.

-- 
Lee Jones [李琼斯]


Re: [PATCH v2 09/10] kbuild: Add config fragment merge functionality

2023-01-20 Thread Arnd Bergmann
On Fri, Jan 20, 2023, at 14:34, Alexander Stein wrote:
> From: Nicolas Saenz Julienne 
>
> So far this function was only used locally in powerpc, some other
> architectures might benefit from it. Move it into
> scripts/Makefile.defconf.
>
> Signed-off-by: Nicolas Saenz Julienne 
> Signed-off-by: Alexander Stein 
> ---
> Directly applied from 
> https://lore.kernel.org/linux-arm-kernel/20200203184820.4433-2-nsaenzjulie...@suse.de/T/#m96968dd45c0aaa88e0a7387024b5ac13b002363d

Acked-by: Arnd Bergmann 

I can probably best take this through the soc tree directly
along with patch 10, while the first eight patches should
got through the nxp maintainer tree.

Arnd


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Liam R. Howlett
* Matthew Wilcox  [230120 11:50]:
> On Fri, Jan 20, 2023 at 08:45:21AM -0800, Suren Baghdasaryan wrote:
> > On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan  
> > wrote:
> > >
> > > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  wrote:
> > > >
> > > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  wrote:
> > > > > >
> > > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > > > call_rcu() can take a long time when callback offloading is 
> > > > > > > enabled.
> > > > > > > Its use in the vm_area_free can cause regressions in the exit 
> > > > > > > path when
> > > > > > > multiple VMAs are being freed. To minimize that impact, place 
> > > > > > > VMAs into
> > > > > > > a list and free them in groups using one call_rcu() call per 
> > > > > > > group.
> > > > > >
> > > > > > After some more clarification I can understand how call_rcu might 
> > > > > > not be
> > > > > > super happy about thousands of callbacks to be invoked and I do 
> > > > > > agree
> > > > > > that this is not really optimal.
> > > > > >
> > > > > > On the other hand I do not like this solution much either.
> > > > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all that
> > > > > > much with processes with a huge number of vmas either. It would 
> > > > > > still be
> > > > > > in housands of callbacks to be scheduled without a good reason.
> > > > > >
> > > > > > Instead, are there any other cases than remove_vma that need this
> > > > > > batching? We could easily just link all the vmas into linked list 
> > > > > > and
> > > > > > use a single call_rcu instead, no? This would both simplify the
> > > > > > implementation, remove the scaling issue as well and we do not have 
> > > > > > to
> > > > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon + 
> > > > > > 1.
> > > > >
> > > > > Yes, I agree the solution is not stellar. I wanted something simple
> > > > > but this is probably too simple. OTOH keeping all dead vm_area_structs
> > > > > on the list without hooking up a shrinker (additional complexity) does
> > > > > not sound too appealing either.
> > > >
> > > > I suspect you have missed my idea. I do not really want to keep the list
> > > > around or any shrinker. It is dead simple. Collect all vmas in
> > > > remove_vma and then call_rcu the whole list at once after the whole list
> > > > (be it from exit_mmap or remove_mt). See?
> > >
> > > Yes, I understood your idea but keeping dead objects until the process
> > > exits even when the system is low on memory (no shrinkers attached)
> > > seems too wasteful. If we do this I would advocate for attaching a
> > > shrinker.
> > 
> > Maybe even simpler, since we are hit with this VMA freeing flood
> > during exit_mmap (when all VMAs are destroyed), we pass a hint to
> > vm_area_free to batch the destruction and all other cases call
> > call_rcu()? I don't think there will be other cases of VMA destruction
> > floods.
> 
> ... or have two different call_rcu functions; one for munmap() and
> one for exit.  It'd be nice to use kmem_cache_free_bulk().

Do we even need a call_rcu on exit?  At the point of freeing the VMAs we
have set the MMF_OOM_SKIP bit and unmapped the vmas under the read lock.
Once we have obtained the write lock again, I think it's safe to say we
can just go ahead and free the VMAs directly.



Re: [PATCH RESEND 0/8] Resend LED patches

2023-01-20 Thread Arnd Bergmann
On Mon, Dec 26, 2022, at 13:36, Pali Rohár wrote:
> Linus Walleij suggested me to send these patches to SoC tree [1]
> instead. So I'm doing it.
>
> This patch series contains LED patches which are on the linux-leds
> mailing list for a long time without any future movement. Could you
> please handle them here via SoC tree? Thanks.
>
> [1] - 
> https://lore.kernel.org/linux-leds/cacrpkdad6wdo7rgfa4mw8zz0mlxmcpho+sec-ylqnrz_kdr...@mail.gmail.com/

I'm going through the backlog of patches sent to s...@kernel.org
and came across this series. While I don't mind taking these
patches through the soc tree in principle, it is important
that this is only done as an exception, and with all the
relevant parties on Cc.

In particular, the original series that you got no
feedback for did not include the arch/powerpc/ changes,
and I would assume those should go through the powerpc
tree anyway. We have recently decided to take
risc-v and loongarch dts changes through the soc
tree, and I don't mind doing it for powerpc as well
if the powerpc maintainers prefer that, but this is
not something we have even discussed so far.

I've added everyone to Cc on this mail, but please
resend the series once more so everyone has the patches,
and then we can decide who will pick up what.

Arnd

>
> Marek Behún (3):
>   leds: turris-omnia: support HW controlled mode via private trigger
>   leds: turris-omnia: initialize multi-intensity to full
>   leds: turris-omnia: change max brightness from 255 to 1
>
> Pali Rohár (5):
>   dt-bindings: leds: register-bit-led: Add active-low property
>   leds: syscon: Implement support for active-low property
>   powerpc/85xx: DTS: Add CPLD definitions for P1021RDB Combo Board CPL
> Design
>   dt-bindings: leds: Add cznic,turris1x-leds.yaml binding
>   leds: Add support for Turris 1.x LEDs
>
>  .../testing/sysfs-class-led-driver-turris1x   |  31 ++
>  .../bindings/leds/cznic,turris1x-leds.yaml| 118 +
>  .../bindings/leds/register-bit-led.yaml   |   5 +
>  arch/powerpc/boot/dts/fsl/p1020mbg-pc.dtsi|  92 
>  arch/powerpc/boot/dts/fsl/p1020mbg-pc_32b.dts |   6 +-
>  arch/powerpc/boot/dts/fsl/p1020mbg-pc_36b.dts |   6 +-
>  arch/powerpc/boot/dts/fsl/p1020rdb-pd.dts |  44 +-
>  arch/powerpc/boot/dts/fsl/p1020utm-pc.dtsi|  37 ++
>  arch/powerpc/boot/dts/fsl/p1020utm-pc_32b.dts |   4 +-
>  arch/powerpc/boot/dts/fsl/p1020utm-pc_36b.dts |   4 +-
>  arch/powerpc/boot/dts/fsl/p1021rdb-pc.dtsi|  37 ++
>  arch/powerpc/boot/dts/fsl/p1021rdb-pc_32b.dts |   5 +-
>  arch/powerpc/boot/dts/fsl/p1021rdb-pc_36b.dts |   5 +-
>  arch/powerpc/boot/dts/fsl/p2020rdb-pc.dtsi|  33 +-
>  drivers/leds/Kconfig  |  10 +
>  drivers/leds/Makefile |   1 +
>  drivers/leds/leds-syscon.c|  14 +-
>  drivers/leds/leds-turris-1x.c | 474 ++
>  drivers/leds/leds-turris-omnia.c  |  46 +-
>  19 files changed, 945 insertions(+), 27 deletions(-)
>  create mode 100644 
> Documentation/ABI/testing/sysfs-class-led-driver-turris1x
>  create mode 100644 
> Documentation/devicetree/bindings/leds/cznic,turris1x-leds.yaml
>  create mode 100644 drivers/leds/leds-turris-1x.c
>
> -- 
> 2.20.1


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Matthew Wilcox
On Fri, Jan 20, 2023 at 08:45:21AM -0800, Suren Baghdasaryan wrote:
> On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan  wrote:
> >
> > On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  wrote:
> > >
> > > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  wrote:
> > > > >
> > > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > > call_rcu() can take a long time when callback offloading is enabled.
> > > > > > Its use in the vm_area_free can cause regressions in the exit path 
> > > > > > when
> > > > > > multiple VMAs are being freed. To minimize that impact, place VMAs 
> > > > > > into
> > > > > > a list and free them in groups using one call_rcu() call per group.
> > > > >
> > > > > After some more clarification I can understand how call_rcu might not 
> > > > > be
> > > > > super happy about thousands of callbacks to be invoked and I do agree
> > > > > that this is not really optimal.
> > > > >
> > > > > On the other hand I do not like this solution much either.
> > > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all that
> > > > > much with processes with a huge number of vmas either. It would still 
> > > > > be
> > > > > in housands of callbacks to be scheduled without a good reason.
> > > > >
> > > > > Instead, are there any other cases than remove_vma that need this
> > > > > batching? We could easily just link all the vmas into linked list and
> > > > > use a single call_rcu instead, no? This would both simplify the
> > > > > implementation, remove the scaling issue as well and we do not have to
> > > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon + 1.
> > > >
> > > > Yes, I agree the solution is not stellar. I wanted something simple
> > > > but this is probably too simple. OTOH keeping all dead vm_area_structs
> > > > on the list without hooking up a shrinker (additional complexity) does
> > > > not sound too appealing either.
> > >
> > > I suspect you have missed my idea. I do not really want to keep the list
> > > around or any shrinker. It is dead simple. Collect all vmas in
> > > remove_vma and then call_rcu the whole list at once after the whole list
> > > (be it from exit_mmap or remove_mt). See?
> >
> > Yes, I understood your idea but keeping dead objects until the process
> > exits even when the system is low on memory (no shrinkers attached)
> > seems too wasteful. If we do this I would advocate for attaching a
> > shrinker.
> 
> Maybe even simpler, since we are hit with this VMA freeing flood
> during exit_mmap (when all VMAs are destroyed), we pass a hint to
> vm_area_free to batch the destruction and all other cases call
> call_rcu()? I don't think there will be other cases of VMA destruction
> floods.

... or have two different call_rcu functions; one for munmap() and
one for exit.  It'd be nice to use kmem_cache_free_bulk().


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Suren Baghdasaryan
On Fri, Jan 20, 2023 at 8:20 AM Suren Baghdasaryan  wrote:
>
> On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  wrote:
> >
> > On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  wrote:
> > > >
> > > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > > call_rcu() can take a long time when callback offloading is enabled.
> > > > > Its use in the vm_area_free can cause regressions in the exit path 
> > > > > when
> > > > > multiple VMAs are being freed. To minimize that impact, place VMAs 
> > > > > into
> > > > > a list and free them in groups using one call_rcu() call per group.
> > > >
> > > > After some more clarification I can understand how call_rcu might not be
> > > > super happy about thousands of callbacks to be invoked and I do agree
> > > > that this is not really optimal.
> > > >
> > > > On the other hand I do not like this solution much either.
> > > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all that
> > > > much with processes with a huge number of vmas either. It would still be
> > > > in housands of callbacks to be scheduled without a good reason.
> > > >
> > > > Instead, are there any other cases than remove_vma that need this
> > > > batching? We could easily just link all the vmas into linked list and
> > > > use a single call_rcu instead, no? This would both simplify the
> > > > implementation, remove the scaling issue as well and we do not have to
> > > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon + 1.
> > >
> > > Yes, I agree the solution is not stellar. I wanted something simple
> > > but this is probably too simple. OTOH keeping all dead vm_area_structs
> > > on the list without hooking up a shrinker (additional complexity) does
> > > not sound too appealing either.
> >
> > I suspect you have missed my idea. I do not really want to keep the list
> > around or any shrinker. It is dead simple. Collect all vmas in
> > remove_vma and then call_rcu the whole list at once after the whole list
> > (be it from exit_mmap or remove_mt). See?
>
> Yes, I understood your idea but keeping dead objects until the process
> exits even when the system is low on memory (no shrinkers attached)
> seems too wasteful. If we do this I would advocate for attaching a
> shrinker.

Maybe even simpler, since we are hit with this VMA freeing flood
during exit_mmap (when all VMAs are destroyed), we pass a hint to
vm_area_free to batch the destruction and all other cases call
call_rcu()? I don't think there will be other cases of VMA destruction
floods.

>
> >
> > --
> > Michal Hocko
> > SUSE Labs


Re: [PATCH v9 00/10] phy: Add support for Lynx 10G SerDes

2023-01-20 Thread Sean Anderson
On 1/20/23 03:06, Vinod Koul wrote:
> On 19-01-23, 11:22, Sean Anderson wrote:
>> On 1/18/23 11:54, Vinod Koul wrote:
>> > On 17-01-23, 11:46, Sean Anderson wrote:
>> >> 
>> >> I noticed that this series is marked "changes requested" on patchwork.
>> >> However, I have received only automated feedback. I have done my best
>> >> effort to address feedback I have received on prior revisions. I would
>> >> appreciate getting another round of review before resending this series.
>> > 
>> > Looking at the series, looks like kernel-bot sent some warnings on the
>> > series so I was expecting an updated series for review
>> > 
>> 
>> Generally, multiple reviewers will comment on a patch, even if another
>> reviewer finds something which needs to be changed. This is a one-line
>> fix, so I would appreciate getting more substantial feedback before
>> respinning. Every time I send a new series I have to rebase and test on
>> hardware. It's work that I would rather do when there is something to be
>> gained.
> 
> I review to apply, if I can apply, I would typically skip this
> 

It is much more efficient to conduct reviews in parallel. So e.g. the
bindings can be reviewed at the same time as the driver, at the same
time as the device tree changes. This way, I can get a series applied
after max(N, M, ...) revisions, where I would otherwise need N revisions
to get the bindings ready, M revisions to get the driver ready, etc.
But what's happening is that I have to make N + M + ... revisions! I am
very frustrated by your refusal to review anything until there are no
other comments, since it unnecessarily extends the process of getting a
series applied. I have been trying to get this series applied since
June, with nine revisions, and you have reviewed it *twice*! I think the
driver is in a good state and is ready to be applied (aside from the one
known issue), but I have no idea if you agree with that assessment.

--Sean


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Suren Baghdasaryan
On Fri, Jan 20, 2023 at 12:52 AM Michal Hocko  wrote:
>
> On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> > On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  wrote:
> > >
> > > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > > call_rcu() can take a long time when callback offloading is enabled.
> > > > Its use in the vm_area_free can cause regressions in the exit path when
> > > > multiple VMAs are being freed. To minimize that impact, place VMAs into
> > > > a list and free them in groups using one call_rcu() call per group.
> > >
> > > After some more clarification I can understand how call_rcu might not be
> > > super happy about thousands of callbacks to be invoked and I do agree
> > > that this is not really optimal.
> > >
> > > On the other hand I do not like this solution much either.
> > > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all that
> > > much with processes with a huge number of vmas either. It would still be
> > > in housands of callbacks to be scheduled without a good reason.
> > >
> > > Instead, are there any other cases than remove_vma that need this
> > > batching? We could easily just link all the vmas into linked list and
> > > use a single call_rcu instead, no? This would both simplify the
> > > implementation, remove the scaling issue as well and we do not have to
> > > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon + 1.
> >
> > Yes, I agree the solution is not stellar. I wanted something simple
> > but this is probably too simple. OTOH keeping all dead vm_area_structs
> > on the list without hooking up a shrinker (additional complexity) does
> > not sound too appealing either.
>
> I suspect you have missed my idea. I do not really want to keep the list
> around or any shrinker. It is dead simple. Collect all vmas in
> remove_vma and then call_rcu the whole list at once after the whole list
> (be it from exit_mmap or remove_mt). See?

Yes, I understood your idea but keeping dead objects until the process
exits even when the system is low on memory (no shrinkers attached)
seems too wasteful. If we do this I would advocate for attaching a
shrinker.

>
> --
> Michal Hocko
> SUSE Labs


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Paul E. McKenney
On Fri, Jan 20, 2023 at 09:57:05AM +0100, Michal Hocko wrote:
> On Thu 19-01-23 11:17:07, Paul E. McKenney wrote:
> > On Thu, Jan 19, 2023 at 01:52:14PM +0100, Michal Hocko wrote:
> > > On Wed 18-01-23 11:01:08, Suren Baghdasaryan wrote:
> > > > On Wed, Jan 18, 2023 at 10:34 AM Paul E. McKenney  
> > > > wrote:
> > > [...]
> > > > > There are a couple of possibilities here.
> > > > >
> > > > > First, if I am remembering correctly, the time between the call_rcu()
> > > > > and invocation of the corresponding callback was taking multiple 
> > > > > seconds,
> > > > > but that was because the kernel was built with CONFIG_LAZY_RCU=y in
> > > > > order to save power by batching RCU work over multiple call_rcu()
> > > > > invocations.  If this is causing a problem for a given call site, the
> > > > > shiny new call_rcu_hurry() can be used instead.  Doing this gets back
> > > > > to the old-school non-laziness, but can of course consume more power.
> > > > 
> > > > That would not be the case because CONFIG_LAZY_RCU was not an option
> > > > at the time I was profiling this issue.
> > > > Laxy RCU would be a great option to replace this patch but
> > > > unfortunately it's not the default behavior, so I would still have to
> > > > implement this batching in case lazy RCU is not enabled.
> > > > 
> > > > >
> > > > > Second, there is a much shorter one-jiffy delay between the call_rcu()
> > > > > and the invocation of the corresponding callback in kernels built with
> > > > > either CONFIG_NO_HZ_FULL=y (but only on CPUs mentioned in the 
> > > > > nohz_full
> > > > > or rcu_nocbs kernel boot parameters) or CONFIG_RCU_NOCB_CPU=y (but 
> > > > > only
> > > > > on CPUs mentioned in the rcu_nocbs kernel boot parameters).  The 
> > > > > purpose
> > > > > of this delay is to avoid lock contention, and so this delay is 
> > > > > incurred
> > > > > only on CPUs that are queuing callbacks at a rate exceeding 
> > > > > 16K/second.
> > > > > This is reduced to a per-jiffy limit, so on a HZ=1000 system, a CPU
> > > > > invoking call_rcu() at least 16 times within a given jiffy will incur
> > > > > the added delay.  The reason for this delay is the use of a separate
> > > > > ->nocb_bypass list.  As Suren says, this bypass list is used to reduce
> > > > > lock contention on the main ->cblist.  This is not needed in 
> > > > > old-school
> > > > > kernels built without either CONFIG_NO_HZ_FULL=y or 
> > > > > CONFIG_RCU_NOCB_CPU=y
> > > > > (including most datacenter kernels) because in that case the callbacks
> > > > > enqueued by call_rcu() are touched only by the corresponding CPU, so
> > > > > that there is no need for locks.
> > > > 
> > > > I believe this is the reason in my profiled case.
> > > > 
> > > > >
> > > > > Third, if you are instead seeing multiple milliseconds of CPU 
> > > > > consumed by
> > > > > call_rcu() in the common case (for example, without the aid of 
> > > > > interrupts,
> > > > > NMIs, or SMIs), please do let me know.  That sounds to me like a bug.
> > > > 
> > > > I don't think I've seen such a case.
> > > > Thanks for clarifications, Paul!
> > > 
> > > Thanks for the explanation Paul. I have to say this has caught me as a
> > > surprise. There are just not enough details about the benchmark to
> > > understand what is going on but I find it rather surprising that
> > > call_rcu can induce a higher overhead than the actual kmem_cache_free
> > > which is the callback. My naive understanding has been that call_rcu is
> > > really fast way to defer the execution to the RCU safe context to do the
> > > final cleanup.
> > 
> > If I am following along correctly (ha!), then your "induce a higher
> > overhead" should be something like "induce a higher to-kfree() latency".
> 
> Yes, this is expected.
> 
> > Of course, there already is a higher latency-to-kfree via call_rcu()
> > than via a direct call to kfree(), and callback-offload CPUs that are
> > being flooded with callbacks raise that latency a jiffy or so more in
> > order to avoid lock contention.
> > 
> > If this becomes a problem, the callback-offloading code can be a bit
> > smarter about avoiding lock contention, but need to see a real problem
> > before I make that change.  But if there is a real problem I will of
> > course fix it.
> 
> I believe that Suren claims that the call_rcu is really visible in the
> exit_mmap case. Time-to-free actual vmas shouldn't really be material
> for that path. If that happens much more later on there could be some
> side effects by an increased memory consumption but that should be
> marginal. How fast exit_mmap really is should only depend on direct
> calls from that path.
> 
> But I guess we need some specific numbers from Suren to be sure what is
> going on here.

Actually, Suren did discuss these (perhaps offlist) back in August.
I was just being forgetful.  :-/

Thanx, Paul


Re: [PATCH V3] tools/perf/tests: Fix string substitutions in build id test

2023-01-20 Thread Arnaldo Carvalho de Melo
Em Fri, Jan 20, 2023 at 08:51:59AM +, David Laight escreveu:
> From: Arnaldo Carvalho de Melo
> > Sent: 19 January 2023 17:00
> > 
> > Em Thu, Jan 19, 2023 at 03:49:15PM +, David Laight escreveu:
> > > From: Athira Rajeev
> > > > Sent: 19 January 2023 14:27
> > > ...
> > > > The test script "tests/shell/buildid.sh" uses some of the
> > > > string substitution ways which are supported in bash, but not in
> > > > "sh" or other shells. Above error on line number 69 that reports
> > > > "Bad substitution" is:
> > >
> > > Looks better - assuming it works :-)
> > 
> > :-)
> > 
> > Can I take this as an:
> > 
> > Acked-by: David Laight 
> 
> I'm not sure that is worth anything.
> You can add a Reviewed-by

Thanks, I'll then add:

Reviewed-by: David Laight 

- Arnaldo


Re: [PATCH] tools/perf: Disable perf probe when libtraceevent is missing

2023-01-20 Thread Arnaldo Carvalho de Melo
Em Fri, Jan 20, 2023 at 05:32:56PM +0530, Athira Rajeev escreveu:
> While parsing the tracepoint events in parse_events_add_tracepoint()
> function, code checks for HAVE_LIBTRACEEVENT support. This is needed
> since libtraceevent is necessary for tracepoint. But while adding
> probe points, check for LIBTRACEEVENT is not done in case of perf probe.
> Hence, in environment with missing libtraceevent-devel, it is
> observed that adding a probe point works even though its not
> supported.
 
> Example:
> Adding probe point:
>   ./perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string'
>   Added new event:
> probe:vfs_getname(on getname_flags:72 with 
> pathname=result->name:string)
 
>   You can now use it in all perf tools, such as:
 
>   perf record -e probe:vfs_getname -aR sleep 1
 
> But trying perf record:
>   ./perf  record -e probe:vfs_getname -aR sleep 1
>   event syntax error: 'probe:vfs_getname'
>   \___ unsupported tracepoint
>   libtraceevent is necessary for tracepoint support
>   Run 'perf list' for a list of valid events
> 
> Fix this by wrapping "builtin-probe" compilation and
> "perf probe" usage under "CONFIG_LIBTRACEEVENT" check.

Humm, but 'perf probe' continues to work, as uou demoed above, the
problem is with the suggestion to use other perf tools that need to
parse tracefs and without libtraceevent, currently can't do it:

[root@quaco ~]# perf probe 'vfs_getname=getname_flags:72 
pathname=result->name:string'
Added new event:
  probe:vfs_getname(on getname_flags:72 with pathname=result->name:string)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_getname -aR sleep 1

[root@quaco ~]# perf probe -l
  probe:vfs_getname(on getname_flags:72@fs/namei.c with pathname)
[root@quaco ~]# perf trace -e probe:vfs_getname
perf: 'trace' is not a perf-command. See 'perf --help'.
[root@quaco ~]# cd /sys/kernel/tracing/events/probe/vfs_getname/
[root@quaco vfs_getname]# ls
enable  filter  format  hist  id  trigger
[root@quaco vfs_getname]# ls -la
total 0
drwxr-x---. 2 root root 0 Jan 20 11:18 .
drwxr-x---. 3 root root 0 Jan 20 11:18 ..
-rw-r-. 1 root root 0 Jan 20 11:18 enable
-rw-r-. 1 root root 0 Jan 20 11:18 filter
-r--r-. 1 root root 0 Jan 20 11:18 format
-r--r-. 1 root root 0 Jan 20 11:18 hist
-r--r-. 1 root root 0 Jan 20 11:18 id
-rw-r-. 1 root root 0 Jan 20 11:18 trigger
[root@quaco vfs_getname]#

But we can go on from there:

[root@quaco tracing]# pwd
/sys/kernel/tracing
[root@quaco tracing]# echo 1 > 
/sys/kernel/tracing/events/probe/vfs_getname/enable
[root@quaco tracing]# echo 1 > tracing_on
[root@quaco tracing]# head trace_pipe
systemd-oomd-979 [003] . 96369.978971: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) 
pathname="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service"
systemd-oomd-979 [003] . 96369.979022: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) 
pathname="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.current"
systemd-oomd-979 [003] . 96369.979084: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) pathname=""
systemd-oomd-979 [003] . 96369.979162: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) 
pathname="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.min"
systemd-oomd-979 [003] . 96369.979197: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) pathname=""
systemd-oomd-979 [003] . 96369.979267: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) 
pathname="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.low"
systemd-oomd-979 [003] . 96369.979303: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) pathname=""
systemd-oomd-979 [003] . 96369.979372: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) 
pathname="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.swap.current"
systemd-oomd-979 [003] . 96369.979406: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) pathname=""
systemd-oomd-979 [003] . 96369.979475: vfs_getname: 
(getname_flags.part.0+0x6b/0x1c0) 
pathname="/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.stat"
[root@quaco tracing]#

So you could instead replace the suggestion from:

"
   You can now use it in all perf tools, such as:

   perf record -e probe:vfs_getname -aR sleep 1
"

To:

"
perf is not linked with libtraceevent, to use the new probe you
can use tracefs:

cd /sys/kernel/tracing/
echo 1 > events/probe/vfs_getname/enable
echo 1 > tracing_on
cat trace_pipe
"

wdyt?

- Arnaldo
 
> Signed-off-by: Athira Rajeev 
> ---
>  tools/perf/Build  | 4 +++-
>  tools/perf/perf.c | 2 ++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/Build b/tools/perf/Build
> index 

[PATCH v2 10/10] ARM: add multi_v7_lpae_defconfig

2023-01-20 Thread Alexander Stein
From: Nicolas Saenz Julienne 

The only missing configuration option preventing us from using
multi_v7_defconfig with the Raspberry Pi 4 is ARM_LPAE. It's needed as
the PCIe controller found on the SoC depends on 64bit addressing, yet
can't be included as not all v7 boards support LPAE.

Introduce multi_v7_lpae_defconfig, built off multi_v7_defconfig, which will
avoid us having to duplicate and maintain multiple similar configurations.

Needless to say the Raspberry Pi 4 is not the only platform that can
benefit from this new configuration.

Signed-off-by: Nicolas Saenz Julienne 
Signed-off-by: Alexander Stein 
---
Directly applied from 
https://lore.kernel.org/linux-arm-kernel/20200203184820.4433-2-nsaenzjulie...@suse.de/T/#m96968dd45c0aaa88e0a7387024b5ac13b002363d
Although I had to apply manually

 arch/arm/Makefile| 4 
 arch/arm/configs/lpae.config | 1 +
 2 files changed, 5 insertions(+)
 create mode 100644 arch/arm/configs/lpae.config

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 506dbc72323bc..80d9eaf3dc06a 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -314,6 +314,10 @@ endif
 # My testing targets (bypasses dependencies)
 bp:;   $(Q)$(MAKE) $(build)=$(boot) $(boot)/bootpImage
 
+include $(srctree)/scripts/Makefile.defconf
+PHONY += multi_v7_lpae_defconfig
+multi_v7_lpae_defconfig:
+   $(call merge_into_defconfig,multi_v7_defconfig,lpae)
 
 define archhelp
   echo  '* zImage- Compressed kernel image (arch/$(ARCH)/boot/zImage)'
diff --git a/arch/arm/configs/lpae.config b/arch/arm/configs/lpae.config
new file mode 100644
index 0..19bab134e014b
--- /dev/null
+++ b/arch/arm/configs/lpae.config
@@ -0,0 +1 @@
+CONFIG_ARM_LPAE=y
-- 
2.34.1



[PATCH v2 09/10] kbuild: Add config fragment merge functionality

2023-01-20 Thread Alexander Stein
From: Nicolas Saenz Julienne 

So far this function was only used locally in powerpc, some other
architectures might benefit from it. Move it into
scripts/Makefile.defconf.

Signed-off-by: Nicolas Saenz Julienne 
Signed-off-by: Alexander Stein 
---
Directly applied from 
https://lore.kernel.org/linux-arm-kernel/20200203184820.4433-2-nsaenzjulie...@suse.de/T/#m96968dd45c0aaa88e0a7387024b5ac13b002363d

 arch/powerpc/Makefile| 12 +---
 scripts/Makefile.defconf | 15 +++
 2 files changed, 16 insertions(+), 11 deletions(-)
 create mode 100644 scripts/Makefile.defconf

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index dc4cbf0a5ca95..533457466ce25 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -253,17 +253,7 @@ PHONY += bootwrapper_install
 bootwrapper_install:
$(Q)$(MAKE) $(build)=$(boot) $(patsubst %,$(boot)/%,$@)
 
-# Used to create 'merged defconfigs'
-# To use it $(call) it with the first argument as the base defconfig
-# and the second argument as a space separated list of .config files to merge,
-# without the .config suffix.
-define merge_into_defconfig
-   $(Q)$(CONFIG_SHELL) $(srctree)/scripts/kconfig/merge_config.sh \
-   -m -O $(objtree) $(srctree)/arch/$(ARCH)/configs/$(1) \
-   $(foreach 
config,$(2),$(srctree)/arch/$(ARCH)/configs/$(config).config)
-   +$(Q)$(MAKE) -f $(srctree)/Makefile olddefconfig
-endef
-
+include $(srctree)/scripts/Makefile.defconf
 PHONY += pseries_le_defconfig
 pseries_le_defconfig:
$(call merge_into_defconfig,pseries_defconfig,le)
diff --git a/scripts/Makefile.defconf b/scripts/Makefile.defconf
new file mode 100644
index 0..ab332f7534f51
--- /dev/null
+++ b/scripts/Makefile.defconf
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0
+# Configuration heplers
+
+# Creates 'merged defconfigs'
+# ---
+# Usage:
+#   $(call merge_into_defconfig,base_config,config_fragment1 config_fragment2 
...)
+#
+# Input config fragments without '.config' suffix
+define merge_into_defconfig
+   $(Q)$(CONFIG_SHELL) $(srctree)/scripts/kconfig/merge_config.sh \
+   -m -O $(objtree) $(srctree)/arch/$(ARCH)/configs/$(1) \
+   $(foreach 
config,$(2),$(srctree)/arch/$(ARCH)/configs/$(config).config)
+   +$(Q)$(MAKE) -f $(srctree)/Makefile olddefconfig
+endef
-- 
2.34.1



[PATCH v2 08/10] ARM: multi_v7_defconfig: Add options to support TQMLS102xA series

2023-01-20 Thread Alexander Stein
Enable drivers used on TQMLS102xA + MBLS1021A.

Signed-off-by: Alexander Stein 
---
Changes in v2:
* Changed symbols to 'm' where possible

 arch/arm/configs/multi_v7_defconfig | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index ee184eb37adcf..92628a160cfb3 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -191,6 +191,7 @@ CONFIG_PCI_TEGRA=y
 CONFIG_PCI_RCAR_GEN2=y
 CONFIG_PCIE_RCAR_HOST=y
 CONFIG_PCI_DRA7XX_EP=y
+CONFIG_PCI_LAYERSCAPE=y
 CONFIG_PCI_ENDPOINT=y
 CONFIG_PCI_ENDPOINT_CONFIGFS=y
 CONFIG_PCI_EPF_TEST=m
@@ -249,6 +250,7 @@ CONFIG_AHCI_ST=y
 CONFIG_AHCI_IMX=y
 CONFIG_AHCI_SUNXI=y
 CONFIG_AHCI_TEGRA=y
+CONFIG_AHCI_QORIQ=y
 CONFIG_SATA_HIGHBANK=y
 CONFIG_SATA_MV=y
 CONFIG_SATA_RCAR=y
@@ -329,6 +331,7 @@ CONFIG_TOUCHSCREEN_ADC=m
 CONFIG_TOUCHSCREEN_ATMEL_MXT=m
 CONFIG_TOUCHSCREEN_ELAN=m
 CONFIG_TOUCHSCREEN_MMS114=m
+CONFIG_TOUCHSCREEN_EDT_FT5X06=m
 CONFIG_TOUCHSCREEN_WM97XX=m
 CONFIG_TOUCHSCREEN_ST1232=m
 CONFIG_TOUCHSCREEN_STMPE=y
@@ -483,6 +486,7 @@ CONFIG_GPIO_ASPEED_SGPIO=y
 CONFIG_GPIO_DAVINCI=y
 CONFIG_GPIO_DWAPB=y
 CONFIG_GPIO_EM=y
+CONFIG_GPIO_MPC8XXX=y
 CONFIG_GPIO_MXC=y
 CONFIG_GPIO_RCAR=y
 CONFIG_GPIO_SYSCON=y
@@ -493,6 +497,7 @@ CONFIG_GPIO_PCA953X=y
 CONFIG_GPIO_PCA953X_IRQ=y
 CONFIG_GPIO_PCF857X=y
 CONFIG_GPIO_PALMAS=y
+CONFIG_GPIO_STMPE=y
 CONFIG_GPIO_TPS6586X=y
 CONFIG_GPIO_TPS65910=y
 CONFIG_GPIO_TWL4030=y
@@ -533,6 +538,7 @@ CONFIG_SENSORS_INA2XX=m
 CONFIG_CPU_THERMAL=y
 CONFIG_DEVFREQ_THERMAL=y
 CONFIG_IMX_THERMAL=y
+CONFIG_QORIQ_THERMAL=m
 CONFIG_ROCKCHIP_THERMAL=y
 CONFIG_RCAR_THERMAL=y
 CONFIG_ARMADA_THERMAL=y
@@ -821,6 +827,8 @@ CONFIG_SND_SOC_MSM8916_WCD_ANALOG=m
 CONFIG_SND_SOC_MSM8916_WCD_DIGITAL=m
 CONFIG_SND_SOC_SGTL5000=m
 CONFIG_SND_SOC_STI_SAS=m
+CONFIG_SND_SOC_TLV320AIC32X4=m
+CONFIG_SND_SOC_TLV320AIC32X4_I2C=m
 CONFIG_SND_SOC_WM8978=m
 CONFIG_SND_AUDIO_GRAPH_CARD=m
 CONFIG_USB=y
@@ -830,6 +838,7 @@ CONFIG_USB_XHCI_MVEBU=y
 CONFIG_USB_XHCI_TEGRA=m
 CONFIG_USB_BRCMSTB=m
 CONFIG_USB_EHCI_HCD=y
+CONFIG_USB_EHCI_FSL=m
 CONFIG_USB_EHCI_HCD_STI=y
 CONFIG_USB_EHCI_EXYNOS=m
 CONFIG_USB_EHCI_MV=m
@@ -933,6 +942,8 @@ CONFIG_NEW_LEDS=y
 CONFIG_LEDS_CLASS=y
 CONFIG_LEDS_CLASS_FLASH=m
 CONFIG_LEDS_CPCAP=m
+CONFIG_LEDS_PCA9532=m
+CONFIG_LEDS_PCA9532_GPIO=y
 CONFIG_LEDS_GPIO=y
 CONFIG_LEDS_PWM=y
 CONFIG_LEDS_MAX8997=m
@@ -949,6 +960,7 @@ CONFIG_LEDS_TRIGGER_DEFAULT_ON=y
 CONFIG_LEDS_TRIGGER_TRANSIENT=y
 CONFIG_LEDS_TRIGGER_CAMERA=y
 CONFIG_EDAC=y
+CONFIG_EDAC_LAYERSCAPE=y
 CONFIG_EDAC_HIGHBANK_MC=y
 CONFIG_EDAC_HIGHBANK_L2=y
 CONFIG_RTC_CLASS=y
@@ -962,6 +974,7 @@ CONFIG_RTC_DRV_MAX8997=m
 CONFIG_RTC_DRV_MAX77686=y
 CONFIG_RTC_DRV_RK808=m
 CONFIG_RTC_DRV_RS5C372=m
+CONFIG_RTC_DRV_PCF85063=m
 CONFIG_RTC_DRV_PCF85363=m
 CONFIG_RTC_DRV_BQ32K=m
 CONFIG_RTC_DRV_TWL4030=y
-- 
2.34.1



[PATCH v2 07/10] ARM: dts: ls1021a: add TQMLS1021A/MBLS102xA LVDS CDTECH FC21 overlay

2023-01-20 Thread Alexander Stein
Add device tree overlay for LVDS display usage.

Signed-off-by: Alexander Stein 
---
Changes in v2:
* None

 arch/arm/boot/dts/Makefile|  2 +
 ...1021a-tqmls1021a-mbls1021a-cdtech-fc21.dts | 55 +++
 2 files changed, 57 insertions(+)
 create mode 100644 
arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-fc21.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 774cc57274e20..bc51a5b868c10 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -806,10 +806,12 @@ dtb-$(CONFIG_SOC_LS1021A) += \
ls1021a-tsn.dtb \
ls1021a-twr.dtb
 
+ls1021a-tqmls1021a-mbls1021a-cdtech-fc21-dtbs := 
ls1021a-tqmls1021a-mbls1021a.dtb ls1021a-tqmls1021a-mbls1021a-cdtech-fc21.dtbo
 ls1021a-tqmls1021a-mbls1021a-cdtech-dc44-dtbs := 
ls1021a-tqmls1021a-mbls1021a.dtb ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dtbo
 ls1021a-tqmls1021a-mbls1021a-hdmi-dtbs := ls1021a-tqmls1021a-mbls1021a.dtb 
ls1021a-tqmls1021a-mbls1021a-hdmi.dtbo
 ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33-dtbs := 
ls1021a-tqmls1021a-mbls1021a.dtb 
ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtbo
 
+dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-cdtech-fc21.dtb
 dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dtb
 dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-hdmi.dtb
 dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtb
diff --git a/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-fc21.dts 
b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-fc21.dts
new file mode 100644
index 0..4bd10d0e17b90
--- /dev/null
+++ b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-fc21.dts
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: (GPL-2.0-or-later OR X11)
+/*
+ * Copyright 2018-2023 TQ-Systems GmbH ,
+ * D-82229 Seefeld, Germany.
+ * Author: Alexander Stein
+
+ */
+
+/dts-v1/;
+/plugin/;
+
+#include 
+#include 
+
+_dcu {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+
+   port {
+   dcu_out: endpoint {
+   remote-endpoint = <_in>;
+   };
+   };
+};
+
+ {
+   compatible = "cdtech,s070pws19hp-fc21";
+   status = "okay";
+
+   port {
+   panel_in: endpoint {
+   remote-endpoint = <_out>;
+   };
+   };
+};
+
+ {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   polytouch: touchscreen@38 {
+   compatible = "edt,edt-ft5406";
+   reg = <0x38>;
+   interrupt-parent = <_0>;
+   interrupts = <6 IRQ_TYPE_EDGE_FALLING>;
+   /* LCD_PWR_EN -> TSC_WAKE */
+   wake-gpios = <_1 4 GPIO_ACTIVE_HIGH>;
+   gain = <20>;
+   touchscreen-size-x = <1024>;
+   touchscreen-size-y = <600>;
+   };
+};
-- 
2.34.1



[PATCH v2 06/10] ARM: dts: ls1021a: add TQMLS1021A/MBLS102xA LVDS CDTECH DC44 overlay

2023-01-20 Thread Alexander Stein
Add device tree overlay for LVDS display usage.

Signed-off-by: Alexander Stein 
---
Changes in v2:
* None

 arch/arm/boot/dts/Makefile|  2 +
 ...1021a-tqmls1021a-mbls1021a-cdtech-dc44.dts | 55 +++
 2 files changed, 57 insertions(+)
 create mode 100644 
arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index d5105e8179431..774cc57274e20 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -806,9 +806,11 @@ dtb-$(CONFIG_SOC_LS1021A) += \
ls1021a-tsn.dtb \
ls1021a-twr.dtb
 
+ls1021a-tqmls1021a-mbls1021a-cdtech-dc44-dtbs := 
ls1021a-tqmls1021a-mbls1021a.dtb ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dtbo
 ls1021a-tqmls1021a-mbls1021a-hdmi-dtbs := ls1021a-tqmls1021a-mbls1021a.dtb 
ls1021a-tqmls1021a-mbls1021a-hdmi.dtbo
 ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33-dtbs := 
ls1021a-tqmls1021a-mbls1021a.dtb 
ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtbo
 
+dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dtb
 dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-hdmi.dtb
 dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtb
 
diff --git a/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dts 
b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dts
new file mode 100644
index 0..ddc71bc597295
--- /dev/null
+++ b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dts
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: (GPL-2.0-or-later OR X11)
+/*
+ * Copyright 2018-2023 TQ-Systems GmbH ,
+ * D-82229 Seefeld, Germany.
+ * Author: Alexander Stein
+
+ */
+
+/dts-v1/;
+/plugin/;
+
+#include 
+#include 
+
+_dcu {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+
+   port {
+   dcu_out: endpoint {
+   remote-endpoint = <_in>;
+   };
+   };
+};
+
+ {
+   compatible = "cdtech,s070swv29hg-dc44";
+   status = "okay";
+
+   port {
+   panel_in: endpoint {
+   remote-endpoint = <_out>;
+   };
+   };
+};
+
+ {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   polytouch: touchscreen@38 {
+   compatible = "edt,edt-ft5406";
+   reg = <0x38>;
+   interrupt-parent = <_0>;
+   interrupts = <6 IRQ_TYPE_EDGE_FALLING>;
+   /* LCD_PWR_EN -> TSC_WAKE */
+   wake-gpios = <_1 4 GPIO_ACTIVE_HIGH>;
+   gain = <20>;
+   touchscreen-size-x = <800>;
+   touchscreen-size-y = <480>;
+   };
+};
-- 
2.34.1



[PATCH v2 05/10] ARM: dts: ls1021a: add TQMLS1021A/MBLS102xA HDMI overlay

2023-01-20 Thread Alexander Stein
Add device tree overlay for HDMI usage.

Signed-off-by: Alexander Stein 
---
Changes in v2:
* None

 arch/arm/boot/dts/Makefile|  2 ++
 .../ls1021a-tqmls1021a-mbls1021a-hdmi.dtso| 36 +++
 2 files changed, 38 insertions(+)
 create mode 100644 arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-hdmi.dtso

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index e6943dc73a3fd..d5105e8179431 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -806,8 +806,10 @@ dtb-$(CONFIG_SOC_LS1021A) += \
ls1021a-tsn.dtb \
ls1021a-twr.dtb
 
+ls1021a-tqmls1021a-mbls1021a-hdmi-dtbs := ls1021a-tqmls1021a-mbls1021a.dtb 
ls1021a-tqmls1021a-mbls1021a-hdmi.dtbo
 ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33-dtbs := 
ls1021a-tqmls1021a-mbls1021a.dtb 
ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtbo
 
+dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-hdmi.dtb
 dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtb
 
 dtb-$(CONFIG_SOC_VF610) += \
diff --git a/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-hdmi.dtso 
b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-hdmi.dtso
new file mode 100644
index 0..f5ca22643c08e
--- /dev/null
+++ b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-hdmi.dtso
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: (GPL-2.0-or-later OR X11)
+/*
+ * Copyright 2018-2023 TQ-Systems GmbH ,
+ * D-82229 Seefeld, Germany.
+ * Author: Alexander Stein
+
+ */
+
+/dts-v1/;
+/plugin/;
+
+ {
+   status = "okay";
+
+   port {
+   dcu_out: endpoint {
+   remote-endpoint = <_in>;
+   };
+   };
+};
+
+_out {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+
+   ports {
+   port@0 {
+   sii9022a_in: endpoint {
+   remote-endpoint = <_out>;
+   };
+   };
+   };
+};
-- 
2.34.1



[PATCH v2 04/10] ARM: dts: ls1021a: add TQMLS1021A/MBLS102xA LVDS TM070JVHG33 overlay

2023-01-20 Thread Alexander Stein
Add device tree overlay for LVDS display usage.

Signed-off-by: Alexander Stein 
---
Changes in v2:
* None

 arch/arm/boot/dts/Makefile|  5 ++
 ...tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtso | 56 +++
 2 files changed, 61 insertions(+)
 create mode 100644 
arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtso

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 44b5ed44b13d6..e6943dc73a3fd 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -805,6 +805,11 @@ dtb-$(CONFIG_SOC_LS1021A) += \
ls1021a-tqmls1021a-mbls1021a.dtb \
ls1021a-tsn.dtb \
ls1021a-twr.dtb
+
+ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33-dtbs := 
ls1021a-tqmls1021a-mbls1021a.dtb 
ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtbo
+
+dtb-$(CONFIG_SOC_LS1021A) += ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtb
+
 dtb-$(CONFIG_SOC_VF610) += \
vf500-colibri-eval-v3.dtb \
vf610-bk4.dtb \
diff --git 
a/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtso 
b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtso
new file mode 100644
index 0..252ef982dd862
--- /dev/null
+++ b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtso
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: (GPL-2.0-or-later OR X11)
+/*
+ * Copyright 2018-2023 TQ-Systems GmbH ,
+ * D-82229 Seefeld, Germany.
+ * Author: Alexander Stein
+
+ */
+
+/dts-v1/;
+/plugin/;
+
+#include 
+
+_dcu {
+   status = "okay";
+};
+
+ {
+   status = "okay";
+
+   port {
+
+   dcu_out: endpoint {
+   remote-endpoint = <_encoder_in>;
+   };
+   };
+};
+
+ {
+   compatible = "tianma,tm070jvhg33";
+   status = "okay";
+
+   port {
+   panel_in: endpoint {
+   remote-endpoint = <_encoder_out>;
+   };
+   };
+};
+
+_encoder {
+   status = "okay";
+
+   ports {
+   port@0 {
+   lvds_encoder_in: endpoint {
+   remote-endpoint = <_out>;
+   };
+   };
+
+   port@1 {
+   lvds_encoder_out: endpoint {
+   remote-endpoint = <_in>;
+   };
+   };
+   };
+};
-- 
2.34.1



[PATCH v2 03/10] ARM: dts: ls1021a: add TQMLS1021A flash partition layout

2023-01-20 Thread Alexander Stein
The bootloader does not add the partitions into DT, so add them manually
here.

Signed-off-by: Alexander Stein 
---
Changes in v2:
* None

 arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi 
b/arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi
index 24ad4a76fe597..ea6734253ba88 100644
--- a/arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi
@@ -77,5 +77,36 @@ qflash0: flash@0 {
spi-rx-bus-width = <4>;
spi-tx-bus-width = <4>;
reg = <0>;
+
+   partitions {
+   compatible = "fixed-partitions";
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   uboot@0 {
+   label = "U-Boot-PBL";
+   reg = <0x0 0xe>;
+   };
+
+   env@e {
+   label = "U-Boot Environment";
+   reg = <0xe 0x1>;
+   };
+
+   dtb@f {
+   label = "DTB";
+   reg = <0xf 0x1>;
+   };
+
+   linux@10 {
+   label = "Linux";
+   reg = <0x10 0x70>;
+   };
+
+   rootfs@80 {
+   label = "RootFS";
+   reg = <0x80 0x380>;
+   };
+   };
};
 };
-- 
2.34.1



[PATCH v2 02/10] ARM: dts: ls1021a: add TQ-Systems MBLS102xA device tree

2023-01-20 Thread Alexander Stein
Add device tree for the MBLS102xA mainboard with TQMLS1021A SoM.

Signed-off-by: Alexander Stein 
---
Changes in v2:
* Remove unnecessary status = "okay"
* Remove underscore from node names
* Move reg direct below compatiblefor i2c devices
* Remove i2c device nodes without software support
  Add a comment about existance for the device though

 arch/arm/boot/dts/Makefile|   1 +
 .../boot/dts/ls1021a-tqmls1021a-mbls1021a.dts | 406 ++
 arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi |  81 
 3 files changed, 488 insertions(+)
 create mode 100644 arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a.dts
 create mode 100644 arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index d0c07867aeabe..44b5ed44b13d6 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -802,6 +802,7 @@ dtb-$(CONFIG_SOC_LS1021A) += \
ls1021a-iot.dtb \
ls1021a-moxa-uc-8410a.dtb \
ls1021a-qds.dtb \
+   ls1021a-tqmls1021a-mbls1021a.dtb \
ls1021a-tsn.dtb \
ls1021a-twr.dtb
 dtb-$(CONFIG_SOC_VF610) += \
diff --git a/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a.dts 
b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a.dts
new file mode 100644
index 0..aa8b605344655
--- /dev/null
+++ b/arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a.dts
@@ -0,0 +1,406 @@
+// SPDX-License-Identifier: (GPL-2.0-or-later OR X11)
+/*
+ * Copyright 2013-2014 Freescale Semiconductor, Inc.
+ * Copyright 2018-2023 TQ-Systems GmbH ,
+ * D-82229 Seefeld, Germany.
+ * Author: Alexander Stein
+ */
+
+/dts-v1/;
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "ls1021a-tqmls1021a.dtsi"
+
+/ {
+   model = "TQMLS102xA SOM on MBLS102xA";
+   compatible = "tq,ls1021a-tqmls1021a-mbls102xa", 
"tq,ls1021a-tqmls1021a", "fsl,ls1021a";
+
+   audio_mclk: audio-clock {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <2500>;
+   };
+
+   backlight_dcu: backlight {
+   compatible = "gpio-backlight";
+   gpios = < 0 GPIO_ACTIVE_LOW>;
+   status = "disabled";
+   };
+
+   gpio-keys {
+   compatible = "gpio-keys";
+   autorepeat;
+
+   switch-1 {
+   label = "S6";
+   linux,code = ;
+   gpios = <_0 0 GPIO_ACTIVE_LOW>;
+   };
+
+   btn2: switch-2 {
+   label = "S7";
+   linux,code = ;
+   gpios = <_0 1 GPIO_ACTIVE_LOW>;
+   };
+
+   switch-3 {
+   label = "S8";
+   linux,code = ;
+   gpios = <_0 2 GPIO_ACTIVE_LOW>;
+   };
+   };
+
+   gpio_leds: gpio-leds {
+   compatible = "gpio-leds";
+
+   led-0 {
+   color = ;
+   function = LED_FUNCTION_STATUS;
+   function-enumerator = <0>;
+   gpios = <_2 4 GPIO_ACTIVE_LOW>;
+   linux,default-trigger = "default-on";
+   };
+
+   led-1 {
+   color = ;
+   function = LED_FUNCTION_STATUS;
+   function-enumerator = <1>;
+   gpios = <_2 5 GPIO_ACTIVE_LOW>;
+   linux,default-trigger = "default-on";
+   };
+
+   led-2 {
+   color = ;
+   function = LED_FUNCTION_STATUS;
+   function-enumerator = <2>;
+   gpios = <_2 6 GPIO_ACTIVE_LOW>;
+   linux,default-trigger = "default-on";
+   };
+
+   led-3 {
+   color = ;
+   function = LED_FUNCTION_HEARTBEAT;
+   function-enumerator = <0>;
+   gpios = <_2 7 GPIO_ACTIVE_LOW>;
+   linux,default-trigger = "heartbeat";
+   };
+   };
+
+   lvds_encoder: lvds-encoder {
+   compatible = "ti,sn75lvds83", "lvds-encoder";
+   power-supply = <_3p3v>;
+   status = "disabled";
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+
+   lvds_encoder_in: endpoint {};
+   };
+
+   port@1 {
+   reg = <1>;
+
+   lvds_encoder_out: endpoint {};
+   };
+   };
+   };
+
+   reg_1p2v: regulator-1p2v {
+   compatible = "regulator-fixed";
+   regulator-name = "1P2V";
+   

[PATCH v2 01/10] dt-bindings: arm: fsl: add TQ-Systems LS1021A board

2023-01-20 Thread Alexander Stein
From: Matthias Schiffer 

TQMLS102xA is a SOM family using NXP LS1021A CPU family.
MBLS102xA is an evaluation mainboard for this SOM.

Signed-off-by: Matthias Schiffer 
Signed-off-by: Alexander Stein 
---
Changes in v2:
* Improved the description mentioning this is a socketable module

 Documentation/devicetree/bindings/arm/fsl.yaml | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/devicetree/bindings/arm/fsl.yaml 
b/Documentation/devicetree/bindings/arm/fsl.yaml
index 3ba354578e8f9..489d350c3bbbe 100644
--- a/Documentation/devicetree/bindings/arm/fsl.yaml
+++ b/Documentation/devicetree/bindings/arm/fsl.yaml
@@ -1164,6 +1164,16 @@ properties:
   - fsl,ls1021a-twr
   - const: fsl,ls1021a
 
+  - description:
+  TQ-Systems TQMLS102xA is a series of socketable SOM featuring
+  LS102x system-on-chip variants. MBLS102xA mainboard can be used as
+  starterkit.
+items:
+  - enum:
+  - tq,ls1021a-tqmls1021a-mbls102xa
+  - const: tq,ls1021a-tqmls1021a
+  - const: fsl,ls1021a
+
   - description: LS1028A based Boards
 items:
   - enum:
-- 
2.34.1



[PATCH v2 00/10] TQMLS1021A support

2023-01-20 Thread Alexander Stein
Hi,

this is the second round of this series to add support for the TQMLS1021A
using the MBLS1021A mainboard.
The changelog is included in the individual patches but the most notable
one is picking up the old patch series for adding lpae config snippet
support.

Best regards,
Alexander

Alexander Stein (7):
  ARM: dts: ls1021a: add TQ-Systems MBLS102xA device tree
  ARM: dts: ls1021a: add TQMLS1021A flash partition layout
  ARM: dts: ls1021a: add TQMLS1021A/MBLS102xA LVDS TM070JVHG33 overlay
  ARM: dts: ls1021a: add TQMLS1021A/MBLS102xA HDMI overlay
  ARM: dts: ls1021a: add TQMLS1021A/MBLS102xA LVDS CDTECH DC44 overlay
  ARM: dts: ls1021a: add TQMLS1021A/MBLS102xA LVDS CDTECH FC21 overlay
  ARM: multi_v7_defconfig: Add options to support TQMLS102xA series

Matthias Schiffer (1):
  dt-bindings: arm: fsl: add TQ-Systems LS1021A board

Nicolas Saenz Julienne (2):
  kbuild: Add config fragment merge functionality
  ARM: add multi_v7_lpae_defconfig

 .../devicetree/bindings/arm/fsl.yaml  |  10 +
 arch/arm/Makefile |   4 +
 arch/arm/boot/dts/Makefile|  12 +
 ...1021a-tqmls1021a-mbls1021a-cdtech-dc44.dts |  55 +++
 ...1021a-tqmls1021a-mbls1021a-cdtech-fc21.dts |  55 +++
 .../ls1021a-tqmls1021a-mbls1021a-hdmi.dtso|  36 ++
 ...tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtso |  56 +++
 .../boot/dts/ls1021a-tqmls1021a-mbls1021a.dts | 406 ++
 arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi | 112 +
 arch/arm/configs/lpae.config  |   1 +
 arch/arm/configs/multi_v7_defconfig   |  13 +
 arch/powerpc/Makefile |  12 +-
 scripts/Makefile.defconf  |  15 +
 13 files changed, 776 insertions(+), 11 deletions(-)
 create mode 100644 
arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-dc44.dts
 create mode 100644 
arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-cdtech-fc21.dts
 create mode 100644 arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-hdmi.dtso
 create mode 100644 
arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a-lvds-tm070jvhg33.dtso
 create mode 100644 arch/arm/boot/dts/ls1021a-tqmls1021a-mbls1021a.dts
 create mode 100644 arch/arm/boot/dts/ls1021a-tqmls1021a.dtsi
 create mode 100644 arch/arm/configs/lpae.config
 create mode 100644 scripts/Makefile.defconf

-- 
2.34.1



Re: [PATCH v3 16/51] cpuidle: Annotate poll_idle()

2023-01-20 Thread Peter Zijlstra
On Thu, Jan 12, 2023 at 08:43:30PM +0100, Peter Zijlstra wrote:
> The __cpuidle functions will become a noinstr class, as such they need
> explicit annotations.
> 
> Signed-off-by: Peter Zijlstra (Intel) 
> Reviewed-by: Rafael J. Wysocki 
> Acked-by: Frederic Weisbecker 
> Tested-by: Tony Lindgren 
> Tested-by: Ulf Hansson 
> ---
>  drivers/cpuidle/poll_state.c |6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> --- a/drivers/cpuidle/poll_state.c
> +++ b/drivers/cpuidle/poll_state.c
> @@ -13,7 +13,10 @@
>  static int __cpuidle poll_idle(struct cpuidle_device *dev,
>  struct cpuidle_driver *drv, int index)
>  {
> - u64 time_start = local_clock();
> + u64 time_start;
> +
> + instrumentation_begin();
> + time_start = local_clock();
>  
>   dev->poll_time_limit = false;
>  
> @@ -39,6 +42,7 @@ static int __cpuidle poll_idle(struct cp
>   raw_local_irq_disable();
>  
>   current_clr_polling();
> + instrumentation_end();
>  
>   return index;
>  }

Pff, this patch is garbage. However wrote it didn't have his brain
engaged :/

Something like the below fixes it, but I still need to build me funny
configs like ia64 and paravirt to see if I didn't wreck me something...

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index a78e73da4a74..70c07e11caa6 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -215,7 +215,7 @@ static void __init cyc2ns_init_secondary_cpus(void)
 /*
  * Scheduler clock - returns current time in nanosec units.
  */
-u64 native_sched_clock(void)
+noinstr u64 native_sched_clock(void)
 {
if (static_branch_likely(&__use_tsc)) {
u64 tsc_now = rdtsc();
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 500d1720421e..0b00f21cefe3 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -426,7 +426,7 @@ void cpuidle_reflect(struct cpuidle_device *dev, int index)
  * @dev:   the cpuidle device
  *
  */
-u64 cpuidle_poll_time(struct cpuidle_driver *drv,
+__cpuidle u64 cpuidle_poll_time(struct cpuidle_driver *drv,
  struct cpuidle_device *dev)
 {
int i;
diff --git a/drivers/cpuidle/poll_state.c b/drivers/cpuidle/poll_state.c
index d25ec52846e6..bdcfeaecd228 100644
--- a/drivers/cpuidle/poll_state.c
+++ b/drivers/cpuidle/poll_state.c
@@ -15,7 +15,6 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
 {
u64 time_start;
 
-   instrumentation_begin();
time_start = local_clock();
 
dev->poll_time_limit = false;
@@ -42,7 +41,6 @@ static int __cpuidle poll_idle(struct cpuidle_device *dev,
raw_local_irq_disable();
 
current_clr_polling();
-   instrumentation_end();
 
return index;
 }
diff --git a/include/linux/sched/clock.h b/include/linux/sched/clock.h
index 867d588314e0..7960f0769884 100644
--- a/include/linux/sched/clock.h
+++ b/include/linux/sched/clock.h
@@ -45,7 +45,7 @@ static inline u64 cpu_clock(int cpu)
return sched_clock();
 }
 
-static inline u64 local_clock(void)
+static __always_inline u64 local_clock(void)
 {
return sched_clock();
 }
@@ -79,7 +79,7 @@ static inline u64 cpu_clock(int cpu)
return sched_clock_cpu(cpu);
 }
 
-static inline u64 local_clock(void)
+static __always_inline u64 local_clock(void)
 {
return sched_clock_cpu(raw_smp_processor_id());
 }
diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
index e374c0c923da..6b3b0559e53c 100644
--- a/kernel/sched/clock.c
+++ b/kernel/sched/clock.c
@@ -260,7 +260,7 @@ notrace static inline u64 wrap_max(u64 x, u64 y)
  *  - filter out backward motion
  *  - use the GTOD tick value to create a window to filter crazy TSC values
  */
-notrace static u64 sched_clock_local(struct sched_clock_data *scd)
+noinstr static u64 sched_clock_local(struct sched_clock_data *scd)
 {
u64 now, clock, old_clock, min_clock, max_clock, gtod;
s64 delta;
@@ -287,7 +287,7 @@ notrace static u64 sched_clock_local(struct 
sched_clock_data *scd)
clock = wrap_max(clock, min_clock);
clock = wrap_min(clock, max_clock);
 
-   if (!try_cmpxchg64(>clock, _clock, clock))
+   if (!arch_try_cmpxchg64(>clock, _clock, clock))
goto again;
 
return clock;
@@ -360,7 +360,7 @@ notrace static u64 sched_clock_remote(struct 
sched_clock_data *scd)
  *
  * See cpu_clock().
  */
-notrace u64 sched_clock_cpu(int cpu)
+noinstr u64 sched_clock_cpu(int cpu)
 {
struct sched_clock_data *scd;
u64 clock;


Re: [PATCH v4 5/7] treewide: Trace IPIs sent via smp_send_reschedule()

2023-01-20 Thread Huacai Chen
For LoongArch parts,
Acked-by: Huacai Chen 

On Thu, Jan 19, 2023 at 10:37 PM Valentin Schneider  wrote:
>
> To be able to trace invocations of smp_send_reschedule(), rename the
> arch-specific definitions of it to arch_smp_send_reschedule() and wrap it
> into an smp_send_reschedule() that contains a tracepoint.
>
> Changes to include the declaration of the tracepoint were driven by the
> following coccinelle script:
>
>   @func_use@
>   @@
>   smp_send_reschedule(...);
>
>   @include@
>   @@
>   #include 
>
>   @no_include depends on func_use && !include@
>   @@
> #include <...>
>   +
>   + #include 
>
> Signed-off-by: Valentin Schneider 
> [csky bits]
> Acked-by: Guo Ren 
> [riscv bits]
> Acked-by: Palmer Dabbelt 
> ---
>  arch/alpha/kernel/smp.c  |  2 +-
>  arch/arc/kernel/smp.c|  2 +-
>  arch/arm/kernel/smp.c|  2 +-
>  arch/arm/mach-actions/platsmp.c  |  2 ++
>  arch/arm64/kernel/smp.c  |  2 +-
>  arch/csky/kernel/smp.c   |  2 +-
>  arch/hexagon/kernel/smp.c|  2 +-
>  arch/ia64/kernel/smp.c   |  4 ++--
>  arch/loongarch/kernel/smp.c  |  4 ++--
>  arch/mips/include/asm/smp.h  |  2 +-
>  arch/mips/kernel/rtlx-cmp.c  |  2 ++
>  arch/openrisc/kernel/smp.c   |  2 +-
>  arch/parisc/kernel/smp.c |  4 ++--
>  arch/powerpc/kernel/smp.c|  6 --
>  arch/powerpc/kvm/book3s_hv.c |  3 +++
>  arch/powerpc/platforms/powernv/subcore.c |  2 ++
>  arch/riscv/kernel/smp.c  |  4 ++--
>  arch/s390/kernel/smp.c   |  2 +-
>  arch/sh/kernel/smp.c |  2 +-
>  arch/sparc/kernel/smp_32.c   |  2 +-
>  arch/sparc/kernel/smp_64.c   |  2 +-
>  arch/x86/include/asm/smp.h   |  2 +-
>  arch/x86/kvm/svm/svm.c   |  4 
>  arch/x86/kvm/x86.c   |  2 ++
>  arch/xtensa/kernel/smp.c |  2 +-
>  include/linux/smp.h  | 11 +--
>  virt/kvm/kvm_main.c  |  2 ++
>  27 files changed, 52 insertions(+), 26 deletions(-)
>
> diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
> index f4e20f75438f8..38637eb9eebd5 100644
> --- a/arch/alpha/kernel/smp.c
> +++ b/arch/alpha/kernel/smp.c
> @@ -562,7 +562,7 @@ handle_ipi(struct pt_regs *regs)
>  }
>
>  void
> -smp_send_reschedule(int cpu)
> +arch_smp_send_reschedule(int cpu)
>  {
>  #ifdef DEBUG_IPI_MSG
> if (cpu == hard_smp_processor_id())
> diff --git a/arch/arc/kernel/smp.c b/arch/arc/kernel/smp.c
> index ad93fe6e4b77d..409cfa4675b40 100644
> --- a/arch/arc/kernel/smp.c
> +++ b/arch/arc/kernel/smp.c
> @@ -292,7 +292,7 @@ static void ipi_send_msg(const struct cpumask *callmap, 
> enum ipi_msg_type msg)
> ipi_send_msg_one(cpu, msg);
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
> ipi_send_msg_one(cpu, IPI_RESCHEDULE);
>  }
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 45b8ca2ce521f..dea24a6e0ed6f 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -744,7 +744,7 @@ void __init set_smp_ipi_range(int ipi_base, int n)
> ipi_setup(smp_processor_id());
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
> smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/arch/arm/mach-actions/platsmp.c b/arch/arm/mach-actions/platsmp.c
> index f26618b435145..7b208e96fbb67 100644
> --- a/arch/arm/mach-actions/platsmp.c
> +++ b/arch/arm/mach-actions/platsmp.c
> @@ -20,6 +20,8 @@
>  #include 
>  #include 
>
> +#include 
> +
>  #define OWL_CPU1_ADDR  0x50
>  #define OWL_CPU1_FLAG  0x5c
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 937d2623e06ba..8d108edc4a89f 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -976,7 +976,7 @@ void __init set_smp_ipi_range(int ipi_base, int n)
> ipi_setup(smp_processor_id());
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
> smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/arch/csky/kernel/smp.c b/arch/csky/kernel/smp.c
> index 4b605aa2e1d65..fd7f81be16dd6 100644
> --- a/arch/csky/kernel/smp.c
> +++ b/arch/csky/kernel/smp.c
> @@ -140,7 +140,7 @@ void smp_send_stop(void)
> on_each_cpu(ipi_stop, NULL, 1);
>  }
>
> -void smp_send_reschedule(int cpu)
> +void arch_smp_send_reschedule(int cpu)
>  {
> send_ipi_message(cpumask_of(cpu), IPI_RESCHEDULE);
>  }
> diff --git a/arch/hexagon/kernel/smp.c b/arch/hexagon/kernel/smp.c
> index 4ba93e59370c4..4e8bee25b8c68 100644
> --- a/arch/hexagon/kernel/smp.c
> +++ b/arch/hexagon/kernel/smp.c
> @@ -217,7 +217,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
> }
>  }
>
> -void smp_send_reschedule(int 

Re: [PATCH v2] of: Fix of platform build on powerpc due to bad of disaply code

2023-01-20 Thread Michal Suchánek
On Thu, Jan 19, 2023 at 11:34:46AM +0100, Michal Suchánek wrote:
> Hello,
> 
> On Thu, Jan 19, 2023 at 10:24:07AM +, Christophe Leroy wrote:
> > 
> > 
> > Le 19/01/2023 à 10:53, Michal Suchanek a écrit :
> > > The commit 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> > > breaks build because of wrong argument to snprintf. That certainly
> > > avoids the runtime error but is not the intended outcome.
> > > 
> > > Also use standard device name format of-display.N for all created
> > > devices.
> > > 
> > > Fixes: 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> > > Signed-off-by: Michal Suchanek 
> > > ---
> > > v2: Update the device name format
> > > ---
> > >   drivers/of/platform.c | 12 
> > >   1 file changed, 8 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > > index f2a5d679a324..8c1b1de22036 100644
> > > --- a/drivers/of/platform.c
> > > +++ b/drivers/of/platform.c
> > > @@ -525,7 +525,9 @@ static int __init 
> > > of_platform_default_populate_init(void)
> > >   if (IS_ENABLED(CONFIG_PPC)) {
> > >   struct device_node *boot_display = NULL;
> > >   struct platform_device *dev;
> > > - int display_number = 1;
> > > + int display_number = 0;
> > > + char buf[14];
> > 
> > Can you declare that in the for block where it is used instead ?
> 
> No, there are two for blocks.
> 
> > 
> > > + char *of_display_format = "of-display.%d";
> > 
> > Should be const ?
> 
> Yes, could be.
> 
> > 
> > >   int ret;
> > >   
> > >   /* Check if we have a MacOS display without a node spec 
> > > */
> > > @@ -556,7 +558,10 @@ static int __init 
> > > of_platform_default_populate_init(void)
> > >   if (!of_get_property(node, "linux,opened", 
> > > NULL) ||
> > >   !of_get_property(node, 
> > > "linux,boot-display", NULL))
> > >   continue;
> > > - dev = of_platform_device_create(node, "of-display", 
> > > NULL);
> > > + ret = snprintf(buf, sizeof(buf), of_display_format, 
> > > display_number++);
> > > + if (ret >= sizeof(buf))
> > > + continue;
> > 
> > 
> > Can you make buf big enough to avoid that ?
> 
> It would be a bit fragile that way.
> 
> The buffer would have to theoretically accomodate
> "of-display.-9223372036854775808", and any change to the format requires
> recalculating the length, by hand.
> 
> Of course, the memory would run out way before allocating that many
> devices so it's kind of pointless to try and accomodate all possible
> device numbers.
> 
> > 
> > And by the way could it be called something else than 'buf' ?
> > 
> > See exemple here : 
> > https://elixir.bootlin.com/linux/v6.1/source/drivers/fsi/fsi-occ.c#L690
> 
> Yes, that is quite possible. Nonetheless, just like 'ret' generic
> variable names also work.

And in fact judicious use of short generic variable names is more
readeable than naming all variables foobar_* as far as I am concerned.
Of course, YMMV.

Thanks

Michal


[PATCH] tools/perf: Disable perf probe when libtraceevent is missing

2023-01-20 Thread Athira Rajeev
While parsing the tracepoint events in parse_events_add_tracepoint()
function, code checks for HAVE_LIBTRACEEVENT support. This is needed
since libtraceevent is necessary for tracepoint. But while adding
probe points, check for LIBTRACEEVENT is not done in case of perf probe.
Hence, in environment with missing libtraceevent-devel, it is
observed that adding a probe point works even though its not
supported.

Example:
Adding probe point:
./perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string'
Added new event:
  probe:vfs_getname(on getname_flags:72 with 
pathname=result->name:string)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_getname -aR sleep 1

But trying perf record:
./perf  record -e probe:vfs_getname -aR sleep 1
event syntax error: 'probe:vfs_getname'
\___ unsupported tracepoint
libtraceevent is necessary for tracepoint support
Run 'perf list' for a list of valid events

Fix this by wrapping "builtin-probe" compilation and
"perf probe" usage under "CONFIG_LIBTRACEEVENT" check.

Signed-off-by: Athira Rajeev 
---
 tools/perf/Build  | 4 +++-
 tools/perf/perf.c | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Build b/tools/perf/Build
index 6dd67e502295..a138a2304929 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -33,7 +33,9 @@ ifeq ($(CONFIG_LIBTRACEEVENT),y)
   perf-$(CONFIG_TRACE) += trace/beauty/
 endif
 
-perf-$(CONFIG_LIBELF) += builtin-probe.o
+ifeq ($(CONFIG_LIBTRACEEVENT),y)
+  perf-$(CONFIG_LIBELF) += builtin-probe.o
+endif
 
 perf-y += bench/
 perf-y += tests/
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 82bbe0ca858b..7b0d79284d5a 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -80,9 +80,11 @@ static struct cmd_struct commands[] = {
 #ifdef HAVE_LIBTRACEEVENT
{ "sched",  cmd_sched,  0 },
 #endif
+#ifdef HAVE_LIBTRACEEVENT
 #ifdef HAVE_LIBELF_SUPPORT
{ "probe",  cmd_probe,  0 },
 #endif
+#endif
 #ifdef HAVE_LIBTRACEEVENT
{ "kmem",   cmd_kmem,   0 },
{ "lock",   cmd_lock,   0 },
-- 
2.39.0



Re: [PATCH v2] of: Fix of platform build on powerpc due to bad of disaply code

2023-01-20 Thread Michal Suchánek
On Fri, Jan 20, 2023 at 12:39:23PM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 20.01.23 um 12:27 schrieb Michal Suchánek:
> > Hello,
> > 
> > On Thu, Jan 19, 2023 at 04:20:57PM +0100, Thomas Zimmermann wrote:
> > > Hi
> > > 
> > > Am 19.01.23 um 14:23 schrieb Michal Suchánek:
> > > > On Thu, Jan 19, 2023 at 02:11:13PM +0100, Thomas Zimmermann wrote:
> > > > > Hi
> > > > > 
> > > > > Am 19.01.23 um 11:24 schrieb Christophe Leroy:
> > > > > > 
> > > > > > 
> > > > > > Le 19/01/2023 à 10:53, Michal Suchanek a écrit :
> > > > > > > The commit 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> > > > > > > breaks build because of wrong argument to snprintf. That certainly
> > > > > > > avoids the runtime error but is not the intended outcome.
> > > > > > > 
> > > > > > > Also use standard device name format of-display.N for all created
> > > > > > > devices.
> > > > > > > 
> > > > > > > Fixes: 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> > > > > > > Signed-off-by: Michal Suchanek 
> > > > > > > ---
> > > > > > > v2: Update the device name format
> > > > > > > ---
> > > > > > >  drivers/of/platform.c | 12 
> > > > > > >  1 file changed, 8 insertions(+), 4 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > > > > > > index f2a5d679a324..8c1b1de22036 100644
> > > > > > > --- a/drivers/of/platform.c
> > > > > > > +++ b/drivers/of/platform.c
> > > > > > > @@ -525,7 +525,9 @@ static int __init 
> > > > > > > of_platform_default_populate_init(void)
> > > > > > >   if (IS_ENABLED(CONFIG_PPC)) {
> > > > > > >   struct device_node *boot_display = NULL;
> > > > > > >   struct platform_device *dev;
> > > > > > > - int display_number = 1;
> > > > > > > + int display_number = 0;
> > > > > > > + char buf[14];
> > > > > > 
> > > > > > Can you declare that in the for block where it is used instead ?
> > > > > > 
> > > > > > > + char *of_display_format = "of-display.%d";
> > > > > > 
> > > > > > Should be const ?
> > > > > 
> > > > > That should be static const of_display_format[] = then
> > > > 
> > > > Why? It sounds completely fine to have a const pointer to a string
> > > > constatnt.
> > > 
> > > Generally speaking:
> > > 
> > > 'static' because your const pointer is then not a local variable, so it
> > > takes pressure off the stack. For global variables, you don't want them to
> > > show up in any linker symbol tables.
> > 
> > This sounds a lot like an exemplar case of premature optimization.
> > A simplistic compiler might do exactly what you say, and allocate a slot
> > for the variable on the stack the moment the function is entered.
> > 
> > However, in real compilers there is no stack pressure from having a
> > local variable:
> >   - the compiler can put the variable into a register
> >   - it can completely omit the variable before and after it's actually
> > used which is that specific function call
> > 
> > > The string "of-display.%d" is stored as an array in the ELF data section.
> > > And your char pointer is a reference to that array. For static pointers,
> > > these indirections take CPU cycles to update when the loader has to 
> > > relocate
> > 
> > Provided that the char pointer ever exists in the compiled code. Its
> > address is not taken so it does not need to.
> > 
> > > sections. If you declare of_display_format[] directly as array, you avoid
> > > the reference and work directly with the array.
> > > 
> > > Of course, this is a kernel module and the string is self-contained within
> > > the function. So the compiler can probably detect that and optimize the 
> > > code
> > > to be like the 'static const []' version. It's still good to follow best
> > > practices, as someone might copy from this function.
> > 
> > If it could not detect it there would be a lot of trouble all around.
> 
> The issues definitely exist in userspace code. Kernel modules are simpler,
> so compiler optimization is easier.
> 
> But I'm not really trying to make a technical argument. My point here is
> that someone might read your code and duplicate the pattern. That's not
> unreasonable: it's core Linux code, so it can be assumed to be good (or at
> least not bad). But your current code teaches the reader a bad practices,
> which should be avoided. It is better to do the correct thing, even if it
> makes no difference to the compiled code.

The point I am trying to get across is that besides the original
objection about missing 'const' this code is not bad. Loading a string
constant address into a local variable and passing it as function call
argument is perfectly fine.

If you get any advantage by the alternate convoluted construct it's
more likely than anything else a bug in the compiler you are using.

It may be necessary to work around such bugs in performance-critical
code but not in driver probing code that runs exactly once during boot.

Thanks


RE: Calculating array sizes in C - was: Re: Build regressions/improvements in v6.2-rc1

2023-01-20 Thread David Laight
From: Segher Boessenkool
> Sent: 20 January 2023 10:54
...
> > > I suggest to file a bug against gcc complaining about a "spurious
> > > warning", and using "-Werror -Wno-error-sizeof-pointer-div" until gcc is
> > > adapted to not emit the warning about the pointer division if the result
> > > is not used.
> 
> Yeah.  If the first operand of a conditional operator is non-zero, the
> second operand is not evaluated, and if the first is zero, the third
> operand is not evaluated.  It is better if we do not warn about
> something we do not evaluate.  In cases like here where it is clear at
> compile time which branch is taken, that shouldn't be too hard.
> 
> Can someone please file a GCC PR?  With reduced testcase preferably.

It's not a bug.
All the operands of the conditional operator have to be valid.
It might be that the optimiser can discard one, but that happens
much later on.
Even the operands of choose_expr() have to be valid - but can
have different types.

I'm not sure what the code is trying to do or why it is failing.
Was it a fail in userspace - where the option to allow sizeof (void)
isn't allowed.

FWIW you can check for a compile-time NULL (or 0) with:
#define is_null(x) (sizeof *(0 : (void *)(x) ? (int *)0) != 1)

Although that is a compile-time error for non-NULL unless
'void *' arithmetic is allowed.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH v7 4/8] crash: add phdr for possible CPUs in elfcorehdr

2023-01-20 Thread Laurent Dufour
On 19/01/2023 19:29:52, Laurent Dufour wrote:
> On 15/01/2023 16:02:02, Sourabh Jain wrote:
>> On architectures like PowerPC the crash notes are available for all
>> possible CPUs. So let's populate the elfcorehdr for all possible
>> CPUs having crash notes to avoid updating elfcorehdr during in-kernel
>> crash update on CPU hotplug events.
>>
>> The similar technique is used in kexec-tool for kexec_load case.
>>
>> Signed-off-by: Sourabh Jain 
>> ---
>>  kernel/crash_core.c | 9 ++---
>>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> This patch is not applying on ppc/next (53ab112a9508).
> 
> As far as I could see, crash_prepare_elf64_headers() is defined in the file
> kernel/kexec_file.c and that's not recent, see babac4a84a88 (kexec_file,
> x86: move re-factored code to generic side, 2018-04-13)
> 
> Am I missing something?

My mistake, sounds that your series is based on top of the Eric's one (not yet 
upstream):

https://lore.kernel.org/lkml/20230118213544.2128-1-eric.devol...@oracle.com/

> 
>>
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 910d377ea317e..19f987b3851e8 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -364,8 +364,8 @@ int crash_prepare_elf64_headers(struct kimage *image, 
>> struct crash_mem *mem,
>>  ehdr->e_ehsize = sizeof(Elf64_Ehdr);
>>  ehdr->e_phentsize = sizeof(Elf64_Phdr);
>>  
>> -/* Prepare one phdr of type PT_NOTE for each present CPU */
>> -for_each_present_cpu(cpu) {
>> +/* Prepare one phdr of type PT_NOTE for possible CPU with crash note. */
>> +for_each_possible_cpu(cpu) {
>>  #ifdef CONFIG_CRASH_HOTPLUG
>>  if (IS_ENABLED(CONFIG_HOTPLUG_CPU)) {
>>  /* Skip the soon-to-be offlined cpu */
>> @@ -373,8 +373,11 @@ int crash_prepare_elf64_headers(struct kimage *image, 
>> struct crash_mem *mem,
>>  continue;
>>  }
>>  #endif
>> -phdr->p_type = PT_NOTE;
>>  notes_addr = per_cpu_ptr_to_phys(per_cpu_ptr(crash_notes, cpu));
>> +if (!notes_addr)
>> +continue;
>> +
>> +phdr->p_type = PT_NOTE;
>>  phdr->p_offset = phdr->p_paddr = notes_addr;
>>  phdr->p_filesz = phdr->p_memsz = sizeof(note_buf_t);
>>  (ehdr->e_phnum)++;
> 



Re: [PATCH v2] of: Fix of platform build on powerpc due to bad of disaply code

2023-01-20 Thread Thomas Zimmermann

Hi

Am 20.01.23 um 12:27 schrieb Michal Suchánek:

Hello,

On Thu, Jan 19, 2023 at 04:20:57PM +0100, Thomas Zimmermann wrote:

Hi

Am 19.01.23 um 14:23 schrieb Michal Suchánek:

On Thu, Jan 19, 2023 at 02:11:13PM +0100, Thomas Zimmermann wrote:

Hi

Am 19.01.23 um 11:24 schrieb Christophe Leroy:



Le 19/01/2023 à 10:53, Michal Suchanek a écrit :

The commit 2d681d6a23a1 ("of: Make of framebuffer devices unique")
breaks build because of wrong argument to snprintf. That certainly
avoids the runtime error but is not the intended outcome.

Also use standard device name format of-display.N for all created
devices.

Fixes: 2d681d6a23a1 ("of: Make of framebuffer devices unique")
Signed-off-by: Michal Suchanek 
---
v2: Update the device name format
---
 drivers/of/platform.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index f2a5d679a324..8c1b1de22036 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -525,7 +525,9 @@ static int __init of_platform_default_populate_init(void)
if (IS_ENABLED(CONFIG_PPC)) {
struct device_node *boot_display = NULL;
struct platform_device *dev;
-   int display_number = 1;
+   int display_number = 0;
+   char buf[14];


Can you declare that in the for block where it is used instead ?


+   char *of_display_format = "of-display.%d";


Should be const ?


That should be static const of_display_format[] = then


Why? It sounds completely fine to have a const pointer to a string
constatnt.


Generally speaking:

'static' because your const pointer is then not a local variable, so it
takes pressure off the stack. For global variables, you don't want them to
show up in any linker symbol tables.


This sounds a lot like an exemplar case of premature optimization.
A simplistic compiler might do exactly what you say, and allocate a slot
for the variable on the stack the moment the function is entered.

However, in real compilers there is no stack pressure from having a
local variable:
  - the compiler can put the variable into a register
  - it can completely omit the variable before and after it's actually
used which is that specific function call


The string "of-display.%d" is stored as an array in the ELF data section.
And your char pointer is a reference to that array. For static pointers,
these indirections take CPU cycles to update when the loader has to relocate


Provided that the char pointer ever exists in the compiled code. Its
address is not taken so it does not need to.


sections. If you declare of_display_format[] directly as array, you avoid
the reference and work directly with the array.

Of course, this is a kernel module and the string is self-contained within
the function. So the compiler can probably detect that and optimize the code
to be like the 'static const []' version. It's still good to follow best
practices, as someone might copy from this function.


If it could not detect it there would be a lot of trouble all around.


The issues definitely exist in userspace code. Kernel modules are 
simpler, so compiler optimization is easier.


But I'm not really trying to make a technical argument. My point here is 
that someone might read your code and duplicate the pattern. That's not 
unreasonable: it's core Linux code, so it can be assumed to be good (or 
at least not bad). But your current code teaches the reader a bad 
practices, which should be avoided. It is better to do the correct 
thing, even if it makes no difference to the compiled code.


Best regards
Thomas



Thanks

Michal


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v2] of: Fix of platform build on powerpc due to bad of disaply code

2023-01-20 Thread Michal Suchánek
Hello,

On Thu, Jan 19, 2023 at 04:20:57PM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 19.01.23 um 14:23 schrieb Michal Suchánek:
> > On Thu, Jan 19, 2023 at 02:11:13PM +0100, Thomas Zimmermann wrote:
> > > Hi
> > > 
> > > Am 19.01.23 um 11:24 schrieb Christophe Leroy:
> > > > 
> > > > 
> > > > Le 19/01/2023 à 10:53, Michal Suchanek a écrit :
> > > > > The commit 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> > > > > breaks build because of wrong argument to snprintf. That certainly
> > > > > avoids the runtime error but is not the intended outcome.
> > > > > 
> > > > > Also use standard device name format of-display.N for all created
> > > > > devices.
> > > > > 
> > > > > Fixes: 2d681d6a23a1 ("of: Make of framebuffer devices unique")
> > > > > Signed-off-by: Michal Suchanek 
> > > > > ---
> > > > > v2: Update the device name format
> > > > > ---
> > > > > drivers/of/platform.c | 12 
> > > > > 1 file changed, 8 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> > > > > index f2a5d679a324..8c1b1de22036 100644
> > > > > --- a/drivers/of/platform.c
> > > > > +++ b/drivers/of/platform.c
> > > > > @@ -525,7 +525,9 @@ static int __init 
> > > > > of_platform_default_populate_init(void)
> > > > >   if (IS_ENABLED(CONFIG_PPC)) {
> > > > >   struct device_node *boot_display = NULL;
> > > > >   struct platform_device *dev;
> > > > > - int display_number = 1;
> > > > > + int display_number = 0;
> > > > > + char buf[14];
> > > > 
> > > > Can you declare that in the for block where it is used instead ?
> > > > 
> > > > > + char *of_display_format = "of-display.%d";
> > > > 
> > > > Should be const ?
> > > 
> > > That should be static const of_display_format[] = then
> > 
> > Why? It sounds completely fine to have a const pointer to a string
> > constatnt.
> 
> Generally speaking:
> 
> 'static' because your const pointer is then not a local variable, so it
> takes pressure off the stack. For global variables, you don't want them to
> show up in any linker symbol tables.

This sounds a lot like an exemplar case of premature optimization.
A simplistic compiler might do exactly what you say, and allocate a slot
for the variable on the stack the moment the function is entered.

However, in real compilers there is no stack pressure from having a
local variable:
 - the compiler can put the variable into a register
 - it can completely omit the variable before and after it's actually
   used which is that specific function call

> The string "of-display.%d" is stored as an array in the ELF data section.
> And your char pointer is a reference to that array. For static pointers,
> these indirections take CPU cycles to update when the loader has to relocate

Provided that the char pointer ever exists in the compiled code. Its
address is not taken so it does not need to.

> sections. If you declare of_display_format[] directly as array, you avoid
> the reference and work directly with the array.
> 
> Of course, this is a kernel module and the string is self-contained within
> the function. So the compiler can probably detect that and optimize the code
> to be like the 'static const []' version. It's still good to follow best
> practices, as someone might copy from this function.

If it could not detect it there would be a lot of trouble all around.

Thanks

Michal


Re: Calculating array sizes in C - was: Re: Build regressions/improvements in v6.2-rc1

2023-01-20 Thread Segher Boessenkool
On Thu, Jan 19, 2023 at 09:31:21PM -0600, Rob Landley wrote:
> On 1/19/23 16:11, Michael.Karcher wrote:
> > I don't see a clear bug at this point. We are talking about the C expression
> > 
> >    __same_type((void*)0, (void*)0)? 0 : sizeof((void*)0)/sizeof(*((void*0))

(__same_type is a kernel macro, it expands to something with
__builtin_compatible_type()).

> *(void*) is type "void" which does not have a size.

It has size 1, in GCC, so that you can do arithmetic on pointers to
void.  This is a long-standing and very widely used GCC extension.

"""
6.24 Arithmetic on 'void'- and Function-Pointers


In GNU C, addition and subtraction operations are supported on pointers
to 'void' and on pointers to functions.  This is done by treating the
size of a 'void' or of a function as 1.

 A consequence of this is that 'sizeof' is also allowed on 'void' and on
function types, and returns 1.

 The option '-Wpointer-arith' requests a warning if these extensions are
used.
"""

> The problem is gcc "optimizing out" an earlier type check, the same way it
> "optimizes out" checks for signed integer math overflowing, or "optimizes 
> out" a
> comparison to pointers from two different local variables from different
> function calls trying to calculate the amount of stack used, or "optimizes 
> out"

Are you saying something in the kernel code here is invalid code?
Because your other examples are.

> using char *x = (char *)1; as a flag value and then doing "if (!(x-1)) because
> it can "never happen"...

Like here.  And no, this is not allowed by -fno-strict-aliasing.

> > I suggest to file a bug against gcc complaining about a "spurious 
> > warning", and using "-Werror -Wno-error-sizeof-pointer-div" until gcc is 
> > adapted to not emit the warning about the pointer division if the result 
> > is not used.

Yeah.  If the first operand of a conditional operator is non-zero, the
second operand is not evaluated, and if the first is zero, the third
operand is not evaluated.  It is better if we do not warn about
something we do not evaluate.  In cases like here where it is clear at
compile time which branch is taken, that shouldn't be too hard.

Can someone please file a GCC PR?  With reduced testcase preferably.


Segher


[PATCH] powerpc/powernv/ioda: Skip unallocated resources when mapping to PE

2023-01-20 Thread Frederic Barrat
pnv_ioda_setup_pe_res() calls opal to map a resource with a PE. However,
the code assumes the resource is allocated and it uses the resource
address to find out the segment(s) which need to be mapped to the
PE. In the unlikely case where the resource hasn't been allocated, the
computation for the segment number is garbage, which can lead to
invalid memory access and potentially a kernel crash, such as:

[ ] pci_bus 0002:02: Configuring PE for bus
[ ] pci 0002:02 : [PE# fc] Secondary bus 
0x0002..0x0002 associated with PE#fc
[ ] BUG: Kernel NULL pointer dereference on write at 0x
[ ] Faulting instruction address: 0xc005eac4
[ ] Oops: Kernel access of bad area, sig: 7 [#1]
[ ] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[ ] Modules linked in:
[ ] CPU: 12 PID: 1 Comm: swapper/20 Not tainted 5.10.50-openpower1 #2
[ ] NIP:  c005eac4 LR: c005ea44 CTR: 30061b9c
[ ] REGS: c00027383650 TRAP: 0300   Not tainted  (5.10.50-openpower1)
[ ] MSR:  90009033   CR: 44000224  XER: 
2004
[ ] CFAR: c005eaa0 DAR:  DSISR: 0208 IRQMASK: 0
[ ] GPR00: c005dd98 c000273838e0 c185de00 c000200fff018960
[ ] GPR04: 00fc 0003  
[ ] GPR08:    90001033
[ ] GPR12: 31cb c00e6a80 c0010a58 
[ ] GPR16:    
[ ] GPR20:    c711e200
[ ] GPR24: 0100 c00029501120 c0002cee2800 03ff
[ ] GPR28: c000200fff018960  c000200ffcb7fd00 
[ ] NIP [c005eac4] pnv_ioda_setup_pe_res+0x94/0x1a0
[ ] LR [c005ea44] pnv_ioda_setup_pe_res+0x14/0x1a0
[ ] Call Trace:
[ ] [c000273838e0] [c005eb98] pnv_ioda_setup_pe_res+0x168/0x1a0 
(unreliable)
[ ] [c00027383970] [c005dd98] pnv_pci_ioda_dma_dev_setup+0x43c/0x970
[ ] [c00027383a60] [c0032cdc] pcibios_bus_add_device+0x78/0x18c
[ ] [c00027383aa0] [c028f2bc] pci_bus_add_device+0x28/0xbc
[ ] [c00027383b10] [c028f3a0] pci_bus_add_devices+0x50/0x7c
[ ] [c00027383b50] [c028f3c4] pci_bus_add_devices+0x74/0x7c
[ ] [c00027383b90] [c028f3c4] pci_bus_add_devices+0x74/0x7c
[ ] [c00027383bd0] [c069ad0c] pcibios_init+0xf0/0x104
[ ] [c00027383c50] [c00106d8] do_one_initcall+0x84/0x1c4
[ ] [c00027383d20] [c06910b8] kernel_init_freeable+0x264/0x268
[ ] [c00027383dc0] [c0010a68] kernel_init+0x18/0x138
[ ] [c00027383e20] [c000cbfc] ret_from_kernel_thread+0x5c/0x80
[ ] Instruction dump:
[ ] 7f89e840 409d000c 7fbbf840 409c000c 38210090 4848f448 809c002c e95e0120
[ ] 7ba91764 38a3 57a7043e 38c0 <7c8a492e> 5484043e e87e0018 4bff23bd

Hitting the problem is not that easy. It was seen with a (semi-bogus)
PCI device with a class code of 0. The generic PCI framework doesn't
allocate resources in such a case.

The patch is simply skipping resources which are still flagged with
IORESOURCE_UNSET.

We don't have the problem with 64-bit mem resources, as the address of
the resource is checked to be within the range of the 64-bit mmio
window. See pnv_ioda_reserve_dev_m64_pe() and pnv_pci_is_m64().

Reported-by: Andrew Jeffery 
Fixes: 23e79425fe7c ("powerpc/powernv: Simplify pnv_ioda_setup_pe_seg()")
Signed-off-by: Frederic Barrat 
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5c144c05cbfd..4f6e20a35aa1 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2325,7 +2325,8 @@ static void pnv_ioda_setup_pe_res(struct pnv_ioda_pe *pe,
int index;
int64_t rc;
 
-   if (!res || !res->flags || res->start > res->end)
+   if (!res || !res->flags || res->start > res->end ||
+   res->flags & IORESOURCE_UNSET)
return;
 
if (res->flags & IORESOURCE_IO) {
-- 
2.39.0



Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Michal Hocko
On Thu 19-01-23 11:17:07, Paul E. McKenney wrote:
> On Thu, Jan 19, 2023 at 01:52:14PM +0100, Michal Hocko wrote:
> > On Wed 18-01-23 11:01:08, Suren Baghdasaryan wrote:
> > > On Wed, Jan 18, 2023 at 10:34 AM Paul E. McKenney  
> > > wrote:
> > [...]
> > > > There are a couple of possibilities here.
> > > >
> > > > First, if I am remembering correctly, the time between the call_rcu()
> > > > and invocation of the corresponding callback was taking multiple 
> > > > seconds,
> > > > but that was because the kernel was built with CONFIG_LAZY_RCU=y in
> > > > order to save power by batching RCU work over multiple call_rcu()
> > > > invocations.  If this is causing a problem for a given call site, the
> > > > shiny new call_rcu_hurry() can be used instead.  Doing this gets back
> > > > to the old-school non-laziness, but can of course consume more power.
> > > 
> > > That would not be the case because CONFIG_LAZY_RCU was not an option
> > > at the time I was profiling this issue.
> > > Laxy RCU would be a great option to replace this patch but
> > > unfortunately it's not the default behavior, so I would still have to
> > > implement this batching in case lazy RCU is not enabled.
> > > 
> > > >
> > > > Second, there is a much shorter one-jiffy delay between the call_rcu()
> > > > and the invocation of the corresponding callback in kernels built with
> > > > either CONFIG_NO_HZ_FULL=y (but only on CPUs mentioned in the nohz_full
> > > > or rcu_nocbs kernel boot parameters) or CONFIG_RCU_NOCB_CPU=y (but only
> > > > on CPUs mentioned in the rcu_nocbs kernel boot parameters).  The purpose
> > > > of this delay is to avoid lock contention, and so this delay is incurred
> > > > only on CPUs that are queuing callbacks at a rate exceeding 16K/second.
> > > > This is reduced to a per-jiffy limit, so on a HZ=1000 system, a CPU
> > > > invoking call_rcu() at least 16 times within a given jiffy will incur
> > > > the added delay.  The reason for this delay is the use of a separate
> > > > ->nocb_bypass list.  As Suren says, this bypass list is used to reduce
> > > > lock contention on the main ->cblist.  This is not needed in old-school
> > > > kernels built without either CONFIG_NO_HZ_FULL=y or 
> > > > CONFIG_RCU_NOCB_CPU=y
> > > > (including most datacenter kernels) because in that case the callbacks
> > > > enqueued by call_rcu() are touched only by the corresponding CPU, so
> > > > that there is no need for locks.
> > > 
> > > I believe this is the reason in my profiled case.
> > > 
> > > >
> > > > Third, if you are instead seeing multiple milliseconds of CPU consumed 
> > > > by
> > > > call_rcu() in the common case (for example, without the aid of 
> > > > interrupts,
> > > > NMIs, or SMIs), please do let me know.  That sounds to me like a bug.
> > > 
> > > I don't think I've seen such a case.
> > > Thanks for clarifications, Paul!
> > 
> > Thanks for the explanation Paul. I have to say this has caught me as a
> > surprise. There are just not enough details about the benchmark to
> > understand what is going on but I find it rather surprising that
> > call_rcu can induce a higher overhead than the actual kmem_cache_free
> > which is the callback. My naive understanding has been that call_rcu is
> > really fast way to defer the execution to the RCU safe context to do the
> > final cleanup.
> 
> If I am following along correctly (ha!), then your "induce a higher
> overhead" should be something like "induce a higher to-kfree() latency".

Yes, this is expected.

> Of course, there already is a higher latency-to-kfree via call_rcu()
> than via a direct call to kfree(), and callback-offload CPUs that are
> being flooded with callbacks raise that latency a jiffy or so more in
> order to avoid lock contention.
> 
> If this becomes a problem, the callback-offloading code can be a bit
> smarter about avoiding lock contention, but need to see a real problem
> before I make that change.  But if there is a real problem I will of
> course fix it.

I believe that Suren claims that the call_rcu is really visible in the
exit_mmap case. Time-to-free actual vmas shouldn't really be material
for that path. If that happens much more later on there could be some
side effects by an increased memory consumption but that should be
marginal. How fast exit_mmap really is should only depend on direct
calls from that path.

But I guess we need some specific numbers from Suren to be sure what is
going on here.

Thanks!
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

2023-01-20 Thread Michal Hocko
On Thu 19-01-23 10:52:03, Suren Baghdasaryan wrote:
> On Thu, Jan 19, 2023 at 4:59 AM Michal Hocko  wrote:
> >
> > On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > > call_rcu() can take a long time when callback offloading is enabled.
> > > Its use in the vm_area_free can cause regressions in the exit path when
> > > multiple VMAs are being freed. To minimize that impact, place VMAs into
> > > a list and free them in groups using one call_rcu() call per group.
> >
> > After some more clarification I can understand how call_rcu might not be
> > super happy about thousands of callbacks to be invoked and I do agree
> > that this is not really optimal.
> >
> > On the other hand I do not like this solution much either.
> > VM_AREA_FREE_LIST_MAX is arbitrary and it won't really help all that
> > much with processes with a huge number of vmas either. It would still be
> > in housands of callbacks to be scheduled without a good reason.
> >
> > Instead, are there any other cases than remove_vma that need this
> > batching? We could easily just link all the vmas into linked list and
> > use a single call_rcu instead, no? This would both simplify the
> > implementation, remove the scaling issue as well and we do not have to
> > argue whether VM_AREA_FREE_LIST_MAX should be epsilon or epsilon + 1.
> 
> Yes, I agree the solution is not stellar. I wanted something simple
> but this is probably too simple. OTOH keeping all dead vm_area_structs
> on the list without hooking up a shrinker (additional complexity) does
> not sound too appealing either.

I suspect you have missed my idea. I do not really want to keep the list
around or any shrinker. It is dead simple. Collect all vmas in
remove_vma and then call_rcu the whole list at once after the whole list
(be it from exit_mmap or remove_mt). See?

-- 
Michal Hocko
SUSE Labs


RE: [PATCH V3] tools/perf/tests: Fix string substitutions in build id test

2023-01-20 Thread David Laight
From: Arnaldo Carvalho de Melo
> Sent: 19 January 2023 17:00
> 
> Em Thu, Jan 19, 2023 at 03:49:15PM +, David Laight escreveu:
> > From: Athira Rajeev
> > > Sent: 19 January 2023 14:27
> > ...
> > > The test script "tests/shell/buildid.sh" uses some of the
> > > string substitution ways which are supported in bash, but not in
> > > "sh" or other shells. Above error on line number 69 that reports
> > > "Bad substitution" is:
> >
> > Looks better - assuming it works :-)
> 
> :-)
> 
> Can I take this as an:
> 
> Acked-by: David Laight 

I'm not sure that is worth anything.
You can add a Reviewed-by

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: Calculating array sizes in C - was: Re: Build regressions/improvements in v6.2-rc1

2023-01-20 Thread John Paul Adrian Glaubitz

Hi Michael!

On 1/19/23 23:11, Michael.Karcher wrote:

I suggest to file a bug against gcc complaining about a "spurious warning",
and using "-Werror -Wno-error-sizeof-pointer-div" until gcc is adapted to
not emit the warning about the pointer division if the result is not used.


Could you post a kernel patch for that? I would be happy to test it on my
SH-7785CLR board. Also, I'm going to file a bug report against GCC.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: [PATCH v9 00/10] phy: Add support for Lynx 10G SerDes

2023-01-20 Thread Vinod Koul
On 19-01-23, 11:22, Sean Anderson wrote:
> On 1/18/23 11:54, Vinod Koul wrote:
> > On 17-01-23, 11:46, Sean Anderson wrote:
> >> 
> >> I noticed that this series is marked "changes requested" on patchwork.
> >> However, I have received only automated feedback. I have done my best
> >> effort to address feedback I have received on prior revisions. I would
> >> appreciate getting another round of review before resending this series.
> > 
> > Looking at the series, looks like kernel-bot sent some warnings on the
> > series so I was expecting an updated series for review
> > 
> 
> Generally, multiple reviewers will comment on a patch, even if another
> reviewer finds something which needs to be changed. This is a one-line
> fix, so I would appreciate getting more substantial feedback before
> respinning. Every time I send a new series I have to rebase and test on
> hardware. It's work that I would rather do when there is something to be
> gained.

I review to apply, if I can apply, I would typically skip this

-- 
~Vinod


Re: [PATCH V3] tools/perf/tests: Fix string substitutions in build id test

2023-01-20 Thread Disha Goel

Environment with /bin/sh
# readlink -f $(which sh)
/bin/dash

Running perf test from tmp.perf/urgent

# ./perf test -v "build id cache operations"
 73: build id cache operations   :
--- start ---
test child forked, pid 71063
WARNING: wine not found. PE binaries will not be run.
test binaries: /tmp/perf.ex.SHA1.RNm /tmp/perf.ex.MD5.Flx 
./tests/shell/../pe-file.exe
DEBUGINFOD_URLS=
Adding 4abd406f041feb4f10ecde3fc30fd0639e1a91cb /tmp/perf.ex.SHA1.RNm: Ok
build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
./tests/shell/buildid.sh: 69: ./tests/shell/buildid.sh: Bad substitution
test child finished with -2
 end 

After applying the patch in same environment, error is not seen and test runs 
fine.

Tested the same patch in bash as well.
# readlink -f $(which sh)
/usr/bin/bash

The test works fine with the changes.

Tested-by: Disha Goel

On 1/19/23 7:57 PM, Athira Rajeev wrote:

The perf test named “build id cache operations” skips with below
error on some distros:

<<>>
  78: build id cache operations   :
test child forked, pid 01
WARNING: wine not found. PE binaries will not be run.
test binaries: /tmp/perf.ex.SHA1.PKz /tmp/perf.ex.MD5.Gt3 
./tests/shell/../pe-file.exe
DEBUGINFOD_URLS=
Adding 4abd406f041feb4f10ecde3fc30fd0639e1a91cb /tmp/perf.ex.SHA1.PKz: Ok
build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
./tests/shell/buildid.sh: 69: ./tests/shell/buildid.sh: Bad substitution
test child finished with -2
build id cache operations: Skip
<<>>

The test script "tests/shell/buildid.sh" uses some of the
string substitution ways which are supported in bash, but not in
"sh" or other shells. Above error on line number 69 that reports
"Bad substitution" is:

<<>>
link=${build_id_dir}/.build-id/${id:0:2}/${id:2}
<<>>

Here the way of getting first two characters from id ie,
${id:0:2} and similarly expressions like ${id:2} is not
recognised in "sh". So the line errors and instead of
hitting failure, the test gets skipped as shown in logs.
So the syntax issue causes test not to be executed in
such cases. Similarly usage : "${@: -1}" [ to pick last
argument passed to a function] in “test_record” doesn’t
work in all distros.

Fix this by using alternative way with shell substitution
to pick required characters from the string. Also fix the
usage of “${@: -1}” to work in all cases.

Another usage in “test_record” is:
<<>>
${perf} record --buildid-all -o ${data} $@ &> ${log}
<<>>

This causes the perf record to start in background and
Results in the data file not being created by the time
"check" function is invoked. Below log shows perf record
result getting displayed after the call to "check" function.

<<>>
running: perf record /tmp/perf.ex.SHA1.EAU
build id: 4abd406f041feb4f10ecde3fc30fd0639e1a91cb
link: /tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb
failed: link 
/tmp/perf.debug.mLT/.build-id/4a/bd406f041feb4f10ecde3fc30fd0639e1a91cb does 
not exist
test child finished with -1
build id cache operations: FAILED!
root@machine:~/athira/linux/tools/perf# Couldn't synthesize bpf events.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.010 MB /tmp/perf.data.bFF ]
<<>>

Fix this by redirecting output instead of using “&” which
starts the command in background.

Signed-off-by: Athira Rajeev
Acked-by: Ian Rogers
---
Changelog:
 From v2 -> v3
- Addressed review comments from David Laight
   for using shell substitutions.

 From v1 -> v2
- Added Acked-by from Ian.
- Rebased to tmp.perf/urgent of:
  git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git

  tools/perf/tests/shell/buildid.sh | 18 +++---
  1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/tools/perf/tests/shell/buildid.sh 
b/tools/perf/tests/shell/buildid.sh
index aaf851108ca3..0ce22ea0a7f1 100755
--- a/tools/perf/tests/shell/buildid.sh
+++ b/tools/perf/tests/shell/buildid.sh
@@ -66,7 +66,9 @@ check()
esac
echo "build id: ${id}"
  
-	link=${build_id_dir}/.build-id/${id:0:2}/${id:2}

+   id_file=${id#??}
+   id_dir=${id%$id_file}
+   link=$build_id_dir/.build-id/$id_dir/$id_file
  	echo "link: ${link}" if [ ! -h $link ]; then @@ -74,7 +76,7 @@ check() exit 1 fi - 
file=${build_id_dir}/.build-id/${id:0:2}/`readlink ${link}`/elf + 
file=${build_id_dir}/.build-id/$id_dir/`readlink ${link}`/elf echo "file: ${file}"
  
  	# Check for file permission of original file

@@ -130,20 +132,22 @@ test_record()
  {
data=$(mktemp /tmp/perf.data.XXX)
build_id_dir=$(mktemp -d /tmp/perf.debug.XXX)
-   log=$(mktemp /tmp/perf.log.XXX)
+   log_out=$(mktemp /tmp/perf.log.out.XXX)
+   log_err=$(mktemp /tmp/perf.log.err.XXX)
perf="perf --buildid-dir ${build_id_dir}"
  
  	echo "running: perf record $@"

-   ${perf} record --buildid-all -o ${data} $@ &> ${log}
+   ${perf} record --buildid-all -o ${data} $@ 1>${log_out} 2>${log_err}
if [ 

[PATCH v4 07/24] powerpc/secvar: Handle max object size in the consumer

2023-01-20 Thread Andrew Donnellan
From: Russell Currey 

Currently the max object size is handled in the core secvar code with an
entirely OPAL-specific implementation, so create a new max_size() op and
move the existing implementation into the powernv platform.  Should be
no functional change.

Signed-off-by: Russell Currey 
Signed-off-by: Andrew Donnellan 

---

v3: Change uint64_t type to u64 (mpe)

v4: Return immediately if node is NULL (gjoyce)
---
 arch/powerpc/include/asm/secvar.h|  1 +
 arch/powerpc/kernel/secvar-sysfs.c   | 17 +++
 arch/powerpc/platforms/powernv/opal-secvar.c | 22 
 3 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/include/asm/secvar.h 
b/arch/powerpc/include/asm/secvar.h
index 2d9816dff128..b97ab793cc8a 100644
--- a/arch/powerpc/include/asm/secvar.h
+++ b/arch/powerpc/include/asm/secvar.h
@@ -18,6 +18,7 @@ struct secvar_operations {
int (*get_next)(const char *key, u64 *key_len, u64 keybufsize);
int (*set)(const char *key, u64 key_len, u8 *data, u64 data_size);
ssize_t (*format)(char *buf, size_t bufsize);
+   int (*max_size)(u64 *max_size);
 };
 
 #ifdef CONFIG_PPC_SECURE_BOOT
diff --git a/arch/powerpc/kernel/secvar-sysfs.c 
b/arch/powerpc/kernel/secvar-sysfs.c
index 4beec935f5e7..d2f65a67845c 100644
--- a/arch/powerpc/kernel/secvar-sysfs.c
+++ b/arch/powerpc/kernel/secvar-sysfs.c
@@ -132,27 +132,16 @@ static struct kobj_type secvar_ktype = {
 static int update_kobj_size(void)
 {
 
-   struct device_node *node;
u64 varsize;
-   int rc = 0;
+   int rc = secvar_ops->max_size();
 
-   node = of_find_compatible_node(NULL, NULL, "ibm,secvar-backend");
-   if (!of_device_is_available(node)) {
-   rc = -ENODEV;
-   goto out;
-   }
-
-   rc = of_property_read_u64(node, "max-var-size", );
if (rc)
-   goto out;
+   return rc;
 
data_attr.size = varsize;
update_attr.size = varsize;
 
-out:
-   of_node_put(node);
-
-   return rc;
+   return 0;
 }
 
 static int secvar_sysfs_load(void)
diff --git a/arch/powerpc/platforms/powernv/opal-secvar.c 
b/arch/powerpc/platforms/powernv/opal-secvar.c
index e33bb703ecbc..a8436bf35e2f 100644
--- a/arch/powerpc/platforms/powernv/opal-secvar.c
+++ b/arch/powerpc/platforms/powernv/opal-secvar.c
@@ -122,11 +122,33 @@ static ssize_t opal_secvar_format(char *buf, size_t 
bufsize)
return rc;
 }
 
+static int opal_secvar_max_size(u64 *max_size)
+{
+   int rc;
+   struct device_node *node;
+
+   node = of_find_compatible_node(NULL, NULL, "ibm,secvar-backend");
+   if (!node)
+   return -ENODEV;
+
+   if (!of_device_is_available(node)) {
+   rc = -ENODEV;
+   goto out;
+   }
+
+   rc = of_property_read_u64(node, "max-var-size", max_size);
+
+out:
+   of_node_put(node);
+   return rc;
+}
+
 static const struct secvar_operations opal_secvar_ops = {
.get = opal_get_variable,
.get_next = opal_get_next_variable,
.set = opal_set_variable,
.format = opal_secvar_format,
+   .max_size = opal_secvar_max_size,
 };
 
 static int opal_secvar_probe(struct platform_device *pdev)
-- 
2.39.0



[PATCH v4 12/24] powerpc/secvar: Don't print error on ENOENT when reading variables

2023-01-20 Thread Andrew Donnellan
If attempting to read the size or data attributes of a  non-existent
variable (which will be possible after a later patch to expose the PLPKS
via the secvar interface), don't spam the kernel log with error messages.
Only print errors for return codes that aren't ENOENT.

Reported-by: Sudhakar Kuppusamy 
Signed-off-by: Andrew Donnellan 

---

v3: New patch
---
 arch/powerpc/kernel/secvar-sysfs.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/secvar-sysfs.c 
b/arch/powerpc/kernel/secvar-sysfs.c
index 6dd9b4f6f87c..33d1797851ba 100644
--- a/arch/powerpc/kernel/secvar-sysfs.c
+++ b/arch/powerpc/kernel/secvar-sysfs.c
@@ -43,8 +43,8 @@ static ssize_t size_show(struct kobject *kobj, struct 
kobj_attribute *attr,
 
rc = secvar_ops->get(kobj->name, strlen(kobj->name) + 1, NULL, );
if (rc) {
-   pr_err("Error retrieving %s variable size %d\n", kobj->name,
-  rc);
+   if (rc != -ENOENT)
+   pr_err("Error retrieving %s variable size %d\n", 
kobj->name, rc);
return rc;
}
 
@@ -61,7 +61,8 @@ static ssize_t data_read(struct file *filep, struct kobject 
*kobj,
 
rc = secvar_ops->get(kobj->name, strlen(kobj->name) + 1, NULL, );
if (rc) {
-   pr_err("Error getting %s variable size %d\n", kobj->name, rc);
+   if (rc != -ENOENT)
+   pr_err("Error getting %s variable size %d\n", 
kobj->name, rc);
return rc;
}
pr_debug("dsize is %llu\n", dsize);
-- 
2.39.0



[PATCH v4 18/24] powerpc/pseries: Make caller pass buffer to plpks_read_var()

2023-01-20 Thread Andrew Donnellan
Currently, plpks_read_var() allocates a buffer to pass to the
H_PKS_READ_OBJECT hcall, then allocates another buffer, of the caller's
preferred size if necessary, into which the data is copied, and returns
that buffer to the caller.

This is a bit over the top - while we probably still want to allocate a
separate buffer to pass to the hypervisor in the hcall, we can let the
caller allocate the final buffer and specify the size.

Don't allocate var->data in plpks_read_var(), instead expect the caller to
allocate it. If the caller needs to discover the size, it can set
var->data to NULL and var->datalen will be populated. Update header file
to document this.

Suggested-by: Michael Ellerman 
Signed-off-by: Andrew Donnellan 
Signed-off-by: Russell Currey 

---

v3: New patch (mpe)
---
 arch/powerpc/include/asm/plpks.h   | 12 
 arch/powerpc/platforms/pseries/plpks.c | 11 ---
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/plpks.h b/arch/powerpc/include/asm/plpks.h
index e7204e6c0ca4..0c49969b0864 100644
--- a/arch/powerpc/include/asm/plpks.h
+++ b/arch/powerpc/include/asm/plpks.h
@@ -88,16 +88,28 @@ int plpks_remove_var(char *component, u8 varos,
 
 /**
  * Returns the data for the specified os variable.
+ *
+ * Caller must allocate a buffer in var->data with length in var->datalen.
+ * If no buffer is provided, var->datalen will be populated with the object's
+ * size.
  */
 int plpks_read_os_var(struct plpks_var *var);
 
 /**
  * Returns the data for the specified firmware variable.
+ *
+ * Caller must allocate a buffer in var->data with length in var->datalen.
+ * If no buffer is provided, var->datalen will be populated with the object's
+ * size.
  */
 int plpks_read_fw_var(struct plpks_var *var);
 
 /**
  * Returns the data for the specified bootloader variable.
+ *
+ * Caller must allocate a buffer in var->data with length in var->datalen.
+ * If no buffer is provided, var->datalen will be populated with the object's
+ * size.
  */
 int plpks_read_bootloader_var(struct plpks_var *var);
 
diff --git a/arch/powerpc/platforms/pseries/plpks.c 
b/arch/powerpc/platforms/pseries/plpks.c
index 96a026a37285..5d9c6a3b2014 100644
--- a/arch/powerpc/platforms/pseries/plpks.c
+++ b/arch/powerpc/platforms/pseries/plpks.c
@@ -578,17 +578,14 @@ static int plpks_read_var(u8 consumer, struct plpks_var 
*var)
goto out_free_output;
}
 
-   if (var->datalen == 0 || var->datalen > retbuf[0])
+   if (!var->data || var->datalen > retbuf[0])
var->datalen = retbuf[0];
 
-   var->data = kzalloc(var->datalen, GFP_KERNEL);
-   if (!var->data) {
-   rc = -ENOMEM;
-   goto out_free_output;
-   }
var->policy = retbuf[1];
 
-   memcpy(var->data, output, var->datalen);
+   if (var->data)
+   memcpy(var->data, output, var->datalen);
+
rc = 0;
 
 out_free_output:
-- 
2.39.0



[PATCH v4 23/24] integrity/powerpc: Improve error handling & reporting when loading certs

2023-01-20 Thread Andrew Donnellan
From: Russell Currey 

A few improvements to load_powerpc.c:

 - include integrity.h for the pr_fmt()
 - move all error reporting out of get_cert_list()
 - use ERR_PTR() to better preserve error detail
 - don't use pr_err() for missing keys

Signed-off-by: Russell Currey 
Signed-off-by: Andrew Donnellan 

---

v3: New patch
---
 .../integrity/platform_certs/load_powerpc.c   | 26 ++-
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/security/integrity/platform_certs/load_powerpc.c 
b/security/integrity/platform_certs/load_powerpc.c
index 1e4f80a4e71c..dee51606d5f4 100644
--- a/security/integrity/platform_certs/load_powerpc.c
+++ b/security/integrity/platform_certs/load_powerpc.c
@@ -14,9 +14,15 @@
 #include 
 #include 
 #include "keyring_handler.h"
+#include "../integrity.h"
 
 /*
  * Get a certificate list blob from the named secure variable.
+ *
+ * Returns:
+ *  - a pointer to a kmalloc'd buffer containing the cert list on success
+ *  - NULL if the key does not exist
+ *  - an ERR_PTR on error
  */
 static __init void *get_cert_list(u8 *key, unsigned long keylen, u64 *size)
 {
@@ -25,19 +31,19 @@ static __init void *get_cert_list(u8 *key, unsigned long 
keylen, u64 *size)
 
rc = secvar_ops->get(key, keylen, NULL, size);
if (rc) {
-   pr_err("Couldn't get size: %d\n", rc);
-   return NULL;
+   if (rc == -ENOENT)
+   return NULL;
+   return ERR_PTR(rc);
}
 
db = kmalloc(*size, GFP_KERNEL);
if (!db)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
 
rc = secvar_ops->get(key, keylen, db, size);
if (rc) {
kfree(db);
-   pr_err("Error reading %s var: %d\n", key, rc);
-   return NULL;
+   return ERR_PTR(rc);
}
 
return db;
@@ -69,7 +75,11 @@ static int __init load_powerpc_certs(void)
 */
db = get_cert_list("db", 3, );
if (!db) {
-   pr_err("Couldn't get db list from firmware\n");
+   pr_info("Couldn't get db list from firmware\n");
+   } else if (IS_ERR(db)) {
+   rc = PTR_ERR(db);
+   pr_err("Error reading db from firmware: %d\n", rc);
+   return rc;
} else {
rc = parse_efi_signature_list("powerpc:db", db, dbsize,
  get_handler_for_db);
@@ -81,6 +91,10 @@ static int __init load_powerpc_certs(void)
dbx = get_cert_list("dbx", 4,  );
if (!dbx) {
pr_info("Couldn't get dbx list from firmware\n");
+   } else if (IS_ERR(dbx)) {
+   rc = PTR_ERR(dbx);
+   pr_err("Error reading dbx from firmware: %d\n", rc);
+   return rc;
} else {
rc = parse_efi_signature_list("powerpc:dbx", dbx, dbxsize,
  get_handler_for_dbx);
-- 
2.39.0



[PATCH v4 19/24] powerpc/pseries: Turn PSERIES_PLPKS into a hidden option

2023-01-20 Thread Andrew Donnellan
It seems a bit unnecessary for the PLPKS code to have a user-visible
config option when it doesn't do anything on its own, and there's existing
options for enabling Secure Boot-related features.

It should be enabled by PPC_SECURE_BOOT, which will eventually be what
uses PLPKS to populate keyrings.

However, we can't get of the separate option completely, because it will
also be used for SED Opal purposes.

Change PSERIES_PLPKS into a hidden option, which is selected by
PPC_SECURE_BOOT.

Signed-off-by: Andrew Donnellan 
Signed-off-by: Russell Currey 

---

v3: New patch
---
 arch/powerpc/Kconfig   |  1 +
 arch/powerpc/platforms/pseries/Kconfig | 11 +--
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b8c4ac56bddc..d4ed46101bec 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -1029,6 +1029,7 @@ config PPC_SECURE_BOOT
depends on PPC_POWERNV || PPC_PSERIES
depends on IMA_ARCH_POLICY
imply IMA_SECURE_AND_OR_TRUSTED_BOOT
+   select PSERIES_PLPKS if PPC_PSERIES
help
  Systems with firmware secure boot enabled need to define security
  policies to extend secure boot to the OS. This config allows a user
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index a3b4d99567cb..82b6f993be0f 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -151,16 +151,7 @@ config IBMEBUS
 
 config PSERIES_PLPKS
depends on PPC_PSERIES
-   bool "Support for the Platform Key Storage"
-   help
- PowerVM provides an isolated Platform Keystore(PKS) storage
- allocation for each LPAR with individually managed access
- controls to store sensitive information securely. It can be
- used to store asymmetric public keys or secrets as required
- by different usecases. Select this config to enable
- operating system interface to hypervisor to access this space.
-
- If unsure, select N.
+   bool
 
 config PAPR_SCM
depends on PPC_PSERIES && MEMORY_HOTPLUG && LIBNVDIMM
-- 
2.39.0



[PATCH v4 21/24] powerpc/pseries: Pass PLPKS password on kexec

2023-01-20 Thread Andrew Donnellan
From: Russell Currey 

Before interacting with the PLPKS, we ask the hypervisor to generate a
password for the current boot, which is then required for most further
PLPKS operations.

If we kexec into a new kernel, the new kernel will try and fail to
generate a new password, as the password has already been set.

Pass the password through to the new kernel via the device tree, in
/chosen/plpks-pw. Check for the presence of this property before trying
to generate a new password - if it exists, use the existing password and
remove it from the device tree.

Signed-off-by: Russell Currey 
Signed-off-by: Andrew Donnellan 

---

v3: New patch

v4: Fix compile when CONFIG_PSERIES_PLPKS=n (snowpatch)

Fix error handling on fdt_path_offset() call (ruscur)
---
 arch/powerpc/kexec/file_load_64.c  | 18 ++
 arch/powerpc/platforms/pseries/plpks.c | 18 +-
 2 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index af8854f9eae3..0c9130af60cc 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct umem_info {
u64 *buf;   /* data buffer for usable-memory property */
@@ -1156,6 +1157,9 @@ int setup_new_fdt_ppc64(const struct kimage *image, void 
*fdt,
 {
struct crash_mem *umem = NULL, *rmem = NULL;
int i, nr_ranges, ret;
+#ifdef CONFIG_PSERIES_PLPKS
+   int chosen_offset;
+#endif
 
/*
 * Restrict memory usage for kdump kernel by setting up
@@ -1230,6 +1234,20 @@ int setup_new_fdt_ppc64(const struct kimage *image, void 
*fdt,
}
}
 
+#ifdef CONFIG_PSERIES_PLPKS
+   // If we have PLPKS active, we need to provide the password
+   if (plpks_is_available()) {
+   chosen_offset = fdt_path_offset(fdt, "/chosen");
+   if (chosen_offset < 0) {
+   pr_err("Can't find chosen node: %s\n",
+  fdt_strerror(chosen_offset));
+   goto out;
+   }
+   ret = fdt_setprop(fdt, chosen_offset, "ibm,plpks-pw",
+ plpks_get_password(), 
plpks_get_passwordlen());
+   }
+#endif // CONFIG_PSERIES_PLPKS
+
 out:
kfree(rmem);
kfree(umem);
diff --git a/arch/powerpc/platforms/pseries/plpks.c 
b/arch/powerpc/platforms/pseries/plpks.c
index b3c7410a4f13..0350f10e1755 100644
--- a/arch/powerpc/platforms/pseries/plpks.c
+++ b/arch/powerpc/platforms/pseries/plpks.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -126,7 +127,22 @@ static int plpks_gen_password(void)
 {
unsigned long retbuf[PLPAR_HCALL_BUFSIZE] = { 0 };
u8 *password, consumer = PLPKS_OS_OWNER;
-   int rc;
+   struct property *prop;
+   int rc, len;
+
+   // Before we generate the password, we may have been booted by kexec and
+   // provided with a previous password.  Check for that first.
+   prop = of_find_property(of_chosen, "ibm,plpks-pw", );
+   if (prop) {
+   ospasswordlength = (u16)len;
+   ospassword = kzalloc(ospasswordlength, GFP_KERNEL);
+   if (!ospassword) {
+   of_remove_property(of_chosen, prop);
+   return -ENOMEM;
+   }
+   memcpy(ospassword, prop->value, len);
+   return of_remove_property(of_chosen, prop);
+   }
 
// The password must not cross a page boundary, so we align to the next 
power of 2
password = kzalloc(roundup_pow_of_two(maxpwsize), GFP_KERNEL);
-- 
2.39.0



[PATCH v4 22/24] powerpc/pseries: Implement secvars for dynamic secure boot

2023-01-20 Thread Andrew Donnellan
From: Russell Currey 

The pseries platform can support dynamic secure boot (i.e. secure boot
using user-defined keys) using variables contained with the PowerVM LPAR
Platform KeyStore (PLPKS).  Using the powerpc secvar API, expose the
relevant variables for pseries dynamic secure boot through the existing
secvar filesystem layout.

The relevant variables for dynamic secure boot are signed in the
keystore, and can only be modified using the H_PKS_SIGNED_UPDATE hcall.
Object labels in the keystore are encoded using ucs2 format.  With our
fixed variable names we don't have to care about encoding outside of the
necessary byte padding.

When a user writes to a variable, the first 8 bytes of data must contain
the signed update flags as defined by the hypervisor.

When a user reads a variable, the first 4 bytes of data contain the
policies defined for the object.

Limitations exist due to the underlying implementation of sysfs binary
attributes, as is the case for the OPAL secvar implementation -
partial writes are unsupported and writes cannot be larger than PAGE_SIZE.
(Even when using bin_attributes, which can be larger than a single page,
sysfs only gives us one page's worth of write buffer at a time, and the
hypervisor does not expose an interface for partial writes.)

Co-developed-by: Nayna Jain 
Signed-off-by: Nayna Jain 
Co-developed-by: Andrew Donnellan 
Signed-off-by: Andrew Donnellan 
Signed-off-by: Russell Currey 

---

v2: Remove unnecessary config vars from sysfs and document the others,
thanks to review from Greg.  If we end up needing to expose more, we
can add them later and update the docs.

Use sysfs_emit() instead of sprintf(), thanks to Greg.

Change the size of the sysfs binary attributes to include the 8-byte
flags header, preventing truncation of large writes.

v3: plpks_set_variable(): pass var to plpks_signed_update_var() as a
pointer (mpe)

Update copyright date (ajd)

Consistent comment style (ajd)

Change device_initcall() to machine_arch_initcall(pseries...) so we
don't try to load on powernv and kill the machine (mpe)

Add config attributes into plpks_secvar_ops (mpe)

Get rid of PLPKS_SECVAR_COUNT macro (mpe)

Reworded descriptions in ABI documentation (mpe)

Switch to using secvar_ops->var_names rather than
secvar_ops->get_next() (ajd/mpe)

Optimise allocation/copying of buffers (mpe)

Elaborate the comment documenting the "format" string (mpe)

Return -EIO on errors in the read case (mpe)

Add "grubdbx" variable (Sudhakar Kuppusamy)

Use utf8s_to_utf16s() rather than our own "UCS-2" conversion code (mpe)

Change uint64_t to u64 (mpe)

Fix SB_VERSION data length (ruscur)

Stop prepending policy data on read (ruscur)

Enforce max format length on format string (not strictly needed, but
makes the length limit clear) (ajd)

Update include of plpks.h to reflect new path (ruscur)

Consistent constant naming scheme (ruscur)

v4: Return set_secvar_ops() return code

Pass buffer size to plpks_secvar_format() (stefanb, npiggin)

Add missing null check (stefanb)

Add comment to commit message explaining PAGE_SIZE write limit (joel)
---
 Documentation/ABI/testing/sysfs-secvar|  75 +-
 arch/powerpc/platforms/pseries/Makefile   |   4 +-
 arch/powerpc/platforms/pseries/plpks-secvar.c | 215 ++
 3 files changed, 291 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/platforms/pseries/plpks-secvar.c

diff --git a/Documentation/ABI/testing/sysfs-secvar 
b/Documentation/ABI/testing/sysfs-secvar
index feebb8c57294..a19f4d5fcec6 100644
--- a/Documentation/ABI/testing/sysfs-secvar
+++ b/Documentation/ABI/testing/sysfs-secvar
@@ -18,6 +18,14 @@ Description: A string indicating which backend is in use by 
the firmware.
This determines the format of the variable and the accepted
format of variable updates.
 
+   On powernv/OPAL, this value is provided by the OPAL firmware
+   and is expected to be "ibm,edk2-compat-v1".
+
+   On pseries/PLPKS, this is generated by the kernel based on the
+   version number in the SB_VERSION variable in the keystore, and
+   has the form "ibm,plpks-sb-v", or
+   "ibm,plpks-sb-unknown" if there is no SB_VERSION variable.
+
 What:  /sys/firmware/secvar/vars/
 Date:  August 2019
 Contact:   Nayna Jain 
@@ -34,7 +42,7 @@ Description:  An integer representation of the size of the 
content of the
 
 What:  /sys/firmware/secvar/vars//data
 Date:  August 2019
-Contact:   Nayna Jain h
+Contact:   Nayna Jain 
 Description:   A read-only file containing the value of the variable. The size
of the file represents the maximum size of the variable data.
 
@@ -44,3 +52,68 @@ Contact: Nayna Jain 
 Description:   A write-only file that is used to submit the new