Re: [PATCH 289/606] macintosh: ams/ams-i2c: Convert to i2c's .probe_new()

2022-11-18 Thread Christophe Leroy


Le 18/11/2022 à 23:40, Uwe Kleine-König a écrit :
> From: Uwe Kleine-König 
> 
> The probe function doesn't make use of the i2c_device_id * parameter so it
> can be trivially converted.
> 
> Signed-off-by: Uwe Kleine-König 

The patch itself and the others seems ok. But can you group all 
macintosh changes into a single patch instead of the 9 patches you sent ?

See the process about submitting patches, 
https://docs.kernel.org/process/submitting-patches.html and especially 
the "NO No more huge patch bombs to linux-ker...@vger.kernel.org 
people!" and the associated reference 
https://lore.kernel.org/all/20050711.125305.08322243.da...@davemloft.net/ :

If you feel the need to send, say, more than 15 patches at once, 
reconsider.

Thanks
Christophe

> ---
>   drivers/macintosh/ams/ams-i2c.c | 8 +++-
>   1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/macintosh/ams/ams-i2c.c b/drivers/macintosh/ams/ams-i2c.c
> index 3ded340699fb..a4a1035eb412 100644
> --- a/drivers/macintosh/ams/ams-i2c.c
> +++ b/drivers/macintosh/ams/ams-i2c.c
> @@ -56,8 +56,7 @@ enum ams_i2c_cmd {
>   AMS_CMD_START,
>   };
>   
> -static int ams_i2c_probe(struct i2c_client *client,
> -  const struct i2c_device_id *id);
> +static int ams_i2c_probe(struct i2c_client *client);
>   static void ams_i2c_remove(struct i2c_client *client);
>   
>   static const struct i2c_device_id ams_id[] = {
> @@ -70,7 +69,7 @@ static struct i2c_driver ams_i2c_driver = {
>   .driver = {
>   .name   = "ams",
>   },
> - .probe  = ams_i2c_probe,
> + .probe_new  = ams_i2c_probe,
>   .remove = ams_i2c_remove,
>   .id_table   = ams_id,
>   };
> @@ -155,8 +154,7 @@ static void ams_i2c_get_xyz(s8 *x, s8 *y, s8 *z)
>   *z = ams_i2c_read(AMS_DATAZ);
>   }
>   
> -static int ams_i2c_probe(struct i2c_client *client,
> -  const struct i2c_device_id *id)
> +static int ams_i2c_probe(struct i2c_client *client)
>   {
>   int vmaj, vmin;
>   int result;


Re: [PATCH 000/606] i2c: Complete conversion to i2c_probe_new

2022-11-18 Thread patchwork-bot+chrome-platform
Hello:

This patch was applied to chrome-platform/linux.git (for-kernelci)
by Tzung-Bi Shih :

On Fri, 18 Nov 2022 23:35:34 +0100 you wrote:
> Hello,
> 
> since commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new()
> call-back type") from 2016 there is a "temporary" alternative probe
> callback for i2c drivers.
> 
> This series completes all drivers to this new callback (unless I missed
> something). It's based on current next/master.
> A part of the patches depend on commit 662233731d66 ("i2c: core:
> Introduce i2c_client_get_device_id helper function"), there is a branch that
> you can pull into your tree to get it:
> 
> [...]

Here is the summary with links:
  - [512/606] platform/chrome: cros_ec: Convert to i2c's .probe_new()
https://git.kernel.org/chrome-platform/c/f9e510dc92df

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH 000/606] i2c: Complete conversion to i2c_probe_new

2022-11-18 Thread patchwork-bot+chrome-platform
Hello:

This patch was applied to chrome-platform/linux.git (for-next)
by Tzung-Bi Shih :

On Fri, 18 Nov 2022 23:35:34 +0100 you wrote:
> Hello,
> 
> since commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new()
> call-back type") from 2016 there is a "temporary" alternative probe
> callback for i2c drivers.
> 
> This series completes all drivers to this new callback (unless I missed
> something). It's based on current next/master.
> A part of the patches depend on commit 662233731d66 ("i2c: core:
> Introduce i2c_client_get_device_id helper function"), there is a branch that
> you can pull into your tree to get it:
> 
> [...]

Here is the summary with links:
  - [512/606] platform/chrome: cros_ec: Convert to i2c's .probe_new()
https://git.kernel.org/chrome-platform/c/f9e510dc92df

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




[PATCH] powerpc: Fix potential memory leak in icp_native_map_one_cpu()

2022-11-18 Thread Xiu Jianfeng
Before return error, it has allocated memory by kasprintf() and save it
in @rname, may cause a memory leak issue, fix it.

Fixes: 0b05ac6e2480 ("powerpc/xics: Rewrite XICS driver")
Signed-off-by: Xiu Jianfeng 
---
 arch/powerpc/sysdev/xics/icp-native.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/sysdev/xics/icp-native.c 
b/arch/powerpc/sysdev/xics/icp-native.c
index edc17b6b1cc2..126230206350 100644
--- a/arch/powerpc/sysdev/xics/icp-native.c
+++ b/arch/powerpc/sysdev/xics/icp-native.c
@@ -239,6 +239,7 @@ static int __init icp_native_map_one_cpu(int hw_id, 
unsigned long addr,
if (!request_mem_region(addr, size, rname)) {
pr_warn("icp_native: Could not reserve ICP MMIO for CPU %d, 
interrupt server #0x%x\n",
cpu, hw_id);
+   kfree(rname);
return -EBUSY;
}
 
@@ -248,6 +249,7 @@ static int __init icp_native_map_one_cpu(int hw_id, 
unsigned long addr,
pr_warn("icp_native: Failed ioremap for CPU %d, interrupt 
server #0x%x, addr %#lx\n",
cpu, hw_id, addr);
release_mem_region(addr, size);
+   kfree(rname);
return -ENOMEM;
}
return 0;
-- 
2.17.1



Re: build failure in linux-next: offb missing fb helpers

2022-11-18 Thread Randy Dunlap
Hi--

[adding Masahiro-san]


On 11/18/22 07:03, Michal Suchánek wrote:
> Hello,
> 
> I am seeing these errors:
> 
> [ 3825s]   AR  built-in.a
> [ 3827s]   AR  vmlinux.a
> [ 3835s]   LD  vmlinux.o
> [ 3835s]   OBJCOPY modules.builtin.modinfo
> [ 3835s]   GEN modules.builtin
> [ 3835s]   GEN .vmlinux.objs
> [ 3848s]   MODPOST Module.symvers
> [ 3848s]   CC  .vmlinux.export.o
> [ 3849s]   UPD include/generated/utsversion.h
> [ 3849s]   CC  init/version-timestamp.o
> [ 3849s]   LD  .tmp_vmlinux.btf
> [ 3864s] ld: drivers/video/fbdev/offb.o:(.data.rel.ro+0x58): undefined
> reference to `cfb_fillrect'
> [ 3864s] ld: drivers/video/fbdev/offb.o:(.data.rel.ro+0x60): undefined
> reference to `cfb_copyarea'
> [ 3864s] ld: drivers/video/fbdev/offb.o:(.data.rel.ro+0x68): undefined
> reference to `cfb_imageblit'
> 
> cfb_fillrect is provided by drivers/video/fbdev/core/cfbfillrect.c
> 
> It is compiled when CONFIG_FB_CFB_FILLRECT
> drivers/video/fbdev/core/Makefile:obj-$(CONFIG_FB_CFB_FILLRECT)  += 
> cfbfillrect.o
> 
> drivers/video/fbdev/Makefile:obj-$(CONFIG_FB_OF)   += offb.o
> is compiled when CONFIG_FB_OF
> 
> It selects CONFIG_FB_CFB_FILLRECT
> config FB_OF
> bool "Open Firmware frame buffer device support"
> depends on (FB = y) && PPC && (!PPC_PSERIES || PCI)
> select APERTURE_HELPERS
> select FB_CFB_FILLRECT
> select FB_CFB_COPYAREA
> select FB_CFB_IMAGEBLIT
> select FB_MACMODES
> 
> The config has FB_OF built-in and FB_CFB_FILLRECT modular
> config/ppc64le/vanilla:CONFIG_FB_CFB_FILLRECT=m
> config/ppc64le/vanilla:CONFIG_FB_CFB_COPYAREA=m
> config/ppc64le/vanilla:CONFIG_FB_CFB_IMAGEBLIT=m
> config/ppc64le/vanilla:CONFIG_FB_OF=y
> 
> It only depends on FB which mut be built-in for FB_OF
> config FB_CFB_FILLRECT
> tristate
> depends on FB
> 
> Is select in kconfig broken?
> 
> Attachnig the config in question.

The symbol info from xconfig says:

Symbol: FB_CFB_FILLRECT [=m]
Type : tristate
Defined at drivers/video/fbdev/Kconfig:69
Depends on: HAS_IOMEM [=y] && FB [=y]
Selected by [m]:
[deleted]
- FB_OF [=y] && HAS_IOMEM [=y] && FB [=y]=y && PPC [=y] && (!PPC_PSERIES [=y] 
|| PCI [=y]) && !DRM_OFDRM [=m]

I don't see why the 'select' from (bool) FB_OF would leave FB_CFB_FILLRECT (and 
the others)
as =m instead of =y.

Hopefully Masahiro can shed some light on this.

-- 
~Randy


Re: [PATCH 2/4] fs: define a firmware security filesystem named fwsecurityfs

2022-11-18 Thread Nayna



On 11/17/22 16:27, Greg Kroah-Hartman wrote:

On Mon, Nov 14, 2022 at 06:03:43PM -0500, Nayna wrote:

On 11/10/22 04:58, Greg Kroah-Hartman wrote:

On Wed, Nov 09, 2022 at 03:10:37PM -0500, Nayna wrote:

On 11/9/22 08:46, Greg Kroah-Hartman wrote:

On Sun, Nov 06, 2022 at 04:07:42PM -0500, Nayna Jain wrote:

securityfs is meant for Linux security subsystems to expose policies/logs
or any other information. However, there are various firmware security
features which expose their variables for user management via the kernel.
There is currently no single place to expose these variables. Different
platforms use sysfs/platform specific filesystem(efivarfs)/securityfs
interface as they find it appropriate. Thus, there is a gap in kernel
interfaces to expose variables for security features.

Define a firmware security filesystem (fwsecurityfs) to be used by
security features enabled by the firmware. These variables are platform
specific. This filesystem provides platforms a way to implement their
own underlying semantics by defining own inode and file operations.

Similar to securityfs, the firmware security filesystem is recommended
to be exposed on a well known mount point /sys/firmware/security.
Platforms can define their own directory or file structure under this path.

Example:

# mount -t fwsecurityfs fwsecurityfs /sys/firmware/security

Why not juset use securityfs in /sys/security/firmware/ instead?  Then
you don't have to create a new filesystem and convince userspace to
mount it in a specific location?

  From man 5 sysfs page:

/sys/firmware: This subdirectory contains interfaces for viewing and
manipulating firmware-specific objects and attributes.

/sys/kernel: This subdirectory contains various files and subdirectories
that provide information about the running kernel.

The security variables which are being exposed via fwsecurityfs are managed
by firmware, stored in firmware managed space and also often consumed by
firmware for enabling various security features.

Ok, then just use the normal sysfs interface for /sys/firmware, why do
you need a whole new filesystem type?


  From git commit b67dbf9d4c1987c370fd18fdc4cf9d8aaea604c2, the purpose of
securityfs(/sys/kernel/security) is to provide a common place for all kernel
LSMs. The idea of
fwsecurityfs(/sys/firmware/security) is to similarly provide a common place
for all firmware security objects.

/sys/firmware already exists. The patch now defines a new /security
directory in it for firmware security features. Using /sys/kernel/security
would mean scattering firmware objects in multiple places and confusing the
purpose of /sys/kernel and /sys/firmware.

sysfs is confusing already, no problem with making it more confusing :)

Just document where you add things and all should be fine.


Even though fwsecurityfs code is based on securityfs, since the two
filesystems expose different types of objects and have different
requirements, there are distinctions:

1. fwsecurityfs lets users create files in userspace, securityfs only allows
kernel subsystems to create files.

Wait, why would a user ever create a file in this filesystem?  If you
need that, why not use configfs?  That's what that is for, right?

The purpose of fwsecurityfs is not to expose configuration items but rather
security objects used for firmware security features. I think these are more
comparable to EFI variables, which are exposed via an EFI-specific
filesystem, efivarfs, rather than configfs.


2. firmware and kernel objects may have different requirements. For example,
consideration of namespacing. As per my understanding, namespacing is
applied to kernel resources and not firmware resources. That's why it makes
sense to add support for namespacing in securityfs, but we concluded that
fwsecurityfs currently doesn't need it. Another but similar example of it
is: TPM space, which is exposed from hardware. For containers, the TPM would
be made as virtual/software TPM. Similarly for firmware space for
containers, it would have to be something virtualized/software version of
it.

I do not understand, sorry.  What does namespaces have to do with this?
sysfs can already handle namespaces just fine, why not use that?

Firmware objects are not namespaced. I mentioned it here as an example of
the difference between firmware and kernel objects. It is also in response
to the feedback from James Bottomley in RFC v2 
[https://lore.kernel.org/linuxppc-dev/41ca51e8db9907d9060cc38adb59a66dcae4c59b.ca...@hansenpartnership.com/].

I do not understand, sorry.  Do you want to use a namespace for these or
not?  The code does not seem to be using namespaces.  You can use sysfs
with, or without, a namespace so I don't understand the issue here.

With your code, there is no namespace.


You are correct. There's no namespace for these.





3. firmware objects are persistent and read at boot time by interaction with
firmware, unlike kernel objects which are not persistent.

That doesn't matter, sys

[PATCH AUTOSEL 5.10 12/18] scsi: ibmvfc: Avoid path failures during live migration

2022-11-18 Thread Sasha Levin
From: Brian King 

[ Upstream commit 62fa3ce05d5d73c5eccc40b2db493f55fecfc446 ]

Fix an issue reported when performing a live migration when multipath is
configured with a short fast fail timeout of 5 seconds and also to have
no_path_retry set to fail. In this scenario, all paths would go into the
devloss state while the ibmvfc driver went through discovery to log back
in. On a loaded system, the discovery might take longer than 5 seconds,
which was resulting in all paths being marked failed, which then resulted
in a read only filesystem.

This patch changes the migration code in ibmvfc to avoid deleting rports at
all in this scenario, so we avoid losing all paths.

Signed-off-by: Brian King 
Link: 
https://lore.kernel.org/r/20221026181356.148517-1-brk...@linux.vnet.ibm.com
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index f6d6539c657f..b793e342ab7c 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -635,8 +635,13 @@ static void ibmvfc_init_host(struct ibmvfc_host *vhost)
memset(vhost->async_crq.msgs, 0, PAGE_SIZE);
vhost->async_crq.cur = 0;
 
-   list_for_each_entry(tgt, &vhost->targets, queue)
-   ibmvfc_del_tgt(tgt);
+   list_for_each_entry(tgt, &vhost->targets, queue) {
+   if (vhost->client_migrated)
+   tgt->need_login = 1;
+   else
+   ibmvfc_del_tgt(tgt);
+   }
+
scsi_block_requests(vhost->host);
ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT);
vhost->job_step = ibmvfc_npiv_login;
@@ -2822,9 +2827,12 @@ static void ibmvfc_handle_crq(struct ibmvfc_crq *crq, 
struct ibmvfc_host *vhost)
/* We need to re-setup the interpartition connection */
dev_info(vhost->dev, "Partition migrated, Re-enabling 
adapter\n");
vhost->client_migrated = 1;
+
+   scsi_block_requests(vhost->host);
ibmvfc_purge_requests(vhost, DID_REQUEUE);
-   ibmvfc_link_down(vhost, IBMVFC_LINK_DOWN);
+   ibmvfc_set_host_state(vhost, IBMVFC_LINK_DOWN);
ibmvfc_set_host_action(vhost, 
IBMVFC_HOST_ACTION_REENABLE);
+   wake_up(&vhost->work_wait_q);
} else if (crq->format == IBMVFC_PARTNER_FAILED || crq->format 
== IBMVFC_PARTNER_DEREGISTER) {
dev_err(vhost->dev, "Host partner adapter deregistered 
or failed (rc=%d)\n", crq->format);
ibmvfc_purge_requests(vhost, DID_ERROR);
-- 
2.35.1



[PATCH AUTOSEL 5.15 18/27] scsi: ibmvfc: Avoid path failures during live migration

2022-11-18 Thread Sasha Levin
From: Brian King 

[ Upstream commit 62fa3ce05d5d73c5eccc40b2db493f55fecfc446 ]

Fix an issue reported when performing a live migration when multipath is
configured with a short fast fail timeout of 5 seconds and also to have
no_path_retry set to fail. In this scenario, all paths would go into the
devloss state while the ibmvfc driver went through discovery to log back
in. On a loaded system, the discovery might take longer than 5 seconds,
which was resulting in all paths being marked failed, which then resulted
in a read only filesystem.

This patch changes the migration code in ibmvfc to avoid deleting rports at
all in this scenario, so we avoid losing all paths.

Signed-off-by: Brian King 
Link: 
https://lore.kernel.org/r/20221026181356.148517-1-brk...@linux.vnet.ibm.com
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index b3531065a438..45ef78f388dc 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -708,8 +708,13 @@ static void ibmvfc_init_host(struct ibmvfc_host *vhost)
memset(vhost->async_crq.msgs.async, 0, PAGE_SIZE);
vhost->async_crq.cur = 0;
 
-   list_for_each_entry(tgt, &vhost->targets, queue)
-   ibmvfc_del_tgt(tgt);
+   list_for_each_entry(tgt, &vhost->targets, queue) {
+   if (vhost->client_migrated)
+   tgt->need_login = 1;
+   else
+   ibmvfc_del_tgt(tgt);
+   }
+
scsi_block_requests(vhost->host);
ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT);
vhost->job_step = ibmvfc_npiv_login;
@@ -3235,9 +3240,12 @@ static void ibmvfc_handle_crq(struct ibmvfc_crq *crq, 
struct ibmvfc_host *vhost,
/* We need to re-setup the interpartition connection */
dev_info(vhost->dev, "Partition migrated, Re-enabling 
adapter\n");
vhost->client_migrated = 1;
+
+   scsi_block_requests(vhost->host);
ibmvfc_purge_requests(vhost, DID_REQUEUE);
-   ibmvfc_link_down(vhost, IBMVFC_LINK_DOWN);
+   ibmvfc_set_host_state(vhost, IBMVFC_LINK_DOWN);
ibmvfc_set_host_action(vhost, 
IBMVFC_HOST_ACTION_REENABLE);
+   wake_up(&vhost->work_wait_q);
} else if (crq->format == IBMVFC_PARTNER_FAILED || crq->format 
== IBMVFC_PARTNER_DEREGISTER) {
dev_err(vhost->dev, "Host partner adapter deregistered 
or failed (rc=%d)\n", crq->format);
ibmvfc_purge_requests(vhost, DID_ERROR);
-- 
2.35.1



[PATCH AUTOSEL 6.0 23/44] scsi: ibmvfc: Avoid path failures during live migration

2022-11-18 Thread Sasha Levin
From: Brian King 

[ Upstream commit 62fa3ce05d5d73c5eccc40b2db493f55fecfc446 ]

Fix an issue reported when performing a live migration when multipath is
configured with a short fast fail timeout of 5 seconds and also to have
no_path_retry set to fail. In this scenario, all paths would go into the
devloss state while the ibmvfc driver went through discovery to log back
in. On a loaded system, the discovery might take longer than 5 seconds,
which was resulting in all paths being marked failed, which then resulted
in a read only filesystem.

This patch changes the migration code in ibmvfc to avoid deleting rports at
all in this scenario, so we avoid losing all paths.

Signed-off-by: Brian King 
Link: 
https://lore.kernel.org/r/20221026181356.148517-1-brk...@linux.vnet.ibm.com
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 00684e11976b..1a0c0b7289d2 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -708,8 +708,13 @@ static void ibmvfc_init_host(struct ibmvfc_host *vhost)
memset(vhost->async_crq.msgs.async, 0, PAGE_SIZE);
vhost->async_crq.cur = 0;
 
-   list_for_each_entry(tgt, &vhost->targets, queue)
-   ibmvfc_del_tgt(tgt);
+   list_for_each_entry(tgt, &vhost->targets, queue) {
+   if (vhost->client_migrated)
+   tgt->need_login = 1;
+   else
+   ibmvfc_del_tgt(tgt);
+   }
+
scsi_block_requests(vhost->host);
ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_INIT);
vhost->job_step = ibmvfc_npiv_login;
@@ -3235,9 +3240,12 @@ static void ibmvfc_handle_crq(struct ibmvfc_crq *crq, 
struct ibmvfc_host *vhost,
/* We need to re-setup the interpartition connection */
dev_info(vhost->dev, "Partition migrated, Re-enabling 
adapter\n");
vhost->client_migrated = 1;
+
+   scsi_block_requests(vhost->host);
ibmvfc_purge_requests(vhost, DID_REQUEUE);
-   ibmvfc_link_down(vhost, IBMVFC_LINK_DOWN);
+   ibmvfc_set_host_state(vhost, IBMVFC_LINK_DOWN);
ibmvfc_set_host_action(vhost, 
IBMVFC_HOST_ACTION_REENABLE);
+   wake_up(&vhost->work_wait_q);
} else if (crq->format == IBMVFC_PARTNER_FAILED || crq->format 
== IBMVFC_PARTNER_DEREGISTER) {
dev_err(vhost->dev, "Host partner adapter deregistered 
or failed (rc=%d)\n", crq->format);
ibmvfc_purge_requests(vhost, DID_ERROR);
-- 
2.35.1



Re: [PATCH mm-unstable v1 20/20] mm: rename FOLL_FORCE to FOLL_PTRACE

2022-11-18 Thread Kees Cook
On Fri, Nov 18, 2022 at 12:09:02PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 16, 2022 at 10:16:34AM -0800, Linus Torvalds wrote:
> > Following the history of it is a big of a mess, because there's a
> > number of renamings and re-organizations, but it seems to go back to
> > 2007 and commit b6a2fea39318 ("mm: variable length argument support").
> 
> I went back and read parts of the discussions with Ollie, and the
> .force=1 thing just magically appeared one day when we were sending
> work-in-progress patches back and forth without mention of where it came
> from :-/
> 
> And I certainly can't remember now..
> 
> Looking at it now, I have the same reaction as both you and Kees had, it
> seems entirely superflous. So I'm all for trying to remove it.

Thanks for digging through the history! I've pushed the change to -next:
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=for-next/execve&id=cd57e443831d8eeb083c7165bce195d886e216d4

-- 
Kees Cook


Re: [patch 23/39] PCI/MSI: Move pci_alloc_irq_vectors_affinity() to api.c

2022-11-18 Thread Ahmed S. Darwish
On Wed, Nov 16, 2022 at 10:23:22AM -0600, Bjorn Helgaas wrote:
> On Fri, Nov 11, 2022 at 02:54:51PM +0100, Thomas Gleixner wrote:
...
> > +
> > +/**
> > + * pci_alloc_irq_vectors_affinity() - Allocate multiple device interrupt
> > + *vectors with affinity requirements
> > + * @dev:  the PCI device to operate on
> > + * @min_vecs: minimum required number of vectors (must be >= 1)
> > + * @max_vecs: maximum desired number of vectors
> > + * @flags:allocation flags, as in pci_alloc_irq_vectors()
> > + * @affd: affinity requirements (can be %NULL).
> > + *
> > + * Same as pci_alloc_irq_vectors(), but with the extra @affd parameter.
> > + * Check that function docs, and &struct irq_affinity, for more details.
>
> Is "&struct irq_affinity" some kernel-doc syntax, or is the "&"
> superfluous?
>

Hmmm, I stole it from Documentation/doc-guide/kernel-doc.rst. htmldoc
parses it and generates a link to the referenced structure's kernel-doc.

But, yeah, this was literally the first usage of such a doc pattern in
the entire kernel's C code :)

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH


Re: [PATCH mm-unstable v1 20/20] mm: rename FOLL_FORCE to FOLL_PTRACE

2022-11-18 Thread Peter Zijlstra
On Wed, Nov 16, 2022 at 10:16:34AM -0800, Linus Torvalds wrote:
> Following the history of it is a big of a mess, because there's a
> number of renamings and re-organizations, but it seems to go back to
> 2007 and commit b6a2fea39318 ("mm: variable length argument support").

I went back and read parts of the discussions with Ollie, and the
.force=1 thing just magically appeared one day when we were sending
work-in-progress patches back and forth without mention of where it came
from :-/

And I certainly can't remember now..

Looking at it now, I have the same reaction as both you and Kees had, it
seems entirely superflous. So I'm all for trying to remove it.


[PATCH 289/606] macintosh: ams/ams-i2c: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/ams/ams-i2c.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/drivers/macintosh/ams/ams-i2c.c b/drivers/macintosh/ams/ams-i2c.c
index 3ded340699fb..a4a1035eb412 100644
--- a/drivers/macintosh/ams/ams-i2c.c
+++ b/drivers/macintosh/ams/ams-i2c.c
@@ -56,8 +56,7 @@ enum ams_i2c_cmd {
AMS_CMD_START,
 };
 
-static int ams_i2c_probe(struct i2c_client *client,
-const struct i2c_device_id *id);
+static int ams_i2c_probe(struct i2c_client *client);
 static void ams_i2c_remove(struct i2c_client *client);
 
 static const struct i2c_device_id ams_id[] = {
@@ -70,7 +69,7 @@ static struct i2c_driver ams_i2c_driver = {
.driver = {
.name   = "ams",
},
-   .probe  = ams_i2c_probe,
+   .probe_new  = ams_i2c_probe,
.remove = ams_i2c_remove,
.id_table   = ams_id,
 };
@@ -155,8 +154,7 @@ static void ams_i2c_get_xyz(s8 *x, s8 *y, s8 *z)
*z = ams_i2c_read(AMS_DATAZ);
 }
 
-static int ams_i2c_probe(struct i2c_client *client,
-const struct i2c_device_id *id)
+static int ams_i2c_probe(struct i2c_client *client)
 {
int vmaj, vmin;
int result;
-- 
2.38.1



[PATCH 293/606] macintosh: windfarm_fcu_controls: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/windfarm_fcu_controls.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/macintosh/windfarm_fcu_controls.c 
b/drivers/macintosh/windfarm_fcu_controls.c
index c5b1ca5bcd73..e027d889d7e8 100644
--- a/drivers/macintosh/windfarm_fcu_controls.c
+++ b/drivers/macintosh/windfarm_fcu_controls.c
@@ -514,8 +514,7 @@ static int wf_fcu_init_chip(struct wf_fcu_priv *pv)
return 0;
 }
 
-static int wf_fcu_probe(struct i2c_client *client,
-   const struct i2c_device_id *id)
+static int wf_fcu_probe(struct i2c_client *client)
 {
struct wf_fcu_priv *pv;
 
@@ -590,7 +589,7 @@ static struct i2c_driver wf_fcu_driver = {
.name   = "wf_fcu",
.of_match_table = wf_fcu_of_id,
},
-   .probe  = wf_fcu_probe,
+   .probe_new  = wf_fcu_probe,
.remove = wf_fcu_remove,
.id_table   = wf_fcu_id,
 };
-- 
2.38.1



[PATCH 292/606] macintosh: windfarm_ad7417_sensor: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/windfarm_ad7417_sensor.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/macintosh/windfarm_ad7417_sensor.c 
b/drivers/macintosh/windfarm_ad7417_sensor.c
index c5c54a4ce91f..33b4723d235e 100644
--- a/drivers/macintosh/windfarm_ad7417_sensor.c
+++ b/drivers/macintosh/windfarm_ad7417_sensor.c
@@ -229,8 +229,7 @@ static void wf_ad7417_init_chip(struct wf_ad7417_priv *pv)
pv->config = config;
 }
 
-static int wf_ad7417_probe(struct i2c_client *client,
-  const struct i2c_device_id *id)
+static int wf_ad7417_probe(struct i2c_client *client)
 {
struct wf_ad7417_priv *pv;
const struct mpu_data *mpu;
@@ -321,7 +320,7 @@ static struct i2c_driver wf_ad7417_driver = {
.name   = "wf_ad7417",
.of_match_table = wf_ad7417_of_id,
},
-   .probe  = wf_ad7417_probe,
+   .probe_new  = wf_ad7417_probe,
.remove = wf_ad7417_remove,
.id_table   = wf_ad7417_id,
 };
-- 
2.38.1



[PATCH 294/606] macintosh: windfarm_lm75_sensor: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

.probe_new() doesn't get the i2c_device_id * parameter, so determine
that explicitly in the probe function.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/windfarm_lm75_sensor.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/macintosh/windfarm_lm75_sensor.c 
b/drivers/macintosh/windfarm_lm75_sensor.c
index 204661c8e918..24f0a444d312 100644
--- a/drivers/macintosh/windfarm_lm75_sensor.c
+++ b/drivers/macintosh/windfarm_lm75_sensor.c
@@ -87,9 +87,9 @@ static const struct wf_sensor_ops wf_lm75_ops = {
.owner  = THIS_MODULE,
 };
 
-static int wf_lm75_probe(struct i2c_client *client,
-const struct i2c_device_id *id)
-{  
+static int wf_lm75_probe(struct i2c_client *client)
+{
+   const struct i2c_device_id *id = i2c_client_get_device_id(client);
struct wf_lm75_sensor *lm;
int rc, ds1775;
const char *name, *loc;
@@ -177,7 +177,7 @@ static struct i2c_driver wf_lm75_driver = {
.name   = "wf_lm75",
.of_match_table = wf_lm75_of_id,
},
-   .probe  = wf_lm75_probe,
+   .probe_new  = wf_lm75_probe,
.remove = wf_lm75_remove,
.id_table   = wf_lm75_id,
 };
-- 
2.38.1



[PATCH 296/606] macintosh: windfarm_max6690_sensor: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/windfarm_max6690_sensor.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/macintosh/windfarm_max6690_sensor.c 
b/drivers/macintosh/windfarm_max6690_sensor.c
index c0d404ebc792..6c5ab657b6b3 100644
--- a/drivers/macintosh/windfarm_max6690_sensor.c
+++ b/drivers/macintosh/windfarm_max6690_sensor.c
@@ -60,8 +60,7 @@ static const struct wf_sensor_ops wf_max6690_ops = {
.owner  = THIS_MODULE,
 };
 
-static int wf_max6690_probe(struct i2c_client *client,
-   const struct i2c_device_id *id)
+static int wf_max6690_probe(struct i2c_client *client)
 {
const char *name, *loc;
struct wf_6690_sensor *max;
@@ -129,7 +128,7 @@ static struct i2c_driver wf_max6690_driver = {
.name   = "wf_max6690",
.of_match_table = wf_max6690_of_id,
},
-   .probe  = wf_max6690_probe,
+   .probe_new  = wf_max6690_probe,
.remove = wf_max6690_remove,
.id_table   = wf_max6690_id,
 };
-- 
2.38.1



[PATCH 297/606] macintosh: windfarm_smu_sat: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/windfarm_smu_sat.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/macintosh/windfarm_smu_sat.c 
b/drivers/macintosh/windfarm_smu_sat.c
index be5d4593db93..ebc4256a9e4a 100644
--- a/drivers/macintosh/windfarm_smu_sat.c
+++ b/drivers/macintosh/windfarm_smu_sat.c
@@ -189,8 +189,7 @@ static const struct wf_sensor_ops wf_sat_ops = {
.owner  = THIS_MODULE,
 };
 
-static int wf_sat_probe(struct i2c_client *client,
-   const struct i2c_device_id *id)
+static int wf_sat_probe(struct i2c_client *client)
 {
struct device_node *dev = client->dev.of_node;
struct wf_sat *sat;
@@ -349,7 +348,7 @@ static struct i2c_driver wf_sat_driver = {
.name   = "wf_smu_sat",
.of_match_table = wf_sat_of_id,
},
-   .probe  = wf_sat_probe,
+   .probe_new  = wf_sat_probe,
.remove = wf_sat_remove,
.id_table   = wf_sat_id,
 };
-- 
2.38.1



[PATCH 291/606] macintosh: therm_windtunnel: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

.probe_new() doesn't get the i2c_device_id * parameter, so determine
that explicitly in the probe function.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/therm_windtunnel.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/macintosh/therm_windtunnel.c 
b/drivers/macintosh/therm_windtunnel.c
index b8228ca40454..22b15efcc025 100644
--- a/drivers/macintosh/therm_windtunnel.c
+++ b/drivers/macintosh/therm_windtunnel.c
@@ -411,8 +411,9 @@ static const struct i2c_device_id therm_windtunnel_id[] = {
 MODULE_DEVICE_TABLE(i2c, therm_windtunnel_id);
 
 static int
-do_probe(struct i2c_client *cl, const struct i2c_device_id *id)
+do_probe(struct i2c_client *cl)
 {
+   const struct i2c_device_id *id = i2c_client_get_device_id(cl);
struct i2c_adapter *adapter = cl->adapter;
int ret = 0;
 
@@ -441,7 +442,7 @@ static struct i2c_driver g4fan_driver = {
.driver = {
.name   = "therm_windtunnel",
},
-   .probe  = do_probe,
+   .probe_new  = do_probe,
.remove = do_remove,
.id_table   = therm_windtunnel_id,
 };
-- 
2.38.1



[PATCH 295/606] macintosh: windfarm_lm87_sensor: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/windfarm_lm87_sensor.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/macintosh/windfarm_lm87_sensor.c 
b/drivers/macintosh/windfarm_lm87_sensor.c
index 40d25463346e..f37a32c2070c 100644
--- a/drivers/macintosh/windfarm_lm87_sensor.c
+++ b/drivers/macintosh/windfarm_lm87_sensor.c
@@ -95,8 +95,7 @@ static const struct wf_sensor_ops wf_lm87_ops = {
.owner  = THIS_MODULE,
 };
 
-static int wf_lm87_probe(struct i2c_client *client,
-const struct i2c_device_id *id)
+static int wf_lm87_probe(struct i2c_client *client)
 {  
struct wf_lm87_sensor *lm;
const char *name = NULL, *loc;
@@ -173,7 +172,7 @@ static struct i2c_driver wf_lm87_driver = {
.name   = "wf_lm87",
.of_match_table = wf_lm87_of_id,
},
-   .probe  = wf_lm87_probe,
+   .probe_new  = wf_lm87_probe,
.remove = wf_lm87_remove,
.id_table   = wf_lm87_id,
 };
-- 
2.38.1



[PATCH 290/606] macintosh: therm_adt746x: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

.probe_new() doesn't get the i2c_device_id * parameter, so determine
that explicitly in the probe function.

Signed-off-by: Uwe Kleine-König 
---
 drivers/macintosh/therm_adt746x.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/macintosh/therm_adt746x.c 
b/drivers/macintosh/therm_adt746x.c
index b004ea2a1102..8f5db9093c9a 100644
--- a/drivers/macintosh/therm_adt746x.c
+++ b/drivers/macintosh/therm_adt746x.c
@@ -464,9 +464,9 @@ static void thermostat_remove_files(struct thermostat *th)
 
 }
 
-static int probe_thermostat(struct i2c_client *client,
-   const struct i2c_device_id *id)
+static int probe_thermostat(struct i2c_client *client)
 {
+   const struct i2c_device_id *id = i2c_client_get_device_id(client);
struct device_node *np = client->dev.of_node;
struct thermostat* th;
const __be32 *prop;
@@ -598,7 +598,7 @@ static struct i2c_driver thermostat_driver = {
.driver = {
.name   = "therm_adt746x",
},
-   .probe = probe_thermostat,
+   .probe_new = probe_thermostat,
.remove = remove_thermostat,
.id_table = therm_adt746x_id,
 };
-- 
2.38.1



[PATCH 000/606] i2c: Complete conversion to i2c_probe_new

2022-11-18 Thread Uwe Kleine-König
Hello,

since commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new()
call-back type") from 2016 there is a "temporary" alternative probe
callback for i2c drivers.

This series completes all drivers to this new callback (unless I missed
something). It's based on current next/master.
A part of the patches depend on commit 662233731d66 ("i2c: core:
Introduce i2c_client_get_device_id helper function"), there is a branch that
you can pull into your tree to get it:

https://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git 
i2c/client_device_id_helper-immutable

I don't think it's feasable to apply this series in one go, so I ask the
maintainers of the changed files to apply via their tree. I guess it
will take a few kernel release iterations until all patch are in, but I
think a single tree creates too much conflicts.

The last patch changes i2c_driver::probe, all non-converted drivers will
fail to compile then. So I hope the build bots will tell me about any
driver I missed to convert. This patch is obviously not for application
now.

I dropped most individuals from the recipents of this mail to not
challenge the mail servers and mailing list filters too much. Sorry if
you had extra efforts to find this mail.

Best regards
Uwe

Uwe Kleine-König (606):
  tpm: st33zp24: Convert to Convert to i2c's .probe_new()
  tpm: tpm_i2c_atmel: Convert to i2c's .probe_new()
  tpm: tpm_i2c_infineon: Convert to i2c's .probe_new()
  tpm: tpm_i2c_nuvoton: Convert to i2c's .probe_new()
  tpm: tis_i2c: Convert to i2c's .probe_new()
  crypto: atmel-ecc - Convert to i2c's .probe_new()
  crypto: atmel-sha204a - Convert to i2c's .probe_new()
  extcon: fsa9480: Convert to i2c's .probe_new()
  extcon: rt8973: Convert to i2c's .probe_new()
  extcon: usbc-tusb320: Convert to i2c's .probe_new()
  gpio: max732x: Convert to i2c's .probe_new()
  gpio: pca953x: Convert to i2c's .probe_new()
  gpio: pcf857x: Convert to i2c's .probe_new()
  drm/bridge: adv7511: Convert to i2c's .probe_new()
  drm/bridge/analogix/anx6345: Convert to i2c's .probe_new()
  drm/bridge/analogix/anx78xx: Convert to i2c's .probe_new()
  drm/bridge: anx7625: Convert to i2c's .probe_new()
  drm/bridge: icn6211: Convert to i2c's .probe_new()
  drm/bridge: chrontel-ch7033: Convert to i2c's .probe_new()
  drm/bridge: it6505: Convert to i2c's .probe_new()
  drm/bridge: it66121: Convert to i2c's .probe_new()
  drm/bridge: lt8912b: Convert to i2c's .probe_new()
  drm/bridge: lt9211: Convert to i2c's .probe_new()
  drm/bridge: lt9611: Convert to i2c's .probe_new()
  drm/bridge: lt9611uxc: Convert to i2c's .probe_new()
  drm/bridge: megachips: Convert to i2c's .probe_new()
  drm/bridge: nxp-ptn3460: Convert to i2c's .probe_new()
  drm/bridge: parade-ps8622: Convert to i2c's .probe_new()
  drm/bridge: sii902x: Convert to i2c's .probe_new()
  drm/bridge: sii9234: Convert to i2c's .probe_new()
  drm/bridge: sii8620: Convert to i2c's .probe_new()
  drm/bridge: tc358767: Convert to i2c's .probe_new()
  drm/bridge: tc358768: Convert to i2c's .probe_new()
  drm/bridge/tc358775: Convert to i2c's .probe_new()
  drm/bridge: ti-sn65dsi83: Convert to i2c's .probe_new()
  drm/bridge: ti-sn65dsi86: Convert to i2c's .probe_new()
  drm/bridge: tfp410: Convert to i2c's .probe_new()
  drm/i2c/ch7006: Convert to i2c's .probe_new()
  drm/i2c/sil164: Convert to i2c's .probe_new()
  drm/i2c/tda9950: Convert to i2c's .probe_new()
  drm/i2c/tda998x: Convert to i2c's .probe_new()
  drm/panel: olimex-lcd-olinuxino: Convert to i2c's .probe_new()
  drm/panel: raspberrypi-touchscreen: Convert to i2c's .probe_new()
  i2c: core: Convert to i2c's .probe_new()
  i2c: slave-eeprom: Convert to i2c's .probe_new()
  i2c: smbus: Convert to i2c's .probe_new()
  i2c: mux: pca9541: Convert to i2c's .probe_new()
  i2c: mux: pca954x: Convert to i2c's .probe_new()
  iio: accel: adxl372_i2c: Convert to i2c's .probe_new()
  iio: accel: bma180: Convert to i2c's .probe_new()
  iio: accel: bma400: Convert to i2c's .probe_new()
  iio: accel: bmc150: Convert to i2c's .probe_new()
  iio: accel: da280: Convert to i2c's .probe_new()
  iio: accel: kxcjk-1013: Convert to i2c's .probe_new()
  iio: accel: mma7455_i2c: Convert to i2c's .probe_new()
  iio: accel: mma8452: Convert to i2c's .probe_new()
  iio: accel: mma9551: Convert to i2c's .probe_new()
  iio: accel: mma9553: Convert to i2c's .probe_new()
  iio: adc: ad7091r5: Convert to i2c's .probe_new()
  iio: adc: ad7291: Convert to i2c's .probe_new()
  iio: adc: ad799x: Convert to i2c's .probe_new()
  iio: adc: ina2xx-adc: Convert to i2c's .probe_new()
  iio: adc: ltc2471: Convert to i2c's .probe_new()
  iio: adc: ltc2485: Convert to i2c's .probe_new()
  iio: adc: ltc2497: Convert to i2c's .probe_new()
  iio: adc: max1363: Convert to i2c's .probe_new()
  iio: adc: max9611: Convert to i2c's .probe_new()
  iio: adc: mcp3422: Convert to i2c's .probe_new()
  iio: adc: ti-adc081c: Convert to i2c's .probe_new()
  iio: adc: ti-ads1015: Convert to i2c's .probe_new()
  iio: c

[PATCH 598/606] ALSA: aoa: tas: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König 
---
 sound/aoa/codecs/tas.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/sound/aoa/codecs/tas.c b/sound/aoa/codecs/tas.c
index ab89475b7715..f906e9aaddcf 100644
--- a/sound/aoa/codecs/tas.c
+++ b/sound/aoa/codecs/tas.c
@@ -875,8 +875,7 @@ static void tas_exit_codec(struct aoa_codec *codec)
 }
 
 
-static int tas_i2c_probe(struct i2c_client *client,
-const struct i2c_device_id *id)
+static int tas_i2c_probe(struct i2c_client *client)
 {
struct device_node *node = client->dev.of_node;
struct tas *tas;
@@ -937,7 +936,7 @@ static struct i2c_driver tas_driver = {
.driver = {
.name = "aoa_codec_tas",
},
-   .probe = tas_i2c_probe,
+   .probe_new = tas_i2c_probe,
.remove = tas_i2c_remove,
.id_table = tas_i2c_id,
 };
-- 
2.38.1



[PATCH 597/606] ALSA: aoa: onyx: Convert to i2c's .probe_new()

2022-11-18 Thread Uwe Kleine-König
From: Uwe Kleine-König 

The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.

Signed-off-by: Uwe Kleine-König 
---
 sound/aoa/codecs/onyx.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/sound/aoa/codecs/onyx.c b/sound/aoa/codecs/onyx.c
index 2d0f904aba00..4c75381f5ab8 100644
--- a/sound/aoa/codecs/onyx.c
+++ b/sound/aoa/codecs/onyx.c
@@ -990,8 +990,7 @@ static void onyx_exit_codec(struct aoa_codec *codec)
onyx->codec.soundbus_dev->detach_codec(onyx->codec.soundbus_dev, onyx);
 }
 
-static int onyx_i2c_probe(struct i2c_client *client,
- const struct i2c_device_id *id)
+static int onyx_i2c_probe(struct i2c_client *client)
 {
struct device_node *node = client->dev.of_node;
struct onyx *onyx;
@@ -1049,7 +1048,7 @@ static struct i2c_driver onyx_driver = {
.driver = {
.name = "aoa_codec_onyx",
},
-   .probe = onyx_i2c_probe,
+   .probe_new = onyx_i2c_probe,
.remove = onyx_i2c_remove,
.id_table = onyx_i2c_id,
 };
-- 
2.38.1



Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc

2022-11-18 Thread Christophe Leroy


Le 18/11/2022 à 18:28, Song Liu a écrit :
> On Fri, Nov 18, 2022 at 3:47 AM Christophe Leroy
>  wrote:
>>
>>
>>
>> Le 18/11/2022 à 10:39, Hari Bathini a écrit :
>>>
>>>
>>> On 18/11/22 2:21 pm, Christophe Leroy wrote: >
>>> I had the same config but hit this problem:
>>>
>>> # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf
>>> test_bpf: #0 TAX
>>> [ cut here ]
>>> WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367
>>> bpf_int_jit_compile+0x8a0/0x9f8
>>
>> I get no such problem, on QEMU, and I checked the .config has:
>
>> CONFIG_STRICT_KERNEL_RWX=y
>> CONFIG_STRICT_MODULE_RWX=y
>
> Yeah. That did the trick.

 Interesting. I guess we have to find out why it fails when those config
 are missing.

 Maybe module code plays with RO and NX flags even if
 CONFIG_STRICT_MODULE_RWX is not selected ?
>>>
>>> Need to look at the code closely but fwiw, observing same failure on
>>> 64-bit as well with !STRICT_RWX...
>>
>> The problem is in bpf_prog_pack_alloc() and in alloc_new_pack() : They
>> do set_memory_ro() and set_memory_x() without taking into account
>> CONFIG_STRICT_MODULE_RWX.
>>
>> When CONFIG_STRICT_MODULE_RWX is selected, powerpc module_alloc()
>> allocates PAGE_KERNEL memory, that is RW memory, and expects the user to
>> call do set_memory_ro() and set_memory_x().
>>
>> But when CONFIG_STRICT_MODULE_RWX is not selected, powerpc
>> module_alloc() allocates PAGE_KERNEL_TEXT memory, that is RWX memory,
>> and expects to be able to always write into it.
> 
> Ah, I see. x86_64 requires CONFIG_STRICT_MODULE_RWX, so this hasn't
> been a problem yet.
> 

In fact it shouldn't be a problem for BPF on powerpc either. Because 
powerpc BPF expects RO at all time and today uses bpf_jit_binary_lock_ro().

It just means that we can't use patch_instruction() for that. Anyway, 
using patch_instruction() was sub-optimal.

All we have to do I think is set a mirror of the page using vmap() then 
perform a memcpy() of the code then vunmap() it. Maybe a call to 
flush_tlb_kernel_range() will be also needed, unless BPF already does it.

Christophe


Re: [PATCH] powerpc/pseries: unregister VPA when hot unplugging a CPU

2022-11-18 Thread Nathan Lynch
Laurent Dufour  writes:

> The VPA should unregister when offlining a CPU. Otherwise there could be a
> short window where 2 CPUs could share the same VPA.
>
> This happens because the hypervisor is still keeping the VPA attached to
> the vCPU even if it became offline.
>
> Here is a potential situation:
>  1. remove proc A,
>  2. add proc B. If proc B gets proc A's place in cpu_present_map, then it
> registers proc A's VPAs.
>  3. If proc B is then re-added to the LP, its threads are sharing VPAs with
> proc A briefly as they come online.
>
> As the hypervisor may check for the VPA's yield_count field oddity, it may
> detects an unexpected value and kill the LPAR.
>
> Suggested-by: Nathan Lynch 
> Signed-off-by: Laurent Dufour 
> ---
>  arch/powerpc/platforms/pseries/hotplug-cpu.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
> b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index e0a7ac5db15d..090ae5a1e0f5 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -70,6 +70,7 @@ static void pseries_cpu_offline_self(void)
>   xics_teardown_cpu();
>  
>   unregister_slb_shadow(hwcpu);
> + unregister_vpa(hwcpu);
>   rtas_stop_self();
>  
>   /* Should never get here... */

Reviewed-by: Nathan Lynch 

I was wondering whether we could leave an active dispatch trace log
buffer registered, which could interfere with releasing the VPA, but I
verified that DTL has the appropriate cpuhp callback for that
(dtl_worker_offline()).

Alternatively we could change the code to dynamically register and
unregister VPAs only on processor add and remove, as opposed to CPU
online/offline. But I can't see any significant advantage to that.


Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc

2022-11-18 Thread Song Liu
On Fri, Nov 18, 2022 at 3:47 AM Christophe Leroy
 wrote:
>
>
>
> Le 18/11/2022 à 10:39, Hari Bathini a écrit :
> >
> >
> > On 18/11/22 2:21 pm, Christophe Leroy wrote: >
> > I had the same config but hit this problem:
> >
> ># echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf
> >test_bpf: #0 TAX
> >[ cut here ]
> >WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367
> > bpf_int_jit_compile+0x8a0/0x9f8
> 
>  I get no such problem, on QEMU, and I checked the .config has:
> >>>
>  CONFIG_STRICT_KERNEL_RWX=y
>  CONFIG_STRICT_MODULE_RWX=y
> >>>
> >>> Yeah. That did the trick.
> >>
> >> Interesting. I guess we have to find out why it fails when those config
> >> are missing.
> >>
> >> Maybe module code plays with RO and NX flags even if
> >> CONFIG_STRICT_MODULE_RWX is not selected ?
> >
> > Need to look at the code closely but fwiw, observing same failure on
> > 64-bit as well with !STRICT_RWX...
>
> The problem is in bpf_prog_pack_alloc() and in alloc_new_pack() : They
> do set_memory_ro() and set_memory_x() without taking into account
> CONFIG_STRICT_MODULE_RWX.
>
> When CONFIG_STRICT_MODULE_RWX is selected, powerpc module_alloc()
> allocates PAGE_KERNEL memory, that is RW memory, and expects the user to
> call do set_memory_ro() and set_memory_x().
>
> But when CONFIG_STRICT_MODULE_RWX is not selected, powerpc
> module_alloc() allocates PAGE_KERNEL_TEXT memory, that is RWX memory,
> and expects to be able to always write into it.

Ah, I see. x86_64 requires CONFIG_STRICT_MODULE_RWX, so this hasn't
been a problem yet.

Thanks,
Song


Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

2022-11-18 Thread Song Liu
Hi Petr,

On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek  wrote:
>
> On Thu 2022-09-01 10:12:52, Song Liu wrote:
[...]
> >
> >  arch/powerpc/kernel/module_32.c |  10 
> >  arch/powerpc/kernel/module_64.c |  49 +++
> >  arch/s390/kernel/module.c   |   8 +++
> >  arch/x86/kernel/module.c| 102 +++-
> >  include/linux/moduleloader.h|   7 +++
> >  kernel/livepatch/core.c |  41 -
>
> First, thanks a lot for working on this.
>
> I can't check or test the powerpc and s390 code easily.
>
> I am going to comment only x86 and generic code. It looks good
> but it needs some changes to improve maintainability.

Thanks for these comments and suggestions. I will work on them
and send v4.

Song


Re: [PATCH mm-unstable v1 05/20] mm: add early FAULT_FLAG_WRITE consistency checks

2022-11-18 Thread Vlastimil Babka
On 11/16/22 11:26, David Hildenbrand wrote:
> Let's catch abuse of FAULT_FLAG_WRITE early, such that we don't have to
> care in all other handlers and might get "surprises" if we forget to do
> so.
> 
> Write faults without VM_MAYWRITE don't make any sense, and our
> maybe_mkwrite() logic could have hidden such abuse for now.
> 
> Write faults without VM_WRITE on something that is not a COW mapping is
> similarly broken, and e.g., do_wp_page() could end up placing an
> anonymous page into a shared mapping, which would be bad.
> 
> This is a preparation for reliable R/O long-term pinning of pages in
> private mappings, whereby we want to make sure that we will never break
> COW in a read-only private mapping.
> 
> Signed-off-by: David Hildenbrand 

Reviewed-by: Vlastimil Babka 

> ---
>  mm/memory.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index e014435a87db..c4fa378ec2a0 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5170,6 +5170,14 @@ static vm_fault_t sanitize_fault_flags(struct 
> vm_area_struct *vma,
>*/
>   if (!is_cow_mapping(vma->vm_flags))
>   *flags &= ~FAULT_FLAG_UNSHARE;
> + } else if (*flags & FAULT_FLAG_WRITE) {
> + /* Write faults on read-only mappings are impossible ... */
> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_MAYWRITE)))
> + return VM_FAULT_SIGSEGV;
> + /* ... and FOLL_FORCE only applies to COW mappings. */
> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_WRITE) &&
> +  !is_cow_mapping(vma->vm_flags)))
> + return VM_FAULT_SIGSEGV;
>   }
>   return 0;
>  }



Re: [PATCH mm-unstable v1 04/20] mm: add early FAULT_FLAG_UNSHARE consistency checks

2022-11-18 Thread Vlastimil Babka
On 11/16/22 11:26, David Hildenbrand wrote:
> For now, FAULT_FLAG_UNSHARE only applies to anonymous pages, which
> implies a COW mapping. Let's hide FAULT_FLAG_UNSHARE early if we're not
> dealing with a COW mapping, such that we treat it like a read fault as
> documented and don't have to worry about the flag throughout all fault
> handlers.
> 
> While at it, centralize the check for mutual exclusion of
> FAULT_FLAG_UNSHARE and FAULT_FLAG_WRITE and just drop the check that
> either flag is set in the WP handler.
> 
> Signed-off-by: David Hildenbrand 
> ---
>  mm/huge_memory.c |  3 ---
>  mm/hugetlb.c |  5 -
>  mm/memory.c  | 23 ---
>  3 files changed, 20 insertions(+), 11 deletions(-)

Reviewed-by: Vlastimil Babka 



Re: [RFC PATCH v2 8/8] sched, smp: Trace smp callback causing an IPI

2022-11-18 Thread Daniel Bristot de Oliveira
On 11/18/22 10:12, Peter Zijlstra wrote:
> On Thu, Nov 17, 2022 at 02:45:29PM +, Valentin Schneider wrote:
> 
>>> +   if (trace_ipi_send_cpumask_enabled()) {
>>> +   call_single_data_t *csd;
>>> +   smp_call_func_t func;
>>> +
>>> +   csd = container_of(node, call_single_data_t, node.llist);
>>> +
>>> +   func = sched_ttwu_pending;
>>> +   if (CSD_TYPE(csd) != CSD_TYPE_TTWU)
>>> +   func = csd->func;
>>> +
>>> +   if (raw_smp_call_single_queue(cpu, node))
>>> +   trace_ipi_send_cpumask(cpumask_of(cpu), _RET_IP_, func);
>> So I went with the tracepoint being placed *before* the actual IPI gets
>> sent to have a somewhat sane ordering between trace_ipi_send_cpumask() and
>> e.g. trace_call_function_single_entry().
>>
>> Packaging the call_single_queue logic makes the code less horrible, but it
>> does mix up the event ordering...
> Keeps em sharp ;-)
> 

Having the trace before the IPI avoids the (non ideal) case where the trace 
stops because of
an IPI execution before we have trace about who sent it... :-(.

-- Daniel



Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

2022-11-18 Thread Petr Mladek
On Thu 2022-09-01 10:12:52, Song Liu wrote:
> From: Miroslav Benes 
> 
> Josh reported a bug:
> 
>   When the object to be patched is a module, and that module is
>   rmmod'ed and reloaded, it fails to load with:
> 
>   module: x86/modules: Skipping invalid relocation target, existing value is 
> nonzero for type 2, loc ba0302e9, val a03e293c
>   livepatch: failed to initialize patch 'livepatch_nfsd' for module 'nfsd' 
> (-8)
>   livepatch: patch 'livepatch_nfsd' failed for module 'nfsd', refusing to 
> load module 'nfsd'
> 
>   The livepatch module has a relocation which references a symbol
>   in the _previous_ loading of nfsd. When apply_relocate_add()
>   tries to replace the old relocation with a new one, it sees that
>   the previous one is nonzero and it errors out.
> 
> We thus decided to reverse the relocation patching (clear all relocation
> targets on x86_64). The solution is not
> universal and is too much arch-specific, but it may prove to be simpler
> in the end.
>
>  arch/powerpc/kernel/module_32.c |  10 
>  arch/powerpc/kernel/module_64.c |  49 +++
>  arch/s390/kernel/module.c   |   8 +++
>  arch/x86/kernel/module.c| 102 +++-
>  include/linux/moduleloader.h|   7 +++
>  kernel/livepatch/core.c |  41 -

First, thanks a lot for working on this.

I can't check or test the powerpc and s390 code easily.

I am going to comment only x86 and generic code. It looks good
but it needs some changes to improve maintainability.

>  6 files changed, 189 insertions(+), 28 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/module_32.c b/arch/powerpc/kernel/module_32.c
> index ea6536171778..e3c312770453 100644
> --- a/arch/powerpc/kernel/module_32.c
> +++ b/arch/powerpc/kernel/module_32.c
> @@ -285,6 +285,16 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
>   return 0;
>  }
>  
> +#ifdef CONFIG_LIVEPATCH
> +void clear_relocate_add(Elf32_Shdr *sechdrs,
> +const char *strtab,
> +unsigned int symindex,
> +unsigned int relsec,
> +struct module *me)
> +{
> +}
> +#endif
> +
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  notrace int module_trampoline_target(struct module *mod, unsigned long addr,
>unsigned long *target)
> diff --git a/arch/powerpc/kernel/module_64.c b/arch/powerpc/kernel/module_64.c
> index 7e45dc98df8a..514951f97391 100644
> --- a/arch/powerpc/kernel/module_64.c
> +++ b/arch/powerpc/kernel/module_64.c
> @@ -739,6 +739,55 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
>   return 0;
>  }
>  
> +#ifdef CONFIG_LIVEPATCH
> +void clear_relocate_add(Elf64_Shdr *sechdrs,
> +const char *strtab,
> +unsigned int symindex,
> +unsigned int relsec,
> +struct module *me)
> +{
> + unsigned int i;
> + Elf64_Rela *rela = (void *)sechdrs[relsec].sh_addr;
> + Elf64_Sym *sym;
> + unsigned long *location;
> + const char *symname;
> + u32 *instruction;
> +
> + pr_debug("Clearing ADD relocate section %u to %u\n", relsec,
> +  sechdrs[relsec].sh_info);
> +
> + for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
> + location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
> + + rela[i].r_offset;
> + sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
> + + ELF64_R_SYM(rela[i].r_info);
> + symname = me->core_kallsyms.strtab
> + + sym->st_name;
> +
> + if (ELF64_R_TYPE(rela[i].r_info) != R_PPC_REL24)
> + continue;
> + /*
> +  * reverse the operations in apply_relocate_add() for case
> +  * R_PPC_REL24.
> +  */
> + if (sym->st_shndx != SHN_UNDEF &&
> + sym->st_shndx != SHN_LIVEPATCH)
> + continue;
> +
> + instruction = (u32 *)location;
> + if (is_mprofile_ftrace_call(symname))
> + continue;
> +
> + if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
> + continue;
> +
> + instruction += 1;
> + patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
> + }
> +
> +}

This looks like a lot of duplicated code. Isn't it?

> +#endif
> +
>  #ifdef CONFIG_DYNAMIC_FTRACE
>  int module_trampoline_target(struct module *mod, unsigned long addr,
>unsigned long *target)
> --- a/arch/x86/kernel/module.c
> +++ b/arch/x86/kernel/module.c
> @@ -128,18 +128,20 @@ int apply_relocate(Elf32_Shdr *sechdrs,
>   return 0;
>  }
>  #else /*X86_64*/
> -static int __apply_relocate_add(Elf64_Shdr *sechdrs,
> +static int __apply_clear_relocate_add(Elf64_Shdr *sechdrs,

Nit: Honestly, the combination of 4 verbs: "apply", "clear, "relocate", and 
"add"
 is really crazy. It is fa

Re: [PATCH mm-unstable v1 01/20] selftests/vm: anon_cow: prepare for non-anonymous COW tests

2022-11-18 Thread Vlastimil Babka
On 11/16/22 11:26, David Hildenbrand wrote:
> Originally, the plan was to have a separate tests for testing COW of
> non-anonymous (e.g., shared zeropage) pages.
> 
> Turns out, that we'd need a lot of similar functionality and that there
> isn't a really good reason to separate it. So let's prepare for non-anon
> tests by renaming to "cow".
> 
> Signed-off-by: David Hildenbrand 

Acked-by: Vlastimil Babka 



[PATCH 04/13] powerpc/rtas: avoid scheduling in rtas_os_term()

2022-11-18 Thread Nathan Lynch
It's unsafe to use rtas_busy_delay() to handle a busy status from
the ibm,os-term RTAS function in rtas_os_term():

Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b
BUG: sleeping function called from invalid context at 
arch/powerpc/kernel/rtas.c:618
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
preempt_count: 2, expected: 0
CPU: 7 PID: 1 Comm: swapper/0 Tainted: G  D
6.0.0-rc5-02182-gf8553a572277-dirty #9
Call Trace:
[c7b8f000] [c1337110] dump_stack_lvl+0xb4/0x110 (unreliable)
[c7b8f040] [c02440e4] __might_resched+0x394/0x3c0
[c7b8f0e0] [c004f680] rtas_busy_delay+0x120/0x1b0
[c7b8f100] [c0052d04] rtas_os_term+0xb8/0xf4
[c7b8f180] [c01150fc] pseries_panic+0x50/0x68
[c7b8f1f0] [c0036354] ppc_panic_platform_handler+0x34/0x50
[c7b8f210] [c02303c4] notifier_call_chain+0xd4/0x1c0
[c7b8f2b0] [c02306cc] atomic_notifier_call_chain+0xac/0x1c0
[c7b8f2f0] [c01d62b8] panic+0x228/0x4d0
[c7b8f390] [c01e573c] do_exit+0x140c/0x1420
[c7b8f480] [c01e586c] make_task_dead+0xdc/0x200

Use rtas_busy_delay_time() instead, which signals without side effects
whether to attempt the ibm,os-term RTAS call again.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 81e4996012b7..51f0508593a7 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -965,10 +965,15 @@ void rtas_os_term(char *str)
 
snprintf(rtas_os_term_buf, 2048, "OS panic: %s", str);
 
+   /*
+* Keep calling as long as RTAS returns a "try again" status,
+* but don't use rtas_busy_delay(), which potentially
+* schedules.
+*/
do {
status = rtas_call(ibm_os_term_token, 1, 1, NULL,
   __pa(rtas_os_term_buf));
-   } while (rtas_busy_delay(status));
+   } while (rtas_busy_delay_time(status));
 
if (status != 0)
printk(KERN_EMERG "ibm,os-term call failed %d\n", status);
-- 
2.37.1



[PATCH 11/13] powerpc/rtas: strengthen do_enter_rtas() type safety, drop inline

2022-11-18 Thread Nathan Lynch
Make do_enter_rtas() take a pointer to struct rtas_args and do the
__pa() conversion in one place instead of leaving it to callers. This
also makes it possible to introduce enter/exit tracepoints that access
the rtas_args struct fields.

There's no apparent reason to force inlining of do_enter_rtas()
either, and it seems to bloat the code a bit. Let the compiler decide.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index a88db3b3486f..198366d641d0 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -522,7 +522,7 @@ static const struct rtas_function 
*rtas_token_to_function(s32 token)
 /* This is here deliberately so it's only used in this file */
 void enter_rtas(unsigned long);
 
-static inline void do_enter_rtas(unsigned long args)
+static void do_enter_rtas(struct rtas_args *args)
 {
unsigned long msr;
 
@@ -537,7 +537,7 @@ static inline void do_enter_rtas(unsigned long args)
 
hard_irq_disable(); /* Ensure MSR[EE] is disabled on PPC64 */
 
-   enter_rtas(args);
+   enter_rtas(__pa(args));
 
srr_regs_clobbered(); /* rtas uses SRRs, invalidate */
 }
@@ -908,7 +908,7 @@ static char *__fetch_rtas_last_error(char *altbuf)
save_args = rtas.args;
rtas.args = err_args;
 
-   do_enter_rtas(__pa(&rtas.args));
+   do_enter_rtas(&rtas.args);
 
err_args = rtas.args;
rtas.args = save_args;
@@ -955,7 +955,7 @@ va_rtas_call_unlocked(struct rtas_args *args, int token, 
int nargs, int nret,
for (i = 0; i < nret; ++i)
args->rets[i] = 0;
 
-   do_enter_rtas(__pa(args));
+   do_enter_rtas(args);
 }
 
 void rtas_call_unlocked(struct rtas_args *args, int token, int nargs, int 
nret, ...)
@@ -1731,7 +1731,7 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
flags = lock_rtas();
 
rtas.args = args;
-   do_enter_rtas(__pa(&rtas.args));
+   do_enter_rtas(&rtas.args);
args = rtas.args;
 
/* A -1 return code indicates that the last command couldn't
-- 
2.37.1



[PATCH 09/13] powerpc/rtas: mandate RTAS syscall filtering

2022-11-18 Thread Nathan Lynch
CONFIG_PPC_RTAS_FILTER has been optional but default-enabled since its
introduction. It's been enabled in enterprise distro kernels for a
while without causing ABI breakage that wasn't easily fixed, and it
prevents harmful abuses of the rtas syscall.

Let's make it unconditional.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/Kconfig   | 13 -
 arch/powerpc/kernel/rtas.c | 16 
 2 files changed, 29 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2ca5418457ed..8092915a4e9b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -1012,19 +1012,6 @@ config PPC_SECVAR_SYSFS
  read/write operations on these variables. Say Y if you have
  secure boot enabled and want to expose variables to userspace.
 
-config PPC_RTAS_FILTER
-   bool "Enable filtering of RTAS syscalls"
-   default y
-   depends on PPC_RTAS
-   help
- The RTAS syscall API has security issues that could be used to
- compromise system integrity. This option enforces restrictions on the
- RTAS calls and arguments passed by userspace programs to mitigate
- these issues.
-
- Say Y unless you know what you are doing and the filter is causing
- problems for you.
-
 endmenu
 
 config ISA_DMA_API
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index c3142d352f41..3929bcea92c0 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1051,8 +1051,6 @@ noinstr struct pseries_errorlog 
*get_pseries_errorlog(struct rtas_error_log *log
return NULL;
 }
 
-#ifdef CONFIG_PPC_RTAS_FILTER
-
 /*
  * The sys_rtas syscall, as originally designed, allows root to pass
  * arbitrary physical addresses to RTAS calls. A number of RTAS calls
@@ -1201,20 +1199,6 @@ static void __init rtas_syscall_filter_init(void)
rtas_filters[i].token = rtas_token(rtas_filters[i].name);
 }
 
-#else
-
-static bool block_rtas_call(int token, int nargs,
-   struct rtas_args *args)
-{
-   return false;
-}
-
-static void __init rtas_syscall_filter_init(void)
-{
-}
-
-#endif /* CONFIG_PPC_RTAS_FILTER */
-
 /* We assume to be passed big endian arguments */
 SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 {
-- 
2.37.1



[PATCH 13/13] powerpc/rtas: place tracepoints in do_enter_rtas()

2022-11-18 Thread Nathan Lynch
Call the just-added rtas tracepoints in do_enter_rtas(), taking care
to avoid function name lookups in the CPU offline path.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 198366d641d0..3487b42cfbf7 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 enum rtas_function_flags {
@@ -525,6 +526,7 @@ void enter_rtas(unsigned long);
 static void do_enter_rtas(struct rtas_args *args)
 {
unsigned long msr;
+   const char *name = NULL;
 
/*
 * Make sure MSR[RI] is currently enabled as it will be forced later
@@ -537,9 +539,30 @@ static void do_enter_rtas(struct rtas_args *args)
 
hard_irq_disable(); /* Ensure MSR[EE] is disabled on PPC64 */
 
+   if ((trace_rtas_input_enabled() || trace_rtas_output_enabled())) {
+   /*
+* rtas_token_to_function() uses xarray which uses RCU,
+* but this code can run in the CPU offline path
+* (e.g. stop-self), after it's become invalid to call
+* RCU APIs.
+*/
+   if (cpu_online(smp_processor_id())) {
+   const s32 token = be32_to_cpu(args->token);
+   const struct rtas_function *func = 
rtas_token_to_function(token);
+
+   name = func->name;
+   }
+   }
+
+   trace_rtas_input(args, name);
+   trace_rtas_ll_entry(args);
+
enter_rtas(__pa(args));
 
srr_regs_clobbered(); /* rtas uses SRRs, invalidate */
+
+   trace_rtas_ll_exit(args);
+   trace_rtas_output(args, name);
 }
 
 struct rtas_t rtas = {
-- 
2.37.1



[PATCH 05/13] powerpc/pseries/eeh: use correct API for error log size

2022-11-18 Thread Nathan Lynch
rtas-error-log-max is not the name of an RTAS function, so
rtas_token() is not the appropriate API for retrieving its value. We
already have rtas_get_error_log_max() which returns a sensible value
if the property is absent for any reason, so use that instead.

Signed-off-by: Nathan Lynch 
Fixes: 8d633291b4fc ("powerpc/eeh: pseries platform EEH error log retrieval")
---
 arch/powerpc/platforms/pseries/eeh_pseries.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c 
b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 8e40ccac0f44..e5e4f4aa5afd 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -848,7 +848,7 @@ static int __init eeh_pseries_init(void)
}
 
/* Initialize error log size */
-   eeh_error_buf_size = rtas_token("rtas-error-log-max");
+   eeh_error_buf_size = rtas_get_error_log_max();
if (eeh_error_buf_size == RTAS_UNKNOWN_SERVICE) {
pr_info("%s: unknown EEH error log size\n",
__func__);
-- 
2.37.1



[PATCH 12/13] powerpc/tracing: tracepoints for RTAS entry and exit

2022-11-18 Thread Nathan Lynch
Add two sets of tracepoints to be used around RTAS entry:

* rtas_input/rtas_output, which emit the function name, its inputs,
  the returned status, and any other outputs. These produce an API-level
  record of OS<->RTAS activity.

* rtas_ll_entry/rtas_ll_exit, which are lower-level and emit the
  entire contents of the parameter block (aka rtas_args) on entry and
  exit. Likely useful only for debugging.

With uses of these tracepoints in do_enter_rtas() to be added in the
following patch, examples of get-time-of-day and event-scan functions
as rendered by trace-cmd (with some multi-line formatting manually
imposed on the rtas_ll_* entries to avoid extremely long lines in the
commit message):

cat-36800 [059]  4978.518303: rtas_input:   get-time-of-day arguments:
cat-36800 [059]  4978.518306: rtas_ll_entry:token=3 nargs=0 nret=8
params: [0]=0x 
[1]=0x [2]=0x [3]=0x
[4]=0x 
[5]=0x [6]=0x [7]=0x
[8]=0x 
[9]=0x [10]=0x [11]=0x
[12]=0x 
[13]=0x [14]=0x [15]=0x
cat-36800 [059]  4978.518366: rtas_ll_exit: token=3 nargs=0 nret=8
params: [0]=0x 
[1]=0x07e6 [2]=0x000b [3]=0x0001
[4]=0x 
[5]=0x000e [6]=0x0008 [7]=0x2e0dac40
[8]=0x 
[9]=0x [10]=0x [11]=0x
[12]=0x 
[13]=0x [14]=0x [15]=0x
cat-36800 [059]  4978.518366: rtas_output:  get-time-of-day status: 0, 
other outputs: 2022 11 1 0 14 8 772648000

kworker/39:1-336   [039]  4982.731623: rtas_input:   event-scan 
arguments: 4294967295 0 80484920 2048
kworker/39:1-336   [039]  4982.731626: rtas_ll_entry:token=6 nargs=4 
nret=1
 params: 
[0]=0x [1]=0x [2]=0x04cc1a38 [3]=0x0800
 
[4]=0x [5]=0x000e [6]=0x0008 [7]=0x2e0dac40
 
[8]=0x [9]=0x [10]=0x [11]=0x
 
[12]=0x [13]=0x [14]=0x [15]=0x
kworker/39:1-336   [039]  4982.731676: rtas_ll_exit: token=6 nargs=4 
nret=1
 params: 
[0]=0x [1]=0x [2]=0x04cc1a38 [3]=0x0800
 
[4]=0x0001 [5]=0x000e [6]=0x0008 [7]=0x2e0dac40
 
[8]=0x [9]=0x [10]=0x [11]=0x
 
[12]=0x [13]=0x [14]=0x [15]=0x
kworker/39:1-336   [039]  4982.731677: rtas_output:  event-scan status: 
1, other outputs:

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/include/asm/trace.h | 116 +++
 1 file changed, 116 insertions(+)

diff --git a/arch/powerpc/include/asm/trace.h b/arch/powerpc/include/asm/trace.h
index 08cd60cd70b7..e7a301c9eb95 100644
--- a/arch/powerpc/include/asm/trace.h
+++ b/arch/powerpc/include/asm/trace.h
@@ -119,6 +119,122 @@ TRACE_EVENT_FN_COND(hcall_exit,
 );
 #endif
 
+#ifdef CONFIG_PPC_RTAS
+
+#include 
+
+/*
+ * Since stop-self is how CPUs go offline on RTAS platforms,
+ * these tracepoints are conditional.
+ */
+
+TRACE_EVENT_CONDITION(rtas_input,
+
+   TP_PROTO(struct rtas_args *rtas_args, const char *name),
+
+   TP_ARGS(rtas_args, name),
+
+   TP_CONDITION(cpu_online(raw_smp_processor_id())),
+
+   TP_STRUCT__entry(
+   __field(__u32, nargs)
+   __string(name, name)
+   __dynamic_array(__u32, inputs, be32_to_cpu(rtas_args->nargs))
+   ),
+
+   TP_fast_assign(
+   __entry->nargs = be32_to_cpu(rtas_args->nargs);
+   __assign_str(name, name);
+   be32_to_cpu_array(__get_dynamic_array(inputs), rtas_args->args, 
__entry->nargs);
+   ),
+
+   TP_printk("%s arguments: %s", __get_str(name),
+ __print_array(__get_dynamic_array(inputs), __entry->nargs, 4)
+   )
+);
+
+TRACE_EVENT_CONDITION(rtas_output,
+
+   TP_PROTO(struct rtas_args *rtas_args, const char *name),
+
+   TP_ARGS(rtas_args, name),
+
+   TP_CONDITION(cpu_online(raw_smp_proce

[PATCH 08/13] powerpc/rtas: define pr_fmt and convert printk call sites

2022-11-18 Thread Nathan Lynch
Set pr_fmt to "rtas: " and convert the handful of printk() uses in
rtas.c, adjusting the messages to remove now-redundant "RTAS"
strings.

Note that rtas_restart(), rtas_power_off(), and rtas_halt() all
currently use printk() without specifying a log level. These have been
changed to use pr_emerg(), which matches the behavior of
rtas_os_term().

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 7a5812624e11..c3142d352f41 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -7,6 +7,8 @@
  * Copyright (C) 2001 IBM.
  */
 
+#define pr_fmt(fmt)"rtas: " fmt
+
 #include 
 #include 
 #include 
@@ -718,8 +720,7 @@ static int rtas_error_rc(int rtas_rc)
rc = -ENODEV;
break;
default:
-   printk(KERN_ERR "%s: unexpected RTAS error %d\n",
-   __func__, rtas_rc);
+   pr_err("%s: unexpected error %d\n", __func__, rtas_rc);
rc = -ERANGE;
break;
}
@@ -923,8 +924,8 @@ void __noreturn rtas_restart(char *cmd)
 {
if (rtas_flash_term_hook)
rtas_flash_term_hook(SYS_RESTART);
-   printk("RTAS system-reboot returned %d\n",
-  rtas_call(rtas_token("system-reboot"), 0, 1, NULL));
+   pr_emerg("system-reboot returned %d\n",
+rtas_call(rtas_token("system-reboot"), 0, 1, NULL));
for (;;);
 }
 
@@ -933,8 +934,8 @@ void rtas_power_off(void)
if (rtas_flash_term_hook)
rtas_flash_term_hook(SYS_POWER_OFF);
/* allow power on only with power button press */
-   printk("RTAS power-off returned %d\n",
-  rtas_call(rtas_token("power-off"), 2, 1, NULL, -1, -1));
+   pr_emerg("power-off returned %d\n",
+rtas_call(rtas_token("power-off"), 2, 1, NULL, -1, -1));
for (;;);
 }
 
@@ -943,8 +944,8 @@ void __noreturn rtas_halt(void)
if (rtas_flash_term_hook)
rtas_flash_term_hook(SYS_HALT);
/* allow power on only with power button press */
-   printk("RTAS power-off returned %d\n",
-  rtas_call(rtas_token("power-off"), 2, 1, NULL, -1, -1));
+   pr_emerg("power-off returned %d\n",
+rtas_call(rtas_token("power-off"), 2, 1, NULL, -1, -1));
for (;;);
 }
 
@@ -979,7 +980,7 @@ void rtas_os_term(char *str)
} while (rtas_busy_delay_time(status));
 
if (status != 0)
-   printk(KERN_EMERG "ibm,os-term call failed %d\n", status);
+   pr_emerg("ibm,os-term call failed %d\n", status);
 }
 
 /**
-- 
2.37.1



[PATCH 10/13] powerpc/rtas: improve function information lookups

2022-11-18 Thread Nathan Lynch
The core RTAS support code and its clients perform two types of lookup
for RTAS firmware function information.

First, mapping a known function name to a token. The typical use case
invokes rtas_token() to retrieve the token value to pass to
rtas_call(). rtas_token() relies on of_get_property(), which performs
a linear search of the /rtas node's property list under a lock with
IRQs disabled.

Second, and less common: given a token value, looking up some
information about the function. The primary example is the sys_rtas
filter path, which linearly scans a small table to match the token to
a rtas_filter struct. Another use case to come is RTAS entry/exit
tracepoints, which will require efficient lookup of function names
from token values. Currently there is no general API for this.

We need something much like the existing rtas_filters table, but more
general and organized to facilitate efficient lookups.

Introduce:

* A new rtas_function type, aggregating function name, token,
  and filter. Other function characteristics could be added in the
  future.

* An array of rtas_function, where each element corresponds to a known
  RTAS function. All information in the table is static save the token
  values, which are derived from the device tree at boot. The array is
  sorted by function name to allow binary search.

* A named constant for each known RTAS function, used to index the
  function array. These also will be used in a client-facing API to be
  added later.

* An xarray that maps valid tokens to rtas_function objects.

Fold the existing rtas_filter table into the new rtas_function array,
with the appropriate adjustments to block_rtas_call(). Remove
now-redundant fields from struct rtas_filter.

Convert rtas_token() to use a lockless binary search on the function
table. Fall back to the old behavior for lookups against names that
are not known to be RTAS functions, but issue a warning. rtas_token()
is for function names; it is not a general facility for accessing
arbitrary properties of the /rtas node. All known misuses of
rtas_token() have been converted to more appropriate of_ APIs in
preceding changes.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/include/asm/rtas.h |  87 
 arch/powerpc/kernel/rtas.c  | 735 +++-
 2 files changed, 709 insertions(+), 113 deletions(-)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 479a95cb2770..14fe79217c26 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -16,6 +16,93 @@
  * Copyright (C) 2001 PPC 64 Team, IBM Corp
  */
 
+#define rtas_fnidx(x_) RTAS_FNIDX__ ## x_
+
+enum rtas_function_index {
+   rtas_fnidx(CHECK_EXCEPTION),
+   rtas_fnidx(DISPLAY_CHARACTER),
+   rtas_fnidx(EVENT_SCAN),
+   rtas_fnidx(FREEZE_TIME_BASE),
+   rtas_fnidx(GET_POWER_LEVEL),
+   rtas_fnidx(GET_SENSOR_STATE),
+   rtas_fnidx(GET_TERM_CHAR),
+   rtas_fnidx(GET_TIME_OF_DAY),
+   rtas_fnidx(IBM_ACTIVATE_FIRMWARE),
+   rtas_fnidx(IBM_CBE_START_PTCAL),
+   rtas_fnidx(IBM_CBE_STOP_PTCAL),
+   rtas_fnidx(IBM_CHANGE_MSI),
+   rtas_fnidx(IBM_CLOSE_ERRINJCT),
+   rtas_fnidx(IBM_CONFIGURE_BRIDGE),
+   rtas_fnidx(IBM_CONFIGURE_CONNECTOR),
+   rtas_fnidx(IBM_CONFIGURE_KERNEL_DUMP),
+   rtas_fnidx(IBM_CONFIGURE_PE),
+   rtas_fnidx(IBM_CREATE_PE_DMA_WINDOW),
+   rtas_fnidx(IBM_DISPLAY_MESSAGE),
+   rtas_fnidx(IBM_ERRINJCT),
+   rtas_fnidx(IBM_EXTI2C),
+   rtas_fnidx(IBM_GET_CONFIG_ADDR_INFO),
+   rtas_fnidx(IBM_GET_CONFIG_ADDR_INFO2),
+   rtas_fnidx(IBM_GET_DYNAMIC_SENSOR_STATE),
+   rtas_fnidx(IBM_GET_INDICES),
+   rtas_fnidx(IBM_GET_RIO_TOPOLOGY),
+   rtas_fnidx(IBM_GET_SYSTEM_PARAMETER),
+   rtas_fnidx(IBM_GET_VPD),
+   rtas_fnidx(IBM_GET_XIVE),
+   rtas_fnidx(IBM_INT_OFF),
+   rtas_fnidx(IBM_INT_ON),
+   rtas_fnidx(IBM_IO_QUIESCE_ACK),
+   rtas_fnidx(IBM_LPAR_PERFTOOLS),
+   rtas_fnidx(IBM_MANAGE_FLASH_IMAGE),
+   rtas_fnidx(IBM_MANAGE_STORAGE_PRESERVATION),
+   rtas_fnidx(IBM_NMI_INTERLOCK),
+   rtas_fnidx(IBM_NMI_REGISTER),
+   rtas_fnidx(IBM_OPEN_ERRINJCT),
+   rtas_fnidx(IBM_OPEN_SRIOV_ALLOW_UNFREEZE),
+   rtas_fnidx(IBM_OPEN_SRIOV_MAP_PE_NUMBER),
+   rtas_fnidx(IBM_OS_TERM),
+   rtas_fnidx(IBM_PARTNER_CONTROL),
+   rtas_fnidx(IBM_PHYSICAL_ATTESTATION),
+   rtas_fnidx(IBM_PLATFORM_DUMP),
+   rtas_fnidx(IBM_POWER_OFF_UPS),
+   rtas_fnidx(IBM_QUERY_INTERRUPT_SOURCE_NUMBER),
+   rtas_fnidx(IBM_QUERY_PE_DMA_WINDOW),
+   rtas_fnidx(IBM_READ_PCI_CONFIG),
+   rtas_fnidx(IBM_READ_SLOT_RESET_STATE),
+   rtas_fnidx(IBM_READ_SLOT_RESET_STATE2),
+   rtas_fnidx(IBM_REMOVE_PE_DMA_WINDOW),
+   rtas_fnidx(IBM_RESET_PE_DMA_WINDOWS),
+   rtas_fnidx(IBM_SCAN_LOG_DUMP),
+   rtas_fnidx(IBM_SET_DYNAMIC_INDICATOR),
+   rtas_fnidx(IBM_SET_EEH_OPTION),
+   rtas_fnidx(IBM_SET_SLOT_RESET),
+  

[PATCH 07/13] powerpc/rtas: clean up includes

2022-11-18 Thread Nathan Lynch
rtas.c used to host complex code related to pseries-specific guest
migration and suspend, which used atomics, completions, hcalls, and
CPU hotplug APIs. That's all been deleted or moved, so remove the
include directives that have been rendered unnecessary. Sort the
remainder (with linux/ before asm/) to impose some order on where
future additions go.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 42 +++---
 1 file changed, 16 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 3fa84c247415..7a5812624e11 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -7,43 +7,33 @@
  * Copyright (C) 2001 IBM.
  */
 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
 #include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
 #include 
-#include 
+#include 
+#include 
 #include 
+#include 
 #include 
+#include 
+#include 
+#include 
 #include 
-#include 
-#include 
+#include 
+#include 
 
+#include 
+#include 
 #include 
-#include 
-#include 
 #include 
-#include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
 #include 
-#include 
-#include 
+#include 
 
 /* This is here deliberately so it's only used in this file */
 void enter_rtas(unsigned long);
-- 
2.37.1



[PATCH 00/13] RTAS maintenance

2022-11-18 Thread Nathan Lynch
A collection of loosely-related RTAS code changes, most notably:

* Fixing misuses of rtas_token() for non-function properties.
* The stronger validation for sys_rtas() offered by the
  PPC_RTAS_FILTER config option is always enabled.
* Improved function token lookups, including efficient "reverse"
  token-to-name mappings.
* Static tracepoints for RTAS entry and exit.

Nathan Lynch (13):
  powerpc/rtas: document rtas_call()
  powerpc/rtasd: use correct OF API for event scan rate
  powerpc/rtas: avoid device tree lookups in rtas_os_term()
  powerpc/rtas: avoid scheduling in rtas_os_term()
  powerpc/pseries/eeh: use correct API for error log size
  powerpc/rtas: clean up rtas_error_log_max initialization
  powerpc/rtas: clean up includes
  powerpc/rtas: define pr_fmt and convert printk call sites
  powerpc/rtas: mandate RTAS syscall filtering
  powerpc/rtas: improve function information lookups
  powerpc/rtas: strengthen do_enter_rtas() type safety, drop inline
  powerpc/tracing: tracepoints for RTAS entry and exit
  powerpc/rtas: place tracepoints in do_enter_rtas()

 arch/powerpc/Kconfig |  13 -
 arch/powerpc/include/asm/rtas.h  | 102 +-
 arch/powerpc/include/asm/trace.h | 116 +++
 arch/powerpc/kernel/rtas.c   | 961 +++
 arch/powerpc/kernel/rtasd.c  |   7 +-
 arch/powerpc/platforms/pseries/eeh_pseries.c |   2 +-
 6 files changed, 986 insertions(+), 215 deletions(-)

-- 
2.37.1



[PATCH 06/13] powerpc/rtas: clean up rtas_error_log_max initialization

2022-11-18 Thread Nathan Lynch
The code in rtas_get_error_log_max() doesn't cause problems in
practice, but there are no measures to ensure that the lazy
initialization of the static rtas_error_log_max variable is atomic,
and it's not worth adding them.

Initialize the static rtas_error_log_max variable at boot when we're
single-threaded instead of lazily on first use. Use the more
appropriate of_property_read_u32() API instead of rtas_token() to
consult the "rtas-error-log-max" property, which is not the name of an
RTAS function. Convert use of printk() to pr_warn() and distinguish
the possible error cases.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 37 ++---
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 51f0508593a7..3fa84c247415 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -353,6 +353,9 @@ int rtas_service_present(const char *service)
 EXPORT_SYMBOL(rtas_service_present);
 
 #ifdef CONFIG_RTAS_ERROR_LOGGING
+
+static u32 rtas_error_log_max __ro_after_init = RTAS_ERROR_LOG_MAX;
+
 /*
  * Return the firmware-specified size of the error log buffer
  *  for all rtas calls that require an error buffer argument.
@@ -360,21 +363,30 @@ EXPORT_SYMBOL(rtas_service_present);
  */
 int rtas_get_error_log_max(void)
 {
-   static int rtas_error_log_max;
-   if (rtas_error_log_max)
-   return rtas_error_log_max;
-
-   rtas_error_log_max = rtas_token ("rtas-error-log-max");
-   if ((rtas_error_log_max == RTAS_UNKNOWN_SERVICE) ||
-   (rtas_error_log_max > RTAS_ERROR_LOG_MAX)) {
-   printk (KERN_WARNING "RTAS: bad log buffer size %d\n",
-   rtas_error_log_max);
-   rtas_error_log_max = RTAS_ERROR_LOG_MAX;
-   }
return rtas_error_log_max;
 }
 EXPORT_SYMBOL(rtas_get_error_log_max);
 
+static void __init init_error_log_max(void)
+{
+   static const char propname[] __initconst = "rtas-error-log-max";
+   u32 max;
+
+   if (of_property_read_u32(rtas.dev, propname, &max)) {
+   pr_warn("%s not found, using default of %u\n",
+   propname, RTAS_ERROR_LOG_MAX);
+   max = RTAS_ERROR_LOG_MAX;
+   }
+
+   if (max > RTAS_ERROR_LOG_MAX) {
+   pr_warn("%s = %u, clamping max error log size to %u\n",
+   propname, max, RTAS_ERROR_LOG_MAX);
+   max = RTAS_ERROR_LOG_MAX;
+   }
+
+   rtas_error_log_max = max;
+}
+
 
 static char rtas_err_buf[RTAS_ERROR_LOG_MAX];
 static int rtas_last_error_token;
@@ -432,6 +444,7 @@ static char *__fetch_rtas_last_error(char *altbuf)
 #else /* CONFIG_RTAS_ERROR_LOGGING */
 #define __fetch_rtas_last_error(x) NULL
 #define get_errorlog_buffer()  NULL
+static void __init init_error_log_max(void) {}
 #endif
 
 
@@ -1341,6 +1354,8 @@ void __init rtas_initialize(void)
no_entry = of_property_read_u32(rtas.dev, "linux,rtas-entry", &entry);
rtas.entry = no_entry ? rtas.base : entry;
 
+   init_error_log_max();
+
/*
 * Discover these now to avoid device tree lookups in the
 * panic path.
-- 
2.37.1



[PATCH 02/13] powerpc/rtasd: use correct OF API for event scan rate

2022-11-18 Thread Nathan Lynch
rtas_token() should be used only for properties that are RTAS function
tokens. "rtas-event-scan-rate" does not contain a function token, but it
has the same size/format as token properties so reading it with
rtas_token() happens to work.

Convert to of_property_read_u32().

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtasd.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index 5270b450bbde..cc56ac6ba4b0 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -499,6 +500,8 @@ EXPORT_SYMBOL_GPL(rtas_cancel_event_scan);
 
 static int __init rtas_event_scan_init(void)
 {
+   int err;
+
if (!machine_is(pseries) && !machine_is(chrp))
return 0;
 
@@ -509,8 +512,8 @@ static int __init rtas_event_scan_init(void)
return -ENODEV;
}
 
-   rtas_event_scan_rate = rtas_token("rtas-event-scan-rate");
-   if (rtas_event_scan_rate == RTAS_UNKNOWN_SERVICE) {
+   err = of_property_read_u32(rtas.dev, "rtas-event-scan-rate", 
&rtas_event_scan_rate);
+   if (err) {
printk(KERN_ERR "rtasd: no rtas-event-scan-rate on system\n");
return -ENODEV;
}
-- 
2.37.1



[PATCH 01/13] powerpc/rtas: document rtas_call()

2022-11-18 Thread Nathan Lynch
rtas_call() has a complex calling convention, non-standard return
values, and many users. Add kernel-doc for it and remove the less
structured commentary from rtas.h.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/include/asm/rtas.h | 15 -
 arch/powerpc/kernel/rtas.c  | 58 +
 2 files changed, 58 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index 56319aea646e..479a95cb2770 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -33,21 +33,6 @@
 #define RTAS_THREADS_ACTIVE -9005 /* Multiple processor threads active */
 #define RTAS_OUTSTANDING_COPROC -9006 /* Outstanding coprocessor operations */
 
-/*
- * In general to call RTAS use rtas_token("string") to lookup
- * an RTAS token for the given string (e.g. "event-scan").
- * To actually perform the call use
- *ret = rtas_call(token, n_in, n_out, ...)
- * Where n_in is the number of input parameters and
- *   n_out is the number of output parameters
- *
- * If the "string" is invalid on this system, RTAS_UNKNOWN_SERVICE
- * will be returned as a token.  rtas_call() does look for this
- * token and error out gracefully so rtas_call(rtas_token("str"), ...)
- * may be safely used for one-shot calls to RTAS.
- *
- */
-
 /* RTAS event classes */
 #define RTAS_INTERNAL_ERROR0x8000 /* set bit 0 */
 #define RTAS_EPOW_WARNING  0x4000 /* set bit 1 */
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index e847f9b1c5b9..c12dd5ed5e00 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -467,6 +467,64 @@ void rtas_call_unlocked(struct rtas_args *args, int token, 
int nargs, int nret,
 static int ibm_open_errinjct_token;
 static int ibm_errinjct_token;
 
+/**
+ * rtas_call() - Invoke an RTAS firmware function.
+ * @token: Identifies the function being invoked.
+ * @nargs: Number of input parameters. Does not include token.
+ * @nret: Number of output parameters, including the call status.
+ * @outputs: Array of @nret output words.
+ * @: List of @nargs input parameters.
+ *
+ * Invokes the RTAS function indicated by @token, which the caller
+ * should obtain via rtas_token().
+ *
+ * The @nargs and @nret arguments must match the number of input and
+ * output parameters specified for the RTAS function.
+ *
+ * rtas_call() returns RTAS status codes, not conventional Linux errno
+ * values. Callers must translate any failure to an appropriate errno
+ * in syscall context. Most callers of RTAS functions that can return
+ * -2 or 990x should use rtas_busy_delay() to correctly handle those
+ * statuses before calling again.
+ *
+ * The return value descriptions are adapted from 7.2.8 [RTAS] Return
+ * Codes of the PAPR and CHRP specifications.
+ *
+ * Context: Process context preferably, interrupt context if
+ *  necessary.  Acquires an internal spinlock and may perform
+ *  GFP_ATOMIC slab allocation in error path. Unsafe for NMI
+ *  context.
+ * Return:
+ * *  0 - RTAS function call succeeded.
+ * * -1 - RTAS function encountered a hardware or
+ *platform error, or the token is invalid,
+ *or the function is restricted by kernel 
policy.
+ * * -2 - Specs say "A necessary hardware device was 
busy,
+ *and the requested function could not be
+ *performed. The operation should be retried at
+ *a later time." This is misleading, at least 
with
+ *respect to current RTAS implementations. 
What it
+ *usually means in practice is that the 
function
+ *could not be completed while meeting RTAS's
+ *deadline for returning control to the OS 
(250us
+ *for PAPR/PowerVM, typically), but the call 
may be
+ *immediately reattempted to resume work on it.
+ * * -3 - Parameter error.
+ * * -7 - Unexpected state change.
+ * *9000...9899 - Vendor-specific success codes.
+ * *9900...9905 - Advisory extended delay. Caller should try
+ *again after ~10^x ms has elapsed, where x is
+ *the last digit of the status [0-5]. Again 
going
+ *beyond the PAPR text, 990x on PowerVM 
indicates
+ *contention for RTAS-internal resources. Other
+ *RTAS call sequences in progress should be
+ *allowed to complete before reattempting the
+ *call.

[PATCH 03/13] powerpc/rtas: avoid device tree lookups in rtas_os_term()

2022-11-18 Thread Nathan Lynch
rtas_os_term() is called during panic. Its behavior depends on a
couple of conditions in the /rtas node of the device tree, the
traversal of which entails locking and local IRQ state changes. If the
kernel panics while devtree_lock is held, rtas_os_term() as currently
written could hang.

Instead of discovering the relevant characteristics at panic time,
cache them in file-static variables at boot. Note the lookup for
"ibm,extended-os-term" is converted to of_property_read_bool() since
it is a boolean property, not a RTAS function token.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index c12dd5ed5e00..81e4996012b7 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -947,6 +947,8 @@ void __noreturn rtas_halt(void)
 
 /* Must be in the RMO region, so we place it here */
 static char rtas_os_term_buf[2048];
+static s32 ibm_os_term_token = RTAS_UNKNOWN_SERVICE;
+static bool ibm_extended_os_term;
 
 void rtas_os_term(char *str)
 {
@@ -958,14 +960,13 @@ void rtas_os_term(char *str)
 * this property may terminate the partition which we want to avoid
 * since it interferes with panic_timeout.
 */
-   if (RTAS_UNKNOWN_SERVICE == rtas_token("ibm,os-term") ||
-   RTAS_UNKNOWN_SERVICE == rtas_token("ibm,extended-os-term"))
+   if (ibm_os_term_token == RTAS_UNKNOWN_SERVICE || !ibm_extended_os_term)
return;
 
snprintf(rtas_os_term_buf, 2048, "OS panic: %s", str);
 
do {
-   status = rtas_call(rtas_token("ibm,os-term"), 1, 1, NULL,
+   status = rtas_call(ibm_os_term_token, 1, 1, NULL,
   __pa(rtas_os_term_buf));
} while (rtas_busy_delay(status));
 
@@ -1335,6 +1336,13 @@ void __init rtas_initialize(void)
no_entry = of_property_read_u32(rtas.dev, "linux,rtas-entry", &entry);
rtas.entry = no_entry ? rtas.base : entry;
 
+   /*
+* Discover these now to avoid device tree lookups in the
+* panic path.
+*/
+   ibm_os_term_token = rtas_token("ibm,os-term");
+   ibm_extended_os_term = of_property_read_bool(rtas.dev, 
"ibm,extended-os-term");
+
/* If RTAS was found, allocate the RMO buffer for it and look for
 * the stop-self token if any
 */
-- 
2.37.1



Re: [PATCH printk v5 00/40] reduce console_lock scope

2022-11-18 Thread Petr Mladek
On Fri 2022-11-18 12:22:58, Petr Mladek wrote:
> On Wed 2022-11-16 17:27:12, John Ogness wrote:
> > This is v5 of a series to prepare for threaded/atomic
> > printing. v4 is here [0]. This series focuses on reducing the
> > scope of the BKL console_lock. It achieves this by switching to
> > SRCU and a dedicated mutex for console list iteration and
> > modification, respectively. The console_lock will no longer
> > offer this protection.
> 
> The patchset looks ready for linux-next from my POV.
> 
> I am going to push it there right now to get as much testing
> as possible before the merge window.

JFYI, the patchset is committed in printk/linux.git,
branch rework/console-list-lock.

I'll eventually merge it into rework/kthreads. But I wanted to have
it separated until it gets some more testing in linux-next and
eventually some more review.

Best Regards,
Petr


Re: [patch 23/39] PCI/MSI: Move pci_alloc_irq_vectors_affinity() to api.c

2022-11-18 Thread Peter Zijlstra
On Fri, Nov 18, 2022 at 01:34:12PM +0100, Ahmed S. Darwish wrote:
> On Wed, Nov 16, 2022 at 10:23:22AM -0600, Bjorn Helgaas wrote:
> > On Fri, Nov 11, 2022 at 02:54:51PM +0100, Thomas Gleixner wrote:
> ...
> > > +
> > > +/**
> > > + * pci_alloc_irq_vectors_affinity() - Allocate multiple device interrupt
> > > + *vectors with affinity requirements
> > > + * @dev:  the PCI device to operate on
> > > + * @min_vecs: minimum required number of vectors (must be >= 1)
> > > + * @max_vecs: maximum desired number of vectors
> > > + * @flags:allocation flags, as in pci_alloc_irq_vectors()
> > > + * @affd: affinity requirements (can be %NULL).
> > > + *
> > > + * Same as pci_alloc_irq_vectors(), but with the extra @affd parameter.
> > > + * Check that function docs, and &struct irq_affinity, for more details.
> >
> > Is "&struct irq_affinity" some kernel-doc syntax, or is the "&"
> > superfluous?
> >
> 
> Hmmm, I stole it from Documentation/doc-guide/kernel-doc.rst. htmldoc
> parses it and generates a link to the referenced structure's kernel-doc.
> 
> But, yeah, this was literally the first usage of such a doc pattern in
> the entire kernel's C code :)

Perhaps then not start with it and instead try and convince John to make
his script more clever -- this same script already recognises functions
by their () suffix, might as well also key off the 'struct' keyword, no?

This is a Code comment, to be read in a text editor. That & is a syntax
error.



Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc

2022-11-18 Thread Christophe Leroy


Le 18/11/2022 à 10:39, Hari Bathini a écrit :
> 
> 
> On 18/11/22 2:21 pm, Christophe Leroy wrote: >
> I had the same config but hit this problem:
>
>    # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf
>    test_bpf: #0 TAX
>    [ cut here ]
>    WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367
> bpf_int_jit_compile+0x8a0/0x9f8

 I get no such problem, on QEMU, and I checked the .config has:
>>>
 CONFIG_STRICT_KERNEL_RWX=y
 CONFIG_STRICT_MODULE_RWX=y
>>>
>>> Yeah. That did the trick.
>>
>> Interesting. I guess we have to find out why it fails when those config
>> are missing.
>>
>> Maybe module code plays with RO and NX flags even if
>> CONFIG_STRICT_MODULE_RWX is not selected ?
> 
> Need to look at the code closely but fwiw, observing same failure on
> 64-bit as well with !STRICT_RWX...

The problem is in bpf_prog_pack_alloc() and in alloc_new_pack() : They 
do set_memory_ro() and set_memory_x() without taking into account 
CONFIG_STRICT_MODULE_RWX.

When CONFIG_STRICT_MODULE_RWX is selected, powerpc module_alloc() 
allocates PAGE_KERNEL memory, that is RW memory, and expects the user to 
call do set_memory_ro() and set_memory_x().

But when CONFIG_STRICT_MODULE_RWX is not selected, powerpc 
module_alloc() allocates PAGE_KERNEL_TEXT memory, that is RWX memory, 
and expects to be able to always write into it.

Christophe


Re: [PATCH printk v5 00/40] reduce console_lock scope

2022-11-18 Thread Petr Mladek
On Wed 2022-11-16 17:27:12, John Ogness wrote:
> This is v5 of a series to prepare for threaded/atomic
> printing. v4 is here [0]. This series focuses on reducing the
> scope of the BKL console_lock. It achieves this by switching to
> SRCU and a dedicated mutex for console list iteration and
> modification, respectively. The console_lock will no longer
> offer this protection.

The patchset looks ready for linux-next from my POV.

I am going to push it there right now to get as much testing
as possible before the merge window.

Any review and comments are still appreciate. We could always
take it back if some critical problems are discovered and
can't be solved easily.

Best Regards,
Petr


[PATCH] powerpc: Use "grep -E" instead of "egrep"

2022-11-18 Thread Tiezhu Yang
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
fix this up by moving the related file to use "grep -E" instead.

  sed -i "s/egrep/grep -E/g" `grep egrep -rwl arch/powerpc`

Here are the steps to install the latest grep:

  wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
  tar xf grep-3.8.tar.gz
  cd grep-3.8 && ./configure && make
  sudo make install
  export PATH=/usr/local/bin:$PATH

Signed-off-by: Tiezhu Yang 
---
 arch/powerpc/boot/wrapper | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index 5bdd4dd..a86ae11 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -581,7 +581,7 @@ ps3)
 # reached, then enter the system reset vector of the partially decompressed
 # image.  No warning is issued.
 rm -f "$odir"/{otheros,otheros-too-big}.bld
-size=$(${CROSS}nm --no-sort --radix=d "$ofile" | egrep ' _end$' | cut -d' 
' -f1)
+size=$(${CROSS}nm --no-sort --radix=d "$ofile" | grep -E ' _end$' | cut 
-d' ' -f1)
 bld="otheros.bld"
 if [ $size -gt $((0x100)) ]; then
 bld="otheros-too-big.bld"
-- 
2.1.0



Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc

2022-11-18 Thread Hari Bathini




On 18/11/22 2:21 pm, Christophe Leroy wrote:



Le 18/11/2022 à 09:39, Hari Bathini a écrit :



On 17/11/22 12:29 pm, Christophe Leroy wrote:



Le 16/11/2022 à 18:01, Hari Bathini a écrit :



On 16/11/22 12:14 am, Christophe Leroy wrote:



Le 14/11/2022 à 18:27, Christophe Leroy a écrit :



Le 14/11/2022 à 15:47, Hari Bathini a écrit :

Hi Christophe,

On 11/11/22 4:55 pm, Christophe Leroy wrote:

Le 10/11/2022 à 19:43, Hari Bathini a écrit :

Most BPF programs are small, but they consume a page each. For
systems
with busy traffic and many BPF programs, this may also add
significant
pressure on instruction TLB. High iTLB pressure usually slows down
the
whole system causing visible performance degradation for production
workloads.

bpf_prog_pack, a customized allocator that packs multiple bpf
programs
into preallocated memory chunks, was proposed [1] to address it.
This
series extends this support on powerpc.

Patches 1 & 2 add the arch specific functions needed to support
this
feature. Patch 3 enables the support for powerpc. The last patch
ensures cleanup is handled racefully.




Tested the changes successfully on a PowerVM. patch_instruction(),
needed for bpf_arch_text_copy(), is failing for ppc32. Debugging
it.
Posting the patches in the meanwhile for feedback on these changes.


I did a quick test on ppc32, I don't get such a problem, only
something
wrong in the dump print as traps intructions only are dumped, but
tcpdump works as expected:


Thanks for the quick test. Could you please share the config you
used.
I am probably missing a few knobs in my conifg...





I also managed to test it on QEMU. The config is based on
pmac32_defconfig.


I had the same config but hit this problem:

   # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf
   test_bpf: #0 TAX
   [ cut here ]
   WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367
bpf_int_jit_compile+0x8a0/0x9f8


I get no such problem, on QEMU, and I checked the .config has:



CONFIG_STRICT_KERNEL_RWX=y
CONFIG_STRICT_MODULE_RWX=y


Yeah. That did the trick.


Interesting. I guess we have to find out why it fails when those config
are missing.

Maybe module code plays with RO and NX flags even if
CONFIG_STRICT_MODULE_RWX is not selected ?


Need to look at the code closely but fwiw, observing same failure on
64-bit as well with !STRICT_RWX...

Thanks
Hari


Re: [RFC PATCH v2 8/8] sched, smp: Trace smp callback causing an IPI

2022-11-18 Thread Peter Zijlstra
On Thu, Nov 17, 2022 at 02:45:29PM +, Valentin Schneider wrote:

> > +   if (trace_ipi_send_cpumask_enabled()) {
> > +   call_single_data_t *csd;
> > +   smp_call_func_t func;
> > +
> > +   csd = container_of(node, call_single_data_t, node.llist);
> > +
> > +   func = sched_ttwu_pending;
> > +   if (CSD_TYPE(csd) != CSD_TYPE_TTWU)
> > +   func = csd->func;
> > +
> > +   if (raw_smp_call_single_queue(cpu, node))
> > +   trace_ipi_send_cpumask(cpumask_of(cpu), _RET_IP_, func);
> 
> So I went with the tracepoint being placed *before* the actual IPI gets
> sent to have a somewhat sane ordering between trace_ipi_send_cpumask() and
> e.g. trace_call_function_single_entry().
> 
> Packaging the call_single_queue logic makes the code less horrible, but it
> does mix up the event ordering...

Keeps em sharp ;-)

> > +   return;
> > +   }
> > +
> > +   raw_smp_call_single_queue(cpu, node);
> >  }
> >
> >  /*
> > @@ -983,10 +1017,13 @@ static void smp_call_function_many_cond(
> >* number of CPUs might be zero due to concurrent changes to 
> > the
> >* provided mask.
> >*/
> > -   if (nr_cpus == 1)
> > +   if (nr_cpus == 1) {
> > +   trace_ipi_send_cpumask(cpumask_of(last_cpu), _RET_IP_, 
> > func);
> >   send_call_function_single_ipi(last_cpu);
> 
> This'll yield an IPI event even if no IPI is sent due to the idle task
> polling, no?

Oh, right..


Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc

2022-11-18 Thread Christophe Leroy


Le 18/11/2022 à 09:39, Hari Bathini a écrit :
> 
> 
> On 17/11/22 12:29 pm, Christophe Leroy wrote:
>>
>>
>> Le 16/11/2022 à 18:01, Hari Bathini a écrit :
>>>
>>>
>>> On 16/11/22 12:14 am, Christophe Leroy wrote:


 Le 14/11/2022 à 18:27, Christophe Leroy a écrit :
>
>
> Le 14/11/2022 à 15:47, Hari Bathini a écrit :
>> Hi Christophe,
>>
>> On 11/11/22 4:55 pm, Christophe Leroy wrote:
>>> Le 10/11/2022 à 19:43, Hari Bathini a écrit :
 Most BPF programs are small, but they consume a page each. For
 systems
 with busy traffic and many BPF programs, this may also add
 significant
 pressure on instruction TLB. High iTLB pressure usually slows down
 the
 whole system causing visible performance degradation for production
 workloads.

 bpf_prog_pack, a customized allocator that packs multiple bpf
 programs
 into preallocated memory chunks, was proposed [1] to address it. 
 This
 series extends this support on powerpc.

 Patches 1 & 2 add the arch specific functions needed to support 
 this
 feature. Patch 3 enables the support for powerpc. The last patch
 ensures cleanup is handled racefully.

>>
 Tested the changes successfully on a PowerVM. patch_instruction(),
 needed for bpf_arch_text_copy(), is failing for ppc32. Debugging 
 it.
 Posting the patches in the meanwhile for feedback on these changes.
>>>
>>> I did a quick test on ppc32, I don't get such a problem, only
>>> something
>>> wrong in the dump print as traps intructions only are dumped, but
>>> tcpdump works as expected:
>>
>> Thanks for the quick test. Could you please share the config you 
>> used.
>> I am probably missing a few knobs in my conifg...
>>
>

 I also managed to test it on QEMU. The config is based on
 pmac32_defconfig.
>>>
>>> I had the same config but hit this problem:
>>>
>>>   # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf
>>>   test_bpf: #0 TAX
>>>   [ cut here ]
>>>   WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367
>>> bpf_int_jit_compile+0x8a0/0x9f8
>>
>> I get no such problem, on QEMU, and I checked the .config has:
> 
>> CONFIG_STRICT_KERNEL_RWX=y
>> CONFIG_STRICT_MODULE_RWX=y
> 
> Yeah. That did the trick.

Interesting. I guess we have to find out why it fails when those config 
are missing.

Maybe module code plays with RO and NX flags even if 
CONFIG_STRICT_MODULE_RWX is not selected ?

Christophe


Re: [RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc

2022-11-18 Thread Hari Bathini




On 17/11/22 12:29 pm, Christophe Leroy wrote:



Le 16/11/2022 à 18:01, Hari Bathini a écrit :



On 16/11/22 12:14 am, Christophe Leroy wrote:



Le 14/11/2022 à 18:27, Christophe Leroy a écrit :



Le 14/11/2022 à 15:47, Hari Bathini a écrit :

Hi Christophe,

On 11/11/22 4:55 pm, Christophe Leroy wrote:

Le 10/11/2022 à 19:43, Hari Bathini a écrit :

Most BPF programs are small, but they consume a page each. For
systems
with busy traffic and many BPF programs, this may also add
significant
pressure on instruction TLB. High iTLB pressure usually slows down
the
whole system causing visible performance degradation for production
workloads.

bpf_prog_pack, a customized allocator that packs multiple bpf
programs
into preallocated memory chunks, was proposed [1] to address it. This
series extends this support on powerpc.

Patches 1 & 2 add the arch specific functions needed to support this
feature. Patch 3 enables the support for powerpc. The last patch
ensures cleanup is handled racefully.




Tested the changes successfully on a PowerVM. patch_instruction(),
needed for bpf_arch_text_copy(), is failing for ppc32. Debugging it.
Posting the patches in the meanwhile for feedback on these changes.


I did a quick test on ppc32, I don't get such a problem, only
something
wrong in the dump print as traps intructions only are dumped, but
tcpdump works as expected:


Thanks for the quick test. Could you please share the config you used.
I am probably missing a few knobs in my conifg...





I also managed to test it on QEMU. The config is based on
pmac32_defconfig.


I had the same config but hit this problem:

  # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf
  test_bpf: #0 TAX
  [ cut here ]
  WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367
bpf_int_jit_compile+0x8a0/0x9f8


I get no such problem, on QEMU, and I checked the .config has:



CONFIG_STRICT_KERNEL_RWX=y
CONFIG_STRICT_MODULE_RWX=y


Yeah. That did the trick. These options were missing in my config and
the pmac config you shared. I could not run the other config you shared
on QEMU. Thanks for all the pointers.

- Hari