[GIT PULL] vhost,virtio,vdpa,firmware: bugfixes

2023-11-14 Thread Michael S. Tsirkin
The following changes since commit 86f6c224c97911b4392cb7b402e6a4ed323a449e:

  vdpa_sim: implement .reset_map support (2023-11-01 09:20:00 -0400)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to e07754e0a1ea2d63fb29574253d1fd7405607343:

  vhost-vdpa: fix use after free in vhost_vdpa_probe() (2023-11-01 09:31:16 
-0400)


vhost,virtio,vdpa,firmware: bugfixes

bugfixes all over the place

Signed-off-by: Michael S. Tsirkin 


Björn Töpel (1):
  riscv, qemu_fw_cfg: Add support for RISC-V architecture

Dan Carpenter (1):
  vhost-vdpa: fix use after free in vhost_vdpa_probe()

Jakub Sitnicki (1):
  virtio_pci: Switch away from deprecated irq_set_affinity_hint

Michael S. Tsirkin (1):
  virtio_pci: move structure to a header

Stefano Garzarella (1):
  vdpa_sim_blk: allocate the buffer zeroed

 drivers/firmware/Kconfig   |  2 +-
 drivers/firmware/qemu_fw_cfg.c |  2 +-
 drivers/vdpa/vdpa_sim/vdpa_sim_blk.c   |  4 ++--
 drivers/vhost/vdpa.c   |  1 -
 drivers/virtio/virtio_pci_common.c |  6 +++---
 drivers/virtio/virtio_pci_modern_dev.c |  7 ---
 include/linux/virtio_pci_modern.h  |  7 ---
 include/uapi/linux/virtio_pci.h| 11 +++
 8 files changed, 22 insertions(+), 18 deletions(-)




Re: [PATCH] tracing: fix UAF caused by memory ordering issue

2023-11-14 Thread Kairui Song
Mark Rutland  于2023年11月14日周二 06:17写道:
>

Hi, Mark and Steven

Thank you so much for the detailed comments.

> On Sun, Nov 12, 2023 at 11:00:30PM +0800, Kairui Song wrote:
> > From: Kairui Song 
> >
> > Following kernel panic was observed when doing ftrace stress test:
>
> Can you share some more details:
>
> * What test specifically are you running? Can you share this so that others 
> can
>   try to reproduce the issue?

Yes, the panic happened when doing LTP ftrace stress test:
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/tracing/ftrace_test/ftrace_stress_test.sh

>
> * Which machines are you testing on (i.e. which CPU microarchitecture is this
>   seen with) ?

The panic was seen on a ARM64 VM, lscpu output:
Architecture:   aarch64
  CPU op-mode(s):   64-bit
  Byte Order:   Little Endian
CPU(s): 4
  On-line CPU(s) list:  0-3
Vendor ID:  HiSilicon
  BIOS Vendor ID:   QEMU
  Model name:   Kunpeng-920
BIOS Model name:virt-rhel8.6.0  CPU @ 2.0GHz
BIOS CPU family:1
Model:  0
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s):  4
Stepping:   0x1
BogoMIPS:   200.00
Flags:  fp asimd evtstrm aes pmull sha1 sha2 crc32
atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm

The host machine is a Kunpeng-920 with 4 NUMA nodes and 128 cores.

>
> * Which compiler are you using?

gcc 12.3.1

>
> * The log shows this is with v6.1.61+. Can you reproduce this with a mainline
>   kernel? e.g. v6.6 or v6.7-rc1?

It's reproducible with LTS, not tested with mainline, I'll try to
reproduce this with the latest mainline. But due to the low
reproducibility this may take a while.

>
> > Unable to handle kernel paging request at virtual address 9699b0f8ece28240
> > Mem abort info:
> >   ESR = 0x9604
> >   EC = 0x25: DABT (current EL), IL = 32 bits
> >   SET = 0, FnV = 0
> >   EA = 0, S1PTW = 0
> >   FSC = 0x04: level 0 translation fault
> > Data abort info:
> >   ISV = 0, ISS = 0x0004
> >   CM = 0, WnR = 0
> > [9699b0f8ece28240] address between user and kernel address ranges
> > Internal error: Oops: 9604 [#1] SMP
> > Modules linked in: rpcrdma rdma_cm iw_cm ib_cm ib_core rfkill vfat fat loop 
> > fuse nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache 
> > jbd2 sr_mod cdrom crct10dif_ce ghash_ce sha2_ce virtio_gpu virtio_dma_buf 
> > drm_shmem_helper virtio_blk drm_kms_helper syscopyarea sysfillrect 
> > sysimgblt fb_sys_fops virtio_console sha256_arm64 sha1_ce drm virtio_scsi 
> > i2c_core virtio_net net_failover failover virtio_mmio dm_multipath dm_mod 
> > autofs4 [last unloaded: ipmi_msghandler]
> > CPU: 0 PID: 499719 Comm: sh Kdump: loaded Not tainted 6.1.61+ #2
> > Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
> > pstate: 6045 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : __kmem_cache_alloc_node+0x1dc/0x2e4
> > lr : __kmem_cache_alloc_node+0xac/0x2e4
> > sp : 8ad23aa0
> > x29: 8ad23ab0 x28: 0004052b8000 x27: c513863b
> > x26: 0040 x25: c51384f21ca4 x24: 
> > x23: d615521430b1b1a5 x22: c51386044770 x21: 
> > x20: 0cc0 x19: c0001200 x18: 
> > x17:  x16:  x15: e65e1630
> > x14: 0004 x13: c513863e67a0 x12: c513863af6d8
> > x11: 0001 x10: 8ad23aa0 x9 : c51385058078
> > x8 : 0018 x7 : 0001 x6 : 0010
> > x5 : c09c2280 x4 : c51384f21ca4 x3 : 0040
> > x2 : 9699b0f8ece28240 x1 : c09c2280 x0 : 9699b0f8ece28200
> > Call trace:
> >  __kmem_cache_alloc_node+0x1dc/0x2e4
> >  __kmalloc+0x6c/0x1c0
> >  func_add+0x1a4/0x200
> >  tracepoint_add_func+0x70/0x230
> >  tracepoint_probe_register+0x6c/0xb4
> >  trace_event_reg+0x8c/0xa0
> >  __ftrace_event_enable_disable+0x17c/0x440
> >  __ftrace_set_clr_event_nolock+0xe0/0x150
> >  system_enable_write+0xe0/0x114
> >  vfs_write+0xd0/0x2dc
> >  ksys_write+0x78/0x110
> >  __arm64_sys_write+0x24/0x30
> >  invoke_syscall.constprop.0+0x58/0xf0
> >  el0_svc_common.constprop.0+0x54/0x160
> >  do_el0_svc+0x2c/0x60
> >  el0_svc+0x40/0x1ac
> >  el0t_64_sync_handler+0xf4/0x120
> >  el0t_64_sync+0x19c/0x1a0
> > Code: b9402a63 f9405e77 8b030002 d5384101 (f8636803)
> >
> > Panic was caused by corrupted freelist pointer. After more debugging,
> > I found the root cause is UAF of slab allocated object in ftrace
> > introduced by commit eecb91b9f98d ("tracing: Fix memleak due to race
> > between current_tracer and trace"), and so far it's only reproducible
> > on some ARM64 machines, the UAF and free stack is:
> >
> > UAF:
> > kasan_report+0xa8/0x1bc
> > __asan_report_load8_noabort+0x28/0x3c
> > print_graph_function_flags+0x524/0x5a0
> > print_graph_function_event+0x28/0x40
> > print_

Re: [RESEND PATCH v3 1/2] remoteproc: Make rproc_get_by_phandle() work for clusters

2023-11-14 Thread Mathieu Poirier
On Tue, 14 Nov 2023 at 08:22, Bjorn Andersson  wrote:
>
> On Sat, Oct 14, 2023 at 04:15:47PM -0700, Tanmay Shah wrote:
> > From: Mathieu Poirier 
> >
> > Multi-cluster remoteproc designs typically have the following DT
> > declaration:
> >
> >   remoteproc_cluster {
> >   compatible = "soc,remoteproc-cluster";
> >
> > core0: core0 {
> >   compatible = "soc,remoteproc-core"
> > memory-region;
> > sram;
> > };
> >
> > core1: core1 {
> >   compatible = "soc,remoteproc-core"
> > memory-region;
> > sram;
> > }
> > };
> >
> > A driver exists for the cluster rather than the individual cores
> > themselves so that operation mode and HW specific configurations
> > applicable to the cluster can be made.
> >
> > Because the driver exists at the cluster level and not the individual
> > core level, function rproc_get_by_phandle() fails to return the
> > remoteproc associated with the phandled it is called for.
> >
> > This patch enhances rproc_get_by_phandle() by looking for the cluster's
> > driver when the driver for the immediate remoteproc's parent is not
> > found.
> >
> > Reported-by: Ben Levinsky 
> > Signed-off-by: Mathieu Poirier 
> > Tested-by: Ben Levinsky 
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 28 +++-
> >  1 file changed, 27 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/remoteproc/remoteproc_core.c 
> > b/drivers/remoteproc/remoteproc_core.c
> > index 695cce218e8c..3a8191803885 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -33,6 +33,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -2111,7 +2112,9 @@ EXPORT_SYMBOL(rproc_detach);
> >  #ifdef CONFIG_OF
> >  struct rproc *rproc_get_by_phandle(phandle phandle)
> >  {
> > + struct platform_device *cluster_pdev;
> >   struct rproc *rproc = NULL, *r;
> > + struct device_driver *driver;
> >   struct device_node *np;
> >
> >   np = of_find_node_by_phandle(phandle);
> > @@ -2122,7 +2125,30 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
> >   list_for_each_entry_rcu(r, &rproc_list, node) {
> >   if (r->dev.parent && device_match_of_node(r->dev.parent, np)) 
> > {
> >   /* prevent underlying implementation from being 
> > removed */
> > - if (!try_module_get(r->dev.parent->driver->owner)) {
> > +
> > + /*
> > +  * If the remoteproc's parent has a driver, the
> > +  * remoteproc is not part of a cluster and we can use
> > +  * that driver.
> > +  */
> > + driver = r->dev.parent->driver;
> > +
> > + /*
> > +  * If the remoteproc's parent does not have a driver,
> > +  * look for the driver associated with the cluster.
> > +  */
> > + if (!driver) {
> > + cluster_pdev = 
> > of_find_device_by_node(np->parent);
>
> Both the Ti and Xilinx drivers are using of_platform_populate(), so
> their r->dev.parent should have a parent reference to the cluster
> device.
>

So you are proposing to get the cluster's driver using something like
r->dev.parent->parent->driver?

I will have to verify the parent/child relationship is set up properly
through the of_platform_populate().  If it is, following the pointer
trail is an equally valid approach and I will respin this set.

> Unless I'm reading the code wrong, I think we should follow that
> pointer, rather than taking the detour in the DeviceTree data.
>
> Regards,
> Bjorn
>
> > + if (!cluster_pdev) {
> > + dev_err(&r->dev, "can't get 
> > parent\n");
> > + break;
> > + }
> > +
> > + driver = cluster_pdev->dev.driver;
> > + put_device(&cluster_pdev->dev);
> > + }
> > +
> > + if (!try_module_get(driver->owner)) {
> >   dev_err(&r->dev, "can't get owner\n");
> >   break;
> >   }
> > --
> > 2.25.1
> >


Re: [PATCH v4 3/5] x86/paravirt: introduce ALT_NOT_XEN

2023-11-14 Thread Juergen Gross

On 14.11.23 16:09, Borislav Petkov wrote:

On Mon, Oct 30, 2023 at 03:25:06PM +0100, Juergen Gross wrote:

Introduce the macro ALT_NOT_XEN as a short form of
ALT_NOT(X86_FEATURE_XENPV).


Not crazy about adding yet another macro indirection - at least with the
X86_FEATURE_ it is clear what this is. But ok, whatever.

Anyway, this patch can be the first one in the series.



Okay.


Juergen


OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [RESEND PATCH v3 1/2] remoteproc: Make rproc_get_by_phandle() work for clusters

2023-11-14 Thread Bjorn Andersson
On Sat, Oct 14, 2023 at 04:15:47PM -0700, Tanmay Shah wrote:
> From: Mathieu Poirier 
> 
> Multi-cluster remoteproc designs typically have the following DT
> declaration:
> 
>   remoteproc_cluster {
>   compatible = "soc,remoteproc-cluster";
> 
> core0: core0 {
>   compatible = "soc,remoteproc-core"
> memory-region;
> sram;
> };
> 
> core1: core1 {
>   compatible = "soc,remoteproc-core"
> memory-region;
> sram;
> }
> };
> 
> A driver exists for the cluster rather than the individual cores
> themselves so that operation mode and HW specific configurations
> applicable to the cluster can be made.
> 
> Because the driver exists at the cluster level and not the individual
> core level, function rproc_get_by_phandle() fails to return the
> remoteproc associated with the phandled it is called for.
> 
> This patch enhances rproc_get_by_phandle() by looking for the cluster's
> driver when the driver for the immediate remoteproc's parent is not
> found.
> 
> Reported-by: Ben Levinsky 
> Signed-off-by: Mathieu Poirier 
> Tested-by: Ben Levinsky 
> ---
>  drivers/remoteproc/remoteproc_core.c | 28 +++-
>  1 file changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index 695cce218e8c..3a8191803885 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -33,6 +33,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -2111,7 +2112,9 @@ EXPORT_SYMBOL(rproc_detach);
>  #ifdef CONFIG_OF
>  struct rproc *rproc_get_by_phandle(phandle phandle)
>  {
> + struct platform_device *cluster_pdev;
>   struct rproc *rproc = NULL, *r;
> + struct device_driver *driver;
>   struct device_node *np;
>  
>   np = of_find_node_by_phandle(phandle);
> @@ -2122,7 +2125,30 @@ struct rproc *rproc_get_by_phandle(phandle phandle)
>   list_for_each_entry_rcu(r, &rproc_list, node) {
>   if (r->dev.parent && device_match_of_node(r->dev.parent, np)) {
>   /* prevent underlying implementation from being removed 
> */
> - if (!try_module_get(r->dev.parent->driver->owner)) {
> +
> + /*
> +  * If the remoteproc's parent has a driver, the
> +  * remoteproc is not part of a cluster and we can use
> +  * that driver.
> +  */
> + driver = r->dev.parent->driver;
> +
> + /*
> +  * If the remoteproc's parent does not have a driver,
> +  * look for the driver associated with the cluster.
> +  */
> + if (!driver) {
> + cluster_pdev = 
> of_find_device_by_node(np->parent);

Both the Ti and Xilinx drivers are using of_platform_populate(), so
their r->dev.parent should have a parent reference to the cluster
device.

Unless I'm reading the code wrong, I think we should follow that
pointer, rather than taking the detour in the DeviceTree data.

Regards,
Bjorn

> + if (!cluster_pdev) {
> + dev_err(&r->dev, "can't get parent\n");
> + break;
> + }
> +
> + driver = cluster_pdev->dev.driver;
> + put_device(&cluster_pdev->dev);
> + }
> +
> + if (!try_module_get(driver->owner)) {
>   dev_err(&r->dev, "can't get owner\n");
>   break;
>   }
> -- 
> 2.25.1
> 


Re: [PATCH v4 3/5] x86/paravirt: introduce ALT_NOT_XEN

2023-11-14 Thread Borislav Petkov
On Mon, Oct 30, 2023 at 03:25:06PM +0100, Juergen Gross wrote:
> Introduce the macro ALT_NOT_XEN as a short form of
> ALT_NOT(X86_FEATURE_XENPV).

Not crazy about adding yet another macro indirection - at least with the
X86_FEATURE_ it is clear what this is. But ok, whatever.

Anyway, this patch can be the first one in the series.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette



[PATCH v1] lib: objpool: fix head overrun on RK3588 SBC

2023-11-14 Thread wuqiang.matt
objpool overrun stress with test_objpool on OrangePi5+ SBC triggered the
following kernel warnings:

WARNING: CPU: 6 PID: 3115 at lib/objpool.c:168 objpool_push+0xc0/0x100

This message is from objpool.c:168:

WARN_ON_ONCE(tail - head > pool->nr_objs);

The overrun test case is to validate the case that pre-allocated objects
are insufficient: 8 objects are pre-allocated for each node and consumer
thread per node tries to grab 16 objects in a row. The testing system is
OrangePI 5+, with RK3588, a big.LITTLE SOC with 4x A76 and 4x A55. When
disabling either all 4 big or 4 little cores, the overrun tests run well,
and once with big and little cores mixed together, the overrun test would
always cause an overrun loop. It's likely the memory timing differences
of big and little cores cause this trouble. Here are the debugging data
of objpool_try_get_slot after try_cmpxchg_release:

objpool_pop: cpu: 4/0 0:0 head: 278/279 tail:278 last:276/278

The local copies of 'head' and 'last' were 278 and 276, and reloading of
'slot->head' and 'slot->last' got 279 and 278. After try_cmpxchg_release
'slot->head' became 'head + 1', which is correct. But what's wrong here
is the stale value of 'last', and that stale value of 'last' finally led
the overrun of 'head'.

Memory updating of 'last' and 'head' are performed in push() and pop()
independently, which could be the culprit leading this out of order
visibility of 'last' and 'head'. So for objpool_try_get_slot(), it's
not enough only checking the condition of 'head != slot', the implicit
condition 'last - head <= nr_objs' must also be explicitly asserted to
guarantee 'last' is always behind 'head' before the object retrieving.

This patch will check and try reloading of 'head' and 'last' to ensure
'last' is behind 'head' at the time of object retrieving. Performance
testings show the average impact is about 0.1% for X86_64 and 1.12% for
ARM64. Here are the results:

OS: Debian 10 X86_64, Linux 6.6rc
HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
  1T 2T 4T 8T16T
native: 49543304   99277826  199017659  399070324  795185848
objpool:29909085   59865637  119692073  239750369  478005250
objpool+:   29879313   59230743  119609856  239067773  478509029
 32T48T64T96T   128T
native:   1596927073 2390099988 2929397330 3183875848 3257546602
objpool:   957553042 1435814086 1680872925 2043126796 2165424198
objpool+:  956476281 1434491297 1666055740 2041556569 2157415622

OS: Debian 11 AARCH64, Linux 6.6rc
HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
  1T 2T 4T 8T16T
native: 30890508   60399915  123111980  242257008  494002946
objpool:14742531   28883047   57739948  115886644  232455421
objpool+:   14107220   29032998   57286084  113730493  232232850
 24T32T48T64T96T
native:746406039 1000174750 1493236240 1998318364 2942911180
objpool:   349164852  467284332  702296756  934459713 1387898285
objpool+:  348388180  462750976  696606096  927865887 1368402195

Signed-off-by: wuqiang.matt 
---
 lib/objpool.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/lib/objpool.c b/lib/objpool.c
index ce0087f64400..cfdc02420884 100644
--- a/lib/objpool.c
+++ b/lib/objpool.c
@@ -201,6 +201,23 @@ static inline void *objpool_try_get_slot(struct 
objpool_head *pool, int cpu)
while (head != READ_ONCE(slot->last)) {
void *obj;
 
+   /*
+* data visibility of 'last' and 'head' could be out of
+* order since memory updating of 'last' and 'head' are
+* performed in push() and pop() independently
+*
+* before any retrieving attempts, pop() must guarantee
+* 'last' is behind 'head', that is to say, there must
+* be available objects in slot, which could be ensured
+* by condition 'last != head && last - head <= nr_objs'
+* that is equivalent to 'last - head - 1 < nr_objs' as
+* 'last' and 'head' are both unsigned int32
+*/
+   if (READ_ONCE(slot->last) - head - 1 >= pool->nr_objs) {
+   head = READ_ONCE(slot->head);
+   continue;
+   }
+
/* obj must be retrieved before moving forward head */
obj = READ_ONCE(slot->entries[head & slot->mask]);
 
-- 
2.40.1