Re: [RFC PATCH 0/8] vhost_tasks: Use CLONE_THREAD/SIGHAND
On 01.06.23 12:47, Christian Brauner wrote: > On Thu, Jun 01, 2023 at 09:58:38AM +0200, Thorsten Leemhuis wrote: >> On 19.05.23 14:15, Christian Brauner wrote: >>> On Thu, May 18, 2023 at 10:25:11AM +0200, Christian Brauner wrote: >>>> On Wed, May 17, 2023 at 07:09:12PM -0500, Mike Christie wrote: >>>>> This patch allows the vhost and vhost_task code to use CLONE_THREAD, >>>>> CLONE_SIGHAND and CLONE_FILES. It's a RFC because I didn't do all the >>>>> normal testing, haven't coverted vsock and vdpa, and I know you guys >>>>> will not like the first patch. However, I think it better shows what >>>> >>>> Just to summarize the core idea behind my proposal is that no signal >>>> handling changes are needed unless there's a bug in the current way >>>> io_uring workers already work. All that should be needed is >>>> s/PF_IO_WORKER/PF_USER_WORKER/ in signal.c. >> [...] >>>> So it feels like this should be achievable by adding a callback to >>>> struct vhost_worker that get's called when vhost_worker() gets SIGKILL >>>> and that all the users of vhost workers are forced to implement. >>>> >>>> Yes, it is more work but I think that's the right thing to do and not to >>>> complicate our signal handling. >>>> >>>> Worst case if this can't be done fast enough we'll have to revert the >>>> vhost parts. I think the user worker parts are mostly sane and are >>> >>> As mentioned, if we can't settle this cleanly before -rc4 we should >>> revert the vhost parts unless Linus wants to have it earlier. >> >> Meanwhile -rc5 is just a few days away and there are still a lot of >> discussions in the patch-set proposed to address the issues[1]. Which is >> kinda great (albeit also why I haven't given it a spin yet), but on the >> other hand makes we wonder: > > You might've missed it in the thread but it seems everyone is currently > operating under the assumption that the preferred way is to fix this is > rather than revert. I saw that, but that was also a week ago already, so I slowly started to wonder if plans might have/should be changed. Anyway: if that's still the plan forward it's totally fine for me if it's fine for Linus. :-D BTW: I for now didn't sit down to test Mike's patches, as due to all the discussions I assumed new ones would be coming sooner or later anyway. If it's worth giving them a shot, please let me know. > [...] Thx for the update! Ciao, Thorsten ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH 0/8] vhost_tasks: Use CLONE_THREAD/SIGHAND
On 19.05.23 14:15, Christian Brauner wrote: > On Thu, May 18, 2023 at 10:25:11AM +0200, Christian Brauner wrote: >> On Wed, May 17, 2023 at 07:09:12PM -0500, Mike Christie wrote: >>> This patch allows the vhost and vhost_task code to use CLONE_THREAD, >>> CLONE_SIGHAND and CLONE_FILES. It's a RFC because I didn't do all the >>> normal testing, haven't coverted vsock and vdpa, and I know you guys >>> will not like the first patch. However, I think it better shows what >> >> Just to summarize the core idea behind my proposal is that no signal >> handling changes are needed unless there's a bug in the current way >> io_uring workers already work. All that should be needed is >> s/PF_IO_WORKER/PF_USER_WORKER/ in signal.c. [...] >> So it feels like this should be achievable by adding a callback to >> struct vhost_worker that get's called when vhost_worker() gets SIGKILL >> and that all the users of vhost workers are forced to implement. >> >> Yes, it is more work but I think that's the right thing to do and not to >> complicate our signal handling. >> >> Worst case if this can't be done fast enough we'll have to revert the >> vhost parts. I think the user worker parts are mostly sane and are > > As mentioned, if we can't settle this cleanly before -rc4 we should > revert the vhost parts unless Linus wants to have it earlier. Meanwhile -rc5 is just a few days away and there are still a lot of discussions in the patch-set proposed to address the issues[1]. Which is kinda great (albeit also why I haven't given it a spin yet), but on the other hand makes we wonder: Is it maybe time to revert the vhost parts for 6.4 and try again next cycle? [1] https://lore.kernel.org/all/20230522025124.5863-1-michael.chris...@oracle.com/ Ciao, Thorsten "not sure if I'm asking because I'm affected, or because it's my duty as regression tracker" Leemhuis ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v11 8/8] vhost: use vhost_tasks for worker threads
[CCing the regression list] On 06.05.23 00:37, Mike Christie wrote: > On 5/5/23 1:22 PM, Linus Torvalds wrote: >> On Fri, May 5, 2023 at 6:40 AM Nicolas Dichtel >> wrote: >>> >>> Is this an intended behavior? >>> This breaks some of our scripts. Jumping in here, as I found another problem with that patch: it broke s2idle on my laptop when a qemu-kvm VM is running, as freezing user space processes now fails for me: ``` [ 195.442949] PM: suspend entry (s2idle) [ 195.641271] Filesystems sync: 0.198 seconds [ 195.833828] Freezing user space processes [ 215.841084] Freezing user space processes failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0): [ 215.841255] task:vhost-3221 state:R stack:0 pid:3250 ppid:3221 flags:0x4006 [ 215.841264] Call Trace: [ 215.841266] [ 215.841270] ? update_rq_clock+0x39/0x270 [ 215.841283] ? _raw_spin_unlock+0x19/0x40 [ 215.841290] ? __schedule+0x3f/0x1510 [ 215.841296] ? sysvec_apic_timer_interrupt+0xaf/0xd0 [ 215.841306] ? schedule+0x61/0xe0 [ 215.841313] ? vhost_worker+0x87/0xb0 [vhost] [ 215.841329] ? vhost_task_fn+0x1a/0x30 [ 215.841336] ? __pfx_vhost_task_fn+0x10/0x10 [ 215.841341] ? ret_from_fork+0x2c/0x50 [ 215.841352] [ 215.841936] OOM killer enabled. [ 215.841938] Restarting tasks ... done. [ 215.844204] random: crng reseeded on system resumption [ 215.957095] PM: suspend exit [ 215.957185] PM: suspend entry (s2idle) [ 215.967646] Filesystems sync: 0.010 seconds [ 215.971326] Freezing user space processes [ 235.974400] Freezing user space processes failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0): [ 235.974574] task:vhost-3221 state:R stack:0 pid:3250 ppid:3221 flags:0x4806 [ 235.974583] Call Trace: [ 235.974586] [ 235.974593] ? __schedule+0x184/0x1510 [ 235.974605] ? sysvec_apic_timer_interrupt+0xaf/0xd0 [ 235.974616] ? schedule+0x61/0xe0 [ 235.974624] ? vhost_worker+0x87/0xb0 [vhost] [ 235.974648] ? vhost_task_fn+0x1a/0x30 [ 235.974656] ? __pfx_vhost_task_fn+0x10/0x10 [ 235.974662] ? ret_from_fork+0x2c/0x50 [ 235.974673] [ 235.975190] OOM killer enabled. [ 235.975192] Restarting tasks ... done. [ 235.978131] random: crng reseeded on system resumption [ 236.091219] PM: suspend exit ``` After running into the problem I booted 6.3.1-rc1 again and there s2idle still worked. Didn't do a bisection, just looked at the vhost commits during the latest merge window; 6e890c5d502 ("vhost: use vhost_tasks for worker threads") looked suspicious, so I reverted it on top of latest mainline and then things work again. Through a search on lore I arrived in this thread and found below patch from Mike. Gave it a try on top of latest mainline, but it didn't help. Ciao, Thorsten > [...] > If it's ok to change the behavior of "ps -u root", then we can do this patch: > (Nicolas, I confirmed it fixes the 'ps a' case, but couldn't replicate the > 'ps' > case. If you could test the ps only case or give me info on what > /usr/bin/example > was doing I can replicate and test here): > > > diff --git a/kernel/fork.c b/kernel/fork.c > index ed4e01daccaa..eb9ffc58e211 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -2269,8 +2269,14 @@ __latent_entropy struct task_struct *copy_process( > /* >* Thread groups must share signals as well, and detached threads >* can only be started up within the thread group. > + * > + * A userworker's parent thread will normally have a signal handler > + * that performs management operations, but the worker will not > + * because the parent will handle the signal then user a worker > + * specific interface to manage the thread and related resources. >*/ > - if ((clone_flags & CLONE_THREAD) && !(clone_flags & CLONE_SIGHAND)) > + if ((clone_flags & CLONE_THREAD) && !(clone_flags & CLONE_SIGHAND) && > + !args->user_worker && !args->ignore_signals) > return ERR_PTR(-EINVAL); > > /* > diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c > index b7cbd66f889e..3700c21ea39d 100644 > --- a/kernel/vhost_task.c > +++ b/kernel/vhost_task.c > @@ -75,7 +78,8 @@ struct vhost_task *vhost_task_create(int (*fn)(void *), > void *arg, >const char *name) > { > struct kernel_clone_args args = { > - .flags = CLONE_FS | CLONE_UNTRACED | CLONE_VM, > + .flags = CLONE_FS | CLONE_THREAD | CLONE_VM | > + CLONE_UNTRACED, > .exit_signal= 0, > .fn = vhost_task_fn, > .name = name ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest
On 23.03.2017 17:41, Michael S. Tsirkin wrote: > On Thu, Mar 23, 2017 at 03:19:07PM +, Richard W.M. Jones wrote: >> On Thu, Mar 23, 2017 at 01:13:50PM +0800, Jason Wang wrote: >>> >From 312859b596e83a2164a8430343d31fce2a5ad808 Mon Sep 17 00:00:00 2001 >>> From: Jason Wang <jasow...@redhat.com> >>> Date: Thu, 23 Mar 2017 13:07:16 +0800 >>> Subject: [PATCH] virtio_pci: fix out of bound access for msix_names >>> Signed-off-by: Jason Wang <jasow...@redhat.com> >> I tested this, and it does appear to fix the crashes in >> vp_modern_find_vqs. Therefore: >> Tested-by: Richard W.M. Jones <rjo...@redhat.com> > I've queued the fix, thanks everyone! Thx. Feel free to add Tested-by: Thorsten Leemhuis <li...@leemhuis.info> as a kernel with that patch works well in my reboot tests so far. Ciao, Thorsten ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization