Mike Christie <michael.chris...@oracle.com> writes:

> This has us deqeue SIGKILL even if SIGNAL_GROUP_EXIT/group_exec_task is
> set when we are dealing with PF_USER_WORKER tasks.

> When a vhost_task gets a SIGKILL, we could have outstanding IO in flight.
> We can easily stop new work/IO from being queued to the vhost_task, but
> for IO that's already been sent to something like the block layer we
> need to wait for the response then process it. These type of IO
> completions use the vhost_task to process the completion so we can't
> exit immediately.


I understand the concern.

> We need to handle wait for then handle those completions from the
> vhost_task, but when we have a SIGKLL pending, functions like
> schedule() return immediately so we can't wait like normal. Functions
> like vhost_worker() degrade to just a while(1); loop.
>
> This patch has get_signal drop down to the normal code path when
> SIGNAL_GROUP_EXIT/group_exec_task is set so the caller can still detect
> there is a SIGKILL but still perform some blocking cleanup.
>
> Note that in that chunk I'm now bypassing that does:
>
> sigdelset(&current->pending.signal, SIGKILL);
>
> we look to be ok, because in the places we set SIGNAL_GROUP_EXIT/
> group_exec_task we are already doing that on the threads in the
> group.

What you are doing does not make any sense to me.

First there is the semantic non-sense, of queuing something that
is not a signal.   The per task SIGKILL bit is used as a flag with
essentially the same meaning as SIGNAL_GROUP_EXIT, reporting that
the task has been scheduled for exit.

More so is what happens afterwards.

As I read your patch it is roughly equivalent to doing:

        if ((current->flags & PF_USER_WORKER) &&
            fatal_signal_pending(current)) {
                sigdelset(&current->pending.signal, SIGKILL);
                clear_siginfo(&ksig->info);
                ksig->info.si_signo = SIGKILL;
                ksig->info.si_code = SI_USER;
                recalc_sigpending();
                trace_signal_deliver(SIGKILL, &ksig->info,
                        &sighand->action[SIGKILL - 1]);
                goto fatal;
        }

Before the "(SIGNAL_GROUP_EXIT || signal->group_exec_task)" test.

To get that code I stripped the active statements out of the
dequeue_signal path the code executes after your change below.

I don't get why you are making it though because the code you
are opting out of does:

                /* Has this task already been marked for death? */
                if ((signal->flags & SIGNAL_GROUP_EXIT) ||
                     signal->group_exec_task) {
                        clear_siginfo(&ksig->info);
                        ksig->info.si_signo = signr = SIGKILL;
                        sigdelset(&current->pending.signal, SIGKILL);
                        trace_signal_deliver(SIGKILL, SEND_SIG_NOINFO,
                                &sighand->action[SIGKILL - 1]);
                        recalc_sigpending();
                        goto fatal;
                }

I don't see what in practice changes, other than the fact that by going
through the ordinary dequeue_signal path that other signals can be
processed after a SIGKILL has arrived.  Of course those signal all
should be blocked.




The trailing bit that expands the PF_IO_WORKER test to be PF_USER_WORKER
appears reasonable, and possibly needed.

Eric


> Signed-off-by: Mike Christie <michael.chris...@oracle.com>
> ---
>  kernel/signal.c | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index 8f6330f0e9ca..ae4972eea5db 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2705,9 +2705,18 @@ bool get_signal(struct ksignal *ksig)
>               struct k_sigaction *ka;
>               enum pid_type type;
>  
> -             /* Has this task already been marked for death? */
> -             if ((signal->flags & SIGNAL_GROUP_EXIT) ||
> -                  signal->group_exec_task) {
> +             /*
> +              * Has this task already been marked for death?
> +              *
> +              * If this is a PF_USER_WORKER then the task may need to do
> +              * extra work that requires waiting on running work, so we want
> +              * to dequeue the signal below and tell the caller its time to
> +              * start its exit procedure. When the work has completed then
> +              * the task will exit.
> +              */
> +             if (!(current->flags & PF_USER_WORKER) &&
> +                 ((signal->flags & SIGNAL_GROUP_EXIT) ||
> +                  signal->group_exec_task)) {
>                       clear_siginfo(&ksig->info);
>                       ksig->info.si_signo = signr = SIGKILL;
>                       sigdelset(&current->pending.signal, SIGKILL);
> @@ -2861,11 +2870,11 @@ bool get_signal(struct ksignal *ksig)
>               }
>  
>               /*
> -              * PF_IO_WORKER threads will catch and exit on fatal signals
> +              * PF_USER_WORKER threads will catch and exit on fatal signals
>                * themselves. They have cleanup that must be performed, so
>                * we cannot call do_exit() on their behalf.
>                */
> -             if (current->flags & PF_IO_WORKER)
> +             if (current->flags & PF_USER_WORKER)
>                       goto out;
>  
>               /*
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Reply via email to