Re: Signals and application shutdown (was Re: [PATCH] sigwait())

Antti Kantee Thu, 12 Mar 2015 19:01:57 -0700

On 12/03/15 15:57, Martin Lucina wrote:

[email protected] said:

Right, but signals is something that should be attacked critically,
not just bend over and emulating the existing concept.  This whole
work is pointless if the end result is what is known as an OS ;)


I do NOT want to be doing a full-blown signals implementation. What I do
want (and the problem I'm trying to solve) is to be able to shutdown an
application cleanly, using either "xl shutdown", ACPI poweroff, magic
packet or whatever.

Let me try to clarify what I mean by "critically". I don't thinkemulating signals is the way to go, "full-blown" implementation or not.Anecdote: if we take the original motivation for rump kernels ("kerneldriver development in userspace") and just figure out how to emulate thefacilities required by the kernel in userspace, we end up with ausermode OS, not a rump kernel. While the usermode OS is a solution,it's not the better solution out of the two; everyone is of coursewelcome to disagree with my assertion, but if so, use specific argumentsagainst section 4.5 of my thesis.

Maybe emulating signals will pretend to solve one problem in one case,and maybe there's some merit in that. However, as you should be wellaware of, the problem with emulating the real world is in the actualimplementation and the testing against actual code. I can say that if Iwere to emulate signals, which I'm not sure I would, I'd probably startdown the path you describe. However, I can't really give any usefulinsight because that would require me implementing and testing the wholething.

I'd still go for a "what is actually needed" approach first, definitelylooking at what mods mysql would require, and perhaps surveyingunikernel projects and teaming up with them for a common interface.Sure, the concept of a clean shutdown without signals doesn't really fitinto the POSIX interface, but if nobody is willing to even imagine whatthings actually should look like, there will never be progress.

The accepted way to cleanly shutdown a well-behaved POSIX application is to
send it SIGTERM, wait a while and possibly send it SIGKILL if it doesn't
get its act together.

Here's a naive way of how I think this could work. This is likely to be
full of race conditions or potential deadlocks, but I want to get it on the
table anyway:

1) We implement a sigaction() which allows setting a SIGTERM handler and
ignores any other operations.

2) (Xen-specific, but other platforms will work similarly). We reinstate
the mini-os shutdown thread from upstream. This waits on a xenbus watch for
the "xl shutdown" signal.

3) When we receive a shutdown signal, the following needs to happen, in
this order:

    a) We run the application's SIGTERM handler if set, in the context of
    the shutdown thread.

    b) We unblock any application threads waiting inside the Rump Kernel,
    with all calls returning EINTR.

This should be enough for a well-behaved application to figure out that
it's supposed to terminate.

Judging by the words "should" and "well-behaved", your confidence in theproposed emulation being a solution is about the same level as mine.

4) (This needs to be added to normal _exit() / abort() / after callmain
handling anyway). We close all open file descriptors, and do a chdir("/"),
so that rumpconfig can cleanly unmount filesystems and down network
interfaces.

Is this possible? Specifically, step 3) b)?

Yes, 3b is possible, sort of: something similar was required forsupporting exec() by multithreaded remote clients. However, the lasttime it came up on this list with ping6(1) using alarm() instead of thetimeout argument to poll(), our conclusion was that applications shouldbe adjusted instead of trying to bend over backwards and sideways toemulate signals. Now, granted, that discussion did not includeaddressing how to shut down long-running servers, so it does notdirectly apply. Really, I don't really know what's required withoutbanging on it for a week or two, not something I can solve by writingemails.

For 4, a rump kernel provides the notion of a process exiting. Ano-longer-existing process does not have file descriptors or cwd. Thisfacility is not a secret and documented e.g. in rump_lwproc(3).

Now, I really think you should start by implementing a working solutionfor _at least_ one non-trivial program (e.g. mysql) before proposing ageneralization.

Separately, but related, as I have found with MySQL's normal shutdown
process, we will also need to implement pthread_kill(thread, SIGKILL).

-mato

Re: Signals and application shutdown (was Re: [PATCH] sigwait())

Reply via email to