Re: data structures for use in signal handlers
Bruno Haible wrote: > Eric Wong wrote: > > > How would setting up a pipe or eventfd help with a communication > > > problem between threads in the same process? I thought these were > > > for communication between processes. > > > > The "self-pipe trick" is a common idiom in Unix for signal > > handling; especially when pselect/ppoll aren't available. > > I seem to recall learning of it reading something by djb. > What I understand from [1] is that if a program uses signals AND > select()/poll(), then it is useful to map signals into events that > can be received by select()/poll(). The self-pipe trick is a way > to do that. Right. > Maybe I didn't describe accurately the situation I am asking for. > I'm not attempting to transform a program to an event-based engine. > I'm looking to let signal handlers (SIGTERM, SIGSEGV, SIGWINCH, etc.) > read and traverse data structures that the main program has prepared > (and that contain information about files to delete, memory areas, > window lines to redraw etc.). No worries, I figured pipes/sockets weren't an options given your mention of libsigsegv in the original post. One possible solution is to spawn a dedicated thread which does nothing but run an event loop to handle signals sequentially without reentrancy. Then, signal handlers are guaranteed to run alone and existing thread-safe structures (e.g. rculfhash) and locks would be able to work properly. It all depends on whether you can spawn a dedicated thread for that. FWIW, liburcu uses background threads internally, too. > [1] http://osiris.978.org/~alex/safesignalhandling.html
Re: data structures for use in signal handlers
On Tue, Feb 9, 2021 at 6:07 PM Bruno Haible wrote: > Maybe I didn't describe accurately the situation I am asking for. > I'm not attempting to transform a program to an event-based engine. > I'm looking to let signal handlers (SIGTERM, SIGSEGV, SIGWINCH, etc.) > read and traverse data structures that the main program has prepared > (and that contain information about files to delete, memory areas, > window lines to redraw etc.). This is a situation that I've always worked hard to avoid. So little can be safely done in a signal handler that it's worthwhile to try not to do more than the minimum possible. Have you already concluded that you must do significant work in a signal handler? I would not want to try to, say, redraw lines in a window in a signal handler, because the functions that I would need to call to do that would not be prepared to run inside such a handler: I don't think ncurses or GTK+, for example, would be happy at all about it.
Re: data structures for use in signal handlers
On 2/9/21 5:21 PM, Eric Wong wrote: I know the (C)Ruby VM uses it internally (or eventfd on Linux); I think CPython's GVL does, too. GNU Make does too. It's where I learned of the trick.
Re: data structures for use in signal handlers
Eric Wong wrote: > > How would setting up a pipe or eventfd help with a communication > > problem between threads in the same process? I thought these were > > for communication between processes. > > The "self-pipe trick" is a common idiom in Unix for signal > handling; especially when pselect/ppoll aren't available. > I seem to recall learning of it reading something by djb. > > It's perfectly valid to use pipes in single-threaded or > multi-threaded, or multi-process situations. POSIX gives > well-defined semantics w.r.t. atomicity. Atomicity alone is not an argument; the compiler and OS have sufficient primitives (CAS instructions, barriers, spin-locks, locks) to make things atomic between threads. What I understand from [1] is that if a program uses signals AND select()/poll(), then it is useful to map signals into events that can be received by select()/poll(). The self-pipe trick is a way to do that. Maybe I didn't describe accurately the situation I am asking for. I'm not attempting to transform a program to an event-based engine. I'm looking to let signal handlers (SIGTERM, SIGSEGV, SIGWINCH, etc.) read and traverse data structures that the main program has prepared (and that contain information about files to delete, memory areas, window lines to redraw etc.). Bruno [1] http://osiris.978.org/~alex/safesignalhandling.html
Re: data structures for use in signal handlers
On Tue, Feb 9, 2021 at 5:01 PM Bruno Haible wrote: > > Ben Pfaff wrote: > > Building an RCU implementation isn't necessarily difficult (I have done it, > > but the implementation isn't suitable for gnulib). > > > > There is a liburcu that is under LGPL v2.1: https://liburcu.org/ > > Thanks for the pointers, Ben. Yes, RCU is the technique I had in mind. > So: "can it actually be made to work?" -> Yes. > > But the implementation of liburcu uses extra signals and a helper thread of > its own (or even a helper thread per CPU). > > I have to rephrase the question: can it be made to work in a straightforward > (not necessarily scalability-optimized) way? Signals are not needed. I haven't studied liburcu (it wasn't suitable for my userspace use case because it only worked with GCC), so I am not sure why it needs signals. The userspace RCU implementation that I built in OVS (which is free but Apache licensed) does use a thread, but not signals. The nice thing about using a thread is that you always have something freeing memory that is known to no longer be in use. But I can imagine implementations that don't even use an extra thread. For example, each thread could keep its own local list of callbacks to be called after the next grace period. Then, whenever a thread quiesces, it goes through its own list and executes all of the callbacks that now can be. It could take a long time for some callbacks to be called, though, especially if a thread goes to sleep and thus won't call any of its own until it wakes up again.
Re: data structures for use in signal handlers
Bruno Haible wrote: > Hi Eric, > > Eric Wong wrote: > > I would've recommended just using a pipe, socket or eventfd; > > How would setting up a pipe or eventfd help with a communication > problem between threads in the same process? I thought these were > for communication between processes. The "self-pipe trick" is a common idiom in Unix for signal handling; especially when pselect/ppoll aren't available. I seem to recall learning of it reading something by djb. It's perfectly valid to use pipes in single-threaded or multi-threaded, or multi-process situations. POSIX gives well-defined semantics w.r.t. atomicity. I know the (C)Ruby VM uses it internally (or eventfd on Linux); I think CPython's GVL does, too. My gnulib-using cmogstored uses a self-pipe when epoll_pwait isn't available (Though it could probably use EVFILT_SIGNAL on *BSD).
Re: data structures for use in signal handlers
Hi Eric, Eric Wong wrote: > I would've recommended just using a pipe, socket or eventfd; How would setting up a pipe or eventfd help with a communication problem between threads in the same process? I thought these were for communication between processes. Bruno
Re: data structures for use in signal handlers
Ben Pfaff wrote: > On Sun, Feb 7, 2021 at 2:57 PM Bruno Haible wrote: > > (2) Let the signal handler work only on immutable copies of the data > > structure. Whenever the other code manipulates the data structure, > > it creates an immutable copy of it, for the signal handler to use. > > Use an 'asyncsafe-spin' lock through which the signal handler tells > > the other threads to not free that immutable copy while it running. > > > > This is tricky; can it actually be made to work? > > > > Btw, there is an obvious requirement: the technique should use > > malloc/ > > free for memory management, and should not have memory leaks. > > Algorithms that assume a garbage collected memory management are not > > suitable here. > > This makes me think of read-copy-update aka RCU: > https://en.wikipedia.org/wiki/Read-copy-update > https://lwn.net/Articles/262464/ > > In RCU, code that updates the data structure takes a lock, creates and > modifies a copy, and then installs a new pointer to the data structure, > which is otherwise immutable. Readers always access the data structure > through a pointer. Whichever pointer they happen to see when they > read the pointer remains available until they're done with it. > > Using RCU is pretty straightforward once you've done a little of it, but > it takes some reading to understand all of its concepts. It's probably best > for me not to try to explain it all, because I'll surely get some parts of it > wrong. > > Building an RCU implementation isn't necessarily difficult (I have done it, > but the implementation isn't suitable for gnulib). > > There is a liburcu that is under LGPL v2.1: https://liburcu.org/ Thanks for the pointers, Ben. Yes, RCU is the technique I had in mind. So: "can it actually be made to work?" -> Yes. But the implementation of liburcu uses extra signals and a helper thread of its own (or even a helper thread per CPU). I have to rephrase the question: can it be made to work in a straightforward (not necessarily scalability-optimized) way? Bruno
Re: data structures for use in signal handlers
Bruno Haible wrote: > Hi Marc, Ben, > > I need your expertise regarding data structures. > > Assume a program has a signal handler, which needs to share some data > structure with the rest of the program. > > There are two cases: > > (A) Assume that the program is single-threaded. > Then it is sufficient to use 'volatile' variables for communication > between the handler and the rest of the program. > > (B) Assume that the program is multi-threaded. > Then 'volatile' is not sufficient, because the signal handler and some > other threads may be running at the same time. > > As far as I can see, there are three possibilities to write such a > signal handler: > (2) Let the signal handler work only on immutable copies of the data > structure. Whenever the other code manipulates the data structure, > it creates an immutable copy of it, for the signal handler to use. > Use an 'asyncsafe-spin' lock through which the signal handler tells > the other threads to not free that immutable copy while it running. > > This is tricky; can it actually be made to work? Maybe, it sounds like RCU as Ben mentioned. I've never used RCU inside signal handlers, though. > Btw, there is an obvious requirement: the technique should use malloc/ > free for memory management, and should not have memory leaks. > Algorithms that assume a garbage collected memory management are not > suitable here. liburcu uses intrusive data structures (via container_of) like the Linux kernel does, so memory can be pre-allocated at startup before a signal handler runs. > (3) Use lock-free algorithms. What lock-free algorithms can you propose? > What I want is > (a) a list, I haven't used it for signal handlers, but wfcqueue from liburcu might be a possibility, here. > (b) a balanced binary tree, That seems way tricky. I've never gone beyond atomic ops (xchg, cmpxchg, add/sub/or, etc...) and the usual pipe/eventfd/socket stuff. > and the other code can malloc/free/add/insert in the data structure, > whereas the signal handler should only be allowed to iterate and > search an element within the data structure. > > What algorithms can you recommend for this purpose? > This would be useful for gnulib in general. The balanced binary tree would > also help making GNU libsigsegv multithread-safe. I would've recommended just using a pipe, socket or eventfd; but I guess that's not an option for your particular structure or libsigsegv?
Re: data structures for use in signal handlers
On Sun, Feb 7, 2021 at 2:57 PM Bruno Haible wrote: > (2) Let the signal handler work only on immutable copies of the data > structure. Whenever the other code manipulates the data structure, > it creates an immutable copy of it, for the signal handler to use. > Use an 'asyncsafe-spin' lock through which the signal handler tells > the other threads to not free that immutable copy while it running. > > This is tricky; can it actually be made to work? > > Btw, there is an obvious requirement: the technique should use malloc/ > free for memory management, and should not have memory leaks. > Algorithms that assume a garbage collected memory management are not > suitable here. This makes me think of read-copy-update aka RCU: https://en.wikipedia.org/wiki/Read-copy-update https://lwn.net/Articles/262464/ In RCU, code that updates the data structure takes a lock, creates and modifies a copy, and then installs a new pointer to the data structure, which is otherwise immutable. Readers always access the data structure through a pointer. Whichever pointer they happen to see when they read the pointer remains available until they're done with it. Using RCU is pretty straightforward once you've done a little of it, but it takes some reading to understand all of its concepts. It's probably best for me not to try to explain it all, because I'll surely get some parts of it wrong. Building an RCU implementation isn't necessarily difficult (I have done it, but the implementation isn't suitable for gnulib). There is a liburcu that is under LGPL v2.1: https://liburcu.org/