On 4/21/2015 2:56 PM, Buck Evan wrote:
My understanding of s6 socket activation is that services should open, hold onto their listening socket when they're up, and s6 relies on the OS for swapping out inactive services. It's not socket activation in the usual sense. http://skarnet.org/software/s6/socket-activation.html

I apologize, I was a bit hasty and I think I need more sleep. I'm confusing socket activation with some other s6 feature, perhaps I was confusing it with how s6-notifywhenup is used... http://skarnet.org/software/s6/s6-notifywhenup.html
So I wonder what the "full guarantee" provided by s6 that you mentioned looks like. It seems like in such a world all services would race and the determinism of the race would depend on each service's implementation.
This I do understand, having gone through it with supervision-scripts. The basic problem is that a running service does not mean a service is ready, it only means it's "up".

Dependency handling "with guarantee" means there is some means by which the child service itself signals "I'm fully up and running", vs. "I'm started but not ready". Because there is no polling going on, this allows the start-up of the parent daemon to sleep until it either is notified or times out. And you get a "clean" start-up of the parent because the children have directly signaled that "we're all ready".

Dependency handling "without guarantee" is what my project does as an optional feature - it brings up the child process and then calls the child's ./check script to see if everything is OK, which is polling the child (and wasting CPU cycles). This is fine for "light" use because most child processes will start quickly and the parent won't time out while waiting. There are trade-offs for using this feature. First, ./check scripts may have unintended bugs, behaviors, or issues that you can't see or resolve, unlike the child directly signalling that it is ready for use. Second, the polling approach adds to CPU overhead, making it less than ideal for mobile computing - it will draw more power over time. Third, there are edge cases where it can make a bad situation worse - picture a heavily loaded system that takes 20+ minutes to start a child process, and the result being the parent spawn-loops repeatedly, which just adds even more load. That's just the three I can think off off the top of my head - I'm sure there's more. It's also why it's not enabled by default.

Reply via email to