Re: Thoughts on First Class Services
Note: this re-post is due to an error I made earlier today. I've gutted out a bunch of stuff as well. My apologies for the duplication. On 4/28/2015 11:34 AM, Laurent Bercot wrote: I'm also interested in Avery's experience with dependency handling. Hm. Today isn't the best day to write this (having been up since 4am) but I'll try to digest all the little bits and pieces into something. Here we go... First, I will qualify a few things. The project's scope is, compared to a lot of the discussion on the mailing list, very narrow. There are several goals but the primary thrust of the project is to create a generic, universal set of service definitions that could be plugged into many init, distribution, and supervision framework arrangements. That's a tall order in itself, but there are ways around a lot of this. So while the next three paragraphs are off-topic, they are there to address those three concerns mentioned. With regard to init work, I don't touch it. Trying to describe a proper init sequence is already beyond the scope of the project. I'm leaving that to other implementers. With regard to distributions, well, I'm trying to make it as generic as possible. Development is done on a Debian 7 box but I have made efforts to avoid any Debian-isms in the actual project itself. In theory, you should be able to use the scripts on any distribution. With regard to the supervision programs used, the difference in command names have been abstracted away. I'm not entirely wild about how it is currently done, but creating definitions is a higher priority than revisiting this at the moment. In the future, I will probably restructure it. ~ ~ ~ ~ ~ ~ ~ ~ What --- The dependency handling in supervision-scripts is meant to be used in installations that don't have access to it. Put another way, it's a Poor Man's Solution to the problem and functions as a convenience. The feature is turned off by default, and this will cause any service definition that requires other services to run-loop repeatedly until someone starts them manually. This could be said to be the default behavior of most installations that don't have dependency handling, so I'm not introducing a disruptive behavior with this feature. Why --- I could have hard-coded many of the dependencies into the various run scripts, but this would have created a number of problems for other areas. 1. Hard-coding prevents switching from shell to execline in the future, by necessitating a re-write. There will be an estimated 1,000+ scripts when the project is complete, so this is a major concern. 2. We are already using the filesystem as an ad-hoc database, so it makes sense to continue with this concept. The dependencies should be stored on the filesystem and not inside of the script. With this in mind, I picked sv/(service)/needs as a directory to hold the definitions to be used. Because I can't envision what every init and future dependency management framework would look like, I'll simply make it as generic as I can, leaving things as open as possible to additional changes.A side note: it is by fortuitous circumstance that anopa uses a ./needs directory that has the same functionality and behavior. I use soft links just because. Anopa uses named files. The net effect is the same. Each dependency is simply a named soft link that points to a service that needs to be started, typically something like sv/(service)/needs/foobar points to /service/foobar. In this case, a soft link is made with the name of the service, pointing to the service definition in /service. This also allows me to ensure that the dependency is actually available, and not just assume that it is there. A single rule determines what goes into ./needs, you can only have the names of other services that are explicitly needed. You can say foo needs baz and baz needs bar but NEVER would you say foo needs baz, foo needs bar. This is intentional because it's not the job of the starting service to handle the entire chain. It simplifies the list of dependencies because a service will only worry about its immediate needs, and not the needs of dependent services it launches. It also has the desirable property of making dependency chains self-organizing, which is an important decision with hundreds of services having potentially hundreds of dependencies. Setup is straightforward and you can easily extend a service need by adding one soft link to the new dependency. This also fits with my current use of a single launch script; I don't have to change the script, just the parameters that the script uses. The new soft link becomes just another parameter. You could call this peer-level dependency resolution if you like. How --- Enabling this behavior requires that you set sv/.env/NEEDS_ENABLED to the single character 1. It is normally set to 0. With the setting disabled (zero), the entire
Re: Thoughts on First Class Services
On Tue, Apr 28, 2015 at 8:38 AM, Avery Payne avery.p.pa...@gmail.com wrote: The steps described are explicit and hard-coded into a service's ./run script. This design works just OK and ensures complicated start-up sequences can be carried out, but it does not allow easy replacement of child services. The catch is that the name of child(ren) will be baked into the parent ./run script. I guess I don't know what this means, in practice. My child services generally know about the parent in their ./run script and the parent (sometimes) has to know about the children in his ./finish script. The loosely-coupled approach taken by anopa (and my own work) allows easy replacement of a child service with some other service, at the cost of slightly increased complexity in the child's run script. My child scripts are definitely more complex than the parents. I've never had a problem replacing a child service, it's them who have to know about my parents and not the other way around. In practical terms: /etc/sv/postgresql/run does not know about any of his children, but /etc/sv/pgbouncer/run does know about his parent (postgresql) and /etc/sv/app_which_uses_pg/run does know about his parent (pgbouncer), but not his grandparent. None of the parents know anything about their children. The net effect is the same but the difference is subtle. I'd like to see the difference in a code example. I haven't had a chance to dig in to anopa yet enough to see how they couple it mouse loosely. Tj
Re: Thoughts on First Class Services
Dang it. Hit the send button. It will be a bit, I'll follow up with the completed email. Sorry for the half-baked posting.
Re: Thoughts on First Class Services
On Tue, Apr 28, 2015 at 12:31 PM, Steve Litt sl...@troubleshooters.com wrote: Good! I was about to ask the definitions of parent and child, but the preceding makes it clear. Well at least we're talking the same language now, though reversing parent/child is disconcerting to my OCD. Here's the current version of run.sh, with dependency support baked in: https://bitbucket.org/avery_payne/supervision-scripts/src/b8383ed5aaa1f6d848c1a85e6216e59ba98c3440/sv/.run/run.sh?at=default That's a gnarley run script. It's as big as a lot of sysvinit or OpenRC scripts I've seen. One of the reasons I like daemontools style package management is my run scripts are usually less than 10 lines. This was my thought, as well. It adds a level of complexity we try to avoid in our run scripts. It also seems to me that there is less typing involved in individual run scripts than the individual things that have to be configured for this script. If on goal of this abstraction is to minimize mistakes, adding more moving parts to edit doesn't seem to work towards that goal. And, as you said in a past email, having a run-once capability without insane kludges would be nice, and as you said in another past email, it's not enough to test for the child service to be up according to runit, but it must pass a test to indicate the process itself is functional. I've been doing that ever since you mentioned it. When run-once is necessary we decide per-service whether it's a true one-shot (networking, etc) or a spooky background daemon that we may want to watch. If the latter the ./run script contains our (probably polling) supervisor, if it's a networking type job, we end with exec /bin/pause (part of runit-void). For the 'making sure it is actually up', a ./check script suffices. Tj
Re: Thoughts on First Class Services
On 4/28/2015 10:50 AM, bougyman wrote: Well at least we're talking the same language now, though reversing parent/child is disconcerting to my OCD. Sorry if the terminology is reversed. Here's the current version of run.sh, with dependency support baked in: https://bitbucket.org/avery_payne/supervision-scripts/src/b8383ed5aaa1f6d848c1a85e6216e59ba98c3440/sv/.run/run.sh?at=default That's a gnarley run script. It's as big as a lot of sysvinit or OpenRC scripts I've seen. One of the reasons I like daemontools style package management is my run scripts are usually less than 10 lines. This was my thought, as well. It adds a level of complexity we try to avoid in our run scripts. It also seems to me that there is less typing involved in individual run scripts than the individual things that have to be configured for this script. If on goal of this abstraction is to minimize mistakes, adding more moving parts to edit doesn't seem to work towards that goal. Currently there are the following sections, in sequence: 1. shunt stderr to stdout for logging purposes 2. shunt supporting symlinks into the $PATH so that tools are called correctly. This is critical to supporting more than just a single framework; all of the programs referenced in .bin are actually symlinks that point to the correct program to run. See the .bin/use-* scripts for details. 3. if a definition is broken in some way, then immediately write a message to the log and abort the run. 4. if dependency handling is enabled, then process dependencies. Otherwise, just skip the entire thing. By default, dependencies are disabled; this means ./run scripts behave as if they have no dependency support. 4a. should dependency handling fail, log the failing child in the parent's log, and abort the run. 5. figure out if user or group IDs are in used, and define them. 6. figure out if a run state directory is needed. If so, set it up. 7. start the daemon.
Re: Thoughts on First Class Services
On 4/28/2015 11:34 AM, Laurent Bercot wrote: If a lot of people would like to participate but don't want to subscribe to the skaware mailing-list, I'll move the thread here. Good point, I'm going to stop discussion here and go over there, where the discussion belongs.
Re: Thoughts on First Class Services
On 4/28/2015 10:31 AM, Steve Litt wrote: Good! I was about to ask the definitions of parent and child, but the preceding makes it clear. I'm taking it from the viewpoint that says the service that the user wishes to start is the parent of all other service dependencies that must start. So what you're doing here is minimizing polling, right? Instead of saying whoops, child not running yet, continue the runit loop, you actually start the child, the hope being that no service will ever be skipped and have to wait for the next iteration. Do I have that right? Kinda. A failure of a single child still causes a run loop, but the next time around, some of the children are already started, and a start of the child will quickly return a success, allowing the script to skip over it quickly until it is looking at the same problem child from the last time. The time lost is only on failed starts, and child starts typically don't take that long. If they are, well, it's not the parent's fault... Here's the current version of run.sh, with dependency support baked in: https://bitbucket.org/avery_payne/supervision-scripts/src/b8383ed5aaa1f6d848c1a85e6216e59ba98c3440/sv/.run/run.sh?at=default That's a gnarley run script. Yup. For the moment. If I'm not mistaken, everything inside the if test $( cat ../.env/NEEDS_ENABLED ) -gt 0; then block is boilerplate that could be put inside a shellscript callable from any ./run. True, and that idea has merit. That would hack off 45 lines right there. I think you could do something similar with everything between lines 83 and 110. The person who is truly interested in the low level details could look at the called shellscripts (perhaps called with the dot operator). I'm thinking you could knock this ./run down to less than 35 lines of shellscript by putting boilerplate in shellscripts. I've seen this done in other projects, and for the sake of simplicity (and reducing subshell spawns) I've tried to avoid it. But that doesn't mean I'm against the idea. Certainly, all of these are improvements with merit, provided that they don't interfere with some of the other project goals. If I can get the time to look at all of it, I'll re-write it by segmenting out the various components. In fact, you may have given me an idea to solve an existing problem I'm having with certain daemons... You're doing more of a recursive start. No doubt, when there are two or three levels of dependency and services take a non-trivial amount of time to start (seconds), yours results in the quicker boot. But for typical stuff, I'd imagine the old wait til next time if your ducks aren't in line will be almost as fast, will be conceptually simpler, and more codeable by the end user. Not because your method is any harder, but because you're applying it against a program whose native behavior is wait til next cycle. Actually, I was looking for the lowest-cost solution to how do I keep track of dependency trees between multiple services. The result was a self-organizing set of data and scripts. I don't manage *anything* beyond service A must have service B. It doesn't matter how deep that dependency tree goes, or even if there are common leaf nodes at the end of the tree, because it self-organizes. This reduces my cognitive workload; as the project grows to hundreds of scripts, the number of possible combinations reaches a point where it would be unmanageable otherwise. Using this approach means I don't care how many there are, I only care about what is needed for a specific service. And, as you said in a past email, having a run-once capability without insane kludges would be nice, and as you said in another past email, it's not enough to test for the child service to be up according to runit, but it must pass a test to indicate the process itself is functional. I've been doing that ever since you mentioned it. At some point I have to go back and start writing ./check scripts. :(
Re: Thoughts on First Class Services
On 28/04/2015 20:49, Avery Payne wrote: Good point, I'm going to stop discussion here and go over there, where the discussion belongs. That's not what I meant :) Keep your thread here, it interests people who are not subscribed to skaware, and it will be simpler. But please come over there to give your opinion on how s6-rc should do things. :) -- Laurent