Re: dependant services
On 6/8/2015 10:44 AM, Steve Litt wrote: Just so we're all on the same page, am I correct that the subject of your response here is *not* socket activation, the awesome and wonderful feature of systemd. You're simply talking about a service opening its socket before it's ready to exchange information, right? That is my understanding, yes. We are discussing using UCSPI to hold a socket for clients to connect to, then launching the service and connecting the socket on demand; as a by-product the assumption is the client will block while the launch is occuring for the socket. Of course, to make this work, there is an implicit assumption that the launch includes handling of service is up vs service is ready. Isn't this all controlled by the service? sshd decides when to open its socket: The admin has nothing to do with it. UCSPI is basically the inetd concept re-done daemontools style. It can be a local socket, a network socket, etc. So the UCSPI program would create and hold the socket; upon connection, the service spawns. [Snip 2 paragraphs discussing the complexity of sockets used in a certain context] If I were to write support for sockets in, I would guess that it would probably augment the existing ./needs approach by checking for a socket first (when the feature is enabled), and then failing to find one proceed to peer-level dependency management (when it is enabled). Man, is all this bo-ha-ha about dependencies? Sequencing actually; I'm just mixing a metaphor here, in that my version of dependencies is sequential, self-organizing, but not manually ordered. Order is obtained by sequentially walking the tree, so while you have a little control by organizing the relationships, you don't have any control over which relationship launches first at a given level. = if /usr/local/bin/networkisdown; then sleep 5 exit 1 fi exec /usr/sbin/sshd -d -q = Is this all about using the existance of a socket to decide whether to exec your service or not? If it is, personally I think it's too generic, for the reasons you said: On an arbitrary service, perhaps written by a genius, perhaps written by a poodle, having a socket running is no proof of anything. I know you're trying to write generic run scripts, but at some point, especially with dependencies on specific but arbitrary processes, you need to know about how the process works and about the specific environment in which it's working. And it's not all that difficult, if you allow a human to do it. I think that such edge case dependencies are much easier for humans to do than for algorithms to do. Oh, don't get me wrong, I'm saying that the human should not only be involved but also have a choice. Yes, I will have explicit assumptions about X needs Y but there's still a human around that can decide if they want to flip the switch on to get that behavior. If this really is about recognizing when a process is fully functional, because the process being spawned depends on it, I'd start collecting a bunch of best-practices, portable scripts called ServiceXIsDown and ServiceXIsUp. This is of passing interest to me, because a lot of that accumulated knowledge can be re-implemented to support run scripts. I may write about that separately in a little bit. Sorry for the DP101 shellscript grammar: Shellscripts are a second language for me. The project is currently written in shell, so you're in good company. Anyway, each possible dependent program could have one or more best-practice is it up type test shellscripts. Some would involve sockets, some wouldn't. I don't think this is something you can code into the actual process manager, without a kudzu field of if statements. It wouldn't be any more difficult than the existing peer code. Yes, I know you peeked at that once and found it a bit baroque but if you take the time to walk through it, it's not all that bad, and I'm trying hard to make sure each line is clear about its intention and use. Regarding an older comment that was made about relocating peer dependencies into a separate script, I'm about 80% convinced to do it, if only to make things a little more modular internally. [snip a couple paragraphs that were way above my head] Of course, there are no immediate plans to support UCSPI, although I've already made the mistake of baking in some support with a bcron definition. I think I need to go back and revisit that entry... I'm a big fan of parsimonious scope and parsimonious dependencies, so IMHO the less that's baked in, the better. The minimum dependencies are there. If anything, my dependencies are probably lighter than most - there isn't anything in shell that is baked in (i.e. explicit service X start statements in the script outright), and the dependencies themselves are simply symlinks that can be changed. As a side note, I'm beginning to suspect that the
Re: dependant services
On 08/06/2015 16:00, Avery Payne wrote: This is where I've resisted using sockets. Not because they are bad - they are not. I've resisted because they are difficult to make 100% portable between environments. Let me explain. I have trouble understanding several points of your message. - You've resisted using sockets. What does that mean ? A daemon will, or will not, use a socket; as an integrator, you don't have much say on the matter. You can decide where the socket will be, what will open it, how the daemon will get it if it doesn't open it itself, and other similar details; but you cannot, say, write a run script for a X11 server if you don't want any Unix domain sockets on the machine. :) So, can you clarify your resistance ? - What tools are available. What does that have to do with daemons using sockets ? UCSPI tools will, or will not, be available, but daemons will do as they please. If your scripts rely on UCSPI tools to ease socket management, then add a package dependency - your scripts need UCSPI tools installed, end of story. Dependencies are not a bad thing per se, they just need to be controlled and justified. UCSPI sockets does not make sense. You'll have Unix sockets and INET sockets, and maybe one or two esoteric things such as netlink. UCSPI is a framework that helps manipulate sockets with command-line utilities. Use the tools or don't use them, but I don't understand what your actual problem is. So where do the sockets live? /var/run? /run? /var/sockets? /insert-my-own-flavor-here? How about the service directory of the daemon using the socket ? That's what a service directory is for. * Make socket activate an admin-controlled feature that is disabled by default. You want socket activation, you ask for it first. The admin gets control, I get more headache, and mostly everyone can be happy. If all this fuss is about socket activation, then you can simply forget it altogether. Jonathan was simply mentioning socket activation as an alternative to real dependency management, as in that's what some people do. I don't think he implied it was a good idea. Only Lennart says it's a good idea. Or people who blindly repeat what Lennart says. As a side note, I'm beginning to suspect that the desire for true parallel startup is more of a mirage caused by desire rather than by design. What I'm saying is that it may be more of an ideal we aspire to rather than a design that was thought through. If you have sequenced dependencies, can you truly gain a lot of time by attempting parallel startup? Is the gain for the effort really that important? Can we even speed things up when fsck is deemed mandatory by the admin for a given situation? Questions like these make me wonder if this is really a feasible feature at all. It's feasible. Anopa does it. s6-rc does it too - there are still a lot of things missing so I can't release it now, but the core functionality is done and it works. At least, if by parallel startup you mean start things as soon as they can be started without risk, without needless waiting times. Because if you want parallel startup as in start all the things and pray it works, you can already do it today, with any supervision framework (services will restart until their dependencies are met) or with systemd (socket activation: everything will be fine unless something important crashes, in which case we will deny there is a problem). What I don't think is feasible is having an easy way of accomplishing parallel startup without a real service management tool specifically thought out and designed to handle dependencies properly, and especially mixing one-shot services and long-run services. Which is exactly what anopa and s6-rc do. -- Laurent
Re: dependant services
Laurent Bercot wrote: If all this fuss is about socket activation, then you can simply forget it altogether. Jonathan was simply mentioning socket activation as an alternative to real dependency management, as in that's what some people do. I don't think he implied it was a good idea. Only Lennart says it's a good idea. Or people who blindly repeat what Lennart says. Actually, I carefully wrote opening server sockets early, talking about the specific mechanism that is employed to weaken client ordering dependencies upon servers. As to whether opening server sockets early is a good idea: I'm not in a hurry to naysay. It achieves the stated effect. Arguably, indeed, it can be described as *what the system already does* if one has a lot of daemontools-style services spawned through UCSPI toolsets. They all start up early and in parallel, opening the sockets very first thing with something like tcpserver or tcp-socket-listen and *then* progressing to starting the main server program, thereby allowing clients to connect and block (rather than fail to connect and abend over and over) in parallel. So it would be possibly a bit rich for me to agree that this is a Lennartism. Especially given that I have machines that do this and have had since before systemd was a twinkle in upstart's eye. (-:
Re: dependant services
On 08/06/2015 22:40, Jonathan de Boyne Pollard wrote: As to whether opening server sockets early is a good idea: I'm not in a hurry to naysay. It achieves the stated effect. Arguably, indeed, it can be described as *what the system already does* if one has a lot of daemontools-style services spawned through UCSPI toolsets. They all start up early and in parallel, opening the sockets very first thing with something like tcpserver or tcp-socket-listen and *then* progressing to starting the main server program, thereby allowing clients to connect and block (rather than fail to connect and abend over and over) in parallel. That's an interesting way of seeing it, and it certainly puts things into perspective; but I don't think it's accurate. Not every service can be run under a generic superserver. The ones that can have one thing in common: they have no shared state among various client connections. Also, since they fork a process per client connection, they usually have little state overall, in order to avoid computing the state with every connection. So, services running under a generic superserver can pretty much serve instantly. Read stdin, do stuff, write the result to stdout, done, no prep time needed. If you're running sshd under a superserver, you'll need a little prep, but it's only CPU cycles, that can hardly fail. (sshd can still fail if for instance the host key cannot be found, but honestly, people usually test their sshd and failures of that kind are very rare.) It's usually not a stretch to say that once the superserver has listen()ed to the socket, the service is really ready. (At this point, s6-???server4d sends its readiness notification.) Failures past that point are uncommon, due to the simple nature of UCSPI servers. Socket activation is a different beast. It's supposed to work with any service, no matter what the daemon has to do before being ready to serve, no matter how much state it has to build, no matter whether it's depending on something that can fail. The risk of failure-after-readiness is much bigger with socket- activated services, because the early socket opening will apply to services that do not necessarily play well or safely afterwards. Would you open a socket early for your RabbitMQ server with its legendary starting times? -- Laurent
Re: dependant services
On Mon, 08 Jun 2015 21:08:38 +0100 Jonathan de Boyne Pollard j.deboynepollard-newsgro...@ntlworld.com wrote: The systemd dictum is that to truly take advantage of parallel startup, one eliminates orderings as far as possible. Which is where socket activation comes in. Part of socket activation is systemd opening server sockets early, and passing them to the server processes that get run. Because clients depend from the availability of the sockets, rather than from the availability of the final services, clients and servers can actually start in parallel, and the client is *not* declared as dependent from the *service* being up. Eeeeuuu! Am I the only one here who is grossed out by the preceding paragraph? A socket activated init has no idea who programmed the server, or what idioms that programmer used. Telling clients here's your socket, we're sure that it works sounds a little like check's in the mail or we come in peace. Oh, wait, will there be a systemd compliance sticker given only to servers who do it the systemd way? Nice! Jonathan, am I hallucinating, or does your paragraph basically say that system activation depends on an assumption? Thanks, SteveT Steve Litt June 2015 featured book: The Key to Everyday Excellence http://www.troubleshooters.com/key
Re: dependant services
On 6/8/2015 2:15 PM, Steve Litt wrote: I'm not familiar with inetd. Using sockets to activate what? In what manner? Whose socket? ~ ~ ~ Let's go back in time a little bit. The year is 1996, I'm downstairs literally in my basement with my creaky old 486 with 16Mb of RAM and I'm trying to squeeze as much as I can into my Slackware 3.6 install that I made with 12 floppy disks. There are some of these service-thingys that I'm learning about and they all take up gobs of expen$ive RAM, and while I can swap to disk and deal with that, swapping is a slooow afair because a drive that pushes 10 megaBYTES per second is speedy. Heck, my drive isn't even IDE, it's ESDI, and being a full-height 5 1/2 drive is actually larger than a brick. But I digress. It would be cool if there was a way to reduce the RAM consumption... ~ ~ ~ Me: There's got to be something that can free up some RAM...time to dig around documentation and aritcles online with my uber-kool 14.4 dialup modem! Let's see herewhat's this? Inetd? Whoa, it frees up RAM while providing services! Now I just need RAM to run inetd and all the RAM I save from not running other things can be used for mischief! ~ ~ ~ What inetd does is: 1. Have a giant list of port numbers defined, with a program that pairs with each port number (/etc/inetd.conf) 2. Opens port numbers out of that list when the inetd daemon is run and listens to all of them. 3. When someone talks to the port, the corresponding program is launched and the port connected to the program. If the program fails to launch, the connection is closed. 4. You only need RAM for inetd + any services that launch. 5. ... 6. Profit! Meanwhile, in the same year, halfway across the country in Illinois, in a dark lab... ~ ~ ~ DJB: (swiveling around in a dramatic swivel chair, but no cat, because cats would shed hair on his cool looking sweater) I shall take the old inetd concept, and make it generic and decoupled and streamlined and secure. I shall gift this to you, the Internet, so that you may all be secure, unlike Sendmail's Security Exploit of the Month Club which keeps arriving in my inbox when I didn't ask for it. Go forth, and provide much joy to sysadmins everywhere! (queue dramatic music) ~ ~ ~ ...and thus, UCSPI was born. Fast forward to 2014while surfing various Linux news articles, I stumble into something that sounds like an infomercial... ~ ~ ~ ...Systemd will now do socket activation with not only file sockets but also network sockets too! NETWORK SOCKETS! It's like an Armed Bear riding a Shark with Frickin' Laser Beams while singing the National Anthem with an Exploding Background!! Get your copy today for THREE easy payments!!! Order Now While Supplies Last OPERATORS ARE STANDING BY!!! ~ ~ ~ Yes, that juicy sound is the sound of my eyes rolling up into their sockets as I read that article, attempting to retreat to the relative safety of my skull as I Cannot Un-see What I Have Seen...as you can tell, this isn't exactly a new concept, and it's been done before, many many times, in various ways (inetd, xinetd, various flavors of UCSPI, and now systemd's flavor).
Re: dependant services
On 5/14/2015 3:25 PM, Jonathan de Boyne Pollard wrote: The most widespread general purpose practice for breaking (i.e. avoiding) this kind of ordering is of course opening server sockets early. Client and server then don't need to be so strongly ordered. This is where I've resisted using sockets. Not because they are bad - they are not. I've resisted because they are difficult to make 100% portable between environments. Let me explain. First, there is the question of what environment am I running in? This can break down in to several sub-questions of what variable settings do I have, what does my directory structure look like, and what tools are available. That last one - what tools are installed - is what kills me. Because while I can be assured that the bulk of a framework will be present, there is no guarantee that I will have UCSPI sockets around. Let's say I decide to only support frameworks that package UCSPI out of the box, so I am assured that the possibility of socket activate is 100% guaranteed, ignoring the fact that I just jettisoned several other frameworks in the process simply to support this one feature. So we press on with the design assumption it is safe to assume that UCSPI is installed and therefore can be encoded into run scripts. Now we have another problem - integration. Using sockets means I need to have a well-defined namespace to locate the sockets themselves, and that means a well-known area in the filesystem because the filesystem is what organizes the namespace. So where do the sockets live? /var/run? /run? /var/sockets? /insert-my-own-flavor-here? Let's take it a step further and I decide on some name - I'll pull one out of a hat and simply call it /var/run/ucspi-sockets - and ignore all of the toes I'm stepping on in the process, including the possibility that some distribution already has that name reserved. Now I have (a) the assurance that UCSPI is supported and (b) a place for UCSPI to get its groove on, then we have the next problem, getting all of the services to play nice within this context. Do I write everything to depend on UCSPI sockets so that I get automatic block? Do I make it entirely the choice of the administrator to activate this feature via a switch that can be thrown? Or is it used for edge cases only? Getting consistency out of it would be great, but then I back the admin into a corner with this is design policy and you get it, like it or not. If I go with admin controlled, that means yet another code path in an already bloaty ./run.sh script that may or may not activate, and the admin has their day with it, but the number of potential problem vectors grows. Or I can hybridize it and do it for edge cases only, but now the admin is left scratching their head asking why is it here, but not there? it's not consistent, what where they thinking?? Personally, I would do the following: * Create a socket directory in whatever passes for /var/run, and name it /var/run/ucspi-sockets. * For each service definition that has active sockets, there would be /var/run/ucspi-sockets/{directory} where {directory} is the name of the service, and inside of that is a socket file named /var/run/ucspi-sockets/{directory}/socket. That is about as generic and safe as I can get, given that /var/run on Linux is a symlink that points to /run in some cases. It is consistent - the admin knows where to find the socket every single time, and is assured that the socket inside of the directory is the one that connects to a service. It is a reasonable name - the odds of /var/run/ucspi-sockets being taken for anything else but that is fairly low, and the odds of me stepping on top of some other construct in that directory are low as well, because any existing sub-directory in that location is probably there for the same reason. * Make socket activate an admin-controlled feature that is disabled by default. You want socket activation, you ask for it first. The admin gets control, I get more headache, and mostly everyone can be happy. We've answered the where and the when, now we are left with the how. I suspect that you and Laurent would argue that I shouldn't be using sockets inside of ./run as it is, that it should be in the layer above in service management proper, meaning that the entire construct shouldn't exist at that level. Which means I shouldn't even support it inside of ./run. Which means I can't package this feature in my scripts. And we're back to square one. Let's say I ignore this advice (at my own peril) and provide support for those frameworks that don't have external management layers on top of them. This was the entire reason I wrote my silly peer-level dependency support to begin with, so that other folks would have one or two of these features available to them, even though they don't have external management like nosh or s6-rc or anopa. It's a poor man's
Re: dependant services
Buck Evan: For example, I'd like to encode the fact that I don't expect service A to be able to come up before service B. In nosh, the filesystem is the database. This is an ordering, not a dependency. One can separately encode in nosh (a) that start of service B will cause the start of service A, and (b) that start of service A has to be scheduled after the start of service B rather than in parallel. The latter is an ordering. And it's a symbolic link, either A/after/B pointing to B or B/before/A pointing to A. The former is a symbolic link B/wants/A pointing to A. This is a very brief precis. There's a detailed explanation of service bundles and all of the subdirectories in them in the manual. Start with man system-control. Or even man manual/system-control.1 if you haven't been brave enough to actually export the software into /usr/local. (-: The most widespread general purpose practice for breaking (i.e. avoiding) this kind of ordering is of course opening server sockets early. Client and server then don't need to be so strongly ordered.
Re: dependant services
On 4/21/2015 2:19 PM, Buck Evan wrote: Does s6 (or friends) have first-class support for dependant services? I know that runit and daemontools do not. I do know that nosh has direct support for this. I believe s6 supports it through various intermediary tools, i.e. using socket activation to bring services up, so you could say that while it supports it directly and provides a full guarantee, it's not first class in the sense that you can simply provide a list of bring these up first and it will do it out of the box. The recently announced anopa init system fills in this gap and makes it first class, in the sense that you can simply provide the names of definitions that need to start and everything else is handled for you. Alternatively, are there general-purpose practices for breaking this kind of dependency? Strange as it sounds, renaming the child definition of a dependency chain (which typically translates into the directory name of the defintion) seems to be a regular issue. Changing the name of the definition typically causes various links to break, causing the parent service to be unable to locate its children by name at start-up.
Re: dependant services
On 4/21/2015 2:56 PM, Buck Evan wrote: My understanding of s6 socket activation is that services should open, hold onto their listening socket when they're up, and s6 relies on the OS for swapping out inactive services. It's not socket activation in the usual sense. http://skarnet.org/software/s6/socket-activation.html I apologize, I was a bit hasty and I think I need more sleep. I'm confusing socket activation with some other s6 feature, perhaps I was confusing it with how s6-notifywhenup is used... http://skarnet.org/software/s6/s6-notifywhenup.html So I wonder what the full guarantee provided by s6 that you mentioned looks like. It seems like in such a world all services would race and the determinism of the race would depend on each service's implementation. This I do understand, having gone through it with supervision-scripts. The basic problem is that a running service does not mean a service is ready, it only means it's up. Dependency handling with guarantee means there is some means by which the child service itself signals I'm fully up and running, vs. I'm started but not ready. Because there is no polling going on, this allows the start-up of the parent daemon to sleep until it either is notified or times out. And you get a clean start-up of the parent because the children have directly signaled that we're all ready. Dependency handling without guarantee is what my project does as an optional feature - it brings up the child process and then calls the child's ./check script to see if everything is OK, which is polling the child (and wasting CPU cycles). This is fine for light use because most child processes will start quickly and the parent won't time out while waiting. There are trade-offs for using this feature. First, ./check scripts may have unintended bugs, behaviors, or issues that you can't see or resolve, unlike the child directly signalling that it is ready for use. Second, the polling approach adds to CPU overhead, making it less than ideal for mobile computing - it will draw more power over time. Third, there are edge cases where it can make a bad situation worse - picture a heavily loaded system that takes 20+ minutes to start a child process, and the result being the parent spawn-loops repeatedly, which just adds even more load. That's just the three I can think off off the top of my head - I'm sure there's more. It's also why it's not enabled by default.
Re: dependant services
On 4/21/2015 3:08 PM, Buck Evan wrote: On Tue, Apr 21, 2015 at 2:46 PM, Avery Payne avery.p.pa...@gmail.com mailto:avery.p.pa...@gmail.com wrote: Alternatively, are there general-purpose practices for breaking this kind of dependency? Strange as it sounds, renaming the child definition of a dependency chain (which typically translates into the directory name of the defintion) seems to be a regular issue. Changing the name of the definition typically causes various links to break, causing the parent service to be unable to locate its children by name at start-up. Ah, I just realized you misunderstood me. You understood breaking dependencies to mean the dependant system no longer works where what I meant was the dependency is no longer relevant. With regard to practice or policy, I can only speak to my own project. I try to stick with minimum feasible assumption when designing things. In the case of the run script handling dependencies, it only assumes that the child failed for reasons known only to the child, and therefore the parent will abort out and eventually spawn-loop. Prior to exiting the script, a message is left for the systems administrator about which child failed, so that they can at least see why the parent refused to start. Beyond that, I try not to assume too much. If the dependency is no longer relevant, then that is a small issue - the ./needs directory holds the names of all the child processes that are needed, and if the child will fail because it's broken / moved / uninstalled / picked up its marbles and went home, then the parent will simply continue to fail to start, until the child's name is removed from the ./needs directory. Again, you'll see a recorded message in the parent log about the child causing the failure, but not much more than that. It can be easily fixed by simply removing the child symlink in ./needs, which will cause the parent to forget about the child. This possibility should be documented somewhere in the project, and I know I haven't done so yet. Thanks for bringing it up, I'll try to get to it soon.
Re: dependant services
My understanding of s6 socket activation is that services should open, hold onto their listening socket when they're up, and s6 relies on the OS for swapping out inactive services. It's not socket activation in the usual sense. http://skarnet.org/software/s6/socket-activation.html So I wonder what the full guarantee provided by s6 that you mentioned looks like. It seems like in such a world all services would race and the determinism of the race would depend on each service's implementation. On Tue, Apr 21, 2015 at 2:46 PM, Avery Payne avery.p.pa...@gmail.com wrote: On 4/21/2015 2:19 PM, Buck Evan wrote: Does s6 (or friends) have first-class support for dependant services? I know that runit and daemontools do not. I do know that nosh has direct support for this. I believe s6 supports it through various intermediary tools, i.e. using socket activation to bring services up, so you could say that while it supports it directly and provides a full guarantee, it's not first class in the sense that you can simply provide a list of bring these up first and it will do it out of the box. The recently announced anopa init system fills in this gap and makes it first class, in the sense that you can simply provide the names of definitions that need to start and everything else is handled for you. Alternatively, are there general-purpose practices for breaking this kind of dependency? Strange as it sounds, renaming the child definition of a dependency chain (which typically translates into the directory name of the defintion) seems to be a regular issue. Changing the name of the definition typically causes various links to break, causing the parent service to be unable to locate its children by name at start-up.