Re: runit-scripts gone, supervision-scripts progress
On 03/01/2015 01:13, Avery Payne wrote: I'm thinking "spawn to background and exit just after that". That solves the problem I mentioned, but creates other ones. If ./finish is about cleanup, and you background it, then ./run may start again before the cleanup has completed, so there will be competition for resources, and race conditions. If the service starts again, then dies again, you will have two concurrent ./finish processes. More race conditions, unless they are reentrant, which is a heavy constraint on a finish script. If the service is in a failure loop and dies faster than cleanup completes, you will have an accumulation of ./finish processes, which will end up eating a lot of resources. You do not want to risk cascading failure. All in all, I think it's safer not to background ./finish, and just make sure it doesn't block. Right now I'm having an internal dialog about if I should have an environment variable that "hints" the framework to the scripts, which in turn would allow me to support framework-specific features. I like the idea but I'm concerned that it will be unmaintainable without templates. Welcome to the wonderful world of integration! As you must have guessed by now, DJB, Wayne, Gerrit and I regularly meet in secret to find new ways to make you pull your hair out, and we've decided that we really went too easy on you in 2014, so expect more work in 2015. ;) -- Laurent
Re: runit-scripts gone, supervision-scripts progress
Hi James, I have had a though, why not include symlinkable functionality for halt, poweroff, shutdown, and reboot directly in s6-svscanctl s6-svscan can be used as a normal process, not only as process 1, and there can be more than one scan directory on the system. Calling s6-svscanctl "shutdown" would only be valid when applied to the main scan directory: this implies embedding some policy into the software. s6 is not the place to do that. However, the s6-linux-init package I'm working on will provide "shutdown" compatibility binaries. and move s6-pause into s6 itself to simplify the packages s6-pause is a hack to have a live process that does nothing. I'm using it as a test tool and as a placeholder for real run scripts. But in a real installation, it should not be needed. If you need s6-pause in a real configuration, you are probably using a process supervision framework to implement supervision of services that do not need a long-lived process, and this is ugly. I'm working on finding a better design. Think of s6-pause as a long "sleep" process. You wouldn't want that in your scripts, would you ? And if you *really* need that functionality, use "sleep 2147483647". I guarantee your system will reboot before that sleep exits. :P Anyways, I'll be posting more frequently about getting init-stage-1/2/3 drafted correctly and in execline script language. Don't let execline steal your focus. If you want to write scripts and are more comfortable with sh, write in sh and get something running. Converting scripts to execline is always possible later. The only place where execline is important to have is the early logging pipe in stage 1, and then again you only need redirfd. I like writing in execline because it makes chain loading a lot easier than sh does, it's more predictable, on embedded systems the resource savings are not negligible, and most important, I've grown accustomed to it and can now speak it as fluently as sh. But for early development, if you are more familiar with sh, by all means use what you are familiar with and focus on the job, not on the tool. -- Laurent
Re: runit-scripts gone, supervision-scripts progress
> > One way or the other, ./finish should only be used scarcely, for clean-up > duties that absolutely need to happen when the long-lived process has died: > removing stale or temporary files, for instance. Those should be brief > operations and absolutely cannot block. > I'm thinking "spawn to background and exit just after that". > So, if you're implementing reporting in ./finish, make sure you are using > fast, non-blocking commands that just fail (possibly logging an error > message) if they have trouble doing their job. > > The way I would implement reporting wouldn't be based on ./finish, but on > an external set of processes listening to down/up/ready notifications in > /service/foobar/event. It would only work with s6, though. Unfortunately I don't have a firm plan for supporting framework enhancements just yet. Although every little note and suggestion you give will certainly be remembered, and when the time comes, I'll see what I can do to incorporate them. Right now I'm having an internal dialog about if I should have an environment variable that "hints" the framework to the scripts, which in turn would allow me to support framework-specific features. I like the idea but I'm concerned that it will be unmaintainable without templates. > > > -- > Laurent > >
Re: runit-scripts gone, supervision-scripts progress
On Fri, Jan 2, 2015 at 3:42 PM, James Powell wrote: > > Anyways, I'll be posting more frequently about getting init-stage-1/2/3 > drafted correctly and in execline script language. Avery maybe you can > share your notes as well on this with me, if possible. > I'll provide what little I know. There's a lot of ground to cover.
RE: runit-scripts gone, supervision-scripts progress
Hey Laurent, Over at LQ, I'm working on importing s6 into LFS again, but this time at a slower pace. I was hoping to also see about using the native LFS utilities as much as possible and only include the init-shim tools (halt, shutdown, pause, and runlevel scripts and binaries) from Runit-For-LFS for low level system management if possible to avoid using more extras. I have had a though, why not include symlinkable functionality for halt, poweroff, shutdown, and reboot directly in s6-svscanctl and move s6-pause into s6 itself to simplify the packages (you could even have a configure trigger --with-s6-pause to enable or disable it during build. Just a suggestion, but no biggie. Anyways, I'll be posting more frequently about getting init-stage-1/2/3 drafted correctly and in execline script language. Avery maybe you can share your notes as well on this with me, if possible. Thanks, Jim Sent from my Windows Phone From: Laurent Bercot<mailto:ska-supervis...@skarnet.org> Sent: 1/2/2015 4:59 AM To: supervision@list.skarnet.org<mailto:supervision@list.skarnet.org> Subject: Re: runit-scripts gone, supervision-scripts progress Hi Avery, Happy new year to you ! Congratulations on the achievements so far, even if they're not reaching the bar you set for yourself. Just a little note: > + The ./finish concept needs development and refinement. > > + Need to incorporate some kind of alerting or reporting mechanism into > ./finish, so that the sysadmin receives notifications ./finish is a delicate beast. It is not only run when the admin brings the service down, which is fine, but also when the service stops in an untimely fashion; and the service cannot start again as long as ./finish is running. So, if anything time-consuming, or worse, blocking, happens in ./finish, the service can be totally hosed. Services should do all their necessary work in ./run, before executing into the long-lived process: when they are in ./run, it's a known and manageable state, they are up, even if they are not ready yet. But in ./finish, it's kind of a limbo state that shouldn't be drawn out. The service is down, but it's still doing something, can't be brought up right now, etc. Having a service stuck in "finish" state is about as infuriating as having a process stuck in "D" state on Linux. s6-supervise has a built-in protection against misbehaving ./finish scripts: if ./finish is still around after 5 seconds, it kills it. (With a SIGKILL. When a service is down is not the time to be polite.) AFAICT, runsv does not have such a protection, which makes it even more important to pay attention when writing ./finish scripts. One way or the other, ./finish should only be used scarcely, for clean-up duties that absolutely need to happen when the long-lived process has died: removing stale or temporary files, for instance. Those should be brief operations and absolutely cannot block. So, if you're implementing reporting in ./finish, make sure you are using fast, non-blocking commands that just fail (possibly logging an error message) if they have trouble doing their job. The way I would implement reporting wouldn't be based on ./finish, but on an external set of processes listening to down/up/ready notifications in /service/foobar/event. It would only work with s6, though. -- Laurent
Re: runit-scripts gone, supervision-scripts progress
Hi Avery, Happy new year to you ! Congratulations on the achievements so far, even if they're not reaching the bar you set for yourself. Just a little note: + The ./finish concept needs development and refinement. + Need to incorporate some kind of alerting or reporting mechanism into ./finish, so that the sysadmin receives notifications ./finish is a delicate beast. It is not only run when the admin brings the service down, which is fine, but also when the service stops in an untimely fashion; and the service cannot start again as long as ./finish is running. So, if anything time-consuming, or worse, blocking, happens in ./finish, the service can be totally hosed. Services should do all their necessary work in ./run, before executing into the long-lived process: when they are in ./run, it's a known and manageable state, they are up, even if they are not ready yet. But in ./finish, it's kind of a limbo state that shouldn't be drawn out. The service is down, but it's still doing something, can't be brought up right now, etc. Having a service stuck in "finish" state is about as infuriating as having a process stuck in "D" state on Linux. s6-supervise has a built-in protection against misbehaving ./finish scripts: if ./finish is still around after 5 seconds, it kills it. (With a SIGKILL. When a service is down is not the time to be polite.) AFAICT, runsv does not have such a protection, which makes it even more important to pay attention when writing ./finish scripts. One way or the other, ./finish should only be used scarcely, for clean-up duties that absolutely need to happen when the long-lived process has died: removing stale or temporary files, for instance. Those should be brief operations and absolutely cannot block. So, if you're implementing reporting in ./finish, make sure you are using fast, non-blocking commands that just fail (possibly logging an error message) if they have trouble doing their job. The way I would implement reporting wouldn't be based on ./finish, but on an external set of processes listening to down/up/ready notifications in /service/foobar/event. It would only work with s6, though. -- Laurent
runit-scripts gone, supervision-scripts progress
Happy belated New Year! As discussed elsewhere, the runit-scripts repository has been removed. A link has been left that redirects to the supervision-scripts project. The new project should be a 100% compatible replacement. I did not achieve my personal goal of a 0.1 release by January 1. I feel badly about this, but it has been a hectic holiday for my family. Current, the project is short about 50 definitions needed for the release, which would put it at 10% coverage, or about ~120 definitions. Here's what little has been done so far: Done: - - - - - - - - + getty support is via a template, and supports 3 different types + socklog is now via a template for its three different modes + user-controlled services are now via a template, in pure shell script for all three frameworks (although it's not fully tested) + Incorporate pgrphack, envdir, and setuidgid regardless of framework used + system-wide environment PATH in .env + Migrate environment variables off of the ./options shell file and onto envdir for service-specific settings + Retired run-simple completely in favor of run-envdir, making it possible to have non-shell ./run launchers + Removed the dependency of the directory name matching the program + Service definition directories can now be named arbitrarily vs. the actual name of the daemon, meaning it may be possible to support runit's SysV shim mode again! In Progress: - - - - - - - - + hunt down the last vestiges of any runit-specific scripting, and replace it with generic framework scripting for all three frameworks + hunt down ./run scripts in the wild, gather them, and give the authors attribution. Goal: accelerate development + Re-organize the definition creation sequence around Debian's pop-con data, with the most common services being written first. Goal: increase the project's usefulness by making common things accessible + Experimental service dependencies in 100% shell script. Goal: No compiling required upon install! + Experimental one-shot definitions that don't need a pause(1) or a signal. Goal: No PIDs or sleep(y) programs + Reach that 0.1 release!!! To-Do / Experimental: - - - - - - - - + The ./finish concept needs development and refinement. + Need to incorporate some kind of alerting or reporting mechanism into ./finish, so that the sysadmin receives notifications + service definition names may be changed in the future to better support SysV shimming, but this is not a definite plan, and may be cancelled. + replace the user-controlled service template with an active program that seeks out service directories and starts them up as needed; there is a Github project to this effect, but I have not been able to contact the author. + Look at re-writing the project in execline(!), although several features may stop working + Refine logging to support all three frameworks. Currently it assumes that (service)/log/run is sane, when in fact it's just a pointer to something else. + Refine the logging mechanism closer to Laurent's "logging chain" concept, if possible for all three + Not everything needs per-service logging. At the moment, all service definitions receive this, regardless if it is needed or not. This "blanket logging" ensures nothing is lost but it's inefficient. I plan on backtracking through in the future and cleaning this up as part of the logging re-structure.