Hey Laurent, Over at LQ, I'm working on importing s6 into LFS again, but this time at a slower pace. I was hoping to also see about using the native LFS utilities as much as possible and only include the init-shim tools (halt, shutdown, pause, and runlevel scripts and binaries) from Runit-For-LFS for low level system management if possible to avoid using more extras.
I have had a though, why not include symlinkable functionality for halt, poweroff, shutdown, and reboot directly in s6-svscanctl and move s6-pause into s6 itself to simplify the packages (you could even have a configure trigger --with-s6-pause to enable or disable it during build. Just a suggestion, but no biggie. Anyways, I'll be posting more frequently about getting init-stage-1/2/3 drafted correctly and in execline script language. Avery maybe you can share your notes as well on this with me, if possible. Thanks, Jim Sent from my Windows Phone ________________________________ From: Laurent Bercot<mailto:[email protected]> Sent: 1/2/2015 4:59 AM To: [email protected]<mailto:[email protected]> Subject: Re: runit-scripts gone, supervision-scripts progress Hi Avery, Happy new year to you ! Congratulations on the achievements so far, even if they're not reaching the bar you set for yourself. Just a little note: > + The ./finish concept needs development and refinement. > > + Need to incorporate some kind of alerting or reporting mechanism into > ./finish, so that the sysadmin receives notifications ./finish is a delicate beast. It is not only run when the admin brings the service down, which is fine, but also when the service stops in an untimely fashion; and the service cannot start again as long as ./finish is running. So, if anything time-consuming, or worse, blocking, happens in ./finish, the service can be totally hosed. Services should do all their necessary work in ./run, before executing into the long-lived process: when they are in ./run, it's a known and manageable state, they are up, even if they are not ready yet. But in ./finish, it's kind of a limbo state that shouldn't be drawn out. The service is down, but it's still doing something, can't be brought up right now, etc. Having a service stuck in "finish" state is about as infuriating as having a process stuck in "D" state on Linux. s6-supervise has a built-in protection against misbehaving ./finish scripts: if ./finish is still around after 5 seconds, it kills it. (With a SIGKILL. When a service is down is not the time to be polite.) AFAICT, runsv does not have such a protection, which makes it even more important to pay attention when writing ./finish scripts. One way or the other, ./finish should only be used scarcely, for clean-up duties that absolutely need to happen when the long-lived process has died: removing stale or temporary files, for instance. Those should be brief operations and absolutely cannot block. So, if you're implementing reporting in ./finish, make sure you are using fast, non-blocking commands that just fail (possibly logging an error message) if they have trouble doing their job. The way I would implement reporting wouldn't be based on ./finish, but on an external set of processes listening to down/up/ready notifications in /service/foobar/event. It would only work with s6, though. -- Laurent
