Re: runit-scripts gone, supervision-scripts progress

2015-01-03 Thread Laurent Bercot

On 03/01/2015 01:13, Avery Payne wrote:

I'm thinking "spawn to background and exit just after that".


 That solves the problem I mentioned, but creates other ones.

 If ./finish is about cleanup, and you background it, then ./run
may start again before the cleanup has completed, so there will
be competition for resources, and race conditions.
 If the service starts again, then dies again, you will have two
concurrent ./finish processes. More race conditions, unless they
are reentrant, which is a heavy constraint on a finish script.
 If the service is in a failure loop and dies faster than
cleanup completes, you will have an accumulation of ./finish
processes, which will end up eating a lot of resources. You do
not want to risk cascading failure.

 All in all, I think it's safer not to background ./finish, and
just make sure it doesn't block.



Right now I'm having an internal dialog about if I should have an
environment variable that "hints" the framework to the scripts, which in
turn would allow me to support framework-specific features.  I like the
idea but I'm concerned that it will be unmaintainable without templates.


 Welcome to the wonderful world of integration!
 As you must have guessed by now, DJB, Wayne, Gerrit and I regularly
meet in secret to find new ways to make you pull your hair out, and
we've decided that we really went too easy on you in 2014, so expect
more work in 2015. ;)

--
 Laurent



Re: runit-scripts gone, supervision-scripts progress

2015-01-03 Thread Laurent Bercot

 Hi James,



I have had a though, why not include symlinkable functionality for
halt, poweroff, shutdown, and reboot directly in s6-svscanctl


 s6-svscan can be used as a normal process, not only as process 1,
and there can be more than one scan directory on the system.
Calling s6-svscanctl "shutdown" would only be valid when applied
to the main scan directory: this implies embedding some policy
into the software. s6 is not the place to do that. However, the
s6-linux-init package I'm working on will provide "shutdown"
compatibility binaries.



and move s6-pause into s6 itself to simplify the packages


 s6-pause is a hack to have a live process that does nothing. I'm
using it as a test tool and as a placeholder for real run scripts.
But in a real installation, it should not be needed. If you need
s6-pause in a real configuration, you are probably using a
process supervision framework to implement supervision of services
that do not need a long-lived process, and this is ugly. I'm
working on finding a better design.

 Think of s6-pause as a long "sleep" process. You wouldn't want
that in your scripts, would you ? And if you *really* need that
functionality, use "sleep 2147483647". I guarantee your system will
reboot before that sleep exits. :P



Anyways, I'll be posting more frequently about getting
init-stage-1/2/3 drafted correctly and in execline script language.


 Don't let execline steal your focus. If you want to write scripts
and are more comfortable with sh, write in sh and get something
running. Converting scripts to execline is always possible later.
The only place where execline is important to have is the early
logging pipe in stage 1, and then again you only need redirfd.

 I like writing in execline because it makes chain loading a lot
easier than sh does, it's more predictable, on embedded systems the
resource savings are not negligible, and most important, I've
grown accustomed to it and can now speak it as fluently as sh.
But for early development, if you are more familiar with sh, by
all means use what you are familiar with and focus on the job, not
on the tool.

--
 Laurent


Re: runit-scripts gone, supervision-scripts progress

2015-01-02 Thread Avery Payne
>
>  One way or the other, ./finish should only be used scarcely, for clean-up
> duties that absolutely need to happen when the long-lived process has died:
> removing stale or temporary files, for instance. Those should be brief
> operations and absolutely cannot block.
>

I'm thinking "spawn to background and exit just after that".


>  So, if you're implementing reporting in ./finish, make sure you are using
> fast, non-blocking commands that just fail (possibly logging an error
> message) if they have trouble doing their job.
>
>  The way I would implement reporting wouldn't be based on ./finish, but on
> an external set of processes listening to down/up/ready notifications in
> /service/foobar/event. It would only work with s6, though.


Unfortunately I don't have a firm plan for supporting framework
enhancements just yet.  Although every little note and suggestion you give
will certainly be remembered, and when the time comes, I'll see what I can
do to incorporate them.

Right now I'm having an internal dialog about if I should have an
environment variable that "hints" the framework to the scripts, which in
turn would allow me to support framework-specific features.  I like the
idea but I'm concerned that it will be unmaintainable without templates.


>
>
> --
>  Laurent
>
>


Re: runit-scripts gone, supervision-scripts progress

2015-01-02 Thread Avery Payne
On Fri, Jan 2, 2015 at 3:42 PM, James Powell  wrote:

>
> Anyways, I'll be posting more frequently about getting init-stage-1/2/3
> drafted correctly and in execline script language. Avery maybe you can
> share your notes as well on this with me, if possible.
>

I'll provide what little I know.  There's a lot of ground to cover.


RE: runit-scripts gone, supervision-scripts progress

2015-01-02 Thread James Powell
Hey Laurent,

Over at LQ, I'm working on importing s6 into LFS again, but this time at a 
slower pace. I was hoping to also see about using the native LFS utilities as 
much as possible and only include the init-shim tools (halt, shutdown, pause, 
and runlevel scripts and binaries) from Runit-For-LFS for low level system 
management if possible to avoid using more extras.

I have had a though, why not include symlinkable functionality for halt, 
poweroff, shutdown, and reboot directly in s6-svscanctl and move s6-pause into 
s6 itself to simplify the packages (you could even have a configure trigger 
--with-s6-pause to enable or disable it during build. Just a suggestion, but no 
biggie.

Anyways, I'll be posting more frequently about getting init-stage-1/2/3 drafted 
correctly and in execline script language. Avery maybe you can share your notes 
as well on this with me, if possible.

Thanks,
Jim

Sent from my Windows Phone

From: Laurent Bercot<mailto:ska-supervis...@skarnet.org>
Sent: ‎1/‎2/‎2015 4:59 AM
To: supervision@list.skarnet.org<mailto:supervision@list.skarnet.org>
Subject: Re: runit-scripts gone, supervision-scripts progress


  Hi Avery,
  Happy new year to you !

  Congratulations on the achievements so far, even if they're not reaching
the bar you set for yourself.

  Just a little note:

> + The ./finish concept needs development and refinement.
>
> + Need to incorporate some kind of alerting or reporting mechanism into
> ./finish, so that the sysadmin receives notifications

  ./finish is a delicate beast. It is not only run when the admin brings
the service down, which is fine, but also when the service stops in an
untimely fashion; and the service cannot start again as long as ./finish
is running. So, if anything time-consuming, or worse, blocking, happens
in ./finish, the service can be totally hosed.
  Services should do all their necessary work in ./run, before executing
into the long-lived process: when they are in ./run, it's a known and
manageable state, they are up, even if they are not ready yet. But in
./finish, it's kind of a limbo state that shouldn't be drawn out. The
service is down, but it's still doing something, can't be brought up
right now, etc. Having a service stuck in "finish" state is about as
infuriating as having a process stuck in "D" state on Linux.

  s6-supervise has a built-in protection against misbehaving ./finish
scripts: if ./finish is still around after 5 seconds, it kills it.
(With a SIGKILL. When a service is down is not the time to be polite.)
AFAICT, runsv does not have such a protection, which makes it even more
important to pay attention when writing ./finish scripts.

  One way or the other, ./finish should only be used scarcely, for clean-up
duties that absolutely need to happen when the long-lived process has died:
removing stale or temporary files, for instance. Those should be brief
operations and absolutely cannot block.
  So, if you're implementing reporting in ./finish, make sure you are using
fast, non-blocking commands that just fail (possibly logging an error
message) if they have trouble doing their job.

  The way I would implement reporting wouldn't be based on ./finish, but on
an external set of processes listening to down/up/ready notifications in
/service/foobar/event. It would only work with s6, though.

--
  Laurent



Re: runit-scripts gone, supervision-scripts progress

2015-01-02 Thread Laurent Bercot


 Hi Avery,
 Happy new year to you !

 Congratulations on the achievements so far, even if they're not reaching
the bar you set for yourself.

 Just a little note:


+ The ./finish concept needs development and refinement.

+ Need to incorporate some kind of alerting or reporting mechanism into
./finish, so that the sysadmin receives notifications


 ./finish is a delicate beast. It is not only run when the admin brings
the service down, which is fine, but also when the service stops in an
untimely fashion; and the service cannot start again as long as ./finish
is running. So, if anything time-consuming, or worse, blocking, happens
in ./finish, the service can be totally hosed.
 Services should do all their necessary work in ./run, before executing
into the long-lived process: when they are in ./run, it's a known and
manageable state, they are up, even if they are not ready yet. But in
./finish, it's kind of a limbo state that shouldn't be drawn out. The
service is down, but it's still doing something, can't be brought up
right now, etc. Having a service stuck in "finish" state is about as
infuriating as having a process stuck in "D" state on Linux.

 s6-supervise has a built-in protection against misbehaving ./finish
scripts: if ./finish is still around after 5 seconds, it kills it.
(With a SIGKILL. When a service is down is not the time to be polite.)
AFAICT, runsv does not have such a protection, which makes it even more
important to pay attention when writing ./finish scripts.

 One way or the other, ./finish should only be used scarcely, for clean-up
duties that absolutely need to happen when the long-lived process has died:
removing stale or temporary files, for instance. Those should be brief
operations and absolutely cannot block.
 So, if you're implementing reporting in ./finish, make sure you are using
fast, non-blocking commands that just fail (possibly logging an error
message) if they have trouble doing their job.

 The way I would implement reporting wouldn't be based on ./finish, but on
an external set of processes listening to down/up/ready notifications in
/service/foobar/event. It would only work with s6, though.

--
 Laurent



runit-scripts gone, supervision-scripts progress

2015-01-02 Thread Avery Payne
Happy belated New Year!

As discussed elsewhere, the runit-scripts repository has been removed.  A
link has been left that redirects to the supervision-scripts project.  The
new project should be a 100% compatible replacement.

I did not achieve my personal goal of a 0.1 release by January 1.  I feel
badly about this, but it has been a hectic holiday for my family.  Current,
the project is short about 50 definitions needed for the release, which
would put it at 10% coverage, or about ~120 definitions.  Here's what
little has been done so far:

Done:
- - - - - - - -
+ getty support is via a template, and supports 3 different types

+ socklog is now via a template for its three different modes

+ user-controlled services are now via a template, in pure shell script for
all three frameworks (although it's not fully tested)

+ Incorporate pgrphack, envdir, and setuidgid regardless of framework used

+ system-wide environment PATH in .env

+ Migrate environment variables off of the ./options shell file and onto
envdir for service-specific settings

+ Retired run-simple completely in favor of run-envdir, making it possible
to have non-shell ./run launchers

+ Removed the dependency of the directory name matching the program

+ Service definition directories can now be named arbitrarily vs. the
actual name of the daemon, meaning it may be possible to support runit's
SysV shim mode again!


In Progress:
- - - - - - - -
+ hunt down the last vestiges of any runit-specific scripting, and replace
it with generic framework scripting for all three frameworks

+ hunt down ./run scripts in the wild, gather them, and give the authors
attribution.  Goal: accelerate development

+ Re-organize the definition creation sequence around Debian's pop-con
data, with the most common services being written first.  Goal: increase
the project's usefulness by making common things accessible

+ Experimental service dependencies in 100% shell script.  Goal: No
compiling required upon install!

+ Experimental one-shot definitions that don't need a pause(1) or a
signal.  Goal: No PIDs or sleep(y) programs

+ Reach that 0.1 release!!!


To-Do / Experimental:
- - - - - - - -
+ The ./finish concept needs development and refinement.

+ Need to incorporate some kind of alerting or reporting mechanism into
./finish, so that the sysadmin receives notifications

+ service definition names may be changed in the future to better support
SysV shimming, but this is not a definite plan, and may be cancelled.

+ replace the user-controlled service template with an active program that
seeks out service directories and starts them up as needed; there is a
Github project to this effect, but I have not been able to contact the
author.

+ Look at re-writing the project in execline(!), although several features
may stop working

+ Refine logging to support all three frameworks.  Currently it assumes
that (service)/log/run is sane, when in fact it's just a pointer to
something else.

+ Refine the logging mechanism closer to Laurent's "logging chain" concept,
if possible for all three

+  Not everything needs per-service logging.  At the moment, all service
definitions receive this, regardless if it is needed or not.  This "blanket
logging" ensures nothing is lost but it's inefficient.  I plan on
backtracking through in the future and cleaning this up as part of the
logging re-structure.