Re: runit SIGPWR support

2020-03-16 Thread Jeff
25.02.2020, 10:08, "Jonathan de Boyne Pollard" 
:
> Yes. First: This is a kernel virtual terminal thing not a console
> thing. Strictly speaking, it is doing it wrongly to access it through
> the console device, which is not necessarily a KVT. Second: The
> mechanism does not require an open file descriptor. It requires that
> the target process never terminate, because there's no symmetrical
> kernel API function to disable the mechanism once it has been enabled,
> but it does not require that that process have any particular file
> descriptors open.

how is all this handled on the BSDs ?

their inits do not provide configuration hooks like the "ctrlaltdel"
and "kbrequest" inittab stanzas. does that mean there exist no
BSD equivalents for the "secure attention key" and "keyboard request"
Linux events ? or are the corresponding responses just hardcoded into
their inits and cannot be altered by configuration ?

do the BSD kernels signal process #1 on occurence of certain events ?
do they even know events triggered by user input on the console ?



Re: runit SIGPWR support

2020-03-16 Thread Jeff
25.02.2020, 10:08, "Jonathan de Boyne Pollard" 
:
>>   Of course, but once the fd is closed, /dev/console should not have
>>  any impact on the process, so would a kbrequest still reach it?
>
> Yes. First: This is a kernel virtual terminal thing not a console
> thing. Strictly speaking, it is doing it wrongly to access it through
> the console device, which is not necessarily a KVT. Second: The
> mechanism does not require an open file descriptor. It requires that
> the target process never terminate, because there's no symmetrical
> kernel API function to disable the mechanism once it has been enabled,
> but it does not require that that process have any particular file
> descriptors open.

so /dev/tty0 should be open(2)ed instead of /dev/console since it is
the tty master.

and again:
will the by the ioctl(2) call achieved arrangement survive exec chaining
into another binary (the stage2 executable s6-svscan in case of s6) ?



Re: runit SIGPWR support

2020-03-16 Thread Jeff
24.02.2020, 23:25, "Laurent Bercot" :
> However, I was not aware that kbrequest needed a special ioctl call
> before it can be accepted, so thank you for that; I'll add the call
> to s6-l-i.

will the setting achieved by this very ioctl() call survive after
exec()ing into another binary (s6-svscan/stage 2) ?

the ioctl code has to be placed in s6-svscan if not.



Re: runit SIGPWR support

2020-02-24 Thread Jeff
24.02.2020, 11:23, "Laurent Bercot" :
>> SIGRTMIN+3 should also be caught and processed.

why only this one and not ALL of the real time signals ?

> What piece of software sends SIGRTMIN+3 to pid 1 when you're not
> running systemd?

in this case systemd compatibility can be trivially achieved,
so there is no real reason to abstain from it.

systemd uses real time signals since they were introduced
for this purpose:
signals without an already assigned default meaning,
free for application (ab)use, hence the systemd approach is
absolutely correct here.

support code for ALL of the RT signals on ALL platforms that
provide them can be added without much effort
(in a "POSIX-correct" way that is always so important to you):

https://man.voidlinux.org/signal.h

...

#include 

...

#if defined (SIGRTMIN) && defined (SIGRTMAX)
  /* catch and handle them with a hook executable like say
   * "SIGRT signum"
   * that is called with the RT signal number as a first parameter
   */
#else
  /* probably OpenBSD */
#endif

...



Re: runit SIGPWR support

2020-02-23 Thread Jeff
18.02.2020, 10:39, "Laurent Bercot" :
> you're telling me that s6-svscan needs to understand SIGPWR in case the
> kernel wants to signal a power failure, you actually have a good point,
> and yes, I should implement SIGPWR support when this signal exists.

BTW:

have you ever used s6 as process #1 on any other platform than Linux ?
i bet you have not even tried to do so on any of the BSDs.
so why are you sticking to all this "POSIX-correctness" ?
adding a few lines of code to support a specific platform (linux or any other
unix) looks not like a big problem to me.
sticking to POSIX features in the default case is a good way to
achieve portability, that's right.
but avoiding platform specific advantages at all costs seems
pretty strange to me.

solaris, AIX and even OS X are all POSIX platforms, hence it
would be interesting to see if s6 will work out of the box there
(as process #1; handling SIGPWR may be a requirement here).
i am sure it will since unlike systemd it is portable.

those platforms cannot be "POSIX-correct" if not.
hence their kernels should be made "POSIX-correct" to run the
"POSIX-correct" s6 unchanged as process #1.

so who has to adapt, s6 or those kernels ?



Re: runit SIGPWR support

2020-02-23 Thread Jeff
18.02.2020, 10:39, "Laurent Bercot" :
> An additional reason is that signaling init is not a casual operation;
> instead it's part of a very limited API between the kernel and user
> space, to be used in very controlled, exhaustively listed, situations.

right.

> Now, *as a separate conversation*, you can say that s6-svscan should
> be able to handle every signal that the kernel can throw at it, no
> matter how unportable. And it is a reasonable request: there are good
> arguments for it.

indeed.

> In the latter case, the kernel takes precedence over init, the kernel
> decides what the API is and init must adapt. If the kernel says "when
> I get a power failure, I send you SIGPWR", init cannot say "uh, no,
> I wish you'd send SIGUSR2 instead". Shut up and handle SIGPWR.

right.

> In the former case, lxd *emulates* a kernel, and is supposed to adapt
> to every kind of init that runs in a container, so it should follow
> existing conventions and be able to adapt to every init. And that's
> exactly why the lxc.signal.stop configuration switch exists!

really ? a process #1 in a namespace is not the "real" process #1,
hence there is no requirement to use a "real" init program here.
instead it is required to react to all signals lxd may sent if said process
#1 was spawned by it. of course things would be easier for everybody
if lxd could follow exsiting conventions on the linux platform, i cannot
see why it does not use TERM, USR1/2 and so on instead to notify
the process #1 it started. but it has no obligation to do so.

i guess the only case with a special meaning for SIGPWR is when the
real kernel notifies the real process #1 of a power shortage.
hence lxd is free to abuse this signal for its own purposes.
but this default choice looks indeed quite strange.


> systemd, always being a special snowflake, uses SIGRTMIN+3
> and SIGRTMIN+4, because any other choice made way too much sense.

why should it not use the RT sigs for this ? this is absolutely ok as linux
provides them anyway (unlike OpenBSD).

> None of them uses SIGPWR, and for a good reason: SIGPWR does not mean
> "the admin requested a system shutdown", it means "power failure". And
> it is very possible that the action implemented by the system in case
> of a power failure is very different from a shutdown: it could be a
> suspend-to-disk, for instance (which is faster than a full shutdown, and
> when the power fails you want to save your data *fast*). So, even for
> inits that actually understand SIGPWR - and most of them actually do -
> SIGPWR is a *terrible* default choice of signal to send as a shutdown
> request. It already has a use, and the use is not a normal shutdown.

right, agreed.

> Arguably, lxc.signal.halt should *always* be set to something else, be
> it SIGTERM, SIGUSR1, SIGUSR2, or even lolSIGRTMIN+3.

would have been a more obvious choice indeed, but they decided against
and this is also ok since this is not the kernel.

> So, if you're asking me to implement SIGPWR support in s6 because that's
> what lxd sends by default to signal a container shutdown, I will laugh
> at you, because you are being, uh, "ridicolous".

not really, catch it and let the user handle it, that way s6-svscan could be 
used
as process #1 in an LXC process namespace without problems.

> On the other hand, if
> you're telling me that s6-svscan needs to understand SIGPWR in case the
> kernel wants to signal a power failure, you actually have a good point,
> and yes, I should implement SIGPWR support when this signal exists.

right, it should be caught anyway and the user should decide via a hook
executable what to do about it (see if power returns after a while, sync and
suspend to disk if not naturally come to mind here).

s6 should also catch SIGWINCH (keyboard request) and let the user handle
it via a hook executable if the signal exists btw. dunno if it already does so.

you are absolutely right that one should not abuse SIGPWR to signal poweroff
to the "real" process #1 started by the kernel, there exist enough other signals
for that purpose.



Re: runit SIGPWR support

2020-02-23 Thread Jeff
> Most init systems allow the SIGPWR behavior to be configured.
> This includes Upstart, systemd, and my own "little init":
>
> https://gitlab.com/chinstrap/linit#configuration
>
> I provide a guide for using linit with runit here, but the process is
> experimental:
>
> https://gitlab.com/chinstrap/linit/-/blob/master/README.runit.md

linit does less than runit since it does not respawn any subprocess,
so this may not be a good choice for runit fans.

the only solution here is to patch runit's unmaintained code and
add support for SIGPWR (similar to the way it reacts to SIGINT
(secure attention key/ctrl-alt-del)) and SIGWINCH (keyboard request).
when you are at it you can also handle other signals (e. g. the
real time signals in a way compatible with systemd).

this is highly linux-specific of course but no real problem.

> Lastly I want to mention that lxc.signal.halt is not available with LXD.

that means SIGPWR cannot be replaced by another signal ?



Re: runit SIGPWR support

2020-02-17 Thread Jeff
17.02.2020, 11:00, "innerspacepilot" :
> Just as a thought: You have implemented signal diversion, but limited to
> known signals. Why not just pass unknown signals as numbers or something
> like (S6SIG55011), so they can be diverted by user? You wouldn't have to
> catalogue them.

absolutely right, totally agreed.
i also wondered why he refuses to add this.
just catch and handle ALL possible signals, including the RT signals
and leave it to the user how to react.

> We need good, flexible and user-friendly init alternatives for linux.

right.

>>  But even if your containers were using s6, which has a well-defined
>>  upstream (me) and which does not understand SIGPWR either, I would not
>>  apply your patch suggestion. Why? Because SIGPWR is not standardized,
>>  and s6 aims to be portable, it works ootb on other systems than Linux
>>  and making it use SIGPWR would endanger that. It's the exact kind of
>>  problems you haven't thought of but run into when you want to patch
>>  software, and makes patching always more complex than it seems from the
>>  outside.

sorry Laurent, this is absolutely ridicolous.

we are talking about using s6 as Linux process #1, so
it should catch, handle and react to all possible signals the
kernel may send to said process, there might be a good reason
for it, same for any other possible platform, be it BSD or SysV unices.

this is inherently unportable per se. there exists no POSIX standard
describing the signals a kernel may send to notify process #1 about
certain events.



Re: runit SIGPWR support

2020-02-17 Thread Jeff
17.02.2020, 15:45, "Jeff" :
> what about SIGINT and SIGWINCH ? are they required by the POSIX
> standard ? if not why does runit handle both ?

oh no, i just saw that it "POSIX-correctly" ignores SIGWINCH ...
the BSD kernels do not send SIGWINCH to process #1, so (ab)using
it violates the POSIX standard, rite ?


Re: runit SIGPWR support

2020-02-17 Thread Jeff
12.02.2020, 22:54, "Colin Booth" :
> far as I know SIGPWR is a Linux-specific signal so services that are
> aiming for portability will either need to have special handling for
> that in the linux case or need to ignore it. Ergo, runit (and all other
> POSIX-compliant inits) currently have no special handling around SIGPWR
> as they don't understand what it is.

what about SIGINT and SIGWINCH ? are they required by the POSIX
standard ? if not why does runit handle both ?


Re: runit SIGPWR support

2020-02-17 Thread Jeff
14.02.2020, 13:29, "innerspacepilot" :
> I would suggest it should be a graceful shutdown ( stopping all daemons,
> syncing filesystems and stuff )

yes, of course, this should preceed the powerdown step.

a more "correct" solution would be the approach taken by SysV init
via the "powerfail" stanza for the "real" process #1 (not those running
in containers/other process namespaces). it starts a subprocess to
handle the situation, i. e. see if power returns and shutdown ASAP
if not.

there is no excuse for a Linux process #1 to ignore SIGPWR anyway
since that signal is sent by the kernel in powerfail situations
(Linux and System V unices), it also makes sense to abuse it to
shutdown a container, so i cannot understand why runit just ignores it.



Re: runit SIGPWR support

2020-02-14 Thread Jeff
12.02.2020, 22:54, "Colin Booth" :
> I wasn't trying to be hostile, apologies if it came across that way. As
> far as I know SIGPWR is a Linux-specific signal so services that are
> aiming for portability will either need to have special handling for
> that in the linux case or need to ignore it. Ergo, runit (and all other
> POSIX-compliant inits) currently have no special handling around SIGPWR
> as they don't understand what it is.

what should SIGPWR mean to a Linux init ?
i would suggest: halt and power down the system ASAP.



Re: runit SIGPWR support

2020-02-14 Thread Jeff
12.02.2020, 22:54, "Colin Booth" :
> I wasn't trying to be hostile, apologies if it came across that way. As
> far as I know SIGPWR is a Linux-specific signal so services that are

this is a SysV signal that is sent in case of power suply problems.
it has no special meaning per se and can be (ab)used for everything
the coder sees fit.

is the abuse of SIGTSP in BSD init "POSIX-compliant" or even necessary ?
certainly not.

> aiming for portability will either need to have special handling for
> that in the linux case or need to ignore it. Ergo, runit (and all other
> POSIX-compliant inits) currently have no special handling around SIGPWR
> as they don't understand what it is.

what is a "POSIX-compliant init" btw ?

> Is this the right behavior? I don't know.

yes. what should power suply problems mean to container ?

> Something like SIGPWR as an
> alerting mechanism when you're switched to UPS battery is pretty nice in
> a general case but using that as your container shutdown solution
> isolates you into a very SysV-specific world.

BS. Linux provides this signal so you can (ab)use it for anything
you wish.

> Overriding the default via
> lxc.signal.halt will allow you to modify what you send to something that
> is within the POSIX spec and allow you to trigger shutdowns the "right"
> way. It's a little lame but it is portable, and LXC using a non-portable
> signal is a little bit of a bummer.

why "fix" LXC when adding handler code for SIGPWR to runit is not
much of a deal:

#ifdef SIGPWR
  /* handle it analogous to SIGTERM or else */
#endif

this is also very portable to systems that do not provide SIGPWR.

not in any way, the signal is portable between Linux and SysV unices.
it is a good idea to use it since it is rarely used for anything, another
solution could be usage of real time signals.



Re: runit SIGPWR support

2020-02-14 Thread Jeff
12.02.2020, 22:54, "Colin Booth" :
> On Wed, Feb 12, 2020 at 05:25:56PM +0300, innerspacepilot wrote:
>>  Why not just make runit systems run inside containers out of the box?
>>  We are talking about one/two lines of code.

you should patch the code, runit is dead anyway.
try something along this lines in the source:

#ifdef SIGPWR
  /* handle that one */
  ...
#endif

i can't see the problem, you have to patch the runit sources to
fulfil your requirements since that project is dead and the code
is not maintained anymore.

>>  Why can't we be just a little bit more friendly to each other?

that would be indeed helpful.

> I wasn't trying to be hostile, apologies if it came across that way. As
> far as I know SIGPWR is a Linux-specific signal so services that are
> aiming for portability will either need to have special handling for
> that in the linux case or need to ignore it. Ergo, runit (and all other
> POSIX-compliant inits) currently have no special handling around SIGPWR
> as they don't understand what it is.
>
> Is this the right behavior? I don't know. Something like SIGPWR as an
> alerting mechanism when you're switched to UPS battery is pretty nice in
> a general case but using that as your container shutdown solution
> isolates you into a very SysV-specific world. Overriding the default via
> lxc.signal.halt will allow you to modify what you send to something that
> is within the POSIX spec and allow you to trigger shutdowns the "right"
> way. It's a little lame but it is portable, and LXC using a non-portable
> signal is a little bit of a bummer.

just BS. adding a bit of handler code for SIGPWR is no big deal,
please stop your lamento, it's so boring.



Re: s6 usability (was: runit patches to fix compiler warnings on RHEL 7)

2019-12-02 Thread Jeff
30.11.2019, 19:58, "Laurent Bercot" :
>> the solution here could be a simple symlink to the original s6 tool without
>> the prefix if you prefer (maybe even located in an other dir than /bin).
>
> That would be a decision for users, not software authors - else it would
> defeat the point of not invading the namespace. Daemontools is still
> around with names such as "svc".

sure, that was just an idea for Jan, he could just create a dir somewhere,
populate it with symlinks he prefers to the original s6 tools and put this dir
in front of the PATH when running s6 since it seems the utilities do not bother
under what name they run.
 
>> using a single combined tool is more efficient since it avoids wasteful
>> further exec chaining steps, though.
>
> Sure, but if we're talking about UI, optimization at this level is a
> very
> moot point. A human choosing between "chpst" and "s6-applyuidgid" will
> *not* notice the extra millisecond taken by an execve() step. The
> primary focus should be usability.

i prefer short names like "chpst" (change process state ?) with multiple
command line options from a usability perspective. but the usage of single
tools with descriptive names is of course easier to read (not to write) and
hence understand when they occur in a script, that's true.

> I am reluctant to make the ABI details public because I want the freedom
> to change them. If people start relying on internals, their stuff may
> break when updating, which would be bad.
> There are *some* details that I could document as official and stable,
> but I'd need to go through all of it and decide with precision what can
> be guaranteed and what cannot - and that's extra work, so it will have
> to wait.

ok. i was more about insights into the design of the whole s6-rc toolset.
are the up/down scripts run by a dedicated service from within the supervision
tree? what exactly is the task of the "s6-rc-oneshot-run" and
"s6-rc-fdholder-filler" internal programs ? how is the startup organized,
how are "longruns" and "oneshots" intertwined ?

having to read the sources to get this information is somewhat inconvenient.
:-(



Re: The "Unix Philosophy 2020" document

2019-11-30 Thread Jeff
30.11.2019, 16:48, "Casper Ti. Vector" :
> See also this design:
> .

since scheme was mentioned in that article:
an init/supervisor written entirely in (Guile) Scheme already
exists (http://www.gnu.org/software/shepherd/):

The GNU Daemon Shepherd or GNU Shepherd,
formerly known as GNU dmd, is a service manager that looks
after the herd of system services. It provides a replacement for
the service-managing capabilities of SysV-init (or any other init)
with a both powerful and beautiful dependency-based system with
a convenient interface. It is intended for use on GNU/Hurd, but it is
supposed to work on every POSIX-like system where Guile is available.
In particular, it is used as PID 1 by GNU Guix.

this is also not entirely off topic here since it is used as process #1,
daemon supervisor and "service manager".



Re: The "Unix Philosophy 2020" document

2019-11-30 Thread Jeff
30.11.2019, 15:43, "Casper Ti. Vector" :
> "builtins" are supported by the `nosh' interpreter

a useful command interpreter should provide some builtins IMO.
this is much more efficient and avoids excessive exec chaining
(analoguous to single combined utils for several tasks vs the
"one task one tool" approach). there might be a very good reason
shells provide builtin "cd", "umask", "(u)limit" etc ...

i dunno if such builtins would be possible with execline, too.



Re: s6 usability (was: runit patches to fix compiler warnings on RHEL 7)

2019-11-30 Thread Jeff
30.11.2019, 11:15, "Laurent Bercot" :
> This is very interesting. I thought that having a s6- prefix was a *good*
> thing, because I valued clarity above everything, and especially above
> terseness. I understand the advantages of having commands named "sv" and
> "chpst", but I believed, in my naïveté, that it wasn't a good idea for a
> specialized package to tread on a large namespace; and to me the s6-
> prefix would help users recognize exactly the domain of the command
> they're using, and then they could abstract it away and focus on the
> real command name.

totally agreed, Laurent.

using a dedicated namespace prefix like "s6-" is a very good idea.
this avoids nameclashes (i. e. overwriting on installation) with similar
utilities of other supervision suites and frees Laurent from the task
of coming up with proper AND unique command names. consider
nameclashes of several "init" program for example.

the solution here could be a simple symlink to the original s6 tool without
the prefix if you prefer (maybe even located in an other dir than /bin).

> The number of executables is a choice; I like to have more, smaller,
> executables than less, bigger ones. One function, one tool. It makes
> code easier to write; this is not really for historical reasons, it's a
> design choice. Personally, it's easier for me to remember several
> process state change command names than all the options to chpst.
> whenever I use chpst, I always need to check the doc; when I use
> something like softlimit or setuidgid, I may need to check the doc for
> specific options, but I always remember which command I want and its
> general syntax. So, I suppose it comes down to individual preference
> there.

using a single combined tool is more efficient since it avoids wasteful further
exec chaining steps, though.

> Would a generic "s6" command, that takes alternative syntax and rewrites
> itself into "internal" commands, help? It could emulate runit syntax,
> among other things.
>
> s6 runsv ... -> s6-supervise ...
> s6 sv ... -> s6-svc ...
> s6 chpst ... -> various s6-prefixed process state change commands
>
> My plan is for the future s6-frontend package to include such a
> one-stop-shop command that controls various aspects of an s6
> installation,
> but if this command can help with s6 adoption, I can work on it much
> earlier than the rest of the s6-frontend functionality.
>
> Or, if you have other ideas that could help with easier assimilation of
> the s6 commands, I'm very open to suggestions.

Busy/ToyBox style ?

> Would entirely removing s6's dependency on execline help clear that
> misunderstanding and help with s6 adoption? This could be made possible
> by:
> - duplicating el_semicolon() functionality in s6-ftrig-listen.c
> (it's not elegant, but clearing the dep may be worth it)
> - adding an alternative '?' processor directive to s6-log, that spawns
> the processor using /bin/sh instead of execlineb. (The '!' directive
> would still be there; processor invocations using '!' would just fail
> if execline is not installed.)

sounds not too bad, IMO. though i personally can live without it,
especially since the other suites also provide loggers (without any
execline deps of course), he can use dt encore's "multilog" utility.

> s6-rc, however, absolutely cannot do without execline, since it uses
> autogenerated execline scripts.

could you document the way s6-rc works (i. e. its architecture) somewhere ?
or are users requested to follow your C code to find out how it works
exactly ?

> But s6-rc is a different beast, that
> requires a lot more involvement than s6 anyway, and that isn't needed
> at all if we're just talking about runit-like functionality.

indeed.



Re: runit patches to fix compiler warnings on RHEL 7

2019-11-30 Thread Jeff


>>  chpst is a monster of a program and at least with runscripts written in
>>  execline it's generally easier to understand 3-4 process state
>>  manipulators than a pile of chpst options.
>
> this is more complicated to use, though.

it is even unnecessary, inefficient, and wasteful.

why exec chaining into several different utils where doing all the
requested process state changes in one go by using the same
utility to achieve them would suffice ?



Re: runit patches to fix compiler warnings on RHEL 7

2019-11-30 Thread Jeff
30.11.2019, 01:22, "Colin Booth" :
>>  2) runit has manpages. s6 has HTML. :(
> Yeah, this sucks. I know Laurent isn't going to do it but I've been
> meaning to take some time off and dedicate it to rewriting all of the
> documentation into something that an compile into both mandoc and html.

what about writing the docs in Perl's POD format or Markdown ?
it is easy to convert POD to html AND manpages (pod2(html,man))
and to deliver the generated docs in the source releases.

>>  3) s6 executables are somehow worse named than runit's. This may be
>> highly subjective, but I can recall and recognize the runit commands
>> far easier than the s6 ones. Possibly it's the "s6-" prefix getting
>> in the way of my brain pattern matching on visual appearance of glyph
>> sequences.
>> This point is exacerbated by #2 and the number of s6 executables.
>> Compare chpst with s6-envdir s6-envuidgid s6-fghack s6-setsid
>> s6-setuidgid s6-applyuidgid s6-softlimit. Yes, I know about the
>> historical reasons, but sti

totally agreed.

> chpst is a monster of a program and at least with runscripts written in
> execline it's generally easier to understand 3-4 process state
> manipulators than a pile of chpst options.

this is more complicated to use, though.
therefore i personally prefer perp's approach of providing both:
the single process state manipulators and their combination into one
single tool ("run..." vs "runtool").

>>  Brainstorming possible ways forward:
>>
>>  A) Gerrit Pape becomes more active in maintianing runit, at least
>> acknowledging patches posted here.
>>  B) Somebody else steps in as (co-)maintainer.
>>  C) We get a dumping ground (wiki or somesuch) for patches to allow
>> - contributors to publish their patches (after discussing them here)
>> - users to easily find and download patches they'd be interested in
>> - Gerrit Pape to review and apply patches at his leisure when he
>>   feels like making a new release.
>>  D) The maintainers of distros shipping runit work out a patch-sharing
>> scheme among them.

runit is dead, i recommend against using it at all, the only tools of interest
this supervision suite provides are "chpst" and "utmpset"
(though the latter is indeed not as powerful as it should to make it really
useful).

besides waking up to poll for rundir changes shutting it down really
sucks since it has problems closing the log files properly, i have not seen
this with any of daemontools encore, perp, and s6.

consider switching, daemontools encore and perp were not meant to run
as process #1, but they can be supervised by (Busy,Toy)Box- or SysV init easily.

daemontools encore's "svscan" utility wakes up to poll the rundir for changes
frequently, though, unlike s6 and perp it does not provide any option to only
do this on request (maybe by just listing for a given signal).

so the final conclusion and recommendation here is:
stop using runit for supervision (and as process #1) and switch to s6.



Re: The "Unix Philosophy 2020" document

2019-11-30 Thread Jeff
> this sounds even more funny with regard to the posting's title
> "Unix Philosophy 2020(!!!)".

C++ is the perfect language to implement systemd in btw,
though even prof. dr. L. Poettering abstained from doing so.

will we see a C++ rewrite of our favourite "init framework"
(the "master of the cgroups") ?

or is it just too early since our beloved friend and every admin's
darling is still not "feature complete" (or even stable :-) after all ?

let's wait and see what pleasant suprises christmas eve may bring ...



Re: The "Unix Philosophy 2020" document

2019-11-30 Thread Jeff
> why C++ btw ?
>
> i don't see any benefit in not using C in the first place,
> since when does one write Unix system tools in C++ ?
>
> is it the added "advantage" of bloated binaries and additional
> lib dependencies ?

this sounds even more funny with regard to the posting's title
"Unix Philosophy 2020(!!!)".
:-(



Re: The "Unix Philosophy 2020" document

2019-11-30 Thread Jeff
>>  Macros and/or helper functions (again cf. [1]; they can be factored into
>>  a mini-library in nosh) can also be used to reduce boilerplate like
>>  > const int error(errno);
>>  > std::fprintf(stderr, ..., std::strerror(error));
>>  > throw EXIT_FAILURE;
>>  which can be easily observed after the attached patch is applied.
>
> The first chunk in the patch is incorrect, and the new code should be
> (in other words, swap the `ENOENT == error' and the `const int' lines)
>>  if (!self_cgroup) {
>>  if (ENOENT == error) return; // This is what we'll see on a BSD.
>>  const int error(errno);
>>  std::fprintf(stderr, "%s: FATAL: %s: %s\n", prog, "/proc/self/cgroup", 
>> std::strerror(error));
>>  throw EXIT_FAILURE;
  ^^ using an exception here looks like a 
REALLY brilliant idea to me
>>  }

why C++ btw ?

i don't see any benefit in not using C in the first place,
since when does one write Unix system tools in C++ ?

is it the added "advantage" of bloated binaries and additional
lib dependencies ?



Re: chpst -u and supplementary groups

2019-08-27 Thread Jeff
> Apparently everyone re-implementing daemontools does something like
> this. So that brings me back to my original question:
> is there consensus that the historical behaviour is a bug?

no, this is no bug.

> Or are there valid use cases?

most of the time one does not want the subprocess to run under
additional GIDs, so that is a sane default behaviour.

obviously there should be an option that makes "chpst" add all
supplementary GIDs the UID belongs to, though
(when this is desired by the user).

would not be too much work to add such a commandline option to it.



combining s6 utils

2019-06-09 Thread Jeff


what about combining daemontool-like s6 utilities like
s6-(setuidgid,softlimit,setsid) into one tool akin to runit's
"chpst" ("change process state" ?) and perp's "runtool"
while also retaining the originals ?

this is redundant but shortens exec chains.
(such a tool could also provide ch(dir,root) functionality)



where is the right place to run s6-rc from ?

2019-05-19 Thread Jeff


i wonder whether s6-rc (or any other service manager or startup script)
should be started from within the supervision tree directly as
service (dir) in the scandir ?

let us assume we created a service dir "rc" or so in the otherwise
unpopulated scandir (except required loggers).
it starts s6-rc (or another service manager) from its run script and
execs into "pause" or something similar that just pauses until killed.
that way it is possible to associate a log service with said "rc" service
and when "pause" exits the service manager can be called from the "finish"
script to bring down the other services correctly
(dunno if this script's output will be logged),
in this case shutdown would just mean to stop the "rc" service.

of course it would be more natural to make this very service run only
once via the "once" file. but i guess such "once" services cannot have
an associated logger, right ?

another hack would be to make this once service write its output to a
fifo that is read by another logging service, could this be s6-svscan's
own catch-all default logger ?

but when using such a catch-all logger to log s6-svscan's own output
there is no need for the "rc" service to have its own logger since
everything it outputs goes into this catch-all logger.

in case of a run-once service s6-rc would be called from s6-svscan's
signal handler scripts again to bring down services.

why not ? apparently a good idea ...



Re: emergency IPC with SysV message queues

2019-05-19 Thread Jeff


> What details need to be conveyed other than "stand up", "sit down",
> and "roll over" (boot, sigpwr, sigint)?

depends on what you plan to do. for a minimal init handling SIGCHLD
(that is an interesting point indeed. is it really necessary ?
i still have to find out. would be nice if one could run without it
though.) and (Linux) SIG(INT,WINCH) should be sufficient.
in the case of the latter 2 it would be enough to run an external
executable to notify their arrival and let the admin decide what
to do about them. maybe SIGPWR is of relevance too.
that suffices for init itself.

a supervisor needs more information, such as:
start this service, stop that one, disable another, restart one,
signal another one and so on, depends on what capabilities the
supervisor provides.

and this has to be encoded in such a protocol that uses 2 ipc
mechanisms: sysv ipc and a specific signal (SIGIO comes to mind)
to notify the daemon (maybe a third one: (abstract) sockets).

> Abstract namespace sockets have two shortcomings:
>
> * not portable beyond linux

true, but i would use them where available and standard unix sockets
elsewhere.

> * need to use access restrictions

don't you use credentials anyway ?
AFAIK all the BSDs and Solaris have them too.

> * shared across different mount namespaces;
>   one needs a new network namespace for different instances

so you need to care for that namespaces too. this can be an
advantage since it is decoupled from mount namespaces though.

i did not consider namespaces at all since i follow the systemd
development approach: works on my laptop, hence works everywhere. :-(

> I am considering dropping it for a socket in /run in my supervisor.

why not ? i would use standard unix sockets for everything with
PID > 1 too, but in the process #1 situation i guess they provide
an advantage.



Re: emergency IPC with SysV message queues

2019-05-19 Thread Jeff


On Thu, May 16, 2019 at 09:25:09PM +, Laurent Bercot wrote:
> Oh? And the other complaints haven't given you a clue?
> We are a friendly community, and that includes choosing to follow
> widely adopted threading conventions in order to make your readers
> comfortable, instead of breaking them because you happen not to like
> them. Please behave accordingly and don't be a jerk.

breaking long threads that went in a direction that has nothing to do
with the original thread topic is in no way unfriendly or offensive,
nor does that make me a "jerk".

> Okay, so your IPC mechanism isn't just message queues, it's a mix
> of two different channels: message queues *plus* signals.

well, no. the mechanism is SysV msg queues and the protocol for
clients to use to communicate includes - among other things - notifying
the daemon (its PID is well known) by sending a signal to wake it up and
have it processes the request input queue.
you do not use just fifos (the mechanism), there is also a protocol
involved that clients and server use.

> Signals for notification, message queues for data transmission. Yes,
> it can work, but it's more complex than it has to be, using two Unix
> facilities instead of one.

indeed, this is more complex than - say - just sockets. on the other
hand it does not involve any locking to protect against concurrently
accessing the resource as it would have done with a fifo.

and again: it is just an emergency backup solution, the preferred way
are (Linux: abstract) unix sockets of course. such complicated ipc is
not even necessary in my case, but for more complex and integrated
inits it is. that was why i suggested in order to make their ipc
independent of rw fs access.

and of course one can tell a reliable init by the way it does ipc.

> You basically need a small library for the client side. Meh.

right, the client has to know the protocol.
first try via the socket, then try to reach init via the msg queue.
for little things like shutdown requests signaling suffices.

> A fifo or a socket works as both a notification mechanism and a
> data transmission mechanism,

true, but the protocol used by requests has to be desinged as well.
and in the case of fifos: they have to be guarded against concurrent
writing by clients via locking (which requires rw fs access).

> and it's as simple as it gets.

the code used for the msg queueing is not complicated either.

> Yes, they can... but on Linux, they are implemented via a virtual
> filesystem, mqueue. And your goal, in using message queues, was to
> avoid having to mount a read-write filesystem to perform IPC with
> process 1 - so that eliminates them from contention, since mounting
> a mqueue is just as heavy a requirement as mounting a tmpfs.

indeed, they usually live in /dev/mqueue while posix shared memory
lives in /dev/shm.

that was reason that i did not mention them in the first place
(i dunno if OpenBSD has them as they usually lag behind the other
unices when it comes to posix conformance).

i just mentioned them to point out that you can be notified about
events involving the posix SysV ipc successors.
i never used them in any way since they require a tmpfs for this.

> Also, it is not clear from the documentation, and I haven't
> performed tests, but it's even possible that the Linux implementation
> of SysV message queues requires a mqueue mount just like POSIX ones,
> in which case this whole discussion would be moot anyway.

which in fact is not the case, try it with "ipcmk -Q", same for the
other SysV ipc mechanisms like shared memory and semaphores.
you can see that easily when running firefox. it uses shared memory
without semaphores akin to "epoch" (btw: if anyone uses "epoch" init
it would be interesting to see what ipcs(1) outputs).
this is in fact a very fast ipc mechanism (the fastest ??), though
a clever protocol must be used to avoid dead locks, concurrent accesses
and such. the msg queues have the advantage that messages are already
separated and sorted in order of arrival.

> You've lost me there. Why do you want several methods of IPCs in
> your init system? Why don't you pick one and stick to it?

since SysV msg queues are a quite portable ipc mechanism that does
not need any rw access. so they make up for a reliable ipc backup
emergency method.

> Sockets are available on every Unix system.

these days (IoT comes to mind). but i guess SysV init (Linux) does
not use them since there might have been kernels in use without
socket support (?? dunno, just a guess).
on the BSDs this should be true since it was said that they implement
piping via socketpairs.

> So are FIFOs.

i do not like to use them at all, especially since they need rw
(is that true everywhere ??).

> If you're going to use sockets by default, then use sockets,
> and you'll never need to fall back on SysV IPC, because sockets work.

true for abstract sockets (where available), dunno what access rights
are needed to use unix sockets residing on a fs.

killall test run

2019-05-19 Thread Jeff
19.05.2019, 13:24, "fungal-net" :
> I am glad some of you can tell more than I can about this, and since you
> did I tried my weirdest of setups. This is Adélie adelielinux.org
> installation on HD. Although it is confusing to me how they set this up
> still, after months of following its development (beta3), there is
> sysvinit on the first steps of booting then OpenRC takes over, and then
> s6-supervisor handles everything running. It is like a fruit punch in
> my eyes. For those that don't know this is built on musl.

they use SysV init + OpenRC (with some scripts from Alpine)
OpenRC is known to work with s6-svscan (this is done via the
/libexec/rc/sh/s6.sh shell script). although it is better to start
s6-svscan from /etc/inittab (directly or - as i prefer - via a starter
shell/execline script that ensures at least the scandir exists)
since SysV init can respawn processes anyway (supervised
supervisor ;-).

by default OpenRC (better: s6.sh) uses /run/openrc/s6-scan as the
service scandir and /var/svc.d as service template dir, this can
be changed easily of course since only shell scripts are involved here.
when starting an s6 service it copies the service template dir into
the scandir /run/openrc/s6-scan. for this to work properly an already
running instance of s6-svscan (that runs on said scandir) is required.
OpenRC achieves this by adding "s6-svscan" to the "need" call in
the depend function of the corresponding service script.

when starting s6-svscan from inittab OpenRC does not have to
start it and there is no need to start the s6-svscan nor to add it to
other services' dependencies. although i do not know the order
in which sysvinit processes the inittab entries for a given (SysV)
"runlevel". do "wait" entries precede "respawn" entries, followed
by the "once" entries ? dunno, this needs a bit of hackery to make
it work but this ordering/dependency problem is definitely solvable.

> # kill -9 -1 on tty1 brought me back to tty1 login screen with 5 more
> ttys active. So everything is respawned almost instantly to a system
> just like it had just booted. Doing the same from terminal on X had the
> same exact outcome.

as expected. :D

> Both s6/s6-rc and 66 pkgs are available through void's repositories but
> s6-rc has been modified and I haven't been able to get it to work.

you can find out easily about your init via
$ ps -Fjlp 1
also have a look in the /proc/1 dir when procfs is mounted on /proc,
the target of the exe symlink is of interest here.

to see how all this is organised check with
$ ps -FHjle fww
BEFORE kill -1 -9 and again after sending the kill blast.

> Void uses arch-like /bin /sbin --> /usr/bin,

yes, i noticed this too. this is not "arch" like but was invented by fedora/
red hat instead. being the fedora testbed the lame arch distro had
to follow suit immediately as is typical for them.

> Adélie has more traditional 4 separate directories.

this is what one would expect.




Re: interesting claims

2019-05-17 Thread Jeff
18.05.2019, 00:58, "Guillermo" :
>>  OpenRC: Nice,
>>    init
>> |_ zsh
>>    when I exited the shell there was nothing but a dead cursor on my screen

in this case the shell is not signaled since "-1" does not signal the sending
process.

> May I ask what was this setup like? You made a different entry for
> sysvinit, presumably with the customary getty processes configured in
> /etc/inittab 'respawn' entries, judging by your results, so how was
> the OpenRC case different?

i also wondered whether he used openrc-init here ?
in that case he may have also used openrc's "supervise-daemon" util
which do not get restarted after they were terminated by the kill -1 -9
blast and hence cannot respawn the gettys. looks like you were pretty
hosed when you quit the super-user zsh (which sent the kill blast via
its "kill" builtin) ?

you should provide more information on the used init here as openrc
is not an init per se and works well with sysv + busybox init, runit, ...

>>  sysV: init and 6 ttys with shell ... nothing can kill it that I know off.

what do you mean here ?
were the gettys respawned by SysV init or did they not die at all ?
where did you send the signal from ?
i would assume from a super-user zsh on a console tty ?



Re: interesting claims

2019-05-16 Thread Jeff
16.05.2019, 10:31, "Laurent Bercot" :
>> The Question: As a newbie outsider I wonder, after following the
>> discussion of supervision and tasks on stages (1,2,3), that there is a
>> restrictive linear progression that prevents reversal. In terms of pid1
>> that I may not totally understand, is there a way that an admin can
>> reduce the system back to pid1 and restart processes instead of taking
>> the system down and restarting? If a glitch is found, usually it is
>> corrected and we find it simple to just do a reboot. What if you can
>> fix the problem and do it on the fly. The question would be why (or why
>> not), and I am not sure I can answer it, but if you theoretically can do
>> so, then can you also kill pid2 while pid10 is still running. With my
>> limited vision I see stages as one-way check valves in a series of fluid
>> linear flow.

take a look at (the now defunct) depinit:
http://sf.net/p/depinit/
http://depinit.sf.net/

it is said to provide very extended rollback of dependencies
(so extended gettys will not work with it according to the docs).

> Stage 1 isn't reversible; once it's done, you never touch it again,
> you don't need to "reverse" it. It would be akin to also unloading
> the kernel from memory before shutting down - it's just not necessary.

indeed.
and when something fails in that first stage a super-user rescue shell
should be started to fix it instead of any services that depend on it.
(stupid example: sethostname failed for some reason, spawn a rescue
shell for the admin to do something about it ;-).
in such cases it has to be considered whether this failure important
enough to justify interuption of the boot phase.

if not: start as much other services as possible,
output/log an error message, keep calm, and carry on,
things can be handled when a getty is up.

> stage 4

i would prefer to call it "stage 3b" since stage 4 would be start after
stage3a + b, i. e. process #1 execs into another executable, maybe
required in connection with initramfs, anopa provides such a stage 4
execline script.

> - If you want to kill every process but pid 1 and have the system
> reconstruct itself from there, then yes, it is possible, and that is
> the whole point of having a supervision tree rooted in pid 1. When
> you kill every process, the supervision tree respawns, so you always
> have a certain set of services running, and the system can always
> recover from whatever you throw at it. Try it: grab a machine with
> a supervision tree and a root shell, run "kill -9 -1", see what happens.

i wonder what happens if process #1 reacts to, say SIGTERM,
by starting the shutdown phase and doing reboot afterwards.
what if process #1 is signaled "accidently" by kill -TERM 1
(as we saw in preceding posts -1 will not reach it).
nothing is restarted and the system goes down instead since
it is assumed that the signal was not sent "accidently".

in the case of a process #1 not supervising anything, supervisor
runs with 1 < PID when killing everything "accidently"
(via kill ( -1, SIGKILL ) for example), system is bricked, reset
button has to be used:

only a privileged process can reach everything with PID > 1 that
way. there seems to be something wrong that should be fixed ASAP.
in the case of process #1 respawning the supervisor:
it restarts everything, maybe the "accident" happens again, and so on ...
could lead to the system being caught in such an "endless" loop.
maybe this can also only get fixed by powering down ...

non supervising process #1: same, but worse: reset button has to
be used, state is lost, fs are not unmounted cleanly and what not.

but in the situation of a supervising process #1 it can also be possible
to be prevented from entering the shutdown phase cleanly.



Re: correct init

2019-05-16 Thread Jeff
> process #1 should not rely on conditions that it has not previously
> ascertained to be true (eg by setting something up by itself, so it
> exists and is safe to use/rely on it).

> sounds self-evident ? sadly many inits do not comply with that
> postulation, a well known example that comes immediatly to mind
> is the poorly designed/though out SysV init itself.

it also relies on syslog for logging, though it cannot ensure someone
listens to /dev/log and processes the log messages.




correct init

2019-05-16 Thread Jeff
new postulation:

process #1 should not rely on conditions that it has not previously
ascertained to be true (eg by setting something up by itself, so it
exists and is safe to use/rely on it).

sounds self-evident ? sadly many inits do not comply with that
postulation, a well known example that comes immediatly to mind
is the poorly designed/though out SysV init itself.

the poor crap relies on the initctl fifo for ipc, how dumb can one be ?



Re: emergency IPC with SysV message queues

2019-05-16 Thread Jeff
11.05.2019, 15:33, "Laurent Bercot" :
> Please stop breaking threads. This makes conversations needlessly
> difficult to follow, and clutters up mailboxes.

i do that intentionally since i find the opposite easier to follow.
that leads often to complaints on other lists aswell.

> That is obviously not going to work.

obviously ? to me this is not obvious at all.

> Operations such as msgsnd() and msgrcv() are synchronous and
> cannot be mixed with asynchronous event loops.

what does that exactly mean ? do you mean they block ?
this not the case when the IPC_NOWAIT flag is used.

> There is no way to be asynchronously notified of the
> readability or writability of a message queue, which would be
> essential for working with poll() and selfpipes for signals.

i do not understand at all what you mean here.
the client should signal us (SIGIO for example) to wake us up, then
we look at the input queue without blocking (handling SIGIO via selfpipe)
else we ignore the msg queue. again: this just a backup ipc protocol.
i suggest not to use an output queue to reply to client requests, keeps
things simpler. but if you insist on a reply queue you have to look if it
is not full before writing the reply to it (do this with a non-blocking call
to msgrcv(2) using said IPC_NOWAIT flag). use IPC_NOWAIT in the
following msgsnd(2) writing call, again nothing will block.

and i have not even mentioned posix message queues which can be
used from a select/poll based event loop ...
(dunno if OpenBSD has them)

please also take into consideration the following:

* we run as process #1 which we should exploit (SysV ipc ids) here

* SysV ipc is just used as a backup ipc protocol where unix (abstract
  on Linux) sockets are the preferred default method 
  (and signals of course).

> If you are suggesting a synchronous architecture around message
> queues where the execution flow can be disrupted by signal handlers,
> that is theoretically possible, but leads to incredibly ugly code
> that is exceptionally difficult to write and maintain, to the point
> where I would absolutely choose to use threads over this.

just a claim, it is not too much work, now that you beg for it i consider
adding it again, just for you as i would else have restricted myself to
signal handling only which is absolutely sufficient in that case.

> Yes. Every modern Unix system has a way to create and mount a RAM
 ^^ see ? how modern ? not that portable, right ?
 there was a reason why the initctl fifo was stored in /dev 
...

> filesystem. The APIs themselves are not portable, and that is why
> s6 doesn't do it itself, but the concept totally is. If I had more
> experience with the BSD APIs, I could easily port s6-linux-init to
> the BSDs. I hope someone more BSD-savvy than me will do so.

no this is not portable but your assumption since s6 will not work
without rw which is indeed very limiting and far from correct behaviour.
the need for rw is an unnecessary artificial introduction to use your
daemontools style supervisor tool suite for process #1.

> Using non-portable *functionality*, on the other hand, is an
> entirely different game. Abstract sockets only exist on Linux, and
> porting code that uses abstract sockets to another platform, without
> using non-abstract Unix domain sockets, is much more difficult than
> porting code that mounts a tmpfs.

we use normal unix sockets on BSD, on Solaris one could also use SysV
STREAMs for local ipc (needs also rw AFAIK).

that is ok since we provide a portable backup/emergency method that
works even on OpenBSD. on Linux we exploit its non-portable abstract
sockets which makes ipc even less dependent on rw mounts.
that is perfectly ok IMO.

> That is my opinion backed by 20 years of experience working with
> Unix systems and 8 years with init systems, and evidence I've been
> patiently laying since you started making various claims in the mailing-list.

it is you who makes claims on your web pages, so i just wondered how
you back them. quote:

> System V IPCs, i.e. message queues and semaphores.

why using semaphores ? they are primarily meant to ease the usage
of SysV shared memory. but epoch init uses shared memory without them.

> The interfaces to those IPCs are quite specific and can't mix with

specific to what ? even older unices support them.
and SysV shared memory is in fact a very fast ipc method.

> select/poll loops, that's why nobody in their right mind uses them. 

there exist also the respective successor posix ipc mechanisms
where one could do exactly that.

> You are obviously free to disagree with my opinion,
> but I wish your arguments came backed with as much evidence as mine.

they are backed very well by the man pages of the syscalls in question.
please also notice the difference between an ipc mechanism and a
protocol that makes use of it.

>> (hello opendir(3) !!! ;-). :PP

hello (pu,se)tenv(3) !!



emergency IPC with SysV message queues

2019-05-11 Thread Jeff
10.05.2019, 20:03, "Laurent Bercot" :
> Have you tried working with SysV message queues before recommending
> them ?

yes, i had the code in an init once, but i never completed that init.
but dealing with SysV msg queues was not such a big deal from the
code side.

i used it merely as an additional emergency ipc method when other
ipc methods become impossible. though actually signals were sufficient
in case of emergency.

> Because my recommendation would be the exact opposite. Don't
> ever use SysV message queues if you can avoid it. The API is very
> clumsy, and does not mix with event loops, so it constrains your
> programming model immensely - you're practically forced to use threads.
> And that's a can of worms you certainly don't want to open in pid 1.

that is wrong. just read the msg queue when a signal arrives
(say SIGIO for example). catch that signal via selfpiping and there
you are, no need to use threads. we were talking about process #1
anyway, so some msg queue restrictions do not apply here (like
finding the ipc id), if you need to send replies back to clients set up a
second queue for the replies (with ipc id 2, we are process #1 and
free to grab ANY possible ipc id). if those clients wait for their results
and block you could also signal them after writing the reply to the
second queue (that means clients send their pid along with their
requests, usually the message type field is (ab ?)used to pass that
information).

> Abstract sockets are cool - the only issue is that they're not
> portable, which would limit your init system to Linux only.

well, we are talking about Linux here since that is where that
obscene systemd monstrosity rampages for quite a while now.

> If you're going for a minimal init, it's a shame not to make it
> portable.

in my case it will be portable and will work solely signal driven.
but i do not see to much need for other unices to change their
init systems, especially not for BSDs.

of course BSD init could be improved upon but it just works and
it is rather easy to understand how. they did not follow the SysV
runlevel BS and speeding their inits up will mostly mean to speed
up /etc/rc and friends. it also has respawn capabilities via /etc/ttys
to (re)start a supervisor from there. though i agree it is quite big for
not doing too much ...

i hope the FreeBSD (i do not think the other BSDs even consider such
a step) team will not follow the BS introduced elsewhere (OpenBSD
will probably not), a danger for FreeBSD was some of their users'
demand to port that launchd joke over from macOS or come up
with something even worse (more in the direction of systemd).

> Really, the argument for an ipc mechanism that does not require a
> rw fs is a weak one.

not at all.

> Every modern Unix can mount a RAM filesystem nowadays,

that is a poor excuse, you wanted to portable, right ?

> and it is basically essential to do so if you want early logs.

use the console device for your early logs, that requires console
access though ...

> Having no logs at all until you can mount a rw fs is a big
> no-no, and being unable to log what your init system does *at all*,
> unless you're willing to store everything in RAM until a rw fs
> comes up, is worse. An early tmpfs technically stores things in RAM
> too, but is much, *much* cleaner, and doesn't need ad-hoc code or
> weird convolutions to make it work.

> Just use an early tmpfs and use whatever IPC you want that uses a
> rendez-vous point in that tmpfs to communicate with process 1.
> But just say no to message queues.

that is just your opinion since your solution works that way.
other solutions are of course possible, eg using msg queues
just as a backup ipc method since we can exploit being process #1
here. in the case of Linux i do not see any reason not to use
abstract unix sockets as preferred ipc method in process #1
(except the kernel does not support sockets which is rare these
days, right ? on BSD AFAIK the kernel also supports sockets
since socketpairs were said to be used there to implement pipes.)
for BSD use normal unix sockets (this is save to do even on OpenBSD
since we have emergency ipc via signals and SysV msg queues),
on Solaris SysV streams (and possibly doors) might be used as
we also have said backup mechanism in reserve.

but to be honest: a simple reliable init implementation should be solely
signal driven. i was just thinking about more complex integrated inits
that have higher ipc demands (dunno what systemd does :-).

you can tell a reliable init by the way it does ipc.
many inits do not get that right and rely on ipc mechanisms that require
rw fs access. if mounting the corresponding fs rw fails they are pretty
hosed since their ipc does not work anymore and their authors were
too clueless to just react to signals in case of emergency and abused
signal numbers to reread their config or other needless BS.

i top my claims even more:
you can tell a reliable init by not even using malloc directly nor in

SysV shutdown util

2019-05-11 Thread Jeff
10.05.2019, 20:03, "Laurent Bercot" :
> then signals are not enough:
> you need to be able to convey more information to pid 1
> than a few signals can.

such as ?
what more information than the runlevel (0 or 6, maybe 1 to go
into single user) does SysV init need to start the system shutdown ?
and shutdown itself just notifies all users via wall, logs shutdown time
to wtmp and then notifies init via the /dev/.initctl fifo.
this can all be done solely by 2 different signals.

the Void Linux team just made a shell script out of it that just brings
down the runit services and runsvdir itself.



ipc in process #1

2019-05-09 Thread Jeff


IMO process #1 should be solely signal driven.
i guess there is no need for further ipc by other means,
escpecially for those that require rw fs access to work
(eg fifos, unix sockets).

process #1 has to react to (some) incoming signals and thus
signal handling has to be enabled anyway (mainly SIGCHLD,
but also for signals that are sent by the kernel to indicate the
occurence of important events like the 3 finger salute et al).

apart from that process #1 has IMO only the additional duty of
leaving stage 2 and entering staqe 3 when requested. this can also
be done by sending process #1 signals that indicate what to do
(reboot, halt, poweroff, maybe suspend). access control is easy here
since only the super-user may signal process #1.
there are also quite a lot of real-time signals that might also be used
for the purpose of notifying process #1.

hence there is no need for further ipc i guess.
for those who still need more than what is provided by signals
i would recommend using abstract unix domain sockets (Linux only)
or SysV message queues (the latter works even on OpenBSD) since
those ipc mechanisms can work properly without rw fs access.

SysV msg queues are especially useful in situations that only require
one way ipc (ie the init process just reacts to commands sent by clients
without sending information (results like successful task completion)
back to them) since they are rather portable and provide cheap and easy
to set up access control. and since it is process #1 that is the first
process to create and use a SysV msg queue the usual problems with
SysV ipc ids do not occur at all as process #1 can just grab ANY possible
ipc id, like say 1, without interfering with other processes' msg queues
and so all clients know which msg queue id to use for writing requests
(they can also send a signal like SIGIO to process #1 to wake it up and
have it process its msg queue, process #1's pid is also well known. ;-).

this can also be used as an emergency protocol when ipc by other means
(such as unix sockets) becomes unavailable.



ToyBox init

2019-05-04 Thread Jeff
> maybe you should have a look at the tiny "oneit" utility that is
> part of/included in ToyBox ( http://landley.net/toybox/ ):

ToyBox also provides its own rewrite of BusyBox init which is (almost ?)
compatible with the latter but consists of less code.

it is licensed under the very permissive ToyBox license (almost public domain)
instead of the GPL.



is it required to call kill() from process #1 ?

2019-05-04 Thread Jeff


> Before the reboot(2) system call, at some point you need to
> kill all processes ("kill -9 -1") so you can unmount filesystems
> and *then* call reboot(2).

indeed.

> That's relying on a behaviour that Linux implements, and possibly
> BSD too, but that is not specified in POSIX: that the process
> that does a kill(-1, signal) is not affected by the kill() call.

true when using kill( -1, sig ).

> With the extended behaviour, the process that performs the kill -9 -1
> survives, and can then go on to "stage 4", i.e. unmounting everything
> and telling the hardware to halt/reboot. But that is not POSIX.
> POSIX specifies that the kill signal will be sent to all processes
> "excluding an unspecified set of system processes". pid 1 is naturally
> part of those "system processes", but a shell, or a program that
> performs the shutdown sequence, with a random pid, cannot be.

there are at least to other solutions to the killall problem:

* on Linux you probably have the procfs mounted, on the BSDs, Solaris,
  and AIX you can use kvm to do the following:
  find all running processes (except your own and possibly your own
  session id) via the procfs or kvm and signal them, your own process
  (and session) are now not signaled (this is how the killall5 utility
  actually works). in the case of kvm you do not even need to have the
  procfs mounted.

* if you can not rely on such a mechanism you can still do a brute-force
  search to find running processes along this pseudo code lines:

  pid_t p ;
  const pid_t mypid = getpid () ;
  const ... int u = get_current_upper_limit_for_the_number_of_procs () ;

  for ( p = 2 ; u >= p ; ++ p ) {
// this ignores session ids
if ( mypid != p && 0 == kill ( p, 0 ) ) {
  (void) kill ( p, signal ) ;
}
  }

i personally do it from process #1 aswell since calling kill( -1, sig )
from there is much simpler and should be faster (work is done by the kernel,
no need to find all running processes by ourselves).

> The only ways to perform a proper shutdown sequence that strictly
> conforms to POSIX are:
>  - do it in pid 1
>  - do it *under a supervision tree*. When the shutdown sequence kills
> everything, it may also kill itself; if it is the case, it is restarted
> by the supervision tree, and can then go on to stage 4.

i prefer to call it "stage 3b". :PP
stage 3a terminates known services.
then everything is killed by process #1 and stage 3b is run thereafter
to complete the remaining shutdown tasks like swapoff and unmounting
the fs.

BTW: i do not un/remount pseudo fs like procfs, sysfs, devtmpfs etc
whose mountpoints are directly located on the root fs or via a direct
path of pseudo fs from the root fs. works well when one does not use
initram and the like. could this cause trouble somewhere ?

> The shutdown sequence generated by the current s6-linux-init-maker
> does the former. The shutdown sequence in the upcoming s6-linux-init
> performs the latter.

ok, when will you release it ? you made me curious ...

> It is not strictly necessary to do so on Linux, and apparently on
> BSD either, since those systems ensure the survival of the process
> sending the big nuke. But you need to be aware of this implementation
> detail before advertising the "classical BSD way". :)

:PP

actually it may not since it looks like inherited behaviour from even older
Unix implentations' init. the Linux SysV init incarnation and minit also do
not run any of the system shutdown tasks themselves but instead delegate
these to subprocesses.



supervising the supervisor

2019-05-02 Thread Jeff
from the last replies we have the following possibilities regarding
process #1's supervision capabilities:

- no supervision/respawning (maybe also not handling system shutdown
  at all, too):
  simplifies the process #1 implementation (especially in the latter case),
  supervision can be delegated to a subprocess which also simplifies that
  supervisor's implenation since there is no need for it to handle process #1
  specific duties
  (given there are more than just reaping zombies as the default subreaper
  and successfully starting at least one necessary child process (to which the
  remaining duties are delegated)).
  disadvantage: "incorrect" behaviour when all other processes die, leads to
  a bricked system, deep shit ahead.

- respawning (at least one) given services/daemons, possibly even with log
  output redirection to logger processes (s6-svscan et al)

- a compromise between the above 2 solutions:
  process #1 supervises (i. e. respawns, possibly only under certain conditions)
  at most 2 subprocesses (a real "supervisor") and maybe redirects
  its output via pipe(2) to a separate supervised dedicated logger subprocess.
  
  in that case those child processes should only be respawned under certain
  conditions (respawn throttling maybe, i. e. stop respawning if one of the 2
  repeatedly fails in a certain amount of time). if those conditions are not met
  it should start a single user rescue shell (possible via sulogin) and/or 
reboot.

  only in case the logger child process repeatedly fails: do not redirect the
  supervisor's output, use our own (possibly opened by the kernel) output fds
  for the supervisor child process (probably the console device) instead
  of the pipe fds.

  it could also be a good idea to close all of process #1 stdio fds and only
  open the console device for output when the need arises.
  this has the advantage that we do not have this device open all the time
  (in case /dev needs to get re/unmounted).

again (as we are at it ;-):

in the last case:
when said "supervisor" is s6-svscan (or perpd for that) it would be helpful
for the process #1 implementor (me) if it could manage its own output logger
via a command line option (akin to dt encore's "svscan") since it saves
him from opening the pipe, comparing terminated child PIDs with an
additional (the logger's) PID, and managing additional emergency situations
caused by the logger's failure himself
(especially since s6-svscan does a lot of additional stuff like catching
signals and running the corresponding scripts anyway).
:PP



ToyBox oneit

2019-05-02 Thread Jeff
> By this redefinition, a good init is one that doesn't allow systems to go
> vegetable, either by having something they restart, or totally freaking
> out and burning down the world if the one thing they started ever
> vanishes.
>
> sinit could be made proper by forking a thing and then
> issuing the reboot(2) syscall any time its child vanished.
> Annoyingly aggressive on the restarts, but proper.

maybe you should have a look at the tiny "oneit" utility that is
part of/included in ToyBox ( http://landley.net/toybox/ ):

$ toybox help oneit

usage: oneit [-p] [-c /dev/tty0] command [...]

Simple init program that runs a single supplied command line with a
controlling tty (so CTRL-C can kill it).

-c Which console device to use (/dev/console doesn't do CTRL-C, etc)
-p Power off instead of rebooting when command exits
-r  Restart child when it exits
-3 Write 32 bit PID of each exiting reparented process to fd 3 of child
(Blocking writes, child must read to avoid eventual deadlock.)

Spawns a single child process (because PID 1 has signals blocked)
in its own session, reaps zombies until the child exits, then
reboots the system (or powers off with -p, or restarts the child with -r).

Responds to SIGUSR1 by halting the system, SIGUSR2 by powering off,
and SIGTERM or SIGINT reboot.




Runit

2019-05-02 Thread Jeff
>>  If something kills runsvdir, then runit immediately enters
>>  stage 3, and reboots the system. This is an acceptable response
>>  to the scanner dying, but is not the same thing as supervising
>>  it. If runsvdir's death is accidental, the system goes through
>>  an unnecessary reboot.
>
> If the /etc/runit/2 process exits with code 111 or gets killed by a
> signal, the runit program is actually supposed to respawn it,
> according to its man page. I believe this counts as supervising at
> least one process, so it would put runit in the "correct init" camp :)
>
> There is code that checks the 'wstat' value returned by a
> wait_nohang(&wstat) call that reaps the /etc/runit/2 process, however,
> it is executed only if wait_exitcode(wstat) != 0. On my computer,
> wait_exitcode() returns 0 if its argument is the wstat of a process
> killed by a signal, so runit indeed spawns /etc/runit/3 instead of
> respawning /etc/runit/2 when, for example, I point a gun at runsvdir
> on purpose and use a kill -int command specifying its PID. Changing
> the condition to wait_crashed(wstat) || (wait_exitcode(wstat) != 0)
> makes things work as intended.

that is again one of several runit problems. among them:

- see above
- no setsid(2) for child procs by default in "runsv"
- having only runsv managing the log pipe.
- runit-init requires rw fs access without the slightest need
  (setting the +x bit of  the /etc/runit/(stopit,reboot) files
  which could indeed reside on a tmpfs in /run and be
  symlinks have symlinks pointing to them (that is done in
  Void Linux)
- problems with log files while bringing down the system.
  i never encountered that with daemontools-encore, perp(d)
  and s6.

so it is a quite dated project that clearly shows its age.
i would recommend against using it at all (except its
"chpst" and "utmpset" utilities).




how to handle system shutdown ?

2019-05-02 Thread Jeff
> And of course you'd need a shutdown script that PID1
> can call when it gets signals to reboot or poweroff.

that is also an interesting point.

i personally added this to my init. it ran a sript with the received
signal's name and number as parameters to let the user decide
what to do about it (i am also used to shut my desktop down via
"kill -12 1 ; exit 0").

but one can do without it and call the shutdown script by hand
which in the end does the reboot(2) call itself, thats perfectly
possible and the classical BSD way, so process #1 does not even
need to do the system shutdown itself.

but reacting to signals other than SIGCHLD is necessary on Linux
(and probably also the BSDs on PC hardware) to react to incoming
signals sent by the kernel when the "3 finger salute" and other
recognized special key sequences are hit (Linux: SIG(INT,WINCH),
dunno what the BSDs use here).




what init systems do you use ?

2019-05-02 Thread Jeff
thanks for the interesting links.

> https://www.reddit.com/r/linux/comments/2dx7k3/s6_skarnetorg_small_secure_supervision_software/cjxc1hj/?context=3

nice exchange of arguments.

> Do not mistake causes for consequences. Things are not correct
> because s6 does them; s6 does things because they are correct.

well i thought it inherited that behaviour from daemontool's svscan.
no i understand that this was a totally wrong assumption. :PP

> Then you are free to use one of the many incorrect inits out there,
> including sinit, Rich Felker's init, dumb-init, and others. You are
> definitely not alone with your opinion.

i wrote such an init myself which did a bit more than the ones you mention.
it ran REALLY fast. the only thing that was slow was my usage of
a customized (older) version of OpenRC (since i have written some own
openrc scripts (adding perp "support" et al) and am too lazy to write my
own scripts currently).

> However, you sound interested in process supervision

indeed. it's just a question of where to put the supervisor.

> if you subscribe to that idea, then you
> will understand why init must supervise at least 1 process.

ok, i understand your arguments and of course there is something
true about it.

> Maybe you've never bricked a device because init didn't respawn
> anything.

well, i bricked my desktop when doing init experiments.
but almost immediatly after hosing the system it comes into my
mind what exactly went wrong and i fix this on reboot.

i was forced to reboot from a rescue dvd only once so far. ;-)
(when testing "ninit" which i don't recommend)
and it involved only mounting the root fs on disc rw and fixing
some symlinks in /sbin (init et al), so this was no real problem
so long as one has console access.

that's why i wrote (/sbin/)testinit that forks, execs into the real init
to test in process #1 and sleeps a while in the child process
after which it kill()s process #1 with given signals.
this works usually very well and is safe to do.

> I have. The "rather artificial and constructed argument"
> happened to me in real life, and it was a significant inconvenience.

oh no, i hope it was not a remote server ... :-/
always try things out on a box you have console access to
or in a vm.

BTW:

what init systems do this list's subscribers use ?
i use statically linked (musl) BusyBox init (and gettys)
+ mksh (https://www.mirbsd.org/mksh.htm) + s6 + OpenRC
(v0.34.11, outdated) as default init system.
i also ran perp but now run everything to be supervised
under s6, started via a little setup shell script directly from
/etc/inittab (most "one time tasks" are indeed directly run
from the inittab file instead of a shell script).




further claims

2019-04-29 Thread Jeff


At
http://skarnet.org/software/s6/why.html
one can find further interesting claims:

> The runit process, not the runsvdir process,
> runs as process 1. This lengthens the supervision chain.

haven't you claimed process #1 should supervise long running
child processes ? runit fulfils exactly this requirement by
supervising the supervisor.

this simplifies both (runit-)init (it has only to compare the PIDs
of terminated child processes with exactly 1 PID) and the
supervisor runsvdir (the latter can do its usual business without
the requirement to do process #1 specific work such as reacting
to signals in a special way, running the 3 different init stages etc.
one could also rightfully point out here that these are proces #1
specific tasks and not a supervisor's duties per se.).

this lengthens the supervision chain but also has the additional
advantage of a supervised supervisor. ;-)

maybe runsvdir was not made to run as process #1 and this was
just a hack its author came up with to replace (SysV) init totally.
who knows ? but it works well (except that runit-init looks at
/etc/runit/reboot etc after receiving SIGCONT which is no good
idea at all since it requires unnecessary read-write access to the
fs this files reside on. how about just reacting to signals, say
use STGTERM to poweroff, SIGHUP to reboot, SIGUSR1 to halt,
SIGUSR2 to reboot or poweroff and make signal handling scripts
like /etc/runit/ctrl-alt-del etc just send one of those signals to
process #1 when SIG(INT,WINCH) were received ?
does not require any read-write fs access and looks much
simpler to me.)

"Artistic considerations":

> runit has only one supervisor, runsv, for both a daemon and its logger.
> The pipe is maintained by runsv. If the runsv process dies, the pipe
> disappears and logs are lost. So, runit does not offer as strong a
> guarantee as daemontools.

sure, if (s6-)svscan dies one is in deep shit aswell, so what is the point
here ? runsv gets restarted by runsvdir but the pipe is gone (are pipes
really closed when the opening (parent) process exits without closing
them itself and subprocesses still use that very pipe ?)

> daemontools' svscan maintains an open pipe between a daemon and its logger,
> so even if the daemon, the logger, and both supervise processes die,
> the pipe is still the same so no logs are lost, ever, unless svscan itself 
> dies.

but:

> perp has only one process, perpd, acting both as a "daemon and logger
> supervisor" (like runsv) and as a "service directory scanner" (like runsvdir).
> It maintains the pipes between the daemons and their respective loggers.
> If perpd dies, everything is lost.

same for (s6-)svscan here (at least for the pipes).

> however, perpd is well-written and has virtually no risk of dying.

the same holds probably for (s6-)svscan, i guess.

> Since perpd cannot be run as process 1, 
> this is a possible SPOF for a perp installation

but from a design perspective it seems as reliable as s6-svscan ?
or not since it uses a more integrated desing/approach ?
this design simplifies communication since tasks are not
implemented in other tools running as its (direct) subprocesses.

so all kinds of fifos/pipes used for IPC are not necessary anymore
except one socket per perpd process for client connections
and there is no need for further communication with subprocesses
(except via signals).



interesting claims

2019-04-29 Thread Jeff


i came across some interesting claims recently. on
http://skarnet.org/software/s6/
it reads

"suckless init is incorrect, because it has no supervision capabilities,
and thus, killing all processes but init can brick the machine."

a rather bold claim IMO !
where was the "correct" init behaviour specified ?
where can i learn how a "correct" init has to operate ?
or is it true since s6-svscan already provides such respawn
capabilities ? ;-)

there is actually NO need for a "correct" working init implementation
to provide respawn capabilities at all IMO.
this can easily done in/by a subprocess and has 2 advantages:

- it simplyfies the init implementation

- process #1 is the default subprocess reaper on any unix
  implementation and hence a lot of terminated zombie subprocesses
  get assigned to it, subprocesses that were not started by it.
  if it has respawn capabilities it has to find out if any of this recently
  assigned but elsewhere terminated subprocesses is one of its
  own childs to be respawned. if it has lots of services to respawn
  this means lots of unnecessary work that could be also done
  in/by a suprocess aswell.

when do you kill a non supvervised process running with UID 0
"accidently" ? when calling kill ( -1, SIGTERM ) ?
the kernel protects special/important processes in this case from
being killed "accidently", that's true.
but where do we usually see that ? in the shutdown stage, i guess.
and that's exactly where one wants to kill all process with PID > 1
(sometimes excluding the calling process since it has to complete
more tasks). or when going into single user mode.

so this looks like a rather artificial and constructed argument for
the necessity of respawn functionality in an init implementation IMO.



ezmlm mail headers

2019-04-26 Thread Jeff


there is a "problem" when ezmlm meets gmail:

the ezmlm mailing list manager used for this list seems
(unlike other solutions like sympa and mailman) not to 
set a "List-Id" mail header field which the dumb gmail
"filter" facilities seem to require to recognize postings to
mailing lists as such.

the gmail "filtering" crap does not apply "filter" rules that
match other headers like "List-Post" that ezmlm sets properly.




Re: catch-all logger service

2019-04-26 Thread Jeff
26.04.2019, 20:51, "Laurent Bercot" :
> You need to be able to take "no" for an answer.

i can do that.
and that will be probably the answer i will get from perp's author
aswell (dunno if he reads this list).

it is just important for me to know if this functionality will be added
or not since i have my own runit-init style process #1 implementation
in the makings and do not want to change it when such features
are added to s6-svscan and perpd.

no i know for sure that my code has to do the pipe(2) call and
has to supervise the logging process aswell
(which is a bit more code and one more child process to supervise).

but one could also use runit's "runsv" here which in turn supervises
both: s6-svscan/perpd and the logger.
(same with perp's "rundeux", that fits even better here since it does
not need any service dirs by itself)

but that is a bit awkward since that can be done easily by process #1
itself and hence just adds an additional level of indirection where
it is not really useful/necessary.

and here is another advantage of the daemontools-encore approach:
when given the special logging service option svscan knows it can
run very verbose since it has an associated logger.
(i think daemontools-encore's svscan already operates this way)

this is also important in the case of perpd since its default operation
style is pretty verbose and thus needs a logger.




catch-all logger service

2019-04-26 Thread Jeff
26.04.2019, 11:44, "Laurent Bercot" :
> You need to have execline installed to run s6 anyway.

true. but adding the required code to s6-svscan as
daemontools-encore did would obsolete execline at least  for
this purpose entirely (which is a good thing IMO).
it should not be to much of an effort to add this functionality
to s6-svscan (and perp(d) for that).

> Yes, from what I can see in the code, it works when the logdir is
> in the scandir, and it seems to be the intended use. But it also
> *appears to work* when the logdir is not in the scandir, and that
> is not good.

definitely true. but this is non intended use, so don't do it,
it was not recommended in the docs in any way.
(btw: this could be clarified in the doco)
still a pretty artificial consideration.

> We're talking pid 1 here.

not in the case of daemontools-encore which you were referring to.

> It needs to be *airtight*.

indeed, especially excluding any read-write fs access requirements.

> Running a supervision tree requires a rw fs anyway, so that's not a
> problem at all.

yes true, but IMO process #1 should not require read-write access
to any fs for proper operation. for subprocesses this is not a problem,
though.




Re: s6 style "readiness notification" with perpd

2019-04-26 Thread Jeff
26.04.2019, 17:27, "Laurent Bercot" :
> It doesn't matter what the number is that the service sees. As long
> as perpd creates a separate pipe for every service (which is why it
> would count against the maximum number of services it can maintain),
> it can read the notifications from every service separately.

so there is a small advantage of using an intermediary supervise
child process instead of doing it in a more integrated way directly
in the perpd parent process.

of course one can run more intances of perpd on different scandirs
should the upper limit on the number of open fds be exceeded
(which is rarely the case i guess).

> The notification-fd value used by s6-supervise is not relevant to
> s6-supervise, it's only relevant to the service. It's only used
> by s6-supervise at ./run spawning time, to tell what number the
> *child* should dup2() the notification pipe to before execing ./run,
> so the pipe is made available to the service on the fd it expects.

probably fd 3 in most cases ... 

> The supervisor itself does not use fixed fd numbers. It would be
> the same with perpd.

ok, i see.
so the perpd parent process opens a pipe(2) for every such service
and does a dup(2) after fork(2) in the child process that execs into
the service run script somehow among these lines:

...

pid_t p = 0 ;
int p [ 2 ] = { -1 } ;

pipe ( p ) ;

p = fork ()

if ( 0 == p ) {
  // child process that should execve(2) into the daemon
  ... 
  dup2 ( p [ 1 ], requested_fd ) ;
  ...
  // exec into service run script
  execve ( ... ) ;
  _exit ( 111 ) ;
} else if ( 0 < p ) {
  // parent process (perpd)
  // waitpid() for child and read(2) from p [ 0 ]
} else if ( 0 > p ) {
  // fork(2) failed
}

...

dumb question anyway, my error.



Re: s6 style "readiness notification" with perpd

2019-04-26 Thread Jeff
26.04.2019, 11:43, "Laurent Bercot" :
> It would certainly be possible for perpd to do the same, as long
> as it stays under the max open file limit - which may reduce the
> maximum possible number of services, but not enough for it to be
> a serious limitation.

how is that ?
what if 2 services (non-interdependent, to be started concurrently)
specify/use the same fd (number) to write their readiness notification
message to ? how could perpd tell which of the 2 nofied it ?




s6 style "readiness notification" with perpd

2019-04-25 Thread Jeff
does s6-supervise listen to the given "readiness notification" fd ?
seems so to me.

that is the reason s6 style "readiness notification" would be hard
to do directly with perpd since it is a more integrated solution
that does not use intermediary supervise processes.

right ?




s6-svscan catch-all logger service

2019-04-25 Thread Jeff
26.04.2019, 01:10, "Laurent Bercot" :
> The bad news is that the way daemontools-encore's svscan manages
> its catch-all logger is almost exactly the way I do it in my stage 1
> scripts, except it requires ad-hoc code in svscan.

but this "ad-hoc code" is in C and does not require any execline tricks.
for those to work you need execline installed too.
that way daemontools-encore's svscan can easily get directly started
and supervised by process #1 without any effort.

i think this feature does not require too much "ad-hoc" C code in
the svscan source. it is easy to do that in C while it becomes hard
to impossible do write a shell script that does it (ok, in execline
this seems to be possible, but you need that package installed then).
  
> Additionally, the way daemontools-encore does it, the logdir may
> or may not be in the scandir.

really ? i always used a service subdir of the scandir and it worked
very well.

> If it is not in the scandir, it will
> not be watched by svscan, and if both the supervise process and the
> logger die, nothing will ever read the logpipe again and when the
> kernel buffer fills up, your supervision tree will eventually freeze.

pretty artificial counter example since noone will use a service dir
outside the scandir. i do not know if this is even possible ...
in any case doing so seems very odd to me and was probably
not intended by the author.

> and performing a little FIFO trickery at svscan start time in order

see ? "a little trickery" is necessary here.

> to redirect its stdout and stderr to a FIFO (that the catch-all
> logger will read from). The differences in implementation is that
> the logpipe is a FIFO, not an anonymous pipe, and it's held open by
> the logger, not by svscan or supervise.

using a fifo here is IMO not the best solution when using a simple pipe(2)
would suffice since fifos need read-write access to fs they reside on.

> You're saying that my implementation makes running s6-svscan under
> sysvinit complex because you need 2 lines in /etc/inittab.

that was just a general remark not specific to SysV, busybox or 
toybox init that applies to every process #1 (and also to init stage 1
when using s6-svscan as stage 2)

> That is not true: you only need 1 inittab line, that runs a "mini-stage 1"
> script that performs the FIFO trick (as well as any other early
> preparation that you need) before executing s6-svscan. 

yes that works but introduces an extra step of indirection. and again this
seems to require execline.
it becomes more difficult doing so directly from -say- inittab or its
actual equivalent for the given init system.

> You're also saying that this implementation makes stage 1 scripts
> difficult to write. That is true

indeed.

> I would also recommend you, or anyone interested in stage 1 script
> creation, to do this sooner rather than later, because a new version
> of s6-linux-init is going to be released in less than a week,

i see.

> and it will be significantly different, and more complex
> (because more featureful,
> with a focus on turnkey sysvinit compatibility

who needs such compatibility anyway ?
those who want/need it should run SysV init directly
and start s6 per init script/inittab entry.

> stage 1 will be a C program rather than an execline script

nice ! that sounds really interesting to me.
you have suprised and teased me definitely with this
announcement.

good luck.




special s6-svscan/perp(d) catch-all logger service option

2019-04-25 Thread Jeff


hello,

i am a new subscriber to this mailing list.

i saw that daemontools-encore svscan provides an option to
specify a special catch-all logging service for svscan and its child
supervise processes:

svscan [ directory ] [ log-service ]

If the 'log-service' option is specified and the named subdirectory
exists, svscan starts the service found there and redirects its output
through it. This service is started before any other (since it is the very
important catch-all logger for (among others) svscan's own output).

it would be very nice for s6(-svscan) and perp(d) to provide such
functionality too.

this would simplify starting them directly from init(tab) (or use as init stage 
2
in the case of s6) to a great extent as they would do their output redirections
by themselves and also supervise this special catch-all logging service
directly which would init free from this additional task. now init has only 
to supervise (and restart) one process (s6-svscan/perpd) instead of 2
(the additional catch-add logger) which of course is much easier to
achieve.

when using s6-svscan for init stage 2 this would also simplify the
stage 1 script greatly since it can just directly exec into stage 2
now (by using this option) without doing the output redirection
for s6-svscan by itself.

i also think that this can be achieved without too much effort
and since it is an optional feature would not break compatibility
with earlier versions and hence older scripts should still work
without requiring any changes.

kind regards.