RE: s6 init-stage1

2015-01-06 Thread James Powell
The problem of using sockets rather than named pipes is that each UNIX socket 
requires more POSIX shared memory increasing the system resource base 
requirements. Named pipes just use normal process memory which keeps system 
requirements less. How Lennart failed to mention that in the systemd 
presentation is insane.

Sent from my Windows Phone

From: post-sysv<mailto:boycottsyst...@openmailbox.org>
Sent: ‎1/‎6/‎2015 12:03 PM
To: Laurent Bercot<mailto:ska-supervis...@skarnet.org>
Cc: supervision@list.skarnet.org<mailto:supervision@list.skarnet.org>
Subject: Re: s6 init-stage1

On 01/06/2015 07:48 AM, Laurent Bercot wrote:
> Interesting. Thanks for the heads-up - I had heard of tsort, but didn't
> know exactly what it does.
>
>  However, I'd like a tool that knows what steps it can parallelize.
> A sequential output is great for functions name in a piece of code,
> but for services, the point is to start as many as possible in
> parallel, and minimize the amount of synchronization points.
>
> For instance, given
> 1 2
> 3 4
> meaning 2 should happen after 1, and 4 should happen after 3,
> tsort gives
> 1
> 3
> 2
> 4
> but instead, I need something like
> 1 3
> 2 4
> because 1 and 3 can happen in parallel, and same for 2 and 4.
>
>  AFAICT, tsort cannot do that. (make might not be able to either,
> but since it's more complex, it's harder to tell.)
>

  About that. Actually, I'm not even certain if there exists a service
manager that
*actually* starts processes in parallel. Usually what I've noticed is
that most of
the time what is really meant is that services are started
asynchronously, or at
best concurrently.

  Debian and other formerly sysvinit-based distributions had what was
known as
a "Makefile-style concurrent boot". To the best of my knowledge, this
was done
using a combination of LSB initscript headers through insserv, and a
program
called startpar.

  Reading the source code of startpar, I was surprised to see that it
does its job
through a primitive form of socket activation in the run() function
where it allocates
a so-called "preload" socket and determines exit status by its
availability for
connection. Secondary routines including meddling with ptys and file
descriptors
to curb interleaving and make sure the execution state is clean and free of
potentially blocking operations.

  Makes me wonder if Poettering ever read it, though his ostensible
inspiration
was from launchd. That said, it does show that the systemd supporters have
overhyped the novelty of "socket activation" (inetd) even more
significantly
than I had previously thought. Someone should make note of this.

  In any event, I'm under the impression that most so-called parallel
service starters
are really ones that start asynchronously in a clean execution state, as
true
parallelism and even concurrency sounds conceptually quite difficult,
particularly
when you keep in mind that many boot processes are I/O-bound, primarily.
systemd itself has a complex dependency system at its backbone, with socket
activation not being a mandatory thing from what I've learned. It also
blocks on
occasion to fulfill start jobs, so evidently it has synchronization
methods that are
contrary to its claims.

  If someone can clarify this issue or point to any concurrent/parallel
schemes for
starting services at boot time that have been implemented, that would be
appreciated.


Execline: was s6 init-stage1

2015-01-06 Thread Steve Litt
On Tue, 06 Jan 2015 04:28:59 +0100
Laurent Bercot  wrote:


>   Far be it from me to discourage you from your noble quest ! But
> you could write it in sh and just use the 'redirfd' command from
> execline, which does the FIFO magic.

I read the documentation of execline, complete with the diagrams, and
didn't understand a word of it. I think a lot more documentation and a
lot of examples would help immensely.

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance



Re: s6 init-stage1

2015-01-06 Thread post-sysv

On 01/06/2015 07:48 AM, Laurent Bercot wrote:

Interesting. Thanks for the heads-up - I had heard of tsort, but didn't
know exactly what it does.

 However, I'd like a tool that knows what steps it can parallelize.
A sequential output is great for functions name in a piece of code,
but for services, the point is to start as many as possible in
parallel, and minimize the amount of synchronization points.

For instance, given
1 2
3 4
meaning 2 should happen after 1, and 4 should happen after 3,
tsort gives
1
3
2
4
but instead, I need something like
1 3
2 4
because 1 and 3 can happen in parallel, and same for 2 and 4.

 AFAICT, tsort cannot do that. (make might not be able to either,
but since it's more complex, it's harder to tell.)



 About that. Actually, I'm not even certain if there exists a service 
manager that
*actually* starts processes in parallel. Usually what I've noticed is 
that most of
the time what is really meant is that services are started 
asynchronously, or at

best concurrently.

 Debian and other formerly sysvinit-based distributions had what was 
known as
a "Makefile-style concurrent boot". To the best of my knowledge, this 
was done
using a combination of LSB initscript headers through insserv, and a 
program

called startpar.

 Reading the source code of startpar, I was surprised to see that it 
does its job
through a primitive form of socket activation in the run() function 
where it allocates
a so-called "preload" socket and determines exit status by its 
availability for
connection. Secondary routines including meddling with ptys and file 
descriptors

to curb interleaving and make sure the execution state is clean and free of
potentially blocking operations.

 Makes me wonder if Poettering ever read it, though his ostensible 
inspiration

was from launchd. That said, it does show that the systemd supporters have
overhyped the novelty of "socket activation" (inetd) even more 
significantly

than I had previously thought. Someone should make note of this.

 In any event, I'm under the impression that most so-called parallel 
service starters
are really ones that start asynchronously in a clean execution state, as 
true
parallelism and even concurrency sounds conceptually quite difficult, 
particularly

when you keep in mind that many boot processes are I/O-bound, primarily.
systemd itself has a complex dependency system at its backbone, with socket
activation not being a mandatory thing from what I've learned. It also 
blocks on
occasion to fulfill start jobs, so evidently it has synchronization 
methods that are

contrary to its claims.

 If someone can clarify this issue or point to any concurrent/parallel 
schemes for

starting services at boot time that have been implemented, that would be
appreciated.


Re: s6 init-stage1

2015-01-06 Thread Colin Booth
On Tue, Jan 6, 2015 at 6:27 AM, Avery Payne  wrote:
> On Tue, Jan 6, 2015 at 4:02 AM, Laurent Bercot 
> wrote:
>  But on servers and embedded systems, / should definitely be read-only.
>> Having it read-write makes it susceptible to filesystem corruption,
>> which kills the guarantee that your machine will boot to at least a
>> debuggable state. A read-only / saves you the hassle of having a
>> recovery system.
>>
>
> Interesting concept.

I use xfs. If I'm going to use a journaling file system, I might as
well use one that doesn't have filesystem corruption.

-- 
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
  --  William Blake


Re: s6 init-stage1

2015-01-06 Thread Colin Booth
On Tue, Jan 6, 2015 at 4:02 AM, Laurent Bercot
 wrote:
> On 06/01/2015 09:00, Colin Booth wrote:
>
>> 1. Depending on your initramfs and your on-disk layout you can skip
>> mounting proc and sys. I know this is the case with Debian, probably
>> true elsewhere as well.
>
>
>  It all depends on the assumptions that init-stage2 makes, but yes,
> now that you're mentioning it, mounting /proc and /sys may be
> delayed, as long as none of the very early services need them.
> Make sure the login process and interactive root shell do not need
> them either, because if init-stage2 fails very early, being able to
> log in will make debugging/recovery a lot easier.
>
In Debian's case, initramfs had already loaded /proc and /sys so
trying to mount them again was causing things to fail.
>
>> 2. If you aren't starting udev until init-stage2, you'll need to
>> manually mknod null and console devices before the "Reopen
>> stdin/stdout/stderr" comment.
>
>
>  That only applies to people who want a static /dev. Most people
> will run some flavour of udev, and will probably want to keep the
> devtmpfs mounted on /dev, in which case the kernel exports
> /dev/null and /dev/console itself. (Probably with the wrong rights,
> but they're functional enough to get by until udev runs.)
>
Hm, true. I guess that note is only if you are running with /dev as a
symlink to /mnt/tmpfs/dev since you get a tmpfs in that case. This is
what the init-stage1 script assumes. So, either make the nodes, run
udev as part of init-stage1, or use devtmpfs. I suggest the last :)
>
>> 3. You'll need to either symlink /tmp into your tmpfs, mount a tmpfs
>> on /tmp as part of init-stage1, or remount / to rw before s6-svscan is
>> loaded. Otherwise the catch-all logger won't be able to do its thing
>> as written. Same deal with /service, though that one is documented and
>> expected.
>
>
>  Actually, neither of those 3 things are needed for /tmp. :)
>  What *is* needed is a writable-by-root-only directory, to store the
> information init needs:
>  - The scan directory, which must be rw
>  - rw places to store the supervise/ and event/ subdirectories of
> the service directories, or a copy of the service directories
> themselves
>  - a rw place for the catch-all logger to run
>
>  /tmp is not ideal for this, for several reasons. One of which is
> as soon as stage 2 begins and user stuff runs on the system, creating
> files in /tmp isn't absolutely secure anymore, because filenames can
> be predicted and DoSsed. Another reason is conceptual: the information
> we need to store is not exactly temporary, it's not the throwaway
> stuff you'd expect to see in /tmp - on the contrary, it's vital to the
> system. So it's very unsightly to put it in /tmp.
>
Makes total sense. In that case though, s6-svscan-log/run should
probably be updated in the examples so that it doesn't try to use /tmp
since any /tmp/uncaught-logs symlink will be unavailable if a tmpfs
does get mounted or something cleans up /tmp. In the first case you're
doing more work in init-stage1 than necessary, in the second you're
back to having a rw root (if even for a second).
>
>  That is why I'm saying that s6 needs a tmpfs, distinct from /tmp,
> made in stage 1. Having a "private" tmpfs allows init to store the
> scan directory, the copies of service directories, and the catch-all
> logger directory, without impacting the rest of the system.
>  Since that tmpfs is needed anyway, /tmp might as well be a symlink
> to a public (mode 1777) subdirectory of it: it makes /proc/mounts
> cleaner. But it's not a requirement, and /tmp may be mounted as a
> separate tmpfs at some point in stage 2.
>
>
>  If you are reckless, totally insensitive to gracefulness, and you
> absolutely cannot deal with creating a tmpfs just for the sake of s6,
> you may try to use a subdirectory of the devtmpfs in /dev as an
> early root-only read-write place.
>  You will now forget I suggested that. *flash*
>
That. Wow. That's amazingly bad.
>
>> 4. If you don't want to have your dev mount in /mnt/tmpfs/dev (mostly
>> to keep ps output non-ugly and to kind-of stick to the FHS)
>
>
>  Eh, the FHS doesn't say that /dev should be a real directory. It can
> be a symlink all right. I checked. :P
>  Most Linux people will use udev, though, and for them /dev will be a
> devtmpfs: a real directory, and a mountpoint.
>
Mostly it's to keep the output of ps non-mangled when you ssh in. A
tty of pts/XX doesn't mess up the column output, a tty of
/mnt/tmpfs/dev/pts/XX definitely does. That said, I'm a bit surprised
that the FHS doesn't care beyond needing the name present.
>
>  The order in which init-stage2 starts services and interleaves them
> with one-shot commands should mirror your dependency graph. This is
> where a dependency management system would come in handy; I plan to
> work on a program that takes a dependency graph as its input (format
> TBD) and outputs a suitable init-stage2 script.
>
It would. In my case though, I knew the se

Re: s6 init-stage1

2015-01-06 Thread Avery Payne
On Tue, Jan 6, 2015 at 4:02 AM, Laurent Bercot 
wrote:
>
>  I very much dislike having / read-write. In desktops or other systems
> where /etc is not really static, it is unfortunately unavoidable
> (unless symlinks to /var are made, for instance /etc/resolv.conf should
> be a symlink to /var/etc/resolv.conf or something, but you cannot store,
> for instance, /etc/passwd on /var...)
>

What if /etc were a mount overlay?  I don't know if other *nix systems
support the concept, but under Linux, mounting a file system onto an
existing directory simply "blocks" the original directory contents
"underneath", exposing only the file system "on top", and all writes go to
the "top" filesystem.  This would allow you to cook up a minimalist /etc
that could be left read-only, but when the system comes up, /etc is
remounted as read-write with a different filesystem to capture read-write
data.  Dismounting /etc would occur along with all the other dismounts at
the tail-end of shutdown.  The only issue I could see is /etc/passwd having
a password set for root, which would be needed to secure the console in the
event that the startup failed somehow and /etc isn't mounted yet. This
implies a possible de-sync between the read-only /etc/passwd and the
read-write /etc/passwd; the former is fixed in stone, the later can change.

 But on servers and embedded systems, / should definitely be read-only.
> Having it read-write makes it susceptible to filesystem corruption,
> which kills the guarantee that your machine will boot to at least a
> debuggable state. A read-only / saves you the hassle of having a
> recovery system.
>

Interesting concept.


Re: s6 init-stage1

2015-01-06 Thread Laurent Bercot

On 06/01/2015 13:12, Peter Pentchev wrote:

Even better: most modern systems have a tsort(1) utility for this kind of
topological sorting; BSD-derived systems have had it for ages.


 Interesting. Thanks for the heads-up - I had heard of tsort, but didn't
know exactly what it does.

 However, I'd like a tool that knows what steps it can parallelize.
A sequential output is great for functions name in a piece of code,
but for services, the point is to start as many as possible in
parallel, and minimize the amount of synchronization points.

For instance, given
1 2
3 4
meaning 2 should happen after 1, and 4 should happen after 3,
tsort gives
1
3
2
4
but instead, I need something like
1 3
2 4
because 1 and 3 can happen in parallel, and same for 2 and 4.

 AFAICT, tsort cannot do that. (make might not be able to either,
but since it's more complex, it's harder to tell.)

--
 Laurent


Re: s6 init-stage1

2015-01-06 Thread Peter Pentchev
On Tue, Jan 06, 2015 at 01:02:46PM +0100, Laurent Bercot wrote:
> On 06/01/2015 09:00, Colin Booth wrote:
[snip]
> >5. I made a few more classes of services for init-stage2 to copy into
> >the service directory. Specifically for things that I wanted running
> >ASAP and were udev agnostic. Those were: syslogd (using s6-ipcserver
> >and ucspilogd), klogd, cron, and udev. Mostly that was because I
> >needed udev running (and supervised) before bringing up dbus, and I
> >wanted to make sure /dev/log had a reader before I started bringing
> >anything up that might not want to talk to stdout instead (openssh,
> >I'm looking at you).
> 
>  The order in which init-stage2 starts services and interleaves them
> with one-shot commands should mirror your dependency graph. This is
> where a dependency management system would come in handy; I plan to
> work on a program that takes a dependency graph as its input (format
> TBD) and outputs a suitable init-stage2 script.
> 
>  (Crazy idea brewing. Dependency graph management is a solved problem:
> it's exactly what "make" does. So my program could simply translate
> the service dependency graph into a Makefile, and make would
> output the script. This requires more thought.)

Even better: most modern systems have a tsort(1) utility for this kind of
topological sorting; BSD-derived systems have had it for ages.

G'luck,
Peter

-- 
Peter Pentchev  r...@ringlet.net r...@freebsd.org p.penc...@storpool.com
PGP key:http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint 2EE7 A7A5 17FC 124C F115  C354 651E EFB0 2527 DF13


signature.asc
Description: Digital signature


Re: s6 init-stage1

2015-01-06 Thread Laurent Bercot

On 06/01/2015 09:00, Colin Booth wrote:


1. Depending on your initramfs and your on-disk layout you can skip
mounting proc and sys. I know this is the case with Debian, probably
true elsewhere as well.


 It all depends on the assumptions that init-stage2 makes, but yes,
now that you're mentioning it, mounting /proc and /sys may be
delayed, as long as none of the very early services need them.
Make sure the login process and interactive root shell do not need
them either, because if init-stage2 fails very early, being able to
log in will make debugging/recovery a lot easier.



2. If you aren't starting udev until init-stage2, you'll need to
manually mknod null and console devices before the "Reopen
stdin/stdout/stderr" comment.


 That only applies to people who want a static /dev. Most people
will run some flavour of udev, and will probably want to keep the
devtmpfs mounted on /dev, in which case the kernel exports
/dev/null and /dev/console itself. (Probably with the wrong rights,
but they're functional enough to get by until udev runs.)



3. You'll need to either symlink /tmp into your tmpfs, mount a tmpfs
on /tmp as part of init-stage1, or remount / to rw before s6-svscan is
loaded. Otherwise the catch-all logger won't be able to do its thing
as written. Same deal with /service, though that one is documented and
expected.


 Actually, neither of those 3 things are needed for /tmp. :)
 What *is* needed is a writable-by-root-only directory, to store the
information init needs:
 - The scan directory, which must be rw
 - rw places to store the supervise/ and event/ subdirectories of
the service directories, or a copy of the service directories
themselves
 - a rw place for the catch-all logger to run

 /tmp is not ideal for this, for several reasons. One of which is
as soon as stage 2 begins and user stuff runs on the system, creating
files in /tmp isn't absolutely secure anymore, because filenames can
be predicted and DoSsed. Another reason is conceptual: the information
we need to store is not exactly temporary, it's not the throwaway
stuff you'd expect to see in /tmp - on the contrary, it's vital to the
system. So it's very unsightly to put it in /tmp.

 I very much dislike having / read-write. In desktops or other systems
where /etc is not really static, it is unfortunately unavoidable
(unless symlinks to /var are made, for instance /etc/resolv.conf should
be a symlink to /var/etc/resolv.conf or something, but you cannot store,
for instance, /etc/passwd on /var...)
 But on servers and embedded systems, / should definitely be read-only.
Having it read-write makes it susceptible to filesystem corruption,
which kills the guarantee that your machine will boot to at least a
debuggable state. A read-only / saves you the hassle of having a
recovery system.
 So, it should be the admin's choice, and I do not want s6 to force
the admin to mount / rw.

 That is why I'm saying that s6 needs a tmpfs, distinct from /tmp,
made in stage 1. Having a "private" tmpfs allows init to store the
scan directory, the copies of service directories, and the catch-all
logger directory, without impacting the rest of the system.
 Since that tmpfs is needed anyway, /tmp might as well be a symlink
to a public (mode 1777) subdirectory of it: it makes /proc/mounts
cleaner. But it's not a requirement, and /tmp may be mounted as a
separate tmpfs at some point in stage 2.

 If you are reckless, totally insensitive to gracefulness, and you
absolutely cannot deal with creating a tmpfs just for the sake of s6,
you may try to use a subdirectory of the devtmpfs in /dev as an
early root-only read-write place.
 You will now forget I suggested that. *flash*



4. If you don't want to have your dev mount in /mnt/tmpfs/dev (mostly
to keep ps output non-ugly and to kind-of stick to the FHS)


 Eh, the FHS doesn't say that /dev should be a real directory. It can
be a symlink all right. I checked. :P
 Most Linux people will use udev, though, and for them /dev will be a
devtmpfs: a real directory, and a mountpoint.



5. I made a few more classes of services for init-stage2 to copy into
the service directory. Specifically for things that I wanted running
ASAP and were udev agnostic. Those were: syslogd (using s6-ipcserver
and ucspilogd), klogd, cron, and udev. Mostly that was because I
needed udev running (and supervised) before bringing up dbus, and I
wanted to make sure /dev/log had a reader before I started bringing
anything up that might not want to talk to stdout instead (openssh,
I'm looking at you).


 The order in which init-stage2 starts services and interleaves them
with one-shot commands should mirror your dependency graph. This is
where a dependency management system would come in handy; I plan to
work on a program that takes a dependency graph as its input (format
TBD) and outputs a suitable init-stage2 script.

 (Crazy idea brewing. Dependency graph management is a solved problem:
it's exactly what "make" does. So my program could simp

Re: s6 init-stage1

2015-01-06 Thread Colin Booth
On Mon, Jan 5, 2015 at 5:03 PM, James Powell  wrote:
> The initial init bootscript that I'm currently drafting is in execline using 
> the template provided by Laurent. I was going to take the advice on using 
> /bin/sh rather than /bin/execlineb but I recanted that decision due to the 
> fact I wanted the using the FIFO handling execline provides.
>
> My question about stage 1 is as follows for a target system of a PC desktop:
>
> If I am reading things correctly, assumingly, init-stage1 using the template, 
> I only need to correct the paths and include any mounts of virtual kernel 
> file systems not listed as well as get cgroups ready, and stage any core 
> one-time services like copying core service scripts to the service scan 
> directory on the tmpfs, correct, before passing off to init-stage2 to load 
> drivers, start daemons, etc.?
>
Laurent's answers are all great. Here's a few other things that I ran
into when adapting s6-init to run on my laptop (distro kernel and a
desire to not trash my root directory too badly), mostly in the
gotchas category:

1. Depending on your initramfs and your on-disk layout you can skip
mounting proc and sys. I know this is the case with Debian, probably
true elsewhere as well.
2. If you aren't starting udev until init-stage2, you'll need to
manually mknod null and console devices before the "Reopen
stdin/stdout/stderr" comment.
3. You'll need to either symlink /tmp into your tmpfs, mount a tmpfs
on /tmp as part of init-stage1, or remount / to rw before s6-svscan is
loaded. Otherwise the catch-all logger won't be able to do its thing
as written. Same deal with /service, though that one is documented and
expected.
4. If you don't want to have your dev mount in /mnt/tmpfs/dev (mostly
to keep ps output non-ugly and to kind-of stick to the FHS) you'll
need to make sure to manually create /dev/pts after you initially
mount a tmpfs or devtmpfs into /dev. This needs to get done before
starting you hotplug manager. udev mounts a devpts there for you when
started, but if you're running mdel you'll need to mount it yourself.
5. I made a few more classes of services for init-stage2 to copy into
the service directory. Specifically for things that I wanted running
ASAP and were udev agnostic. Those were: syslogd (using s6-ipcserver
and ucspilogd), klogd, cron, and udev. Mostly that was because I
needed udev running (and supervised) before bringing up dbus, and I
wanted to make sure /dev/log had a reader before I started bringing
anything up that might not want to talk to stdout instead (openssh,
I'm looking at you).
6. Lastly, since this was an init replacement on a distro-based
system, I made a script called "oneshots" that init-stage2 runs that
fired off all the fake daemons that get started when you bring up a
Debian system. This is things like checking if you're booting while on
batteries, clearing old sudo privileges, and setting the hostname.

The first four are all things that blew up in my face in one way or
another, usually as early-boot kernel panics but sometimes as just a
lot of junk logged to the console while I was trying to log in.
Everything between the fdclose line and repoening stdin is super
fragile, and since we've unmounted /dev, it's impossible to boot
half-way and then start a shell to find out what exactly went wrong.

Good luck. Barring some experiments back in the summer I never
switched any of my daily-use systems to s6-init. I have virutals that
are s6 top-to-bottom, but that doesn't particularly count.

> Thanks,
> James
>

Cheers!

-- 
"If the doors of perception were cleansed every thing would appear to
man as it is, infinite. For man has closed himself up, till he sees
all things thru' narrow chinks of his cavern."
  --  William Blake


Re: s6 init-stage1

2015-01-05 Thread Laurent Bercot

On 06/01/2015 02:03, James Powell wrote:

The initial init bootscript that I'm currently drafting is in
execline using the template provided by Laurent. I was going to take
the advice on using /bin/sh rather than /bin/execlineb but I recanted
that decision due to the fact I wanted the using the FIFO handling
execline provides.


 Far be it from me to discourage you from your noble quest ! But
you could write it in sh and just use the 'redirfd' command from
execline, which does the FIFO magic.
 That said, performing multiple chain loads in shell is a pain,
and you'd have to write something like
"... ; exec redirfd -wnb 1 fifo /bin/sh -c '...'" which quickly
becomes a quoting nightmare for the second part of the script. So
maybe learning execline is the right choice. ;)

 

If I am reading things correctly, assumingly, init-stage1 using the
template, I only need to correct the paths and include any mounts of
virtual kernel file systems not listed as well as get cgroups ready,
and stage any core one-time services like copying core service
scripts to the service scan directory on the tmpfs, correct, before
passing off to init-stage2 to load drivers, start daemons, etc.?


 That is correct - it's how I designed the example init-stage1 in
the s6 tarball.
 It is not the only way to proceed, but I find it the safest way,
even safer than runit, because it puts the supervision infrastructure
in place extremely early on.

 The idea is that unlike svscan and runsvdir (that poll their scan
directory every five seconds no matter what), s6-svscan can be told
to scan for services immediately (with the s6-svscanctl -a command).
So it's possible for a s6-based system to start s6-svscan on an
empty, or nearly empty, scan directory, to have it running as early
as possible; then to populate the scan directory with other
services, and trigger s6-svscan so those services start immediately.

 So, stage 1 just becomes:
 a) Do what is strictly necessary to prepare the scan directory for
s6-svscan
 b) Fork a background process that will block until s6-svscan is
guaranteed running
 c) exec into s6-svscan, which will run as process 1 until shutdown.

 a) involves virtual filesystems, maybe cgroups, and setting up
a few basic services in the scan directory: s6-svscan's own logger,
udev, and an early getty for debugging, for instance.
 b) can be done in different ways, but I've found it simple to just
open s6-svscan's logging FIFO for writing in the *normal* way. That
operation will block until there's a reader for the FIFO, and such
a reader will appear when s6-svscan's logger is up, which means that
s6-svscan is up and running and has started its own logging service.
At this point, the background process can unblock, and run what I
call init-stage2, which can mount filesystems, populate the scan
directory and call s6-svscanctl -a to start services, etc.
 c) needs a way to *not* block on the FIFO while there is no reader,
and that's why redirfd is handy here.

 Note that this scheme allows exactly every service on the system to
be supervised. Even udev. Even s6-svscan's own logger. I believe this
is the cleanest way to use a supervision infrastructure.

 There is no theoretical objection to using that scheme with runit or
daemontools too. There simply are two practical roadbumps:
 * The FIFO trick. This can be solved by using execline's redirfd even
in a daemontools/runit installation, or by adding a few lines of
special code to daemontools/runit.
 * The fact that runsvdir/svscan can wait for up to 5 seconds before
launching a requested service: there is no way to trigger a scan like
there is with s6. This makes it hard to populate a scan directory
sequentially, because the waiting times (before runsvdir picks up the
new service directory and starts the service) will add up, and the
boot process will be very slow.
 But as a proof of concept, it could be done.

--
 Laurent


s6 init-stage1

2015-01-05 Thread James Powell
The initial init bootscript that I'm currently drafting is in execline using 
the template provided by Laurent. I was going to take the advice on using 
/bin/sh rather than /bin/execlineb but I recanted that decision due to the fact 
I wanted the using the FIFO handling execline provides.

My question about stage 1 is as follows for a target system of a PC desktop:

If I am reading things correctly, assumingly, init-stage1 using the template, I 
only need to correct the paths and include any mounts of virtual kernel file 
systems not listed as well as get cgroups ready, and stage any core one-time 
services like copying core service scripts to the service scan directory on the 
tmpfs, correct, before passing off to init-stage2 to load drivers, start 
daemons, etc.?

Thanks,
James