Re: Be prepared for the fall of systemd

2022-08-04 Thread Jan Bramkamp

On 04.08.22 11:09, Tanuj Bagaria wrote:

What do we as a community need to do
  to get S6 into a "corporate friendly" state?

What can I do to help?


Here are some ideas:
- easier access to the VCS (git, pijul, etc)


I would not (yet) consider pijul common and stable enough to count 
toward that goal. I would recommend we accept that git is the 
established default VCS for *nix software development for the 
foreseeable future.


Each skaware software project has:

  * Its own git repository at 
https://git.skarnet.org/cgi-bin/cgit.cgi/$project (with plaintext 
read-only access at git://git.skarnet.org/$project as well)


  * A GitHub mirror are at https://github.com/skarnet/$project.

A list of the existing projects is hosted at https://skarnet.org/software/.


- Issue tracking system
The the per project mirror GitHub issue trackers aren't disabled and 
used occasionally, but their use is discouraged [1] (at least for 
support). Unless someone has a better idea I would recommend using them 
at least for bug tracking. Here the biggest problem I expect is a drain 
on Laurent Bercot's time and the biggest help would be for someone to 
moderate and classify the reports to save preserve developer time. A 
useful moderator would have to know their technical limitations (when to 
bump an issue to a developer for further analysis), engage with human 
users so the project feels "alive" for lack of a better word, help 
reporters improve their issues to they point they become actionable, tag 
and assign the issues correctly. Such a post  would require dedication 
and perseverance in the face or repetitive, thankless work. It would 
neither require a deep understanding of the implementation nor are most 
developers a good fit it. It requires its own skillset.

- CI/CD build chain (being careful not to make it too painful to use)
Running post-commit s6 and s6-rc regression tests (that don't exist to 
the best of my knowledge) on several platforms would be enough for cover 
most of it.

- "idiot proof" website


The website is idiot proof in the sense that idiots bounce of it without 
wasting anyone's time, but their own. It also provides a reference/man 
page style documentation with a few pages explaining one concepts that 
could be collected to be easier to discover.



- quick start / getting started guide
- easier access (better?) Documentation


That's exactly what missing in my opinion: an introduction/tutorial 
style documentation to bring down the *very* steep learning curve. It 
should further explain how the concepts fit together back-referencing 
how they've already been applied in the tutorials.


Enough mechanisms are in place in s6 and s6-rc implement most (sane) 
policies. The big missing quality of life feature is a safe frontend 
making dynamic reconfiguration easier. This feature is work in progress 
[2] and development can probably be accelerated a great deal by throwing 
enough money at Laurent Bercot enabling him to dedicate more of his time 
to completing it [3].


[1]: https://github.com/skarnet/s6/issues/31#issuecomment-1079312762

[2]: https://skarnet.org/software/s6-frontend

[3]: https://skarnet.com/projects/service-manager.html



Re: s6-svscan shutdown notification

2022-02-25 Thread Jan Bramkamp

On 24.02.22 06:16, Jan-willem De Bleser wrote:


Not an option to be its parent since there's no persistent supervisor of a
jail's root process, but that script using .s6-svscan/finish should do
nicely. Thanks for the suggestion!
I use s6-rc on the FreeBSD jail host as well to manage jails with their 
own supervision trees as long running services making the jail 
supervision tree a subtree of the host supervision tree. I prefer to let 
s6-rc handle state transitions instead of using the more limited state 
management support in the jail utility. I use one service bundle per 
jail containing a oneshot to create/destroy a persistent jail and a long 
run depending on the oneshot using jexec to start the supervision 
subtree inside the jail.


Re: s6 as a systemd alternative

2017-06-30 Thread Jan Bramkamp
The s6-svscan and s6-supervise are very simple (and elegant) and in a 
way do less than runsvdir and runsv: they don't go around allocating 
resources at runtime.



On 30.06.2017 22:38, Steve Litt wrote:

On Fri, 30 Jun 2017 19:50:17 +
"Laurent Bercot"  wrote:


The runsv executable is pretty robust, so it's unlikely to die.

   Yadda yadda yadda. Most daemons are also unlikely to die, so
following your reasoning, I wonder why we're doing supervision in the
first place. Hint: we're doing supervision because we are not content
with "unlikely". We want "impossible".

You want impossible. I'm quite happy with unlikely. With my use
case, rebooting my computer doesn't ruin my whole day. If it *did* ruin
my whole day, my priorities would be changed and I'd switch to s6.




  As far
as somebody killing it accidentally or on purpose with the kill
command, that's a marginal case. But if it were *really* important to
protect against, fine, have one link dir per early longrun, and run
an individual runsvdir on each of those link directories.

   And you just increased the length of the chain while adding no
guarantee at all, because now someone can just kill that runsvdir
first and then go down the chain, like an assassin starting with the
bodyguards of the bodyguards of the important people. Or the assassin
might just use a bomb and blow up the whole house in one go: kill -9
-1.

   The main point of supervision is to provide an absolute guarantee
that some process tree will always be up, no matter what gets killed
in what order, and even if everything is killed at the same time.

To me, the preceding isn't the main point of supervision. Supervision
benefits I value more are:

* Run my daemon in foreground, so homegrown daemons have no need to
   self-background.
* Consistent and easy handling of log files.
* Under almost all circumstances, dead daemons get restarted.
* Simple config and troubleshooting, lots of test points.
* POSIX methodologies ensure I can easily do special stuff with it.
* Ability to base process dependency on whether the dependee is
   *really* doing its job.


You
can only achieve that guarantee by rooting your supervision tree in
process 1.

Yes.


   With runit, only the main runsvdir is supervised - and even then it
isn't really, because when it dies runit switches to stage 3 and
reboots the machine. Which is probably acceptable behaviour, but
still not supervision.

If we're going to get into definitions, then let me start by saying
what I want is daemontools that comes up automatically when the machine
is booted. Whether or not that's supervision isn't something I care
about.


And everything running outside of that main
runsvdir is just hanging up in the air - they can be easily killed
and will not return.

Well, if they kill the runsv that's true, but if they kill the
daemon, no. Either way, I'm willing to live with it.



   By adding supervisors to supervisors, you are making probabilistic
statements, and hoping that nobody will kill all the processes in the
wrong order. But hope is not a strategy. There is, however, a strategy
that works 100% of the time, and that is also more lightweight because
it doesn't require long supervisor chains: rooting the supervision
tree in process 1. That is what an s6-based init does, and it
provides real, strong supervision; and unlike with runit, the machine
is only rebooted when the admin explicitly decides so.

I completely understand your point. I just don't need that level of
indestructibility.


   If you're not convinced: *even systemd* does better than your
solution. systemd obviously has numerous other problems, but it does
the "root the supervision tree in process 1" thing right.

LOL, my whole point is I don't necessarily think "root the supervision
tree in process 1" is right, at least for my use case. I *enjoy* having
a tiny, do-almost-nothing PID1.

Like I said before, if losing control of the system during special
circumstances would ruin my whole day, I'd change my priorities and use
s6.


   I appreciate your enthusiasm for supervision suites. I would
appreciate it more if you didn't stop halfway from understanding
everything they bring, and if you didn't paint your unwillingness to
learn more as a technical argument, which it definitely is not, while
flippantly dismissing contributions from people who know what they
are talking about.

But I didn't flippantly dismiss anybody or any contributions. I
pointed  out that one can, and I'll use different verbiage now, respawn
daemons early in the boot, before some of the one-shots had started.

I'm not an enemy of s6. I'm not an enemy of anything you apply the word
"supervision" to. I think I understand your reasons for doing what you
do. It's just that with my current use case, I've traded some of s6's
process and boot security (you know what I mean) for a simpler PID1 and
a standalone daemon respawner.

If and when I get a use case requiring more 

Re: s6, listen(8), etc.

2016-09-01 Thread Jan Bramkamp

On 01/09/16 15:43, Roger Pate wrote:

On Thu, Sep 1, 2016 at 8:34 AM, Laurent Bercot
 wrote:

 OK, now let's have a look at LISTEN_FDS.


I also find these particular implementation details a poor choice.  I
was going to recommend a different environment convention, but then I
saw the pre-existing convention was copied exactly.

If I was Daniel, I'd create something better.  But I'm not sure
there's enough interest/need to warrant it.  (Daemons currently
written to expect LISTEN_FDS could have a chain-loaded conversion
program.)

Not that I'm particularly knowledgeable here; s6's fdholding seems
able to fill this niche already.


FD holding is a very general mechanism and requires a protocol between 
the FD holding daemon and the client(s). A "./run" script can access the 
FD holding daemon, but this isn't the purpose of the systemd style 
socket activation.


Socket activation is a very old idea in Unix and commonly implemented 
the inetd superserver. This works well for forking servers with low 
setup overhead. There several problems with forking one process per 
connection.


 * For some protocols forking a process per request is too much overhead.

 * Some daemons perform expensive setup operations e.g. OpenSSH used to 
generate an ephemeral SSHv1 server key during startup.


A socket requires very few resources compared to a running process 
especially a process offering a non-trivial service yet only the bound 
socket is required for clients to initiate a connection. This tempted 
Apple to not just create and bind sockets in the superserver but also 
accept the (first) connection inside the superserver before they spawn 
the service process. The problem is that now the superserver has to pass 
the bound socket and the accepted connected socket to the service 
process. This requires a protocol between the superserver and the 
service. Both Apple and systemd "strongly recommend" applications to 
link against their libraries resulting in an annoying vendor lock-in.


On a classic Unix server running a few services this is unnecessary 
complexity, but these days most unixoid systems are powered by a single 
Lithium battery cell. Launchd went too far too quickly and the original 
implementation requires Mach IPC. In a launchd world every service is 
always available and the processes implementing it are spawned on 
demand. There is even a transaction concept enabling launchd to reclaim 
resources from cooperating services unless they're inside a transaction. 
This design works well on a laptop or smartphone. It fails spectacularly 
in the face of "evil legacy software" which doesn't follow the "one true 
way".


The systemd APIs look like they try to follow a middle ground, but they 
end up making a mess out of things. Have a look at 
https://ewontfix.com/15/ if want to know more about systemd's design flaws.


Re: Linuxisms in s6

2016-08-26 Thread Jan Bramkamp

On 25/08/16 23:17, Adrian Chadd wrote:

On 25 August 2016 at 14:13, Warner Losh  wrote:

On Thu, Aug 25, 2016 at 3:08 PM, Adrian Chadd  wrote:

On 25 August 2016 at 12:48, Lars Engels  wrote:

On Thu, Aug 25, 2016 at 08:46:10AM -0700, Adrian Chadd wrote:

On 24 August 2016 at 21:53, Jonathan de Boyne Pollard
 wrote:

http://adrianchadd.blogspot.co.uk/2016/08/freebsd-on-tiny-system-whats-missing.html?showComment=1471236502051#c1305086913155850955
, Adrian Chadd:


We're using s6 at work, and it works out mostly ok. Mostly once you get
around the linuxisms, and the lack of sensible time code in it (its
calculations for daemon run duration is based on system time, not wall
clock, so if your box boots jan 1, 1970 then gets NTP, things are..
hilarious), and some of the arcane bits to get logging working right.


What are these Linuxisms in s6?  s6-linux-utils and s6-linux-init have
Linuxisms, obviously.  But what Linuxisms does s6 have?


We just had a bunch of fun trying to get it to build right, and the
uptime stuff really threw us.

It's fine though, I found that s6 may start growing an IPC mechanism
so we could possibly do a launchd/jobd style service later (ie to run
things upon event changes, like ifup, ifdown, ifcreate, ifdestroy,
arbitrary messages, etc) so I may try incorporating it again. :)



Can't this be done with devd?


Sure, but I'm looking for something more generic than just devd. Like,
notifications about events like "default route is up" can be done by
sniffing the rtsock, but notifications like "ntpdate has updated the
date, we can now do crypto services" doesn't happen there right now.


devd was never intended to be limited to just device events from the
kernel. It has grown beyond that, and could easily grow to cope with
routing events and other notifications. No need to reinvent everything
for that.


Right. I don't want to reinvent the wheel if it can be avoided.


Afaik devd is limited handling events reported by the kernel on 
/etc/devctl. There is no way to inject events into arbitrary events from 
userspace into devd (no ptrace hacks don't count).



But there are other things that want to produce and consume events.
eg, openvpn bringing up a VPN triggering possible ipfw rule changes.
Or openvpn coming down triggering other ipfw rule changes.


FreeBSD offers several IPC APIs but non of them can implemented reliable 
multicast as this would require an unbounded journal in stable memory. 
For most use-cases reliable notification of the current state is enough. 
Instead of reliably multicasting each message to each recipient just 
send each observer the latest state of each observed value e.g. in your 
OpenVPN example the IPFW wrapper doesn't care how many time the tunnel 
flapped. The user just wants the right firewall configuration for his 
current network environment. He doesn't want to replay every change on 
the way.


In macOS has notifyd offers this service. The optimizations in notifyd 
make it too large to just import its services into an init process, but 
having a simpler reliable notification mechanism early would be useful.



What devd doesn't do is launchd / job control. That would be a whole
new kettle of fish for it, and one for which it may be ill suited. Though
viewed in the right way, it's all just a series of notifications: This service
is up, this is down, user wants to start this other one, etc, so maybe
it wouldn't so bad...


Well, ideally the jobd would sit on the message bus and take commands
to do things. Like dbus and udevd did in linux, before systemd
steamrolled over them. But then if I suggest we need a message bus
daemon up and going so arbitrary system pieces could talk to other
system pieces, I'll likely be shouted at.

But not by jkh. He'd likely be "YOURE ONLY JUST GETTING AROUND TO THIS
NOW?" and laugh a lot.

(jkh - please come to the next bafug so we can talk shop..)


Process spawning and supervision should be separate from the policy 
engine(s) as the process supervision graph should be a tree rooted in 
pid 1, but the user might want to run multiple rule/policy engines. An 
init process should just offer the required mechanisms and nothing more. 
Convenient policies can be implemented on top of those mechanisms.


For my own init system I'm still not sure if the init process should 
track services and their dependencies at all or just processes and keep 
the concept of services and dependencies in a service management process 
instead of the init process.


Re: Linuxisms in s6

2016-08-25 Thread Jan Bramkamp

On 25/08/16 06:53, Jonathan de Boyne Pollard wrote:

http://adrianchadd.blogspot.co.uk/2016/08/freebsd-on-tiny-system-whats-missing.html?showComment=1471236502051#c1305086913155850955
, Adrian Chadd:


We're using s6 at work, and it works out mostly ok. Mostly once you
get around the linuxisms, and the lack of sensible time code in it
(its calculations for daemon run duration is based on system time, not
wall clock, so if your box boots jan 1, 1970 then gets NTP, things
are.. hilarious), and some of the arcane bits to get logging working
right.


What are these Linuxisms in s6?  s6-linux-utils and s6-linux-init have
Linuxisms, obviously.  But what Linuxisms does s6 have?



The skalibs library used by s6 to calculate the deadlines should use 
clock_gettime(CLOCK_MONOTONIC) on FreeBSD and as such shouldn't be 
affected by changes to the wall clock.


I'm currently working on a FreeBSD only potential init replacement as 
well just without the mandatory per service supervisor process. The new 
kqueue EVFILT_PROCDESC filter type in FreeBSD 11 combined with pdfork() 
should make it really easy to deal child processes in a single unified 
kevent loop. Forking services could still be handled by a supervisor 
using procctl(PROC_REAP_ACQUIRE).


At the moment I'm fighting with some corner cases in the file descriptor 
passing code and redesigning the API to work without access to a 
writable file system. My last API required a writable file system 
because FreeBSD doesn't support listen()ing on unbound unix domain 
seqpacket sockets and I don't want to require something like the Linux 
/run tmpfs. Instead my new API uses socketpair() to create a connected 
pair of anonymous unix domain sockets for each supervised process. Next 
I have to find out if fexecve() works at least for fd 0, 1 and 2 without 
a mounted fdescfs.


I want to implement the following features in a single process capable 
of running as PID 1:

- Track service dependencies (want, require, bind, conflict)
- Store, Retrieve and close file descriptors.
- Spawn and supervise processes in a well defined environment.
- Reliable event notification with coalescing.
- Bootstrap the system with help from a default service.

With those features it should be able to wrap existing rc.d scripts 
without resorting to polling.


Re: s6-linux-init: SIGUSR1 and SIGUSR2

2016-08-22 Thread Jan Bramkamp

On 22/08/16 16:29, Martin "eto" Misuth wrote:

On Mon, 22 Aug 2016 07:26:12 -0700

Colin Booth  wrote:

My own $0.02 is that s6-svscan -S should ignore power state signals
(including SIGINT which it currently doesn't ignore).


I haven't really understood this thread, but I think I am starting to
understand.

Correct me, if I am wrong, but is this about signals, which generate various
cleanup paths to end normal machine operations aka "curuising" mode?

Because if so, there is problem I hit before, but thought it was PEBKAC on my
part.

Currently on FreeBSD, when s6-svscan runs as PID1, native "reboot" command makes
s6-svscan do proper cleanup (I guess it sends signal which is
correctly interpreted by s6-svscan as shutdown signal). While shutdown does
"nothing". I guess on that box I need to add -S switch and introduce signal
handling scripts?


FreeBSD has a different convention.

SIGHUP: Reload /etc/ttys. Since s6 doesn't start getty etc. from 
/etc/ttys there no need to handle this signal. I just log it to the 
default logger.


SIGINT: Reboot the system. I implemented it as `s6-rc -a -d change` 
followed by `s6-svscanctl -i /run/service`.


SIGQUIT: Not used by FreeBSD init. Just log and otherwise ignore the signal.

SIGUSR1: Halt the system. Use `s6-rc -a -d change` to stop all services 
and `s6-svscanctl -st /run/service` to halt.


SIGUSR2: Poweroff the system. Use `s6-rc -a -d change` to stop all 
services and `s6-svscanctl -pt /run/service` to power down.


It's also a good idea to kill all remaining processes, update the 
/entropy file and sync the disks from the s6-svscan ./finish script.


I can upload my /etc/service and /etc/s6-rc/source if you're interested.


Re: How to supervise an early process [root pivot]

2016-06-21 Thread Jan Bramkamp



On 21/06/16 16:24, Martin "eto" Misuth wrote:

On Tue, 21 Jun 2016 14:45:59 +0200
Laurent Bercot  wrote:

...
  With udevd, the workaround is to kill it after you have performed the
coldplug, and only restart it as part of your normal boot sequence once
you have pivot_rooted. It can be supervised at this point.



Thank you! Especially for mdev coldplug process description!

I asked, because it seems FreeBSD will be getting pivot_root like
capabilities soon. This makes it more similar to Linux in a way. And opens
some weekends for tinkering. It also introduces remote posibility of situation
like described actually happening there too.


FreeBSD 10.3 (the latest release as of writing) includes rerooting 
support. By passing the rerooting flag to the reboot systemcall the 
userland can tell the kernel to start the usual shutdown (kill all 
processes including init, unmount all filesystems including "/") and 
after unmounting the root filesystem the kernel performs a "userland 
reboot" by mounting a new root filesystem and starting a new init process.


There are lots of usecases for this e.g. configure the in-kernel iSCSI 
initiator from a small netboot image and switch to an iSCSI LUN as root 
file system. An other example are full disc encrypted systems without 
trusted system console. In that case you can use a minimal unencrypted 
system to unlock the encrypted disks and reroot into your encrypted devices.


Use kenv vfs.root.mountfrom=":" to set the filesystem 
type and device path before you invoke "reroot -r".


Re: s6-svscan & SIGPWR

2016-03-14 Thread Jan Bramkamp

On 14/03/16 17:42, Jan Olszak wrote:

Hi!
We're running s6 in an lxc container.

lxc-stop sends SIGPWR to the init process (s6-svscan) to stop the
container, but SIGPWR isn't handled. It just gets discarded as if nothing
happened.

Is there a reason it works this way?

Thanks!
Jan



Probably because it masks all signals, whitelists signals the authors 
though about and offers to proxy those to scripts. After all s6-svscan 
is designed to be a good pid 1 and exiting from an default signal 
results in an instant kernel panic. The correct solution would be to 
teach s6-svscan about SIGPWR and proxy it to .s6-svscan/SIGPWR.


Re: What's the difference between s6 and s6-rc?

2016-02-25 Thread Jan Bramkamp



On 25/02/16 15:47, Steve Litt wrote:

Hi all,

Could somebody, in one or two sentences, tell me the difference between
s6 vs s6-init?


For the subject i infer that you're asking for the difference between s6 
and s6-rc because there is no project or binary named s6-init.



I'm not looking for references to big old documents: I'm looking for
one or two sentences telling me the difference.


s6 is a collection of tools. Together these tools implement service 
supervision. One of those tools is s6-svscan. It scans a service 
directory and starts one s6-supervise process per service. It can also 
serve as pid 1 (aka init) of a unix system.


While s6 is a good service supervisor on its own it lacks service 
management. s6 can only track running processes and offers no ordering 
between them. You can block in a ./run script until all your 
dependencies are fulfilled or just exit and retry until it works. 
Writing this logic in every ./run script is a burden on the user but 
manageable.


On its own s6 can't track state not represented by a running process 
e.g. a mounted file system. This forces a system booting with just s6 to 
run a (possibly very small) script before s6-svscan can take over. 
Dependencies can only be modeled between running processes. This is good 
enough for small embedded devices or (most) systems run by a very 
experienced user willing to tinker with his boot process.


s6-rc builds on top of s6 to provide support for state tracking and 
changing on top of s6. It does this in a surprisingly safe, clean and 
simple way. This in turn makes it hard to understand because s6-rc is 
just an empty hull. You still have to write your own start code fitting 
into its structure. The s6 linux init repo contains an example of how it 
could look like.


Safely replacing /etc/s6-rc/compiled

2015-11-12 Thread Jan Bramkamp
Is there a way to update /etc/s6-rc/compiled without risking a broken 
system if the power fails midway through the update?
The best I can come up with is to use a symlink to a uniquely named 
compiled directory and a recompile wrapper along those lines:


cd /etc/s6-rc
s6-rc-compile .compiled-$SUFFIX source
s6-rc-update $PWD/.compiled-$SUFFIX $* # TODO: avoid $*
ln -s .compiled-$SUFFIX .compiled-${SUFFIX}.link
mv .compiled-${SUFFIX}.link compiled
# TODO: cleanup old .compiled-* dirs
fsync compiled .


Re: s6: something like runit's ./check script

2015-09-08 Thread Jan Bramkamp



On 03/09/15 20:23, Laurent Bercot wrote:

On 03/09/2015 18:25, Buck Evan wrote:

An s6-checkhelper wrapper that implements exactly the above would make me
happy enough.


  Yes, that's envisionable. I'll think about it.


I pondered over the problem and came up with the following conclusions:
- Polling is ugly, but useful enough to implement anyways.
- Polling has no place in s6-supervise.
- It's infeasible to implement the s6 readiness protocol in every service.
- The runit ./check scripts polled by sv (the runit equivalent to 
s6-svc) scale badly for dependency trees with a high fan-out.


Bitching about the limitations of open source software is easy so here 
is my proposed solution:

- Have one polling daemon.
- A wrapper registers polled services with the polling daemon.
- The wrapper passes the service directory fd (its working directory) 
and the notification fd to the polling service over a unix domain socket 
(maybe with a polling frequency and scaling factor).
- The polling daemon invokes ./check until it returns an exit 
successfully or the polled service is brought down.


I'm willing to implement this but I have no experience with the skalibs 
API and would probably make a mess of it which integrates poorly with 
the source code of the s6 suite. I would also require some testers for 
platforms other than FreeBSD.


This solution would offer polling for those accepting the trade-offs and 
keep the rest of the s6 suite clean.


Re: Hello world and couple of questions

2015-09-08 Thread Jan Bramkamp



On 07/09/15 16:36, Martin "eto" Misuth wrote:

Hi,

new s6 user here.

I started using s6 to do FreeBSD jail services management. It seems this helped
me to get rid of some socket timeout errors (?) in my jails setups.

In my deployment I am using "jail as vm" approach.

This works pretty well in my testing vm and on testing server.

Being very happy with the preliminary system I decided I want to replace init
on freebsd similar way runit does it.

However runit is fullblown init in stage 1 + and zombie reaper in stage 2,
while s6 uses execline script for stage 1.

After some smashing things together, I took original init.c from base FreeBSD
src tree and removed parts until it's just rudimentary initialization
function that execves into execlineb boot script.

My environment is ZFS only so, I just did really dirty hacks to have ZFS fses,
and some other late fses mounted and some other things.

This seems to work reliably both in vm and on real hw.


I too ran into the problem of incompatible signal number conventions 
between FreeBSD and Linux init processes. My solution was to #ifdef the 
signal handler registration in the s6 init code and write a few ports 
which install the skatools collection in base (PREFIX=/ instead of the 
normal PREFIX=/usr/local). These ports install all the binaries in /bin 
and /sbin and because they need only the libc at runtime execline 
scripts work as early init. As init has to mount all filesystems except 
/ it has to work with just the contents of the root fs. By convention 
ports install as much as possible under a $PREFIX of /usr/local. Unless 
you want to mix /package with hier(7) a proper execline port usable as 
early init "interpreter" would require the port to install into the 
rootfs which means /bin (and interpreter has no place in /sbin). Since 
most of the s6 suite is required for debugging any problems the s6 port 
should install its binaries into the rootfs as well. This rest of the 
nice s6 tools. They probably should install into /usr/local like normal 
ports and the services, because the early init script should mount all 
the critical (local) filesystems or fail hard.



One problem I am having though, is "uncaught" logger dance, as unlike on linux,
stage 1 dies with "broken pipe" signal when it tries to write to logger fifo
(as it has no reader yet).
For now I got around that by redirecting stderr to logger only right before
supervisor boots up, however that means initial messages appear only on
console.


Setting up a default logger requires some trickery. The idea is to open 
a non-blocking pipe (because opening a blocking pipe requires a reader) 
and toggle the opend pipe from non-blocking to blocking.



Is there some way to buffer those from fifo until last resort logger
starts (using tools in s6 packages)?
Or can I somehow spawn logger before supervisor spins up and have it "descent"
later into supervision tree?


I have a working FreeBSD default logger service and early init execline 
script in a virtualbox VM.



Second problem is when I compile s6 and portable utils package with:
./configure --enable-static-libc
all s6 tools are correctly identified by 'file' command as fully static ELFs,
however s6-svstat for example fails to work correctly.


There is no reason the use a static libc on FreeBSD unless you plan to 
break your linker or libc. In all my years I never needed /rescue 
because my linker was broken.



It always reports given service as down.
Without static libc it all works as intended.
Is this a bug?


As a user I would consider this a bug. The question is if it's a bug in 
the FreeBSD libc or s6-svstat. But there are few good reasons to link 
static binaries on FreeBSD and quiet a few reasons against static 
linking among them that updates to the FreeBSD world may patch the libc 
without rebuilding your static binaries.


Re: s6: something like runit's ./check script

2015-09-08 Thread Jan Bramkamp

On 08/09/15 13:24, Laurent Bercot wrote:

On 08/09/2015 10:49, Jan Bramkamp wrote:

- Have one polling daemon.
- A wrapper registers polled services with the polling daemon.


  No can do, sorry.

  You can't make a supervision infrastructure depend on daemons,
because daemons depend on a supervision infrastructure. The
polling daemon would have to be supervised; of course it would
not perform polling itself, but it would still need to be up
every time a polling service starts. That's way too much
bootstrap complexity and possibility of failure (with bad
consequences) for that kind of functionality.


Only daemons requiring external polling would depend on this daemon and 
it would support the s6 notification interface itself.



  Avoiding dependencies from s6 mechanisms to a daemon is the
reason for the fifodir stuff, for instance. If I could have a
daemon, I'd use one to pubsub notifications - it would be
cleaner. But no.


That's why I suggested a daemon only for services lacking support for 
the s6 readiness notification API.



  Also, what's wrong with a simple unsupervised background process
that dies when its job is done?



If the script fails for any reason the service is stuck and won't come 
up. Such simple scripts shouldn't fail but they might still run into 
resource limits or other (temporary) problems. I agree that a simple 
background script forked from ./run will work in >99% of the cases.


I don't like this fail-safe behavior. A polling daemon could keep its 
state (except the file descriptors) in the file system and restart any 
polled services in transition at the time to reacquire the file descriptors.


Re: s6: something like runit's ./check script

2015-09-08 Thread Jan Bramkamp

On 08/09/15 14:05, Laurent Bercot wrote:

On 08/09/2015 13:51, Jan Bramkamp wrote:


If the script fails for any reason the service is stuck and won't
come up. Such simple scripts shouldn't fail but they might still run
into resource limits or other (temporary) problems.


  If the polling script fails, it will die. The run script will then
die. The supervisor will restart it. Eventually, when temporary problems
get fixed, the polling script will succeed, and the service will
make it.

  Don't overengineer stuff. Because we can easily manage daemons and
notifications and complex mechanisms doesn't mean we should. ;)


How would the ./run script or more likely the daemon it exec()ed into 
die from a failed child process? If it does so reliably I agree that 
more elaborated constructs aren't worth the complexity.