s6-rc - odd warn logging and a best practices question
Hey all, I've reconfigured one of my debian systems to boot with s6-init/s6-rc and while trying to debug a timing issue that I think was my own fault (my all services bundle didn't contain my ersatz single-user bundle). That mucked up a bunch of timing since half of the initialization stuff wasn't running until I tried to start syslogd. Anyway, as part of the debugging I found some garbage in verbose logging that might be logging issues and might be something more serious. When running s6-rc -v2 change $bundle, the s6rc-fdholder pre addition of notification-fd message has some garbage characters in it: s6-svc: warning: /run/s6/rc/scandir/s6rc-fdholderà¯'þpre addition of notification-fd s6-svc: warning: /run/s6/rc/scandir/s6rc-fdholder/notification-fdpost addition of notification-fd The garbage changes each time, but it's always there. Also, I know it's aesthetic, but it'd be nice to have a space between the service name and the text pre addition or post addition. As for the best practices question. What's the right way to fake service notification for daemons that don't support it? My udev run script is the following which while I'm pretty sure it works strikes me as not the best: #!/command/execlineb -P fdmove -c 2 1 if { foreground { /etc/init.d/udev start } foreground { /bin/udevadm control --exit } } fdmove 1 3 foreground { s6-echo } /lib/systemd/systemd-udevd Firstly, that seems to be leaving me with a pipe to nowhere on fd 1 that never closes unless I re-fdmove fd2 back onto fd1 (not sure if that matters mind you, it probably depends on if the service chats over stdout at all). Secondly, that seems really hacky to me. Now, I'm pretty sure that the cleanest method would be to break it up into two atomics: oneshot udev-init - runs `udev start' and `udevadm control --exit' longrun udev-svc - normal run script handling the maintenance of systemd-udevd The general question though is: what's the best way to handle readyness notification on services that run a prep script before starting the daemon proper. Assuming daemon availability is relatively instant, is foregrounding your initialization script and then moving the notification fd onto stdout right before sending a blank message the best method? I'll do some more testing on the potential timing issues that I've (hopefully) fixed, but so far it's been an interesting experience. I'm hoping I can sort out the remaining issues without having to force ordering to the same level as Debian's rcX.d/S0Xservice scripts or resorting to check loops inside of run scripts. Also, it'll be nice to have s6-rc-update, I've been rebooting... a lot. Once I've got my laptop booting correctly all the way into X with the wireless running and a few things like the sleep button working (`/usr/bin/pm-suspend' works, Fn+F4... not so much, until I run /etc/init.d/acpid start once, even if I turn it off afterwards.. why? beats me) I'll post some comments, bundle up my init stuff, and see about making it available for folks who want to go full crazy. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 1:16 PM, Colin Booth cathe...@gmail.com wrote: By the way, I've found a maybe-bug that, if real, is pretty severe. `s6-rc -d change all ; some stuff ; s6-rc -u change all' has caused my s6-init + s6-rc testbed system to remove the control pipe for my pid 1 s6-svscan. I need to make sure it wasn't something I did between things, and to make sure it wasn't mucked up handling in various scripts that I was running. I'm at work right now so I can't test it out, but sometime in the next day or so I should have the cycles to test it out. Not a bug in s6-rc or s6 but in some Debian script somewhere. Some single-user script appears to re-mount all mount points, which has the net result of causing all file handles into tmpfs mounts to go stale. That's what's breaking s6-svscan. Once I isolate it, I'll see if I can avoid calling that script, and if I have to I'll see about moving its execution somewhere safe. I am learning way more about the complexities of the distro boot cycle than I'd ever expected to this week. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 2:35 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: I can't grep the word addition in my current git, either s6 or s6-rc. Are you sure it's not a message you wrote? Can you please give me the exact line you're running and the exact output you're getting? Thanks, Ugh, it was something I'd hacked in to s6-svc early on in the life of s6-rc to track down some issue I was having with something. I never committed it and sort of assumed that the next git pull I made would have complained and forced me to back it out. Apparently git merge got smarter about unstaged non-conflicts recently. Mystery solved! I'll reply to the other stuff in the other mail fork. -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 1:57 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: Just don't have a notification-fd file. s6-rc will assume your daemon is ready as soon as the run script is started. It may spam you with a warning on high verbosity levels, but that's it. :) Yeah, this is for the special case where you have a daemon that doesn't do readiness notification but also has a non-trivial amount of initialization work before it starts. For most things doing the below talked about oneshot/longrun split is best, but sometimes you need to run that initialization every time (data validators are the most obvious example). If your daemon doesn't support readiness notification, I'd generally advise not to pretend it does: even if daemon availability is fast, the scheduler can always screw you. So yes, if at all possible, having the init in a oneshot and the daemon in a longrun depending on the init oneshot is the best way to go, without declaring a notification-fd for the longrun. If it's not possible, foregrounding the init, then sending a blank notification message, then execing the daemon, is probably the least ugly way to proceed. I was using the readiness signal to enforce the timing between udev's heavyweight system prep scripts and everything that depends on udev. Starting udev itself is trivial, and I'm pretty sure that udev doesn't need to guaranteedly be running for other things to start, it just needs to be running for the preparation steps. Hence that run the sysvinit udev script, immediately afterwards stop udev, start it again supervised dance. Breaking out into a pair of atomics is a lot more elegant. Why didn't I think of that until last night when I've been experimenting with this since Monday? Dunno. I'm surprised that systemd-udevd doesn't provide notification: that's one of the least bad reasons for daemons to integrate with systemd. Doesn't it use sd_notify() ? It does provide notification, but only if you're running under systemd. At least according to the sd_notify() docs. I'll see about faking up the environment so sd_notify() is happy and report back. Also, it'll be nice to have s6-rc-update, I've been rebooting... a lot. No need to reboot: s6-rc -da change for i in `ls -1 $live/servicedirs` ; do rm $scandir/$i ; done s6-svscanctl -an $scandir rm -rf `basedir $live`/`s6-linkname $live` rm $live s6-rc-init -l $live -c $newcompiled $scandir s6-rc -u change $everythingbundle That's more or less what s6-rc-update will do, of course with optimizations to avoid restarting everything. Actually, the more I think about it, the less s6-rc-update will help me avoid reboots in the short term since part of what I need to get back is a pristine post-boot environment. Power management on Linux laptops is high-level demonology, and mere mortals should not dabble in it, lest their souls be consumed. I had a friend who tried and came back shaking and drooling... it took him a long while to recover. Fortunately, there's *almost* no permanent damage to his mind. HA! I'm pretty sure the failure is in some acpi policy handling glue code that isn't getting set right. The init.d/acpid script isn't terribly complicated, I simply need to capture the system state before and after the init script is run. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 10:24 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: Oh, the protocol is complicated too. If I start to implement it, there's no stopping, and I'll be running behind systemd every time they add something to the protocol, which is exactly what I don't want to do. Sure. And I bet that listening for any message on the socket isn't good enough since things might be chattery. You can enforce a non-race by synchronizing both processes, i.e. making the notification listener notify the notification sender that it is ready to receive a message. I'm not even joking. Notifiception is a thing with the wonderful systemd APIs. NOW we're talking! I see. You could pull those out of the set of services managed by s6-rc and just run them sequentially at boot time, until s6-rc-update is out. Yeah, but then you get into that question of what you do with oneshots that depend on longruns which are required for initialization... Like I said, it's a bit of a mess but isn't any more of a mess than someone who is doing early boot optimization in any other init. Once I've sorted out all the timing issues (and I think I'm close) it should be fine. By the way, I've found a maybe-bug that, if real, is pretty severe. `s6-rc -d change all ; some stuff ; s6-rc -u change all' has caused my s6-init + s6-rc testbed system to remove the control pipe for my pid 1 s6-svscan. I need to make sure it wasn't something I did between things, and to make sure it wasn't mucked up handling in various scripts that I was running. I'm at work right now so I can't test it out, but sometime in the next day or so I should have the cycles to test it out. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On 20/08/2015 16:43, Colin Booth wrote: Yeah, this is for the special case where you have a daemon that doesn't do readiness notification but also has a non-trivial amount of initialization work before it starts. For most things doing the below talked about oneshot/longrun split is best, but sometimes you need to run that initialization every time (data validators are the most obvious example). In that case, yes, if { init } if { notification } daemon is probably the best. It represents service readiness almost correctly, if service includes the initialization. It does provide notification, but only if you're running under systemd. At least according to the sd_notify() docs. I'll see about faking up the environment so sd_notify() is happy and report back. systemd's notification API is a pain. It forces you to have a daemon listening on a Unix socket. So basically you'd have to have a notification receiver service, communicating with the supervisors - which eventually makes it a lot simpler to integrate everything into a single binary. This API was made to make systemd look like the only possible design for a service manager. That's political design to the utmost, and I hate that with a passion. I have a wrapper to make things work the other way (i.e. using s6-like daemons under systemd), but a wrapper that would actually understand sd_notify() notifications would be much more painful to write. Actually, the more I think about it, the less s6-rc-update will help me avoid reboots in the short term since part of what I need to get back is a pristine post-boot environment. What do you have in that post-boot environment that would be different from what you have after shutting down all your s6-rc services and wiping the live directory ? -- Laurent
Re: s6-rc - odd warn logging and a best practices question
On 20/08/2015 10:57, Laurent Bercot wrote: s6-svc: warning: /run/s6/rc/scandir/s6rc-fdholder/notification-fdpost addition of notification-fd Looks like a missing/wrong string terminator. Thanks for the report, I'll look for it. I can't grep the word addition in my current git, either s6 or s6-rc. Are you sure it's not a message you wrote? Can you please give me the exact line you're running and the exact output you're getting? Thanks, -- Laurent
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 8:44 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: In that case, yes, if { init } if { notification } daemon is probably the best. It represents service readiness almost correctly, if service includes the initialization. Cool. Not the most elegant but good to know I was on the right track. It does provide notification, but only if you're running under systemd. At least according to the sd_notify() docs. I'll see about faking up the environment so sd_notify() is happy and report back. systemd's notification API is a pain. It forces you to have a daemon listening on a Unix socket. So basically you'd have to have a notification receiver service, communicating with the supervisors - which eventually makes it a lot simpler to integrate everything into a single binary. This API was made to make systemd look like the only possible design for a service manager. That's political design to the utmost, and I hate that with a passion. I think only the socket part is fancy systemd-centric design, so presumably a stupid subscript that takes socket messages and emits s6-ftrig events could do the reverse of sdnotify_wrapper. I'm thinking something like s6-ipcserver-socketbinder execing into a background'ed puller to s6-ftrig-notify chain. The puller would be something like s6-ftrig-wait but for generic file descriptors instead of fifo dirs (this probably exists and if not should be reasonably easy), and s6-ftrig-notify would handle the actual readiness alarm. The API is definitely more complicated than the s6 notification one, but it doesn't seem insurmountable. My solution is a bit racy, though I'd hope a socket puller would start faster than a daemon, scheduler whims or no. What do you have in that post-boot environment that would be different from what you have after shutting down all your s6-rc services and wiping the live directory ? Adjustments to modules, locale and hostname setting, re-seeding the random device. Basically everything that happens in the single-user boot stage on distro systems. For example, the udev init script does a lot of work that can't easily be un-done without a reboot. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake