Re: s6-rc - odd warn logging and a best practices question
On Fri, Aug 21, 2015 at 2:11 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: Wow. Is it a mount -o remount, or a umount followed by a mount ? If a -o remount has this effect on file handles, then it's probably worth reporting to the kernel guys, because it's insane. Even if the script does something nonsensical such as remounting everything read-only, which hardly makes any sense for a tmpfs, this is not normal behaviour: when I remount a partition in read-only mode, and there are still open descriptors for writing, the mount() call fails with EBUSY; it does not silently invalidate all the writing descriptors! First reboot in a while so I spent some time tracking this down. It was caused by some really cute interactions between a few of the Debian single-user mode system prep scripts. checkfs-bootclean.sh is safe to run against tmpfs mounts right until you run bootmisc.sh, which removes the flag files that the clean_all function uses to identify a tmpfs. So that's been fixed. Last time I looked at a mainstream distro's boot cycle, i.e. almost 10 years ago, it was already unnecessarily complex and convoluted; and Debian was far from the worst. I doubt it has become simpler since. It probably doesn't help that I'm working against the hardest target too: laptops. Thankfully, the only place where I really need to interact with the sysvinit stuff is in the collection of oneshots that are emulating the single-user portion of the bootcycle. I did find a script in there that will halt an s6-init system if you run it . That was fun. It's the only place that I found that actually cares about what init you're running under. In the case where you have sysvinit but no initctl control pipe (such as can happen if you mount a new /run over the old one) it recreates that and then fires off SIGUSR1 at whatever happens to be init at the time. The only things left to fix are some file permissions and mounts that the aforementioned script fixes up, and that ACPI sleep handler weirdness that I mentioned earlier. Plus, you know, not running a pre-alpha rc system ;) systemd will probably make scripting simpler, by moving a lot of the complexity into the C code. Which is obviously the worst possible solution. Probably. I almost want to build out a systemd machine to see what the early boot land looks like. Depending on what the system prep stuff looks like it might be easier to gut. Like I said though, almost. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 1:16 PM, Colin Booth cathe...@gmail.com wrote: By the way, I've found a maybe-bug that, if real, is pretty severe. `s6-rc -d change all ; some stuff ; s6-rc -u change all' has caused my s6-init + s6-rc testbed system to remove the control pipe for my pid 1 s6-svscan. I need to make sure it wasn't something I did between things, and to make sure it wasn't mucked up handling in various scripts that I was running. I'm at work right now so I can't test it out, but sometime in the next day or so I should have the cycles to test it out. Not a bug in s6-rc or s6 but in some Debian script somewhere. Some single-user script appears to re-mount all mount points, which has the net result of causing all file handles into tmpfs mounts to go stale. That's what's breaking s6-svscan. Once I isolate it, I'll see if I can avoid calling that script, and if I have to I'll see about moving its execution somewhere safe. I am learning way more about the complexities of the distro boot cycle than I'd ever expected to this week. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 2:35 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: I can't grep the word addition in my current git, either s6 or s6-rc. Are you sure it's not a message you wrote? Can you please give me the exact line you're running and the exact output you're getting? Thanks, Ugh, it was something I'd hacked in to s6-svc early on in the life of s6-rc to track down some issue I was having with something. I never committed it and sort of assumed that the next git pull I made would have complained and forced me to back it out. Apparently git merge got smarter about unstaged non-conflicts recently. Mystery solved! I'll reply to the other stuff in the other mail fork. -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 1:57 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: Just don't have a notification-fd file. s6-rc will assume your daemon is ready as soon as the run script is started. It may spam you with a warning on high verbosity levels, but that's it. :) Yeah, this is for the special case where you have a daemon that doesn't do readiness notification but also has a non-trivial amount of initialization work before it starts. For most things doing the below talked about oneshot/longrun split is best, but sometimes you need to run that initialization every time (data validators are the most obvious example). If your daemon doesn't support readiness notification, I'd generally advise not to pretend it does: even if daemon availability is fast, the scheduler can always screw you. So yes, if at all possible, having the init in a oneshot and the daemon in a longrun depending on the init oneshot is the best way to go, without declaring a notification-fd for the longrun. If it's not possible, foregrounding the init, then sending a blank notification message, then execing the daemon, is probably the least ugly way to proceed. I was using the readiness signal to enforce the timing between udev's heavyweight system prep scripts and everything that depends on udev. Starting udev itself is trivial, and I'm pretty sure that udev doesn't need to guaranteedly be running for other things to start, it just needs to be running for the preparation steps. Hence that run the sysvinit udev script, immediately afterwards stop udev, start it again supervised dance. Breaking out into a pair of atomics is a lot more elegant. Why didn't I think of that until last night when I've been experimenting with this since Monday? Dunno. I'm surprised that systemd-udevd doesn't provide notification: that's one of the least bad reasons for daemons to integrate with systemd. Doesn't it use sd_notify() ? It does provide notification, but only if you're running under systemd. At least according to the sd_notify() docs. I'll see about faking up the environment so sd_notify() is happy and report back. Also, it'll be nice to have s6-rc-update, I've been rebooting... a lot. No need to reboot: s6-rc -da change for i in `ls -1 $live/servicedirs` ; do rm $scandir/$i ; done s6-svscanctl -an $scandir rm -rf `basedir $live`/`s6-linkname $live` rm $live s6-rc-init -l $live -c $newcompiled $scandir s6-rc -u change $everythingbundle That's more or less what s6-rc-update will do, of course with optimizations to avoid restarting everything. Actually, the more I think about it, the less s6-rc-update will help me avoid reboots in the short term since part of what I need to get back is a pristine post-boot environment. Power management on Linux laptops is high-level demonology, and mere mortals should not dabble in it, lest their souls be consumed. I had a friend who tried and came back shaking and drooling... it took him a long while to recover. Fortunately, there's *almost* no permanent damage to his mind. HA! I'm pretty sure the failure is in some acpi policy handling glue code that isn't getting set right. The init.d/acpid script isn't terribly complicated, I simply need to capture the system state before and after the init script is run. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 10:24 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: Oh, the protocol is complicated too. If I start to implement it, there's no stopping, and I'll be running behind systemd every time they add something to the protocol, which is exactly what I don't want to do. Sure. And I bet that listening for any message on the socket isn't good enough since things might be chattery. You can enforce a non-race by synchronizing both processes, i.e. making the notification listener notify the notification sender that it is ready to receive a message. I'm not even joking. Notifiception is a thing with the wonderful systemd APIs. NOW we're talking! I see. You could pull those out of the set of services managed by s6-rc and just run them sequentially at boot time, until s6-rc-update is out. Yeah, but then you get into that question of what you do with oneshots that depend on longruns which are required for initialization... Like I said, it's a bit of a mess but isn't any more of a mess than someone who is doing early boot optimization in any other init. Once I've sorted out all the timing issues (and I think I'm close) it should be fine. By the way, I've found a maybe-bug that, if real, is pretty severe. `s6-rc -d change all ; some stuff ; s6-rc -u change all' has caused my s6-init + s6-rc testbed system to remove the control pipe for my pid 1 s6-svscan. I need to make sure it wasn't something I did between things, and to make sure it wasn't mucked up handling in various scripts that I was running. I'm at work right now so I can't test it out, but sometime in the next day or so I should have the cycles to test it out. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake
Re: s6-rc - odd warn logging and a best practices question
On 20/08/2015 16:43, Colin Booth wrote: Yeah, this is for the special case where you have a daemon that doesn't do readiness notification but also has a non-trivial amount of initialization work before it starts. For most things doing the below talked about oneshot/longrun split is best, but sometimes you need to run that initialization every time (data validators are the most obvious example). In that case, yes, if { init } if { notification } daemon is probably the best. It represents service readiness almost correctly, if service includes the initialization. It does provide notification, but only if you're running under systemd. At least according to the sd_notify() docs. I'll see about faking up the environment so sd_notify() is happy and report back. systemd's notification API is a pain. It forces you to have a daemon listening on a Unix socket. So basically you'd have to have a notification receiver service, communicating with the supervisors - which eventually makes it a lot simpler to integrate everything into a single binary. This API was made to make systemd look like the only possible design for a service manager. That's political design to the utmost, and I hate that with a passion. I have a wrapper to make things work the other way (i.e. using s6-like daemons under systemd), but a wrapper that would actually understand sd_notify() notifications would be much more painful to write. Actually, the more I think about it, the less s6-rc-update will help me avoid reboots in the short term since part of what I need to get back is a pristine post-boot environment. What do you have in that post-boot environment that would be different from what you have after shutting down all your s6-rc services and wiping the live directory ? -- Laurent
Re: s6-rc - odd warn logging and a best practices question
On 20/08/2015 10:57, Laurent Bercot wrote: s6-svc: warning: /run/s6/rc/scandir/s6rc-fdholder/notification-fdpost addition of notification-fd Looks like a missing/wrong string terminator. Thanks for the report, I'll look for it. I can't grep the word addition in my current git, either s6 or s6-rc. Are you sure it's not a message you wrote? Can you please give me the exact line you're running and the exact output you're getting? Thanks, -- Laurent
Re: s6-rc - odd warn logging and a best practices question
On Thu, Aug 20, 2015 at 8:44 AM, Laurent Bercot ska-skaw...@skarnet.org wrote: In that case, yes, if { init } if { notification } daemon is probably the best. It represents service readiness almost correctly, if service includes the initialization. Cool. Not the most elegant but good to know I was on the right track. It does provide notification, but only if you're running under systemd. At least according to the sd_notify() docs. I'll see about faking up the environment so sd_notify() is happy and report back. systemd's notification API is a pain. It forces you to have a daemon listening on a Unix socket. So basically you'd have to have a notification receiver service, communicating with the supervisors - which eventually makes it a lot simpler to integrate everything into a single binary. This API was made to make systemd look like the only possible design for a service manager. That's political design to the utmost, and I hate that with a passion. I think only the socket part is fancy systemd-centric design, so presumably a stupid subscript that takes socket messages and emits s6-ftrig events could do the reverse of sdnotify_wrapper. I'm thinking something like s6-ipcserver-socketbinder execing into a background'ed puller to s6-ftrig-notify chain. The puller would be something like s6-ftrig-wait but for generic file descriptors instead of fifo dirs (this probably exists and if not should be reasonably easy), and s6-ftrig-notify would handle the actual readiness alarm. The API is definitely more complicated than the s6 notification one, but it doesn't seem insurmountable. My solution is a bit racy, though I'd hope a socket puller would start faster than a daemon, scheduler whims or no. What do you have in that post-boot environment that would be different from what you have after shutting down all your s6-rc services and wiping the live directory ? Adjustments to modules, locale and hostname setting, re-seeding the random device. Basically everything that happens in the single-user boot stage on distro systems. For example, the udev init script does a lot of work that can't easily be un-done without a reboot. Cheers! -- If the doors of perception were cleansed every thing would appear to man as it is, infinite. For man has closed himself up, till he sees all things thru' narrow chinks of his cavern. -- William Blake