On 06/01/2015 09:00, Colin Booth wrote:
1. Depending on your initramfs and your on-disk layout you can skip mounting proc and sys. I know this is the case with Debian, probably true elsewhere as well.
It all depends on the assumptions that init-stage2 makes, but yes, now that you're mentioning it, mounting /proc and /sys may be delayed, as long as none of the very early services need them. Make sure the login process and interactive root shell do not need them either, because if init-stage2 fails very early, being able to log in will make debugging/recovery a lot easier.
2. If you aren't starting udev until init-stage2, you'll need to manually mknod null and console devices before the "Reopen stdin/stdout/stderr" comment.
That only applies to people who want a static /dev. Most people will run some flavour of udev, and will probably want to keep the devtmpfs mounted on /dev, in which case the kernel exports /dev/null and /dev/console itself. (Probably with the wrong rights, but they're functional enough to get by until udev runs.)
3. You'll need to either symlink /tmp into your tmpfs, mount a tmpfs on /tmp as part of init-stage1, or remount / to rw before s6-svscan is loaded. Otherwise the catch-all logger won't be able to do its thing as written. Same deal with /service, though that one is documented and expected.
Actually, neither of those 3 things are needed for /tmp. :) What *is* needed is a writable-by-root-only directory, to store the information init needs: - The scan directory, which must be rw - rw places to store the supervise/ and event/ subdirectories of the service directories, or a copy of the service directories themselves - a rw place for the catch-all logger to run /tmp is not ideal for this, for several reasons. One of which is as soon as stage 2 begins and user stuff runs on the system, creating files in /tmp isn't absolutely secure anymore, because filenames can be predicted and DoSsed. Another reason is conceptual: the information we need to store is not exactly temporary, it's not the throwaway stuff you'd expect to see in /tmp - on the contrary, it's vital to the system. So it's very unsightly to put it in /tmp. I very much dislike having / read-write. In desktops or other systems where /etc is not really static, it is unfortunately unavoidable (unless symlinks to /var are made, for instance /etc/resolv.conf should be a symlink to /var/etc/resolv.conf or something, but you cannot store, for instance, /etc/passwd on /var...) But on servers and embedded systems, / should definitely be read-only. Having it read-write makes it susceptible to filesystem corruption, which kills the guarantee that your machine will boot to at least a debuggable state. A read-only / saves you the hassle of having a recovery system. So, it should be the admin's choice, and I do not want s6 to force the admin to mount / rw. That is why I'm saying that s6 needs a tmpfs, distinct from /tmp, made in stage 1. Having a "private" tmpfs allows init to store the scan directory, the copies of service directories, and the catch-all logger directory, without impacting the rest of the system. Since that tmpfs is needed anyway, /tmp might as well be a symlink to a public (mode 1777) subdirectory of it: it makes /proc/mounts cleaner. But it's not a requirement, and /tmp may be mounted as a separate tmpfs at some point in stage 2. If you are reckless, totally insensitive to gracefulness, and you absolutely cannot deal with creating a tmpfs just for the sake of s6, you may try to use a subdirectory of the devtmpfs in /dev as an early root-only read-write place. You will now forget I suggested that. *flash*
4. If you don't want to have your dev mount in /mnt/tmpfs/dev (mostly to keep ps output non-ugly and to kind-of stick to the FHS)
Eh, the FHS doesn't say that /dev should be a real directory. It can be a symlink all right. I checked. :P Most Linux people will use udev, though, and for them /dev will be a devtmpfs: a real directory, and a mountpoint.
5. I made a few more classes of services for init-stage2 to copy into the service directory. Specifically for things that I wanted running ASAP and were udev agnostic. Those were: syslogd (using s6-ipcserver and ucspilogd), klogd, cron, and udev. Mostly that was because I needed udev running (and supervised) before bringing up dbus, and I wanted to make sure /dev/log had a reader before I started bringing anything up that might not want to talk to stdout instead (openssh, I'm looking at you).
The order in which init-stage2 starts services and interleaves them with one-shot commands should mirror your dependency graph. This is where a dependency management system would come in handy; I plan to work on a program that takes a dependency graph as its input (format TBD) and outputs a suitable init-stage2 script. (Crazy idea brewing. Dependency graph management is a solved problem: it's exactly what "make" does. So my program could simply translate the service dependency graph into a Makefile, and make would output the script. This requires more thought.)
Everything between the fdclose line and repoening stdin is super fragile, and since we've unmounted /dev, it's impossible to boot half-way and then start a shell to find out what exactly went wrong.
I will definitely be working on a s6-init package to automate all this and make sure the fragile part is as brief as possible. The really risky stuff is replacing /dev/console under init's nose; for udev users, this won't even happen, so stage 1 will be practically safe. Thanks for your comments! -- Laurent
