s6-linux-init, alpine linux, and initramfs

2017-01-30 Thread Guillaume Perréal

Hello there,

Me again. After tweaking my Xsession for a stater, I am trying to build 
a VM with s6-linux-init.


I am starting from Alpine Linux, because I am not into recompiling Linux 
and all tools from scratch (well, not yet) and this distro already 
provides binaries for all skarnet tools. Like many distro, they use an 
initramfs, because most of the drivers (including sd-mod, scsi and ext*) 
are built as modules.


I have found how to customize the initramfs. Now I am facing a choice: 
what should I put in the initramfs ?


1) put s6-linux-init phase 1 into the initramfs, use it at /init, then 
use an embedded /etc/rc.init to load modules, mount / and exec into the 
root's /etc/rc.init. The advantage would be a full s6 boot process. One 
drawback is that I have to put all execline and s6 tools (but s6-rc) in 
the initramfs. Another one is that the phase 1 of s6-linux-init is not 
very verbose and does not have any emergency fallback.


2) have a very small init script to load the modules, mount the 
filesystems (/dev, /proc, /sys, /), and finally pivot-chroot into 
s6-linux-init phase 1. This would be less elegent but it might be easier 
to set up.


Any idea on this ?

I know the right way would be to recompile linux with the right modules 
to boot directly into s6-linux-init phase 1 from the root partition.


++

--

Guillaume.



Re: s6-linux-init, alpine linux, and initramfs

2017-01-30 Thread Laurent Bercot


2) have a very small init script to load the modules, mount the 
filesystems (/dev, /proc, /sys, /), and finally pivot-chroot into 
s6-linux-init phase 1. This would be less elegent but it might be 
easier to set up.


 This. If you need or want an initramfs, you need to comply with the
implicit initramfs contract: when you exec into /sbin/init, it must be
the only process running on your machine, just as if the kernel started
it; and any init system should be able to work, the initramfs should not
tie you to a specific init. So, spawning a supervision tree in the
initramfs is a no-no, because it breaks both aspects.

 You can see an initramfs as a mini-system that you set up to do what
needs to be done *and then tear down* before exec'ing into the real 
system.

So, do that: load your modules, find your rootfs, pivot-chroot into it,
and start your real system with your init of choosing.

 Ideally, you'd even unmount /proc and /sys (which you likely need 
during

your initramfs execution) before entering /sbin/init. But obviously
that's not practical since your boot sequence will mount then again
very soon, so the separation between "pre-init" and "post-init" can
be a bit less strict. You can document that the state of your system
at init time is "pristine as if the kernel had directly started init,
except that /proc and /sys are already mounted", for instance, and
that's acceptable. (/dev is not even a question - you should have a
devtmpfs mounted at boot anyway, and mount --move it after your
pivot_root.)

 But "after initramfs, I already have a supervision system and just
need to run the rest of the boot sequence" is not acceptable - if only
because you then need to keep supervision executables in RAM, and a
component of your pivoted initramfs in your PATH.

 Just make initramfs as transparent as possible, it's a lot cleaner.

 Oh, by the way, pivot_root works with initrd, but not initramfs: see
https://www.kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt
so you can use busybox/toybox's switch_root instead, or you can do the
switch_root by hand.

 I have a skeleton /init here that only needs in your initramfs:
 - empty /dev, /proc, /rootfs and /sys directories. OK if the kernel
mounts devtmpfs at boot (which it should).
 - a /sbin directory with a "mdev" static binary inside. (busybox with
only mdev selected will still be 100ish kB, that's unfortunately 
normal.)

 - a /command directory with static cd, execlineb, export, foreground,
if, redirfd, s6-echo and s6-mount binaries inside. Also "define" for the
skeleton but you'd replace it with something else for your rootfs 
detection.
 - whatever else you need to do your job - you could add modutils to 
your

busybox build, for instance, if you want to load modules. You may want
a /etc/mdev.conf depending on the devices you're expecting to detect.
 - also execline, s6-portable-utils and s6-linux-utils binaries 
accessible

in the /command directory of your real root filesystem.

 You can get it at http://pastebin.com/KZfdETy5
 My gzipped initramfs image made with that is about 104kB, for x86_64.

 HTH,

--
 Laurent



skalibs will soon change APIs

2017-01-30 Thread Laurent Bercot


 Hello,

 This is a heads-up for people who are using skalibs, or any of the
libraries in the skarnet.org projects (which are undergoing the same
transformation).

 Since the beginning, skalibs has used ints and unsigned ints about
everywhere in its prototypes. This is legacy from DJB, and comes from a
time where POSIX compliance was dubious on a lot of systems. (It still
is, but divergences between POSIX and development environments are a
lot more subtle today than they were 15 or 20 years ago.)

 Use of integer types rather than POSIX types was the reasonable thing
to do back then; but for a few years, it has not been necessary and has
become a drawback and a concern rather than an advantage. For instance,
on x86_64, skalibs does not support IO on more than 2 GB at a time.

 This is going to change soon (probably towards the end of February, or
in March). skalibs prototypes will now use POSIX types - such as size_t
and ssize_t instead of unsigned int and int, struct iovec instead of
siovec_t, uint32_t instead of uint32, etc. (You can still keep using
uint16/uint32/uint64 for a few versions.)

 You can prepare your software right now in your applications by
already changing your types. That's what all the applicative skarnet.org
packages do in the recent commits. For instance, instead of

 unsigned int max = MAX ;
 int r = fd_read(fd, buf, MAX) ;

 you'd write:

 size_t max = MAX ;
 ssize_t r = fd_read(fd, buf, MAX) ;

(Remember to include  and  as needed.)

 so the limiting factor is skalibs and not your application, and your
application becomes 64-bit-ready when skalibs changes.

 Of course, those changes are the easy part. The hard part is if you
have pointers. You can't replace "unsigned int *" with "size_t *" and
be compatible with the old skalibs. I advise you to mark those places
in your code with a comment, so you can quickly adapt to the new skalibs
API when it comes out.

 Thanks for your understanding.

--
 Laurent