On Sa, 03.05.25 14:50, Xogium (cont...@xogium.me) wrote: > Hi, > > so for context, I want to isolate most services I plan on running inside > containers, with each its own nginx, php, etc. > > My issue is with credentials. I would like the host to handle the renewal of > tls certificate, and to have the credentials propagated via systemd-nspawn > to the services that run within each container. I get the basic idea of how > to implement this, but from what I'm reading, once the credentials are > loaded, they are immutable for as long as the service runs -- in this case > I'm assuming as long as the nspawn container itself runs.
Well, yes, the current protocol is pretty strict on that: we pick them up, decrypt them when a service starts, and release them when it stops, and do not support reloading in between, when it comes to the $CREDENTIALS_DIRECTORY API. This behaviour is nicely "atomic" in behaviour: the service sees stable and immutable versions of the creds for their entire lifetime. > So how would I best handle renewal of the certificate? Would I have to > restart each container via machinectl in order to reload this, thus causing > very brief downtime on all of my services? Well, yes, that's certainly one way. > Is there a better way of doing what I'm trying to accomplish here? Nginx can > access the certificate normally, but I would like to run it as a totally > dynamic user combo. I also host other services that do not run as root first > before dropping privileges, so they require access to the certificate > another way. So I thought of systemd's credentials management to give access > without compromising on security and isolation. So, basically there are three APIs for credentials: 1. via the $CREDENTIALS_DIRECTORY API: supports encrypted and plaintext credentials, and inheritance down the tree. 2. via the "systemd-creds decrypt" cmdline tool. supports encrypted credentials, and can be invoked anytime. 3. With the upcoming v258 there's also a varlink interface for the latter, so that programs can ask for that. Of course the 2nd/3rd way is entirely dynamic: clients can request creds any time they like. But it also works a bit different than the 1st way, it's more "raw", i.e. you have to provide the encrypted blob, there's no concept for finding this automatically for you. I understand that people are looking for a way for supporting a "reload" mechanism in the $CREDENTIALS_DIRECTORY API. I used to be very negative on that idea, because doing this with some concept of "atomic" behaviour used to be impossible: if you have 5 creds that belong together there was no concept to ensure that if you read them all you either get the 5 old or the 5 new creds but not a mixture. This however has changed a while back. With the new MOVE_MOUNT_BENEATH logic in the kernel we -- can for the first time -- replace a mount atomically. So with this we could make it so that a service sees either the old or the new simply by swapping out the tmpfs mount they are stored on in the namespace of the service. This isn't perfect (since if you open the creds files by path at the wrong times you might still get a bad combination), but if clients are well behaved this shouldn't matter (i.e., if a client first pins the $CREDENTIALS_DIRECTORY dir by fd, and then opens the creds via openat() from there, we get a consistent, atomic view of things). Anyway, long story short: happy to review a patch for that. Lennart -- Lennart Poettering, Berlin