On Sa, 03.05.25 14:50, Xogium (cont...@xogium.me) wrote:

> Hi,
>
> so for context, I want to isolate most services I plan on running inside
> containers, with each its own nginx, php, etc.
>
> My issue is with credentials. I would like the host to handle the renewal of
> tls certificate, and to have the credentials propagated via systemd-nspawn
> to the services that run within each container. I get the basic idea of how
> to implement this, but from what I'm reading, once the credentials are
> loaded, they are immutable for as long as the service runs -- in this case
> I'm assuming as long as the nspawn container itself runs.

Well, yes, the current protocol is pretty strict on that: we pick them
up, decrypt them when a service starts, and release them when it
stops, and do not support reloading in between, when it comes to the
$CREDENTIALS_DIRECTORY API.

This behaviour is nicely "atomic" in behaviour: the service sees
stable and immutable versions of the creds for their entire lifetime.

> So how would I best handle renewal of the certificate? Would I have to
> restart each container via machinectl in order to reload this, thus causing
> very brief downtime on all of my services?

Well, yes, that's certainly one way.

> Is there a better way of doing what I'm trying to accomplish here? Nginx can
> access the certificate normally, but I would like to run it as a totally
> dynamic user combo. I also host other services that do not run as root first
> before dropping privileges, so they require access to the certificate
> another way. So I thought of systemd's credentials management to give access
> without compromising on security and isolation.

So, basically there are three APIs for credentials:

1. via the $CREDENTIALS_DIRECTORY API: supports encrypted and
   plaintext credentials, and inheritance down the tree.

2. via the "systemd-creds decrypt" cmdline tool. supports encrypted
   credentials, and can be invoked anytime.

3. With the upcoming v258 there's also a varlink interface for the
   latter, so that programs can ask for that.

Of course the 2nd/3rd way is entirely dynamic: clients can request
creds any time they like. But it also works a bit different than the
1st way, it's more "raw", i.e. you have to provide the encrypted blob,
there's no concept for finding this automatically for you.

I understand that people are looking for a way for supporting a
"reload" mechanism in the $CREDENTIALS_DIRECTORY API. I used to be
very negative on that idea, because doing this with some concept of
"atomic" behaviour used to be impossible: if you have 5 creds that
belong together there was no concept to ensure that if you read them
all you either get the 5 old or the 5 new creds but not a
mixture. This however has changed a while back. With the new
MOVE_MOUNT_BENEATH logic in the kernel we -- can for the first time --
replace a mount atomically. So with this we could make it so that a
service sees either the old or the new simply by swapping out the
tmpfs mount they are stored on in the namespace of the service. This
isn't perfect (since if you open the creds files by path at the wrong
times you might still get a bad combination), but if clients are well
behaved this shouldn't matter (i.e., if a client first pins the
$CREDENTIALS_DIRECTORY dir by fd, and then opens the creds via
openat() from there, we get a consistent, atomic view of things).

Anyway, long story short: happy to review a patch for that.

Lennart

--
Lennart Poettering, Berlin

Reply via email to