Re: [systemd-devel] Anonymous SYSTEMD_NOTIFY socket

2023-06-28 Thread Lennart Poettering
On Di, 27.06.23 13:36, Adrian Vovk (adrianv...@gmail.com) wrote:

> Hello!
>
> I'm working on passing sd_notify events from systemd-{pull,import} through
> sysupdate.
>
> All services that consume sd_notify events (systemd itself, importd,
> machined, homed, etc) act as daemons and own a directory in /run. Thus,
> they can open a notification socket at, say, /run/SERVICENAME/notify and
> set NOTIFY_SOCKET to that. Also, there's no cleanup involved: if the
> service goes away the file sticks around until the service is restarted.
>
> sysupdate, however, is the first instance of a worker process forking off
> another worker process. Thus, we cannot bind the notify socket to some
> stable name. Here are potential approaches I've explored to solve this:
>
> - Simply pass through the NOTIFY_SOCKET environment variable. That's not
> suitable because we want to export an overall progress value (smoothly from
> 0 to 100), but systemd-import is forked off multiple times so we'd instead
> export a progress value that bounces around from 0 to 100 and back to 0.
> Also progress messages would come from different PIDs for a single
> invocation of sysupdate
>
> - Create a temporary file and use that as the socket. Problem: What happens
> if systemd-sysupdate crashes and we don't get to clean up that file? Over
> time that potentially clutters up /tmp! Is this a concern?
>
> - Use socketpair to open an anonymous socket and pass it into the child.
> This one seems ideal on paper but it doesn't actually work. I can modify
> sd_notify to just use an open file descriptor instead of tying to open its
> own, and that does work except for some reason process credentials aren't
> sent over. Also using the socket pair method doesn't work all that well
> with CLOEXEC, though maybe we don't want to CLOEXEC (We'd only close the
> socket when sd_notify is called with unset_env=true)
>
> - Create an abstract socket, named after the PID of the parent sysupdate.
> i.e. @/run/sysupdate/PID/notify. I'm not super familiar with abstract
> sockets so I'm not sure of the downsides
>
> Which approach would you suggest?

Use an abstract socket for this, with a randomized name, to avoid DoS
scenarios (i.e. include a formatted value of random_u64() in the
socket address).

abstract namespace sockets are nice for things like this, but they are
inherently vulnerable to DoS attacks if you use a fixed name since the
namespace knows not access controls: everyone can grab any socket they
like.

Make sure to look at the source PID (i..e SCM_CREDENTIALS) before
using incoming data.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Anonymous SYSTEMD_NOTIFY socket

2023-06-28 Thread Mantas Mikulėnas
On Tue, Jun 27, 2023 at 8:36 PM Adrian Vovk  wrote:

> Hello!
>
> I'm working on passing sd_notify events from systemd-{pull,import} through
> sysupdate.
>
> All services that consume sd_notify events (systemd itself, importd,
> machined, homed, etc) act as daemons and own a directory in /run. Thus,
> they can open a notification socket at, say, /run/SERVICENAME/notify and
> set NOTIFY_SOCKET to that. Also, there's no cleanup involved: if the
> service goes away the file sticks around until the service is restarted.
>
> sysupdate, however, is the first instance of a worker process forking off
> another worker process. Thus, we cannot bind the notify socket to some
> stable name. Here are potential approaches I've explored to solve this:
>
> - Simply pass through the NOTIFY_SOCKET environment variable. That's not
> suitable because we want to export an overall progress value (smoothly from
> 0 to 100), but systemd-import is forked off multiple times so we'd instead
> export a progress value that bounces around from 0 to 100 and back to 0.
> Also progress messages would come from different PIDs for a single
> invocation of sysupdate
>
> - Create a temporary file and use that as the socket. Problem: What
> happens if systemd-sysupdate crashes and we don't get to clean up that
> file? Over time that potentially clutters up /tmp! Is this a concern?
>

I assume you meant named sockets (like one would usually find in /run), not
actually regular temporary files?

systemd-tmpfiles should correctly clean up broken sockets in /tmp, IIRC it
supports checking whether the socket is bound or stale (though maybe /run
is still a better place, even for temporary sockets).

Personally my concern would be the crash itself, not the lack of cleanup.
But if the sockets are kept in a single place, say, /run/sysupdate/notify/,
then the subsequent restart could clean out all of them?


>
> - Use socketpair to open an anonymous socket and pass it into the child.
> This one seems ideal on paper but it doesn't actually work. I can modify
> sd_notify to just use an open file descriptor instead of tying to open its
> own, and that does work except for some reason process credentials aren't
> sent over. Also using the socket pair method doesn't work all that well
> with CLOEXEC, though maybe we don't want to CLOEXEC (We'd only close the
> socket when sd_notify is called with unset_env=true)
>
> - Create an abstract socket, named after the PID of the parent sysupdate.
> i.e. @/run/sysupdate/PID/notify. I'm not super familiar with abstract
> sockets so I'm not sure of the downsides
>

Abstract sockets are tied to the network namespace, instead of the
filesystem (mount namespace). That's the main difference, as far as I know.

-- 
Mantas Mikulėnas


[systemd-devel] Anonymous SYSTEMD_NOTIFY socket

2023-06-27 Thread Adrian Vovk
Hello!

I'm working on passing sd_notify events from systemd-{pull,import} through
sysupdate.

All services that consume sd_notify events (systemd itself, importd,
machined, homed, etc) act as daemons and own a directory in /run. Thus,
they can open a notification socket at, say, /run/SERVICENAME/notify and
set NOTIFY_SOCKET to that. Also, there's no cleanup involved: if the
service goes away the file sticks around until the service is restarted.

sysupdate, however, is the first instance of a worker process forking off
another worker process. Thus, we cannot bind the notify socket to some
stable name. Here are potential approaches I've explored to solve this:

- Simply pass through the NOTIFY_SOCKET environment variable. That's not
suitable because we want to export an overall progress value (smoothly from
0 to 100), but systemd-import is forked off multiple times so we'd instead
export a progress value that bounces around from 0 to 100 and back to 0.
Also progress messages would come from different PIDs for a single
invocation of sysupdate

- Create a temporary file and use that as the socket. Problem: What happens
if systemd-sysupdate crashes and we don't get to clean up that file? Over
time that potentially clutters up /tmp! Is this a concern?

- Use socketpair to open an anonymous socket and pass it into the child.
This one seems ideal on paper but it doesn't actually work. I can modify
sd_notify to just use an open file descriptor instead of tying to open its
own, and that does work except for some reason process credentials aren't
sent over. Also using the socket pair method doesn't work all that well
with CLOEXEC, though maybe we don't want to CLOEXEC (We'd only close the
socket when sd_notify is called with unset_env=true)

- Create an abstract socket, named after the PID of the parent sysupdate.
i.e. @/run/sysupdate/PID/notify. I'm not super familiar with abstract
sockets so I'm not sure of the downsides

Which approach would you suggest?

Best,
Adrian