[systemd-devel] systemctl, unclear error msg/warning, "Refusing to accept PID outside of service control group..."

2022-10-27 Thread rop
When doing systemctl-status on an old application, we are now seeing this
error/warning:

   "systemd[1]: Refusing to accept PID outside of service control group,
acquired through unsafe symlink chain"
I tested to remove a symlink in the chain, but then got this instead:

   "New main PID x does not belong to service, and PID file is not
owned by root. Refusing."
And when trying to examine the pid-file, it wasn't even created.

In spite of these errors/warnings, it seems the application is still
working...
So I am not sure when this error started appearing,
but I suspect it has started after some RHEL OS-upgrade in the last year(s).

googling didn't turn up much, so far

Where can I find an explanation of these messages?

What exactly is deemed "unsafe"?

And what to do about it?


Re: [systemd-devel] systemd-resolved/NetworkManager resolv.conf handling

2022-10-27 Thread Barry



> On 26 Oct 2022, at 20:17, Thomas HUMMEL  wrote:
> 
> Hello,
> 
> I'm not sure if this is a systemd-resolved or NetworkManager question but it 
> involves both (I know Thomas HALLER is a member of this list too)
> 
> on
> 
> Fedora release 36 (Thirty Six) using the following kernel and packages
> 
>5.19.16-200.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC
> 
>systemd-250.8-1.fc36.x86_64
>systemd-resolved-250.8-1.fc36.x86_64
>NetworkManager-1.38.4-1.fc36.x86_64
> 
> I'm using a proprietary vpn client which does not seem to work very well with 
> systemd-resolved. As a matter of fact it seems to create a manual NM profile 
> which does not include dns properties and it seems to (try to) set 
> /etc/resolv.conf aside (F5 vpn linux client f5fpc for the record)
> 
> Making it work is not the question here. I'm trying to understand how the 2 
> nameservers it configures may end up in /run/systemd/resolve/resolv.conf (and 
> global systemd-resolved config as shown by resolvectl status) ONLY when I 
> switch from a non systemd-resolved config then back to a systemd-resolved 
> config

Can you hook into the vpn client and intercept it doing the dns changes?
I do that for the vpn client used a work to solve integration issues.
In my case its a matter of the right options to use a wrapper around resolvectl
that fixes things before calling the real resolvectl.

Barry

> 
> Here's exactly what I'm doing/experiencing:
> 
> Starting from
> 
> a) default NetworkManager config:
> 
> # grep -iE 'dns|rc\.manager' NetworkManager.conf
> # ls -l conf.d/
> total 0
> 
> b) systemd-resolved stub-resolv.conf mode:
> 
> # ls -l /etc/resolv.conf
> lrwxrwxrwx 1 root root 37 Oct 26 19:15 /etc/resolv.conf -> 
> /run/systemd/resolve/stub-resolv.conf
> 
> and with (not linked from /etc/resolv.conf) :
> 
> /run/systemd/resolve/resolve.conf following content:
> 
> nameserver 192.168.1.1
> nameserver 2a01:cb00:7e1:3300:aa6a:bbff:fe6e:190
> search home
> 
> matching my auto wireless NM profile
> 
> 1) I start the vpn client
> 
> obviously it does not work very well with systemd-resolved as I don't get 
> corresponding nameserver (10.33.1.2,10.33.1.3) anywhere and name resolution 
> does not work for corresponding zones
> 
> /run/systemd/resolve/resolve.conf content has not changed
> 
> 2) I stop the vpn client, and switch to the following setup
> 
> # rm /etc/resolv.conf
> rm: remove symbolic link '/etc/resolv.conf'? y
> 
> # cat < /etc/NetworkManager/conf.d/foo.conf
> > [main]
> > dns=default
> > rc.manager=file
> > EOF
> 
> # reboot
> 
> -> after the reboot the /etc/resolv.conf link as been recreated : why ?
> 
> (/run/systemd/resolve/resolv.conf hasn't changed, which seems normal to me)
> 
> 3) I remove it again and reboot
> 
> # rm /etc/resolv.conf
> rm: remove symbolic link '/etc/resolv.conf'? y
> 
> # reboot
> 
> -> this time /etc/resolv.conf is as expected a regular file which content is 
> handled by NM:
> 
> $ ls -l /etc/resolv.conf
> -rw-r--r-- 1 root root 114 Oct 26 20:22 /etc/resolv.conf
> $ cat /etc/resolv.conf
> # Generated by NetworkManager
> search home
> nameserver 192.168.1.1
> nameserver 2a01:cb00:7e1:3300:aa6a:bbff:fe6e:190
> 
> 
> 4) I start the vpn client
> 
> it wrote to /etc/resolv.conf (which seems wrong to me but is out of scope 
> here)
> 
> $ cat /etc/resolv.conf
> #F5 Networks Inc. :File modified by VPN process
> search pasteur.fr home
> nameserver 10.33.1.2
> nameserver 10.33.1.3
> 
> the 2 nameservers it provided do not appear in 
> /run/systemd/resolve/resolv.conf
> 
> 6) I stop the vpn client switch back to my orgininal config, and reboot
> 
> # rm /etc/NetworkManager/conf.d/foo.conf
> rm: remove regular file '/etc/NetworkManager/conf.d/foo.conf'? y
> 
> # rm /etc/resolv.conf
> rm: remove regular file '/etc/resolv.conf'? y
> 
> # ln -s /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf
> 
> # reboot
> 
> -> everything looks as expected
> 
> 7) I start the vpn client
> 
> -> its provided nameserver appear in /run/systemd/resolv/resolv.conf (and 
> resolution of related zones work)
> 
> -> why ? Where does the info come from ?
> 
> nameserver 10.33.1.2
> nameserver 10.33.1.3
> nameserver 192.168.1.1
> # Too many DNS servers configured, the following entries may be ignored.
> nameserver 2a01:cb00:7e1:3300:aa6a:bbff:fe6e:190
> search pasteur.fr home
> 
> Can you help me figure out what's happening or at least how can the behavior 
> seem to change with what seem a rollback to the initial state ?
> 
> Thanks for your help
> 
> --
> Thomas HUMMEL
> 



Re: [systemd-devel] How to get a useful peer address when doing accept(3, ...) on a systemd supplied listening socket

2022-10-27 Thread Klaus Ebbe Grue
Hi Mantas


> I have a feeling it "changes" because you're trying to give the whole
> struct sockaddr to inet_pton() instead of giving just the .sin6_addr field,
> so your program is trying to interpret the *port number*
> (i.e. the .sin6_port which precedes .sin_addr) as part of the address...

That was exactly my mistake! Thanks.

At some point in my hopeless odyssey down a wrong track I tried giving the 
whole struct to inet_ntop() and forgot to change it back when I got everything 
else to work.

Thanks for your help and sorry for the inconvenience.

Cheers,
Klaus



Re: [systemd-devel] How to get a useful peer address when doing accept(3, ...) on a systemd supplied listening socket

2022-10-27 Thread Mantas Mikulėnas
On Thu, Oct 27, 2022 at 1:51 PM Klaus Ebbe Grue  wrote:

> Hi systemd-devel,
>
> Sorry to bug you with another user question.
>
> I have a socket activated daemon, call it mydaemon, and I have trouble
> finding out who connects to it.
>
>
> mydaemon.socket contains:
>
>
>   [Socket]
>   ListenStream=
>
> When I connect using IPv4 using
>
>   nc -4 localhost 
>
> then mydaemon does
>
>   sockaddr_in6 peer;
>   socklen_t peer_size=sizeof(peer);
>   accept(3,(struct sockaddr *)&peer,sizeof(peer))
>
> Afterwards, peer.sin6_family is AF_INET6 and peer.sin6_addr contains some
> gibberish like a00:e5ae::
>

If you specify nothing for the listen address, systemd will assume the IPv6
address [::] as the default, and will create an AF_INET6 socket bound to
[::]:.

Due to Linux's default "bind both families" magic, it will actually be
bound to both [::]: *and* 0.0.0.0:, so it will accept IPv4
connections – but you'll receive them in the form of AF_INET6 sockets, so
the peer address of your v4 client indeed has family AF_INET6 but contains
a "v6-mapped" IPv4 address such as [:::10.0.229.174] aka
[:::a00:e5ae].

The alternative would be to specify both ListenStream=[::]: and
ListenStream=0.0.0.0: (as well as BindIPv6Only=ipv6-only), which would
cause you to receive *two* socket FDs – one purely for IPv6 clients, the
other for IPv4 – that you'd have to put into poll() or some other loop for
accepting clients.

You can extract the IPv4 address by detecting the [:::0:0/96] prefix
and stripping away the first 12 bytes. (There's also a magic option for
getsockopt() listed in ipv6(7) that can convert such a "v6-mapped" socket
to a "real" AF_INET socket, but it's rarely needed.)


>
> If I connect more than once, the gibberish changes from connection to
> connection.
>

I have a feeling it "changes" because you're trying to give the whole
struct sockaddr to inet_pton() instead of giving just the .sin6_addr field,
so your program is trying to interpret the *port number* (i.e. the
.sin6_port which precedes .sin_addr) as part of the address...

But please show your entire code, otherwise this is all just guessing.

Here's a working example that I've just tested with
`systemd-socket-activate --listen=`:
https://gist.github.com/grawity/63369273742f23b596d764cb6d45feb7


>
> If mydaemon creates the listening socket, I can easily get the peer
> address.
>
> I suspect that when systemd creates the listening socket then
> accept(3,...) returns a socket which is connected to a local socket created
> by systemd.
>
> QUESTION: Is that suspicion correct?
>

No, it isn't.

>

-- 
Mantas Mikulėnas


Re: [systemd-devel] Is there a way to find out if Delegate=yes?

2022-10-27 Thread Arseny Maslennikov
On Thu, Oct 27, 2022 at 11:48:20AM +0300, Yuri Kanivetsky wrote:
> Arseny Maslennikov, for some reason I didn't receive your email.

It had successfully reached this mailing list by 2022-Oct-25, so that
means you're not subscribed to the list. Strangely enough,
the mail receiver rejects emails from non-subscribers, so you wouldn't
be able to reach out to the list at all.

On Thu, Oct 27, 2022 at 11:48:20AM +0300, Yuri Kanivetsky wrote:
> Anyways, indeed on the server with --user:
> 
> $ systemctl --user show -p Delegate 
> run-rcbb44fb2c7774453b18cda8fe03f0f26.scope
> Delegate=yes
> 
> But that's just part of the mystery. Locally, what can I do... I can
> try and query the scope to which my shell belongs to:
> 
> $ systemctl --user show -p Delegate session-2.scope
> Delegate=no

As usual, since the logind machinery which creates this scope
on the request of pam_systemd.so does not set that property.

> Or the enclosing slice for the scope on the server (the local slice
> that matches the one on the server where the transient scope is
> created):
> 
> $ systemctl --user show -p Delegate app.slice
> Delegate=no

It wouldn't make sense the other way for any slice. Slices map to inner
cgroups, which distribute system resources, and not to leaf cgroups,
which can host processes. This means there's no one to delegate to.

The design of slices and scopes (and all the other units processes can
be a part of) largely inherits from the cgroupv2 design in the kernel.
A non-`/` cgroup can _either_ have child cgroups (and distribute
resources to them) _or_ have member processes. Here is an excerpt from
[1]https://www.kernel.org/doc/Documentation/admin-guide/cgroup-v2.rst:
>> Non-root cgroups can distribute domain resources to their children
>> only when they don't have any processes of their own.  In other words,
>> only domain cgroups which don't contain any processes can have domain
>> controllers enabled in their "cgroup.subtree_control" files.
That document is a valuable reference in general.

> Somehow I don't need systemd-run for lxc-start and lxc-attach locally.
> Any ideas?

The topic looks really dizzy, in fact.

I'll try to explain what I can. I suppose there's someone in the world
who has really hit the problems described below and is in a better
position to comment, or provide links to available resources where the
experience is documented for the perusal of the community.

(I have little experience with the lxc-* suite)
It looks like lxc-start(1) and lxc-attach(1) try to manage cgroups
themselves. If they work on processes in a systemd-managed cgroup (or
put new processes in one), the unit that maps to that cgroup should
have `Delegate=yes` for at least the following reasons:
— the permissions on file objects under /sys/fs/cgroup/ (e.g.
  controllers) are set appropriately;
— the unit manager puts its hands off the delegated cgroup, so there's a
  single entity managing the cgroup.
This really holds for any container manager foreign to systemd.

If this is not fulfilled, the result is undefined: the lxc-utils and
their payloads may work, not work, occasionally work or (the scariest
option) sometimes break.


In addition to [1], there's also a systemd-centric document on the topic:
[2]https://systemd.io/CGROUP_DELEGATION/
One of the topics it intends to cover is the semantics of the
`Delegate=` property on units.

It is also structured as more of a reference than a guide, but
(unfortunately) often makes a statement on how it should be done, not
explaining why.
>> The single-writer rule: this means that each cgroup only has a single
>> writer, i.e. a single process managing it. It’s OK if different
>> cgroups have different processes managing them. However, only a
>> single process should own a specific cgroup, and when it does that
>> ownership is exclusive, and nothing else should manipulate it at the
>> same time. This rule ensures that various pieces of software don’t
>> step on each other’s toes constantly.
There are no examples of what exactly might go wrong (and if there were
any, your question would be answered).
Also, scopes themselves are a type of unit which is not originated by
systemd (e.g. they don't have a unit file) and only is a tracking
measure for pids, so it's even harder to imagine a scenario where
putting foreign pids in it would break. It's definitely not possible to
make child cgroups of a scope, though.

As noted above, someone who dabbles a lot in the cgroup mechanics and/or
deals with the lxc-* project would be in a better position to comment
than me.

Cheers!

[1]https://www.kernel.org/doc/Documentation/admin-guide/cgroup-v2.rst
[2]https://systemd.io/CGROUP_DELEGATION/


signature.asc
Description: PGP signature


[systemd-devel] How to get a useful peer address when doing accept(3, ...) on a systemd supplied listening socket

2022-10-27 Thread Klaus Ebbe Grue
Hi systemd-devel,

Sorry to bug you with another user question.

I have a socket activated daemon, call it mydaemon, and I have trouble finding 
out who connects to it.


mydaemon.socket contains:


  [Socket]
  ListenStream=

When I connect using IPv4 using

  nc -4 localhost 

then mydaemon does

  sockaddr_in6 peer;
  socklen_t peer_size=sizeof(peer);
  accept(3,(struct sockaddr *)&peer,sizeof(peer))


Afterwards, peer.sin6_family is AF_INET6 and peer.sin6_addr contains some 
gibberish like a00:e5ae::


If I connect more than once, the gibberish changes from connection to 
connection.


Something similar happens if I connect using IPv6.


If I change mydaemon.socket to


  [Socket]
  ListenStream=0.0.0.0:

Then peer.sin6_family becomes AF_INET as it should. But if peer is cast to 
struct sockaddr_in then peer.sin_addr still contains gibberish like 2.0.191.150 
(I expected something like 127.0.0.1 or 192.168.0.99).

When I connect from other machines, the peer address still is gibberish.

If mydaemon creates the listening socket, I can easily get the peer address.

I suspect that when systemd creates the listening socket then accept(3,...) 
returns a socket which is connected to a local socket created by systemd.

QUESTION: Is that suspicion correct? And if yes, is there are way to recover 
the address of the actually connecting peer?

Cheers,
Klaus



Re: [systemd-devel] Is there a way to find out if Delegate=yes?

2022-10-27 Thread Yuri Kanivetsky
Arseny Maslennikov, for some reason I didn't receive your email.

Anyways, indeed on the server with --user:

$ systemctl --user show -p Delegate run-rcbb44fb2c7774453b18cda8fe03f0f26.scope
Delegate=yes

But that's just part of the mystery. Locally, what can I do... I can
try and query the scope to which my shell belongs to:

$ systemctl --user show -p Delegate session-2.scope
Delegate=no

Or the enclosing slice for the scope on the server (the local slice
that matches the one on the server where the transient scope is
created):

$ systemctl --user show -p Delegate app.slice
Delegate=no

Somehow I don't need systemd-run for lxc-start and lxc-attach locally.
Any ideas?

On Mon, Oct 24, 2022 at 6:07 PM Yuri Kanivetsky
 wrote:
>
> Hi,
>
> I'm experimenting with LXC containers:
>
> https://linuxcontainers.org/lxc/getting-started/
>
> And there's a command I don't fully understand:
>
> systemd-run --unit=my-unit --user --scope -p "Delegate=yes" --
> lxc-start my-container
>
> It runs lxc-start in a transient user scope with Delegate=yes, but:
>
> $ systemctl show -p Delegate run-scope
> Delegate=no
>
> That's on an Ubuntu server. Locally on Arch Linux I don't need
> systemd-run, lxc-start just works.
>
> How can I see the effect of systemd-run, and why systemd-run is not
> needed on Arch Linux?