Re: [systemd-devel] systemd's connections to /run/systemd/private ?

2019-07-10 Thread Zbigniew Jędrzejewski-Szmek
On Wed, Jul 10, 2019 at 09:51:36AM -0400, Brian Reichert wrote:
> On Wed, Jul 10, 2019 at 07:37:19AM +, Zbigniew J??drzejewski-Szmek wrote:
> 
> > It's a bug report as any other. Writing a meaningful reply takes time
> > and effort. Lack of time is a much better explanation than ressentiments.
> 
> I wasn't expressing resentment; I apologize if it came off that way.
> 
> > Please always specify the systemd version in use. We're not all SLES
> > users, and even if we were, I assume that there might be different
> > package versions over time.
> 
> Quite reasonable:
> 
>   localhost:/var/tmp # cat /etc/os-release
>   NAME="SLES"
>   VERSION="12-SP3"
>   VERSION_ID="12.3"
>   PRETTY_NAME="SUSE Linux Enterprise Server 12 SP3"
>   ID="sles"
>   ANSI_COLOR="0;32"
>   CPE_NAME="cpe:/o:suse:sles:12:sp3"
> 
>   localhost:/var/tmp # rpm -q systemd
>   systemd-228-142.1.x86_64

That's ancient... 228 was released almost four years ago.

> > > When we first spin up a new SLES12 host with our custom services,
> > > the number of connections to /run/systemd/private numbers in the
> > > mere hundreds. 
> 
> > That sounds wrong already. Please figure out what those connections
> > are. I'm afraid that you might have to do some debugging on your
> > own, since this issue doesn't seem easily reproducible.
> 
> What tactics should I employ?  All of those file handles to
> /run/systemd/private are owned by PID 1, and 'ss' implies there are
> no peers.
> 
> 'strace' in pid shows messages are flowing, but that doesn't reveal
> the logic about how the connections get created or culled, nor who
> initiated them.

strace -p1 -e recvmsg,close,accept4,getsockname,getsockopt,sendmsg -s999

yields the relevant info. In particular, the pid, uid, and guid of the
remote is shown. My approach would be to log this to some file, and
then see which fds remain, and then look up this fd in the log.
The recvmsg calls contain the serialized dbus calls, a bit messy but
understandable. E.g. 'systemctl show systemd-udevd' gives something
like this:

recvmsg(20, {msg_name=NULL, msg_namelen=0, 
msg_iov=[{iov_base="l\1\4\1\5\0\0\0\1\0\0\0\257\0\0\0\1\1o\08\0\0\0", 
iov_len=24}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, 
MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 24
recvmsg(20, {msg_name=NULL, msg_namelen=0, 
msg_iov=[{iov_base="/org/freedesktop/systemd1/unit/systemd_2dudevd_2eservice\0\0\0\0\0\0\0\0\3\1s\0\6\0\0\0GetAll\0\0\2\1s\0\37\0\0\0org.freedesktop.DBus.Properties\0\6\1s\0\30\0\0\0org.freedesktop.systemd1\0\0\0\0\0\0\0\0\10\1g\0\1s\0\0\0\0\0\0\0",
 ...

HTH,
Zbyszek
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd's connections to /run/systemd/private ?

2019-07-10 Thread Brian Reichert
On Wed, Jul 10, 2019 at 07:37:19AM +, Zbigniew J??drzejewski-Szmek wrote:

> It's a bug report as any other. Writing a meaningful reply takes time
> and effort. Lack of time is a much better explanation than ressentiments.

I wasn't expressing resentment; I apologize if it came off that way.

> Please always specify the systemd version in use. We're not all SLES
> users, and even if we were, I assume that there might be different
> package versions over time.

Quite reasonable:

  localhost:/var/tmp # cat /etc/os-release
  NAME="SLES"
  VERSION="12-SP3"
  VERSION_ID="12.3"
  PRETTY_NAME="SUSE Linux Enterprise Server 12 SP3"
  ID="sles"
  ANSI_COLOR="0;32"
  CPE_NAME="cpe:/o:suse:sles:12:sp3"

  localhost:/var/tmp # rpm -q systemd
  systemd-228-142.1.x86_64

> > When we first spin up a new SLES12 host with our custom services,
> > the number of connections to /run/systemd/private numbers in the
> > mere hundreds. 

> That sounds wrong already. Please figure out what those connections
> are. I'm afraid that you might have to do some debugging on your
> own, since this issue doesn't seem easily reproducible.

What tactics should I employ?  All of those file handles to
/run/systemd/private are owned by PID 1, and 'ss' implies there are
no peers.

'strace' in pid shows messages are flowing, but that doesn't reveal
the logic about how the connections get created or culled, nor who
initiated them.

On a box with ~500 of these file handles, I can see that many of
them are hours or days old:

  localhost:/var/tmp # date
  Wed Jul 10 09:45:01 EDT 2019

  # new ones
  localhost:/var/tmp # lsof -nP /run/systemd/private | awk '/systemd/ {
  sub(/u/, "", $4); print $4}' | (  cd /proc/1/fd; xargs ls -t --full-time ) | 
head -5
  lrwx-- 1 root root 64 2019-07-10 09:45:05.211722809 -0400 561 -> 
socket:[1183838]
  lrwx-- 1 root root 64 2019-07-10 09:40:02.611726025 -0400 559 -> 
socket:[1173429]
  lrwx-- 1 root root 64 2019-07-10 09:40:02.611726025 -0400 560 -> 
socket:[1176265]
  lrwx-- 1 root root 64 2019-07-10 09:33:10.687730403 -0400 100 -> 
socket:[113992]
  lrwx-- 1 root root 64 2019-07-10 09:33:10.687730403 -0400 101 -> 
socket:[115163]
  xargs: ls: terminated by signal 13

  # old ones
  localhost:/var/tmp # lsof -nP /run/systemd/private | awk '/systemd/ {
  sub(/u/, "", $4); print $4}' | (  cd /proc/1/fd; xargs ls -t --full-time ) | 
tail -5
  lrwx-- 1 root root 64 2019-07-08 15:12:04.725350882 -0400 59 -> 
socket:[43097]
  lrwx-- 1 root root 64 2019-07-08 15:12:04.725350882 -0400 60 -> 
socket:[44029]
  lrwx-- 1 root root 64 2019-07-08 15:12:04.725350882 -0400 63 -> 
socket:[46234]
  lrwx-- 1 root root 64 2019-07-08 15:12:04.725350882 -0400 65 -> 
socket:[49252]
  lrwx-- 1 root root 64 2019-07-08 15:12:04.725350882 -0400 71 -> 
socket:[54064]
  
> > Is my guess about CONNECTIONS_MAX's relationship to /run/systemd/private
> > correct?
> 
> Yes. The number is hardcoded because it's expected to be "large
> enough". The connection count shouldn't be more than "a few" or maybe
> a dozen at any time.

Thanks for confirming that.

> Zbyszek

-- 
Brian Reichert  
BSD admin/developer at large
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd-timedated: Not possible to set time zone that is a symlink!

2019-07-10 Thread Lennart Poettering
On Fr, 05.07.19 21:41, Christopher Wong (christopher.w...@axis.com) wrote:

> Hi,
>
>
> The systemd-timedated doesn't allow setting a tz-file under
> /usr/share/zoneinfo to be a symlink. Is it due to security reasons?

Hmm, I don't think we care whether it is a symlink or not. Where does
your symlink point to though?

Note that we turn on a sandbox for systemd-timedated though, which
limits access to /usr and /etc basically... (and turns off mount
propagation for those dirs). Maybe that's tripping you up, because
your symlink destination are mounts established later on in /home?

> I am asking because our system mount /usr/share/zoneinfo as
> read-only and because of legacy we need to support the user being
> able to change the TZ string in a tz-file. Installing a symlink that
> point to such a tz-file will allow us to use the systemd-timedated
> interface to set time zone. The changeable tz-file (located at
> /etc/...) can be altered by root and a specific service. Do you see
> any potential risk by doing so?

consider turning off the sandboxing features, i.e. add a drop-in that
turns off ProtectSystem=, ProtectHome= and suchlike.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] Nonstandard port for systemd-resolved DNS forwarded queries

2019-07-10 Thread Lennart Poettering
On Do, 04.07.19 01:04, jimc (j...@jfcarter.net) wrote:

>
> What I would really like to see, which I'm going to implement in my
> kludge, is syntax in /etc/systemd/resolved.conf where you could say
> something like
> DNS=192.9.200.193#53 (192.9.200.194#4253 192.9.200.195#4253)
> meaning: Try the master, but if it's down use whichever slave is
> responding.  And speculatively retry the master occasionally, reverting
> to it when it's up again.  For me it's important to go through the
> master's dnsmasq on port 53, to get hosts which have names but which
> have non-fixed IPs from the DhCP pool or RFC 4862 addresses.  But of
> course the slaves have no dnsmasq (I wish that were possible).

This is currently not supported. Please file an RFE issue on github,
or even a PR implementing this. (That said, the ':' separator for
ports seems to be more standard, instead of '#')

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] OFFLIST Re: systemd's connections to /run/systemd/private ?

2019-07-10 Thread Zbigniew Jędrzejewski-Szmek
On Tue, Jul 09, 2019 at 11:05:50PM +0300, Mantas Mikulėnas wrote:
> On Tue, Jul 9, 2019 at 4:28 PM Brian Reichert  wrote:
> 
> > On Tue, Jul 09, 2019 at 11:21:13AM +0100,
> > systemd-devel@lists.freedesktop.org wrote:
> > > Hi Brian
> > >
> > > I feel embarrassed at having recommended you to join the systemd-devel
> > > list :( I don't understand why nobody is responding to you, and I'm not
> > > qualified to help!
> >
> > I appreciate the private feedback.  I recognize this is an all-volunteer
> > ecosystem, but I'm not used to radio silence. :/
> >
> > > There is a bit of anti-SUSE feeling for some reason
> > > that I don't really understand, but Lennart in particular normally
> > > seems to be very helpful, as does Zbigniew.
> >
> 
> It seems that Lennart tends to process his mailing-list inbox only every
> couple of weeks. He's a bit more active on GitHub however.
> 
> The rest of us are probably either waiting for a dev to make a comment,
> and/or wondering why such massive numbers of `systemctl` are being run on
> your system in the first place.
> >
> > I'm new to this list, so haven't seen any anti-SLES sentiments as
> > of yet.  But, based on the original symptoms I reported, this occurs
> > on many distributions.

It's a bug report as any other. Writing a meaningful reply takes time
and effort. Lack of time is a much better explanation than ressentiments.

Zbyszek
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] systemd's connections to /run/systemd/private ?

2019-07-10 Thread Zbigniew Jędrzejewski-Szmek
On Tue, Jul 02, 2019 at 09:57:44AM -0400, Brian Reichert wrote:
> At $JOB, on some of our SLES12 boxes, our logs are getting swamped
> with messages saying:
> 
>   "Too many concurrent connections, refusing"

Please always specify the systemd version in use. We're not all SLES
users, and even if we were, I assume that there might be different
package versions over time.

>   # ss -x | grep /run/systemd/private | wc -l
>   4015

/run/systemd/private is used by systemctl and other systemd utilities
when running as root. Those connections are expected to be short-lived.
Generally, on a normal machine "ss -x | grep /run/systemd/private | wc -l"
is expected to yield 0 or a very low number transiently.

> But, despite the almost 4k connections, 'ss' shows that there are
> no connected peers:
> 
>   # ss -x | grep /run/systemd/private | grep -v -e '* 0' | wc -l
>   0

Interesting. ss output is not documented at all from what I can see,
but indeed '* 0' seems to indicate that. It is possible that systemd
has a leak and is not closing the private bus connections properly.

> When we first spin up a new SLES12 host with our custom services,
> the number of connections to /run/systemd/private numbers in the
> mere hundreds. 
That sounds wrong already. Please figure out what those connections
are. I'm afraid that you might have to do some debugging on your
own, since this issue doesn't seem easily reproducible.

(I installed systemd with CONNECTIONS_MAX set to 10, and I can easily
saturate the number of available connections with
  for i in {1..11}; do systemctl status '*' & sleep 0.5; kill -STOP $!;done
As soon as I allow the processes to continue or kill them, the connection
count goes down. They never show up with '* 0'.)

> Is my guess about CONNECTIONS_MAX's relationship to /run/systemd/private
> correct?

Yes. The number is hardcoded because it's expected to be "large
enough". The connection count shouldn't be more than "a few" or maybe
a dozen at any time.

> I have a hypothesis that this may be some resource leak in systemd,
> but I've not found a way to test that.

Once you figure out what is creating the connection, it would be useful
to attach strace to pid 1 and see what is happening there.

Zbyszek
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Re: [systemd-devel] OFFLIST Re: systemd's connections to /run/systemd/private ?

2019-07-10 Thread systemd
On Tue, 9 Jul 2019 15:29:24 -0400
Brian Reichert  wrote:
> On Tue, Jul 09, 2019 at 06:20:02PM +0100,
> systemd-devel@lists.freedesktop.org wrote:
> > 
> > Posting private messages to a public list is generally considered
> > very RUDE.  
> 
> I agree, and I apologize.
> 
> The message I received, and replied to, did not come from a private
> email address; it apparently came from the mailing list software,
> and I did not realize that until I hit 'reply':
> 
>   Date: Tue, 9 Jul 2019 11:21:13 +0100
>   From: systemd-devel@lists.freedesktop.org
>   To: Brian Reichert 
>   Subject: OFFLIST Re: [systemd-devel] systemd's connections to
>/run/systemd/private ?

Ah, mea culpa. I apologize to you and to the list for the noise. I had
the wrong setting in my MUA, which is hopefully now fixed.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel