Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-14 Thread Harald Dunkel

On 8/13/20 11:03 AM, Lennart Poettering wrote:


Is it possible the container and the host run in the very same cgroup
hierarchy?

If that's the case (and it looks like it): this is not
supported. Please file a bug against LXC, it's very clearly broken.



FYI: https://github.com/lxc/lxc/issues/3520

Regards
Harri
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-14 Thread Lennart Poettering
On Fr, 14.08.20 06:42, Harald Dunkel (harald.dun...@aixigo.com) wrote:

> On 8/13/20 11:07 AM, Lennart Poettering wrote:
> >
> > No! It's a bug. Not in systemd, but LXC. But generating errors in such
> > a borked setup is *good*, not bad, and certainly nothing to hide.
> >
>
> Surely its not a bug in systemd, but ignoring unreasonable data (maybe with
> a warning, if necessary) has a proud tradition in computing. Not to mention
> that systemd ignores duplicate PIDs in the same context as well, AFAICS.
>
> Ignoring PID == 0 wouldn't be unreasonable, regardless whose bug this is.

Well, we have to agree to disagree. We won't hide errors like this one
in systemd. We

Also, why even? If we'd ignore this specific error nothing for you
would be fixed, because to host systemd instance and the one in the
container will conflict with their cgroup use all the
time. i.e. remove each other's cgroups at inconenient times, will not
be able to detect correctly when a service is stopped, or get notified
when it shuts down, because the cgroups it cares about won't be empty
when they are supposed to be empty.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Harald Dunkel

On 8/13/20 11:07 AM, Lennart Poettering wrote:


No! It's a bug. Not in systemd, but LXC. But generating errors in such
a borked setup is *good*, not bad, and certainly nothing to hide.



Surely its not a bug in systemd, but ignoring unreasonable data (maybe with
a warning, if necessary) has a proud tradition in computing. Not to mention
that systemd ignores duplicate PIDs in the same context as well, AFAICS.

Ignoring PID == 0 wouldn't be unreasonable, regardless whose bug this is.


Just my $0.02. Regards
Harri
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Lennart Poettering
On Do, 13.08.20 09:20, Michael Biebl (mbi...@gmail.com) wrote:

> > >>
> > >> kernel returns "0" as process number in this cgroup which results in EIO
> > >> returned by systemd.
> > >
> >
> > systemd should really clearly log this (invalid PID and and in which
> > cgroup it was). Returning generic error message without any indication
> > what caused this error is not useful at all.
>
> I agree. Could you file a github issue for this?

Please file a bug against LXC instead. They need to set up the
environment right.

https://systemd.io/CONTAINER_INTERFACE

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Lennart Poettering
On Do, 13.08.20 09:17, Harald Dunkel (harald.dun...@aixigo.com) wrote:

> On 8/13/20 9:05 AM, Andrei Borzenkov wrote:
> >
> > systemd should really clearly log this (invalid PID and and in which
> > cgroup it was). Returning generic error message without any indication
> > what caused this error is not useful at all.
>
> Do you think it would be reasonable to silently ignore the PID = 0
> in cg_read_pid() and maybe others?

No! It's a bug. Not in systemd, but LXC. But generating errors in such
a borked setup is *good*, not bad, and certainly nothing to hide.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Lennart Poettering
On Do, 13.08.20 10:05, Andrei Borzenkov (arvidj...@gmail.com) wrote:

> 13.08.2020 09:54, Harald Dunkel пишет:
> > On 8/12/20 2:16 PM, Andrei Borzenkov wrote:
> >> 12.08.2020 14:03, Harald Dunkel пишет:
> >>> See attachment. Hope this helps
> >>> Harri
> >>
> >>
> >>> 1 openat(AT_FDCWD,
> >>> "/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs",
> >>> O_RDONLY|O_CLOEXEC) = 24
> >>> 1 read(24, "0\n1544456\n", 4096)    = 10
> >>
> >>
> >> kernel returns "0" as process number in this cgroup which results in EIO
> >> returned by systemd.
> >
>
> systemd should really clearly log this (invalid PID and and in which
> cgroup it was). Returning generic error message without any indication
> what caused this error is not useful at all.

I think it's necessary that we log, but I am not sure we have to have
a clear error message ready for every kind of impossible error. I'd
just leave this as is. The setup that causes this is seriously broken,
and I am not convinced we need to handle that gracefully or help the
user with a clear message.

You cannot run multiple containers in the same cgroup hierarchy,
that's against the delegation model of cgroups. Containers need a
clearly delegated subtree of cgroup hierarchy, and running everything
in the same top-level tree is just entirely bogus. This is very
clearly documented.

It's an LXC bug, that's all. And yes, it causes weird error messages
in systemd, but that's because the setup is just so broken, and as
long as you do get *some* error messages I think we are good.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Lennart Poettering
On Do, 13.08.20 08:54, Harald Dunkel (harald.dun...@aixigo.com) wrote:

> On 8/12/20 2:16 PM, Andrei Borzenkov wrote:
> > 12.08.2020 14:03, Harald Dunkel пишет:
> > > See attachment. Hope this helps
> > > Harri
> >
> >
> > > 1 openat(AT_FDCWD, 
> > > "/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs", 
> > > O_RDONLY|O_CLOEXEC) = 24
> > > 1 read(24, "0\n1544456\n", 4096)= 10
> >
> >
> > kernel returns "0" as process number in this cgroup which results in EIO
> > returned by systemd.
>
> Indeed. This is kernel 4.19.132-1, as provided by Debian 10. Upgrading
> to kernel 5.6.14-2~bpo10+1 and lxc 4.0.4 doesn't help, same problem.
>
> And now its getting weird: I found a few ghost services in some LXC
> containers with *just* zeros, e.g. the zabbix-agent:
>
> # cat /sys/fs/cgroup/unified/system.slice/zabbix-agent.service/cgroup.procs
> 0
> 0
> 0
> 0
> 0
> 0
>
> Zabbix-agent isn't even installed in the container. Its installed on
> the host system only.

Is it possible the container and the host run in the very same cgroup
hierarchy?

If that's the case (and it looks like it): this is not
supported. Please file a bug against LXC, it's very clearly broken.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Lennart Poettering
On Do, 13.08.20 10:05, Andrei Borzenkov (arvidj...@gmail.com) wrote:

> >>> 1 openat(AT_FDCWD,
> >>> "/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs",
> >>> O_RDONLY|O_CLOEXEC) = 24
> >>> 1 read(24, "0\n1544456\n", 4096)    = 10
> >>
> >>
> >> kernel returns "0" as process number in this cgroup which results in EIO
> >> returned by systemd.
> >
>
> systemd should really clearly log this (invalid PID and and in which
> cgroup it was). Returning generic error message without any indication
> what caused this error is not useful at all.

Well, this is an impossible error. We generally trust the kernel to
return valid data. If it doesn't we are fucked, and I am not sure we
have to cater for all such cases. Without kernel behaving correctly we
cannot reasonably operate.

And there *is* logging about this: client side, i.e. the message that
this whole thread was started about.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Michael Biebl
Am Do., 13. Aug. 2020 um 09:05 Uhr schrieb Andrei Borzenkov
:
>
> 13.08.2020 09:54, Harald Dunkel пишет:
> > On 8/12/20 2:16 PM, Andrei Borzenkov wrote:
> >> 12.08.2020 14:03, Harald Dunkel пишет:
> >>> See attachment. Hope this helps
> >>> Harri
> >>
> >>
> >>> 1 openat(AT_FDCWD,
> >>> "/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs",
> >>> O_RDONLY|O_CLOEXEC) = 24
> >>> 1 read(24, "0\n1544456\n", 4096)= 10
> >>
> >>
> >> kernel returns "0" as process number in this cgroup which results in EIO
> >> returned by systemd.
> >
>
> systemd should really clearly log this (invalid PID and and in which
> cgroup it was). Returning generic error message without any indication
> what caused this error is not useful at all.

I agree. Could you file a github issue for this?
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Harald Dunkel

On 8/13/20 9:05 AM, Andrei Borzenkov wrote:


systemd should really clearly log this (invalid PID and and in which
cgroup it was). Returning generic error message without any indication
what caused this error is not useful at all.


Do you think it would be reasonable to silently ignore the PID = 0
in cg_read_pid() and maybe others?


Regards
Harri
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Andrei Borzenkov
13.08.2020 09:54, Harald Dunkel пишет:
> On 8/12/20 2:16 PM, Andrei Borzenkov wrote:
>> 12.08.2020 14:03, Harald Dunkel пишет:
>>> See attachment. Hope this helps
>>> Harri
>>
>>
>>> 1 openat(AT_FDCWD,
>>> "/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs",
>>> O_RDONLY|O_CLOEXEC) = 24
>>> 1 read(24, "0\n1544456\n", 4096)    = 10
>>
>>
>> kernel returns "0" as process number in this cgroup which results in EIO
>> returned by systemd.
> 

systemd should really clearly log this (invalid PID and and in which
cgroup it was). Returning generic error message without any indication
what caused this error is not useful at all.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-13 Thread Harald Dunkel

On 8/12/20 2:16 PM, Andrei Borzenkov wrote:

12.08.2020 14:03, Harald Dunkel пишет:

See attachment. Hope this helps
Harri




1 openat(AT_FDCWD, 
"/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs", 
O_RDONLY|O_CLOEXEC) = 24
1 read(24, "0\n1544456\n", 4096)= 10



kernel returns "0" as process number in this cgroup which results in EIO
returned by systemd.


Indeed. This is kernel 4.19.132-1, as provided by Debian 10. Upgrading
to kernel 5.6.14-2~bpo10+1 and lxc 4.0.4 doesn't help, same problem.

And now its getting weird: I found a few ghost services in some LXC
containers with *just* zeros, e.g. the zabbix-agent:

# cat /sys/fs/cgroup/unified/system.slice/zabbix-agent.service/cgroup.procs
0
0
0
0
0
0

Zabbix-agent isn't even installed in the container. Its installed on
the host system only.

I will check on the LXC mailing list. Maybe somebody is able to
reproduce this problem.


I highly appreciate your support on this
Harri
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-12 Thread Andrei Borzenkov
12.08.2020 14:03, Harald Dunkel пишет:
> See attachment. Hope this helps
> Harri


> 1 openat(AT_FDCWD, 
> "/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs", 
> O_RDONLY|O_CLOEXEC) = 24
> 1 read(24, "0\n1544456\n", 4096)= 10


kernel returns "0" as process number in this cgroup which results in EIO
returned by systemd.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-12 Thread Lennart Poettering
On Mi, 12.08.20 13:03, Harald Dunkel (harald.dun...@aixigo.com) wrote:

> 1 getuid()  = 0
> 1 kill(1544456, SIGHUP) = 0

So, first, systemd tries and succeeds to kill the main process of your
service with SIGHUP. So far far so good.

> 1 openat(AT_FDCWD, 
> "/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs", 
> O_RDONLY|O_CLOEXEC) = 24
> 1 fstat(24, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> 1 read(24, "0\n1544456\n", 4096)= 10
> 1 close(24) = 0

It then goes on, and tries to enumerate the other processes in the
cgroup of the service to kill them too, if their PID is not equal to
the main PID it already killed above. And this is where it gets weird:
the "cgroup.procs" file contains two PIDs. The second one is the main
PID again, as expected. But the first one is "0". That is not a valid
PID, obviously. THis is either a kernel bug, or you joined a foreign
process in there from the outside? Could also be some LXC
weirdness..

systemd sees the PID "0", parses it, notices it is invalid, and
propagates EIO, since we read borked data.

I am not sure why LXC should insert random processes into random
subtrees of our cgroup tree. If it does that, this would really be a
bug...

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-12 Thread Harald Dunkel

On 8/12/20 1:03 PM, Harald Dunkel wrote:

See attachment. Hope this helps
Harri


PS:

# ls -al /sys/fs/cgroup/unified/system.slice/rsyslog.service
total 0
drwxr-xr-x  2 root root 0 Jun 20 17:40 .
drwxr-xr-x 53 root root 0 Aug 12 13:30 ..
-r--r--r--  1 root root 0 Aug 12 13:05 cgroup.controllers
-r--r--r--  1 root root 0 Aug 12 13:05 cgroup.events
-rw-r--r--  1 root root 0 Aug 12 13:05 cgroup.max.depth
-rw-r--r--  1 root root 0 Aug 12 13:05 cgroup.max.descendants
-rw-r--r--  1 root root 0 Aug 12 13:05 cgroup.procs
-r--r--r--  1 root root 0 Aug 12 13:05 cgroup.stat
-rw-r--r--  1 root root 0 Aug 12 13:05 cgroup.subtree_control
-rw-r--r--  1 root root 0 Aug 12 13:05 cgroup.threads
-rw-r--r--  1 root root 0 Aug 12 13:05 cgroup.type
-r--r--r--  1 root root 0 Aug 12 13:05 cpu.stat
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-12 Thread Harald Dunkel

See attachment. Hope this helps
Harri
1 epoll_wait(4, [{EPOLLIN, {u32=3589379376, u64=94720503158064}}], 36, -1) 
= 1
1 clock_gettime(CLOCK_BOOTTIME, {tv_sec=4562316, tv_nsec=425983895}) = 0
1 recvmsg(29, {msg_name=NULL, msg_namelen=0, 
msg_iov=[{iov_base="WATCHDOG=1\n", iov_len=4096}], msg_iovlen=1, 
msg_control=[{cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS, 
cmsg_data={pid=75, uid=0, gid=0}}], msg_controllen=32, 
msg_flags=MSG_CMSG_CLOEXEC}, MSG_TRUNC|MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 11
1 openat(AT_FDCWD, "/proc/75/cgroup", O_RDONLY|O_CLOEXEC) = 23
1 fstat(23, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
1 read(23, "13:name=systemd:/system.slice/in"..., 1024) = 290
1 read(23, "", 1024)= 0
1 close(23) = 0
1 timerfd_settime(25, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, 
tv_nsec=0}, it_value={tv_sec=4562321, tv_nsec=67759}}, NULL) = 0
1 epoll_wait(4, [{EPOLLIN, {u32=3588589904, u64=94720502368592}}], 36, -1) 
= 1
1 clock_gettime(CLOCK_BOOTTIME, {tv_sec=4562316, tv_nsec=795336025}) = 0
1 accept4(17, NULL, NULL, SOCK_CLOEXEC|SOCK_NONBLOCK) = 23
1 getsockopt(23, SOL_SOCKET, SO_PEERCRED, {pid=1613131, uid=0, gid=0}, 
[12]) = 0
1 fcntl(23, F_GETFL)= 0x802 (flags O_RDWR|O_NONBLOCK)
1 fcntl(23, F_GETFD)= 0x1 (flags FD_CLOEXEC)
1 fstat(23, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
1 getsockopt(23, SOL_SOCKET, SO_RCVBUF, [212992], [4]) = 0
1 setsockopt(23, SOL_SOCKET, SO_RCVBUF, [8388608], 4) = 0
1 getsockopt(23, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
1 setsockopt(23, SOL_SOCKET, SO_SNDBUF, [8388608], 4) = 0
1 getsockopt(23, SOL_SOCKET, SO_PEERCRED, {pid=1613131, uid=0, gid=0}, 
[12]) = 0
1 getsockopt(23, SOL_SOCKET, SO_PEERSEC, 0x5625d5f8beb0, [64]) = -1 
ENOPROTOOPT (Protocol not available)
1 getsockopt(23, SOL_SOCKET, SO_PEERGROUPS, [0], [256->4]) = 0
1 fstat(23, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
1 getsockopt(23, SOL_SOCKET, SO_ACCEPTCONN, [0], [4]) = 0
1 getsockname(23, {sa_family=AF_UNIX, sun_path="/run/systemd/private"}, 
[128->23]) = 0
1 recvmsg(23, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\0AUTH 
EXTERNAL 30\r\nNEGOTIATE_UNI"..., iov_len=256}], msg_iovlen=1, 
msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 
45
1 epoll_ctl(4, EPOLL_CTL_ADD, 23, {0, {u32=3589061440, 
u64=94720502840128}}) = 0
1 epoll_ctl(4, EPOLL_CTL_MOD, 23, {EPOLLIN|EPOLLOUT, {u32=3589061440, 
u64=94720502840128}}) = 0
1 epoll_wait(4, [{EPOLLOUT, {u32=3589061440, u64=94720502840128}}], 39, -1) 
= 1
1 clock_gettime(CLOCK_BOOTTIME, {tv_sec=4562316, tv_nsec=801992152}) = 0
1 sendmsg(23, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="OK 
7afb570fdfeb4f7584b0d7b5c98ae"..., iov_len=52}, {iov_base=NULL, iov_len=0}, 
{iov_base=NULL, iov_len=0}], msg_iovlen=3, msg_controllen=0, msg_flags=0}, 
MSG_DONTWAIT|MSG_NOSIGNAL) = 52
1 epoll_ctl(4, EPOLL_CTL_MOD, 23, {EPOLLIN, {u32=3589061440, 
u64=94720502840128}}) = 0
1 epoll_wait(4, [{EPOLLIN, {u32=3589061440, u64=94720502840128}}], 39, -1) 
= 1
1 clock_gettime(CLOCK_BOOTTIME, {tv_sec=4562316, tv_nsec=802457504}) = 0
1 epoll_wait(4, [{EPOLLIN, {u32=3589061440, u64=94720502840128}}], 39, -1) 
= 1
1 clock_gettime(CLOCK_BOOTTIME, {tv_sec=4562316, tv_nsec=802669666}) = 0
1 recvmsg(23, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="l\1\4\1 
\0\0\0\1\0\0\0\241\0\0\0\1\1o\0\31\0\0\0", iov_len=24}], msg_iovlen=1, 
msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 
24
1 recvmsg(23, {msg_name=NULL, msg_namelen=0, 
msg_iov=[{iov_base="/org/freedesktop/systemd1\0\0\0\0\0\0\0"..., iov_len=192}], 
msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, 
MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 192
1 getuid()  = 0
1 kill(1544456, SIGHUP) = 0
1 openat(AT_FDCWD, 
"/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs", 
O_RDONLY|O_CLOEXEC) = 24
1 fstat(24, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
1 read(24, "0\n1544456\n", 4096)= 10
1 close(24) = 0
1 openat(AT_FDCWD, "/sys/fs/cgroup/unified/system.slice/rsyslog.service", 
O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 24
1 fstat(24, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
1 getdents64(24, /* 12 entries */, 32768) = 432
1 getdents64(24, /* 0 entries */, 32768) = 0
1 close(24) = 0
1 sendmsg(23, {msg_name=NULL, msg_namelen=0, 
msg_iov=[{iov_base="l\3\1\1\27\0\0\0\1\0\0\0g\0\0\0\5\1u\0\1\0\0\0\4\1s\0\"\0\0\0"...,
 iov_len=120}, {iov_base="\22\0\0\0Input/output error\0", iov_len=23}], 
msg_iovlen=2, msg_controllen=0, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 143
1 epoll_wait(4, [{EPOLLIN|EPOLLHUP, {u32=3589061440, u64=94720502840128}}], 
39, -1) = 1
1 clock_gettime(CLOCK_BOOTTIME, 

Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-12 Thread Andrei Borzenkov
12.08.2020 13:04, Harald Dunkel пишет:
> On 8/12/20 10:32 AM, Ulrich Windl wrote:
>>
>> As you found out the details already, maybe you could have added some
>> strace
>> output, especially after the kill() is returning...
>>
> 
> See attachment. Hope this helps

Not really. We already know that systemctl gets error from systemd. You
need to strace systemd when it tries to perform requested operation.
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] I/O error on "systemctl kill -s HUP rsyslog.service"

2020-08-12 Thread Harald Dunkel

On 8/12/20 10:32 AM, Ulrich Windl wrote:


As you found out the details already, maybe you could have added some strace
output, especially after the kill() is returning...



See attachment. Hope this helps
Harri
44504 execve("/bin/systemctl", ["systemctl", "kill", "-s", "HUP", 
"rsyslog.service"], 0x7ffcae899d08 /* 25 vars */) = 0
44504 brk(NULL) = 0x55dbbdeff000
44504 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/tls/haswell/avx512_1/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/tls/haswell/avx512_1/x86_64", 0x7ffde718d280) = -1 
ENOENT (No such file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/tls/haswell/avx512_1/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/tls/haswell/avx512_1", 0x7ffde718d280) = -1 ENOENT (No 
such file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/tls/haswell/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/tls/haswell/x86_64", 0x7ffde718d280) = -1 ENOENT (No 
such file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/tls/haswell/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/tls/haswell", 0x7ffde718d280) = -1 ENOENT (No such 
file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/tls/avx512_1/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/tls/avx512_1/x86_64", 0x7ffde718d280) = -1 ENOENT (No 
such file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/tls/avx512_1/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/tls/avx512_1", 0x7ffde718d280) = -1 ENOENT (No such 
file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) 
= -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/tls/x86_64", 0x7ffde718d280) = -1 ENOENT (No such file 
or directory)
44504 openat(AT_FDCWD, "/lib/systemd/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 
ENOENT (No such file or directory)
44504 stat("/lib/systemd/tls", 0x7ffde718d280) = -1 ENOENT (No such file or 
directory)
44504 openat(AT_FDCWD, "/lib/systemd/haswell/avx512_1/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/haswell/avx512_1/x86_64", 0x7ffde718d280) = -1 ENOENT 
(No such file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/haswell/avx512_1/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/haswell/avx512_1", 0x7ffde718d280) = -1 ENOENT (No 
such file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/haswell/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/haswell/x86_64", 0x7ffde718d280) = -1 ENOENT (No such 
file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/haswell/libc.so.6", O_RDONLY|O_CLOEXEC) = 
-1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/haswell", 0x7ffde718d280) = -1 ENOENT (No such file or 
directory)
44504 openat(AT_FDCWD, "/lib/systemd/avx512_1/x86_64/libc.so.6", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/avx512_1/x86_64", 0x7ffde718d280) = -1 ENOENT (No such 
file or directory)
44504 openat(AT_FDCWD, "/lib/systemd/avx512_1/libc.so.6", O_RDONLY|O_CLOEXEC) = 
-1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/avx512_1", 0x7ffde718d280) = -1 ENOENT (No such file 
or directory)
44504 openat(AT_FDCWD, "/lib/systemd/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC) = 
-1 ENOENT (No such file or directory)
44504 stat("/lib/systemd/x86_64", 0x7ffde718d280) = -1 ENOENT (No such file or 
directory)
44504 openat(AT_FDCWD, "/lib/systemd/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 
ENOENT (No such file or directory)
44504 stat("/lib/systemd", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0
44504 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
44504 fstat(3, {st_mode=S_IFREG|0644, st_size=43597, ...}) = 0
44504 mmap(NULL, 43597, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fa75e41b000
44504 close(3)  = 0
44504 mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x7fa75e419000
44504 openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 
3
44504 read(3, 
"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260A\2\0\0\0\0\0"..., 832) = 832
44504 fstat(3, {st_mode=S_IFREG|0755, st_size=1824496, ...}) = 0
44504 mmap(NULL, 1837056, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7fa75e258000
44504 mprotect(0x7fa75e27a000, 1658880, PROT_NONE) = 0
44504 mmap(0x7fa75e27a000, 1343488, PROT_READ|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7fa75e27a000
44504 mmap(0x7fa75e3c2000, 311296, PROT_READ, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16a000) = 0x7fa75e3c2000
44504