Re: [systemd-devel] Is there a way to find out if Delegate=yes?

2022-10-30 Thread Yuri Kanivetsky
On Thu, Oct 27, 2022 at 1:40 PM Arseny Maslennikov  wrote:
> It had successfully reached this mailing list by 2022-Oct-25, so that
> means you're not subscribed to the list. Strangely enough,
> the mail receiver rejects emails from non-subscribers, so you wouldn't
> be able to reach out to the list at all.

I'm subscribed, and received your second email. Probably some sort of
a glitch. I just decided to notify you, just in case.

> I'll try to explain what I can. I suppose there's someone in the world
> who has really hit the problems described below and is in a better
> position to comment, or provide links to available resources where the
> experience is documented for the perusal of the community.

Thanks for your replies, things are a bit clearer at the moment. And
yeah, I'll probably ask the lxc guys as well. But let me add here what
I've learned so far. In case someone has anything to add.

Locally it works w/o systemd-run, although there's one warning when
running lxc-start (apparently non-fatal):

lxc-start c 20221030073216.345 WARN start -
../src/lxc/start.c:lxc_spawn:1832 - Operation not permitted - Failed
to allocate new network namespace id

On the server w/o systemd-run (these are probably also non-fatal):

lxc-start c 20221030114914.612 WARN apparmor -
lsm/apparmor.c:lsm_apparmor_ops_init:1275 - Per-container AppArmor
profiles are disabled because the mac_admin capability is missing
lxc-start c 20221030114914.626 WARN start - start.c:lxc_spawn:1835
- Operation not permitted - Failed to allocate new network namespace
id

But in the container's console I see:

Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
[[0;1;31m!![0m] Failed to mount API filesystems.
Exiting PID 1...

>From what you said it looks like Delegate=yes is not about
permissions, but about not stepping on someone else's toes. Yet on the
server from the console output it looks like it's about permissions.
However that might be a result of stepping on someone else's toes. I'm
not sure.

There's also a related issue. I tried to launch a container locally
from another user (useradd + su), and it failed:

lxc-start c 20221030074222.316 ERRORcgfsng -
../src/lxc/cgroups/cgfsng.c:unpriv_systemd_create_scope:1232 - Failed
to connect to user bus: No medium found
lxc-start c 20221030074222.326 WARN start -
../src/lxc/start.c:lxc_spawn:1832 - Operation not permitted - Failed
to allocate new network namespace id

The console output:

Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
[^[[0;1;31m!!^[[0m] Failed to mount API filesystems.
Exiting PID 1...

Which somewhat reminds me of what I saw on the server. But when I
tried it with systemd-run (under this other user), systemd-run failed:

Failed to connect to bus: No medium found

A more detailed logs can be found here:

https://gist.github.com/x-yuri/a6d31154df07405de97217ba75c1ff0f

Regards,
Yuri


Re: [systemd-devel] Is there a way to find out if Delegate=yes?

2022-10-27 Thread Yuri Kanivetsky
Arseny Maslennikov, for some reason I didn't receive your email.

Anyways, indeed on the server with --user:

$ systemctl --user show -p Delegate run-rcbb44fb2c7774453b18cda8fe03f0f26.scope
Delegate=yes

But that's just part of the mystery. Locally, what can I do... I can
try and query the scope to which my shell belongs to:

$ systemctl --user show -p Delegate session-2.scope
Delegate=no

Or the enclosing slice for the scope on the server (the local slice
that matches the one on the server where the transient scope is
created):

$ systemctl --user show -p Delegate app.slice
Delegate=no

Somehow I don't need systemd-run for lxc-start and lxc-attach locally.
Any ideas?

On Mon, Oct 24, 2022 at 6:07 PM Yuri Kanivetsky
 wrote:
>
> Hi,
>
> I'm experimenting with LXC containers:
>
> https://linuxcontainers.org/lxc/getting-started/
>
> And there's a command I don't fully understand:
>
> systemd-run --unit=my-unit --user --scope -p "Delegate=yes" --
> lxc-start my-container
>
> It runs lxc-start in a transient user scope with Delegate=yes, but:
>
> $ systemctl show -p Delegate run-scope
> Delegate=no
>
> That's on an Ubuntu server. Locally on Arch Linux I don't need
> systemd-run, lxc-start just works.
>
> How can I see the effect of systemd-run, and why systemd-run is not
> needed on Arch Linux?


[systemd-devel] Is there a way to find out if Delegate=yes?

2022-10-24 Thread Yuri Kanivetsky
Hi,

I'm experimenting with LXC containers:

https://linuxcontainers.org/lxc/getting-started/

And there's a command I don't fully understand:

systemd-run --unit=my-unit --user --scope -p "Delegate=yes" --
lxc-start my-container

It runs lxc-start in a transient user scope with Delegate=yes, but:

$ systemctl show -p Delegate run-scope
Delegate=no

That's on an Ubuntu server. Locally on Arch Linux I don't need
systemd-run, lxc-start just works.

How can I see the effect of systemd-run, and why systemd-run is not
needed on Arch Linux?


Re: [systemd-devel] Can /usr/lib/systemd/user/sockets.target.wants be used to autoenable a socket by a vendor package?

2022-10-10 Thread Yuri Kanivetsky
After experimenting some more, I can see that if there's no [Install]
section and the unit is enabled by putting it into
/usr/lib/systemd/*/*.target.wants, then is-enabled is static, and
`systemctl enable` does nothing and explains the situation. Which
makes me think that one should either add the [Install] section, or
enable a unit via /usr/lib/systemd/*/*.target.wants. If both are done,
is-enabled is disabled, and one can enable it twice (provide a second
route). Which probably is not particularly bad, but is confusing.

On Tue, Sep 20, 2022 at 11:23 AM Andrei Borzenkov  wrote:
>
> On Tue, Sep 20, 2022 at 10:42 AM Barry  wrote:
> >
> > Enabled does mean that it will or will not run.
> > It means that it is wanted by the default target.
> >
>
> No. It means that it is wanted by whatever units are listed in
> [Install] section (actually, it is "enabled" even if only aliases are
> created, so more correct is - it is enabled if links mentioned in
> [Install] section are created).


Re: [systemd-devel] Can /usr/lib/systemd/user/sockets.target.wants be used to autoenable a socket by a vendor package?

2022-09-18 Thread Yuri Kanivetsky
> > $ ls -al 
> > /usr/lib/systemd/user/multi-user.target.wants/infinite-tsukuyomi.service
> > lrwxrwxrwx 1 root root 29 Sep 18 08:45
> > /usr/lib/systemd/user/multi-user.target.wants/infinite-tsukuyomi.service
> > -> ../infinite-tsukuyomi.service
> >
> > And rebooted the machine. The service didn't start. But starts
> > manually if I tell it to. Is there anything I'm missing here?
> >
>
> There is no multi-user.target for user systemd instances (nothing
> prevents you from creating one, but it does not exist by default).

Good point. When I symlink it into
/usr/lib/systemd/user/default.target.wants it starts on boot.

Also, I've created a simple perl server:

https://gist.github.com/x-yuri/45f53c16a99337ba0716a988290491bd

And if I put perl-server.socket and perl-server.service into
/usr/lib/systemd/user, and symlink perl-server.socket into
/usr/lib/systemd/user/sockets.target.wants, it autoactivates on boot.

The confusing thing though is:

$ systemctl --user is-enabled perl-server.socket
disabled

And the fact that enable/preset/disable create/remove symlinks in
~/.config/systemd/user/sockets.target.wants.

Which doesn't happen with a service (e.g. infinite-tsukuyomi) when the
service is in /usr/lib/systemd/user.

Regards,
Yuri


Re: [systemd-devel] Can /usr/lib/systemd/user/sockets.target.wants be used to autoenable a socket by a vendor package?

2022-09-17 Thread Yuri Kanivetsky
> No, everything linked to a .wants/ directory immediately becomes a 
> Wants= dep of  and is therefore "enabled", it doesn't matter whether 
> that .wants/ is in /etc or /usr/lib or /run.

To confirm this, I created the following files:

$ cat /usr/lib/systemd/user/infinite-tsukuyomi.service
[Unit]
Description=Infinite Tsukuyomi

[Service]
ExecStart=/usr/bin/sleep infinity

$ ls -al 
/usr/lib/systemd/user/multi-user.target.wants/infinite-tsukuyomi.service
lrwxrwxrwx 1 root root 29 Sep 18 08:45
/usr/lib/systemd/user/multi-user.target.wants/infinite-tsukuyomi.service
-> ../infinite-tsukuyomi.service

And rebooted the machine. The service didn't start. But starts
manually if I tell it to. Is there anything I'm missing here?

Regards,
Yuri


[systemd-devel] Can /usr/lib/systemd/user/sockets.target.wants be used to autoenable a socket by a vendor package?

2022-09-17 Thread Yuri Kanivetsky
Hi,

I've noticed that an Arch Linux package (gnupg) seemingly
automatically enables a socket:

ln -s "../dirmngr.socket"
"/usr/lib/systemd/user/sockets.target.wants/dirmngr.socket"

https://github.com/archlinux/svntogit-packages/commit/e7a6851881e2cfea37b76cfb16ba97af2fcc

Before the change they were symlinked to /etc/systemd/user/sockets.target.wants.

Later I was told that there's such a thing as preset units (an
undocumented feature?):

https://bbs.archlinux.org/viewtopic.php?pid=2057758#p2057758

The way I understood it, if I put dirmngr.socket at
/usr/lib/systemd/user/sockets.target.wants, it's like adding "enable
dirmngr.service" to the preset policy. In other words, it won't be
enabled by default, and won't be activated on boot unless I do
`systemctl --user preset dirmngr`.

Can you clarify this? Are there preset units? Is my understanding of
how they work correct?

Regards,
Yuri


Re: [systemd-devel] Are logs at /run/log/journal automerged?

2022-08-25 Thread Yuri Kanivetsky
Let me first reply to your answers. Then I'll provide more details.
And a couple of questions at the end.

> > I'm experiencing this on Digital Ocean. The machine id there changes
> > (which I think shouldn't happen) on the first boot (supposedly by
> > cloud-init).
>
> The machine ID may change during the initrd to host-fs
> transition. Otherwise that's not OK though.

Is the transition somehow logged to the journal?

> When the logs from /run/ are flushed to /var/ they are all merged
> together into one.

> By default journalctl will show logs associated with the current
> machine ID and those associated with the current boot ID. The latter
> should usually ensure that logs from the initrd phase are shown as
> well if it has a different machine ID.

That might explain what I experience under CentOS, but not under
Ubuntu. Under Ubuntu I need --merge to see the records from the old
journal.

> > In Ubuntu 22.04 droplets, where logs are stored at
> > /var/log/journal, that leads to journalctl outputting no records
> > (because the log for the new machine-id has not been created), unless
> > I pass --file or --merge. Also, the records continue to be added to
> > the old log (for the old machine id).
> >
> > In CentOS 9 droplets, where logs are stored at /run/log/journal,
> > journalctl outputs records from all 3 files:
> >
> > cb754b7b85bb42d1af6b48e7ca843674/system.journal
> > 61238251e3db916639eaa8cd54998712/system@6600bdad291b419c8a0b1fea2564c472-0001-0005e6d123825866.journal
> > 61238251e3db916639eaa8cd54998712/system.journal
> >
> > In this case records also are being added to the old log. But the new
> > log somehow contains the beginning of the log (starting with boot).
> >
> > Is my guess correct? Logs at /run/log/journal are automerged, logs at
> > /var/run/journal aren't.
>
> As mentioned abive, when the logs are flushed from /run/ to /var/ in
> systemd-journal-flush.service they are merged into one new journal
> file, which is located in the machine ID subdir of the actual machine
> ID of the system.

Um, by "automerged" I meant that journalctl doesn't need --merge to
display records from several journals (corresponding to different
machine ids).

The details:

Ubuntu 22.04 (249-11)
-

After creating a droplet (Digital Ocean) it starts with one machine id
(c3a57680c1d26ca313b9c7ec36a5beaa), but after this line in the
journal:

Aug 25 23:32:59 d1 cloud-init[1318]: Initializing machine ID from
D-Bus machine ID.

the machine id changes:

# cat /etc/machine-id
b05137603357003a36d3c69c630806ab

And journalctl starts saying:

No journal files were found.
-- No entries --

Also:

# ls -R /var/log/journal
/var/log/journal:
c3a57680c1d26ca313b9c7ec36a5beaa

/var/log/journal/c3a57680c1d26ca313b9c7ec36a5beaa:
system.journal

I can inspect the journal with one of these commands (the records
continue to be created in the old journal):

# journalctl --file
/var/log/journal/c3a57680c1d26ca313b9c7ec36a5beaa/system.journal
# journalctl --merge

Alternatively I can do:

# systemctl restart systemd-journald
# journalctl

Then it creates a new journal, and journalctl displays the new journal
(only the records created after restart).

# ls -R /var/log/journal
/var/log/journal:
b05137603357003a36d3c69c630806ab
c3a57680c1d26ca313b9c7ec36a5beaa

/var/log/journal/b05137603357003a36d3c69c630806ab:
system.journal

/var/log/journal/c3a57680c1d26ca313b9c7ec36a5beaa:
system.journal

# journalctl --merge -u systemd-journald
Aug 25 23:31:32 ubuntu systemd-journald[333]: Journal started
Aug 25 23:31:32 ubuntu systemd-journald[333]: Runtime Journal
(/run/log/journal/c3a57680c1d26ca313b9c7ec36a5beaa) is 600.0K, max
4.6M, 4.0M free.
Aug 25 23:31:32 ubuntu systemd-journald[333]: Time spent on flushing
to /var/log/journal/c3a57680c1d26ca313b9c7ec36a5beaa is 33.441ms for
598 entries.
Aug 25 23:31:32 ubuntu systemd-journald[333]: System Journal
(/var/log/journal/c3a57680c1d26ca313b9c7ec36a5beaa) is 8.0M, max
200.3M, 192.3M free.
Aug 25 23:35:32 d1 systemd-journald[333]: Journal stopped
Aug 25 23:35:33 d1 systemd-journald[2098]: Journal started
Aug 25 23:35:33 d1 systemd-journald[2098]: System Journal
(/var/log/journal/b05137603357003a36d3c69c630806ab) is 8.0M, max
974.0M, 966.0M free.

CentOS 9 (systemd-250.4)


The difference. A different starting machine id
(61238251e3db916639eaa8cd54998712). A slightly different message:

Aug 25 20:20:16 dc cloud-init[872]: Initializing machine ID from VM UUID.

The journal is accessible even though the machine id changed. It's at
/run/log/journal (/var/log/journal doesn't exist):

# ls -R /run/log/journal
/run/log/journal:
48727b98ca35468eb885d68e67ab2fca
61238251e3db916639eaa8cd54998712

/run/log/journal/48727b98ca35468eb885d68e67ab2fca:
system.journal

/run/log/journal/61238251e3db916639eaa8cd54998712:
system@1b641591e26b40b9a2dc994b34c71f97-0001-0005e719dec62219.journal
system.journal

`journalctl` displays data from all 3 files.

[systemd-devel] Are logs at /run/log/journal automerged?

2022-08-22 Thread Yuri Kanivetsky
Hi,

I'm experiencing this on Digital Ocean. The machine id there changes
(which I think shouldn't happen) on the first boot (supposedly by
cloud-init). In Ubuntu 22.04 droplets, where logs are stored at
/var/log/journal, that leads to journalctl outputting no records
(because the log for the new machine-id has not been created), unless
I pass --file or --merge. Also, the records continue to be added to
the old log (for the old machine id).

In CentOS 9 droplets, where logs are stored at /run/log/journal,
journalctl outputs records from all 3 files:

cb754b7b85bb42d1af6b48e7ca843674/system.journal
61238251e3db916639eaa8cd54998712/system@6600bdad291b419c8a0b1fea2564c472-0001-0005e6d123825866.journal
61238251e3db916639eaa8cd54998712/system.journal

In this case records also are being added to the old log. But the new
log somehow contains the beginning of the log (starting with boot).

Is my guess correct? Logs at /run/log/journal are automerged, logs at
/var/run/journal aren't.


Re: [systemd-devel] systemd tries to terminate a process that seems to have exited

2022-05-10 Thread Yuri Kanivetsky
ed behavior be explained like
this? With cgroupv1 "empty cgroup" notifications in containers don't
always reach systemd. As a result, if systemd doesn't receive an
"empty cgroup" notification, it thinks some processes are still
running (although there're none left), tries to kill them and
eventually times out. Does that sound correct?

On Tue, May 10, 2022 at 4:22 PM Lennart Poettering
 wrote:
>
> On Di, 10.05.22 08:44, Yuri Kanivetsky (yuri.kanivet...@gmail.com) wrote:
>
> > The one that produces the messages is 249.11 (that is running in a
> > docker container):
> >
> > https://packages.ubuntu.com/jammy/systemd
> >
> > The one running on the host is 215-17 (Debian 8).
>
> that's ancient... i figure this then also means you are stuck with
> cgroupv1. Which means cgroup empty notifications in containers
> typically don#t work.
>
> Lennart
>
> --
> Lennart Poettering, Berlin


Re: [systemd-devel] systemd tries to terminate a process that seems to have exited

2022-05-09 Thread Yuri Kanivetsky
The one that produces the messages is 249.11 (that is running in a
docker container):

https://packages.ubuntu.com/jammy/systemd

The one running on the host is 215-17 (Debian 8).

> But it sounds like systemd issue in one specific version you are using.

On hosts with newer Debians the issue doesn't manifest itself (the
systemd version inside docker remains the same). I'm trying to figure
out what exactly is happening on Debian 8. If not a systemd issue...
The things that come to mind are: a) the process is waiting to release
some resources after exit() or return from main(), b) something PAM-
or dbus-related, c) some threads that don't let it exit. Not a C
programmer to know if those are possible (if something can not let a
process terminate after exit() or return from main()).

On Tue, May 10, 2022 at 8:09 AM Andrei Borzenkov  wrote:
>
> On 09.05.2022 23:43, Yuri Kanivetsky wrote:
> > Hi Andrei,
> >
> > Thanks for the suggestion. It becomes more verbose, but it still seems
> > like `systemd` fails to notice that `gnome-keyring` exited:
> >
>
> Probably
>
> ...
>
> >
> > The child exits:
> >
> > May 09 17:52:47 cb6d1c84f84e gnome-keyring-daemon[314]: -- main:
> > return 0, gkd-main.c:1210
> > May 09 17:52:47 cb6d1c84f84e gnome-keyring-d[314]: -- main: return
> > 0, gkd-main.c:1210
> > May 09 17:52:47 cb6d1c84f84e systemd[106]: Child 314
> > (gnome-keyring-d) died (code=exited, status=0/SUCCESS)
> > May 09 17:52:47 cb6d1c84f84e systemd[106]: gnome-keyring.service:
> > Child 314 belongs to gnome-keyring.service.
> > May 09 17:52:47 cb6d1c84f84e systemd[106]: Received SIGCHLD from
> > PID 314 (n/a).
>
> What I miss is "cgroup is empty" message. For comparison:
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: Received SIGCHLD from
> PID 73346 (sleep).
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: Child 73346 (sleep)
> died (code=exited, status=0/SUCCESS)
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: oneshot.service: Child
> 73346 belongs to oneshot.service.
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: oneshot.service:
> Control group is empty.
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: oneshot.service:
> Succeeded.
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: oneshot.service:
> Service will not restart (restart setting)
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: oneshot.service:
> Changed stop-sigterm -> dead
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: oneshot.service: Job
> 986 oneshot.service/start finished, result=done
>
> May 10 07:56:16 bor-Latitude-E5450 systemd[1593]: Finished test oneshot
> forking service.
>
>
> You never mentioned your systemd version so it is hard to say anything.
> But it sounds like systemd issue in one specific version you are using.
>
> >
> > The org.freedesktop.secrets service is activated:
> >
> > May 09 17:52:47 cb6d1c84f84e dbus-daemon[124]: [session uid=1000
> > pid=124] Activating service name='org.freedesktop.secrets' requested
> > by ':1.16' (uid=1000 pid=243 comm="/usr/libexec/xdg-desktop-portal ")
> > May 09 17:52:47 cb6d1c84f84e gnome-keyring-d[348]: -- main: 
> > gkd-main.c:1046
> > May 09 17:52:47 cb6d1c84f84e org.freedesktop.secrets[348]:
> > gnome-keyring-daemon: no process capabilities, insecure memory might
> > get used
> > May 09 17:52:47 cb6d1c84f84e gnome-keyring-daemon[348]: couldn't
> > access control socket: /run/user/1000/keyring/control: No such file or
> > directory
> > May 09 17:52:47 cb6d1c84f84e gnome-keyring-d[348]: couldn't access
> > control socket: /run/user/1000/keyring/control: No such file or
> > directory
> > May 09 17:52:47 cb6d1c84f84e dbus-daemon[124]: [session uid=1000
> > pid=124] Successfully activated service 'org.freedesktop.secrets'
> >
> > The gnome-keyring service times out:
> >
> > May 09 17:54:17 cb6d1c84f84e systemd[106]: gnome-keyring.service:
> > State 'stop-sigterm' timed out. Killing.
> > May 09 17:54:17 cb6d1c84f84e systemd[106]: gnome-keyring.service:
> > Failed with result 'timeout'.
> > May 09 17:54:17 cb6d1c84f84e systemd[106]: gnome-keyring.service:
> > Service will not restart (restart setting)
> > May 09 17:54:17 cb6d1c84f84e systemd[106]: gnome-keyring.service:
> > Changed stop-sigterm -> failed
> > May 09 17:54:17 cb6d1c84f84e systemd[106]: gnome-keyring.service:
> > Job 167 gnome-keyring.service/start finished, result=failed
> > May 09 17:54:17 cb6d1c8

Re: [systemd-devel] systemd tries to terminate a process that seems to have exited

2022-05-09 Thread Yuri Kanivetsky
fo here:

https://gist.github.com/x-yuri/b12e8178a621372a4aa62c60693af37b#file-b-journal-gnome-keyring-gist-md

Do you know any reason a process can remain alive after exit() or
return from main()? Any threads started by PAM or anything
dbus-related (wild guesses on my part)? Anything else I can check?

Regards,
Yuri

On Thu, May 5, 2022 at 8:19 AM Andrei Borzenkov  wrote:
>
> On 05.05.2022 04:41, Yuri Kanivetsky wrote:
> > Hi,
> >
> > This might be not a systemd issue. But the behavior is weird, and I'm not 
> > sure.
> >
> > I'm trying to run GNOME in a docker container. And gnome-keyring fails to 
> > start:
> >
> > https://gist.github.com/x-yuri/c3c715ea6355633de4546ae957a66410
> >
> > I added debug statements, and in the log I see:
> >
> > May 02 05:09:02 ab6aaba04124 systemd[109]: Starting Start
> > gnome-keyring for the Secrets Service, and PKCS #11...
> > May 02 05:09:02 ab6aaba04124 gnome-keyring-d[309]: -- main: 1046
> > May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]:
> > gnome-keyring-daemon: no process capabilities, insecure memory might
> > get used
> > May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]: --
> > fork_and_print_environment: fork(), parent, 653
> > May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: --
> > fork_and_print_environment: fork(), child, 684
> > May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: couldn't
> > access control socket: /run/user/1000/keyring/control: No such file or
> > directory
> > May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]: --
> > fork_and_print_environment: exit(0), 680
> > May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: -- main:
> > return 0, 1210
> > May 02 05:10:32 ab6aaba04124 systemd[109]: gnome-keyring.service:
> > State 'stop-sigterm' timed out. Killing.
> > May 02 05:10:32 ab6aaba04124 systemd[109]: gnome-keyring.service:
> > Failed with result 'timeout'.
> > May 02 05:10:32 ab6aaba04124 systemd[109]: Failed to start Start
> > gnome-keyring for the Secrets Service, and PKCS #11.
> >
> ...
> >
> > I can only reproduce it on Debian 8. Which kind of makes it
> > unimportant. But the behavior is so weird (either gnome-keyring is
> > blocked in/after exit(), or systemd tries to kill a process that
> > exited), that I can't help but think about what is really going on
> > there.
> >
>
>
> So run systemd user instance with debug level logging to see which
> process are still left.


[systemd-devel] systemd tries to terminate a process that seems to have exited

2022-05-04 Thread Yuri Kanivetsky
Hi,

This might be not a systemd issue. But the behavior is weird, and I'm not sure.

I'm trying to run GNOME in a docker container. And gnome-keyring fails to start:

https://gist.github.com/x-yuri/c3c715ea6355633de4546ae957a66410

I added debug statements, and in the log I see:

May 02 05:09:02 ab6aaba04124 systemd[109]: Starting Start
gnome-keyring for the Secrets Service, and PKCS #11...
May 02 05:09:02 ab6aaba04124 gnome-keyring-d[309]: -- main: 1046
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]:
gnome-keyring-daemon: no process capabilities, insecure memory might
get used
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]: --
fork_and_print_environment: fork(), parent, 653
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: --
fork_and_print_environment: fork(), child, 684
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: couldn't
access control socket: /run/user/1000/keyring/control: No such file or
directory
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]: --
fork_and_print_environment: exit(0), 680
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: -- main:
return 0, 1210
May 02 05:10:32 ab6aaba04124 systemd[109]: gnome-keyring.service:
State 'stop-sigterm' timed out. Killing.
May 02 05:10:32 ab6aaba04124 systemd[109]: gnome-keyring.service:
Failed with result 'timeout'.
May 02 05:10:32 ab6aaba04124 systemd[109]: Failed to start Start
gnome-keyring for the Secrets Service, and PKCS #11.

A longer version (w/ lines about a service activation):

May 02 05:09:02 ab6aaba04124 systemd[109]: Starting Start
gnome-keyring for the Secrets Service, and PKCS #11...
May 02 05:09:02 ab6aaba04124 gnome-keyring-d[309]: -- main: 1046
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]:
gnome-keyring-daemon: no process capabilities, insecure memory might
get used
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]: --
fork_and_print_environment: fork(), parent, 653
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: --
fork_and_print_environment: fork(), child, 684
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: couldn't
access control socket: /run/user/1000/keyring/control: No such file or
directory
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]: --
fork_and_print_environment: exit(0), 680
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: -- main:
return 0, 1210
May 02 05:09:02 ab6aaba04124 dbus-daemon[124]: [session uid=1000
pid=124] Activating service name='org.freedesktop.secrets' requested
by ':1.19' (uid=1000 pid=251 comm="/usr/libexec/xdg-desktop-portal ")
May 02 05:09:02 ab6aaba04124 gnome-keyring-d[347]: -- main: 1046
May 02 05:09:02 ab6aaba04124 org.freedesktop.secrets[347]:
gnome-keyring-daemon: no process capabilities, insecure memory might
get used
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[347]: couldn't
access control socket: /run/user/1000/keyring/control: No such file or
directory
May 02 05:09:02 ab6aaba04124 dbus-daemon[124]: [session uid=1000
pid=124] Successfully activated service 'org.freedesktop.secrets'
May 02 05:10:32 ab6aaba04124 systemd[109]: gnome-keyring.service:
State 'stop-sigterm' timed out. Killing.
May 02 05:10:32 ab6aaba04124 systemd[109]: gnome-keyring.service:
Failed with result 'timeout'.
May 02 05:10:32 ab6aaba04124 systemd[109]: Failed to start Start
gnome-keyring for the Secrets Service, and PKCS #11.

And even longer version (with duplicate and intervening lines):

May 02 05:09:02 ab6aaba04124 systemd[109]: Starting Start
gnome-keyring for the Secrets Service, and PKCS #11...
May 02 05:09:02 ab6aaba04124 systemd[109]: Starting GNOME Remote Desktop...
May 02 05:09:02 ab6aaba04124 systemd[109]: Starting Monitor
Session leader for GNOME Session...
May 02 05:09:02 ab6aaba04124 systemd[109]: Starting Session Migration...
May 02 05:09:02 ab6aaba04124 systemd[109]: Starting Rewrite
dynamic launcher portal entries...
May 02 05:09:02 ab6aaba04124 systemd[109]: Finished Start
gnome-keyring as SSH agent.
May 02 05:09:02 ab6aaba04124 systemd[109]: Started OpenSSH Agent.
May 02 05:09:02 ab6aaba04124 systemd[109]: Started Monitor Session
leader for GNOME Session.
May 02 05:09:02 ab6aaba04124 gnome-keyring-d[309]: -- main: 1046
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]:
gnome-keyring-daemon: no process capabilities, insecure memory might
get used
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[309]: --
fork_and_print_environment: fork(), parent, 653
May 02 05:09:02 ab6aaba04124 gnome-keyring-d[309]: --
fork_and_print_environment: fork(), parent, 653
May 02 05:09:02 ab6aaba04124 systemd[109]: Finished Rewrite
dynamic launcher portal entries.
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: --
fork_and_print_environment: fork(), child, 684
May 02 05:09:02 ab6aaba04124 gnome-keyring-daemon[321]: couldn't
access control socket: /run/

[systemd-devel] A server says: "System is going down." But never does.

2017-11-28 Thread Yuri Kanivetsky
Hi,

This mailing list is the only place where I expect to have some
helpful feedback. But feel free to suggest other places. I'd like to
investigate situation I have now, find out what went wrong and prevent
it from happening again if possible. Your help is appreciated.

Like I said, a server reports that it's going down, when I ssh to it
as root. As a non-root user, it says that and closes the connection.

In the journal I see a lot of this:


Nov 28 16:22:01 st2 systemd-journal[353]: Journal stopped
Nov 28 16:22:01 st2 systemd-journal[494]: Runtime journal is using
624.0M (max allowed 642.1M, trying to leave 963.1M free of 5.6G
available → current limit 642.1M).
Nov 28 16:22:01 st2 systemd-journal[494]: Runtime journal is using
624.0M (max allowed 642.1M, trying to leave 963.1M free of 5.6G
available → current limit 642.1M).
Nov 28 16:22:01 st2 systemd-journal[494]: Journal started
Nov 28 16:22:01 st2 systemd[1]: systemd-journald.service watchdog
timeout (limit 1min)!
Nov 28 16:22:01 st2 systemd-journald[353]: Received SIGTERM from PID 1
(systemd).
Nov 28 16:22:01 st2 systemd[1]: Unit systemd-journald.service entered
failed state.
Nov 28 16:22:01 st2 systemd[1]: systemd-journald.service has no
holdoff time, scheduling restart.
Nov 28 16:22:01 st2 systemd[1]: Stopping Journal Service...
Nov 28 16:22:01 st2 systemd[1]: Starting Journal Service...
Nov 28 16:22:01 st2 systemd[1]: Started Journal Service.
Nov 28 16:22:01 st2 systemd[1]: Starting Trigger Flushing of Journal
to Persistent Storage...
Nov 28 16:22:01 st2 systemd[1]: systemd-journal-flush.service: main
process exited, code=exited, status=1/FAILURE
Nov 28 16:22:01 st2 systemd[1]: Failed to start Trigger Flushing of
Journal to Persistent Storage.
Nov 28 16:22:01 st2 systemd[1]: Unit systemd-journal-flush.service
entered failed state.


Nov 28 16:22:52 st2 systemd[1]: systemd-timesyncd.service start
operation timed out. Terminating.
Nov 28 16:22:52 st2 systemd[1]: Failed to start Network Time Synchronization.
Nov 28 16:22:52 st2 systemd[1]: Unit systemd-timesyncd.service entered
failed state.
Nov 28 16:22:53 st2 systemd[1]: systemd-timesyncd.service has no
holdoff time, scheduling restart.
Nov 28 16:22:53 st2 systemd[1]: Stopping Network Time Synchronization...
Nov 28 16:22:53 st2 systemd[1]: Starting Network Time Synchronization...


Nov 28 16:23:02 st2 systemd-journal[494]: Journal stopped
Nov 28 16:23:02 st2 systemd-journal[632]: Runtime journal is using
624.0M (max allowed 642.1M, trying to leave 963.1M free of 5.6G
available → current limit 642.1M).
Nov 28 16:23:02 st2 systemd-journal[632]: Runtime journal is using
624.0M (max allowed 642.1M, trying to leave 963.1M free of 5.6G
available → current limit 642.1M).
Nov 28 16:23:02 st2 systemd-journal[632]: Journal started
Nov 28 16:23:02 st2 systemd[1]: systemd-journald.service watchdog
timeout (limit 1min)!
Nov 28 16:23:02 st2 systemd-journald[494]: Received SIGTERM from PID 1
(systemd).
Nov 28 16:23:02 st2 systemd[1]: Unit systemd-journald.service entered
failed state.
Nov 28 16:23:02 st2 systemd[1]: systemd-journald.service has no
holdoff time, scheduling restart.
Nov 28 16:23:02 st2 systemd[1]: Stopping Journal Service...
Nov 28 16:23:02 st2 systemd[1]: Starting Journal Service...
Nov 28 16:23:02 st2 systemd[1]: Started Journal Service.
Nov 28 16:23:02 st2 systemd[1]: Starting Trigger Flushing of Journal
to Persistent Storage...
Nov 28 16:23:02 st2 systemd[1]: systemd-journal-flush.service: main
process exited, code=exited, status=1/FAILURE
Nov 28 16:23:02 st2 systemd[1]: Failed to start Trigger Flushing of
Journal to Persistent Storage.
Nov 28 16:23:02 st2 systemd[1]: Unit systemd-journal-flush.service
entered failed state.


It repeats itself every minute.

systemctl doesn't work:


# systemctl
Failed to get D-Bus connection: Connection refused


I have 16 lxc containers running on the server:


# lxc-ls -f | grep RUNNING | wc -l
16


and 16 dbus-daemon's (so supposedly one dbus-daemon is missing):


# ps -ef | grep dbus
message+   845 1  0 Feb15 ?00:09:56 /usr/bin/dbus-daemon
--system --address=systemd: --nofork --nopidfile --systemd-activation
systemd+  1615   579  0 Jun13 ?00:00:00 /usr/bin/dbus-daemon
--system --address=systemd: --nofork --nopidfile --systemd-activation
root  1673 28602  0 16:26 pts/31   00:00:00 grep dbus
systemd+  3761  3461  0 Feb15 ?00:00:00 /usr/bin/dbus-daemon
--system --address=systemd: --nofork --nopidfile --systemd-activation
systemd+  4635  3436  0 Feb15 ?00:00:00 /usr/bin/dbus-daemon
--system --address=systemd: --nofork --nopidfile --systemd-activation
systemd+  4767  3527  0 Feb15 ?00:00:00 /usr/bin/dbus-daemon
--system --address=systemd: --nofork --nopidfile --systemd-activation
systemd+  5344  3597  0 Feb15 ?00:00:00 /usr/bin/dbus-daemon
--system --address=systemd: --nofork --nopidfile --systemd-activation
systemd+  5714  3664  0 Feb15 ?00:00:00 /usr/bin/dbus-daemon
--system --address=systemd: 

[systemd-devel] hostnamectl doesn't work in a lxc container

2017-02-10 Thread Yuri Kanivetsky
Hi,

Not sure it's a good place to ask. But it'd be great if you could help
me with this one. Or at least tell me where to ask. I failed to find
any systemd user mailing lists. The guys from lxc mailing list keep
silence:

https://lists.linuxcontainers.org/pipermail/lxc-users/2017-February/012840.html

So, on one physical server in a lxc container I get this:

# hostnamectl --static
Could not get property: Connection timed out
# hostnamectl
# echo $?
1

Occasionally I get this error message:

Could not get property: Failed to activate service
'org.freedesktop.hostname1': timed out

And in the log I see this:

Feb 10 12:39:04 server1 dbus[79]: [system] Activating via systemd:
service name='org.freedesktop.hostname1'
unit='dbus-org.freedesktop.hostname1.service'
Feb 10 12:39:05 server1 systemd[1]: Starting Hostname Service...
Feb 10 12:39:05 server1 systemd[2935]: systemd-hostnamed.service:
Failed at step NETWORK spawning /lib/systemd/systemd-hostnamed:
Permission denied
Feb 10 12:39:05 server1 systemd[1]: systemd-hostnamed.service: Main
process exited, code=exited, status=225/NETWORK
Feb 10 12:39:05 server1 systemd[1]: Failed to start Hostname Service.
Feb 10 12:39:05 server1 systemd[1]: systemd-hostnamed.service: Unit
entered failed state.
Feb 10 12:39:05 server1 systemd[1]: systemd-hostnamed.service:
Failed with result 'exit-code'.
Feb 10 12:39:29 server1 dbus[79]: [system] Failed to activate
service 'org.freedesktop.hostname1': timed out

On the other server it works though. Here's what I see in the log:

Feb 10 14:40:26 server2 dbus[25957]: [system] Activating via
systemd: service name='org.freedesktop.hostname1'
unit='dbus-org.freedesktop.hostname1.service'
Feb 10 14:40:26 server2 systemd[1]: Starting Hostname Service...
Feb 10 14:40:26 server2 systemd-hostnamed[18340]: Warning:
nss-myhostname is not installed. Changing the local hostname might
make it unresolveable. Please install nss-myhostname!
Feb 10 14:40:26 server2 dbus[25957]: [system] Successfully
activated service 'org.freedesktop.hostname1'
Feb 10 14:40:26 server2 systemd[1]: Started Hostname Service.

What's causing it? What can I check? How do I remedy this? Thanks in advance.

Regards,
Yuri
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel