Re: [systemd-devel] Waiting udev jobs

2021-03-27 Thread Alan Perry


On 3/27/21 5:38 AM, Lennart Poettering wrote:

On Fr, 26.03.21 23:24, Alan Perry (al...@snowmoose.com) wrote:


I occasionally see a problem where systemd-analyze reports that boot
did not complete and it is suggested that I use systemctl list-jobs
to find out more. That shows a .device service job and some sub-jobs
(associated with udev rules) all waiting. They will wait for literal
days in this state. When I accessed the system, it wasn’t apparent
what the jobs were waiting on since all of the device symlinks and
such were there and working. The systemctl status of the .device
service was alive.

Any suggestions on what is going on and/or how to figure out what is
going on?

If you have followed my posts here previously, it should come as no
surprise that the device that I observed this happen with was one of
the emmc boot devices.

This is not enough information. Please provide "systemctl status" info
on the relevant units and jobs, please provide a dump of the output.



I don't have access to that info at the moment. IIRC ...

dev-disk-by\x2dpath-platform\x2d68cf1000.sdhci\x2dboot0.device and 
sys-devices-platform…mc0:0001-block-mmcblk0-mmcblk0boot0.device returned 
"Active: inactive(dead)" and not much else.


dev-mmcblk0boot0.device returned "Active: active (plugged)". There was 
more, but I don't remember what else.





And most importantly, always start with the systemd version number you
are using,



v247 plus patches



  and whether you have any weird udev rules or so, or just
plain upstream stuff.



plain, upstream rules.


I am trying to figure out what to look at when I have access to the 
system exhibiting the problem that I am trying to resolve.



alan





Lennart

--
Lennart Poettering, Berlin

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Waiting udev jobs

2021-03-27 Thread Alan Perry


> On Mar 27, 2021, at 05:46, Lennart Poettering  wrote:
> 
> On Fr, 26.03.21 23:24, Alan Perry (al...@snowmoose.com) wrote:
> 
>> I occasionally see a problem where systemd-analyze reports that boot
>> did not complete and it is suggested that I use systemctl list-jobs
>> to find out more. That shows a .device service job and some sub-jobs
>> (associated with udev rules) all waiting. They will wait for literal
>> days in this state. When I accessed the system, it wasn’t apparent
>> what the jobs were waiting on since all of the device symlinks and
>> such were there and working. The systemctl status of the .device
>> service was alive.
>> 
>> Any suggestions on what is going on and/or how to figure out what is
>> going on?
>> 
>> If you have followed my posts here previously, it should come as no
>> surprise that the device that I observed this happen with was one of
>> the emmc boot devices.
> 
> This is not enough information. Please provide "systemctl status" info
> on the relevant units and jobs, please provide a dump of the output.
> 
> And most importantly, always start with the systemd version number you
> are using, and whether you have any weird udev rules or so, or just
> plain upstream stuff.


While I am now looking at a specific problem, I am asking a general question. 
There might be future situations where I see a udev job waiting. Is there a 
general way to find out what it is waiting on? Does it depend on what systems 
version is being run?

I can answer the questions you asked later. I don’t have all of the answers off 
the top of my head.

alan



> 
> Lennart
> 
> --
> Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Waiting udev jobs

2021-03-27 Thread Lennart Poettering
On Fr, 26.03.21 23:24, Alan Perry (al...@snowmoose.com) wrote:

> I occasionally see a problem where systemd-analyze reports that boot
> did not complete and it is suggested that I use systemctl list-jobs
> to find out more. That shows a .device service job and some sub-jobs
> (associated with udev rules) all waiting. They will wait for literal
> days in this state. When I accessed the system, it wasn’t apparent
> what the jobs were waiting on since all of the device symlinks and
> such were there and working. The systemctl status of the .device
> service was alive.
>
> Any suggestions on what is going on and/or how to figure out what is
> going on?
>
> If you have followed my posts here previously, it should come as no
> surprise that the device that I observed this happen with was one of
> the emmc boot devices.

This is not enough information. Please provide "systemctl status" info
on the relevant units and jobs, please provide a dump of the output.

And most importantly, always start with the systemd version number you
are using, and whether you have any weird udev rules or so, or just
plain upstream stuff.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Ordering of oneshot services and path units?

2021-03-27 Thread Andrei Borzenkov
On 27.03.2021 10:11, John Ioannidis wrote:
...
> 
> *workdir.path *
> 
> [Unit]
> Description=Trigger workdir.service when a job starts, creating a directory
> in /opt/circleci/workdir
> After=ccistated.service
> ConditionPathExists=/run/metadata/tags/resource_class
> [Path]
> PathChanged=/opt/circleci/workdir
> [Install]
> WantedBy=multi-user.target
> 
> 
...
> 
> Huh?!?!?! It's supposed to run after ccistated, and of course after mktags
> (highlighted in cyan above). It appears to be running anyway, and of course
> mktags has not gotten a chance to finish, and the ConditionExists file has
> not been created yet.
> 
> Do .path units not obey the same startup rules?
> 

By default all path units have Before=paths.target dependency which puts
them before basic.target and all service units have default dependency
After=basic.target. So you have dependency loop which systemd needs to
break; results are unpredictable.

You will need DefaultDependencies=false in your path unit (and likely
usual Conflicts=shutdown.target in addition to make sure unit is stopped
on shutdown).
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] Ordering of oneshot services and path units?

2021-03-27 Thread John Ioannidis
tl;dr: a .path unit does not appear to be waiting for the After= unit to
run first.

I am still trying to understand why some services occasionally do not start
at boot time. It is a very intermittent behavior, but I caught another
instance. Everything is running in Google Compute Engine or Amazon EC2.

I have a one-shot service, *mktags.service*, a long-running job of type
*notify*, *ccistated.service*, and a path unit with its corresponding
service, *workdir.path* and *workdir.service*. Here are the relevant parts
of the unit files:


*mktags.service*

[Unit]
Description=Populate /run/metadata/tags
After=network.target
[Service]
ExecStart=/usr/local/sbin/mktags.py
Type=oneshot
[Install]
WantedBy=multi-user.target



*ccistated.service*

[Unit]
Description=State Machine Manager
After=mktags.service
ConditionPathExists=/run/metadata/tags/resource_class
[Service]
ExecStart=/usr/local/sbin/ccistated.py
Type=notify
NotifyAccess=all
[Install]
WantedBy = multi-user.target



*workdir.path *

[Unit]
Description=Trigger workdir.service when a job starts, creating a directory
in /opt/circleci/workdir
After=ccistated.service
ConditionPathExists=/run/metadata/tags/resource_class
[Path]
PathChanged=/opt/circleci/workdir
[Install]
WantedBy=multi-user.target


What *mktags* does is extract the metadata tags from the metadata service,
and populate /run/metadata/tags/.

$ systemctl status mktags.service
* mktags.service - Populate /run/metadata/tags
Loaded: loaded (/etc/systemd/system/mktags.service; enabled; vendor
preset: enabled)
Active: inactive (dead) since Sat 2021-03-27 05:34:00 UTC; 10min ago
   Process: 454 ExecStart=/usr/local/sbin/mktags.py (code=exited,
status=0/SUCCESS)
  Main PID: 454 (code=exited, status=0/SUCCESS)

Sure enough, it ran:

$ cat /run/metadata/tags/resource_class
waw

As did the next service:

$ systemctl status ccistated.service
* ccistated.service - State Machine Manager
Loaded: loaded (/etc/systemd/system/ccistated.service; enabled; vendor
preset: enabled)
Active: active (running) since Sat 2021-03-27 05:36:00 UTC; 8min ago
  Main PID: 1420 (python3)
 Tasks: 1 (limit: 9544)
Memory: 8.5M
CGroup: /system.slice/ccistated.service
`-1420 python3 /usr/local/sbin/ccistated.py

It's OK that it took a couple of minutes for ccistated to start up, it does
a few things before sending out the Notify.

But look at the old workdir.path:

$ systemctl status workdir.path
* workdir.path - Trigger workdir.service when a job starts, creating a
directory in /opt/circleci/workdir
Loaded: loaded (/etc/systemd/system/workdir.path; enabled; vendor
preset: enabled)
Active: inactive (dead)
  Triggers: * workdir.service
 Condition: start condition failed at Sat 2021-03-27 05:33:59 UTC; 22min
ago
`- ConditionPathExists=/run/metadata/tags/resource_class was
not met

Huh?!?!?! It's supposed to run after ccistated, and of course after mktags
(highlighted in cyan above). It appears to be running anyway, and of course
mktags has not gotten a chance to finish, and the ConditionExists file has
not been created yet.

Do .path units not obey the same startup rules?
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel