Timothy Redaelli via dev <[email protected]> writes:

> When ovsdb-server or ovs-vswitchd fails and auto-restarts
> (Restart=on-failure), it briefly passes through the failed/inactive
> state.  This causes a cascade: the umbrella service (which Requires
> both) sees the failure and stops, which in turn stops the other
> service via PartOf.  When the failed service comes back, the other
> does not automatically restart.
>
> RestartMode=direct (systemd v254+, PR systemd/systemd#27584) makes
> the service transition directly to the activating state during
> auto-restart, skipping the failed/inactive state.  Dependents never
> see the failure, so the cascade does not happen.
>
> On older systemd versions the directive is silently ignored with a
> harmless journal warning ("Unknown key name 'RestartMode'"), so
> this change is safe for all supported platforms.  Tested with
> containers:
>
>   systemd 252 (CentOS Stream 9, Debian 12): warning, ignored
>   systemd 255 (Ubuntu 24.04): recognized, clean
>   systemd 256 (CentOS Stream 10): recognized, clean
>   systemd 257 (Debian 13): recognized, clean

I didn't check, but we should probably make sure that any systems where
we apply this also have:

https://github.com/goenkam/systemd/commit/7f85fc2c31f074badcf4d517a4f84a1fd72cf909

applied, right?  Otherwise, I think there's some kind of looped
dependency restarts when this is triggered.

But actually, this mode should only be on Type=one-shot services I
think.  If ovsdb-server experiences failure, the RestartMode=direct
shouldn't have any effect.  I'm guessing based on this:

* i.e. unit_process_job -> job_finish_and_invalidate is never called,
* and the previous job might still be running (especially for
* Type=oneshot services).

Which seems to imply that if there's a weird failure propagated, we
might end up with too many instances of vswitchd/db-server running.

Perhaps I'm misunderstanding something.

> Timothy Redaelli (2):
>   rhel: Add RestartMode=direct to service units.
>   debian: Add RestartMode=direct to service units.
>
>  debian/openvswitch-switch.ovs-vswitchd.service      | 1 +
>  debian/openvswitch-switch.ovsdb-server.service      | 1 +
>  rhel/usr_lib_systemd_system_ovs-vswitchd.service.in | 1 +
>  rhel/usr_lib_systemd_system_ovsdb-server.service    | 1 +
>  4 files changed, 4 insertions(+)

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to