Timothy Redaelli <[email protected]> writes: > On Wed, 17 Jun 2026 09:25:07 -0400 > Aaron Conole <[email protected]> wrote: > >> Timothy Redaelli via dev <[email protected]> writes: >> >> > When ovsdb-server or ovs-vswitchd fails and auto-restarts >> > (Restart=on-failure), it briefly passes through the failed/inactive >> > state. This causes a cascade: the umbrella service (which Requires >> > both) sees the failure and stops, which in turn stops the other >> > service via PartOf. When the failed service comes back, the other >> > does not automatically restart. >> > >> > RestartMode=direct (systemd v254+, PR systemd/systemd#27584) makes >> > the service transition directly to the activating state during >> > auto-restart, skipping the failed/inactive state. Dependents never >> > see the failure, so the cascade does not happen. >> > >> > On older systemd versions the directive is silently ignored with a >> > harmless journal warning ("Unknown key name 'RestartMode'"), so >> > this change is safe for all supported platforms. Tested with >> > containers: >> > >> > systemd 252 (CentOS Stream 9, Debian 12): warning, ignored >> > systemd 255 (Ubuntu 24.04): recognized, clean >> > systemd 256 (CentOS Stream 10): recognized, clean >> > systemd 257 (Debian 13): recognized, clean >> >> I didn't check, but we should probably make sure that any systems where >> we apply this also have: >> >> https://github.com/goenkam/systemd/commit/7f85fc2c31f074badcf4d517a4f84a1fd72cf909 >> >> applied, right? Otherwise, I think there's some kind of looped >> dependency restarts when this is triggered. > > That commit (upstream 7a13937007, in v257+) fixes stop-job propagation > to BindsTo= dependents during direct-mode restarts. > OVS don't use BindsTo=, openvswitch.service uses Requires= on the > sub-services, and the sub-services use PartOf=openvswitch.service. > > The cascade we're preventing happens because Requires= reacts to > the sub-service entering the failed/inactive state. > RestartMode=direct prevents that by skipping the state transition > entirely, and that code path has been there since > v254.
ACK >> But actually, this mode should only be on Type=one-shot services I >> think. If ovsdb-server experiences failure, the RestartMode=direct >> shouldn't have any effect. I'm guessing based on this: >> >> * i.e. unit_process_job -> job_finish_and_invalidate is never called, >> * and the previous job might still be running (especially for >> * Type=oneshot services). >> >> Which seems to imply that if there's a weird failure propagated, we >> might end up with too many instances of vswitchd/db-server running. > > RestartMode=direct is not restricted to Type=oneshot, it works with > any service type. > The comment you quoted says "especially for Type=oneshot services" > because those have long-running ExecStart= commands that might still be > in progress when a restart is attempted. > > Our services are Type=forking with PIDFile=. This means the restart only > triggers when the main process exits (that's what Restart=on-failure > reacts to), so by the time service_enter_restart() runs, the old > process is already gone. > There's no window where two instances coexist. Gotcha - for some reason I misread this and had some thought about how the failures cascaded. It makes more sense now. > Re-reading systemd service files made me think about migrating > Type=forking to Type=notify to avoid useless forking + PID checking and > to have a proper readiness signaling (sd_notify), but I'll do that as a > follow up series (since RestartMode=direct will still be needed). Sounds good. >> Perhaps I'm misunderstanding something. >> >> > Timothy Redaelli (2): >> > rhel: Add RestartMode=direct to service units. >> > debian: Add RestartMode=direct to service units. >> > >> > debian/openvswitch-switch.ovs-vswitchd.service | 1 + >> > debian/openvswitch-switch.ovsdb-server.service | 1 + >> > rhel/usr_lib_systemd_system_ovs-vswitchd.service.in | 1 + >> > rhel/usr_lib_systemd_system_ovsdb-server.service | 1 + >> > 4 files changed, 4 insertions(+) >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
