On Mon, Aug 11, 2014 at 7:19 PM, Luis R. Rodriguez <mcg...@suse.com> wrote: > On Mon, Aug 11, 2014 at 12:57 PM, Lennart Poettering > <lenn...@poettering.net> wrote: >> On Mon, 11.08.14 18:39, Luis R. Rodriguez (mcg...@suse.com) wrote: >> >>> > This looks really wrong. We shouldn't permit worker processes to be >>> > blocked indefinitely without any timeout applied. Designing a worker >>> > process system like that is simply wrong. It's one thing to allow >>> > changing the specific timeout applied, it's a very different thing to >>> > allow broken drivers to completely stall the worker process logic. >>> >>> OK what if we enable customizations then on the timeout by the built-in >>> cmd type and we use a high multiplier for now for kmod ? A multiplier >>> for kmod of 10 would set the kmod timeout to 5 minutes then, as we >>> sweep up and clean drivers we can reduce this over time. This would also >>> enable us to keep the default timeout for the other type of workers. >> >> Why this complexity? >> >> I mean, it sounds much simpler to simply increase the default timeout a >> bit, if it turns out to be too low for the current cases... > > True, there's two things here and one of which this v2 patch didn't address: > > 1) It'd be good for defaults on systemd to work on most systems based > on upstream kernels today, right now folks deploying systemd would > need to modify the default timeout. Are we up to bump the default up > considerably? If its high, would that be unfair for classes of workers > we know shouldn't take that long, or wouldn't that allow folks > developing new workers to take longer? > > 2) We want chatty logs to allow us to keep track of drivers that need > attention. Ideally right now we should strive for 30 seconds init and > work on asynching most work, we want to do this in a non fatal way. > Overriding the timeout won't let us to keep track of buggy drivers > that need love from systemd's perspective to stay within the 30 second > bound time. We can have chatty logs from the kernel but using a > timeout on the driver core seems a bit overkill specially if systemd > is already keeping track of driver's init time, so it'd be better if > we could collect offending drivers from systemd. I could have > implemented support for this in this v2 patch by simply adding another > check using the default timeout.
Hi Luis, Just following up on this "old" thread: I have now bumped the default kill timeout (for all workers) to 180 seconds, and added a warning which is triggered after a third of the timeout. Let me know if this covers what you need, or if there is anything else we should do on the systemd side. Cheers, Tom _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel