On Mon, Aug 11, 2014 at 7:19 PM, Luis R. Rodriguez <mcg...@suse.com> wrote:
> On Mon, Aug 11, 2014 at 12:57 PM, Lennart Poettering
> <lenn...@poettering.net> wrote:
>> On Mon, 11.08.14 18:39, Luis R. Rodriguez (mcg...@suse.com) wrote:
>>
>>> > This looks really wrong. We shouldn't permit worker processes to be
>>> > blocked indefinitely without any timeout applied. Designing a worker
>>> > process system like that is simply wrong. It's one thing to allow
>>> > changing the specific timeout applied, it's a very different thing to
>>> > allow broken drivers to completely stall the worker process logic.
>>>
>>> OK what if we enable customizations then on the timeout by the built-in
>>> cmd type and we use a high multiplier for now for kmod ? A multiplier
>>> for kmod of 10 would set the kmod timeout to 5 minutes then, as we
>>> sweep up and clean drivers we can reduce this over time. This would also
>>> enable us to keep the default timeout for the other type of workers.
>>
>> Why this complexity?
>>
>> I mean, it sounds much simpler to simply increase the default timeout a
>> bit, if it turns out to be too low for the current cases...
>
> True, there's two things here and one of which this v2 patch didn't address:
>
> 1) It'd be good for defaults on systemd to work on most systems based
> on upstream kernels today, right now folks deploying systemd would
> need to modify the default timeout. Are we up to bump the default up
> considerably? If its high, would that be unfair for classes of workers
> we know shouldn't take that long, or wouldn't that allow folks
> developing new workers to take longer?
>
> 2) We want chatty logs to allow us to keep track of drivers that need
> attention. Ideally right now we should strive for 30 seconds init and
> work on asynching most work, we want to do this in a non fatal way.
> Overriding the timeout won't let us to keep track of buggy drivers
> that need love from systemd's perspective to stay within the 30 second
> bound time. We can have chatty logs from the kernel but using a
> timeout on the driver core seems a bit overkill specially if systemd
> is already keeping track of driver's init time, so it'd be better if
> we could collect offending drivers from systemd. I could have
> implemented support for this in this v2 patch by simply adding another
> check using the default timeout.

Hi Luis,

Just following up on this "old" thread: I have now bumped the default
kill timeout (for all workers) to 180 seconds, and added a warning
which is triggered after a third of the timeout.

Let me know if this covers what you need, or if there is anything else
we should do on the systemd side.

Cheers,

Tom
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to