On Thu, 2018-01-11 at 05:44 +0100, Frederic Weisbecker wrote: > On Wed, Jan 10, 2018 at 08:19:49PM -0800, Linus Torvalds wrote: > > On Wed, Jan 10, 2018 at 7:22 PM, Frederic Weisbecker > > <frede...@kernel.org> wrote: > > > > > > Makes sense, but I think you need to keep the TASK_RUNNING check. > > > > Yes, good point. > > > > > So perhaps it should be: > > > > > > - return tsk && (tsk->state == TASK_RUNNING); > > > + return (tsk == current) && (tsk->state == TASK_RUNNING); > > > > Looks good to me - definitely worth trying. > > > > Maybe that weakens the thing so much that it doesn't actually help > > the > > UDP packet storm case? > > > > And maybe it's not sufficient for the dvb issue. > > > > But I think it's worth at least testing. Maybe it makes neither > > side > > entirely happy, but maybe it might be a good halfway point? > > Yes I believe Dmitry is facing a different problem where he would > rather > see ksoftirqd scheduled more often to handle the queue as a deferred > batch > instead of having it served one by one on the tails of IRQ storms. > (Dmitry correct me if I misunderstood).
Quite so, what I see is that ksoftirqd is rarely (close to never) scheduled in case of UDP packet storm. That's because the up coming irq is too late in __do_softirq(). So, there is no wakeup on UDP storm here: : pending = local_softirq_pending(); : if (pending & mask) { : if (time_before(jiffies, end) && !need_resched() && : --max_restart) : goto restart; : : wakeup_softirqd(); : } (as there is yet no pending softirq). It comes a bit late to schedule ksoftirqd and in result the next softirq is processed on the context of the task again, not in the scheduled ksoftirqd. That results in cpu-time starvation for the process on irq storm. While I saw that on out-of-tree driver, I believe that on some frequencies (lower than storm) one can observe the same on mainstream drivers. And I *think* that I've reproduced that on mainstream with virtio driver and package size of 1500 in VMs (thou I don't quite like the perf testing in VMs). So, ITOW, maybe there is a bit better way to *detect* that cpu time spent on serving softirqs is close to storm and that userspace starts starving? (and launch ksoftirqd in the result or balance between deferring and serving softirq right-there). > But your patch still seems to make sense for the case you described: > when > ksoftirqd is voluntarily preempted off and the current IRQ could > handle the > queue.