On Fri, Jan 12, 2018 at 06:35:54AM +0100, Frederic Weisbecker wrote: > Some softirq vectors can be more CPU hungry than others. Especially > networking may sometimes deal with packet storm and need more CPU than > IRQ tail can offer without inducing scheduler latencies. In this case > the current code defers to ksoftirqd that behaves nicer. Now this nice > behaviour can be bad for other IRQ vectors that usually need quick > processing. > > To solve this we only defer to threading the vectors that outreached the > time limit on IRQ tail processing and leave the others inline on real > Soft-IRQs service. This is achieved using workqueues with > per-CPU/per-vector worklets. > > Note ksoftirqd is not removed as it is still needed for threaded IRQs > mode. > > Suggested-by: Linus Torvalds <torva...@linux-foundation.org> > Signed-off-by: Frederic Weisbecker <frede...@kernel.org> > Cc: Dmitry Safonov <d...@arista.com> > Cc: Eric Dumazet <eduma...@google.com> > Cc: Linus Torvalds <torva...@linux-foundation.org> > Cc: Peter Zijlstra <pet...@infradead.org> > Cc: Andrew Morton <a...@linux-foundation.org> > Cc: David Miller <da...@davemloft.net> > Cc: Hannes Frederic Sowa <han...@stressinduktion.org> > Cc: Ingo Molnar <mi...@kernel.org> > Cc: Levin Alexander <alexander.le...@verizon.com> > Cc: Paolo Abeni <pab...@redhat.com> > Cc: Paul E. McKenney <paul...@linux.vnet.ibm.com> > Cc: Radu Rendec <rren...@arista.com> > Cc: Rik van Riel <r...@redhat.com> > Cc: Stanislaw Gruszka <sgrus...@redhat.com> > Cc: Thomas Gleixner <t...@linutronix.de> > Cc: Wanpeng Li <wanpeng...@hotmail.com> > --- > kernel/softirq.c | 90 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 87 insertions(+), 3 deletions(-) > > diff --git a/kernel/softirq.c b/kernel/softirq.c > index fa267f7..0c817ec6 100644 > --- a/kernel/softirq.c > +++ b/kernel/softirq.c > @@ -74,6 +74,13 @@ struct softirq_stat { > > static DEFINE_PER_CPU(struct softirq_stat, softirq_stat_cpu); > > +struct vector_work { > + int vec; > + struct work_struct work; > +}; > + > +static DEFINE_PER_CPU(struct vector_work[NR_SOFTIRQS], vector_work_cpu); > + > /* > * we cannot loop indefinitely here to avoid userspace starvation, > * but we also don't want to introduce a worst case 1/HZ latency > @@ -251,6 +258,70 @@ static inline bool lockdep_softirq_start(void) { return > false; } > static inline void lockdep_softirq_end(bool in_hardirq) { } > #endif > > +static void vector_work_func(struct work_struct *work) > +{ > + struct vector_work *vector_work; > + u32 pending; > + int vec; > + > + vector_work = container_of(work, struct vector_work, work); > + vec = vector_work->vec; > + > + local_irq_disable(); > + pending = local_softirq_pending(); > + account_irq_enter_time(current); > + __local_bh_disable_ip(_RET_IP_, SOFTIRQ_OFFSET); > + lockdep_softirq_enter(); > + set_softirq_pending(pending & ~(1 << vec)); > + local_irq_enable(); > + > + if (pending & (1 << vec)) {
Ah I see the problem. Say in do_softirq() we had pending VECTOR 1 and 2. And we had overrun only VECTOR 1 so VECTOR 1 is enqueued to workqueue. Right after that we go back to the restart loop in do_softirq in order to handle pending VECTOR 2 but we erase the local_softirqs_pending state. So when the workqueue runs, it doesn't see anymore VECTOR 1 pending and we lose it. So I need to remove the above condition and make the vector work unconditionally execute the vector callback. Now I can go to sleep... > + struct softirq_action *sa = &softirq_vec[vec]; > + > + kstat_incr_softirqs_this_cpu(vec); > + trace_softirq_entry(vec); > + sa->action(sa); > + trace_softirq_exit(vec); > + }