On Wed, Mar 26, 2025 at 10:09:22AM +0800, LongPing Wei wrote: > Hi, Eric > > > If the CPU has been in softirq context for too long, then stop > > choosing softirq context and instead fall back to the traditional > > workqueue. This could help address objections about increased use of > > softirq context. > > Could you share me some example about this? >
The objection that keeps getting raised to doing more work in softirq context is that the latency of real-time tasks, such as audio tasks, may increase due to softirqs having a higher priority than all tasks. Causing audio skipping, etc. The logic in handle_softirqs() in kernel/softirq.c actually goes a ways towards addressing this already: it processes softirqs for at most 2 ms or until rescheduling of the interrupted task gets requested, before deferring them to ksoftirqd which runs at normal task priority. The gaps that I see are (a) 2 ms is longer than desired, and (b) the limit does not apply to *individual* softirqs. Looking at (b), observe that the block softirq (blk_done_softirq()) completes a list of I/O requests. If there are a lot of requests in that list, it could theoretically take more than the 2 ms limit that kernel/softirq.c is meant to enforce (which is already too long). So the objection would be that, even with dm-verity choosing to do in-line verification only for 4 KiB requests which take only a few microseconds each, a lot of requests could still add up to cause a long time to be spent in a single softirq context. (At least in theory. AFAIK no one has confirmed that this is actually a problem in practice with the 4 KiB limit applied. But this is the objection that I keep hearing whenever anyone suggests doing more in softirqs.) But if dm-verity could detect this case happening and start deferring verification work to task context, that should mostly address this concern, IMO. One idea which *might* get us most of the way there pretty easily would be to do the in-line verification only when need_resched() returns false. - Eric