On Mon, Dec 16, 2024 at 1:11 PM Alexander Lobakin <[email protected]> wrote: > > From: Brian Vazquez <[email protected]> > Date: Mon, 16 Dec 2024 16:27:34 +0000 > > > From: Marco Leogrande <[email protected]> > > > > When a workqueue is created with `WQ_UNBOUND`, its work items are > > served by special worker-pools, whose host workers are not bound to > > any specific CPU. In the default configuration (i.e. when > > `queue_delayed_work` and friends do not specify which CPU to run the > > work item on), `WQ_UNBOUND` allows the work item to be executed on any > > CPU in the same node of the CPU it was enqueued on. While this > > solution potentially sacrifices locality, it avoids contention with > > other processes that might dominate the CPU time of the processor the > > work item was scheduled on. > > > > This is not just a theoretical problem: in a particular scenario > > misconfigured process was hogging most of the time from CPU0, leaving > > less than 0.5% of its CPU time to the kworker. The IDPF workqueues > > that were using the kworker on CPU0 suffered large completion delays > > as a result, causing performance degradation, timeouts and eventual > > system crash. > > Wasn't this inspired by [0]? > > [0] > https://lore.kernel.org/netdev/[email protected]
The root cause is exactly the same so I do see the similarity and I'm not surprised that both were addressed with a similar patch, we hit this problem some time ago and the first attempt to have this was in August [0]. [0] https://lore.kernel.org/netdev/[email protected]/ > > Thanks, > Olek
