On Fri, 11 Apr 2025, Benjamin Marzinski wrote:

> When using a kthread to delay the IOs, dm-delay would continuously loop,
> checking if IOs were ready to submit. It had a cond_resched() call in
> the loop, but might still loop hundreds of millions of times waiting for
> an IO that was scheduled to be submitted 10s of ms in the future. With
> the change to make dm-delay over zoned devices always use kthreads
> regardless of the length of the delay, this wasted work only gets worse.
> 
> To solve this and still keep roughly the same precision for very short
> delays, dm-delay now calls fsleep() for 1/8th of the smallest non-zero
> delay it will place on IOs, or 1 ms, whichever is smaller. The reason
> that dm-delay doesn't just use the actual expiration time of the next
> delayed IO to calculated the sleep time is that delay_dtr() must wait
> for the kthread to finish before deleting the table. If a zoned device
> with a long delay queued an IO shortly before being suspended and
> removed, the IO would be flushed in delay_presuspend(), but the removing
> the device would still have to wait for the remainder of the long delay.
> This time is now capped at 1 ms.
> 
> Fixes: 70bbeb29fab09 ("dm delay: for short delays, use kthread instead of 
> timers and wq")
> Signed-off-by: Benjamin Marzinski <bmarz...@redhat.com>
> ---
> This patch is meant to apply on top of Damien Le Moal's "dm-delay:
> Prevent zoned write reordering on suspend" patch. If people think it's
> important to avoid either this much smaller amount of looping or the
> possible 1 ms delay on deleting a table, I can send a patch that uses
> usleep_range_state() and msleep_interruptible() to do an interruptible
> sleep with a duration based on the expiration time of the next delayed
> IO.

Hi

worker_sleep_ns should be worker_sleep_us - as the value is in 
microseconds.

fsleep in flush_worker_fn should be called unconditionally, to not consume 
100% CPU when suspending.

cond_resched() shouldn't be removed because fsleep may fall back to 
udelay.

The patch should increase target version.

I fixed the patch so that it applies on the top Linus' tree and applied 
it to the linux-dm tree.

BTW. do we need to backport this to the stable kernels? I think not, but 
if you have some reason why should we backport it, explain it.

Mikulas


Reply via email to