Hello, Joseph.
On Wed, Feb 07, 2018 at 04:40:02PM +0800, Joseph Qi wrote:
> writeback kworker
> blkcg_bio_issue_check
> rcu_read_lock
> blkg_lookup
> <<< *race window*
> blk_throtl_bio
> spin_lock_irq(q->queue_lock)
> spin_unlock_irq(q->queue_lock)
> rcu_read_unlock
>
> cgroup_rmdir
> cgroup_destroy_locked
> kill_css
> css_killed_ref_fn
> css_killed_work_fn
> offline_css
> blkcg_css_offline
> spin_trylock(q->queue_lock)
> blkg_destroy
> spin_unlock(q->queue_lock)
Ah, right. Thanks for spotting the bug.
> Since rcu can only prevent blkg from releasing when it is being used,
> the blkg->refcnt can be decreased to 0 during blkg_destroy and schedule
> blkg release.
> Then trying to blkg_get in blk_throtl_bio will complains the WARNING.
> And then the corresponding blkg_put will schedule blkg release again,
> which result in double free.
> This race is introduced by commit ae1188963611 ("blkcg: consolidate blkg
> creation in blkcg_bio_issue_check()"). Before this commit, it will lookup
> first and then try to lookup/create again with queue_lock. So revive
> this logic to fix the race.
The change seems a bit drastic to me. Can't we do something like the
following instead?
blk_throtl_bio()
{
... non throttled cases ...
/* out-of-limit, queue to @tg */
/*
* We can look up and retry but the race window is tiny here.
* Just letting it through should be good enough.
*/
if (!css_tryget(blkcg->css))
goto out;
... actual queueing ...
css_put(blkcg->css);
...
}
Thanks.
--
tejun