On Fri, Feb 23, 2018 at 09:56:54AM +0800, xuejiufei wrote:
> > On Thu, Feb 22, 2018 at 02:14:34PM +0800, Joseph Qi wrote:
> >> I still don't get how css_tryget can work here.
> >> The race happens when:
> >> 1) writeback kworker has found the blkg with rcu;
> >> 2) blkcg is during offlining and blkg_destroy() has already been called.
> >> Then, writeback kworker will take queue lock and access the blkg with
> >> refcount 0.
> > Yeah, then tryget would fail and it should go through the root.
> In this race, the refcount of blkg becomes zero and is destroyed.
> However css may still have refcount, and css_tryget can return success
> before other callers put the refcount.
> So I don't get how css_tryget can fix this race? Or I wonder if we can
> add another function blkg_tryget?
IIRC, as long as the blkcg and the device are there, the blkgs aren't
gonna be destroyed. So, if you have a ref to the blkcg through
tryget, the blkg shouldn't go away.