On Fri, 13 Apr 2018 13:28:23 -0700 Khazhismel Kumykov <kha...@google.com> wrote:

> shrink_dcache_parent may spin waiting for a parallel shrink_dentry_list.
> In this case we may have 0 dentries to dispose, so we will never
> schedule out while waiting for the parallel shrink_dentry_list to
> complete.
> 
> Tested that this fixes syzbot reports of stalls in shrink_dcache_parent()

Well I guess the patch is OK as a stopgap, but things seem fairly
messed up in there.  shrink_dcache_parent() shouldn't be doing a
busywait, waiting for the concurrent shrink_dentry_list().

Either we should be waiting (sleeping) for the concurrent operation to
complete or we should just bail out of shrink_dcache_parent(), perhaps
with 

        if (list_empty(&data.dispose))
                break;

or similar.  Dunno.


That block comment over `struct select_data' is not a good one.  "It
returns zero iff...".  *What* returns zero?  select_collect()?  No it
doesn't, it returns an `enum d_walk_ret'.  Perhaps the comment is
trying to refer to select_data.found.  And the real interpretation of
select_data.found is, umm, hard to describe.  "Counts the number of
dentries which are on a shrink list or which were moved to the dispose
list".  Why?  What's that all about?

This code needs a bit of thought, documentation and perhaps a redo,
I suspect.

Reply via email to