On Thu, Apr 04, 2019 at 01:52:55PM +0300, Amir Goldstein wrote:
> This looks like an old bug, pre-dating the "Fixes" commit, but the
> "Fixes" commit did not handle it properly.
> 
> The bug recently surfaced as a lockdep possible deadlock warning
> with commit d1d04ef8572b ("ovl: stack file ops").
> 
> When acct_on() replaces one acct file with another, it takes sb_writers
> lock on new file sb and calls acct_pin_kill(old) before releasing the
> sb_writers lock.
>
> If new file is on the same fs as old file, acct_pin_kill(old) fail to
> file_start_write_trylock() and skip writing the old file, because
> sb_writers (of new) is already taken by acct_on().
> 
> If new file is not on same fs as old file, this ordering violation
> creates an unneeded dependency between new sb_writers and old sb_writers,
> which may later be reported as possible deadlock.
> 
> This could result in an actual deadlock if acct file is replaced from
> an old file in overlayfs over "real fs" to a new file in "real fs".
> acct_on() takes freeze protection on "real fs" and tries to write to
> overlayfs file. overlayfs is not freeze protected so do_acct_process()
> can carry on with __kernel_write() to overlayfs file, which would
> try to take freeze protection on "real fs" and deadlock.

Huh?  sb_writers is taken when we *open* the new file.  Then we replace
its ->path.mnt with a clone and transfer the write count from the original
to new one.  And close the old file while we are at it.

>From sb_writers POV mainline has
        sb_start_write(new_sb)  // in open
        sb_start_write(new_sb)  // mnt_want_write() on clone
        last write to old_sb, then sb_end_write(old_sb) // acct_pin_kill()
        sb_end_write(new_sb)    // mnt_drop_write(mnt)
and you flip the order of the last two lines.

Could you explain how exactly does your patch help whatever
problem overlayfs has?

Reply via email to