On Wed, Aug 21, 2019 at 04:14:53PM +0200, David Sterba wrote:
> On Wed, Aug 21, 2019 at 03:24:46PM +0200, David Sterba wrote:
> > On Wed, Aug 21, 2019 at 03:20:21PM +0200, David Sterba wrote:
> > > On Tue, Aug 13, 2019 at 10:33:42AM -0700, Omar Sandoval wrote:
> > > > From: Omar Sandoval <osan...@fb.com>
> > > > 
> > > > This does some cleanups to the Btrfs workqueue code following my
> > > > previous fix [1]. Changed since v1 [2]:
> > > > 
> > > > - Removed errant Fixes: tag in patch 1
> > > > - Fixed a comment typo in patch 2
> > > > - Added NB: to comments in patch 2
> > > > - Added Nikolay and Filipe's reviewed-by tags
> > > > 
> > > > Thanks!
> > > > 
> > > > 1: 
> > > > https://lore.kernel.org/linux-btrfs/0bea516a54b26e4e1c42e6fe47548cb48cc4172b.1565112813.git.osan...@fb.com/
> > > > 2: 
> > > > https://lore.kernel.org/linux-btrfs/cover.1565680721.git.osan...@fb.com/
> > > > 
> > > > Omar Sandoval (2):
> > > >   Btrfs: get rid of unique workqueue helper functions
> > > >   Btrfs: get rid of pointless wtag variable in async-thread.c
> > > 
> > > The patches seem to cause crashes inside the worques, happend several
> > > times in random patches, sample stacktrace below. This blocks me from
> > > testing so I'll move the patches out of misc-next for now and add back
> > > once there's a fix.
> > 
> > Another possibility is that the cleanup patches make it more likely to
> > happen and the root cause is "Btrfs: fix workqueue deadlock on dependent
> > filesystems".
> 
> With just the deadlock fix on top<F2>, crashed in btrfs/011;
> 

It's because we're doing

wq = work->wq;

after we've done set_bit(WORK_DONE_BIT, &work->flags).  The work could be freed
immediately after this and thus doing work->wq could give you garbage.  Could
you try with this diff applied and verify it goes away?

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index 6b8ad4a1b568..10a04b99798a 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -252,9 +252,9 @@ static inline void thresh_exec_hook(struct 
__btrfs_workqueue *wq)
        }
 }
 
-static void run_ordered_work(struct btrfs_work *self)
+static void run_ordered_work(struct __btrfs_workqueue *wq,
+                            struct btrfs_work *self)
 {
-       struct __btrfs_workqueue *wq = self->wq;
        struct list_head *list = &wq->ordered_list;
        struct btrfs_work *work;
        spinlock_t *lock = &wq->list_lock;
@@ -356,7 +356,7 @@ static void normal_work_helper(struct btrfs_work *work)
        work->func(work);
        if (need_order) {
                set_bit(WORK_DONE_BIT, &work->flags);
-               run_ordered_work(work);
+               run_ordered_work(wq, work);
        }
        if (!need_order)
                trace_btrfs_all_work_done(wq->fs_info, wtag);

Reply via email to