Sadly, this doesn't help and seems to make the situation worse. Our automated tests were previously seeing about 5% failure rate and with this patch its 20%. We still need to verify that they're all down to the same failure but I thought it better to give some early feedback.
Surprised that the script didn't run for you but the limit on dirty bytes was just to get it to occur more frequently. You may have success with larger values. Mark Message: 1 Date: Fri, 15 Mar 2019 21:58:12 +0100 From: Andreas Gruenbacher <[email protected]<mailto:[email protected]>> To: Ross Lagerwall <[email protected]<mailto:[email protected]>> Cc: [email protected]<mailto:[email protected]>, Sergey Dyasli <[email protected]<mailto:[email protected]>>, Mark Syms <[email protected]<mailto:[email protected]>>, [email protected]<mailto:[email protected]> Subject: Re: [Cluster-devel] [PATCH] gfs2: Prevent writeback in gfs2_file_write_iter Message-ID: <[email protected]<mailto:[email protected]>> Content-Type: text/plain; charset=UTF-8 Hi Ross, On Thu, 14 Mar 2019 at 12:18, Ross Lagerwall <[email protected]<mailto:[email protected]>> wrote: > On 3/13/19 5:13 PM, Andreas Gruenbacher wrote: > > Hi Edwin, > > > > On Wed, 6 Mar 2019 at 12:08, Edwin T?r?k > > <[email protected]<mailto:[email protected]>> > > wrote: > >> Hello, > >> > >> I've been trying to debug a GFS2 deadlock that we see in our lab > >> quite frequently with a 4.19 kernel. With 4.4 and older kernels we > >> were not able to reproduce this. > >> See below for lockdep dumps and stacktraces. > > > > thanks for the thorough bug report. Does the below fix work for > > you? > > > Hi Andreas, > > I've tested the patch and it doesn't fix the issue. As far as I can see, > current->backing_dev_info is not used by any of the code called from > balance_dirty_pages_ratelimited() so I don't see how it could work. yes, I see now. > I found a way of consistently reproducing the issue almost immediately > (tested with the latest master commit): > > # cat a.py > import os > > fd = os.open("f", os.O_CREAT|os.O_TRUNC|os.O_WRONLY) > > for i in range(1000): > os.mkdir("xxx" + str(i), 0777) > > buf = 'x' * 4096 > > while True: > count = os.write(fd, buf) > if count <= 0: > break > > # cat b.py > import os > while True: > os.mkdir("x", 0777) > os.rmdir("x") > > # echo 8192 > /proc/sys/vm/dirty_bytes > # cd /gfs2mnt > # (mkdir tmp1; cd tmp1; python2 ~/a.py) & > # (mkdir tmp2; cd tmp2; python2 ~/a.py) & > # (mkdir tmp3; cd tmp3; python2 ~/b.py) & > > This should deadlock almost immediately. One of the processes will be > waiting in balance_dirty_pages() and holding sd_log_flush_lock and > several others will be waiting for sd_log_flush_lock. This doesn't work for me: the python processes don't even start properly when dirty_bytes is set so low. > I came up with the following patch which seems to resolve the issue by > failing to write the inode if it can't take the lock, but it seems > like a dirty workaround rather than a proper fix: > > [...] Looking at ext4_dirty_inode, it seems that we should just be able to bail out of gfs2_write_inode an return 0 when PF_MEMALLOC is set in current->flags. Also, we should probably add the current->flags checks from xfs_do_writepage to gfs2_writepage_common. So what do you get with the below patch? Thanks, Andreas --- fs/gfs2/aops.c | 7 +++++++ fs/gfs2/super.c | 4 ++++ 2 files changed, 11 insertions(+) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index 05dd78f..694ff91 100644<tel:100644> --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -102,6 +102,13 @@ static int gfs2_writepage_common(struct page *page, pgoff_t end_index = i_size >> PAGE_SHIFT; unsigned offset; + /* (see xfs_do_writepage) */ + if (WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD)) == + PF_MEMALLOC)) + goto redirty; + if (WARN_ON_ONCE(current->flags & PF_MEMALLOC_NOFS)) + goto redirty; + if (gfs2_assert_withdraw(sdp, gfs2_glock_is_held_excl(ip->i_gl))) goto out; if (current->journal_info) diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index ca71163..540535c 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -756,6 +756,10 @@ static int gfs2_write_inode(struct inode *inode, struct writeback_control *wbc) int ret = 0; bool flush_all = (wbc->sync_mode == WB_SYNC_ALL || gfs2_is_jdata(ip)); + /* (see ext4_dirty_inode) */ + if (current->flags & PF_MEMALLOC) + return 0; + if (flush_all) gfs2_log_flush(GFS2_SB(inode), ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL | -- 1.8.3.1 ------------------------------ _______________________________________________ Cluster-devel mailing list [email protected]<mailto:[email protected]> https://www.redhat.com/mailman/listinfo/cluster-devel End of Cluster-devel Digest, Vol 154, Issue 8 *********************************************
