Re: [Cluster-devel] [PATCH v7 5/5] gfs2: Fix iomap write page reclaim deadlock

2019-04-30 Thread Darrick J. Wong
On Tue, Apr 30, 2019 at 12:09:34AM +0200, Andreas Gruenbacher wrote: > Since commit 64bc06bb32ee ("gfs2: iomap buffered write support"), gfs2 is > doing > buffered writes by starting a transaction in iomap_begin, writing a range of > pages, and ending that transaction in iomap_end. This approach

[Cluster-devel] [PATCH] gfs2: fix race between gfs2_freeze_func and unmount

2019-04-30 Thread Abhi Das
gfs2_unfreee() doesn't wait for gfs2_freeze_func() to complete. If a umount is issued right after unfreeze, it could result in an inconsistent filesystem because some journal data (statfs update) wasn't written out. This patch causes gfs2_unfreeze() to wait for gfs2_freeze_func() to complete

Re: [Cluster-devel] [PATCH v7 4/5] iomap: Add a page_prepare callback

2019-04-30 Thread Darrick J. Wong
On Tue, Apr 30, 2019 at 12:09:33AM +0200, Andreas Gruenbacher wrote: > Move the page_done callback into a separate iomap_page_ops structure and > add a page_prepare calback to be called before the next page is written > to. In gfs2, we'll want to start a transaction in page_prepare and end > it

Re: [Cluster-devel] [PATCH v7 5/5] gfs2: Fix iomap write page reclaim deadlock

2019-04-30 Thread Andreas Gruenbacher
On Tue, 30 Apr 2019 at 17:33, Darrick J. Wong wrote: > On Tue, Apr 30, 2019 at 12:09:34AM +0200, Andreas Gruenbacher wrote: > > Since commit 64bc06bb32ee ("gfs2: iomap buffered write support"), gfs2 is > > doing > > buffered writes by starting a transaction in iomap_begin, writing a range of > >

Re: [Cluster-devel] [PATCH v7 5/5] gfs2: Fix iomap write page reclaim deadlock

2019-04-30 Thread Darrick J. Wong
On Tue, Apr 30, 2019 at 05:39:28PM +0200, Andreas Gruenbacher wrote: > On Tue, 30 Apr 2019 at 17:33, Darrick J. Wong wrote: > > On Tue, Apr 30, 2019 at 12:09:34AM +0200, Andreas Gruenbacher wrote: > > > Since commit 64bc06bb32ee ("gfs2: iomap buffered write support"), gfs2 is > > > doing > > >

Re: [Cluster-devel] [PATCH v7 5/5] gfs2: Fix iomap write page reclaim deadlock

2019-04-30 Thread Andreas Grünbacher
Am Di., 30. Apr. 2019 um 17:48 Uhr schrieb Darrick J. Wong : > Ok, I'll take the first four patches through the iomap branch and cc you > on the pull request. Ok great, thanks. Andreas

Re: [Cluster-devel] [PATCH v2] gfs2: fix race between gfs2_freeze_func and unmount

2019-04-30 Thread Andreas Gruenbacher
On Tue, 30 Apr 2019 at 23:54, Abhi Das wrote: > As part of the freeze operation, gfs2_freeze_func() is left blocking > on a request to hold the sd_freeze_gl in SH. This glock is held in EX > by the gfs2_freeze() code. > > A subsequent call to gfs2_unfreeze() releases the EXclusively held >

[Cluster-devel] [PATCH v2] gfs2: fix race between gfs2_freeze_func and unmount

2019-04-30 Thread Abhi Das
As part of the freeze operation, gfs2_freeze_func() is left blocking on a request to hold the sd_freeze_gl in SH. This glock is held in EX by the gfs2_freeze() code. A subsequent call to gfs2_unfreeze() releases the EXclusively held sd_freeze_gl, which allows gfs2_freeze_func() to acquire it in

Re: [Cluster-devel] [GFS2 PATCH v3 00/19] gfs2: misc recovery patch collection

2019-04-30 Thread Steven Whitehouse
Hi, On 01/05/2019 00:03, Bob Peterson wrote: Here is version 3 of the patch set I posted on 23 April. It is revised based on bugs I found testing with xfstests. This is a collection of patches I've been using to address the myriad of recovery problems I've found. I'm still finding them, so the

Re: [Cluster-devel] [GFS2 PATCH v3 09/19] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn

2019-04-30 Thread Steven Whitehouse
Hi, On 01/05/2019 00:03, Bob Peterson wrote: This patch addresses various problems with gfs2/dlm recovery. For example, suppose a node with a bunch of gfs2 mounts suddenly reboots due to kernel panic, and dlm determines it should perform recovery. DLM does so from a pseudo-state machine

Re: [Cluster-devel] [PATCH v7 0/5] iomap and gfs2 fixes

2019-04-30 Thread Dave Chinner
On Mon, Apr 29, 2019 at 07:50:28PM -0700, Darrick J. Wong wrote: > On Tue, Apr 30, 2019 at 12:09:29AM +0200, Andreas Gruenbacher wrote: > > Here's another update of this patch queue, hopefully with all wrinkles > > ironed out now. > > > > Darrick, I think Linus would be unhappy seeing the first

Re: [Cluster-devel] [PATCH] gfs2: fix race between gfs2_freeze_func and unmount

2019-04-30 Thread Abhijith Das
NACK. Andreas mentioned that the description could be more descriptive and that we should be using clear_bit_unlock() instead of clear_bit(). I'll post a v2 shortly with these changes. Cheers! --Abhi On Tue, Apr 30, 2019 at 12:48 PM Abhi Das wrote: > gfs2_unfreee() doesn't wait for

[Cluster-devel] [GFS2 PATCH v3 16/19] gfs2: simply gfs2_freeze by removing case

2019-04-30 Thread Bob Peterson
Function gfs2_freeze had a case statement that simply checked the error code, but the break statements just made the logic hard to read. This patch simplifies the logic in favor of a simple if. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 10 ++ 1 file changed, 2 insertions(+), 8

[Cluster-devel] [GFS2 PATCH v3 09/19] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn

2019-04-30 Thread Bob Peterson
This patch addresses various problems with gfs2/dlm recovery. For example, suppose a node with a bunch of gfs2 mounts suddenly reboots due to kernel panic, and dlm determines it should perform recovery. DLM does so from a pseudo-state machine calling various callbacks into lock_dlm to perform a

[Cluster-devel] [GFS2 PATCH v3 15/19] gfs2: Force withdraw to replay journals and wait for it to finish

2019-04-30 Thread Bob Peterson
When a node withdraws from a file system, it often leaves its journal in an incomplete state. This is especially true when the withdraw is caused by io errors writing to the journal. Before this patch, a withdraw would try to write a "shutdown" record to the journal, tell dlm it's done with the

[Cluster-devel] [GFS2 PATCH v3 07/19] gfs2: Only complain the first time an io error occurs in quota or log

2019-04-30 Thread Bob Peterson
Before this patch, all io errors received by the quota daemon or the logd daemon would cause a complaint message to be issued, such as: gfs2: fsid=dm-13.0: Error 10 writing to journal, jid=0 This patch changes it so that the error message is only issued the first time the error is

[Cluster-devel] [GFS2 PATCH v3 19/19] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty

2019-04-30 Thread Bob Peterson
Before this patch, if gfs2_ail_empty_gl saw there was nothing on the ail list, it would return and not flush the log. The problem is that there could still be a revoke for the rgrp sitting on the sd_log_le_revoke list that's been recently taken off the ail list. But that revoke still needs to be

[Cluster-devel] [GFS2 PATCH v3 11/19] gfs2: Allow some glocks to be used during withdraw

2019-04-30 Thread Bob Peterson
Before this patch, when a file system was withdrawn, all further attempts to enqueue or promote glocks were rejected and returned -EIO. This is only important for media-backed glocks like inode and rgrp glocks. All other glocks may be safely used because there is no potential for metadata

[Cluster-devel] [GFS2 PATCH v3 05/19] gfs2: Introduce concept of a pending withdraw

2019-04-30 Thread Bob Peterson
File system withdraws can be delayed when inconsistencies are discovered when we cannot withdraw immediately, for example, when critical spin_locks are held. But delaying the withdraw can cause gfs2 to ignore the error and keep running for a short period of time. For example, an rgrp glock may be

[Cluster-devel] [GFS2 PATCH v3 06/19] gfs2: log error reform

2019-04-30 Thread Bob Peterson
Before this patch, gfs2 kept track of journal io errors in two places sd_log_error and the SDF_AIL1_IO_ERROR flag in sd_flags. This patch consolidates the two into sd_log_error so that it reflects the first error encountered writing to the journal. In future patches, we will take advantage of this

[Cluster-devel] [GFS2 PATCH v3 12/19] gfs2: Don't loop forever in gfs2_freeze if withdrawn

2019-04-30 Thread Bob Peterson
Before this patch, function gfs2_freeze would loop forever if the file system trying to be frozen is withdrawn. That's because function gfs2_lock_fs_check_clean tries to enqueue the glock of the journal and the gfs2_glock returns -EIO because you can't enqueue a journaled glock after a withdraw.

[Cluster-devel] [GFS2 PATCH v3 17/19] gfs2: Add verbose option to check_journal_clean

2019-04-30 Thread Bob Peterson
Before this patch, function check_journal_clean would give messages related to journal recovery. That's fine for mount time, but when a node withdraws and forces replay that way, we don't want all those distracting and misleading messages. This patch adds a new parameter to make those messages

[Cluster-devel] [GFS2 PATCH v3 18/19] gfs2: Check for log write errors before telling dlm to unlock

2019-04-30 Thread Bob Peterson
Before this patch, function do_xmote just assumed all the writes submitted to the journal were finished and successful, and it called the go_unlock function to release the dlm lock. But if they're not, and a revoke failed to make its way to the journal, a journal replay on another node will cause

[Cluster-devel] [GFS2 PATCH v3 01/19] gfs2: kthread and remount improvements

2019-04-30 Thread Bob Peterson
Before this patch, gfs2 saved the pointers to the two daemon threads (logd and quotad) in the superblock, but they were never cleared, even if the threads were stopped (e.g. on remount -o ro). That meant that certain error conditions (like a withdrawn file system) could race. For example, xfstests

[Cluster-devel] [GFS2 PATCH v3 00/19] gfs2: misc recovery patch collection

2019-04-30 Thread Bob Peterson
Here is version 3 of the patch set I posted on 23 April. It is revised based on bugs I found testing with xfstests. This is a collection of patches I've been using to address the myriad of recovery problems I've found. I'm still finding them, so the battle is not won yet. I'm not convinced we

[Cluster-devel] [GFS2 PATCH v3 08/19] gfs2: Stop ail1 wait loop when withdrawn

2019-04-30 Thread Bob Peterson
Before this patch, function gfs2_log_flush could get into an infinite loop trying to clear out its ail1 list. If the file system was withdrawn (or pending withdraw) due to a problem with writing the ail1 list, it would never clear out the list, and therefore, would loop infinitely. This patch

[Cluster-devel] [GFS2 PATCH v3 03/19] gfs2: log which portion of the journal is replayed

2019-04-30 Thread Bob Peterson
When a journal is replayed, gfs2 logs a message similar to: jid=X: Replaying journal... This patch adds the tail and block number so that the range of the replayed block is also printed. These values will match the values shown if the journal is dumped with gfs2_edit -p journalX. The resulting

[Cluster-devel] [GFS2 PATCH v3 02/19] gfs2: eliminate tr_num_revoke_rm

2019-04-30 Thread Bob Peterson
For its journal processing, gfs2 kept track of the number of buffers added and removed on a per-transaction basis. These values are used to calculate space needed in the journal. But while these calculations make sense for the number of buffers, they make no sense for revokes. Revokes are managed

[Cluster-devel] [GFS2 PATCH v3 04/19] gfs2: Warn when a journal replay overwrites a rgrp with buffers

2019-04-30 Thread Bob Peterson
This patch adds some instrumentation in gfs2's journal replay that indicates when we're about to overwrite a rgrp for which we already have a valid buffer_head. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff

[Cluster-devel] [GFS2 PATCH v3 13/19] gfs2: Make secondary withdrawers wait for first withdrawer

2019-04-30 Thread Bob Peterson
Before this patch, if a process encountered an error and decided to withdraw, if another process was already in the process of withdrawing, the secondary withdraw would be silently ignored, which set it free to proceed with its processing, unlock any locks, etc. That's correct behavior if the

Re: [Cluster-devel] [PATCH v7 1/5] iomap: Clean up __generic_write_end calling

2019-04-30 Thread Darrick J. Wong
On Tue, Apr 30, 2019 at 12:09:30AM +0200, Andreas Gruenbacher wrote: > From: Christoph Hellwig > > Move the call to __generic_write_end into iomap_write_end instead of > duplicating it in each of the three branches. This requires open coding > the generic_write_end for the buffer_head case. >

Re: [Cluster-devel] [PATCH v7 2/5] fs: Turn __generic_write_end into a void function

2019-04-30 Thread Darrick J. Wong
On Tue, Apr 30, 2019 at 12:09:31AM +0200, Andreas Gruenbacher wrote: > The VFS-internal __generic_write_end helper always returns the value of > its @copied argument. This can be confusing, and it isn't very useful > anyway, so turn __generic_write_end into a function returning void > instead.

Re: [Cluster-devel] [PATCH v7 3/5] iomap: Fix use-after-free error in page_done callback

2019-04-30 Thread Darrick J. Wong
On Tue, Apr 30, 2019 at 12:09:32AM +0200, Andreas Gruenbacher wrote: > In iomap_write_end, we're not holding a page reference anymore when > calling the page_done callback, but the callback needs that reference to > access the page. To fix that, move the put_page call in > __generic_write_end

Re: [Cluster-devel] [PATCH v7 2/5] fs: Turn __generic_write_end into a void function

2019-04-30 Thread Christoph Hellwig
On Tue, Apr 30, 2019 at 12:09:31AM +0200, Andreas Gruenbacher wrote: > The VFS-internal __generic_write_end helper always returns the value of > its @copied argument. This can be confusing, and it isn't very useful > anyway, so turn __generic_write_end into a function returning void > instead. >