[Cluster-devel] [PATCH 10/19] gfs2: Check for log write errors before telling dlm to unlock

2019-03-27 Thread Bob Peterson
Before this patch, function do_xmote just assumed all the writes submitted to the journal were finished and successful, and it called the go_unlock function to release the dlm lock. But if they're not, and a revoke failed to make its way to the journal, a journal replay on another node will cause

[Cluster-devel] [PATCH 07/19] gfs2: Don't write log headers after file system withdraw

2019-03-27 Thread Bob Peterson
Before this patch, when a node withdrew a gfs2 file system, it wrote a (clean) unmount log header. That's wrong. You don't want to write anything to the journal once you're withdrawn because that's acknowledging that the transaction is complete and the journal is in good shape, neither of which

[Cluster-devel] [PATCH 08/19] gfs2: Force withdraw to replay journals and wait for it to finish

2019-03-27 Thread Bob Peterson
When a node withdraws from a file system, it often leaves its journal in an incomplete state. This is especially true when the withdraw is caused by io errors writing to the journal. Before this patch, a withdraw would try to write a "shutdown" record to the journal, tell dlm it's done with the

[Cluster-devel] [PATCH 01/19] gfs2: log error reform

2019-03-27 Thread Bob Peterson
Before this patch, gfs2 kept track of journal io errors in two places (sd_log_error and the SDF_AIL1_IO_ERROR flag in sd_flags. This patch consolidates the two by eliminating the SDF_AIL1_IO_ERROR flag in favor of an atomic count of journal errors, sd_log_errors. When the first io error occurs and

[Cluster-devel] [PATCH 04/19] gfs2: move check_journal_clean to util.c for future use

2019-03-27 Thread Bob Peterson
Before this patch function check_journal_clean was in ops_fstype.c. This patch moves it to util.c so we can make use of it elsewhere in a future patch. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 42 - fs/gfs2/util.c | 45

[Cluster-devel] [PATCH 11/19] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty

2019-03-27 Thread Bob Peterson
Before this patch, if gfs2_ail_empty_gl saw there was nothing on the ail list, it would return and not flush the log. The problem is that there could still be a revoke for the rgrp sitting on the sd_log_le_revoke list that's been recently taken off the ail list. But that revoke still needs to be

[Cluster-devel] [PATCH 09/19] gfs2: Add verbose option to check_journal_clean

2019-03-27 Thread Bob Peterson
Before this patch, function check_journal_clean would give messages related to journal recovery. That's fine for mount time, but when a node withdraws and forces replay that way, we don't want all those distracting and misleading messages. This patch adds a new parameter to make those messages

[Cluster-devel] [PATCH 12/19] gfs2: If the journal isn't live ignore log flushes

2019-03-27 Thread Bob Peterson
This patch adds a check to function gfs2_log_flush: if the journal is no longer live, the flush is ignored. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 896165811063..cd2f54db7f8a 100644 ---

[Cluster-devel] [PATCH 19/19] gfs2: clean_journal was setting sd_log_flush_head replaying other journals

2019-03-27 Thread Bob Peterson
Function clean_journal was setting the value of sd_log_flush_head, but that's only a valid thing to do if it is replaying its own journal. If it's replaying another node's journal, that's completely wrong and will lead to multiple problems. Signed-off-by: Bob Peterson --- fs/gfs2/recovery.c | 6

[Cluster-devel] [PATCH 03/19] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn

2019-03-27 Thread Bob Peterson
This patch addresses various problems with gfs2/dlm recovery. For example, suppose a node with a bunch of gfs2 mounts suddenly reboots due to kernel panic, and dlm determines it should perform recovery. DLM does so from a pseudo-state machine calling various callbacks into lock_dlm to perform a

[Cluster-devel] [PATCH 02/19] gfs2: Introduce concept of a pending withdraw

2019-03-27 Thread Bob Peterson
File system withdraws can be delayed when inconsistencies are discovered when we cannot withdraw immediately, for example, when critical spin_locks are held. But delaying the withdraw can cause gfs2 to ignore the error and keep running for a short period of time. For example, an rgrp glock may be

[Cluster-devel] [PATCH 00/19] gfs2: misc recovery patch collection

2019-03-27 Thread Bob Peterson
This is a collection of patches I've been using to address the myriad of recovery problems I've found. I'm still finding them, so the battle is not won yet. I'm not convinced we need all of these but I thought I'd send them anyway and get feedback. Previously I sent out a version of the patch

[Cluster-devel] [PATCH 06/19] gfs2: Make secondary withdrawers wait for first withdrawer

2019-03-27 Thread Bob Peterson
Before this patch, if a process encountered an error and decided to withdraw, if another process was already in the process of withdrawing, the secondary withdraw would be silently ignored, which set it free to proceed with its processing, unlock any locks, etc. That's correct behavior if the

[Cluster-devel] [PATCH 13/19] gfs2: Issue revokes more intelligently

2019-03-27 Thread Bob Peterson
Before this patch, function gfs2_write_revokes would call gfs2_ail1_empty, then traverse the sd_ail1_list looking for transactions that had bds which were no longer queued to a glock. And if it found some, it would try to issue revokes for them, up to a predetermined maximum. There were two

[Cluster-devel] [PATCH 18/19] gfs2: don't call go_unlock unless demote is close at hand

2019-03-27 Thread Bob Peterson
The go_unlock glops function is only used when unlocking rgrps. Before this patch, the glock code did: 1. spin_unlock(>gl_lockref.lock); 2. go_unlock 3. spin_lock(>gl_lockref.lock); The go_unlock function checked for the GLF_DEMOTE or DEMOTE_PENDING bits, and if set, called rgrp_go_unlock. But

[Cluster-devel] [PATCH 16/19] gfs2: Only remove revokes that we've submitted

2019-03-27 Thread Bob Peterson
Before this patch function revoke_lo_after_commit would run the list of revokes (sd_log_num_revoke) and free them all. That assumes they've all been submitted. When revoke_lo_before_commit is called by the log flush, that's true. But then all the io is submitted, which could take some time. In the

[Cluster-devel] [PATCH 17/19] gfs2: eliminate tr_num_revoke_rm

2019-03-27 Thread Bob Peterson
For its journal processing, gfs2 kept track of the number of buffers added and removed on a per-transaction basis. These values are used to calculate space needed in the journal. But while these calculations make sense for the number of buffers, they make no sense for revokes. Revokes are managed

[Cluster-devel] [PATCH 14/19] gfs2: Warn when a journal replay overwrites a rgrp with buffers

2019-03-27 Thread Bob Peterson
This patch adds some instrumentation in gfs2's journal replay that indicates when we're about to overwrite a rgrp for which we already have a valid buffer_head. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff

[Cluster-devel] [PATCH v2] gfs2: Fix lru_count going negative

2019-03-27 Thread Ross Lagerwall
Under certain conditions, lru_count may drop below zero resulting in a large amount of log spam like this: vmscan: shrink_slab: gfs2_dump_glock+0x3b0/0x630 [gfs2] \ negative objects to delete nr=-1 This happens as follows: 1) A glock is moved from lru_list to the dispose list and lru_count

[Cluster-devel] [GFS2 PATCH V2] gfs2: clean_journal was setting sd_log_flush_head replaying other journals

2019-03-27 Thread Bob Peterson
Hi, Yesterday I posted a patch for this problem, but it was grossly inadequate. This patch, version 2, is another attempt to fix it. Many thanks to Ross Lagerwall for helping us identify, fix, and test the problem. Regards, Bob Peterson --- gfs2: clean_journal was setting sd_log_flush_head

Re: [Cluster-devel] [PATCH v3] gfs2: Convert gfs2 to fs_context

2019-03-27 Thread kbuild test robot
Hi Andrew, Thank you for the patch! Yet something to improve: [auto build test ERROR on gfs2/for-next] [also build test ERROR on v5.1-rc2 next-20190327] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux

Re: [Cluster-devel] gfs2 iomap dealock, IOMAP_F_UNBALANCED

2019-03-27 Thread Ross Lagerwall
On 3/22/19 12:21 AM, Andreas Gruenbacher wrote: On Fri, 22 Mar 2019 at 00:01, Andreas Gruenbacher wrote: On Thu, 21 Mar 2019 at 22:43, Dave Chinner wrote: The problem is calling balance_dirty_pages() inside the ->iomap_begin/->iomap_end calls and not that it is called by the iomap

Re: [Cluster-devel] Assertion failure: sdp->sd_log_blks_free <= sdp->sd_jdesc->jd_blocks

2019-03-27 Thread Bob Peterson
Hi Ross, - Original Message - > > I think the problem may actually be a regression with patch 588bff95c94ef. > > I still hit the assertion with this patch. gfs2_log_write_header() is > unconditionally called and it calls gfs2_log_bmap() which changes > sd_log_flush_head in the success

Re: [Cluster-devel] [PATCH v2] gfs2: Fix lru_count going negative

2019-03-27 Thread Andreas Gruenbacher
Hi Ross, On Wed, 27 Mar 2019 at 18:09, Ross Lagerwall wrote: > I've detached this from "gfs2: Fix occasional glock use-after-free" > since this can go in separately while that is still under discussion. > > Changed in v2: > * Move GLOF_LRU check into gfs2_glock_add_to_lru() for symmetry with >

[Cluster-devel] [RFC PATCH 00/68] VFS: Convert a bunch of filesystems to the new mount API

2019-03-27 Thread David Howells
Hi Al, Here's a set of patches that converts a bunch (but not yet all!) to the new mount API. To this end, it makes the following changes: (1) Provides a convenience member in struct fs_context that is OR'd into sb->s_iflags by sget_fc(). (2) Provides a convenience helper function,

[Cluster-devel] [RFC PATCH 68/68] gfs2: Convert gfs2 to fs_context

2019-03-27 Thread David Howells
From: Andrew Price Convert gfs2 and gfs2meta to fs_context. Removes the duplicated vfs code from gfs2_mount and instead uses the new vfs_get_block_super() before switching the ->root to the appropriate dentry. The mount option parsing has been converted to the new API and error reporting for

Re: [Cluster-devel] Assertion failure: sdp->sd_log_blks_free <= sdp->sd_jdesc->jd_blocks

2019-03-27 Thread Ross Lagerwall
On 3/25/19 3:29 PM, Bob Peterson wrote: - Original Message - I think I've found the cause of the assertion I was hitting. Recovery sets sd_log_flush_head but does not take locks which means a concurrent call to gfs2_log_flush() can result in sd_log_head being set to sd_log_flush_head. A

[Cluster-devel] [PATCH v3] gfs2: Convert gfs2 to fs_context

2019-03-27 Thread Andrew Price
Convert gfs2 and gfs2meta to fs_context. Removes the duplicated vfs code from gfs2_mount and instead uses the new vfs_get_block_super() before switching the ->root to the appropriate dentry. The mount option parsing has been converted to the new API and error reporting for invalid options has