[Cluster-devel] [GFS2 v4 PATCH 13/25] gfs2: Ignore dlm recovery requests if gfs2 is withdrawn

2019-05-15 Thread Bob Peterson
achine so that the functions are ignored and skipped if an io error has occurred or if the file system is withdrawn. That prevents the lvb bits from being updated, and therefore dlm and user space still see the need for recovery to take place. Signed-off-by: Bob Peterson --- fs/gfs2/lock_dlm.c | 18 +++

[Cluster-devel] [GFS2 v4 PATCH 16/25] gfs2: Don't loop forever in gfs2_freeze if withdrawn

2019-05-15 Thread Bob Peterson
. This patch moves the check for file system withdraw inside the loop so that the loop can end when withdraw occurs. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 5ea8d45e989d

[Cluster-devel] [GFS2 v4 PATCH 07/25] gfs2: dump fsid when dumping glock problems

2019-05-15 Thread Bob Peterson
-error cases, such as dumping the glocks debugfs file, the fsid is not dumped in order to keep lock dumps and glocktop as clean as possible. For all error cases, such as GLOCK_BUG_ON, the file system id is now printed. This will make it easier to debug. Signed-off-by: Bob Peterson --- fs/gfs2

[Cluster-devel] [GFS2 v4 PATCH 00/25] gfs2: misc recovery patch collection

2019-05-15 Thread Bob Peterson
dependencies between patches, so many could be accepted or rejected individually. Bob Peterson (25): gfs2: kthread and remount improvements gfs2: eliminate tr_num_revoke_rm gfs2: log which portion of the journal is replayed gfs2: Warn when a journal replay overwrites a rgrp with buffers gfs2

[Cluster-devel] [GFS2 v4 PATCH 08/25] gfs2: replace more printk with calls to fs_info and friends

2019-05-15 Thread Bob Peterson
This patch replaces a few leftover printk errors with calls to fs_info and similar, so that the file system having the error is properly logged. Signed-off-by: Bob Peterson --- fs/gfs2/bmap.c | 2 +- fs/gfs2/glops.c | 3 ++- fs/gfs2/rgrp.c | 27 ++- fs/gfs2/super.c

[Cluster-devel] [GFS2 v4 PATCH 09/25] gfs2: Introduce concept of a pending withdraw

2019-05-15 Thread Bob Peterson
in the meantime. Signed-off-by: Bob Peterson --- fs/gfs2/aops.c | 4 ++-- fs/gfs2/file.c | 2 +- fs/gfs2/glock.c | 7 +++ fs/gfs2/glops.c | 2 +- fs/gfs2/incore.h | 1 + fs/gfs2/log.c| 20 fs/gfs2/meta_io.c| 6 +++--- fs/gfs2

[Cluster-devel] [GFS2 v4 PATCH 18/25] gfs2: Don't write log headers after file system withdraw

2019-05-15 Thread Bob Peterson
regardless of whether it's done by this node or another. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index a841cad187f5..8b6e19802a7f 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -693,

[Cluster-devel] [GFS2 v4 PATCH 05/25] gfs2: Change SDF_SHUTDOWN to SDF_WITHDRAWN

2019-05-15 Thread Bob Peterson
Before this patch, the superblock flag indicating when a file system is withdrawn was called SDF_SHUTDOWN. This patch simply renames it to the more obvious SDF_WITHDRAWN. Signed-off-by: Bob Peterson --- fs/gfs2/aops.c | 4 ++-- fs/gfs2/file.c | 2 +- fs/gfs2/glock.c | 6

[Cluster-devel] [GFS2 v4 PATCH 25/25] gfs2: Check for log write errors before telling dlm to unlock

2019-05-15 Thread Bob Peterson
to progress even under withdrawn conditions. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 38 +- fs/gfs2/glops.c | 4 +++- 2 files changed, 36 insertions(+), 6 deletions(-) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 7202793056a8..0d3cb4d9de52

[Cluster-devel] [GFS2 v4 PATCH 21/25] gfs2: Abort gfs2_freeze if io error is seen

2019-05-15 Thread Bob Peterson
, it dequeues the freeze glock, aborts the loop and returns the error. Also, there's no need to pass the freeze holder to function gfs2_lock_fs_check_clean since it's only called in one place and it's a well-known superblock pointer, so this simplifies that. Signed-off-by: Bob Peterson --- fs/gfs2/super.c

[Cluster-devel] [GFS2 v4 PATCH 19/25] gfs2: Force withdraw to replay journals and wait for it to finish

2019-05-15 Thread Bob Peterson
; glock is not granted in EX; the callback is only just used to indicate a withdraw has occurred. Note that all nodes in the cluster must wait for the recovering node to finish replaying the withdrawing node's journal before continuing. To this end, it checks that the journals are clean multiple

[Cluster-devel] [GFS2 v4 PATCH 20/25] gfs2: Add verbose option to check_journal_clean

2019-05-15 Thread Bob Peterson
optional. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 2 +- fs/gfs2/util.c | 23 --- fs/gfs2/util.h | 4 +++- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index 464d365bd3f5..c4031719fbaa

[Cluster-devel] [GFS2 v4 PATCH 11/25] gfs2: Only complain the first time an io error occurs in quota or log

2019-05-15 Thread Bob Peterson
: This change in function breaks check xfstests generic/441 and causes it to fail: io errors writing to the log should cause a file system to be withdrawn, and no further operations are tolerated. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 5 +++-- fs/gfs2/quota.c | 4 ++-- 2 files changed, 5

[Cluster-devel] [GFS2 v4 PATCH 14/25] gfs2: move check_journal_clean to util.c for future use

2019-05-15 Thread Bob Peterson
Before this patch function check_journal_clean was in ops_fstype.c. This patch moves it to util.c so we can make use of it elsewhere in a future patch. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 42 - fs/gfs2/util.c | 45

[Cluster-devel] [GFS2 v4 PATCH 06/25] gfs2: simplify gfs2_freeze by removing case

2019-05-15 Thread Bob Peterson
Function gfs2_freeze had a case statement that simply checked the error code, but the break statements just made the logic hard to read. This patch simplifies the logic in favor of a simple if. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 10 ++ 1 file changed, 2 insertions(+), 8

[Cluster-devel] [GFS2 v4 PATCH 03/25] gfs2: log which portion of the journal is replayed

2019-05-15 Thread Bob Peterson
output looks something like this: jid=1: Replaying journal...0x28b7 to 0x2beb This will allow us to better debug file system corruption problems. Signed-off-by: Bob Peterson --- fs/gfs2/recovery.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/gfs2/recovery.c b/fs/gfs2

[Cluster-devel] [GFS2 v4 PATCH 10/25] gfs2: log error reform

2019-05-15 Thread Bob Peterson
for the ail1 list to empty. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 7 +++ fs/gfs2/log.c| 20 +++- fs/gfs2/quota.c | 2 +- 3 files changed, 19 insertions(+), 10 deletions(-) diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h index b261168be298..39cec5361ba5

[Cluster-devel] [GFS2 v4 PATCH 17/25] gfs2: Make secondary withdrawers wait for first withdrawer

2019-05-15 Thread Bob Peterson
granted to another node. This patch makes secondary withdrawers wait until the primary withdrawer is finished with its processing before proceeding. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 3 +++ fs/gfs2/util.c | 21 +++-- 2 files changed, 22 insertions(+), 2

[Cluster-devel] [GFS2 v4 PATCH 04/25] gfs2: Warn when a journal replay overwrites a rgrp with buffers

2019-05-15 Thread Bob Peterson
-off-by: Bob Peterson --- fs/gfs2/lops.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c index ce048a9e058d..41e06582772c 100644 --- a/fs/gfs2/lops.c +++ b/fs/gfs2/lops.c @@ -764,9 +764,27 @@ static int buf_lo_scan_elements

[Cluster-devel] [GFS2 v4 PATCH 01/25] gfs2: kthread and remount improvements

2019-05-15 Thread Bob Peterson
of the function. This removes that bypass in favor of just running the whole function, then returning the error. That way, unmounts and remounts won't hang forever. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/fs

[Cluster-devel] [GFS2 v4 PATCH 02/25] gfs2: eliminate tr_num_revoke_rm

2019-05-15 Thread Bob Peterson
in their own list, linked from the superblock. So it's entirely unnecessary to keep separate per-transaction counts for revokes added and removed. A single count will do the same job. Therefore, this patch combines the transaction revokes into a single count. Signed-off-by: Bob Peterson --- fs

[Cluster-devel] [GFS2 v4 PATCH 24/25] gfs2: Prepare to withdraw as soon as an IO error occurs in log write

2019-05-15 Thread Bob Peterson
can be properly withdrawn and unmounted. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c index 5166e863a926..2f4ca2e3e416 100644 --- a/fs/gfs2/lops.c +++ b/fs/gfs2/lops.c @@ -215,6 +215,7 @@ static void

[Cluster-devel] [GFS2 v4 PATCH 23/25] gfs2: Issue revokes more intelligently

2019-05-15 Thread Bob Peterson
any. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 61 +++ 1 file changed, 22 insertions(+), 39 deletions(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 2764d612052d..0397aa446f63 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -189,11

[Cluster-devel] [GFS2 v4 PATCH 22/25] gfs2: Check if holding freeze glock when making fs ro

2019-05-15 Thread Bob Peterson
. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 27 --- 1 file changed, 16 insertions(+), 11 deletions(-) diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 1253fcf35910..d3f6e9a61c13 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -426,9 +426,13 @@ int

[Cluster-devel] [GFS2 v4 PATCH 15/25] gfs2: Allow some glocks to be used during withdraw

2019-05-15 Thread Bob Peterson
corruption. This patch allows some glocks to be used even after the file system is withdrawn. This is accomplished with a new glops flag, GLOF_JOURNALED, which tells us which inodes cannot be safely manipulated after withdraw. This facilitates future patches that enhance fs withdraw. Signed-off-by: Bob

[Cluster-devel] [GFS2 v4 PATCH 12/25] gfs2: Stop ail1 wait loop when withdrawn

2019-05-15 Thread Bob Peterson
changes function gfs2_log_flush so that it does while (!gfs2_withdraw(sdp)) rather than while (;;). Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 8dd07688728b..a841cad187f5 100644 --- a/fs/gfs2

[Cluster-devel] [GFS2 PATCH v3 02/19] gfs2: eliminate tr_num_revoke_rm

2019-04-30 Thread Bob Peterson
in their own list, linked from the superblock. So it's entirely unnecessary to keep separate per-transaction counts for revokes added and removed. A single count will do the same job. Therefore, this patch combines the transaction revokes into a single count. Signed-off-by: Bob Peterson --- fs

[Cluster-devel] [GFS2 PATCH v3 14/19] gfs2: Don't write log headers after file system withdraw

2019-04-30 Thread Bob Peterson
regardless of whether it's done by this node or another. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 6169276aa9e6..771589b2e225 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -693,

[Cluster-devel] [GFS2 PATCH v3 04/19] gfs2: Warn when a journal replay overwrites a rgrp with buffers

2019-04-30 Thread Bob Peterson
This patch adds some instrumentation in gfs2's journal replay that indicates when we're about to overwrite a rgrp for which we already have a valid buffer_head. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff

[Cluster-devel] [GFS2 PATCH v3 10/19] gfs2: move check_journal_clean to util.c for future use

2019-04-30 Thread Bob Peterson
Before this patch function check_journal_clean was in ops_fstype.c. This patch moves it to util.c so we can make use of it elsewhere in a future patch. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 42 - fs/gfs2/util.c | 45

[Cluster-devel] [GFS2 PATCH v3 13/19] gfs2: Make secondary withdrawers wait for first withdrawer

2019-04-30 Thread Bob Peterson
granted to another node. This patch makes secondary withdrawers wait until the primary withdrawer is finished with its processing before proceeding. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 3 +++ fs/gfs2/util.c | 21 +++-- 2 files changed, 22 insertions(+), 2

[Cluster-devel] [GFS2 PATCH v3 01/19] gfs2: kthread and remount improvements

2019-04-30 Thread Bob Peterson
of the function. This removes that bypass in favor of just running the whole function, then returning the error. That way, unmounts and remounts won't hang forever. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/fs

[Cluster-devel] [GFS2 PATCH v3 00/19] gfs2: misc recovery patch collection

2019-04-30 Thread Bob Peterson
clean to util.c for future use". So those four need to be a set. There aren't many other dependencies between patches, so the others could probably be taken or rejected individually. There are more patches I still need to perfect, but maybe a few of the safer ones can be pushed to for-next. Bob P

[Cluster-devel] [GFS2 PATCH v3 08/19] gfs2: Stop ail1 wait loop when withdrawn

2019-04-30 Thread Bob Peterson
changes function gfs2_log_flush so that it does while (!gfs2_withdraw(sdp)) rather than while (;;). Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 33ef2cb570e2..6169276aa9e6 100644 --- a/fs/gfs2

[Cluster-devel] [GFS2 PATCH v3 03/19] gfs2: log which portion of the journal is replayed

2019-04-30 Thread Bob Peterson
output looks something like this: jid=1: Replaying journal...0x28b7 to 0x2beb This will allow us to better debug file system corruption problems. Signed-off-by: Bob Peterson --- fs/gfs2/recovery.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/gfs2/recovery.c b/fs/gfs2

[Cluster-devel] [GFS2 PATCH v3 12/19] gfs2: Don't loop forever in gfs2_freeze if withdrawn

2019-04-30 Thread Bob Peterson
. This patch moves the check for file system withdraw inside the loop so that the loop can end when withdraw occurs. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 378519beb6c3

[Cluster-devel] [GFS2 PATCH v3 17/19] gfs2: Add verbose option to check_journal_clean

2019-04-30 Thread Bob Peterson
optional. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 2 +- fs/gfs2/util.c | 23 --- fs/gfs2/util.h | 4 +++- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index c4222c1fa735..d8bca92f34a1

[Cluster-devel] [GFS2 PATCH v3 18/19] gfs2: Check for log write errors before telling dlm to unlock

2019-04-30 Thread Bob Peterson
to progress even under withdrawn conditions. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 44 +--- 1 file changed, 41 insertions(+), 3 deletions(-) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 3d1b9bdfd0de..f4129305a815 100644 --- a/fs/gfs2

[Cluster-devel] [GFS2 PATCH v3 19/19] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty

2019-04-30 Thread Bob Peterson
-by: Bob Peterson --- fs/gfs2/glops.c | 26 +- fs/gfs2/log.c | 2 +- fs/gfs2/log.h | 1 + 3 files changed, 27 insertions(+), 2 deletions(-) diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c index ff50013c6aa9..1cd2a3c69d63 100644 --- a/fs/gfs2/glops.c +++ b/fs/gfs2/glops.c

[Cluster-devel] [GFS2 PATCH v3 11/19] gfs2: Allow some glocks to be used during withdraw

2019-04-30 Thread Bob Peterson
corruption. This patch allows some glocks to be used even after the file system is withdrawn. This is accomplished with a new glops flag, GLOF_JOURNALED, which tells us which inodes cannot be safely manipulated after withdraw. This facilitates future patches that enhance fs withdraw. Signed-off-by: Bob

[Cluster-devel] [GFS2 PATCH v3 05/19] gfs2: Introduce concept of a pending withdraw

2019-04-30 Thread Bob Peterson
in the meantime. Signed-off-by: Bob Peterson --- fs/gfs2/aops.c | 4 ++-- fs/gfs2/file.c | 2 +- fs/gfs2/glock.c | 7 +++ fs/gfs2/glops.c | 2 +- fs/gfs2/incore.h | 1 + fs/gfs2/log.c| 20 fs/gfs2/meta_io.c| 6 +++--- fs/gfs2

[Cluster-devel] [GFS2 PATCH v3 06/19] gfs2: log error reform

2019-04-30 Thread Bob Peterson
for the ail1 list to empty. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 5 ++--- fs/gfs2/log.c| 20 +++- fs/gfs2/quota.c | 2 +- 3 files changed, 18 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h index 003d9da937b4..e16ab4c98072

[Cluster-devel] [GFS2 PATCH v3 16/19] gfs2: simply gfs2_freeze by removing case

2019-04-30 Thread Bob Peterson
Function gfs2_freeze had a case statement that simply checked the error code, but the break statements just made the logic hard to read. This patch simplifies the logic in favor of a simple if. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 10 ++ 1 file changed, 2 insertions(+), 8

[Cluster-devel] [GFS2 PATCH v3 09/19] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn

2019-04-30 Thread Bob Peterson
ly get put into the lvb generation numbers to be seen by all nodes. This patch adds checks to many of the callbacks used by dlm in its recovery state machine so that the functions are ignored and skipped if an io error has occurred or if the file system was withdraw. Signed-off-by: Bob Peterson --

[Cluster-devel] [GFS2 PATCH v3 15/19] gfs2: Force withdraw to replay journals and wait for it to finish

2019-04-30 Thread Bob Peterson
; glock is not granted in EX; the callback is only just used to indicate a withdraw has occurred. Note that all nodes in the cluster must wait for the recovering node to finish replaying the withdrawing node's journal before continuing. To this end, it checks that the journals are clean multiple

[Cluster-devel] [GFS2 PATCH v3 07/19] gfs2: Only complain the first time an io error occurs in quota or log

2019-04-30 Thread Bob Peterson
: This change in function breaks check xfstests generic/441 and causes it to fail: io errors writing to the log should cause a file system to be withdrawn, and no further operations are tolerated. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 5 +++-- fs/gfs2/quota.c | 4 ++-- 2 files changed, 5

Re: [Cluster-devel] GFS2 rm can be very slow

2019-04-26 Thread Bob Peterson
a "hot" directory that's used in read mode by lots of processes across several nodes and we need it in rw mode to remove the dirent. I suppose a finely crafted systemtap script would help figure this all out. Also, what version of gfs2 is running slow? Regards, Bob Peterson

[Cluster-devel] [PATCH v5] gfs2: clean_journal improperly set sd_log_flush_head

2019-04-03 Thread Bob Peterson
block parameters by changing several unsigned int types to a consistent u32. Fixes: 588bff95c94e ("GFS2: Reduce code redundancy writing log headers") Signed-off-by: Bob Peterson --- fs/gfs2/bmap.c | 26 ++ fs/gfs2/bmap.h | 1 + fs/gfs2/incore.h | 2 +

[Cluster-devel] [PATCH v4] gfs2: clean_journal improperly set sd_log_flush_head

2019-04-03 Thread Bob Peterson
gfs2_extent_map(). Regards, Bob Peterson --- This patch fixes regressions in 588bff95c94efc05f9e1a0b19015c9408ed7c0ef. Due to that patch, function clean_journal was setting the value of sd_log_flush_head, but that's only valid if it is replaying the node's own journal. If it's replaying another

[Cluster-devel] [GFS2 PATCH v3] gfs2: clean_journal improperly set sd_log_flush_head

2019-03-28 Thread Bob Peterson
Hi, Andreas found some problems with the previous version. Here is version 3. Ross: Can you please test this one with your scenario? Thanks. Bob Peterson --- This patch fixes regressions in 588bff95c94efc05f9e1a0b19015c9408ed7c0ef. Due to that patch, function clean_journal was setting

[Cluster-devel] [GFS2 PATCH V2] gfs2: clean_journal was setting sd_log_flush_head replaying other journals

2019-03-27 Thread Bob Peterson
Hi, Yesterday I posted a patch for this problem, but it was grossly inadequate. This patch, version 2, is another attempt to fix it. Many thanks to Ross Lagerwall for helping us identify, fix, and test the problem. Regards, Bob Peterson --- gfs2: clean_journal was setting sd_log_flush_head

Re: [Cluster-devel] Assertion failure: sdp->sd_log_blks_free <= sdp->sd_jdesc->jd_blocks

2019-03-27 Thread Bob Peterson
new one when it's ready, but it may take a couple hours to get it ready. Regards, Bob Peterson Red Hat File Systems

[Cluster-devel] [PATCH 17/19] gfs2: eliminate tr_num_revoke_rm

2019-03-27 Thread Bob Peterson
in their own list, linked from the superblock. So it's entirely unnecessary to keep separate per-transaction counts for revokes added and removed. A single count will do the same job. Therefore, this patch combines the transaction revokes into a single count. Signed-off-by: Bob Peterson --- fs

[Cluster-devel] [PATCH 15/19] gfs2: log which portion of the journal is replayed

2019-03-27 Thread Bob Peterson
output looks something like this: jid=1: Replaying journal...0x28b7 to 0x2beb This will allow us to better debug file system corruption problems. Signed-off-by: Bob Peterson --- fs/gfs2/recovery.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/gfs2/recovery.c b/fs/gfs2

[Cluster-devel] [PATCH 14/19] gfs2: Warn when a journal replay overwrites a rgrp with buffers

2019-03-27 Thread Bob Peterson
This patch adds some instrumentation in gfs2's journal replay that indicates when we're about to overwrite a rgrp for which we already have a valid buffer_head. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff

[Cluster-devel] [PATCH 12/19] gfs2: If the journal isn't live ignore log flushes

2019-03-27 Thread Bob Peterson
This patch adds a check to function gfs2_log_flush: if the journal is no longer live, the flush is ignored. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 896165811063..cd2f54db7f8a 100644 --- a/fs/gfs2

[Cluster-devel] [PATCH 19/19] gfs2: clean_journal was setting sd_log_flush_head replaying other journals

2019-03-27 Thread Bob Peterson
Function clean_journal was setting the value of sd_log_flush_head, but that's only a valid thing to do if it is replaying its own journal. If it's replaying another node's journal, that's completely wrong and will lead to multiple problems. Signed-off-by: Bob Peterson --- fs/gfs2/recovery.c | 6

[Cluster-devel] [PATCH 18/19] gfs2: don't call go_unlock unless demote is close at hand

2019-03-27 Thread Bob Peterson
and we know for sure whether it needs demoting (for this go). That way, nobody can sneak in and change the flag during the function. For simplicity's sake, I also changed the go_unlock parameter to accept the glock rather than the holder. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 9 ++-

[Cluster-devel] [PATCH 16/19] gfs2: Only remove revokes that we've submitted

2019-03-27 Thread Bob Peterson
. In the meantime, another process might be able to add new revokes to the list, and those might be wiped out. Also, since both functions run the list run head to tail, function gfs2_add_revoke should add to the tail, not the head. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 1 + fs/gfs2/log.c

[Cluster-devel] [PATCH 04/19] gfs2: move check_journal_clean to util.c for future use

2019-03-27 Thread Bob Peterson
Before this patch function check_journal_clean was in ops_fstype.c. This patch moves it to util.c so we can make use of it elsewhere in a future patch. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 42 - fs/gfs2/util.c | 45

[Cluster-devel] [PATCH 11/19] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty

2019-03-27 Thread Bob Peterson
-by: Bob Peterson --- fs/gfs2/glops.c | 26 +- fs/gfs2/log.c | 2 +- fs/gfs2/log.h | 1 + 3 files changed, 27 insertions(+), 2 deletions(-) diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c index fb88e1f92eff..9520ec62bcef 100644 --- a/fs/gfs2/glops.c +++ b/fs/gfs2/glops.c

[Cluster-devel] [PATCH 09/19] gfs2: Add verbose option to check_journal_clean

2019-03-27 Thread Bob Peterson
optional. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 2 +- fs/gfs2/util.c | 23 --- fs/gfs2/util.h | 4 +++- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index 2e9061eeec9c..a556760e798f

[Cluster-devel] [PATCH 10/19] gfs2: Check for log write errors before telling dlm to unlock

2019-03-27 Thread Bob Peterson
and dinode glocks and maintaining the integrity of the metadata. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 4 1 file changed, 4 insertions(+) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index 4996ab06e721..72a7b19c3aef 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -566,8

[Cluster-devel] [PATCH 05/19] gfs2: Allow some glocks to be used during withdraw

2019-03-27 Thread Bob Peterson
corruption. This patch allows some glocks to be used even after the file system is withdrawn. This is accomplished with a new glops flag, GLOF_OK_AT_WITHDRAW. This facilitates future patches that enhance fs withdraw. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 4 +++- fs/gfs2/glops.c | 8

[Cluster-devel] [PATCH 07/19] gfs2: Don't write log headers after file system withdraw

2019-03-27 Thread Bob Peterson
regardless of whether it's done by this node or another. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index c79279ef03b8..62106decba29 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -679,

[Cluster-devel] [PATCH 08/19] gfs2: Force withdraw to replay journals and wait for it to finish

2019-03-27 Thread Bob Peterson
; glock is not granted in EX; the callback is only just used to indicate a withdraw has occurred. Note that all nodes in the cluster must wait for the recovering node to finish replaying the withdrawing node's journal before continuing. To this end, it checks that the journals are clean multiple

[Cluster-devel] [PATCH 01/19] gfs2: log error reform

2019-03-27 Thread Bob Peterson
reacting to io errors. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 4 ++-- fs/gfs2/log.c| 7 --- fs/gfs2/lops.c | 7 +-- fs/gfs2/ops_fstype.c | 1 + fs/gfs2/quota.c | 6 -- 5 files changed, 16 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/incore.h b/fs

[Cluster-devel] [PATCH 13/19] gfs2: Issue revokes more intelligently

2019-03-27 Thread Bob Peterson
any. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 61 +++ 1 file changed, 22 insertions(+), 39 deletions(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index cd2f54db7f8a..e573bdbb9634 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -190,11

[Cluster-devel] [PATCH 06/19] gfs2: Make secondary withdrawers wait for first withdrawer

2019-03-27 Thread Bob Peterson
granted to another node. This patch makes secondary withdrawers wait until the primary withdrawer is finished with its processing before proceeding. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 3 +++ fs/gfs2/util.c | 21 +++-- 2 files changed, 22 insertions(+), 2

[Cluster-devel] [PATCH 03/19] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn

2019-03-27 Thread Bob Peterson
ly get put into the lvb generation numbers to be seen by all nodes. This patch adds checks to many of the callbacks used by dlm in its recovery state machine so that the functions are ignored and skipped if an io error has occurred or if the file system was withdraw. Signed-off-by: Bob Peterson --

[Cluster-devel] [PATCH 02/19] gfs2: Introduce concept of a pending withdraw

2019-03-27 Thread Bob Peterson
in the meantime. Signed-off-by: Bob Peterson --- fs/gfs2/aops.c | 4 ++-- fs/gfs2/file.c | 2 +- fs/gfs2/glock.c | 7 +++ fs/gfs2/glops.c | 2 +- fs/gfs2/incore.h | 1 + fs/gfs2/log.c| 20 fs/gfs2/meta_io.c| 6 +++--- fs/gfs2

[Cluster-devel] [PATCH 00/19] gfs2: misc recovery patch collection

2019-03-27 Thread Bob Peterson
tches, so the others could probably be taken or rejected individually. Bob Peterson (19): gfs2: log error reform gfs2: Introduce concept of a pending withdraw gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn gfs2: move check_journal_clean to util.c for future use gfs

Re: [Cluster-devel] [PATCH 1/2] gfs2: Fix occasional glock use-after-free

2019-03-26 Thread Bob Peterson
the evict needs to do the log flush to make sure the revoke is committed. But we've had issues with evict in the past, so we need to be careful about how we fix it. Andreas and I will look into the best way to fix it. Thanks again for your help. Regards, Bob Peterson Red Hat File Systems

[Cluster-devel] [GFS2 PATCH] gfs2: clean_journal was setting sd_log_flush_head replaying other journals

2019-03-25 Thread Bob Peterson
Hi, Function clean_journal was setting the value of sd_log_flush_head, but that's only a valid thing to do if it is replaying its own journal. If it's replaying another node's journal, that's completely wrong and will lead to multiple problems. Signed-off-by: Bob Peterson --- fs/gfs2

Re: [Cluster-devel] Assertion failure: sdp->sd_log_blks_free <= sdp->sd_jdesc->jd_blocks

2019-03-25 Thread Bob Peterson
_flush_head while replaying another node's journal, that will only lead to a problem like this. I'll try and whip up another patch and perhaps you can test it for me. FWIW, I've never seen this problem manifest on my recovery tests, but it still might be causing some of the weird problems I'm seeing. Regards, Bob Peterson Red Hat File Systems

Re: [Cluster-devel] Assertion failure: sdp->sd_log_blks_free <= sdp->sd_jdesc->jd_blocks

2019-03-19 Thread Bob Peterson
complains about the journal. Otherwise, it could be a side effect of one of the recovery issues I'm working on. Do you have other symptoms? Also, make sure multiple nodes aren't trying to use the same journal because of lock_nolock or something...I've made that mistake in the past. Regards, Bob Peterson Red Hat File Systems

[Cluster-devel] GFS2: Patches pulled, linux-gfs2.git for-next rebased

2019-03-11 Thread Bob Peterson
Hi, Linus has pulled the latest set of patches from the for-next branch of the linux-gfs2.git tree, so I rebased the tree from Linus's master. So the current tree is back to having no unmerged GFS2 patches. Regards, Bob Peterson Red Hat File Systems

[Cluster-devel] GFS2: Pull request (merge window)

2019-03-08 Thread Bob Peterson
Hi Linus, Please consider pulling the following changes for the GFS2 file system. Regards, Bob Peterson The following changes since commit 49a57857aeea06ca831043acbb0fa5e0f50602fd: Linux 5.0-rc3 (2019-01-21 13:14:44 +1300

[Cluster-devel] [PATCH 3/3] gfs2: Fix missed wakeups in find_insert_glock

2019-03-08 Thread Bob Peterson
waking up the wrong waitqueue, and the waiting tasks may be stuck forever. Fix that by using ht_parms.key_len instead of sizeof(struct lm_lockname) for the key length. Reported-by: Mark Syms Signed-off-by: Andreas Gruenbacher Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 2 +- 1 file

[Cluster-devel] [PATCH 1/3] gfs: no need to check return value of debugfs_create functions

2019-03-08 Thread Bob Peterson
to save a bit of space and make the code simpler. Cc: Bob Peterson Cc: Andreas Gruenbacher Cc: cluster-devel@redhat.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Andreas Gruenbacher --- fs/gfs2/glock.c | 70 ++-- fs/gfs2/glock.h | 4 +-- fs

[Cluster-devel] [PATCH 0/3] GFS2: Pre-pull patch posting (merge window)

2019-03-08 Thread Bob Peterson
Hi, We've only got three patches ready for this merge window: - Fix a hang related to missed wakeups for glocks from Andreas Gruenbacher. - Rework of how gfs2 manages its debugfs files from Greg K-H. - An incorrect assert when truncating or deleting files from Tim Smith. Regards, Bob

[Cluster-devel] [PATCH 2/3] gfs2: Fix an incorrect gfs2_assert()

2019-03-08 Thread Bob Peterson
, something that was previously only possible because of the difference in units. Signed-off-by: Tim Smith Signed-off-by: Bob Peterson --- fs/gfs2/inode.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h index 793808263c6d..18d4af7417fa 100644

[Cluster-devel] [PATCH 11/15] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty

2019-02-27 Thread Bob Peterson
-by: Bob Peterson --- fs/gfs2/glops.c | 20 +++- fs/gfs2/log.c | 2 +- fs/gfs2/log.h | 1 + 3 files changed, 21 insertions(+), 2 deletions(-) diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c index 2cdd6351e017..fdcb5809995f 100644 --- a/fs/gfs2/glops.c +++ b/fs/gfs2/glops.c

[Cluster-devel] [PATCH 02/15] gfs2: Introduce concept of a pending withdraw

2019-02-27 Thread Bob Peterson
in the meantime. Signed-off-by: Bob Peterson --- fs/gfs2/aops.c | 4 ++-- fs/gfs2/file.c | 2 +- fs/gfs2/glock.c | 7 +++ fs/gfs2/glops.c | 2 +- fs/gfs2/incore.h | 1 + fs/gfs2/log.c| 20 fs/gfs2/meta_io.c| 6 +++--- fs/gfs2

[Cluster-devel] [PATCH 08/15] gfs2: Force withdraw to replay journals and wait for it to finish

2019-02-27 Thread Bob Peterson
; glock is not granted in EX; the callback is only just used to indicate a withdraw has occurred. Note that all nodes in the cluster must wait for the recovering node to finish replaying the withdrawing node's journal before continuing. To this end, it checks that the journals are clean multiple

[Cluster-devel] [PATCH 07/15] gfs2: Don't write log headers after file system withdraw

2019-02-27 Thread Bob Peterson
regardless of whether it's done by this node or another. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index bbcf232b3081..69077f92d703 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -679,

[Cluster-devel] [PATCH 09/15] gfs2: Add verbose option to check_journal_clean

2019-02-27 Thread Bob Peterson
optional. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 2 +- fs/gfs2/util.c | 23 --- fs/gfs2/util.h | 4 +++- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c index 205900cddfe4..50b336536136

[Cluster-devel] [PATCH 15/15] gfs2: log which portion of the journal is replayed

2019-02-27 Thread Bob Peterson
output looks something like this: jid=1: Replaying journal...0x28b7 to 0x2beb This will allow us to better debug file system corruption problems. Signed-off-by: Bob Peterson --- fs/gfs2/recovery.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/gfs2/recovery.c b/fs/gfs2

[Cluster-devel] [PATCH 10/15] gfs2: Check for log write errors before telling dlm to unlock

2019-02-27 Thread Bob Peterson
and dinode glocks and maintaining the integrity of the metadata. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 4 1 file changed, 4 insertions(+) diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index ba61bba46785..afb336b65abd 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -566,8

[Cluster-devel] [PATCH 14/15] gfs2: Warn when a journal replay overwrites a rgrp with buffers

2019-02-27 Thread Bob Peterson
This patch adds some instrumentation in gfs2's journal replay that indicates when we're about to overwrite a rgrp for which we already have a valid buffer_head. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff

[Cluster-devel] [PATCH 06/15] gfs2: Make secondary withdrawers wait for first withdrawer

2019-02-27 Thread Bob Peterson
granted to another node. This patch makes secondary withdrawers wait until the primary withdrawer is finished with its processing before proceeding. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 3 +++ fs/gfs2/util.c | 9 - 2 files changed, 11 insertions(+), 1 deletion(-) diff

[Cluster-devel] [PATCH 12/15] gfs2: If the journal isn't live ignore log flushes

2019-02-27 Thread Bob Peterson
This patch adds a check to function gfs2_log_flush: if the journal is no longer live, the flush is ignored. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 0def6343e618..8199b235790f 100644 --- a/fs/gfs2

[Cluster-devel] [PATCH 13/15] gfs2: Issue revokes more intelligently

2019-02-27 Thread Bob Peterson
any. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 61 +++ 1 file changed, 22 insertions(+), 39 deletions(-) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 8199b235790f..042ebf701382 100644 --- a/fs/gfs2/log.c +++ b/fs/gfs2/log.c @@ -190,11

[Cluster-devel] [PATCH 01/15] gfs2: log error reform

2019-02-27 Thread Bob Peterson
reacting to io errors. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 4 ++-- fs/gfs2/log.c| 7 --- fs/gfs2/lops.c | 7 +-- fs/gfs2/ops_fstype.c | 1 + fs/gfs2/quota.c | 6 -- 5 files changed, 16 insertions(+), 9 deletions(-) diff --git a/fs/gfs2/incore.h b/fs

[Cluster-devel] [PATCH 03/15] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn

2019-02-27 Thread Bob Peterson
ly get put into the lvb generation numbers to be seen by all nodes. This patch adds checks to many of the callbacks used by dlm in its recovery state machine so that the functions are ignored and skipped if an io error has occurred or if the file system was withdraw. Signed-off-by: Bob Peterson --

[Cluster-devel] [PATCH 00/15] GFS2: Withdraw corruption patches [V2]

2019-02-27 Thread Bob Peterson
n problems I've been able to reliably recreate lately with multi-node multi-file system recovery tests. Bob Peterson (15): gfs2: log error reform gfs2: Introduce concept of a pending withdraw gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn gfs2: move check_journal_clean to u

[Cluster-devel] [PATCH 05/15] gfs2: Allow some glocks to be used during withdraw

2019-02-27 Thread Bob Peterson
corruption. This patch allows some glocks to be used even after the file system is withdrawn. This is accomplished with a new glops flag, GLOF_OK_AT_WITHDRAW. This facilitates future patches that enhance fs withdraw. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 4 +++- fs/gfs2/glops.c | 8

[Cluster-devel] [PATCH 04/15] gfs2: move check_journal_clean to util.c for future use

2019-02-27 Thread Bob Peterson
Before this patch function check_journal_clean was in ops_fstype.c. This patch moves it to util.c so we can make use of it elsewhere in a future patch. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 42 - fs/gfs2/util.c | 45

[Cluster-devel] [PATCH] Revert "gfs2: read journal in large chunks to locate the head"

2019-02-13 Thread Bob Peterson
This reverts commit 2a5f14f279f59143139bcd1606903f2f80a34241. This patch causes xfstests generic/311 to fail. Reverting this for now until we have a proper fix. Signed-off-by: Abhi Das Signed-off-by: Bob Peterson --- fs/gfs2/glops.c | 1 - fs/gfs2/log.c| 4 +- fs/gfs2/lops.c

[Cluster-devel] [GFS2 PATCH 5/9] gfs2: Keep transactions on ail1 list until after issuing revokes

2019-02-13 Thread Bob Peterson
the transactions to remain on the ail1 list until it can issue revokes for them. Then, if they have no more buffers, they're moved to the ail2 list after the revokes are added. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 30 ++ 1 file changed, 18 insertions(+), 12

  1   2   3   4   5   6   7   8   9   10   >