Re: [Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro

2023-04-24 Thread Bob Peterson
On 4/24/23 10:08 AM, Andreas Gruenbacher wrote: point the GFS2_LOG_HEAD_FLUSH_SHUTDOWN has been set by gfs2_make_fs_ro. Do you mean that at that point, the SDF_JOURNAL_LIVE flag has already been cleared? Ah, yes, you are correct. That was a think-o. Please adjust as appropriate. Regards,

Re: [Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro

2023-04-24 Thread Andreas Gruenbacher
On Fri, Apr 21, 2023 at 9:07 PM Bob Peterson wrote: > Before this patch function gfs2_make_fs_ro called gfs2_log_flush once to > finalize the log. However, if there's dirty metadata, log flushes tend > to sync the metadata and formulate revokes. Before this patch, those > revokes may not be

Re: [Cluster-devel] [GFS2 PATCH 1/4] gfs2: return errors from gfs2_ail_empty_gl

2023-04-24 Thread Andreas Gruenbacher
On Fri, Apr 21, 2023 at 9:07 PM Bob Peterson wrote: > Before this patch, function gfs2_ail_empty_gl did not return errors it > encountered from __gfs2_trans_begin. Those errors usually came from the > fact that the file system was made read-only, often due to unmount, > (but theoretically could

[Cluster-devel] [GFS2 PATCH 4/4] gfs2: gfs2_ail_empty_gl no log flush on error

2023-04-21 Thread Bob Peterson
Before this patch function gfs2_ail_empty_gl called gfs2_log_flush even in cases where it encountered an error. It should probably skip the log flush and leave the file system in an inconsistent state, letting a subsequent withdraw force the journal to be replayed to reestablish metadata

[Cluster-devel] [GFS2 PATCH 2/4] gfs2: Perform second log flush in gfs2_make_fs_ro

2023-04-21 Thread Bob Peterson
Before this patch function gfs2_make_fs_ro called gfs2_log_flush once to finalize the log. However, if there's dirty metadata, log flushes tend to sync the metadata and formulate revokes. Before this patch, those revokes may not be written out to the journal immediately, which meant unresolved

[Cluster-devel] [GFS2 PATCH 3/4] gfs2: Issue message when revokes cannot be written

2023-04-21 Thread Bob Peterson
Before this patch, function gfs2_ail_empty_gl would silently return an error to the caller. This would get silently set into sd_log_error which would cause a withdraw, but there was no indication why the file system was withdrawn. This patch adds a fs_err to log the appropriate error message.

[Cluster-devel] [GFS2 PATCH 1/4] gfs2: return errors from gfs2_ail_empty_gl

2023-04-21 Thread Bob Peterson
Before this patch, function gfs2_ail_empty_gl did not return errors it encountered from __gfs2_trans_begin. Those errors usually came from the fact that the file system was made read-only, often due to unmount, (but theoretically could be due to -o remount,ro) which prevented the transaction from

[Cluster-devel] [GFS2 PATCH 0/4] Fix revoke processing at unmount and ro

2023-04-21 Thread Bob Peterson
This series of patches fixes a set of corner cases regarding how revokes are handled during unmount and transitions to read-only. Return codes were dropped, errors were not reported, and revokes were not written properly. Bob Peterson (4): gfs2: return errors from gfs2_ail_empty_gl gfs2:

Re: [Cluster-devel] [GFS2 PATCH 2/3] gfs2: Retry on dlm -EBUSY (stop gap)

2022-02-02 Thread Andreas Gruenbacher
Hi Bob, On Mon, Jan 24, 2022 at 6:28 PM Bob Peterson wrote: > Sometimes when gfs2 cancels a glock request, dlm needs time to take the > request off its Conversion queue. During that time, we get -EBUSY from > dlm, which confuses the glock state machine. Ideally we want dlm to > not return -EBUSY

[Cluster-devel] [GFS2 PATCH 1/3] gfs2: cancel timed-out glock requests

2022-01-24 Thread Bob Peterson
From: Andreas Gruenbacher In the gfs2 evict code it tries to upgrade the iopen glock from SH to EX. If the attempt to upgrade times out, gfs2 needs to tell dlm to cancel the lock request or it can deadlock. We also need to wake up the process waiting for the lock when dlm sends its AST back to

[Cluster-devel] [GFS2 PATCH 0/3] Fix how gfs2 handles timed-out dlm requests

2022-01-24 Thread Bob Peterson
Recently, we introduced patches to time out lock requests that take too long, specifically for iopen glocks during ABBA deadlocks during evict. Before this patch set, gfs2 never canceled the failed requests that timed out, which can lead to deadlocks to due dlm keeping the requests on its

[Cluster-devel] [GFS2 PATCH 3/3] gfs2: Switch lock order of inode and iopen glock

2022-01-24 Thread Bob Peterson
From: Andreas Gruenbacher This patch tries to fix the continual ABBA deadlocks we keep having between the iopen and inode glocks. This switches the lock order in gfs2_inode_lookup and gfs2_create_inode so the iopen glock is always locked first. Signed-off-by: Andreas Gruenbacher Signed-off-by:

[Cluster-devel] [GFS2 PATCH 2/3] gfs2: Retry on dlm -EBUSY (stop gap)

2022-01-24 Thread Bob Peterson
Sometimes when gfs2 cancels a glock request, dlm needs time to take the request off its Conversion queue. During that time, we get -EBUSY from dlm, which confuses the glock state machine. Ideally we want dlm to not return -EBUSY but wait until the operation has completed. This is a stop-gap

Re: [Cluster-devel] [GFS2 PATCH v2 6/6] gfs2: introduce and use new glops go_lock_needed

2021-09-22 Thread Andreas Gruenbacher
On Wed, Sep 22, 2021 at 2:47 PM Bob Peterson wrote: > On 9/22/21 6:57 AM, Andreas Gruenbacher wrote: > > On Thu, Sep 16, 2021 at 9:11 PM Bob Peterson wrote: > >> Before this patch, when a glock was locked, the very first holder on the > >> queue would unlock the lockref and call the go_lock

Re: [Cluster-devel] [GFS2 PATCH v2 6/6] gfs2: introduce and use new glops go_lock_needed

2021-09-22 Thread Bob Peterson
On 9/22/21 6:57 AM, Andreas Gruenbacher wrote: On Thu, Sep 16, 2021 at 9:11 PM Bob Peterson wrote: Before this patch, when a glock was locked, the very first holder on the queue would unlock the lockref and call the go_lock glops function (if one exists), unless GL_SKIP was specified. When we

Re: [Cluster-devel] [GFS2 PATCH v2 6/6] gfs2: introduce and use new glops go_lock_needed

2021-09-22 Thread Andreas Gruenbacher
On Thu, Sep 16, 2021 at 9:11 PM Bob Peterson wrote: > Before this patch, when a glock was locked, the very first holder on the > queue would unlock the lockref and call the go_lock glops function (if > one exists), unless GL_SKIP was specified. When we introduced the new > node-scope concept, we

[Cluster-devel] [GFS2 PATCH v2 6/6] gfs2: introduce and use new glops go_lock_needed

2021-09-16 Thread Bob Peterson
Before this patch, when a glock was locked, the very first holder on the queue would unlock the lockref and call the go_lock glops function (if one exists), unless GL_SKIP was specified. When we introduced the new node-scope concept, we allowed multiple holders to lock glocks in EX mode and share

[Cluster-devel] [GFS2 PATCH v2 3/6] gfs2: move GL_SKIP check from glops to do_promote

2021-09-16 Thread Bob Peterson
Before this patch, each individual "go_lock" glock operation (glop) checked the GL_SKIP flag, and if set, would skip further processing. This patch changes the logic so the go_lock caller, function go_promote, checks the GL_SKIP flag before calling the go_lock op in the first place. This avoids

[Cluster-devel] [GFS2 PATCH v2 4/6] gfs2: Switch some BUG_ON to GLOCK_BUG_ON for debug

2021-09-16 Thread Bob Peterson
In rgrp.c there are several places where it does BUG_ON. This tells us the call stack but nothing more, which is not very helpful. This patch switches them to GLOCK_BUG_ON which also prints the glock, its holders, and many of the rgrp values, which will help us debug problems in the future.

[Cluster-devel] [GFS2 PATCH v2 5/6] gfs2: simplify do_promote and fix promote trace

2021-09-16 Thread Bob Peterson
Before this patch, the gfs2_promote kernel trace point would would only record the "first" flag if the go_lock function was called. This patch simplifies do_promote by eliminating the redundant code in do_promote and fixes the trace point by adding a new gfs2_first_holder function. This will also

[Cluster-devel] [GFS2 PATCH v2 1/6] gfs2: remove redundant check in gfs2_rgrp_go_lock

2021-09-16 Thread Bob Peterson
Before this patch function gfs2_rgrp_go_lock checked if GL_SKIP and ar_rgrplvb were both true. However, GL_SKIP is only set for rgrps if ar_rgrplvb is true (see gfs2_inplace_reserve). This patch simply removes the redundant check. Signed-off-by: Bob Peterson --- fs/gfs2/rgrp.c | 3 +-- 1 file

[Cluster-devel] [GFS2 PATCH v2 2/6] gfs2: Add GL_SKIP holder flag to dump_holder

2021-09-16 Thread Bob Peterson
Somehow the GL_SKIP flag was missed when dumping glock holders. This patch adds it to function hflags2str. I added it at the end because I wanted Holder and Skip flags together to read "Hs" rather than "sH" to avoid confusion with "Shared" ("SH") holder state. Signed-off-by: Bob Peterson ---

[Cluster-devel] [GFS2 PATCH v2 0/6] gfs2: fix bugs related to node_scope and go_lock

2021-09-16 Thread Bob Peterson
This set of patches contains a few clean-ups and a patch to fix a NULL Pointer dereference introduced by the new "node scope" patch 06e908cd9ead ("gfs2: Allow node-wide exclusive glock sharing"). Bob Peterson (6): gfs2: remove redundant check in gfs2_rgrp_go_lock gfs2: Add GL_SKIP holder flag

[Cluster-devel] [GFS2 PATCH 3/4] gfs2: move GL_SKIP check from glops to do_promote

2021-09-13 Thread Bob Peterson
Before this patch, each individual "go_lock" glock operation (glop) would check the GL_SKIP flag, and if set, would skip further processing. This patch changes the logic so the go_lock caller, function go_promote, checks the GL_SKIP flag before calling the go_lock op in the first place. There are

[Cluster-devel] [GFS2 PATCH 1/4] gfs2: remove redundant check in gfs2_rgrp_go_lock

2021-09-13 Thread Bob Peterson
Before this patch function gfs2_rgrp_go_lock checked if GL_SKIP and ar_rgrplvb were both true. However, GL_SKIP is only set for rgrps if ar_rgrplvb is true (see gfs2_inplace_reserve). This patch simply removes the redundant check. Signed-off-by: Bob Peterson --- fs/gfs2/rgrp.c | 3 +-- 1 file

[Cluster-devel] [GFS2 PATCH 2/4] gfs2: Add GL_SKIP holder flag to dump_holder

2021-09-13 Thread Bob Peterson
Somehow the GL_SKIP flag was missed when dumping glock holders. This patch adds it to function hflags2str. I added it at the end because I wanted Holder and Skip flags together to read "Hs" rather than "sH" to avoid confusion with "Shared" ("SH") holder state. Signed-off-by: Bob Peterson ---

[Cluster-devel] [GFS2 PATCH 4/4] gfs2: rework go_lock mechanism for node_scope race

2021-09-13 Thread Bob Peterson
Before this patch, glocks only performed their go_lock glop function when the first holder was queued. When we introduced the new "node scope" mechanism, we allowed multiple holders to hold a glock in EX at the same time, with local locking. But it introduced a new race: If the first holder

[Cluster-devel] [GFS2 PATCH 0/4] gfs2: fix bugs related to node_scope and go_lock

2021-09-13 Thread Bob Peterson
This set of patches contains a few clean-ups and a patch to fix a NULL Pointer dereference introduced by the new "node scope" patch 06e908cd9ead ("gfs2: Allow node-wide exclusive glock sharing"). Bob Peterson (4): gfs2: remove redundant check in gfs2_rgrp_go_lock gfs2: Add GL_SKIP holder flag

Re: [Cluster-devel] [GFS2 PATCH 1/3] gfs2: switch go_xmote_bh glop to pass gh not gl

2021-08-25 Thread Bob Peterson
On 8/24/21 5:27 PM, Andreas Gruenbacher wrote: On Tue, Aug 24, 2021 at 6:48 PM Bob Peterson wrote: On 8/24/21 11:12 AM, Andreas Gruenbacher wrote: On Tue, Aug 24, 2021 at 4:02 PM Bob Peterson wrote: Before this patch, the go_xmote_bh function was passed gl, the glock pointer. This patch

Re: [Cluster-devel] [GFS2 PATCH 1/3] gfs2: switch go_xmote_bh glop to pass gh not gl

2021-08-24 Thread Andreas Gruenbacher
On Tue, Aug 24, 2021 at 6:48 PM Bob Peterson wrote: > On 8/24/21 11:12 AM, Andreas Gruenbacher wrote: > > On Tue, Aug 24, 2021 at 4:02 PM Bob Peterson wrote: > >> Before this patch, the go_xmote_bh function was passed gl, the glock > >> pointer. This patch switches it to gh, the holder, which

Re: [Cluster-devel] [GFS2 PATCH 1/3] gfs2: switch go_xmote_bh glop to pass gh not gl

2021-08-24 Thread Bob Peterson
Hi, On 8/24/21 11:12 AM, Andreas Gruenbacher wrote: On Tue, Aug 24, 2021 at 4:02 PM Bob Peterson wrote: Before this patch, the go_xmote_bh function was passed gl, the glock pointer. This patch switches it to gh, the holder, which points to the gl. This facilitates improvements for the next

Re: [Cluster-devel] [GFS2 PATCH 1/3] gfs2: switch go_xmote_bh glop to pass gh not gl

2021-08-24 Thread Andreas Gruenbacher
On Tue, Aug 24, 2021 at 4:02 PM Bob Peterson wrote: > Before this patch, the go_xmote_bh function was passed gl, the glock > pointer. This patch switches it to gh, the holder, which points to the gl. > This facilitates improvements for the next patch. > > Signed-off-by: Bob Peterson > --- >

Re: [Cluster-devel] [GFS2 PATCH 2/3] gfs2: Fix broken freeze_go_xmote_bh

2021-08-24 Thread Andreas Gruenbacher
On Tue, Aug 24, 2021 at 4:02 PM Bob Peterson wrote: > The freeze glock was used in several places whenever a gfs2 file system > was frozen, thawed, mounted, unmounted, remounted, or withdrawn. It was > used to prevent those things from clashing with one another. > Function freeze_go_xmote_bh

[Cluster-devel] [GFS2 PATCH 0/3] gfs2: Fix freeze/thaw journal check problems

2021-08-24 Thread Bob Peterson
This patch set fixes some problems in which the freeze glock's glop functions were not working as expected. Bob Peterson (3): gfs2: switch go_xmote_bh glop to pass gh not gl gfs2: Fix broken freeze_go_xmote_bh gfs2: Eliminate go_xmote_bh in favor of go_lock fs/gfs2/glock.c | 12

[Cluster-devel] [GFS2 PATCH 2/3] gfs2: Fix broken freeze_go_xmote_bh

2021-08-24 Thread Bob Peterson
The freeze glock was used in several places whenever a gfs2 file system was frozen, thawed, mounted, unmounted, remounted, or withdrawn. It was used to prevent those things from clashing with one another. Function freeze_go_xmote_bh only checked if the journal was clean in cases where the journal

[Cluster-devel] [GFS2 PATCH 3/3] gfs2: Eliminate go_xmote_bh in favor of go_lock

2021-08-24 Thread Bob Peterson
Before this patch, the freeze glock was the only glock to use the go_xmote_bh glock op (glop). The go_xmote_bh glop is done when a glock is locked. But so is go_lock. This patch eliminates the glop altogether in favor of just using go_lock for the freeze glock. This is for better consistency,

[Cluster-devel] [GFS2 PATCH 1/3] gfs2: switch go_xmote_bh glop to pass gh not gl

2021-08-24 Thread Bob Peterson
Before this patch, the go_xmote_bh function was passed gl, the glock pointer. This patch switches it to gh, the holder, which points to the gl. This facilitates improvements for the next patch. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 4 ++-- fs/gfs2/glops.c | 5 +++--

Re: [Cluster-devel] [GFS2 PATCH 13/15] gfs2: ignore usr|grp|prjquota mount options

2021-08-02 Thread Andrew Price
On 28/07/2021 21:32, Bob Peterson wrote: On 7/28/21 1:28 PM, Andreas Gruenbacher wrote: On Tue, Jul 27, 2021 at 7:37 PM Bob Peterson wrote: Before this patch, gfs2 rejected mounts attempted with the usrquota, grpquota, or prjquota mount options. That caused numerous xfstests tests to fail.

Re: [Cluster-devel] [GFS2 PATCH 13/15] gfs2: ignore usr|grp|prjquota mount options

2021-07-28 Thread Andreas Gruenbacher
On Wed, Jul 28, 2021 at 10:33 PM Bob Peterson wrote: > On 7/28/21 1:28 PM, Andreas Gruenbacher wrote: > > On Tue, Jul 27, 2021 at 7:37 PM Bob Peterson wrote: > >> Before this patch, gfs2 rejected mounts attempted with the usrquota, > >> grpquota, or prjquota mount options. That caused numerous

Re: [Cluster-devel] [GFS2 PATCH 13/15] gfs2: ignore usr|grp|prjquota mount options

2021-07-28 Thread Bob Peterson
On 7/28/21 1:28 PM, Andreas Gruenbacher wrote: On Tue, Jul 27, 2021 at 7:37 PM Bob Peterson wrote: Before this patch, gfs2 rejected mounts attempted with the usrquota, grpquota, or prjquota mount options. That caused numerous xfstests tests to fail. This patch allows gfs2 to accept but ignore

Re: [Cluster-devel] [GFS2 PATCH 13/15] gfs2: ignore usr|grp|prjquota mount options

2021-07-28 Thread Andreas Gruenbacher
On Tue, Jul 27, 2021 at 7:37 PM Bob Peterson wrote: > Before this patch, gfs2 rejected mounts attempted with the usrquota, > grpquota, or prjquota mount options. That caused numerous xfstests tests > to fail. This patch allows gfs2 to accept but ignore those mount options > so the tests may be

Re: [Cluster-devel] [GFS2 PATCH 10/10] gfs2: replace sd_aspace with sd_inode

2021-07-28 Thread Jan Kara
On Wed 28-07-21 09:57:01, Steven Whitehouse wrote: > On Wed, 2021-07-28 at 08:50 +0200, Andreas Gruenbacher wrote: > > On Tue, Jul 13, 2021 at 9:34 PM Bob Peterson > > wrote: > > > On 7/13/21 1:26 PM, Steven Whitehouse wrote: > > > > > > Hi, > > > > > > On Tue, 2021-07-13 at 13:09 -0500, Bob

Re: [Cluster-devel] [GFS2 PATCH 10/10] gfs2: replace sd_aspace with sd_inode

2021-07-28 Thread Steven Whitehouse
Hi, On Wed, 2021-07-28 at 08:50 +0200, Andreas Gruenbacher wrote: > On Tue, Jul 13, 2021 at 9:34 PM Bob Peterson > wrote: > > On 7/13/21 1:26 PM, Steven Whitehouse wrote: > > > > Hi, > > > > On Tue, 2021-07-13 at 13:09 -0500, Bob Peterson wrote: > > > > Before this patch, gfs2 kept its own

Re: [Cluster-devel] [GFS2 PATCH 10/10] gfs2: replace sd_aspace with sd_inode

2021-07-28 Thread Andreas Gruenbacher
On Tue, Jul 13, 2021 at 9:34 PM Bob Peterson wrote: > On 7/13/21 1:26 PM, Steven Whitehouse wrote: > > Hi, > > On Tue, 2021-07-13 at 13:09 -0500, Bob Peterson wrote: > > Before this patch, gfs2 kept its own address space for rgrps, but > this > caused a lockdep problem because vfs assumes a 1:1

Re: [Cluster-devel] [GFS2 PATCH 09/15] gfs2: fix deadlock in gfs2_ail1_empty withdraw

2021-07-27 Thread Andreas Gruenbacher
Hi Bob, On Tue, Jul 27, 2021 at 7:37 PM Bob Peterson wrote: > Before this patch, function gfs2_ail1_empty could issue a file system > withdraw when IO errors were discovered. However, there are several > callers, including gfs2_flush_revokes() which holds the gfs2_log_lock > before calling

[Cluster-devel] [GFS2 PATCH 10/15] gfs2: replace sd_aspace with sd_inode

2021-07-27 Thread Bob Peterson
Before this patch, gfs2 kept its own address space for rgrps, but this caused a lockdep problem because vfs assumes a 1:1 relationship between address spaces and their inode. One problematic area is this: gfs2_unpin mark_buffer_dirty(bh); mapping = page_mapping(page);

[Cluster-devel] [GFS2 PATCH 07/15] gfs2: init system threads before freeze lock

2021-07-27 Thread Bob Peterson
Patch 96b1454f2e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") changed the gfs2 mount sequence so that it holds the freeze lock before calling gfs2_make_fs_rw. Before this patch, gfs2_make_fs_rw called init_threads to initialize the quotad and logd threads. That is a problem

[Cluster-devel] [GFS2 PATCH 14/15] fs: Move notify_change permission checks into may_setattr

2021-07-27 Thread Bob Peterson
From: Andreas Gruenbacher Move the permission checks in notify_change into a separate function to make them available to filesystems. When notify_change is called, the vfs performs those checks before calling into iop->setattr. However, a filesystem like gfs2 can only lock and revalidate the

[Cluster-devel] [GFS2 PATCH 08/15] gfs2: Don't release and reacquire local statfs bh

2021-07-27 Thread Bob Peterson
Before this patch, several functions in gfs2 related to the updating of the statfs file used a newly acquired/read buffer_head for the local statfs file. This is completely unnecessary, because other nodes should never update it. Recreating the buffer is a waste of time. This patch allows gfs2 to

[Cluster-devel] [GFS2 PATCH 11/15] gfs2: reduce redundant code in gfs2_trans_add_*

2021-07-27 Thread Bob Peterson
Before this patch, functions gfs2_trans_add_data and gfs2_trans_add_meta did similar checks to see if the buffer_head had an existing bd element, and if not, assigned one, temporarily dropping locks to allow for better simultaneous operations. These checks were identical except that the meta

[Cluster-devel] [GFS2 PATCH 13/15] gfs2: ignore usr|grp|prjquota mount options

2021-07-27 Thread Bob Peterson
Before this patch, gfs2 rejected mounts attempted with the usrquota, grpquota, or prjquota mount options. That caused numerous xfstests tests to fail. This patch allows gfs2 to accept but ignore those mount options so the tests may be run. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c |

[Cluster-devel] [GFS2 PATCH 15/15] gfs2: Switch to may_setattr in gfs2_setattr

2021-07-27 Thread Bob Peterson
From: Andreas Gruenbacher The permission check in gfs2_setattr is an old and outdated version of may_setattr(). Switch to the updated version. Fixes fstest generic/079. Signed-off-by: Andreas Gruenbacher Signed-off-by: Bob Peterson --- fs/gfs2/inode.c | 4 ++-- 1 file changed, 2

[Cluster-devel] [GFS2 PATCH 05/15] gfs2: trivial clean up of gfs2_ail_error

2021-07-27 Thread Bob Peterson
This patch does not change function. It adds variable sdp to clean up function gfs2_ail_error and make it more readable. Signed-off-by: Bob Peterson --- fs/gfs2/glops.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c index

[Cluster-devel] [GFS2 PATCH 04/15] gfs2: be more verbose replaying invalid rgrp blocks

2021-07-27 Thread Bob Peterson
This patch adds some crucial information when journal replay detects a replay of an obsolete rgrp block. For example, it wasn't printing the journal id or the generation number played. This just supplements what is logged in this unusual case. The function that actually complains about the

[Cluster-devel] [GFS2 PATCH 00/15] gfs2: misc. patch collection (V2)

2021-07-27 Thread Bob Peterson
This is version2 of a set of misc. patches from my collection. As before, they can be added individually or as a set. Changes from V1: 1. I added a wrapper patch Andreas wrote. I'm not sure how serious he is about this one. 2. This set omits the patch "New log flush watchdog" due to Steve

[Cluster-devel] [GFS2 PATCH 09/15] gfs2: fix deadlock in gfs2_ail1_empty withdraw

2021-07-27 Thread Bob Peterson
Before this patch, function gfs2_ail1_empty could issue a file system withdraw when IO errors were discovered. However, there are several callers, including gfs2_flush_revokes() which holds the gfs2_log_lock before calling gfs2_ail1_empty. If gfs2_ail1_empty needed to withdraw it would leave the

[Cluster-devel] [GFS2 PATCH 12/15] gfs2: Make recovery error more readable

2021-07-27 Thread Bob Peterson
Before this patch, withdraws could cause an error that looked like: Journal recovery skipped for 0 until next mount. This patch changes it to a more readable: Journal recovery skipped for jid 0 until next mount. Signed-off-by: Bob Peterson --- fs/gfs2/util.c | 2 +- 1 file changed, 1

[Cluster-devel] [GFS2 PATCH 02/15] gfs2: Fix glock recursion in freeze_go_xmote_bh

2021-07-27 Thread Bob Peterson
We must not call gfs2_consist (which does a file system withdraw) from the freeze glock's freeze_go_xmote_bh function because the withdraw will try to use the freeze glock, thus causing a glock recursion error. This patch changes freeze_go_xmote_bh to call function gfs2_assert_withdraw_delayed

[Cluster-devel] [GFS2 PATCH 01/15] gfs2: Add wrapper for iomap_file_buffered_write

2021-07-27 Thread Bob Peterson
From: Andreas Gruenbacher Add a wrapper around iomap_file_buffered_write. We'll add code for when the operation needs to be retried here later. Signed-off-by: Andreas Gruenbacher --- fs/gfs2/file.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git

[Cluster-devel] [GFS2 PATCH 06/15] gfs2: tiny cleanup in gfs2_log_reserve

2021-07-27 Thread Bob Peterson
Function gfs2_log_reserve was setting revoke_blks to 0. There's no need because it calculates it shortly thereafter. This patch removes the unnecessary set. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/gfs2/log.c

[Cluster-devel] [GFS2 PATCH 03/15] gfs2: Eliminate go_xmote_bh in favor of go_lock

2021-07-27 Thread Bob Peterson
Before this patch, the freeze glock was the only glock to use the go_xmote_bh glock op (glop). The go_xmote_bh glop is done when a glock is locked. But so is go_lock. This patch eliminates the glop altogether in favor of just using go_lock for the freeze glock. This is for better consistency,

Re: [Cluster-devel] [GFS2 PATCH 08/10] gfs2: New log flush watchdog

2021-07-14 Thread Steven Whitehouse
Hi, On Tue, 2021-07-13 at 15:03 -0500, Bob Peterson wrote: > On 7/13/21 1:41 PM, Steven Whitehouse wrote: > > Hi, > > > > On Tue, 2021-07-13 at 13:09 -0500, Bob Peterson wrote: > > > This patch adds a new watchdog whose sole purpose is to complain > > > when > > > gfs2_log_flush operations are

Re: [Cluster-devel] [GFS2 PATCH 08/10] gfs2: New log flush watchdog

2021-07-13 Thread Steven Whitehouse
Hi, On Tue, 2021-07-13 at 13:09 -0500, Bob Peterson wrote: > This patch adds a new watchdog whose sole purpose is to complain when > gfs2_log_flush operations are taking too long. > This one is a bit confusing. It says that it is to check if the log flush is taking too long, but it appears to

Re: [Cluster-devel] [GFS2 PATCH 10/10] gfs2: replace sd_aspace with sd_inode

2021-07-13 Thread Steven Whitehouse
Hi, On Tue, 2021-07-13 at 13:09 -0500, Bob Peterson wrote: > Before this patch, gfs2 kept its own address space for rgrps, but > this > caused a lockdep problem because vfs assumes a 1:1 relationship > between > address spaces and their inode. One problematic area is this: > I don't think that

[Cluster-devel] [GFS2 PATCH 02/10] gfs2: Eliminate go_xmote_bh in favor of go_lock

2021-07-13 Thread Bob Peterson
Before this patch, the freeze glock was the only glock to use the go_xmote_bh glock op (glop). The go_xmote_bh glop is done when a glock is locked. But so is go_lock. This patch eliminates the glop altogether in favor of just using go_lock for the freeze glock. This is for better consistency,

[Cluster-devel] [GFS2 PATCH 06/10] gfs2: init system threads before freeze lock

2021-07-13 Thread Bob Peterson
Patch 96b1454f2e ("gfs2: move freeze glock outside the make_fs_rw and _ro functions") changed the gfs2 mount sequence so that it holds the freeze lock before calling gfs2_make_fs_rw. Before this patch, gfs2_make_fs_rw called init_threads to initialize the quotad and logd threads. That is a problem

[Cluster-devel] [GFS2 PATCH 10/10] gfs2: replace sd_aspace with sd_inode

2021-07-13 Thread Bob Peterson
Before this patch, gfs2 kept its own address space for rgrps, but this caused a lockdep problem because vfs assumes a 1:1 relationship between address spaces and their inode. One problematic area is this: gfs2_unpin mark_buffer_dirty(bh); mapping = page_mapping(page);

[Cluster-devel] [GFS2 PATCH 03/10] gfs2: be more verbose replaying invalid rgrp blocks

2021-07-13 Thread Bob Peterson
This patch adds some crucial information when journal replay detects a replay of an obsolete rgrp block. For example, it wasn't printing the journal id or the generation number played. This just supplements what is logged in this unusual case. The function that actually complains about the

[Cluster-devel] [GFS2 PATCH 09/10] gfs2: fix deadlock in gfs2_ail1_empty withdraw

2021-07-13 Thread Bob Peterson
Before this patch, function gfs2_ail1_empty could issue a file system withdraw when IO errors were discovered. However, there are several callers, including gfs2_flush_revokes() which holds the gfs2_log_lock before calling gfs2_ail1_empty. If gfs2_ail1_empty needed to withdraw it would leave the

[Cluster-devel] [GFS2 PATCH 00/10] gfs2: misc. patch collection

2021-07-13 Thread Bob Peterson
This is a set of 10 patches from my collection. They can be added individually or as a set. Bob Peterson (10): gfs2: Fix glock recursion in freeze_go_xmote_bh gfs2: Eliminate go_xmote_bh in favor of go_lock gfs2: be more verbose replaying invalid rgrp blocks gfs2: trivial clean up of

[Cluster-devel] [GFS2 PATCH 07/10] gfs2: Don't release and reacquire local statfs bh

2021-07-13 Thread Bob Peterson
Before this patch, several functions in gfs2 related to the updating of the statfs file used a newly acquired/read buffer_head for the local statfs file. This is completely unnecessary, because other nodes should never update it. Recreating the buffer is a waste of time. This patch allows gfs2 to

[Cluster-devel] [GFS2 PATCH 08/10] gfs2: New log flush watchdog

2021-07-13 Thread Bob Peterson
This patch adds a new watchdog whose sole purpose is to complain when gfs2_log_flush operations are taking too long. Signed-off-by: Bob Peterson --- fs/gfs2/incore.h | 6 ++ fs/gfs2/log.c| 47 fs/gfs2/log.h| 1 +

[Cluster-devel] [GFS2 PATCH 01/10] gfs2: Fix glock recursion in freeze_go_xmote_bh

2021-07-13 Thread Bob Peterson
We must not call gfs2_consist (which does a file system withdraw) from the freeze glock's freeze_go_xmote_bh function because the withdraw will try to use the freeze glock, thus causing a glock recursion error. This patch changes freeze_go_xmote_bh to call function gfs2_assert_withdraw_delayed

[Cluster-devel] [GFS2 PATCH 05/10] gfs2: tiny cleanup in gfs2_log_reserve

2021-07-13 Thread Bob Peterson
Function gfs2_log_reserve was setting revoke_blks to 0. There's no need because it calculates it shortly thereafter. This patch removes the unnecessary set. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/gfs2/log.c

[Cluster-devel] [GFS2 PATCH 04/10] gfs2: trivial clean up of gfs2_ail_error

2021-07-13 Thread Bob Peterson
This patch does not change function. It adds variable sdp to clean up function gfs2_ail_error and make it more readable. Signed-off-by: Bob Peterson --- fs/gfs2/glops.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c index

[Cluster-devel] [gfs2 PATCH] gfs2: Eliminate go_xmote_bh in favor of go_lock

2021-06-01 Thread Bob Peterson
Before this patch, the freeze glock was the only glock to use the go_xmote_bh glock op (glop). The go_xmote_bh glop is done when a glock is locked. But so is go_lock. This patch eliminates the glop altogether in favor of just using go_lock for the freeze glock. This is for better consistency,

[Cluster-devel] [gfs2 PATCH] gfs2: Fix glock recursion in freeze_go_xmote_bh

2021-06-01 Thread Bob Peterson
We must not call gfs2_consist (which does a file system withdraw) from the freeze glock's freeze_go_xmote_bh function because the withdraw will try to use the freeze glock, thus causing a glock recursion error. This patch changes freeze_go_xmote_bh to call function gfs2_assert_withdraw_delayed

[Cluster-devel] [GFS2 PATCH] gfs2: Clean up revokes on normal withdraws

2021-05-19 Thread Bob Peterson
Before this patch, the system ail lists were cleaned up if the logd process withdrew, but on other withdraws, they were not cleaned up. This included the cleaning up of the revokes as well. This patch reorganizes things a bit so that all withdraws (not just logd) clean up the ail lists, including

[Cluster-devel] [GFS2 PATCH] gfs2: Fix I_NEW check in gfs2_dinode_in

2021-05-19 Thread Bob Peterson
Patch 4a378d8a0d96 added a new check for I_NEW inodes, but unfortunately it used the wrong variable, i_flags. This caused GFS2 to withdraw when gfs2_lookup_by_inum needed to refresh an I_NEW inode. This patch switches to use the correct variable, i_state. Fixes: 4a378d8a0d96 ("gfs2: be careful

[Cluster-devel] [gfs2 PATCH] gfs2: fix a deadlock on withdraw-during mount

2021-05-18 Thread Bob Peterson
Before this patch, gfs2 would deadlock because of the following sequence during mount: mount gfs2_fill_super gfs2_make_fs_rw <--- Detects IO error with glock kthread_stop(sdp->sd_quotad_process); <--- Blocked waiting for quotad to finish logd Detects IO error and

[Cluster-devel] [gfs2 patch] gfs2: fix scheduling while atomic bug in glocks

2021-05-18 Thread Bob Peterson
Before this patch, in the unlikely event that gfs2_glock_dq encountered a withdraw, it would do a wait_on_bit to wait for its journal to be recovered, but it never released the glock's spin_lock, which caused a scheduling-while-atomic error. This patch unlocks the lockref spin_lock before waiting

Re: [Cluster-devel] [gfs2 PATCH] gfs2: allocate pages for clone bitmaps

2021-04-12 Thread Andreas Gruenbacher
On Sat, Apr 10, 2021 at 3:49 PM Bob Peterson wrote: > Resource group (rgrp) bitmaps have in-core-only "clone" bitmaps that > ensure freed fs space from deletes are not reused until the transaction > is complete. Before this patch, these clone bitmaps were allocated with > kmalloc, but with the

Re: [Cluster-devel] [GFS2 PATCH v2] gfs2: fast dealloc for exhash directories

2021-04-06 Thread Andreas Gruenbacher
On Mon, Mar 22, 2021 at 3:15 PM Bob Peterson wrote: > Before this patch, whenever a directory was deleted, it called function > __gfs2_dir_exhash_dealloc to deallocate the directory's leaf blocks. > But __gfs2_dir_exhash_dealloc never knew if any given leaf block had > leaf continuation aka

[Cluster-devel] [GFS2 PATCH] gfs2: don't create empty buffers for NO_CREATE

2021-03-25 Thread Bob Peterson
Before this patch, function gfs2_getbuf would create empty buffers when it was given the NO_CREATE directive from gfs2_journal_wipe. This is a waste of time: the buffer_head is only used by gfs2_remove_from_journal to determine if the buffer is pinned (which it won't be if it's newly created) and

[Cluster-devel] [GFS2 PATCH] gfs2: report "already frozen/thawed" errors

2021-03-25 Thread Bob Peterson
Before this patch, gfs2's freeze function failed to report an error when the target file system was already frozen as it should (and as generic vfs function freeze_super does. Similarly, gfs2's thaw function failed to report an error when trying to thaw a file system that is not frozen, as vfs

[Cluster-devel] [GFS2 PATCH] gfs2: Fix dir.c function parameter descriptions

2021-03-22 Thread Bob Peterson
This patch simply fixes a bunch of function parameter comments in dir.c that were reported by the kernel test robot. Reported-by: kernel test robot Signed-off-by: Bob Peterson --- fs/gfs2/dir.c | 39 ++- 1 file changed, 22 insertions(+), 17 deletions(-)

[Cluster-devel] [GFS2 PATCH v2] gfs2: fast dealloc for exhash directories

2021-03-22 Thread Bob Peterson
Before this patch, whenever a directory was deleted, it called function __gfs2_dir_exhash_dealloc to deallocate the directory's leaf blocks. But __gfs2_dir_exhash_dealloc never knew if any given leaf block had leaf continuation aka "next" blocks, so it read every single leaf block in, only to

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Add new sysfs file for gfs2 status

2021-03-19 Thread Bob Peterson
- Original Message - > On 19/03/2021 12:06, Bob Peterson wrote: > > This patch adds a new file: /sys/fs/gfs2/*/status which will report > > the status of the file system. Catting this file dumps the current > > status of the file system according to various superblock variables. > > For

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Add new sysfs file for gfs2 status

2021-03-19 Thread Andrew Price
On 19/03/2021 12:06, Bob Peterson wrote: This patch adds a new file: /sys/fs/gfs2/*/status which will report the status of the file system. Catting this file dumps the current status of the file system according to various superblock variables. For example: Journal Checked: 1 Journal

Re: [Cluster-devel] [GFS2 PATCH] gfs2: fast dealloc for exhash directories

2021-03-19 Thread Bob Peterson
- Original Message - > Before this patch, whenever a directory was deleted, it called function > __gfs2_dir_exhash_dealloc to deallocate the directory's leaf blocks. > But __gfs2_dir_exhash_dealloc never knew if any given leaf block had > leaf continuation aka "next" blocks, so it read

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Add new sysfs file for gfs2 status

2021-03-19 Thread Bob Peterson
Hi, - Original Message - > On 19/03/2021 12:06, Bob Peterson wrote: > > This patch adds a new file: /sys/fs/gfs2/*/status which will report > > the status of the file system. Catting this file dumps the current > > status of the file system according to various superblock variables. > >

[Cluster-devel] [GFS2 PATCH] gfs2: fast dealloc for exhash directories

2021-03-19 Thread Bob Peterson
Before this patch, whenever a directory was deleted, it called function __gfs2_dir_exhash_dealloc to deallocate the directory's leaf blocks. But __gfs2_dir_exhash_dealloc never knew if any given leaf block had leaf continuation aka "next" blocks, so it read every single leaf block in, only to

Re: [Cluster-devel] [GFS2 PATCH] gfs2: Add new sysfs file for gfs2 status

2021-03-19 Thread Steven Whitehouse
On 19/03/2021 12:06, Bob Peterson wrote: This patch adds a new file: /sys/fs/gfs2/*/status which will report the status of the file system. Catting this file dumps the current status of the file system according to various superblock variables. For example: Journal Checked: 1 Journal

[Cluster-devel] [GFS2 PATCH] gfs2: Add new sysfs file for gfs2 status

2021-03-19 Thread Bob Peterson
This patch adds a new file: /sys/fs/gfs2/*/status which will report the status of the file system. Catting this file dumps the current status of the file system according to various superblock variables. For example: Journal Checked: 1 Journal Live: 1 Journal ID:

[Cluster-devel] [GFS2 PATCH] gfs2: Eliminate gh parameter from go_xmote_bh func

2021-03-19 Thread Bob Peterson
The only glock that uses go_xmote_bh glops function is the freeze glock which uses freeze_go_xmote_bh. It does not use its gh parameter, so this patch eliminates the unneeded parameter. Signed-off-by: Bob Peterson --- fs/gfs2/glock.c | 2 +- fs/gfs2/glops.c | 2 +- fs/gfs2/incore.h | 2 +- 3

[Cluster-devel] [GFS2 PATCH] gfs2: Bypass log flush if the journal is not live

2021-03-12 Thread Bob Peterson
Patch fe3e397668775 ("gfs2: Rework the log space allocation logic") changed the log flush logic such that it reserves a set of journal blocks for cases in which log_flush has no active transaction but still needs blocks for revokes. However, these blocks were still requested even if the journal is

[Cluster-devel] [GFS2 PATCH] gfs2: bypass signal_our_withdraw if no journal

2021-03-12 Thread Bob Peterson
Before this patch, function signal_our_withdraw referenced the journal inode immediately. But corrupt file systems may have some invalid journals, in which case our attempt to read it in will withdraw and the resulting signal_our_withdraw would dereference the NULL value. This patch adds a check

[Cluster-devel] [GFS2 PATCH] gfs2: fix use-after-free in trans_drain

2021-03-03 Thread Bob Peterson
This patch adds code to function trans_drain to remove drained bd elements from the ail lists, if queued, before freeing the bd. If we don't remove the bd from the ail, function ail_drain will try to reference the bd after it has been freed by trans_drain. Thanks to Andy Price for his analysis of

[Cluster-devel] [GFS2 PATCH] gfs2: In gfs2_ail1_start_one unplug the IO when needed

2021-02-18 Thread Bob Peterson
This patch adds a check in gfs2_ail1_start_one to see if our current IO needs to be unplugged and replugged. If we don't unplug it once in a while, our IO can get so clogged up that none of our pages for ail1 items can be written because they're all in PageWriteback but they can't be written

Re: [Cluster-devel] [gfs2 PATCH] gfs2: Don't skip dlm unlock if glock has an lvb

2021-02-08 Thread Steven Whitehouse
Hi, Longer term we should review whether this is really the correct fix. It seems a bit strange that we have to do something different according to whether there is an LVB or not. We are gradually increasing LVB use over time too. So should we fix the DLM so that either it can cope with locks

  1   2   3   4   5   6   7   8   9   10   >