Re: [Cluster-devel] GFS2, lockdep and deadlocks on 4.19

2019-02-27 Thread Andreas Gruenbacher
Edwin, On Wed, 27 Feb 2019 at 17:11, Edwin Török wrote: > I was trying to use lockdep to track down a GFS2 deadlock in kernel 4.19.19. > I was not able to reproduce the deadlock with CONFIG_PROVE_LOCKING, however I > got quite a lot of warnings about GFS2 code calling functions without holding

Re: [Cluster-devel] iomap_dio_rw: lockdep_assert_held(>i_rwsem)

2019-02-27 Thread Andreas Gruenbacher
Hi Christoph, I've tried to get your feedback on the lockdep_assert_held in iomap_dio_rw in January and didn't hear back from you. Could you please have a look? On Tue, 22 Jan 2019 at 22:26, Andreas Gruenbacher wrote: > > Hi Christoph, > > there's an assertion that the inode rwsem is taken in

[Cluster-devel] [PATCH 01/15] gfs2: log error reform

2019-02-27 Thread Bob Peterson
Before this patch, gfs2 kept track of journal io errors in two places (sd_log_error and the SDF_AIL1_IO_ERROR flag in sd_flags. This patch consolidates the two by eliminating the SDF_AIL1_IO_ERROR flag in favor of an atomic count of journal errors, sd_log_errors. When the first io error occurs and

[Cluster-devel] [PATCH 03/15] gfs2: Ignore recovery attempts if gfs2 has io error or is withdrawn

2019-02-27 Thread Bob Peterson
This patch addresses various problems with gfs2/dlm recovery. For example, suppose a node with a bunch of gfs2 mounts suddenly reboots due to kernel panic, and dlm determines it should perform recovery. DLM does so from a pseudo-state machine calling various callbacks into lock_dlm to perform a

[Cluster-devel] [PATCH 00/15] GFS2: Withdraw corruption patches [V2]

2019-02-27 Thread Bob Peterson
Hi, This is a revision to the patch set I sent on 13 February 2019. These won't make this merge window, obviously, because that's almost upon us. This version fixes some glaring mistakes and problems of the first set. As before, this may not be the final version, but I wanted to put it out for

[Cluster-devel] [PATCH 05/15] gfs2: Allow some glocks to be used during withdraw

2019-02-27 Thread Bob Peterson
Before this patch, when a file system was withdrawn, all further attempts to enqueue or promote glocks were rejected and returned -EIO. This is only important for media-backed glocks like inode and rgrp glocks. All other glocks may be safely used because there is no potential for metadata

[Cluster-devel] [PATCH 15/15] gfs2: log which portion of the journal is replayed

2019-02-27 Thread Bob Peterson
When a journal is replayed, gfs2 logs a message similar to: jid=X: Replaying journal... This patch adds the tail and block number so that the range of the replayed block is also printed. These values will match the values shown if the journal is dumped with gfs2_edit -p journalX. The resulting

[Cluster-devel] [PATCH 06/15] gfs2: Make secondary withdrawers wait for first withdrawer

2019-02-27 Thread Bob Peterson
Before this patch, if a process encountered an error and decided to withdraw, if another process was already in the process of withdrawing, the secondary withdraw would be silently ignored, which set it free to proceed with its processing, unlock any locks, etc. That's correct behavior if the

[Cluster-devel] [PATCH 10/15] gfs2: Check for log write errors before telling dlm to unlock

2019-02-27 Thread Bob Peterson
Before this patch, function do_xmote just assumed all the writes submitted to the journal were finished and successful, and it called the go_unlock function to release the dlm lock. But if they're not, and a revoke failed to make its way to the journal, a journal replay on another node will cause

[Cluster-devel] [PATCH 14/15] gfs2: Warn when a journal replay overwrites a rgrp with buffers

2019-02-27 Thread Bob Peterson
This patch adds some instrumentation in gfs2's journal replay that indicates when we're about to overwrite a rgrp for which we already have a valid buffer_head. Signed-off-by: Bob Peterson --- fs/gfs2/lops.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff

[Cluster-devel] [PATCH 12/15] gfs2: If the journal isn't live ignore log flushes

2019-02-27 Thread Bob Peterson
This patch adds a check to function gfs2_log_flush: if the journal is no longer live, the flush is ignored. Signed-off-by: Bob Peterson --- fs/gfs2/log.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c index 0def6343e618..8199b235790f 100644 ---

[Cluster-devel] [PATCH 13/15] gfs2: Issue revokes more intelligently

2019-02-27 Thread Bob Peterson
Before this patch, function gfs2_write_revokes would call gfs2_ail1_empty, then traverse the sd_ail1_list looking for transactions that had bds which were no longer queued to a glock. And if it found some, it would try to issue revokes for them, up to a predetermined maximum. There were two

[Cluster-devel] [PATCH 04/15] gfs2: move check_journal_clean to util.c for future use

2019-02-27 Thread Bob Peterson
Before this patch function check_journal_clean was in ops_fstype.c. This patch moves it to util.c so we can make use of it elsewhere in a future patch. Signed-off-by: Bob Peterson --- fs/gfs2/ops_fstype.c | 42 - fs/gfs2/util.c | 45

[Cluster-devel] [PATCH 09/15] gfs2: Add verbose option to check_journal_clean

2019-02-27 Thread Bob Peterson
Before this patch, function check_journal_clean would give messages related to journal recovery. That's fine for mount time, but when a node withdraws and forces replay that way, we don't want all those distracting and misleading messages. This patch adds a new parameter to make those messages

[Cluster-devel] [PATCH 11/15] gfs2: Do log_flush in gfs2_ail_empty_gl even if ail list is empty

2019-02-27 Thread Bob Peterson
Before this patch, if gfs2_ail_empty_gl saw there was nothing on the ail list, it would return and not flush the log. The problem is that there could still be a revoke for the rgrp sitting on the sd_log_le_revoke list that's been recently taken off the ail list. But that revoke still needs to be

[Cluster-devel] [PATCH 02/15] gfs2: Introduce concept of a pending withdraw

2019-02-27 Thread Bob Peterson
File system withdraws can be delayed when inconsistencies are discovered when we cannot withdraw immediately, for example, when critical spin_locks are held. But delaying the withdraw can cause gfs2 to ignore the error and keep running for a short period of time. For example, an rgrp glock may be

[Cluster-devel] [PATCH 08/15] gfs2: Force withdraw to replay journals and wait for it to finish

2019-02-27 Thread Bob Peterson
When a node withdraws from a file system, it often leaves its journal in an incomplete state. This is especially true when the withdraw is caused by io errors writing to the journal. Before this patch, a withdraw would try to write a "shutdown" record to the journal, tell dlm it's done with the

[Cluster-devel] [PATCH 07/15] gfs2: Don't write log headers after file system withdraw

2019-02-27 Thread Bob Peterson
Before this patch, when a node withdrew a gfs2 file system, it wrote a (clean) unmount log header. That's wrong. You don't want to write anything to the journal once you're withdrawn because that's acknowledging that the transaction is complete and the journal is in good shape, neither of which

Re: [Cluster-devel] [PATCH V15 14/18] block: enable multipage bvecs

2019-02-27 Thread Ming Lei
On Wed, Feb 27, 2019 at 08:47:09PM +, Jon Hunter wrote: > > On 21/02/2019 08:42, Marek Szyprowski wrote: > > Dear All, > > > > On 2019-02-15 12:13, Ming Lei wrote: > >> This patch pulls the trigger for multi-page bvecs. > >> > >> Reviewed-by: Omar Sandoval > >> Signed-off-by: Ming Lei > >