Re: [Cluster-devel] [PATCH v2.1 19/30] iomap: switch iomap_bmap to use iomap_iter

2021-08-11 Thread Christoph Hellwig
On Wed, Aug 11, 2021 at 12:18:00PM -0700, Darrick J. Wong wrote:
> + while ((ret = iomap_iter(&iter, ops)) > 0) {
> + if (iter.iomap.type == IOMAP_MAPPED)
> + bno = iomap_sector(&iter.iomap, iter.pos) >> blkshift;
> + /* leave iter.processed unset to abort loop */
> + }

This looks ok, thanks.



Re: [Cluster-devel] [PATCH 11/30] iomap: add the new iomap_iter model

2021-08-11 Thread Christoph Hellwig
On Wed, Aug 11, 2021 at 12:17:08PM -0700, Darrick J. Wong wrote:
> > iter.c is also my preference, but in the end I don't care too much.
> 
> Ok.  My plan for this is to change this patch to add the new iter code
> to apply.c, and change patch 24 to remove iomap_apply.  I'll add a patch
> on the end to rename apply.c to iter.c, which will avoid breaking the
> history.

What history?  There is no shared code, so no shared history and.

> 
> I'll send the updated patches as replies to this series to avoid
> spamming the list, since I also have a patchset of bugfixes to send out
> and don't want to overwhelm everyone.

Just as a clear statement:  I think this dance is obsfucation and doesn't
help in any way.  But if that's what it takes..



Re: [Cluster-devel] Why does dlm_lock function fails when downconvert a dlm lock?

2021-08-11 Thread Alexander Aring
Hi,

On Wed, Aug 11, 2021 at 6:41 AM Gang He  wrote:
>
> Hello List,
>
> I am using kernel 5.13.4 (some old version kernels have the same problem).
> When node A acquired a dlm (EX) lock, node B tried to get the dlm lock, node 
> A got a BAST message,
> then node A downcoverted the dlm lock to NL, dlm_lock function failed with 
> the error -16.
> The function failure did not always happen, but in some case, I could 
> encounter this failure.
> Why does dlm_lock function fails when downconvert a dlm lock? there are any 
> documents for describe these error cases?
> If the code ignores dlm_lock return error from node A, node B will not get 
> the dlm lock permanently.
> How should we handle such situation? call dlm_lock function to downconvert 
> the dlm lock again?

What is your dlm user? Is it kernel (e.g. gfs2/ocfs2/md) or user (libdlm)?

I believe you are running into case [0]. Can you provide the
corresponding log_debug() message? It's necessary to insert
"log_debug=1" in your dlm.conf and it will be reported on KERN_DEBUG
in your kernel log then.

Thanks.

- Alex

[0] https://elixir.bootlin.com/linux/v5.14-rc5/source/fs/dlm/lock.c#L2886



[Cluster-devel] [PATCH 31/30] iomap: move iomap iteration code to iter.c

2021-08-11 Thread Darrick J. Wong
From: Darrick J. Wong 

Now that we've moved iomap to the iterator model, rename this file to be
in sync with the functions contained inside of it.

Signed-off-by: Darrick J. Wong 
---
 fs/iomap/Makefile |2 +-
 fs/iomap/iter.c   |0 
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename fs/iomap/{apply.c => iter.c} (100%)

diff --git a/fs/iomap/Makefile b/fs/iomap/Makefile
index e46f936dde81..bb64215ae256 100644
--- a/fs/iomap/Makefile
+++ b/fs/iomap/Makefile
@@ -26,9 +26,9 @@ ccflags-y += -I $(srctree)/$(src) # needed for 
trace events
 obj-$(CONFIG_FS_IOMAP) += iomap.o
 
 iomap-y+= trace.o \
-  apply.o \
   buffered-io.o \
   direct-io.o \
   fiemap.o \
+  iter.o \
   seek.o
 iomap-$(CONFIG_SWAP)   += swapfile.o
diff --git a/fs/iomap/apply.c b/fs/iomap/iter.c
similarity index 100%
rename from fs/iomap/apply.c
rename to fs/iomap/iter.c



[Cluster-devel] [PATCH v2.1 24/30] iomap: remove iomap_apply

2021-08-11 Thread Darrick J. Wong
From: Christoph Hellwig 

iomap_apply is unused now, so remove it.

Signed-off-by: Christoph Hellwig 
[djwong: rebase this patch to preserve git history of iomap loop control]
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong 
---
 fs/iomap/apply.c  |   91 -
 fs/iomap/trace.h  |   40 --
 include/linux/iomap.h |   10 -
 3 files changed, 141 deletions(-)

diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c
index e82647aef7ea..a1c7592d2ade 100644
--- a/fs/iomap/apply.c
+++ b/fs/iomap/apply.c
@@ -3,101 +3,10 @@
  * Copyright (C) 2010 Red Hat, Inc.
  * Copyright (c) 2016-2021 Christoph Hellwig.
  */
-#include 
-#include 
 #include 
 #include 
 #include "trace.h"
 
-/*
- * Execute a iomap write on a segment of the mapping that spans a
- * contiguous range of pages that have identical block mapping state.
- *
- * This avoids the need to map pages individually, do individual allocations
- * for each page and most importantly avoid the need for filesystem specific
- * locking per page. Instead, all the operations are amortised over the entire
- * range of pages. It is assumed that the filesystems will lock whatever
- * resources they require in the iomap_begin call, and release them in the
- * iomap_end call.
- */
-loff_t
-iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags,
-   const struct iomap_ops *ops, void *data, iomap_actor_t actor)
-{
-   struct iomap iomap = { .type = IOMAP_HOLE };
-   struct iomap srcmap = { .type = IOMAP_HOLE };
-   loff_t written = 0, ret;
-   u64 end;
-
-   trace_iomap_apply(inode, pos, length, flags, ops, actor, _RET_IP_);
-
-   /*
-* Need to map a range from start position for length bytes. This can
-* span multiple pages - it is only guaranteed to return a range of a
-* single type of pages (e.g. all into a hole, all mapped or all
-* unwritten). Failure at this point has nothing to undo.
-*
-* If allocation is required for this range, reserve the space now so
-* that the allocation is guaranteed to succeed later on. Once we copy
-* the data into the page cache pages, then we cannot fail otherwise we
-* expose transient stale data. If the reserve fails, we can safely
-* back out at this point as there is nothing to undo.
-*/
-   ret = ops->iomap_begin(inode, pos, length, flags, &iomap, &srcmap);
-   if (ret)
-   return ret;
-   if (WARN_ON(iomap.offset > pos)) {
-   written = -EIO;
-   goto out;
-   }
-   if (WARN_ON(iomap.length == 0)) {
-   written = -EIO;
-   goto out;
-   }
-
-   trace_iomap_apply_dstmap(inode, &iomap);
-   if (srcmap.type != IOMAP_HOLE)
-   trace_iomap_apply_srcmap(inode, &srcmap);
-
-   /*
-* Cut down the length to the one actually provided by the filesystem,
-* as it might not be able to give us the whole size that we requested.
-*/
-   end = iomap.offset + iomap.length;
-   if (srcmap.type != IOMAP_HOLE)
-   end = min(end, srcmap.offset + srcmap.length);
-   if (pos + length > end)
-   length = end - pos;
-
-   /*
-* Now that we have guaranteed that the space allocation will succeed,
-* we can do the copy-in page by page without having to worry about
-* failures exposing transient data.
-*
-* To support COW operations, we read in data for partially blocks from
-* the srcmap if the file system filled it in.  In that case we the
-* length needs to be limited to the earlier of the ends of the iomaps.
-* If the file system did not provide a srcmap we pass in the normal
-* iomap into the actors so that they don't need to have special
-* handling for the two cases.
-*/
-   written = actor(inode, pos, length, data, &iomap,
-   srcmap.type != IOMAP_HOLE ? &srcmap : &iomap);
-
-out:
-   /*
-* Now the data has been copied, commit the range we've copied.  This
-* should not fail unless the filesystem has had a fatal error.
-*/
-   if (ops->iomap_end) {
-   ret = ops->iomap_end(inode, pos, length,
-written > 0 ? written : 0,
-flags, &iomap);
-   }
-
-   return written ? written : ret;
-}
-
 static inline int iomap_iter_advance(struct iomap_iter *iter)
 {
/* handle the previous iteration (if any) */
diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h
index 1012d7af6b68..f1519f9a1403 100644
--- a/fs/iomap/trace.h
+++ b/fs/iomap/trace.h
@@ -138,49 +138,9 @@ DECLARE_EVENT_CLASS(iomap_class,
 DEFINE_EVENT(iomap_class, name,\
TP_PROTO(struct inode *inode, struct iomap *iomap), \
TP_ARGS(inode, iomap))
-DEFINE_IOMAP_EV

[Cluster-devel] [PATCH v2.1 11/30] iomap: add the new iomap_iter model

2021-08-11 Thread Darrick J. Wong
From: Christoph Hellwig 

The iomap_iter struct provides a convenient way to package up and
maintain all the arguments to the various mapping and operation
functions.  It is operated on using the iomap_iter() function that
is called in loop until the whole range has been processed.  Compared
to the existing iomap_apply() function this avoid an indirect call
for each iteration.

For now iomap_iter() calls back into the existing ->iomap_begin and
->iomap_end methods, but in the future this could be further optimized
to avoid indirect calls entirely.

Based on an earlier patch from Matthew Wilcox .

Signed-off-by: Christoph Hellwig 
[djwong: add to apply.c to preserve git history of iomap loop control]
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong 
---
 fs/iomap/apply.c  |   74 -
 fs/iomap/trace.h  |   37 -
 include/linux/iomap.h |   56 +
 3 files changed, 165 insertions(+), 2 deletions(-)

diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c
index 26ab6563181f..e82647aef7ea 100644
--- a/fs/iomap/apply.c
+++ b/fs/iomap/apply.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
  * Copyright (C) 2010 Red Hat, Inc.
- * Copyright (c) 2016-2018 Christoph Hellwig.
+ * Copyright (c) 2016-2021 Christoph Hellwig.
  */
 #include 
 #include 
@@ -97,3 +97,75 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, 
unsigned flags,
 
return written ? written : ret;
 }
+
+static inline int iomap_iter_advance(struct iomap_iter *iter)
+{
+   /* handle the previous iteration (if any) */
+   if (iter->iomap.length) {
+   if (iter->processed <= 0)
+   return iter->processed;
+   if (WARN_ON_ONCE(iter->processed > iomap_length(iter)))
+   return -EIO;
+   iter->pos += iter->processed;
+   iter->len -= iter->processed;
+   if (!iter->len)
+   return 0;
+   }
+
+   /* clear the state for the next iteration */
+   iter->processed = 0;
+   memset(&iter->iomap, 0, sizeof(iter->iomap));
+   memset(&iter->srcmap, 0, sizeof(iter->srcmap));
+   return 1;
+}
+
+static inline void iomap_iter_done(struct iomap_iter *iter)
+{
+   WARN_ON_ONCE(iter->iomap.offset > iter->pos);
+   WARN_ON_ONCE(iter->iomap.length == 0);
+   WARN_ON_ONCE(iter->iomap.offset + iter->iomap.length <= iter->pos);
+
+   trace_iomap_iter_dstmap(iter->inode, &iter->iomap);
+   if (iter->srcmap.type != IOMAP_HOLE)
+   trace_iomap_iter_srcmap(iter->inode, &iter->srcmap);
+}
+
+/**
+ * iomap_iter - iterate over a ranges in a file
+ * @iter: iteration structue
+ * @ops: iomap ops provided by the file system
+ *
+ * Iterate over filesystem-provided space mappings for the provided file range.
+ *
+ * This function handles cleanup of resources acquired for iteration when the
+ * filesystem indicates there are no more space mappings, which means that this
+ * function must be called in a loop that continues as long it returns a
+ * positive value.  If 0 or a negative value is returned, the caller must not
+ * return to the loop body.  Within a loop body, there are two ways to break 
out
+ * of the loop body:  leave @iter.processed unchanged, or set it to a negative
+ * errno.
+ */
+int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
+{
+   int ret;
+
+   if (iter->iomap.length && ops->iomap_end) {
+   ret = ops->iomap_end(iter->inode, iter->pos, iomap_length(iter),
+   iter->processed > 0 ? iter->processed : 0,
+   iter->flags, &iter->iomap);
+   if (ret < 0 && !iter->processed)
+   return ret;
+   }
+
+   trace_iomap_iter(iter, ops, _RET_IP_);
+   ret = iomap_iter_advance(iter);
+   if (ret <= 0)
+   return ret;
+
+   ret = ops->iomap_begin(iter->inode, iter->pos, iter->len, iter->flags,
+  &iter->iomap, &iter->srcmap);
+   if (ret < 0)
+   return ret;
+   iomap_iter_done(iter);
+   return 1;
+}
diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h
index e9cd5cc0d6ba..1012d7af6b68 100644
--- a/fs/iomap/trace.h
+++ b/fs/iomap/trace.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 /*
- * Copyright (c) 2009-2019 Christoph Hellwig
+ * Copyright (c) 2009-2021 Christoph Hellwig
  *
  * NOTE: none of these tracepoints shall be considered a stable kernel ABI
  * as they can change at any time.
@@ -140,6 +140,8 @@ DEFINE_EVENT(iomap_class, name, \
TP_ARGS(inode, iomap))
 DEFINE_IOMAP_EVENT(iomap_apply_dstmap);
 DEFINE_IOMAP_EVENT(iomap_apply_srcmap);
+DEFINE_IOMAP_EVENT(iomap_iter_dstmap);
+DEFINE_IOMAP_EVENT(iomap_iter_srcmap);
 
 TRACE_EVENT(iomap_apply,
TP_PROTO(struct inode *inode, loff_t pos, loff_t length,
@@ -179,6 +181,39 

Re: [Cluster-devel] [PATCH 11/30] iomap: add the new iomap_iter model

2021-08-11 Thread Darrick J. Wong
On Wed, Aug 11, 2021 at 07:38:56AM +0200, Christoph Hellwig wrote:
> On Tue, Aug 10, 2021 at 05:31:18PM -0700, Darrick J. Wong wrote:
> > > +static inline void iomap_iter_done(struct iomap_iter *iter)
> > 
> > I wonder why this is a separate function, since it only has debugging
> > warnings and tracepoints?
> 
> The reason for these two sub-helpers was to force me to structure the
> code so that Matthews original idea of replacing ->iomap_begin and
> ->iomap_end with a single next callback so that iomap_iter could
> be inlined into callers and the indirect calls could be elided is
> still possible.  This would only be useful for a few specific
> methods (probably dax and direct I/O) where we care so much, but it
> seemed like a nice idea conceptually so I would not want to break it.
> 
> OTOH we could just remove this function for now and do that once needed.



> > Modulo the question about iomap_iter_done, I guess this looks all right
> > to me.  As far as apply.c vs. core.c, I'm not wildly passionate about
> > either naming choice (I would have called it iter.c) but ... fmeh.
> 
> iter.c is also my preference, but in the end I don't care too much.

Ok.  My plan for this is to change this patch to add the new iter code
to apply.c, and change patch 24 to remove iomap_apply.  I'll add a patch
on the end to rename apply.c to iter.c, which will avoid breaking the
history.

I'll send the updated patches as replies to this series to avoid
spamming the list, since I also have a patchset of bugfixes to send out
and don't want to overwhelm everyone.

--D



[Cluster-devel] [PATCH v2.1 19/30] iomap: switch iomap_bmap to use iomap_iter

2021-08-11 Thread Darrick J. Wong
From: Christoph Hellwig 

Rewrite the ->bmap implementation based on iomap_iter.

Signed-off-by: Christoph Hellwig 
[djwong: restructure the loop to make its behavior a little clearer]
Reviewed-by: Darrick J. Wong 
Signed-off-by: Darrick J. Wong 
---
 fs/iomap/fiemap.c |   31 +--
 1 file changed, 13 insertions(+), 18 deletions(-)

diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c
index acad09a8c188..66cf267c68ae 100644
--- a/fs/iomap/fiemap.c
+++ b/fs/iomap/fiemap.c
@@ -92,37 +92,32 @@ int iomap_fiemap(struct inode *inode, struct 
fiemap_extent_info *fi,
 }
 EXPORT_SYMBOL_GPL(iomap_fiemap);
 
-static loff_t
-iomap_bmap_actor(struct inode *inode, loff_t pos, loff_t length,
-   void *data, struct iomap *iomap, struct iomap *srcmap)
-{
-   sector_t *bno = data, addr;
-
-   if (iomap->type == IOMAP_MAPPED) {
-   addr = (pos - iomap->offset + iomap->addr) >> inode->i_blkbits;
-   *bno = addr;
-   }
-   return 0;
-}
-
 /* legacy ->bmap interface.  0 is the error return (!) */
 sector_t
 iomap_bmap(struct address_space *mapping, sector_t bno,
const struct iomap_ops *ops)
 {
-   struct inode *inode = mapping->host;
-   loff_t pos = bno << inode->i_blkbits;
-   unsigned blocksize = i_blocksize(inode);
+   struct iomap_iter iter = {
+   .inode  = mapping->host,
+   .pos= (loff_t)bno << mapping->host->i_blkbits,
+   .len= i_blocksize(mapping->host),
+   .flags  = IOMAP_REPORT,
+   };
+   const unsigned int blkshift = mapping->host->i_blkbits - SECTOR_SHIFT;
int ret;
 
if (filemap_write_and_wait(mapping))
return 0;
 
bno = 0;
-   ret = iomap_apply(inode, pos, blocksize, 0, ops, &bno,
- iomap_bmap_actor);
+   while ((ret = iomap_iter(&iter, ops)) > 0) {
+   if (iter.iomap.type == IOMAP_MAPPED)
+   bno = iomap_sector(&iter.iomap, iter.pos) >> blkshift;
+   /* leave iter.processed unset to abort loop */
+   }
if (ret)
return 0;
+
return bno;
 }
 EXPORT_SYMBOL_GPL(iomap_bmap);



[Cluster-devel] Why does dlm_lock function fails when downconvert a dlm lock?

2021-08-11 Thread Gang He
Hello List,

I am using kernel 5.13.4 (some old version kernels have the same problem).
When node A acquired a dlm (EX) lock, node B tried to get the dlm lock, node A 
got a BAST message,
then node A downcoverted the dlm lock to NL, dlm_lock function failed with the 
error -16.
The function failure did not always happen, but in some case, I could encounter 
this failure. 
Why does dlm_lock function fails when downconvert a dlm lock? there are any 
documents for describe these error cases?
If the code ignores dlm_lock return error from node A, node B will not get the 
dlm lock permanently.
How should we handle such situation? call dlm_lock function to downconvert the 
dlm lock again?

Thanks
Gang