Re: [Cluster-devel] [PATCH 2/2] gfs2: Add support for IOMAP_ZERO

2019-07-08 Thread Andreas Gruenbacher
On Mon, 8 Jul 2019 at 19:06, Christoph Hellwig  wrote:
> Don't we actually need the write lock for zeroing as well?

Hmm. For iomap_zero_range, we don't need the
iov_iter_fault_in_readable / iov_iter_copy_from_user_atomic logic from
iomap_write_actor, so we take the write lock outside of
iomap_zero_range. I just now noticed that xfs behaves differently
there; that's an ugly difference.

>  Also
> I think the alloc helper would be nice to keep, and a little use
> of switch might clean things up.  How about something like this
> instead?
>
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index f42718dd292f..8f5a25f507c3 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -1027,17 +1027,8 @@ static int gfs2_iomap_begin_write(struct inode *inode, 
> loff_t pos,
> bool unstuff, alloc_required;
> int ret;
>
> -   ret = gfs2_write_lock(inode);
> -   if (ret)
> -   return ret;
> -
> unstuff = gfs2_is_stuffed(ip) &&
>   pos + length > gfs2_max_stuffed_size(ip);
> -
> -   ret = gfs2_iomap_get(inode, pos, length, flags, iomap, mp);
> -   if (ret)
> -   goto out_unlock;
> -
> alloc_required = unstuff || iomap->type == IOMAP_HOLE;
>
> if (alloc_required || gfs2_is_jdata(ip))
> @@ -1051,7 +1042,7 @@ static int gfs2_iomap_begin_write(struct inode *inode, 
> loff_t pos,
>
> ret = gfs2_quota_lock_check(ip, );
> if (ret)
> -   goto out_unlock;
> +   return ret;
>
> ret = gfs2_inplace_reserve(ip, );
> if (ret)
> @@ -1118,11 +1109,23 @@ static int gfs2_iomap_begin_write(struct inode 
> *inode, loff_t pos,
>  out_qunlock:
> if (alloc_required)
> gfs2_quota_unlock(ip);
> -out_unlock:
> -   gfs2_write_unlock(inode);
> return ret;
>  }
>
> +static inline bool gfs2_iomap_need_write_lock(unsigned flags)
> +{
> +   switch (flags & (IOMAP_WRITE | IOMAP_ZERO)) {
> +   case IOMAP_WRITE:
> +   if (flags & IOMAP_DIRECT)
> +   return false;
> +   return true;
> +   case IOMAP_ZERO:
> +   return true;
> +   default:
> +   return false;
> +   }
> +}
> +
>  static int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length,
> unsigned flags, struct iomap *iomap)
>  {
> @@ -1133,20 +1136,48 @@ static int gfs2_iomap_begin(struct inode *inode, 
> loff_t pos, loff_t length,
> iomap->flags |= IOMAP_F_BUFFER_HEAD;
>
> trace_gfs2_iomap_start(ip, pos, length, flags);
> -   if ((flags & IOMAP_WRITE) && !(flags & IOMAP_DIRECT)) {
> -   ret = gfs2_iomap_begin_write(inode, pos, length, flags, 
> iomap, );
> -   } else {
> -   ret = gfs2_iomap_get(inode, pos, length, flags, iomap, );
>
> -   /*
> -* Silently fall back to buffered I/O for stuffed files or if
> -* we've hot a hole (see gfs2_file_direct_write).
> -*/
> -   if ((flags & IOMAP_WRITE) && (flags & IOMAP_DIRECT) &&
> -   iomap->type != IOMAP_MAPPED)
> -   ret = -ENOTBLK;
> +   if (gfs2_iomap_need_write_lock(flags)) {
> +   ret = gfs2_write_lock(inode);
> +   if (ret)
> +   goto out;
> }
> +
> +   ret = gfs2_iomap_get(inode, pos, length, flags, iomap, );
> +   if (ret)
> +   goto out_unlock;
> +
> +   switch (flags & (IOMAP_WRITE | IOMAP_ZERO)) {
> +   case 0:
> +   goto out_unlock;
> +   case IOMAP_WRITE:
> +   if (flags & IOMAP_DIRECT) {
> +   /*
> +* Silently fall back to buffered I/O for stuffed 
> files
> +* or if we've got a hole (see 
> gfs2_file_direct_write).
> +*/
> +   if (iomap->type != IOMAP_MAPPED)
> +   ret = -ENOTBLK;
> +   goto out_unlock;
> +   }
> +   break;
> +   case IOMAP_ZERO:
> +   if (iomap->type == IOMAP_HOLE)
> +   goto out_unlock;
> +   break;
> +   default:
> +   WARN_ON_ONCE(1);
> +   ret = -EIO;
> +   goto out_unlock;
> +   }
> +
> +   ret = gfs2_iomap_begin_write(inode, pos, length, flags, iomap, );
> +
> +out_unlock:
> +   if (ret && gfs2_iomap_need_write_lock(flags))
> +   gfs2_write_unlock(inode);
> release_metapath();
> +out:
> trace_gfs2_iomap_end(ip, iomap, ret);
> return ret;
>  }
> @@ -1157,8 +1188,21 @@ static int gfs2_iomap_end(struct inode *inode, loff_t 
> pos, loff_t length,
> struct gfs2_inode *ip = GFS2_I(inode);
> struct gfs2_sbd *sdp = GFS2_SB(inode);
>
> -   if ((flags & (IOMAP_WRITE | 

Re: [Cluster-devel] RFC: use the iomap writepage path in gfs2

2019-07-08 Thread Andreas Gruenbacher
On Mon, 8 Jul 2019 at 18:04, Christoph Hellwig  wrote:
> On Thu, Jul 04, 2019 at 12:35:41AM +0200, Andreas Gruenbacher wrote:
> > Patch "gfs2: implement gfs2_block_zero_range using iomap_zero_range"
> > isn't quite ready: the gfs2 iomap operations don't handle IOMAP_ZERO
> > correctly so far, and that needs to be fixed first.
>
> What is the issue with IOMAP_ZERO on gfs2?  Zeroing never does block
> allocations except when on COW extents, which gfs2 doesn't support,
> so there shouldn't really be any need for additional handling.

We still want to set iomap->page_ops for journalled data files on gfs2.

Also, if we go through the existing gfs2_iomap_begin_write /
__gfs2_iomap_begin logic for iomap_zero_range, it will work for
stuffed files as well, and so we can replace stuffed_zero_range with
iomap_zero_range.

> > Some of the tests assume that the filesystem supports unwritten
> > extents, trusted xattrs, the usrquota / grpquota / prjquota mount
> > options. There shouldn't be a huge number of failing tests beyond
> > that, but I know things aren't perfect.
>
> In general xfstests is supposed to have tests for that and not run
> the tests if not supported.  In most cases this is automatic, but
> in case a feature can't be autodetect we have a few manual overrides.

Yes, that needs a bit of work. Let's see.

Thanks,
Andreas



Re: [Cluster-devel] [PATCH 2/2] gfs2: Add support for IOMAP_ZERO

2019-07-08 Thread Christoph Hellwig
Don't we actually need the write lock for zeroing as well?  Also
I think the alloc helper would be nice to keep, and a little use
of switch might clean things up.  How about something like this
instead?

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index f42718dd292f..8f5a25f507c3 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -1027,17 +1027,8 @@ static int gfs2_iomap_begin_write(struct inode *inode, 
loff_t pos,
bool unstuff, alloc_required;
int ret;
 
-   ret = gfs2_write_lock(inode);
-   if (ret)
-   return ret;
-
unstuff = gfs2_is_stuffed(ip) &&
  pos + length > gfs2_max_stuffed_size(ip);
-
-   ret = gfs2_iomap_get(inode, pos, length, flags, iomap, mp);
-   if (ret)
-   goto out_unlock;
-
alloc_required = unstuff || iomap->type == IOMAP_HOLE;
 
if (alloc_required || gfs2_is_jdata(ip))
@@ -1051,7 +1042,7 @@ static int gfs2_iomap_begin_write(struct inode *inode, 
loff_t pos,
 
ret = gfs2_quota_lock_check(ip, );
if (ret)
-   goto out_unlock;
+   return ret;
 
ret = gfs2_inplace_reserve(ip, );
if (ret)
@@ -1118,11 +1109,23 @@ static int gfs2_iomap_begin_write(struct inode *inode, 
loff_t pos,
 out_qunlock:
if (alloc_required)
gfs2_quota_unlock(ip);
-out_unlock:
-   gfs2_write_unlock(inode);
return ret;
 }
 
+static inline bool gfs2_iomap_need_write_lock(unsigned flags)
+{
+   switch (flags & (IOMAP_WRITE | IOMAP_ZERO)) {
+   case IOMAP_WRITE:
+   if (flags & IOMAP_DIRECT)
+   return false;
+   return true;
+   case IOMAP_ZERO:
+   return true;
+   default:
+   return false;
+   }
+}
+
 static int gfs2_iomap_begin(struct inode *inode, loff_t pos, loff_t length,
unsigned flags, struct iomap *iomap)
 {
@@ -1133,20 +1136,48 @@ static int gfs2_iomap_begin(struct inode *inode, loff_t 
pos, loff_t length,
iomap->flags |= IOMAP_F_BUFFER_HEAD;
 
trace_gfs2_iomap_start(ip, pos, length, flags);
-   if ((flags & IOMAP_WRITE) && !(flags & IOMAP_DIRECT)) {
-   ret = gfs2_iomap_begin_write(inode, pos, length, flags, iomap, 
);
-   } else {
-   ret = gfs2_iomap_get(inode, pos, length, flags, iomap, );
 
-   /*
-* Silently fall back to buffered I/O for stuffed files or if
-* we've hot a hole (see gfs2_file_direct_write).
-*/
-   if ((flags & IOMAP_WRITE) && (flags & IOMAP_DIRECT) &&
-   iomap->type != IOMAP_MAPPED)
-   ret = -ENOTBLK;
+   if (gfs2_iomap_need_write_lock(flags)) {
+   ret = gfs2_write_lock(inode);
+   if (ret)
+   goto out;
}
+
+   ret = gfs2_iomap_get(inode, pos, length, flags, iomap, );
+   if (ret)
+   goto out_unlock;
+
+   switch (flags & (IOMAP_WRITE | IOMAP_ZERO)) {
+   case 0:
+   goto out_unlock;
+   case IOMAP_WRITE:
+   if (flags & IOMAP_DIRECT) {
+   /*
+* Silently fall back to buffered I/O for stuffed files
+* or if we've got a hole (see gfs2_file_direct_write).
+*/
+   if (iomap->type != IOMAP_MAPPED)
+   ret = -ENOTBLK;
+   goto out_unlock;
+   }
+   break;
+   case IOMAP_ZERO:
+   if (iomap->type == IOMAP_HOLE)
+   goto out_unlock;
+   break;
+   default:
+   WARN_ON_ONCE(1);
+   ret = -EIO;
+   goto out_unlock;
+   }
+
+   ret = gfs2_iomap_begin_write(inode, pos, length, flags, iomap, );
+
+out_unlock:
+   if (ret && gfs2_iomap_need_write_lock(flags))
+   gfs2_write_unlock(inode);
release_metapath();
+out:
trace_gfs2_iomap_end(ip, iomap, ret);
return ret;
 }
@@ -1157,8 +1188,21 @@ static int gfs2_iomap_end(struct inode *inode, loff_t 
pos, loff_t length,
struct gfs2_inode *ip = GFS2_I(inode);
struct gfs2_sbd *sdp = GFS2_SB(inode);
 
-   if ((flags & (IOMAP_WRITE | IOMAP_DIRECT)) != IOMAP_WRITE)
-   goto out;
+   switch (flags & (IOMAP_WRITE | IOMAP_ZERO)) {
+   case 0:
+   return 0;
+   case IOMAP_WRITE:
+   if (flags & IOMAP_DIRECT)
+   return 0;
+   break;
+   case IOMAP_ZERO:
+   if (iomap->type == IOMAP_HOLE)
+   return 0;
+   break;
+   default:
+   WARN_ON_ONCE(1);
+   return -EIO;
+   }
 
if (!gfs2_is_stuffed(ip))
gfs2_ordered_add_inode(ip);
@@ -1183,8 

Re: [Cluster-devel] RFC: use the iomap writepage path in gfs2

2019-07-08 Thread Christoph Hellwig
On Mon, Jul 08, 2019 at 10:01:03AM +1000, Dave Chinner wrote:
> Ok, this doesn't look too bad from the iomap perspective, though it
> does raise more questions. :)
> 
> gfs2 now has two iopaths, right? One that uses bufferheads for
> journalled data, and the other that uses iomap? That seems like it's
> only a partial conversion - what needs to be done to iomap and gfs2
> to support the journalled data path so there's a single data IO
> path?

gfs2 always had to very different writeback I/O paths, including a copy
and pasted versiom of write_cache_pages for journaled data, they just
diverge a little bit more now. In the longer run I'd also like to add
journaled data support to iomap for use with XFS, and then also switch
gfs2 to it. 



Re: [Cluster-devel] RFC: use the iomap writepage path in gfs2

2019-07-08 Thread Christoph Hellwig
On Thu, Jul 04, 2019 at 12:35:41AM +0200, Andreas Gruenbacher wrote:
> Patch "gfs2: implement gfs2_block_zero_range using iomap_zero_range"
> isn't quite ready: the gfs2 iomap operations don't handle IOMAP_ZERO
> correctly so far, and that needs to be fixed first.

What is the issue with IOMAP_ZERO on gfs2?  Zeroing never does block
allocations except when on COW extents, which gfs2 doesn't support,
so there shouldn't really be any need for additional handling.

> Some of the tests assume that the filesystem supports unwritten
> extents, trusted xattrs, the usrquota / grpquota / prjquota mount
> options. There shouldn't be a huge number of failing tests beyond
> that, but I know things aren't perfect.

In general xfstests is supposed to have tests for that and not run
the tests if not supported.  In most cases this is automatic, but
in case a feature can't be autodetect we have a few manual overrides.