Re: [Cluster-devel] [GFS2 PATCH 06/12] gfs2: Create transaction for inodes with i_nlink != 0

2020-08-27 Thread Bob Peterson
- Original Message -
> On Fri, Aug 21, 2020 at 7:33 PM Bob Peterson  wrote:
> > Before this patch, function gfs2_evict_inode would check if i_nlink
> > was non-zero, and if so, go to label out. The problem is, the evicted
> > file may still have outstanding pages that need invalidating, but
> > the call to truncate_inode_pages_final at label out doesn't start a
> > transaction. It needs a transaction in order to write revokes for any
> > pages it has to invalidate.
> 
> This is only true for jdata inodes though, right? If so, I'd rather
> just create transactions in the jdata case.

The truncate_inode_pages_final() for i_data is only for jdata, which
includes directories for their hash tables. However, for regular files,
evict's call to gfs2_glock_put_eventually() has the potential to be the
last put for the inode's glock (in a race), which might still have pages
attached (metamapping). I firmly believe this is our "nrpages" bug I've been
chasing, but I haven't proven it yet because it's very hard to recreate.

Afaik, some of these unresolved metadata pages may still need revokes, and
we still need a transaction to do that, even if the dinode still has links.

The "nrpages" problem always seems to involve the system quotas file,
probably because it's jdata, but imagine a directory with a large hash
table, which is modified, then is quickly evicted (without being deleted).

It wasn't that long ago I was working on a patch to take glock reference
even sooner than we did for f4e2f5e1a527ce58fc9f85145b03704779a3123e.
I titled the patch "grab glock reference as early as possible in transactions
but it was never pushed anywhere because it added a new atomic to the
glock. It may be an alternative solution to the problem. My comments on
that patch were:

   Before this patch, an additional glock reference was taken when
   the bufdata element, bd, was revoked. That's not early enough
   because the caller who created the bd (via trans_add_meta) may
   have already come and gone with the bd still not revoked (but
   in the ail).
   
   This patch takes the glock reference earlier in the process, when
   the first bd element is allocated for a glock. It queues the glock
   reference to be put when the last bd element for the glock is freed.
   
   To this end, a new atomic glock field, gl_bd_count, keeps count.

Regards,

Bob Peterson



Re: [Cluster-devel] [GFS2 PATCH 06/12] gfs2: Create transaction for inodes with i_nlink != 0

2020-08-27 Thread Steven Whitehouse

Hi,

On 27/08/2020 07:00, Andreas Gruenbacher wrote:

On Fri, Aug 21, 2020 at 7:33 PM Bob Peterson  wrote:

Before this patch, function gfs2_evict_inode would check if i_nlink
was non-zero, and if so, go to label out. The problem is, the evicted
file may still have outstanding pages that need invalidating, but
the call to truncate_inode_pages_final at label out doesn't start a
transaction. It needs a transaction in order to write revokes for any
pages it has to invalidate.

This is only true for jdata inodes though, right? If so, I'd rather
just create transactions in the jdata case.


Yes, and also if the inode is being deallocated, then we might be able 
to skip that step. We'll no doubt have to retain it in case this is just 
an unlink and there are still openers somewhere,


Steve.



This patch removes the early check for i_nlink in gfs2_evict_inode.
Not much further down in the code, there's another check for i_nlink
that skips to out_truncate. That one is proper because the calls
to truncate_inode_pages after out_truncate use a proper transaction,
so the page invalidates and subsequent revokes may be done properly.

Signed-off-by: Bob Peterson 
---
  fs/gfs2/super.c | 21 +
  1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 80ac446f0110..1f3dee740431 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1344,7 +1344,7 @@ static void gfs2_evict_inode(struct inode *inode)
 return;
 }

-   if (inode->i_nlink || sb_rdonly(sb))
+   if (sb_rdonly(sb))
 goto out;
 if (test_bit(GIF_ALLOC_FAILED, &ip->i_flags)) {
@@ -1370,15 +1370,19 @@ static void gfs2_evict_inode(struct inode *inode)
 }

 if (gfs2_inode_already_deleted(ip->i_gl, ip->i_no_formal_ino))
-   goto out_truncate;
+   goto out_flush;
 error = gfs2_check_blk_type(sdp, ip->i_no_addr, GFS2_BLKST_UNLINKED);
-   if (error)
-   goto out_truncate;
+   if (error) {
+   error = 0;
+   goto out_flush;
+   }

 if (test_bit(GIF_INVALID, &ip->i_flags)) {
 error = gfs2_inode_refresh(ip);
-   if (error)
-   goto out_truncate;
+   if (error) {
+   error = 0;
+   goto out_flush;
+   }
 }

 /*
@@ -1392,7 +1396,7 @@ static void gfs2_evict_inode(struct inode *inode)
 test_bit(HIF_HOLDER, &ip->i_iopen_gh.gh_iflags)) {
 if (!gfs2_upgrade_iopen_glock(inode)) {
 gfs2_holder_uninit(&ip->i_iopen_gh);
-   goto out_truncate;
+   goto out_flush;
 }
 }

@@ -1424,7 +1428,7 @@ static void gfs2_evict_inode(struct inode *inode)
 gfs2_inode_remember_delete(ip->i_gl, ip->i_no_formal_ino);
 goto out_unlock;

-out_truncate:
+out_flush:
 gfs2_log_flush(sdp, ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL |
GFS2_LFC_EVICT_INODE);
 metamapping = gfs2_glock2aspace(ip->i_gl);
@@ -1435,6 +1439,7 @@ static void gfs2_evict_inode(struct inode *inode)
 write_inode_now(inode, 1);
 gfs2_ail_flush(ip->i_gl, 0);

+out_truncate:
 nr_revokes = inode->i_mapping->nrpages + metamapping->nrpages;
 if (!nr_revokes)
 goto out_unlock;
--
2.26.2


Thanks,
Andreas





Re: [Cluster-devel] [GFS2 PATCH 06/12] gfs2: Create transaction for inodes with i_nlink != 0

2020-08-26 Thread Andreas Gruenbacher
On Fri, Aug 21, 2020 at 7:33 PM Bob Peterson  wrote:
> Before this patch, function gfs2_evict_inode would check if i_nlink
> was non-zero, and if so, go to label out. The problem is, the evicted
> file may still have outstanding pages that need invalidating, but
> the call to truncate_inode_pages_final at label out doesn't start a
> transaction. It needs a transaction in order to write revokes for any
> pages it has to invalidate.

This is only true for jdata inodes though, right? If so, I'd rather
just create transactions in the jdata case.

> This patch removes the early check for i_nlink in gfs2_evict_inode.
> Not much further down in the code, there's another check for i_nlink
> that skips to out_truncate. That one is proper because the calls
> to truncate_inode_pages after out_truncate use a proper transaction,
> so the page invalidates and subsequent revokes may be done properly.
>
> Signed-off-by: Bob Peterson 
> ---
>  fs/gfs2/super.c | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 80ac446f0110..1f3dee740431 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c
> @@ -1344,7 +1344,7 @@ static void gfs2_evict_inode(struct inode *inode)
> return;
> }
>
> -   if (inode->i_nlink || sb_rdonly(sb))
> +   if (sb_rdonly(sb))
> goto out;
> if (test_bit(GIF_ALLOC_FAILED, &ip->i_flags)) {
> @@ -1370,15 +1370,19 @@ static void gfs2_evict_inode(struct inode *inode)
> }
>
> if (gfs2_inode_already_deleted(ip->i_gl, ip->i_no_formal_ino))
> -   goto out_truncate;
> +   goto out_flush;
> error = gfs2_check_blk_type(sdp, ip->i_no_addr, GFS2_BLKST_UNLINKED);
> -   if (error)
> -   goto out_truncate;
> +   if (error) {
> +   error = 0;
> +   goto out_flush;
> +   }
>
> if (test_bit(GIF_INVALID, &ip->i_flags)) {
> error = gfs2_inode_refresh(ip);
> -   if (error)
> -   goto out_truncate;
> +   if (error) {
> +   error = 0;
> +   goto out_flush;
> +   }
> }
>
> /*
> @@ -1392,7 +1396,7 @@ static void gfs2_evict_inode(struct inode *inode)
> test_bit(HIF_HOLDER, &ip->i_iopen_gh.gh_iflags)) {
> if (!gfs2_upgrade_iopen_glock(inode)) {
> gfs2_holder_uninit(&ip->i_iopen_gh);
> -   goto out_truncate;
> +   goto out_flush;
> }
> }
>
> @@ -1424,7 +1428,7 @@ static void gfs2_evict_inode(struct inode *inode)
> gfs2_inode_remember_delete(ip->i_gl, ip->i_no_formal_ino);
> goto out_unlock;
>
> -out_truncate:
> +out_flush:
> gfs2_log_flush(sdp, ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL |
>GFS2_LFC_EVICT_INODE);
> metamapping = gfs2_glock2aspace(ip->i_gl);
> @@ -1435,6 +1439,7 @@ static void gfs2_evict_inode(struct inode *inode)
> write_inode_now(inode, 1);
> gfs2_ail_flush(ip->i_gl, 0);
>
> +out_truncate:
> nr_revokes = inode->i_mapping->nrpages + metamapping->nrpages;
> if (!nr_revokes)
> goto out_unlock;
> --
> 2.26.2
>

Thanks,
Andreas



[Cluster-devel] [GFS2 PATCH 06/12] gfs2: Create transaction for inodes with i_nlink != 0

2020-08-21 Thread Bob Peterson
Before this patch, function gfs2_evict_inode would check if i_nlink
was non-zero, and if so, go to label out. The problem is, the evicted
file may still have outstanding pages that need invalidating, but
the call to truncate_inode_pages_final at label out doesn't start a
transaction. It needs a transaction in order to write revokes for any
pages it has to invalidate.

This patch removes the early check for i_nlink in gfs2_evict_inode.
Not much further down in the code, there's another check for i_nlink
that skips to out_truncate. That one is proper because the calls
to truncate_inode_pages after out_truncate use a proper transaction,
so the page invalidates and subsequent revokes may be done properly.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/super.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 80ac446f0110..1f3dee740431 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1344,7 +1344,7 @@ static void gfs2_evict_inode(struct inode *inode)
return;
}
 
-   if (inode->i_nlink || sb_rdonly(sb))
+   if (sb_rdonly(sb))
goto out;
 
if (test_bit(GIF_ALLOC_FAILED, &ip->i_flags)) {
@@ -1370,15 +1370,19 @@ static void gfs2_evict_inode(struct inode *inode)
}
 
if (gfs2_inode_already_deleted(ip->i_gl, ip->i_no_formal_ino))
-   goto out_truncate;
+   goto out_flush;
error = gfs2_check_blk_type(sdp, ip->i_no_addr, GFS2_BLKST_UNLINKED);
-   if (error)
-   goto out_truncate;
+   if (error) {
+   error = 0;
+   goto out_flush;
+   }
 
if (test_bit(GIF_INVALID, &ip->i_flags)) {
error = gfs2_inode_refresh(ip);
-   if (error)
-   goto out_truncate;
+   if (error) {
+   error = 0;
+   goto out_flush;
+   }
}
 
/*
@@ -1392,7 +1396,7 @@ static void gfs2_evict_inode(struct inode *inode)
test_bit(HIF_HOLDER, &ip->i_iopen_gh.gh_iflags)) {
if (!gfs2_upgrade_iopen_glock(inode)) {
gfs2_holder_uninit(&ip->i_iopen_gh);
-   goto out_truncate;
+   goto out_flush;
}
}
 
@@ -1424,7 +1428,7 @@ static void gfs2_evict_inode(struct inode *inode)
gfs2_inode_remember_delete(ip->i_gl, ip->i_no_formal_ino);
goto out_unlock;
 
-out_truncate:
+out_flush:
gfs2_log_flush(sdp, ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL |
   GFS2_LFC_EVICT_INODE);
metamapping = gfs2_glock2aspace(ip->i_gl);
@@ -1435,6 +1439,7 @@ static void gfs2_evict_inode(struct inode *inode)
write_inode_now(inode, 1);
gfs2_ail_flush(ip->i_gl, 0);
 
+out_truncate:
nr_revokes = inode->i_mapping->nrpages + metamapping->nrpages;
if (!nr_revokes)
goto out_unlock;
-- 
2.26.2