Re: [Cluster-devel] [GFS2 PATCH 06/12] gfs2: Create transaction for inodes with i_nlink != 0
- Original Message - > On Fri, Aug 21, 2020 at 7:33 PM Bob Peterson wrote: > > Before this patch, function gfs2_evict_inode would check if i_nlink > > was non-zero, and if so, go to label out. The problem is, the evicted > > file may still have outstanding pages that need invalidating, but > > the call to truncate_inode_pages_final at label out doesn't start a > > transaction. It needs a transaction in order to write revokes for any > > pages it has to invalidate. > > This is only true for jdata inodes though, right? If so, I'd rather > just create transactions in the jdata case. The truncate_inode_pages_final() for i_data is only for jdata, which includes directories for their hash tables. However, for regular files, evict's call to gfs2_glock_put_eventually() has the potential to be the last put for the inode's glock (in a race), which might still have pages attached (metamapping). I firmly believe this is our "nrpages" bug I've been chasing, but I haven't proven it yet because it's very hard to recreate. Afaik, some of these unresolved metadata pages may still need revokes, and we still need a transaction to do that, even if the dinode still has links. The "nrpages" problem always seems to involve the system quotas file, probably because it's jdata, but imagine a directory with a large hash table, which is modified, then is quickly evicted (without being deleted). It wasn't that long ago I was working on a patch to take glock reference even sooner than we did for f4e2f5e1a527ce58fc9f85145b03704779a3123e. I titled the patch "grab glock reference as early as possible in transactions but it was never pushed anywhere because it added a new atomic to the glock. It may be an alternative solution to the problem. My comments on that patch were: Before this patch, an additional glock reference was taken when the bufdata element, bd, was revoked. That's not early enough because the caller who created the bd (via trans_add_meta) may have already come and gone with the bd still not revoked (but in the ail). This patch takes the glock reference earlier in the process, when the first bd element is allocated for a glock. It queues the glock reference to be put when the last bd element for the glock is freed. To this end, a new atomic glock field, gl_bd_count, keeps count. Regards, Bob Peterson
Re: [Cluster-devel] [GFS2 PATCH 06/12] gfs2: Create transaction for inodes with i_nlink != 0
Hi, On 27/08/2020 07:00, Andreas Gruenbacher wrote: On Fri, Aug 21, 2020 at 7:33 PM Bob Peterson wrote: Before this patch, function gfs2_evict_inode would check if i_nlink was non-zero, and if so, go to label out. The problem is, the evicted file may still have outstanding pages that need invalidating, but the call to truncate_inode_pages_final at label out doesn't start a transaction. It needs a transaction in order to write revokes for any pages it has to invalidate. This is only true for jdata inodes though, right? If so, I'd rather just create transactions in the jdata case. Yes, and also if the inode is being deallocated, then we might be able to skip that step. We'll no doubt have to retain it in case this is just an unlink and there are still openers somewhere, Steve. This patch removes the early check for i_nlink in gfs2_evict_inode. Not much further down in the code, there's another check for i_nlink that skips to out_truncate. That one is proper because the calls to truncate_inode_pages after out_truncate use a proper transaction, so the page invalidates and subsequent revokes may be done properly. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 80ac446f0110..1f3dee740431 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -1344,7 +1344,7 @@ static void gfs2_evict_inode(struct inode *inode) return; } - if (inode->i_nlink || sb_rdonly(sb)) + if (sb_rdonly(sb)) goto out; if (test_bit(GIF_ALLOC_FAILED, &ip->i_flags)) { @@ -1370,15 +1370,19 @@ static void gfs2_evict_inode(struct inode *inode) } if (gfs2_inode_already_deleted(ip->i_gl, ip->i_no_formal_ino)) - goto out_truncate; + goto out_flush; error = gfs2_check_blk_type(sdp, ip->i_no_addr, GFS2_BLKST_UNLINKED); - if (error) - goto out_truncate; + if (error) { + error = 0; + goto out_flush; + } if (test_bit(GIF_INVALID, &ip->i_flags)) { error = gfs2_inode_refresh(ip); - if (error) - goto out_truncate; + if (error) { + error = 0; + goto out_flush; + } } /* @@ -1392,7 +1396,7 @@ static void gfs2_evict_inode(struct inode *inode) test_bit(HIF_HOLDER, &ip->i_iopen_gh.gh_iflags)) { if (!gfs2_upgrade_iopen_glock(inode)) { gfs2_holder_uninit(&ip->i_iopen_gh); - goto out_truncate; + goto out_flush; } } @@ -1424,7 +1428,7 @@ static void gfs2_evict_inode(struct inode *inode) gfs2_inode_remember_delete(ip->i_gl, ip->i_no_formal_ino); goto out_unlock; -out_truncate: +out_flush: gfs2_log_flush(sdp, ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL | GFS2_LFC_EVICT_INODE); metamapping = gfs2_glock2aspace(ip->i_gl); @@ -1435,6 +1439,7 @@ static void gfs2_evict_inode(struct inode *inode) write_inode_now(inode, 1); gfs2_ail_flush(ip->i_gl, 0); +out_truncate: nr_revokes = inode->i_mapping->nrpages + metamapping->nrpages; if (!nr_revokes) goto out_unlock; -- 2.26.2 Thanks, Andreas
Re: [Cluster-devel] [GFS2 PATCH 06/12] gfs2: Create transaction for inodes with i_nlink != 0
On Fri, Aug 21, 2020 at 7:33 PM Bob Peterson wrote: > Before this patch, function gfs2_evict_inode would check if i_nlink > was non-zero, and if so, go to label out. The problem is, the evicted > file may still have outstanding pages that need invalidating, but > the call to truncate_inode_pages_final at label out doesn't start a > transaction. It needs a transaction in order to write revokes for any > pages it has to invalidate. This is only true for jdata inodes though, right? If so, I'd rather just create transactions in the jdata case. > This patch removes the early check for i_nlink in gfs2_evict_inode. > Not much further down in the code, there's another check for i_nlink > that skips to out_truncate. That one is proper because the calls > to truncate_inode_pages after out_truncate use a proper transaction, > so the page invalidates and subsequent revokes may be done properly. > > Signed-off-by: Bob Peterson > --- > fs/gfs2/super.c | 21 + > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c > index 80ac446f0110..1f3dee740431 100644 > --- a/fs/gfs2/super.c > +++ b/fs/gfs2/super.c > @@ -1344,7 +1344,7 @@ static void gfs2_evict_inode(struct inode *inode) > return; > } > > - if (inode->i_nlink || sb_rdonly(sb)) > + if (sb_rdonly(sb)) > goto out; > if (test_bit(GIF_ALLOC_FAILED, &ip->i_flags)) { > @@ -1370,15 +1370,19 @@ static void gfs2_evict_inode(struct inode *inode) > } > > if (gfs2_inode_already_deleted(ip->i_gl, ip->i_no_formal_ino)) > - goto out_truncate; > + goto out_flush; > error = gfs2_check_blk_type(sdp, ip->i_no_addr, GFS2_BLKST_UNLINKED); > - if (error) > - goto out_truncate; > + if (error) { > + error = 0; > + goto out_flush; > + } > > if (test_bit(GIF_INVALID, &ip->i_flags)) { > error = gfs2_inode_refresh(ip); > - if (error) > - goto out_truncate; > + if (error) { > + error = 0; > + goto out_flush; > + } > } > > /* > @@ -1392,7 +1396,7 @@ static void gfs2_evict_inode(struct inode *inode) > test_bit(HIF_HOLDER, &ip->i_iopen_gh.gh_iflags)) { > if (!gfs2_upgrade_iopen_glock(inode)) { > gfs2_holder_uninit(&ip->i_iopen_gh); > - goto out_truncate; > + goto out_flush; > } > } > > @@ -1424,7 +1428,7 @@ static void gfs2_evict_inode(struct inode *inode) > gfs2_inode_remember_delete(ip->i_gl, ip->i_no_formal_ino); > goto out_unlock; > > -out_truncate: > +out_flush: > gfs2_log_flush(sdp, ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL | >GFS2_LFC_EVICT_INODE); > metamapping = gfs2_glock2aspace(ip->i_gl); > @@ -1435,6 +1439,7 @@ static void gfs2_evict_inode(struct inode *inode) > write_inode_now(inode, 1); > gfs2_ail_flush(ip->i_gl, 0); > > +out_truncate: > nr_revokes = inode->i_mapping->nrpages + metamapping->nrpages; > if (!nr_revokes) > goto out_unlock; > -- > 2.26.2 > Thanks, Andreas
[Cluster-devel] [GFS2 PATCH 06/12] gfs2: Create transaction for inodes with i_nlink != 0
Before this patch, function gfs2_evict_inode would check if i_nlink was non-zero, and if so, go to label out. The problem is, the evicted file may still have outstanding pages that need invalidating, but the call to truncate_inode_pages_final at label out doesn't start a transaction. It needs a transaction in order to write revokes for any pages it has to invalidate. This patch removes the early check for i_nlink in gfs2_evict_inode. Not much further down in the code, there's another check for i_nlink that skips to out_truncate. That one is proper because the calls to truncate_inode_pages after out_truncate use a proper transaction, so the page invalidates and subsequent revokes may be done properly. Signed-off-by: Bob Peterson --- fs/gfs2/super.c | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 80ac446f0110..1f3dee740431 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -1344,7 +1344,7 @@ static void gfs2_evict_inode(struct inode *inode) return; } - if (inode->i_nlink || sb_rdonly(sb)) + if (sb_rdonly(sb)) goto out; if (test_bit(GIF_ALLOC_FAILED, &ip->i_flags)) { @@ -1370,15 +1370,19 @@ static void gfs2_evict_inode(struct inode *inode) } if (gfs2_inode_already_deleted(ip->i_gl, ip->i_no_formal_ino)) - goto out_truncate; + goto out_flush; error = gfs2_check_blk_type(sdp, ip->i_no_addr, GFS2_BLKST_UNLINKED); - if (error) - goto out_truncate; + if (error) { + error = 0; + goto out_flush; + } if (test_bit(GIF_INVALID, &ip->i_flags)) { error = gfs2_inode_refresh(ip); - if (error) - goto out_truncate; + if (error) { + error = 0; + goto out_flush; + } } /* @@ -1392,7 +1396,7 @@ static void gfs2_evict_inode(struct inode *inode) test_bit(HIF_HOLDER, &ip->i_iopen_gh.gh_iflags)) { if (!gfs2_upgrade_iopen_glock(inode)) { gfs2_holder_uninit(&ip->i_iopen_gh); - goto out_truncate; + goto out_flush; } } @@ -1424,7 +1428,7 @@ static void gfs2_evict_inode(struct inode *inode) gfs2_inode_remember_delete(ip->i_gl, ip->i_no_formal_ino); goto out_unlock; -out_truncate: +out_flush: gfs2_log_flush(sdp, ip->i_gl, GFS2_LOG_HEAD_FLUSH_NORMAL | GFS2_LFC_EVICT_INODE); metamapping = gfs2_glock2aspace(ip->i_gl); @@ -1435,6 +1439,7 @@ static void gfs2_evict_inode(struct inode *inode) write_inode_now(inode, 1); gfs2_ail_flush(ip->i_gl, 0); +out_truncate: nr_revokes = inode->i_mapping->nrpages + metamapping->nrpages; if (!nr_revokes) goto out_unlock; -- 2.26.2