Re: [Cluster-devel] [GFS2 PATCH] gfs2: fix trans slab error when withdraw occurs inside log_flush

2020-06-09 Thread Andreas Gruenbacher
On Tue, Jun 9, 2020 at 4:15 PM Bob Peterson  wrote:
> - Original Message -
> > I'm not sure quite what the aim is here... are you sure that it is ok to
> > move something to the AIL list if there has been a withdrawal? If the
> > log flush has not completed correctly then we should not be moving
> > anything to the AIL lists I think,
> >
> > Steve.
>
> Yes, I'm sure it's okay in this case. We only add it temporarily to the
> ail1 list so that function ail_drain() finds and removes it like the rest.
>
> I only coded it this way because Andreas didn't like my somewhat longer
> previous implementation, which follows.

Note that this previous version fails to free the gfs2_bufdata objects
attached to the transaction if the transaction isn't on one of the ail
lists.

Andreas



Re: [Cluster-devel] [GFS2 PATCH] gfs2: fix trans slab error when withdraw occurs inside log_flush

2020-06-09 Thread Bob Peterson
- Original Message -
> I'm not sure quite what the aim is here... are you sure that it is ok to
> move something to the AIL list if there has been a withdrawal? If the
> log flush has not completed correctly then we should not be moving
> anything to the AIL lists I think,
> 
> Steve.

Yes, I'm sure it's okay in this case. We only add it temporarily to the
ail1 list so that function ail_drain() finds and removes it like the rest.

I only coded it this way because Andreas didn't like my somewhat longer
previous implementation, which follows.

Regards,

Bob

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 3e4734431783..0c45851b88d5 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -845,8 +845,15 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 
flags)
 /**
  * ail_drain - drain the ail lists after a withdraw
  * @sdp: Pointer to GFS2 superblock
+ * @flush_tr: If non-null, transaction that was being flushed
+ *
+ * We're draining the ail lists after a withdraw, during a log flush op.
+ * If the log flush was targeting a specific transaction, it will never
+ * go through "after" lops processing, so we need to free it here.
+ * If we find it on one of the ail lists, we free it as we dequeue it.
+ * If we don't find it, we free it at the end.
  */
-static void ail_drain(struct gfs2_sbd *sdp)
+static void ail_drain(struct gfs2_sbd *sdp, struct gfs2_trans *flush_tr)
 {
struct gfs2_trans *tr;
 
@@ -865,6 +872,8 @@ static void ail_drain(struct gfs2_sbd *sdp)
gfs2_ail_empty_tr(sdp, tr, >tr_ail2_list);
list_del(>tr_list);
gfs2_trans_free(sdp, tr);
+   if (tr == flush_tr)
+   flush_tr = NULL;
}
while (!list_empty(>sd_ail2_list)) {
tr = list_first_entry(>sd_ail2_list, struct gfs2_trans,
@@ -872,7 +881,11 @@ static void ail_drain(struct gfs2_sbd *sdp)
gfs2_ail_empty_tr(sdp, tr, >tr_ail2_list);
list_del(>tr_list);
gfs2_trans_free(sdp, tr);
+   if (tr == flush_tr)
+   flush_tr = NULL;
}
+   if (flush_tr)
+   gfs2_trans_free(sdp, flush_tr);
spin_unlock(>sd_ail_lock);
 }
 
@@ -1002,7 +1015,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct 
gfs2_glock *gl, u32 flags)
 
 out:
if (gfs2_withdrawn(sdp)) {
-   ail_drain(sdp); /* frees all transactions */
+   ail_drain(sdp, tr); /* frees all transactions */
tr = NULL;
}
 



Re: [Cluster-devel] [GFS2 PATCH] gfs2: fix trans slab error when withdraw occurs inside log_flush

2020-06-09 Thread Steven Whitehouse

Hi,

On 09/06/2020 14:55, Bob Peterson wrote:

Hi,

Log flush operations (gfs2_log_flush()) can target a specific transaction.
But if the function encounters errors (e.g. io errors) and withdraws,
the transaction was only freed it if was queued to one of the ail lists.
If the withdraw occurred before the transaction was queued to the ail1
list, function ail_drain never freed it. The result was:

BUG gfs2_trans: Objects remaining in gfs2_trans on __kmem_cache_shutdown()

This patch makes log_flush() add the targeted transaction to the ail1
list so that function ail_drain() will find and free it properly.

Signed-off-by: Bob Peterson 
---
  fs/gfs2/log.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 3e4734431783..2b05415bbc13 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1002,6 +1002,16 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct 
gfs2_glock *gl, u32 flags)
  
  out:

if (gfs2_withdrawn(sdp)) {
+   /**
+* If the tr_list is empty, we're withdrawing during a log
+* flush that targets a transaction, but the transaction was
+* never queued onto any of the ail lists. Here we add it to
+* ail1 just so that ail_drain() will find and free it.
+*/
+   spin_lock(>sd_ail_lock);
+   if (tr && list_empty(>tr_list))
+   list_add(>tr_list, >sd_ail1_list);
+   spin_unlock(>sd_ail_lock);
ail_drain(sdp); /* frees all transactions */
tr = NULL;
}

I'm not sure quite what the aim is here... are you sure that it is ok to 
move something to the AIL list if there has been a withdrawal? If the 
log flush has not completed correctly then we should not be moving 
anything to the AIL lists I think,


Steve.




[Cluster-devel] [GFS2 PATCH] gfs2: fix trans slab error when withdraw occurs inside log_flush

2020-06-09 Thread Bob Peterson
Hi,

Log flush operations (gfs2_log_flush()) can target a specific transaction.
But if the function encounters errors (e.g. io errors) and withdraws,
the transaction was only freed it if was queued to one of the ail lists.
If the withdraw occurred before the transaction was queued to the ail1
list, function ail_drain never freed it. The result was:

BUG gfs2_trans: Objects remaining in gfs2_trans on __kmem_cache_shutdown()

This patch makes log_flush() add the targeted transaction to the ail1
list so that function ail_drain() will find and free it properly.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/log.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 3e4734431783..2b05415bbc13 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -1002,6 +1002,16 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct 
gfs2_glock *gl, u32 flags)
 
 out:
if (gfs2_withdrawn(sdp)) {
+   /**
+* If the tr_list is empty, we're withdrawing during a log
+* flush that targets a transaction, but the transaction was
+* never queued onto any of the ail lists. Here we add it to
+* ail1 just so that ail_drain() will find and free it.
+*/
+   spin_lock(>sd_ail_lock);
+   if (tr && list_empty(>tr_list))
+   list_add(>tr_list, >sd_ail1_list);
+   spin_unlock(>sd_ail_lock);
ail_drain(sdp); /* frees all transactions */
tr = NULL;
}