Re: Problem in log_do_checkpoint()?

Jan Kara Mon, 11 Apr 2005 04:40:04 -0700

  Hello,

> On Mon, 2005-04-04 at 10:04, Jan Kara wrote:
> 
> >   In log_do_checkpoint() we go through the t_checkpoint_list of a
> > transaction and call __flush_buffer() on each buffer. Suppose there is
> > just one buffer on the list and it is dirty. __flush_buffer() sees it and
> > puts it to an array of buffers for flushing. Then the loop finishes,
> > retry=0, drop_count=0, batch_count=1. So __flush_batch() is called - we
> > drop all locks and sleep. While we are sleeping somebody else comes and
> > makes the buffer dirty again (OK, that is not probable, but I think it
> > could be possible). 
> 
> Yes, there's no reason why not at that point.
> 
> > Now we wake up and call __cleanup_transaction().
> > It's not able to do anything and returns 0.
> 
> I think the _right_ answer here is to have two separate checkpoint
> lists: the current one, plus one for which the checkpoint write has
> already been submitted.  That way, we can wait for IO completion on
> submitted writes without (a) getting conned into doing multiple rewrites
> if there's somebody else dirtying the buffer; or (b) getting confused
> about how much progress we're making.  Buffers on the pre-write list get
> written; buffers on the post-write list get waited for; and both count
> as progress (eliminating the false assert-failure when we failed to
> detect progress).
  Yes, this seems to be a better long-term solution. A hotfix (retrying
after __flush_batch()) is attached if somebody is interested - it should
be safe and is lightly tested.


> The prevention of multiple writes in this case should also improve
> performance a little.
> 
> That ought to be pretty straightforward, I think.  The existing cases
> where we remove buffers from a checkpoint shouldn't have to care about
> which list_head we're removing from; those cases already handle buffers
> in both states.  It's only when doing the flush/wait that we have to
> distinguish the two.
  Yes, AFAICS the changes should remain local to the checkpointing code
(plus __unlink_buffer()). Should I write the patch or will you?

                                                                Honza
-- 
Jan Kara <[EMAIL PROTECTED]>
SuSE CR Labs

Fix possible false assertion failure in JBD checkpointing code.

Signed-off-by: Jan Kara <[EMAIL PROTECTED]>

diff -rupX /home/jack/.kerndiffexclude linux-2.6.12-rc2/fs/jbd/checkpoint.c 
linux-2.6.12-rc2-checkpoint/fs/jbd/checkpoint.c
--- linux-2.6.12-rc2/fs/jbd/checkpoint.c        2005-03-03 18:58:29.000000000 
+0100
+++ linux-2.6.12-rc2-checkpoint/fs/jbd/checkpoint.c     2005-04-05 
13:26:42.000000000 +0200
@@ -339,8 +339,10 @@ int log_do_checkpoint(journal_t *journal
                        }
                } while (jh != last_jh && !retry);
 
-               if (batch_count)
+               if (batch_count) {
                        __flush_batch(journal, bhs, &batch_count);
+                       retry = 1;
+               }
 
                /*
                 * If someone cleaned up this transaction while we slept, we're

Re: Problem in log_do_checkpoint()?

Reply via email to