Commit:     28ae094c625a9b719c01cf5ec45b8640e6911f53
Parent:     2dafe1c4d69345539735cca64250f2d4657bd057
Author:     Neil Brown <[EMAIL PROTECTED]>
AuthorDate: Fri Feb 8 04:22:13 2008 -0800
Committer:  Linus Torvalds <[EMAIL PROTECTED]>
CommitDate: Fri Feb 8 09:22:44 2008 -0800

    ext3 can fail badly when device stops accepting BIO_RW_BARRIER requests
    Some devices - notably dm and md - can change their behaviour in response
    to BIO_RW_BARRIER requests.  They might start out accepting such requests
    but on reconfiguration, they find out that they cannot any more.
    ext3 (and other filesystems) deal with this by always testing if
    BIO_RW_BARRIER requests fail with EOPNOTSUPP, and retrying the write
    requests without the barrier (probably after waiting for any pending writes
    to complete).
    However there is a bug in the handling for this for ext3.
    When ext3 (jbd actually) decides to submit a BIO_RW_BARRIER request, it
    sets the buffer_ordered flag on the buffer head.  If the request completes
    successfully, the flag STAYS SET.
    Other code might then write the same buffer_head after the device has been
    reconfigured to not accept barriers.  This write will then fail, but the
    "other code" is not ready to handle EOPNOTSUPP errors and the error will be
    treated as fatal.
    This can be seen without having to reconfigure a device at exactly the
    wrong time by putting:
                if (buffer_ordered(bh))
                        printk("OH DEAR, and ordered buffer\n");
    in the while loop in "commit phase 5" of journal_commit_transaction.
    If it ever prints the "OH DEAR ..." message (as it does sometimes for
    me), then that request could (in different circumstances) have failed
    with EOPNOTSUPP, but that isn't tested for.
    My proposed fix is to clear the buffer_ordered flag after it has been
    used, as in the following patch.
    Signed-off-by: Neil Brown <[EMAIL PROTECTED]>
    Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
    Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
 fs/jbd/commit.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c
index 31853eb..8e08efc 100644
--- a/fs/jbd/commit.c
+++ b/fs/jbd/commit.c
@@ -131,6 +131,8 @@ static int journal_write_commit_record(journal_t *journal,
                barrier_done = 1;
        ret = sync_dirty_buffer(bh);
+       if (barrier_done)
+               clear_buffer_ordered(bh);
        /* is it possible for another commit to fail at roughly
         * the same time as this one?  If so, we don't want to
         * trust the barrier flag in the super, but instead want
@@ -148,7 +150,6 @@ static int journal_write_commit_record(journal_t *journal,
                /* And try again, without the barrier */
-               clear_buffer_ordered(bh);
                ret = sync_dirty_buffer(bh);
To unsubscribe from this list: send the line "unsubscribe git-commits-head" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at

Reply via email to