In btree_flush_write(), iterating all cached btree nodes and adding them
into ordered heap c->flush_btree takes quite long time. In order to
protect ordered heap c->flush_btree, spin lock c->journal.lock is held
for all the iteration and heap ordering. When journal space is fully
occupied, btree_flush_write() might be called frequently, if the cached
btree node iteration takes too much time, kenrel will complain that
normal journal kworkers are blocked too long. Of cause write performance
drops at this moment.

This patch introduces a new spin lock member in struct journal, named
flush_write_lock. This lock is only used in btree_flush_write() and
protect the ordered heap c->flush_btree during all the cached btree node
iteration. Then there won't be lock contention on c->journal.lock.

After this fix, when journal space is fully occupied, it is very rare to
observe the journal kworker blocking timeout warning.

Signed-off-by: Coly Li <[email protected]>
---
 drivers/md/bcache/journal.c | 5 +++--
 drivers/md/bcache/journal.h | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index 8536e76fcac9..6e38470f6924 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -519,7 +519,7 @@ static void btree_flush_write(struct cache_set *c)
        atomic_long_inc(&c->flush_write);
 
 retry:
-       spin_lock(&c->journal.lock);
+       spin_lock(&c->journal.flush_write_lock);
        if (heap_empty(&c->flush_btree)) {
                for_each_cached_btree(b, c, i)
                        if (btree_current_write(b)->journal) {
@@ -540,7 +540,7 @@ static void btree_flush_write(struct cache_set *c)
 
        b = NULL;
        heap_pop(&c->flush_btree, b, journal_min_cmp);
-       spin_unlock(&c->journal.lock);
+       spin_unlock(&c->journal.flush_write_lock);
 
        if (b) {
                mutex_lock(&b->write_lock);
@@ -1099,6 +1099,7 @@ int bch_journal_alloc(struct cache_set *c)
        struct journal *j = &c->journal;
 
        spin_lock_init(&j->lock);
+       spin_lock_init(&j->flush_write_lock);
        INIT_DELAYED_WORK(&j->work, journal_write_work);
 
        c->journal_delay_ms = 100;
diff --git a/drivers/md/bcache/journal.h b/drivers/md/bcache/journal.h
index a8be14c6f6d9..d8ad99f6191b 100644
--- a/drivers/md/bcache/journal.h
+++ b/drivers/md/bcache/journal.h
@@ -103,6 +103,7 @@ struct journal_write {
 /* Embedded in struct cache_set */
 struct journal {
        spinlock_t              lock;
+       spinlock_t              flush_write_lock;
        /* used when waiting because the journal was full */
        struct closure_waitlist wait;
        struct closure          io;
-- 
2.16.4

Reply via email to