Before rbio_orig_end_io() goes to free rbio, rbio may get merged with
more bios from other rbios and rbio->bio_list becomes non-empty,
in that case, these newly merged bios don't end properly.

Once unlock_stripe() is done, rbio->bio_list will not be updated any
more and we can call bio_endio() on all queued bios.

It should only happen in error-out cases, the normal path of recover
and full stripe write have already set RBIO_RMW_LOCKED_BIT to disable
merge before doing IO, so rbio_orig_end_io() called by them doesn't
have the above issue.

Reported-by: Jérôme Carretero <cj...@zougloub.eu>
Signed-off-by: Liu Bo <bo.li....@oracle.com>
---
v2: - Remove the usage spin_lock as there is a chance of deadlock in
      interrupt context, it's reported by lockdep, although it'd never
      happen because we've taken care of it by saving irq flags at all
      places.
    - Update commit log and comments of code to explain the new idea.
    - This has been tested against btrfs/011 for 50 times.

 fs/btrfs/raid56.c | 37 +++++++++++++++++++++++++------------
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 7747323..b2b426d 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -864,10 +864,17 @@ static void __free_raid_bio(struct btrfs_raid_bio *rbio)
        kfree(rbio);
 }
 
-static void free_raid_bio(struct btrfs_raid_bio *rbio)
+static void rbio_endio_bio_list(struct bio *cur, blk_status_t err)
 {
-       unlock_stripe(rbio);
-       __free_raid_bio(rbio);
+       struct bio *next;
+
+       while (cur) {
+               next = cur->bi_next;
+               cur->bi_next = NULL;
+               cur->bi_status = err;
+               bio_endio(cur);
+               cur = next;
+       }
 }
 
 /*
@@ -877,20 +884,26 @@ static void free_raid_bio(struct btrfs_raid_bio *rbio)
 static void rbio_orig_end_io(struct btrfs_raid_bio *rbio, blk_status_t err)
 {
        struct bio *cur = bio_list_get(&rbio->bio_list);
-       struct bio *next;
+       struct bio *extra;
 
        if (rbio->generic_bio_cnt)
                btrfs_bio_counter_sub(rbio->fs_info, rbio->generic_bio_cnt);
 
-       free_raid_bio(rbio);
+       /*
+        * At this moment, rbio->bio_list is empty, however since rbio does not
+        * always have RBIO_RMW_LOCKED_BIT set and rbio is still linked on the
+        * hash list, rbio may be merged with others so that rbio->bio_list
+        * becomes non-empty.
+        * Once unlock_stripe() is done, rbio->bio_list will not be updated any
+        * more and we can call bio_endio() on all queued bios.
+        */
+       unlock_stripe(rbio);
+       extra = bio_list_get(&rbio->bio_list);
+       __free_raid_bio(rbio);
 
-       while (cur) {
-               next = cur->bi_next;
-               cur->bi_next = NULL;
-               cur->bi_status = err;
-               bio_endio(cur);
-               cur = next;
-       }
+       rbio_endio_bio_list(cur, err);
+       if (extra)
+               rbio_endio_bio_list(extra, err);
 }
 
 /*
-- 
2.9.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to