On 7/8/20 2:20 AM, Damien Le Moal wrote:
Triggerring reclaim only based on the percentage of unmapped cache
zones can fail to detect cases where reclaim is needed, e.g. if the
target has only 2 or 3 cache zones and only one unmapped cache zone,
the percentage of free cache zone is higher than
DMZ_RECLAIM_LOW_UNMAP_ZONES (30%) and reclaim does not trigger.
This problem, combined with the fact that dmz_schedule_reclaim() is
called from dmz_handle_bio() without the map lock held leads to a race
between zone allocation and dmz_should_reclaim() result. Depending on
the workload applied, this race can lead to the write path forever
waiting for a free zone without reclaim being triggerred.

Fix this by moving dmz_schedule_reclaim() inside dmz_alloc_zone()
under the map lock, checking the need for zone reclaim whenever a new
data or buffer zone needs to be allocated.

Also fix dmz_reclaim_percentage() to always return 0 if the number of
unmapped cache (or random) zone is less than or equal to 1.

Suggested-by: Shin'ichiro Kawasaki <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
---
  drivers/md/dm-zoned-metadata.c |  9 ++++++++-
  drivers/md/dm-zoned-reclaim.c  |  2 ++
  drivers/md/dm-zoned-target.c   | 10 +---------
  3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/drivers/md/dm-zoned-metadata.c b/drivers/md/dm-zoned-metadata.c
index 5cf6f5f552e0..b298fefb022e 100644
--- a/drivers/md/dm-zoned-metadata.c
+++ b/drivers/md/dm-zoned-metadata.c
@@ -2217,8 +2217,15 @@ struct dm_zone *dmz_alloc_zone(struct dmz_metadata *zmd, 
unsigned int dev_idx,
  {
        struct list_head *list;
        struct dm_zone *zone;
-       int i = 0;
+       int i;
+
+       /* Schedule reclaim to ensure free zones are available */
+       if (!(flags & DMZ_ALLOC_RECLAIM)) {
+               for (i = 0; i < zmd->nr_devs; i++)
+                       dmz_schedule_reclaim(zmd->dev[i].reclaim);
+       }
+ i = 0;
  again:
        if (flags & DMZ_ALLOC_CACHE)
                list = &zmd->unmap_cache_list;
diff --git a/drivers/md/dm-zoned-reclaim.c b/drivers/md/dm-zoned-reclaim.c
index dd1eebf6e50f..9c6e264465bc 100644
--- a/drivers/md/dm-zoned-reclaim.c
+++ b/drivers/md/dm-zoned-reclaim.c
@@ -456,6 +456,8 @@ static unsigned int dmz_reclaim_percentage(struct 
dmz_reclaim *zrc)
                nr_zones = dmz_nr_rnd_zones(zmd, zrc->dev_idx);
                nr_unmap = dmz_nr_unmap_rnd_zones(zmd, zrc->dev_idx);
        }
+       if (nr_unmap <= 1)
+               return 0;
        return nr_unmap * 100 / nr_zones;
  }
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index cf915009c306..42aa5139df7c 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -400,15 +400,7 @@ static void dmz_handle_bio(struct dmz_target *dmz, struct 
dm_chunk_work *cw,
                dm_per_bio_data(bio, sizeof(struct dmz_bioctx));
        struct dmz_metadata *zmd = dmz->metadata;
        struct dm_zone *zone;
-       int i, ret;
-
-       /*
-        * Write may trigger a zone allocation. So make sure the
-        * allocation can succeed.
-        */
-       if (bio_op(bio) == REQ_OP_WRITE)
-               for (i = 0; i < dmz->nr_ddevs; i++)
-                       dmz_schedule_reclaim(dmz->dev[i].reclaim);
+       int ret;
dmz_lock_metadata(zmd);
I seem to have run into this during my testing, too, but then as I'd arguably had programming errors at that time I didn't manage to recreate it. Thanks for tracking it down.

Reviewed-by: Hannes Reinecke <[email protected]>

Cheers,

Hannes
--
Dr. Hannes Reinecke            Teamlead Storage & Networking
[email protected]                               +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer


--
dm-devel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/dm-devel

Reply via email to