In passthrough mode, the policy invalidate_mapping operation is called
simultaneously from multiple workers, thus it should be protected by a
lock. Otherwise, we might end up with data races on the allocated blocks
counter, or even use-after-free issues with internal data structures
when doing concurrent writes.
Note that the existing FIXME in smq_invalidate_mapping() doesn't affect
passthrough mode since migration tasks don't exist there, but would need
attention if supporting fast device shrinking via suspend/resume without
target reloading.
Reproduce steps:
1. Create a cache device consisting of 1024 cache entries
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
2. Populate the cache, and record the number of cached blocks
fio --name=populate --filename=/dev/mapper/cache --rw=randwrite --bs=4k \
--size=64m --direct=1
nr_cached=$(dmsetup status cache | awk '{split($7, a, "/"); print a[1]}')
3. Reload the cache into passthrough mode
dmsetup suspend cache
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 passthrough smq 0"
dmsetup resume cache
4. Write to the passthrough cache. By setting multiple jobs with I/O
size equal to the cache block size, cache blocks are invalidated
concurrently from different workers.
fio --filename=/dev/mapper/cache --name=test --rw=randwrite --bs=64k \
--direct=1 --numjobs=2 --randrepeat=0 --size=64m
5. Check if demoted matches cached block count. These numbers should
match but may differ due to the data race.
nr_demoted=$(dmsetup status cache | awk '{print $12}')
echo "$nr_cached, $nr_demoted"
Fixes: b29d4986d0da ("dm cache: significant rework to leverage
dm-bio-prison-v2")
Signed-off-by: Ming-Hung Tsai <[email protected]>
---
drivers/md/dm-cache-policy-smq.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/md/dm-cache-policy-smq.c b/drivers/md/dm-cache-policy-smq.c
index 7e1e8cc0e33a..76a35cce8502 100644
--- a/drivers/md/dm-cache-policy-smq.c
+++ b/drivers/md/dm-cache-policy-smq.c
@@ -1589,14 +1589,18 @@ static int smq_invalidate_mapping(struct
dm_cache_policy *p, dm_cblock_t cblock)
{
struct smq_policy *mq = to_smq_policy(p);
struct entry *e = get_entry(&mq->cache_alloc, from_cblock(cblock));
+ unsigned long flags;
if (!e->allocated)
return -ENODATA;
+ spin_lock_irqsave(&mq->lock, flags);
// FIXME: what if this block has pending background work?
del_queue(mq, e);
h_remove(&mq->table, e);
free_entry(&mq->cache_alloc, e);
+ spin_unlock_irqrestore(&mq->lock, flags);
+
return 0;
}
--
2.49.0