When failing to acquire the root_lock in dm_cache_metadata_abort because
the block_manager is read-only, the temporary block_manager created
outside the root_lock is not properly released, causing a memory leak.
Reproduce steps:
This can be reproduced by reloading a new table while the metadata
is read-only. While the second call to dm_cache_metadata_abort is
caused by lack of support for table preload in dm-cache, mentioned
in commit 9b1cc9f251af ("dm cache: share cache-metadata object across
inactive and active DM tables"), it exposes the memory leak in
dm_cache_metadata_abort when the function is called multiple times.
Specifically, dm-cache fails to sync the new cache object's mode during
preresume, creating the reproducer condition.
This issue could also occur through concurrent metadata_operation_failed
calls due to races in cache mode updates, but the table preload scenario
below provides a reliable reproducer.
1. Create a cache device with some faulty trailing metadata blocks
dmsetup create cmeta <<EOF
0 200 linear /dev/sdc 0
200 7992 error
EOF
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 131072 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 1 writethrough smq 0"
2. Suspend and resume the cache to start a new metadata transaction and
trigger metadata io errors on the next metadata commit.
dmsetup suspend cache
dmsetup resume cache
3. Write to the cache device to update metadata
fio --filename=/dev/mapper/cache --name test --rw=randwrite --bs=4k \
--randrepeat=0 --direct=1 --size 64k
4. Preload the same table
dmsetup reload cache --table "$(dmsetup table cache)"
5. Resume the new table. This triggers the memory leak.
dmsetup suspend cache
dmsetup resume cache
kmemleak logs:
<snip>
unreferenced object 0xffff8880080c2010 (size 16):
comm "dmsetup", pid 132, jiffies 4294982580
hex dump (first 16 bytes):
00 38 b9 07 80 88 ff ff 6a 6b 6b 6b 6b 6b 6b a5 ...
backtrace (crc 3118f31c):
kmemleak_alloc+0x28/0x40
__kmalloc_cache_noprof+0x3d9/0x510
dm_block_manager_create+0x51/0x140
dm_cache_metadata_abort+0x85/0x320
metadata_operation_failed+0x103/0x1e0
cache_preresume+0xacd/0xe70
dm_table_resume_targets+0xd3/0x320
__dm_resume+0x1b/0xf0
dm_resume+0x127/0x170
<snip>
Fixes: 352b837a5541 ("dm cache: Fix ABBA deadlock between shrink_slab and
dm_cache_metadata_abort")
Signed-off-by: Ming-Hung Tsai <[email protected]>
---
v2: Remove redundant cmd->fail_io check after WRITE_LOCK_OR_GOTO since
cmd_write_lock() already checks fail_io atomically under root_lock.
Also mention the races in cache mode updates as alternative trigger.
drivers/md/dm-cache-metadata.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/md/dm-cache-metadata.c b/drivers/md/dm-cache-metadata.c
index 24a6eea5579f..acb335cee4d3 100644
--- a/drivers/md/dm-cache-metadata.c
+++ b/drivers/md/dm-cache-metadata.c
@@ -1023,6 +1023,12 @@ static bool cmd_write_lock(struct dm_cache_metadata *cmd)
return; \
} while (0)
+#define WRITE_LOCK_OR_GOTO(cmd, label) \
+ do { \
+ if (!cmd_write_lock((cmd))) \
+ goto label; \
+ } while (0)
+
#define WRITE_UNLOCK(cmd) \
up_write(&(cmd)->root_lock)
@@ -1780,11 +1786,8 @@ int dm_cache_metadata_abort(struct dm_cache_metadata
*cmd)
new_bm = dm_block_manager_create(cmd->bdev,
DM_CACHE_METADATA_BLOCK_SIZE << SECTOR_SHIFT,
CACHE_MAX_CONCURRENT_LOCKS);
- WRITE_LOCK(cmd);
- if (cmd->fail_io) {
- WRITE_UNLOCK(cmd);
- goto out;
- }
+ /* cmd_write_lock() already checks fail_io with cmd->root_lock held */
+ WRITE_LOCK_OR_GOTO(cmd, out);
__destroy_persistent_data_objects(cmd, false);
old_bm = cmd->bm;
--
2.49.0