在 7/25/2025 12:48 AM, Mikulas Patocka 写道:
On Tue, 15 Jul 2025, Dongsheng Yang wrote:
Hi Mikulas,
This is V4 for dm-pcache, please help to review.
Hi
The mempool operations seem ok.
Here are some comments:
* backing_req_cache and backing_bvec_cache should be static
Thanx
* get_n_vecs: what if "data" is not page-aligned? is that possible?
Yes, it's possible, so I am using DIV_ROUND_UP() to calculate the n_vecs.
* cache_dev_get_empty_segment_id: change set_bit to __set_bit because it
doesn't have to be atomic - you already hold the lock
Okey
* you usually hold seg_map_lock around modifications of seg_map bitmap,
but not in kset_replay and cache_replay - that seems like a bug. When
you hold the spinlock always, you can change set_bit/clear_bit to
__set_bit and __clear_bit
actually, that's intented, because there is no other accessor in
replaying stage. So I dont hold the lock here, that would be faster.
But I am glad to add lock here, On the one hand, this change makes its
use more consistent and switches it to `__set_bit`.
On the other hand, if we later add support for parallel replay, we’ll
need a lock to handle the race conditions during this stage.
* cache.c: cache->seg_map = bitmap_zalloc(cache_dev->seg_num, GFP_KERNEL);
* cache_dev.c: cache_dev->seg_bitmap = bitmap_zalloc(cache_dev->seg_num,
GFP_KERNEL);
how big seg_num can be? allocating more than 32768 bytes with kmalloc
is unreliable (kvmalloc should be used in this case).
#define PCACHE_CACHE_SEGS_MAX (1024 * 1024) /* maximum
cache size for each device is 16T */
kvcalloc(BITS_TO_LONGS(cache_dev->seg_num), sizeof(unsigned long),
GFP_KERNEL);
* cache->ksets = kcalloc(cache->n_ksets, PCACHE_KSET_SIZE, GFP_KERNEL);
`n_ksets` is equal to the number of online CPUs, so this allocation
could exceed 32768 bytes; I’ll switch it to `kvcalloc`.
* kset_onmedia = kzalloc(PCACHE_KSET_ONMEDIA_SIZE_MAX, GFP_KERNEL);
- could these allocations be larger than 32768 bytes?
The value is fixed—currently **5144**—and even if we extend it later, it
shouldn’t exceed 32768 bytes for the time being.
* There's cpu_to_le32 when you set sb->seg_num: sb->seg_num =
cpu_to_le32(nr_segs), but not le32_to_cpu when you read it: ret =
cache_dev_init(cache_dev, sb.seg_num); - it seems that this will not
work on big-endian machines.
Correct, it need le32_to_cpu(), I missed it here.
Just a suggestion for performance improvement: when you hold a spinlock a
you need to allocate some memory, you must drop the spinlock, allocate it
with GFP_NOIO, reacquire the spinlock and recheck the state. You can
improve this logic by allocating with mempool_alloc(GFP_NOWAIT) while you
hold the spinlock (GFP_NOWAIT will not sleep, so it's allowed), and
fallback to dropping the spinlock only if the GFP_NOWAIT allocation fails.
Sounds a good suggestion, I will try it in V5.
Thanx
Mikulas