在 7/25/2025 12:48 AM, Mikulas Patocka 写道:

On Tue, 15 Jul 2025, Dongsheng Yang wrote:

Hi Mikulas,
        This is V4 for dm-pcache, please help to review.
Hi

The mempool operations seem ok.

Here are some comments:

* backing_req_cache and backing_bvec_cache should be static
Thanx

* get_n_vecs: what if "data" is not page-aligned? is that possible?
Yes, it's possible, so I am using DIV_ROUND_UP() to calculate the n_vecs.

* cache_dev_get_empty_segment_id: change set_bit to __set_bit because it
   doesn't have to be atomic - you already hold the lock
Okey

* you usually hold seg_map_lock around modifications of seg_map bitmap,
   but not in kset_replay and cache_replay - that seems like a bug. When
   you hold the spinlock always, you can change set_bit/clear_bit to
   __set_bit and __clear_bit

actually, that's intented, because there is no other accessor in replaying stage. So I dont hold the lock here, that would be faster.

But I am glad to add lock here, On the one hand, this change makes its use more consistent and switches it to `__set_bit`. On the other hand, if we later add support for parallel replay, we’ll need a lock to handle the race conditions during this stage.



* cache.c:     cache->seg_map = bitmap_zalloc(cache_dev->seg_num, GFP_KERNEL);
* cache_dev.c: cache_dev->seg_bitmap = bitmap_zalloc(cache_dev->seg_num, 
GFP_KERNEL);
   how big seg_num can be? allocating more than 32768 bytes with kmalloc
   is unreliable (kvmalloc should be used in this case).

#define PCACHE_CACHE_SEGS_MAX              (1024 * 1024)   /* maximum cache size for each device is 16T */


kvcalloc(BITS_TO_LONGS(cache_dev->seg_num), sizeof(unsigned long), GFP_KERNEL);


* cache->ksets = kcalloc(cache->n_ksets, PCACHE_KSET_SIZE, GFP_KERNEL);
`n_ksets` is equal to the number of online CPUs, so this allocation could exceed 32768 bytes; I’ll switch it to `kvcalloc`.

* kset_onmedia = kzalloc(PCACHE_KSET_ONMEDIA_SIZE_MAX, GFP_KERNEL);
   - could these allocations be larger than 32768 bytes?

The value is fixed—currently **5144**—and even if we extend it later, it shouldn’t exceed 32768 bytes for the time being.


* There's cpu_to_le32 when you set sb->seg_num: sb->seg_num =
   cpu_to_le32(nr_segs), but not le32_to_cpu when you read it: ret =
   cache_dev_init(cache_dev, sb.seg_num); - it seems that this will not
   work on big-endian machines.


Correct, it need le32_to_cpu(), I missed it here.


Just a suggestion for performance improvement: when you hold a spinlock a
you need to allocate some memory, you must drop the spinlock, allocate it
with GFP_NOIO, reacquire the spinlock and recheck the state. You can
improve this logic by allocating with mempool_alloc(GFP_NOWAIT) while you
hold the spinlock (GFP_NOWAIT will not sleep, so it's allowed), and
fallback to dropping the spinlock only if the GFP_NOWAIT allocation fails.


Sounds a good suggestion, I will try it in V5.


Thanx


Mikulas



Reply via email to