Do you have BMC perf / thermal profiles set to performance ? Do you have TuneD installed with the latency-performance profile active to prevent deep C-states?
As people start deploying Crimson I’m curious how much these affect the experience. > On Aug 11, 2025, at 11:44 PM, Ki-taek Lee <ktlee4...@gmail.com> wrote: > > Hello Ceph community, > > I am evaluating Crimson OSD + Seastore performance for potential deployment > in a distributed storage environment. > With BlueStore, I have been able to achieve satisfying performance levels > in my FIO tests for 4K random read/write IOPS. > > However, when testing Crimson OSD + Seastore, I observed that 4K random > read/write IOPS do not scale as expected when increasing the number of > SSDs/OSDs. The performance plateaus beyond a certain point or is much lower > than expected. (See attached test results.) > > Test Environment: > - Cluster: 8 clients, 1 OSD > - Hardware: 40-core CPUs, 377 GiB DRAM > - Image SHA (quay.io): e0543089a9e9cae97999761059eaccdf6bb22e9e > - Configuration parameters: > osd_memory_target = 34359738368 > crimson_osd_scheduler_concurrency = 0 > seastore_max_concurrent_transactions = 16 > crimson_osd_obc_lru_size = 8192 > seastore_cache_lru_size = 16G > seastore_obj_data_write_amplification = 4 > seastore_journal_batch_capacity = 1024 > seastore_journal_batch_flush_size = 256M > seastore_journal_iodepth_limit = 16 > seastore_journal_batch_preferred_fullness = 0.8 > seastore_segment_size = 128M > seastore_device_size = 512G > seastore_block_create = true > seastore_default_object_metadata_reservation = 1073741824 > rbd_cache = false > rbd_cache_writethrough_until_flush = true > rbd_op_threads = 16 > > Replication policy: > - 4096 PGs, no replication (only 1 copy) > > Test Results: > > 1 SSD test (varying number of allocated CPUs, alien threads = 26-29, 36-39): > num CPU | 4k randread | 4k randwrite | Allocated CPU sets > 2 | 126772 | 14830 | 0-1 > 4 | 107860 | 16451 | 0-3 > 6 | 113741 | 17019 | 0-5 > 8 | 132060 | 16099 | 0-7 > > SSD scaling test (2 CPUs per SSD): > OSD CPU mapping: OSD.0 (0-1), OSD.1 (10-11), OSD.2 (2-3), OSD.3 (12-13), > ..., OSD.15 (34-35), Alien threads (26-29, 36-39) > num SSD | 4k randread | 4k randwrite > 4 | 861273 | 22360 > 8 | 1022793 | 22786 > 12 | 1019161 | 21211 > 16 | 927570 | 20502 > > SSD scaling test (1 CPU per SSD): > OSD CPU mapping: OSD.0 (0), OSD.1 (10), OSD.2 (2), OSD.3 (12), ..., OSD.15 > (24), Alien CPUs: 1, 11, 3, 13, ..., 15, 25 > num SSD | 4k randread | 4k randwrite > 4 | 936685 | 13730 > 8 | 1048204 | 18259 > 12 | 922727 | 23078 > 16 | 987838 | 30792 > > Questions: > 1. Since Seastore is still under active development, are there any known > unresolved performance issues that could explain this scaling behavior? > 2. Are there recommended tuning parameters for improving small-block read > scalability in multi-SSD configurations? > 3. Regarding alien threads, are there best practices for CPU pinning or > NUMA-aware placement that have shown measurable improvements? > 4. Any additional guidance for maximizing IOPS with Crimson OSD + Seastore > would be greatly appreciated. > > My goal is to be ready to switch from BlueStore to Crimson + Seastore after > it becomes stable and shows reasonable performance compared to BlueStore, > so I’d like to understand the current limitations and tuning opportunities. > > Thank you, > Ki-taek Lee > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io