Hi Ki-taek,

The low write performance is a known issue and Sam's actively working on it afaik.  I believe there are some significant changes to the write path coming, but for the moment it's expected that Seastore is slower than BlueStore for small random writes.

Out of curiosity, would you mind sharing what benchmark you are using and how you are invoking it?


Thanks,

Mark


On 8/11/25 23:42, Ki-taek Lee wrote:
Hello Ceph community,

I am evaluating Crimson OSD + Seastore performance for potential deployment
in a distributed storage environment.
With BlueStore, I have been able to achieve satisfying performance levels
in my FIO tests for 4K random read/write IOPS.

However, when testing Crimson OSD + Seastore, I observed that 4K random
read/write IOPS do not scale as expected when increasing the number of
SSDs/OSDs. The performance plateaus beyond a certain point or is much lower
than expected. (See attached test results.)

Test Environment:
- Cluster: 8 clients, 1 OSD
- Hardware: 40-core CPUs, 377 GiB DRAM
- Image SHA (quay.io): e0543089a9e9cae97999761059eaccdf6bb22e9e
- Configuration parameters:
     osd_memory_target = 34359738368
     crimson_osd_scheduler_concurrency = 0
     seastore_max_concurrent_transactions = 16
     crimson_osd_obc_lru_size = 8192
     seastore_cache_lru_size = 16G
     seastore_obj_data_write_amplification = 4
     seastore_journal_batch_capacity = 1024
     seastore_journal_batch_flush_size = 256M
     seastore_journal_iodepth_limit = 16
     seastore_journal_batch_preferred_fullness = 0.8
     seastore_segment_size = 128M
     seastore_device_size = 512G
     seastore_block_create = true
     seastore_default_object_metadata_reservation = 1073741824
     rbd_cache = false
     rbd_cache_writethrough_until_flush = true
     rbd_op_threads = 16

Replication policy:
- 4096 PGs, no replication (only 1 copy)

Test Results:

1 SSD test (varying number of allocated CPUs, alien threads = 26-29, 36-39):
num CPU | 4k randread | 4k randwrite | Allocated CPU sets
2       | 126772      | 14830        | 0-1
4       | 107860      | 16451        | 0-3
6       | 113741      | 17019        | 0-5
8       | 132060      | 16099        | 0-7

SSD scaling test (2 CPUs per SSD):
OSD CPU mapping: OSD.0 (0-1), OSD.1 (10-11), OSD.2 (2-3), OSD.3 (12-13),
..., OSD.15 (34-35), Alien threads (26-29, 36-39)
num SSD | 4k randread | 4k randwrite
4       | 861273      | 22360
8       | 1022793     | 22786
12      | 1019161     | 21211
16      | 927570      | 20502

SSD scaling test (1 CPU per SSD):
OSD CPU mapping: OSD.0 (0), OSD.1 (10), OSD.2 (2), OSD.3 (12), ..., OSD.15
(24), Alien CPUs: 1, 11, 3, 13, ..., 15, 25
num SSD | 4k randread | 4k randwrite
4       | 936685      | 13730
8       | 1048204     | 18259
12      | 922727      | 23078
16      | 987838      | 30792

Questions:
1. Since Seastore is still under active development, are there any known
unresolved performance issues that could explain this scaling behavior?
2. Are there recommended tuning parameters for improving small-block read
scalability in multi-SSD configurations?
3. Regarding alien threads, are there best practices for CPU pinning or
NUMA-aware placement that have shown measurable improvements?
4. Any additional guidance for maximizing IOPS with Crimson OSD + Seastore
would be greatly appreciated.

My goal is to be ready to switch from BlueStore to Crimson + Seastore after
it becomes stable and shows reasonable performance compared to BlueStore,
so I’d like to understand the current limitations and tuning opportunities.

Thank you,
Ki-taek Lee
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

--
Best Regards,
Mark Nelson
Head of Research and Development

Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nel...@clyso.com

We are hiring: https://www.clyso.com/jobs/
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to