Hello Ceph community,

I am evaluating Crimson OSD + Seastore performance for potential deployment
in a distributed storage environment.
With BlueStore, I have been able to achieve satisfying performance levels
in my FIO tests for 4K random read/write IOPS.

However, when testing Crimson OSD + Seastore, I observed that 4K random
read/write IOPS do not scale as expected when increasing the number of
SSDs/OSDs. The performance plateaus beyond a certain point or is much lower
than expected. (See attached test results.)

Test Environment:
- Cluster: 8 clients, 1 OSD
- Hardware: 40-core CPUs, 377 GiB DRAM
- Image SHA (quay.io): e0543089a9e9cae97999761059eaccdf6bb22e9e
- Configuration parameters:
    osd_memory_target = 34359738368
    crimson_osd_scheduler_concurrency = 0
    seastore_max_concurrent_transactions = 16
    crimson_osd_obc_lru_size = 8192
    seastore_cache_lru_size = 16G
    seastore_obj_data_write_amplification = 4
    seastore_journal_batch_capacity = 1024
    seastore_journal_batch_flush_size = 256M
    seastore_journal_iodepth_limit = 16
    seastore_journal_batch_preferred_fullness = 0.8
    seastore_segment_size = 128M
    seastore_device_size = 512G
    seastore_block_create = true
    seastore_default_object_metadata_reservation = 1073741824
    rbd_cache = false
    rbd_cache_writethrough_until_flush = true
    rbd_op_threads = 16

Replication policy:
- 4096 PGs, no replication (only 1 copy)

Test Results:

1 SSD test (varying number of allocated CPUs, alien threads = 26-29, 36-39):
num CPU | 4k randread | 4k randwrite | Allocated CPU sets
2       | 126772      | 14830        | 0-1
4       | 107860      | 16451        | 0-3
6       | 113741      | 17019        | 0-5
8       | 132060      | 16099        | 0-7

SSD scaling test (2 CPUs per SSD):
OSD CPU mapping: OSD.0 (0-1), OSD.1 (10-11), OSD.2 (2-3), OSD.3 (12-13),
..., OSD.15 (34-35), Alien threads (26-29, 36-39)
num SSD | 4k randread | 4k randwrite
4       | 861273      | 22360
8       | 1022793     | 22786
12      | 1019161     | 21211
16      | 927570      | 20502

SSD scaling test (1 CPU per SSD):
OSD CPU mapping: OSD.0 (0), OSD.1 (10), OSD.2 (2), OSD.3 (12), ..., OSD.15
(24), Alien CPUs: 1, 11, 3, 13, ..., 15, 25
num SSD | 4k randread | 4k randwrite
4       | 936685      | 13730
8       | 1048204     | 18259
12      | 922727      | 23078
16      | 987838      | 30792

Questions:
1. Since Seastore is still under active development, are there any known
unresolved performance issues that could explain this scaling behavior?
2. Are there recommended tuning parameters for improving small-block read
scalability in multi-SSD configurations?
3. Regarding alien threads, are there best practices for CPU pinning or
NUMA-aware placement that have shown measurable improvements?
4. Any additional guidance for maximizing IOPS with Crimson OSD + Seastore
would be greatly appreciated.

My goal is to be ready to switch from BlueStore to Crimson + Seastore after
it becomes stable and shows reasonable performance compared to BlueStore,
so I’d like to understand the current limitations and tuning opportunities.

Thank you,
Ki-taek Lee
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to