[ceph-users] Performance issues

Ron Gage Sat, 02 Aug 2025 07:26:27 -0700

Hello from Detroit MI:

I have been doing some limited benchmarking of a Squid cluster. Thearrangement of the cluster:

Server        Function
c01             MGR, MON
c02             MGR, MON
o01            OSD
o02            OSD
o03            OSD
o04            OSD


Each OSD has 2 x NVME disks for Ceph, each at 370 Gig

The backing network is as follows:
ens18        Gigabit, mon-ip (192.168.0.0/23) regular MTU (1500)
ens19        2.5 Gigabit, Cluster Network (10.0.0.0/24) Jumbo MTU (9000)

Behind all this is a small ProxMox cluster. All Ceph machines arerunning on a single node. All Ceph drives are exposed and an NFSmounted NVME drive. All Ceph OSD drives are mounted with no cache andsingle controller per drive. Networking bridges are all set to eitherMTU 9000 or MTU 1500 as appropriate.

iPerf3 is showing 2.46 Gbit/sec between servers c01 and o01 on the ens19network. Firewall is off all the way around. OS is CentOS 10. SELinuxof disabled. No network enhancements have been performed (increasingsend/rcv buffer size, queue length, etc).

The concern given all this: rados bench can't exceed 110 MB/s in alltests. In fact if I didn't know better I would swear that the trafficis being either throttled or is somehow routing through a 1Gbitnetwork. The numbers that are returning from rados bench are actinglike saturation at Gigabit and not exhibiting any evidence of being on a2.5 Gbit network. Monitoring at both Ceph and ProxMox consoles confirmthe same. Cluster traffic is confirmed to be going out ens19 - testedvia tcpdump.


Typical command line used for rados bench: rados bench -p s3block 20 write

What the heck am I doing wrong here?

Ron Gage

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Performance issues

Reply via email to