Hello from Detroit MI:
I have been doing some limited benchmarking of a Squid cluster. The
arrangement of the cluster:
Server Function
c01 MGR, MON
c02 MGR, MON
o01 OSD
o02 OSD
o03 OSD
o04 OSD
Each OSD has 2 x NVME disks for Ceph, each at 370 Gig
The backing network is as follows:
ens18 Gigabit, mon-ip (192.168.0.0/23) regular MTU (1500)
ens19 2.5 Gigabit, Cluster Network (10.0.0.0/24) Jumbo MTU (9000)
Behind all this is a small ProxMox cluster. All Ceph machines are
running on a single node. All Ceph drives are exposed and an NFS
mounted NVME drive. All Ceph OSD drives are mounted with no cache and
single controller per drive. Networking bridges are all set to either
MTU 9000 or MTU 1500 as appropriate.
iPerf3 is showing 2.46 Gbit/sec between servers c01 and o01 on the ens19
network. Firewall is off all the way around. OS is CentOS 10. SELinux
of disabled. No network enhancements have been performed (increasing
send/rcv buffer size, queue length, etc).
The concern given all this: rados bench can't exceed 110 MB/s in all
tests. In fact if I didn't know better I would swear that the traffic
is being either throttled or is somehow routing through a 1Gbit
network. The numbers that are returning from rados bench are acting
like saturation at Gigabit and not exhibiting any evidence of being on a
2.5 Gbit network. Monitoring at both Ceph and ProxMox consoles confirm
the same. Cluster traffic is confirmed to be going out ens19 - tested
via tcpdump.
Typical command line used for rados bench: rados bench -p s3block 20 write
What the heck am I doing wrong here?
Ron Gage
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io