Hi all,

I've been experiencing weird performance behavior when using FIO RBD engine 
directly to an RBD volume with numjobs > 1. For a 4KB random write test at 32 
QD and 1 numjob, I can get about 40K IOPS, but when I increase the numjobs to 
4, it plummets to 2800 IOPS. I tried running the same exact test on a VM using 
FIO libaio targeting a block device (volume) attached through QEMU/RBD and I 
get ~35K-40K IOPS in both situations. In all cases, CPU was not fully utilized 
and there were no signs of any hardware bottlenecks. I did not disable any RBD 
features and most of the Ceph parameters are default (besides auth, debug, pool 
size, etc).

My Ceph cluster is running on 6 nodes, all-NVMe, 22-core, 376GB mem, Luminous 
12.2.1, Ubuntu 16.04, and clients running FIO job/VM on similar HW/SW spec. The 
VM has 16 vCPU, 64GB mem, and the root disk is locally stored while the 
persistent disk comes from an RBD volume serviced by the Ceph cluster.

If anyone has seen this issue or have any suggestions please let me know.

Thanks,
Orlando
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to