Hi Philip,

I'm not sure if we're talking about the same thing but I was also confused when I didn't see 100% OSD drive utilization during my first RBD write benchmark. Since then I collect all my confusion here https://yourcmc.ru/wiki/Ceph_performance :)

100% RBD utilization means that something waits for some I/O ops on this device to complete all the time.

This "something" (client software) can't produce more I/O operations while it's waiting for previous ones to complete, that's why it can't saturate your OSDs and your network.

OSDs can't send more write requests to the drives while they're not done with calculating object states on the CPU or while they're busy with network I/O. That's why OSDs can't saturate drives.

Simply said: Ceph is slow. Partly because of the network roundtrips (you have 3 of them: client -> iscsi -> primary osd -> secondary osds), partly because it's just slow.

Of course it's not TERRIBLY slow, so software that can send I/O requests in batches (i.e. use async I/O) feels fine. But software that sends I/Os one by one (because of transactional requirements or just stupidity like Oracle) runs very slow.

Also..

"It seems like your RBD can't flush it's I/O fast enough"
implies that there is some particular measure of "fast enough", that
is a tunable value somewhere.
If my network cards arent blocked, and my OSDs arent blocked...
then doesnt that mean that I can and should "turn that knob" up?
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to