[ceph-users] Re: Low number of inflight compare to op_wip

George Shuklin via ceph-users Mon, 22 Dec 2025 03:48:32 -0800

On 12/16/25 3:37 PM, Anthony D'Atri via ceph-users wrote:

I'm tryting to debug low performance on nvme-based cluster.

Which specific drives?


INTEL SSDPF2KE032T1

Per specs it should give about 300k IOPS.

I have 24 NVME in 4 servers, plenty of CPU, cluster perfectly balanced, no 
scrubbing or replication atm, 1024 pg for 24 OSD/17TB.

I expect to see some performance (~250k IOPS total, 50% r/w, few hundred 
volumes with capped by IOPS, pre-warmed). I see about half of it.

I looked at drive utilization, it's about 70% (per atop)

That's meaningless for SSDs.

Yes, I looked deeper, and I found that most of the time there is 0 or 1,with very rarely 2 inflight operations, and all of them are done (fromnvme point of view) within 50-70µs. At the same time (under constantload from fio) I can see about 10-20 op_wip, so my current theory goeslike this:

Ceph accepts about 20 requests, proccess them slowly, and then write todevices (super fast), so device sits idle most of the time. OSD daemonsare not CPU/memory constrained (~40 cores idle, plenty of memory), osit's just disparity between Ceph OSD speed and backend speed.

I remember, few years ago (on lesser hardware) I done benchmark for cephusing brd (block ram disk) and ceph was able to churn up to 10k IOPS perdaemon. Nowdays, it can up to 20k, I think.

Actually, it's a good question: what is the maxium iops a single OSDdaemon can deliver with perfectly fast underlaying storage andnegligible network?

but I've noticed, that in-flight for drives is, basically, around 1. That 
means, that at a given time only one request is processed. This is match match 
for OSD count/3 /latency formula, and with one in-flight nvme is showing about 
10% of it's specs (Intel, DC grade).

Have you updated to the latest firmware with SST?


Yep. Certified and the most up to date firmware.

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[ceph-users] Re: Low number of inflight compare to op_wip

Reply via email to