> On Dec 22, 2025, at 6:47 AM, George Shuklin via ceph-users 
> <[email protected]> wrote:
> 
> On 12/16/25 3:37 PM, Anthony D'Atri via ceph-users wrote:
>>> I'm tryting to debug low performance on nvme-based cluster.
>> Which specific drives?
> 
> INTEL SSDPF2KE032T1

P5520, excellent drive.  

We see people on the list using random client-class drives, so the drives in 
use are a first question to ask, because so many people don't understand the 
differences. I work with a community member who has his CephFS metadata pool on 
*Sabrent* client drives.  Poster child for "flirting with disaster".


> 
> Ceph accepts about 20 requests, proccess them slowly, and then write to 
> devices (super fast), so device sits idle most of the time. OSD daemons are 
> not CPU/memory constrained (~40 cores idle, plenty of memory), os it's just 
> disparity between Ceph OSD speed and backend speed.

For sure there are serializations in the PG and OSD code, which is one reason 
why a healthy PG ratio matters. Remember that NVMe SSDs can process lots of 
operations in parallel, too.  Have you discussed this with Mark Nelson?

> I remember, few years ago (on lesser hardware) I done benchmark for ceph 
> using brd (block ram disk) and ceph was able to churn up to 10k IOPS per 
> daemon. Nowdays, it can up to 20k, I think.

Mark would know for sure. When doing such tests, be sure that your pool is 
size=1 so that you aren't constrained by replication ops. In practice, network 
latency and replication are often significant factors in speed. 

Also, is the subject system free from swap? Do you have a static 
osd_memory_target, or do you have the autotuner enabled?


> Actually, it's a good question: what is the maxium iops a single OSD daemon 
> can deliver with perfectly fast underlaying storage and negligible network?

Mark territory.

> 
>>> but I've noticed, that in-flight for drives is, basically, around 1. That 
>>> means, that at a given time only one request is processed. This is match 
>>> match for OSD count/3 /latency formula, and with one in-flight nvme is 
>>> showing about 10% of it's specs (Intel, DC grade).
>> Have you updated to the latest firmware with SST?
> 
> Yep. Certified and the most up to date firmware.

Good, with the latest SST? Or if you got the drives from Dell, DSU? 

I worked for the Solidigm product team for a year, some of the firmware updates 
are very important.


> 
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to