On 8/2/2025 4:07 PM, Anthony D'Atri wrote:
Here is the actual performance of the NFS mounted drive:
[root@o01 ~]# dd if=/dev/sdc of=/dev/null bs=4k status=progress
Null writes aren’t a good test, as they may be optimized away by any layer in
the stack. I suggest repeating with /dev/urandom.
I'm confused - why would I use urandom for testing raw read performance
on a local drive at the OSD level?
The NVME is a Samsung 990 Plus. Not exactly entrprise grade, but it should do
fairly well. It is also fairly new having picked it up yesterday. It's not
going to be 100 MBit - for sure.
Have you applied the most recent firmware? Issues have been reported in the
past. Note that in a power loss situation you may corrupt or lose data due to
the apparent lack of PLP.
With updated firmware this is still a 0.33 DPWD drive, fwiw.
No on the recent firmware. Again, the raw read data already posted and
the ceph tell bench data all support the idea that I'm getting over 2Gb
on this drive at the endpoint. Also, the DWPD rating is not germane to
this subject at all. Yes, it is low for a 4TB device but this is a test
bench, not a enterprise SAN.
The NVME connection path:
NVME -> USB C Interface -> Nas Server (ubuntu 24.04) -> 2.5 GBit Ethernet (NFS)
-> ProxMox vmbr1
Why the deep stack? Why not have the OSD drives in cluster nodes with Ceph
deployed converged? That would be a lot less complicated.
Yes, it would be a lot less complicated if I had the physical space to
put it, plus using it as an NFS export makes possible future sharing to
the rest of this cluster.
* Are all of the OSDs exported from the same NAS? SPoF
TestLab
* The USB layer introduces latency
* Since you’re exporting via NFS, I assume that the USB M.2 drives have a local
filesystem built on the NAS node, and large files created to export? The
filesystem layer introduces additional latency.
* The NFS layer introduces latency
* Proxmox drive / network emulation introduces additional latency
* If your cluster network is virtualized through Proxmox SDN, that’s additional
latency. Remember that every write is farmed out to other OSDs and has to
through all those layers.
With all theses factors honestly you’re getting better perf than I would have
expected.
I am getting exactly what I was expecting - near/at saturation levels on
my 2.5G ethernet network.
I am going to move forward with (probably mistaken) premise that the
bench function is the problem here. On to other testing - S3 performance.
Ron Gage
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io