Hi, A maybe even more controversial take: Have you considered refurbished? Why not go "wide" in your cluster design with more but less powerful nodes? Ceph is software defined, not hardware defined. As long as you make "sane" hardware choices, Ceph will run on it just fine.
You'll likely be able to match the performance of a 6 node "new" cluster with more (but less powerful) nodes. It'll consume more electricity. But 1 node dying in a 6 node cluster has a higher impact than 1 node dying in a 10 node cluster. We stepped into Ceph on a mix of recently decommissioned harware plus refurbished hardware to complete the cluster. I've designed it with "refurbished" in mind. Like the cluster switches are 4x redundant. They were €45 a piece, I needed 8, so why not? 🙂 One important aspect of buying refurbished IMHO is a reliable seller. So you know you're not on your own en case of hardware failure. So far we've been lucky with our seller. No shenanigans, if something's broken (which rarely happens, at least not more often than newly bought), no questions asked, we get a replacement sent to us. Wannes ________________________________ From: GLE, Vivien <[email protected]> Sent: Tuesday, November 18, 2025 13:47 To: [email protected] <[email protected]> Subject: [ceph-users] New Cluster Ceph Hi, We plan to buy hardware for a new cluster ceph and would like some approbation on what we choose. 6 nodes DELL 6715 + 6 Powervault MD2412 enclosure For each node => 1 CPU AMD EPYC 9475F 3,65 GHz, 48C/96T, 256M Cache (400 W) RAM 16Gox16 = 256Go 4x 100GB NVIDIA MELLANOX 100GB HBA465e (externe, 22,5GB/s) 2 NVMe mixed 6,4To DBWAL 6 NVMe mixed 6,4To 5 NVMe read 15,36To for each enclosure => 12 HDD 20To HDD will be a replica 3 pool with the dbwal NVME mixed and read will be 2 differents pools (we will test replica 3 and EC to see which performance/storage efficency satisfy us the most) This cluster will be mostly use to store block (VM proxmox/kubernetes) and S3. The crushmap will be like so : 3 rooms with 2 nodes per room. So the point of failure can be for replica 3 => rooms and if we use EC 4+2 => host We saw that the thread needs for NVMe OSD are very expensive, does this CPU good enough to carry them ? MON and MGR will be spread across the cluster and RGW on virtual machine Thanks for your answer ! Vivien _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected] _______________________________________________ ceph-users mailing list -- [email protected] To unsubscribe send an email to [email protected]
