> > I am new to this thread would like to get some suggestions to build new > external ceph cluster
Why external? Many Proxmox deployments are converged. Is this an existing Proxmox cluster that currently does not use shared storage? > which will backend for proxmox VM's > > I am planning to start with 5 Nodes(3 Mon & 2 OSD) This is not the best plan. If your data is not disposable you will want to maintain the default 3 copies, which you cannot safely do on 2 OSD nodes. When deploying a very small cluster solve first for the number of nodes. You need at least 3 OSD nodes, 4 has advantages. So in your case, go converged: OSDs on all 5 nodes, and add the mon/mgr/etc ceph orch labels to all 5 so that when a node is down a replacement may be spun up. This would also let you deploy 5 mon instances instead of 3, which is advantageous in that you can ride out 2 failures without disruption. > and I am expecting to start with ~60+ TB usable space. That would mean (3 * 60) / .85 =211.765 ~ 212 TB of raw capacity, let’s see how that matches your numbers below. > estimated Storage Specs Calculator: > > RAM: 8GB/OSD Daemon, 16GB OS, 4GB for Mon & MGR, 16GB for MDS I would allot more than 4GB for mon/mgr. > cpu: 2 core/osd, 2 core for os, 2 core per services Cores or hyperthreads? Either way these numbers are low. > *Dell R7625 5 Node to start with * Dramatic overkill for a mon/mgr/MDS node. > - RAM: 128G (Plan to increase later as needed) I suggest 32GB DIMMs to maximize potential for future expansion. > - CPU: 2x AMD EPYC 9224 2.50GHz, 24C/48T, 64M Cache (200W) DDR5-4800 96 threads total per server. > - Chassis Configuration 24x2.5 NVME You’ll be tempted to fill those slots; each OSD past, say, 12 will decrease performance due to having to share the vcores/threads. With the above CPU choice I would go with the R7615 to save rack space, or bump up the CPU. The 9224 is the default choice on Dell’s configurator but there are lots of others available. The 9454 for example would give you enough cores to more comfortably service an eventual 24 OSDs. Alternately consider the R7615 with, say, the 9654P. The P CPUs can’t be used in a dual-socket motherboard, so they’re usually a bit cheaper for the same specs. With EPYC CPUs you can get better performance by disabling IOMMU on the kernel command line via GRUB defaults. > - 2x1.92TB Data Center NVMe Read Intensive AG Drive U2 Gen4 with carrier ( > OS Disk, I need extra space) Okay so that will limit you to 22 OSDs with the 24-bay chassis. You could provision BOSS-N1 for M.2 boot though. > - 5x 7.68TB Data Center NVMe Read Intensive AG Drive U2 Gen4 with Carrier > 24Gbps 512e 2.5in Hot-Plug 1DWPD , AG Drive I think you have a copy/paste error there. The second line above sounds like a SAS SSD. So from what you wrote about this would intend a total of 10x 7.68TB OSD drives. With 3x replication and the default headroom ratios these will give you about 22 TB of usable space, which is just 20 TiB. > - 2x Nvidia ConnectX-6 Lx Dual Port 10/25GbE SFP28, No Crypto, PCIe Low > Profile I suggest bonding them and not having an optional replication network. Some people will use one port for public and the other for replication, but for multiple reasons that wouldn’t be ideal. > > - 1G for IPMI > > Please help me finalize these specs. > > Thanks > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io