> 
> I am new to this thread would like to get some suggestions to build new
> external ceph  cluster

Why external?  Many Proxmox deployments are converged.  Is this an existing 
Proxmox cluster that currently does not use shared storage?


> which will backend for proxmox VM's
> 
> I am planning to start with 5 Nodes(3 Mon & 2 OSD)

This is not the best plan.

If your data is not disposable you will want to maintain the default 3 copies, 
which you cannot safely do on 2 OSD nodes.

When deploying a very small cluster solve first for the number of nodes.  You 
need at least 3 OSD nodes, 4 has advantages.

So in your case, go converged: OSDs on all 5 nodes, and add the mon/mgr/etc 
ceph orch labels to all 5 so that when a node is down a replacement may be spun 
up.

This would also let you deploy 5 mon instances instead of 3, which is 
advantageous in that you can ride out 2 failures without disruption.

> and I am expecting to start with ~60+ TB usable space.

That would mean (3 * 60) / .85 =211.765 ~ 212 TB of raw capacity, let’s see how 
that matches your numbers below.

> estimated Storage Specs Calculator:
> 
> RAM: 8GB/OSD Daemon, 16GB OS, 4GB for Mon & MGR, 16GB for MDS

I would allot more than 4GB for mon/mgr.

> cpu: 2 core/osd, 2 core for os, 2 core per services

Cores or hyperthreads?  Either way these numbers are low.  

> *Dell R7625 5 Node to start with *

Dramatic overkill for a mon/mgr/MDS node.

> - RAM: 128G (Plan to increase later as needed)

I suggest 32GB DIMMs to maximize potential for future expansion.

> - CPU: 2x AMD EPYC 9224 2.50GHz, 24C/48T, 64M Cache (200W) DDR5-4800

96 threads total per server.  

> - Chassis Configuration 24x2.5 NVME

You’ll be tempted to fill those slots; each OSD past, say, 12 will decrease 
performance due to having to share the vcores/threads.
With the above CPU choice I would go with the R7615 to save rack space, or bump 
up the CPU. The 9224 is the default choice on Dell’s configurator but there are 
lots of others available. The 9454 for example would give you enough cores to 
more comfortably service an eventual 24 OSDs.

Alternately consider the R7615 with, say, the 9654P. The P CPUs can’t be used 
in a dual-socket motherboard, so they’re usually a bit cheaper for the same 
specs.

With EPYC CPUs you can get better performance by disabling IOMMU on the kernel 
command line via GRUB defaults.


> - 2x1.92TB Data Center NVMe Read Intensive AG Drive U2 Gen4 with carrier (
> OS Disk, I need extra space)

Okay so that will limit you to 22 OSDs with the 24-bay chassis.  You could 
provision BOSS-N1 for M.2 boot though.

> - 5x 7.68TB Data Center NVMe Read Intensive AG Drive U2 Gen4 with Carrier
> 24Gbps 512e 2.5in Hot-Plug 1DWPD , AG Drive

I think you have a copy/paste error there.  The second line above sounds like a 
SAS SSD.

So from what you wrote about this would intend a total of 10x 7.68TB OSD 
drives.  With 3x replication and the default headroom ratios these will give 
you about 22 TB of usable space, which is just 20 TiB.

> - 2x Nvidia ConnectX-6 Lx Dual Port 10/25GbE SFP28, No Crypto, PCIe Low
> Profile

I suggest bonding them and not having an optional replication network.  Some 
people will use one port for public and the other for replication, but for 
multiple reasons that wouldn’t be ideal.

> 
> - 1G for IPMI
> 
> Please help me finalize these specs.
> 
> Thanks
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to