--- Begin Message ---
Hi all,

We're in the process of designing and procuring a new PVE HCI cluster.

We will service about 6,000 tenants, each running two docker containers, one with PostgreSQL and the other a python web application.

Containers will be hosted in VMs, 250 tenants per VM. VMs will have 512GB RAM and 16-32 vcores. 4 VMs per server.

Based on current data, we expect to be memory constrained; CPU, networking and disk (Ceph) not being our main concern.

We're looking to optimize our cost per GB of RAM, so our current plan is to deploy 6 servers with 2,3TB RAM and 2 EPYC CPU sockets, each with 48 cores, 4x25 Gbps network and 3x7,68T disks for Ceph. The more expensive alternative would be to deploy 12x 1,15TB RAM 1 EPYC servers (+30% adquisition and about +40% running costs)

I have been reading reports about NUMA performance issues on Proxmox mailing lists and elsewhere. Memory bandwith issues for example.

Based on what I understood, and seeing that we'll have more than 2,000 not very demanding containers in each server, I think those NUMA issues shouldn't be a problem in our use case.

I'll be very grateful on any suggestion or comment about this NUMA issue and our cluster design.

Cheers

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/



--- End Message ---
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Reply via email to