Thanks all for response.

The CPU will be Intel Xeon 6740E @ 2.4Ghz / 96 cores no HT. // DRR at 6400 Mhz

Actually we manage a ceph cluster full NVME for RGW with 30TB NVME drive 
(Solidigm), it works like champ, but the price per TB/ disk increased during 
the last months.

And the other cluster we managed got several Dell R740XD with 12 disk , 2 NVME 
, 2 SSD for OS, that works quite good, but we we're looking for something more 
dense.

Maybe we should move to Dell R760XD2 with 24 drives, 2NVME and 2 SSD of OS.

We got this "60/90node" from IBM Hardware recommendation for large clusters 
+2PB as they recommend use Dell DSS7000

The main target of this cluster will be 4-5PB for just S3 Storage and main 
customers with usage for backups storage.

The power density per rack at least for us is not a problem, DCs are looking to 
hold racks of 7-15KW per rack, and a single rack can hold about 1000-1200KG.

We know (I also know) that we best cluster for ceph is always a flash/ssd/nvme 
one for RGW as small writes/deletes don’t penalize the performance , but 
currently the price/TB didn’t drop even in the 100TB NVME/SSD drives. (10K $ 
per drive)

Just a note, our current bandwitch usage for RGW is about 10-14 Gbps on peaks.

Thanks all for amazing notes and time to response my question.

Regards,
Manuel



-----Mensaje original-----
De: Anthony D'Atri <anthony.da...@gmail.com> 
Enviado el: viernes, 8 de agosto de 2025 2:43
Para: dar...@soothill.com
CC: ceph-users@ceph.io; Mark Nelson <mark.nel...@clyso.com>
Asunto: [ceph-users] Re: 60/90 bays + 6 NVME Supermicro


> So my question is what is the use case. S3 is not a use case it’s a protocol 
> to access the storage.

Indeed.  Factors include:
* read vs writes, many RGW deployments are very read-heavy
* distribution of object sizes.  Smaller objects (per-second checkpoints) are 
much more metadata-intensive than larger objects (pirated TV shows dubbed into 
Portuguese).

> How many nodes are you looking to deploy?

I suggest that a chassis not hold more than 10% of the cluster capacity with 
slow media, for blast radius.

> What model of Kioxia drive are you looking to use as a brief look said all 
> the 30TB drives where read intensive.

Be very sure that those are TLC, 4KB IU drives, as I believe they have made 
30TB QLC drives as well, which most likely would not be ideal for this purpose. 
 But would be for large-object bucket data pools.

> If you go with 90 drive chassis then you are looking at a ratio of 1:22 for 
> NVME to HDD

Indeed, conventional wisdom has been at most a 10:1 ratio, though with modern 
PCIe 4+ servers and drives, a somewhat higher ratio is probably tolerable.  But 
those would also want to have OSDs for the meta, log, etc pools.

> which could be a big issue made worse by the use of read intensive drives 
> that may only do 80K write IOPS. Back to the first question.

Remember that mixed-use drives are usually the same hardware with more 
overprovisioning (and profit margin).

> Last one is where are you going to put the S3 bucket metadata?

The index pool is mainly omaps (so far), which can live on hybrid OSDs with SSD 
offload.

> Comes back too the first question of what is the use case ?
> 
> Like Mark one of the configurations I use a lot is the 2U24. Whilst the 90 
> drive chassis looks good on paper it typically needs a deeper rack is very 
> heavy and also consumes a lot of power. You are now at close to 2PB per node 
> which is a massive amount of data to rebuild should a node be lost. Unless 
> you are wanting just a very deep archive and a cluster of 30+PB then IMHO 
> they don’t make sense.

Agreed.

> 
> 
>> On 7 Aug 2025, at 17:54, Mark Nelson <mark.nel...@clyso.com> wrote:
>> 
>> Personally I prefer the 24 HDD + 4NVMe 2U chassis unless extreme density and 
>> absolute lowest cost are the highest priorities.  The denser options can 
>> work, but definitely have trade-offs and are only really appropriate if you 
>> are deploying a very large cluster.
>> 
>> 
>> Mark
>> 
>> 
>> On 8/7/25 09:48, Manuel Rios - EDH wrote:
>>> 
>>> Hi Ceph ,
>>> 
>>> Does anyone deployed CEPH in supermicro super storage nodes of 60/90 HDD + 
>>> 4 NVME for WALL?
>>> 
>>> Newest models support 144cores in single socket and several TB of ram 
>>> without issues.
>>> 
>>> But as far as we understood in the technical notes it use SAS Expanders for 
>>> connect all disk, from 2 to 4 SAS Expanders to connect the whole chasis.
>>> 
>>> We’re looking for next configuration :
>>> 
>>> Intel Xeon 96cores @ 2.4
>>> 
>>> 1TB RAM
>>> 
>>> 2 SSD for OS.
>>> 
>>> 4 NVME x 30TB NVME KIOXIA
>>> 
>>> 90 HDD * 22 TB HGST
>>> 
>>> 4x25 Gbps or 2x100 Gbps
>>> 
>>> Main use RGW.
>>> 
>>> Regards
>>> 
>>>     
>>> 
>>> *MANUEL RIOS FERNANDEZ*
>>> 
>>> CEO –  EasyDataHost
>>> 
>>> *Phone: * 677677179
>>> 
>>> *Web: *_www.easydatahost.com <http://www.easydatahost.com/>_
>>> 
>>> *Email: *_mrios...@easydatahost.com <mailto:mrios...@easydatahost.com>_
>>> 
>>> Título: LinkedIn - Descripción: image of LinkedIn icon 
>>> <https://es.linkedin.com/in/manuel-rios-fernandez-14880949?original_referer=https%3A%2F%2Fwww.google.com%2F>
>>> 
>>> *_ADVERTENCIA LEGAL:_*
>>> 
>>> Este mensaje y, en su caso, los ficheros anexos son confidenciales, 
>>> especialmente en lo que respecta a los datos personales, y se dirigen 
>>> exclusivamente al destinatario referenciado.
>>> 
>>> Si usted no lo es y lo ha recibido por error o tiene conocimiento del mismo 
>>> por cualquier motivo, le rogamos que nos lo comunique por este medio y 
>>> proceda a destruirlo o borrarlo, y que en todo caso se abstenga de 
>>> utilizar, reproducir, alterar, archivar o comunicar a terceros el presente 
>>> mensaje y ficheros anexos, todo ello bajo pena de incurrir en 
>>> responsabilidades legales. El emisor no garantiza la integridad, rapidez o 
>>> seguridad del presente correo, ni se responsabiliza de posibles perjuicios 
>>> derivados de la captura, incorporaciones de virus o cualesquiera otras 
>>> manipulaciones efectuadas por terceros.
>>> 
>>> *_CONFIDENTIALITY NOTICE:_*
>>> 
>>> This e-mail message and all attachments transmitted with it may contain 
>>> legally privileged, proprietary and/or confidential information intended 
>>> solely for the use of the addressee. If you are not the intended recipient, 
>>> you are hereby notified that any review, dissemination, distribution, 
>>> duplication or other use of this message and/or its attachments is strictly 
>>> prohibited. If you are not the intended recipient, please contact the 
>>> sender by reply e-mail and destroy all copies of the original message and 
>>> its attachments. Thank you.
>>> 
>>> *Descripción: Descripción: Descripción: Descripción: Descripción: Flor 
>>> ecoTECHNo imprimas si no es necesario. Protejamos el Medio Ambiente.*
>>> 
>>> 
>>> _______________________________________________
>>> ceph-users mailing list --ceph-users@ceph.io
>>> To unsubscribe send an email toceph-users-le...@ceph.io
>> 
>> -- 
>> Best Regards,
>> Mark Nelson
>> Head of Research and Development
>> 
>> Clyso GmbH
>> p: +49 89 21552391 12 | a: Minnesota, USA
>> w:https://clyso.com  | e:mark.nel...@clyso.com
>> 
>> We are hiring:https://www.clyso.com/jobs/
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to