From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Igor 
Mendelev
Sent: 10 December 2017 17:37
To: n...@fisk.me.uk; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] what's the maximum number of OSDs per OSD server?

 

Expected number of nodes for initial setup is 10-15 and of OSDs - 1,500-2,000. 

 

Networking is planned to be 2 100GbE or 2 dual 50GbE in x16 slots (per OSD 
node).

 

JBODs are to be connected with 3-4 x8 SAS3 HBAs (4 4x SAS3 ports each)

 

Choice of hardware is done considering (non-trivial) per-server sw licensing 
costs -

so small (12-24 HDD) nodes are certainly not optimal regardless of CPUs cost 
(which

is estimated to be below 10% of the total cost in the setup I'm currently 
considering).

 

EC (4+2 or 8+3 etc - TBD) - not 3x replication - is planned to be used for most 
of the storage space.

 

Main applications are expected to be archiving and sequential access to large 
(multiGB) files/objects.

 

Nick, which physical limitations you're referring to ?

 

Thanks.

 

 

Hi Igor,

 

I guess I meant physical annoyances rather than limitations. Being able to pull 
out a 1 or 2U node is always much less of a chore vs dealing with several U of 
SAS interconnected JBOD’s. 

 

If you have some license reason for larger nodes, then there is a very valid 
argument for larger nodes. Is this license cost  related in some way to Ceph (I 
thought Redhat was capacity based) or is this some sort of collocated software? 
Just make sure you size the nodes to a point that if one has to be taken 
offline for any reason, that you are happy with the resulting state of the 
cluster, including the peering when suddenly taking ~200 OSD’s offline/online.

 

Nick

 

 

On Sun, Dec 10, 2017 at 11:17 AM, Nick Fisk <n...@fisk.me.uk 
<mailto:n...@fisk.me.uk> > wrote:

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com 
<mailto:ceph-users-boun...@lists.ceph.com> ] On Behalf Of Igor Mendelev
Sent: 10 December 2017 15:39
To: ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> 
Subject: [ceph-users] what's the maximum number of OSDs per OSD server?

 

Given that servers with 64 CPU cores (128 threads @ 2.7GHz) and up to 2TB RAM - 
as well as 12TB HDDs - are easily available and somewhat reasonably priced I 
wonder what's the maximum number of OSDs per OSD server (if using 10TB or 12TB 
HDDs) and how much RAM does it really require if total storage capacity for 
such OSD server is on the order of 1,000+ TB - is it still 1GB RAM per TB of 
HDD or it could be less (during normal operations - and extended with NVMe SSDs 
swap space for extra space during recovery)?

 

Are there any known scalability limits in Ceph Luminous (12.2.2 with BlueStore) 
and/or Linux that'll make such high capacity OSD server not scale well (using 
sequential IO speed per HDD as a metric)?

 

Thanks.

 

How many total OSD’s will you have? If you are planning on having thousands 
then dense nodes might make sense. Otherwise you are leaving yourself open to 
having a few number of very large nodes, which will likely shoot you in the 
foot further down the line. Also don’t forget, unless this is purely for 
archiving, you will likely need to scale the networking up per node, 2x10G 
won’t cut it when you have 10-20+ disks per node.

 

With Bluestore, you are probably looking at around 2-3GB of RAM per OSD, so say 
4GB to be on the safe side.

7.2k HDD’s will likely only use a small proportion of a CPU core due to their 
limited IO potential. A would imagine that even with 90 bay JBOD’s, you will 
run into physical limitations before you hit CPU ones. 

 

Without knowing your exact requirements, I would suggest that larger number of 
smaller nodes, might be a better idea. If you choose your hardware right, you 
can often get the cost down to comparable levels by not going with top of the 
range kit. Ie Xeon E3’s or D’s vs dual socket E5’s.

 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to