Re: [ceph-users] Upper limit of MONs and MDSs in a Cluster

David Turner Thu, 25 May 2017 08:31:03 -0700

For the MDS, the primary doesn't hold state data that needs to be replayed
to a standby.  The information exists in the cluster.  Your setup would be
1 Active, 100 Standby.  If the active went down, 1 of the standby's would
be promoted and read the information from the cluster.

With Mons, it's interesting because of the quorum mechanics.  4 mons is
worse than 3 mons because of the chance for split brain where 2 of them
think something is right and the other 2 think it's wrong.  You have no tie
breaking vote.  Odd numbers are always best and it seems like your proposal
would regularly have an even number of Mons.  I haven't heard of a
deployment with more than 5 mons.  I would imagine there are some with 7
mons out there, but it's not worth the hardware expense in 99.999% of cases.

I'm assuming your question comes from a place of wanting to have 1
configuration to rule them all and not have multiple types of nodes in your
ceph deployment scripts.  Just put in the time and do it right.  Have MDS
servers, have Mons, have OSD nodes, etc.  Once you reach scale, your mons
are going to need their resources, your OSDs are going to need theirs, your
RGW will be using more bandwidth, ad infinitum.  That isn't to mention all
of the RAM that the services will need during any recovery (assume 3x
memory requirements for most Ceph services when recovering.

Hyper converged clusters are not recommended for production deployments.
Several people use them, but generally for smaller clusters.  By the time
you reach dozens and hundreds of servers, you will only cause yourself
headaches by becoming the special snowflake in the community.  Every time
you have a problem, the first place to look will be your resource
contention between Ceph daemons.

Back to some of your direct questions.  Not having tested this, but using
educated guesses... A possible complication of having 100's of Mons would
be that they all have to agree on a new map causing a LOT more
communication between your mons which could likely lead to a bottleneck for
map updates (snapshot creation/deletion, osds going up/down, scrubs
happening, anything that affects data in a map).  When an MDS fails, I
don't know how the voting would go for choosing a new Active MDS among 100
Stand-by's.  That could either go very quickly or take quite a bit longer
depending on the logic behind the choice.  100's of RGW servers behind an
LB (I'm assuming) would negate any caching that is happening on the RGW
servers as multiple accesses to the same file will not likely reach the
same RGW.

On Thu, May 25, 2017 at 10:40 AM Wes Dillingham <wes_dilling...@harvard.edu>
wrote:

> How much testing has there been / what are the implications of having a
> large number of Monitor and Metadata daemons running in a cluster.
>
> Thus far I  have deployed all of our Ceph clusters as a single service
> type per physical machine but I am interested in a use case where we deploy
> dozens/hundreds? of boxes each of which would be a mon,mds,mgr,osd,rgw all
> in one and all a single cluster. I do realize it is somewhat trivial (with
> config mgmt and all) to dedicate a couple of lean boxes as MDS's and MONs
> and only expand at the OSD level but I'm still curious.
>
> My use case in mind is for backup targets where pools span the entire
> cluster and am looking to streamline the process for possible rack and
> stack situations where boxes can just be added in place booted up and they
> auto-join the cluster as a mon/mds/mgr/osd/rgw.
>
> So does anyone run clusters with dozen's of MONs' and/or MDS or aware of
> any testing with very high numbers of each? At the MDS level I would just
> be looking for 1 Active, 1 Standby-replay and X standby until multiple
> active MDSs are production ready. Thanks!
>
> --
> Respectfully,
>
> Wes Dillingham
> wes_dilling...@harvard.edu
> Research Computing | Infrastructure Engineer
> Harvard University | 38 Oxford Street, Cambridge, Ma 02138 | Room 102
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Upper limit of MONs and MDSs in a Cluster

Reply via email to