On 29/03/18 09:25, ST Wong (ITSC) wrote:
> Hi all,
> 
> We put 8 (4+4) OSD and 5 (2+3) MON servers in server rooms in 2 buildings for 
> redundancy.  The buildings are connected through direct connection.
> 
> While servers in each building have alternate uplinks.   What will happen in 
> case the link between the buildings is broken (application servers in each 
> server room will continue to write to OSDs in the same room) ?
> 
> Thanks a lot.

The 3 mons in your second building will be able to remain quorate (as 3 is a 
majority of 5) and keep running the cluster. The other 2 mons will refuse to do 
anything since they can't find enough other monitors to form quorum. For PGs 
that have enough replicas in the 3-mon building to be above min_size, they will 
continue to serve I/O; however, PGs with less than min_size copies available 
will block I/O until you either bring the link back, or the missing OSDs are 
manually/automatically marked out and enough time passes for them to recover up 
to enough replicas on the working side. As far as anything in the 2-mon 
building is concerned ceph will be entirely nonfunctional. Recovery would 
propagate any changes made on the working side when the link comes back up.

Ceph is designed to avoid split brain scenarios to protect data consistency, 
but the consequence is that if your cluster does get partitioned, a lot of it 
may stop working. You can design crush rules to help mitigate impact in the 
working part (for instance making sure that every PG places enough copies of 
itself on the 3-mon side that it will be able to continue serving I/O if the 
other building is lost) but you will never have a situation where the cluster 
is split into two and both sides continue operating and then join back up.

Rich

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to