- **Milestone**: 5.1.FC --> 5.2.FC


---

** [tickets:#439] Enhanced cluster management using quorum**

**Status:** accepted
**Milestone:** 5.2.FC
**Labels:** #79 #1170 
**Created:** Fri May 31, 2013 11:15 AM UTC by Mathi Naickan
**Last Updated:** Wed Jun 22, 2016 10:02 AM UTC
**Owner:** Mathi Naickan


The goal of this ticket is to address the following requirements.This ticket 
should be read in conjunction with ticket #79 (spare SCs) and #1170 (multiple 
standbys):

Deployment of large OpenSAF clusters in the cloud presents with the following 
challenges:
- Multiple nodes failing/faulting simultaneously (either in a cattle class 
deployment OR the host machine going down which inturn will pull down the guest 
VM nodes)
- Relying on 3rd party OR less reliable - hardware/network/hosts
- Dynamically changing cluster membership due to scale-out and scale-in 
operations
- Multiple (or all) nodes can now become system controller nodes. This 
increases the probability of split brain and cluster partitioning.

These requirements are being addressed in a phased manner.
(1) As a first step, https://sourceforge.net/p/opensaf/tickets/79/ - spares was 
implemented in 5.0. (And the headless cluster feature - multiple tickets)

(2) As a second step, implement (this ticket in 5.2)  - 
Enhanced OpenSAF cluster management such that there is always consensus (among 
the cluster nodes) on the 
- current cluster members
- the current active SC, leader election
- the order of member nodes joining/leaving the cluster


(3) As a last step implement https://sourceforge.net/p/opensaf/tickets/1170/ - 
multiple standbys in 5.3)

This ticket addresses bullet (2) above.

Requirements:

* As a part of this ticket RAFT (see https://raft.github.io/) shall be used as 
the mechanism for 
(a) achieving consensus among a set of the cluster nodes (and the membership 
changes)
(b) quorum based leader election
(c) split brain avoidance
The following deployment scenarios shall be supported when using RAFT:
-classic 2 SC OpenSAF cluster (or)
-when all nodes are SCs (2N + the rest are all spares) (or)
-2N + spare SCs (2N + a smaller subset are spares) (or)
-N-WAY (a active, the rest are all hot standbys) - 5.2
Note: A mix of hot standbys and spares should also be possible.


* RAFT shall be added as a new OpenSAF service.

* OpenSAF shall either implement RAFT or re-use existing RAFT implementations 
like etcd, etc. 

* A new topology service(TS) *may* be added which shall use the topology 
information (from TIPC) and MDS (in case of TCP) to determine cluster 
membership - https://sourceforge.net/p/opensaf/tickets/1892/.

* CLM is the single layer that interfaces with the underlying RAFT and TS

* All interactions to RAFT and TS shall be via the normalised cluster services 
adaptation interface called as OpenSAF cluster services library (CS).  The CS 
library thereby shall enable OpenSAF to work with different implementations of 
RAFT. A plugin will be provided for a given implementation of RAFT.

* CS and TS shall be added as libraries of OpenSAF CLM service. 
(In the code structure, these shall be part of ....services/saf/clm/libcs and 
....services/saf/clm/libts.
The name of the library shall be libOsafClusterServices.so)

* OpenSAF should work both when RAFT is enabled or disabled on that system and 
should be backward compatible to previous OpenSAF releases!

The CS library shall provide a normalized set of APIs (and callback interfaces) 
such that OpenSAF can interact with different implementations of RAFT. 

This ticket will implement the CS library and the associated plugin for a given 
implementation of RAFT.

The CS library API definitions to follow soon. 


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to