[devel] [PATCH 1 of 1] IMM: Update the README file to document 2PBE [622]

Anders Bjornerstedt Wed, 20 Nov 2013 06:22:22 -0800

 osaf/services/saf/immsv/README |  167 +++++++++++++++++++++++++++++++++++++++++
 1 files changed, 167 insertions(+), 0 deletions(-)



diff --git a/osaf/services/saf/immsv/README b/osaf/services/saf/immsv/README
--- a/osaf/services/saf/immsv/README
+++ b/osaf/services/saf/immsv/README
@@ -1604,6 +1604,173 @@ Bit 2 controls OpenSAF4.1 protocols allo
 Bit 3 controls OpenSAF4.3 protocols allowed or not.
 
 
+2PBE Allow IMM PBE to be configured without shared file system (4.4)
+===================================================================
+https://sourceforge.net/p/opensaf/tickets/21/
+
+The 2PBE enhancement allows the IMM to have PBE configred so that it
+does not rely on a shared filesystem, such as DRBD. 
+
+Executing IMM without PBE configured or enabled (0PBE) of course also
+makes the IMM not rely on any shared filesystem, but you then do not get
+automatic incremental persistence. Deployments that rarely update the
+configuration data and rarely alter the "administrative state" on AMF data,
+should still consider the option of running without PBE. Persistence is
+then acheived by performing an explicit dump after each CCB, after each
+admin-op to change administrative state. or at least after a set of such 
+persistent changes have been completed. This is the simplest configuration
+with least overhead and lowest resource consumption. 
+
+Regular 1PBE uses one Persistent Back End process started and controlled
+by the IMMND coordinator. 
+
+With 2PBE a PBE process is started at *both* SCs. In 2PBE, the PBE started
+by the coordinator is called the *primaey* PBE and operates in much the same
+way as the regular 1PBE process, except it also synchronizes with the PBE at
+the other SC. The PBE started by the non-coord but SC resident IMMND is called
+the *slave* PBE. 
+
+The primary PBE and the slave PBE each write to what should normally be a
+local sqlite file. The sqlite file has the same basic name as used by regular
+1PBE (IMMSV_PBE_FILE in immnd.conf), except that a suffix is appended that
+consists of the processors node-id, as defined by MDS. This suffix *allows*
+2PBE to be execued with the PBE files actually residing on a shared file 
system.
+That would not be a good solution for deployment, but it may simplify some test
+framweworks, that can then use the same file system configuration for testing
+both 1PBE and 2PBE.
+
+Configuring 2PBE
+---------------- 
+To configure for 2PBE comment in the following parameter in immd.conf
+(note immd.conf, not immnd.conf):
+
+  #export IMMSV_2PBE_PEER_SC_MAX_WAIT=30
+
+This has to be defined on both SCs so that both the active and the standby
+IMMD become aware that 2PBE should be run. The value of this parameter is the
+number of seconds the active IMMD must wait for both SC IMMNDs to complete 
+"preloading" (explaine below) before the active IMMD can choose one of these
+IMMNDs to become IMMND coordinator. 
+
+Note that it is not normal, i.e. not expected, for any given cluster to switch
+between 1PBE and 2PBE. The decision to use 1PBE or 2PBE should be based on what
+is available and intended to be used as storage for the imm service. Typically
+that does not change during the service lifetime for a cluster. Thus the choice
+of 1PBE or 2PBE (or 0PBE) is normally done at cluster installation time. 
+
+THe cardinal example for a deployment that should use 2PBE (or 0PBE) is a down
+sized embedded system, that does not have any shared filesystem available. 
+
+A 2PBE system can disable PBE and enable PBE in the same way that a 1PBE system
+can. This is done in the same way, using the administrative operation for this,
+see the regular section on PBE above. 
+
+Cluster-start & IMM loading from PBE-files
+------------------------------------------
+The active IMMD will order each SC resident IMMND to execute a "preload" 
probing
+the SC local filesystem for the file state that *would* be loaded to the 
cluster
+if that SC IMMND was chosen as coord. The two SC IMMNDs send the preload stats 
to
+the active IMMD. 
+
+The active IMMD will wait for the IMMNDs at *both* SCs to complete the preload
+task and then determine which SC has the apparently latest file state. The 
IMMND
+at that SC will then be chosen as IMMND coord. Should the timeout be reached,
+then the active IMMD declares the only avaialble SC IMMND as coordinator. This
+should be avoided.
+
+Actual loading then proceeds in the same way as for regular 1PBE. The 
+IMMSV_2PBE_PEER_SC_MAX_WAIT is by default 30 seconds. This value should
+be high enough to make it extremely unlikely that the active IMMD is forced
+choose coord/loader when only a single SC IMMND has joined. If that happens,
+then the risk is that the cluster restart will be done *not* using the latest
+persistent imm state, effectively rewinding the imm state. Normally the two
+PBE files should be identical and the choice of coord/loader then does not
+mater. But if hey are not identical, due to one SC having been down for some
+time before the cluster restart, then the choice of the SC to load from does
+matter. [Note: the same type of problem will happen with regular 1PBE based
+on a shared filesystem (DRBD) if one SC fails to come up in time to join the
+(DRBD) sync protocol. The corresponding DRBD timeout is on the order of 20
+seconds. Even if the other SC later joins, it will be too late because by that
+time the loading has probably been completed. Even if loading is still in
+progress, DRBD can not correct/mutate the PBE file while it is being read by
+the sqlite-lib for loading.]
+
+Normal processing with 2PBE
+---------------------------
+When loading has completed, two PBEs will be started. The primary PBE at the
+SC with the IMMND coord and the slave PBE at the other SC.
+
+In the same way as for 1PBE, the primary PBE, is the transaction coordinator
+for CCB commits, PRT operations and class-create/deletes. Specific for 2PBE,
+the slave PBE is in essence a class applier for all configuration classes,
+recording the same data as the primary PBE, but on a file at the other SC.
+
+The primary PBE is thus the entity that decides the outcomme of a CCB that is
+in the crtitical state. [The critical state is defined by the commit request
+having been sent to the (primary) PBE]. IF the pirmary PBE acks the commit, the
+CCB commits in imm-ram. Finally, all appliers that where tracking the CCB get
+the commit/apply callback, including the slave PBE. 
+
+With 2PBE, *both* PBEs must be available for the imm to be persistent-writable.
+If one or both PBEs are unavailable (or unresponsive) then persistent writes
+(CCBs, PRT operations, class changes) will fail.
+
+In 2PBE, a restarted PBE (primary or slave) will more often need to regenerate
+its sqlite file (from imm-ram). On the other hand, regeneration of the sqlite
+file should be faster in 2PBE than in regular 1PBE because the file is 
typically
+placed on a local file system.
+
+OneSafe2PBE
+-----------
+If an SC is taken down by order from the operator, i.e. a controlled shutdown,
+then the operator can also (directly or indirectly) request that persistent
+writes be allowed despite only one PBE being availble in a 2PBE system. This is
+also typically needed for an uncontrolled and unexpected departure of an SC if
+that SC does not immediately bounce back up. A repair is then apparently needed
+and the system *must* be allowed to function with only one PBE, despite that 
it 
+only writes to one local filesystem using one PBE.
+
+The 1safe2PBE mechanism allows a 2PBE OpenSAF cluster to open up for persistent
+writes using only one of the two PBEs - temporarily. This is only intended to 
be
+used as an temporary state when one SC is long term unavailable. As soon as the
+other SC returns, then the IMM will automatically re-enter normal 2-safe2PBE 
and
+will reject persistent writes and attempts to enter 1safe2PBE until the slave 
PBE
+has synced (regenerated its sqlite file) and rejoined the cluster.
+
+The 1safe2PBE state is entered by the administrative opeation:
+
+  immadm -o 1 -a safImmService -p opensafImmNostdFlags:SA_UINT32_T:8 \
+     opensafImm=opensafImm,safApp=safImmService
+
+It is exited either automatically by a rejoined SC or by an explicit 
administrative
+opertion:
+
+  immadm -o 2 -a safImmService -p opensafImmNostdFlags:SA_UINT32_T:8 \
+     opensafImm=opensafImm,safApp=safImmService
+
+Note the explicit setting of admin-owner-name using "-a safImmService". This 
is has
+to be used for these admin-operations because the imm service needs 
admin-ownership
+over the object "opensafImm=opensafImm,safApp=safImmService" in order for 2PBE 
to
+work properly.
+
+Hence the fourth bit in the opensafImmNostdFlags bitvector of the OpenSAF 
service
+object, is used to toggle on/off oneSafe2PBE. Toggling this bit, on older 
systems,
+or systems that do not have 2PBE configured, will have no effect. Toggling 
this bit
+on (on a 2PBE system) is only accepted by the IMM service when there is only 
one SC
+available. 
+
+We reccommend that any deployment of OpenSAF that intends to allow usage of 
2PBE, 
+invoke the toggling on of this bit in the wrapper function for performing a 
planned
+stop of an SC. Note however that the operation will only succeed when the SC 
has gone
+down, i.e. thre is only one SC available. Similarly, if there is any alarm 
generated
+when an SC has gone down and not come back up quickly enough (a node repair 
needed
+alarm), then we suggest that the alarm trigger the invocation of the admin op 
to 
+toggle this flag on. 
+
+For "normal", but unplanned, processor restarts, we recommend that this flag 
not be
+toggled on. This means that for such processor restarts, persistent writes 
will not be 
+allowed untill both SCs are available again. 
+
 ----------------------------------------
 DEPENDENCIES
 ============

------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

[devel] [PATCH 1 of 1] IMM: Update the README file to document 2PBE [622]

Reply via email to