Hi AndersBj, Reviewed the README, looks good. Ack.
/Neel. On Wednesday 20 November 2013 07:51 PM, Anders Bjornerstedt wrote: > osaf/services/saf/immsv/README | 167 > +++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 167 insertions(+), 0 deletions(-) > > > diff --git a/osaf/services/saf/immsv/README b/osaf/services/saf/immsv/README > --- a/osaf/services/saf/immsv/README > +++ b/osaf/services/saf/immsv/README > @@ -1604,6 +1604,173 @@ Bit 2 controls OpenSAF4.1 protocols allo > Bit 3 controls OpenSAF4.3 protocols allowed or not. > > > +2PBE Allow IMM PBE to be configured without shared file system (4.4) > +=================================================================== > +https://sourceforge.net/p/opensaf/tickets/21/ > + > +The 2PBE enhancement allows the IMM to have PBE configred so that it > +does not rely on a shared filesystem, such as DRBD. > + > +Executing IMM without PBE configured or enabled (0PBE) of course also > +makes the IMM not rely on any shared filesystem, but you then do not get > +automatic incremental persistence. Deployments that rarely update the > +configuration data and rarely alter the "administrative state" on AMF data, > +should still consider the option of running without PBE. Persistence is > +then acheived by performing an explicit dump after each CCB, after each > +admin-op to change administrative state. or at least after a set of such > +persistent changes have been completed. This is the simplest configuration > +with least overhead and lowest resource consumption. > + > +Regular 1PBE uses one Persistent Back End process started and controlled > +by the IMMND coordinator. > + > +With 2PBE a PBE process is started at *both* SCs. In 2PBE, the PBE started > +by the coordinator is called the *primaey* PBE and operates in much the same > +way as the regular 1PBE process, except it also synchronizes with the PBE at > +the other SC. The PBE started by the non-coord but SC resident IMMND is > called > +the *slave* PBE. > + > +The primary PBE and the slave PBE each write to what should normally be a > +local sqlite file. The sqlite file has the same basic name as used by regular > +1PBE (IMMSV_PBE_FILE in immnd.conf), except that a suffix is appended that > +consists of the processors node-id, as defined by MDS. This suffix *allows* > +2PBE to be execued with the PBE files actually residing on a shared file > system. > +That would not be a good solution for deployment, but it may simplify some > test > +framweworks, that can then use the same file system configuration for testing > +both 1PBE and 2PBE. > + > +Configuring 2PBE > +---------------- > +To configure for 2PBE comment in the following parameter in immd.conf > +(note immd.conf, not immnd.conf): > + > + #export IMMSV_2PBE_PEER_SC_MAX_WAIT=30 > + > +This has to be defined on both SCs so that both the active and the standby > +IMMD become aware that 2PBE should be run. The value of this parameter is the > +number of seconds the active IMMD must wait for both SC IMMNDs to complete > +"preloading" (explaine below) before the active IMMD can choose one of these > +IMMNDs to become IMMND coordinator. > + > +Note that it is not normal, i.e. not expected, for any given cluster to > switch > +between 1PBE and 2PBE. The decision to use 1PBE or 2PBE should be based on > what > +is available and intended to be used as storage for the imm service. > Typically > +that does not change during the service lifetime for a cluster. Thus the > choice > +of 1PBE or 2PBE (or 0PBE) is normally done at cluster installation time. > + > +THe cardinal example for a deployment that should use 2PBE (or 0PBE) is a > down > +sized embedded system, that does not have any shared filesystem available. > + > +A 2PBE system can disable PBE and enable PBE in the same way that a 1PBE > system > +can. This is done in the same way, using the administrative operation for > this, > +see the regular section on PBE above. > + > +Cluster-start & IMM loading from PBE-files > +------------------------------------------ > +The active IMMD will order each SC resident IMMND to execute a "preload" > probing > +the SC local filesystem for the file state that *would* be loaded to the > cluster > +if that SC IMMND was chosen as coord. The two SC IMMNDs send the preload > stats to > +the active IMMD. > + > +The active IMMD will wait for the IMMNDs at *both* SCs to complete the > preload > +task and then determine which SC has the apparently latest file state. The > IMMND > +at that SC will then be chosen as IMMND coord. Should the timeout be reached, > +then the active IMMD declares the only avaialble SC IMMND as coordinator. > This > +should be avoided. > + > +Actual loading then proceeds in the same way as for regular 1PBE. The > +IMMSV_2PBE_PEER_SC_MAX_WAIT is by default 30 seconds. This value should > +be high enough to make it extremely unlikely that the active IMMD is forced > +choose coord/loader when only a single SC IMMND has joined. If that happens, > +then the risk is that the cluster restart will be done *not* using the latest > +persistent imm state, effectively rewinding the imm state. Normally the two > +PBE files should be identical and the choice of coord/loader then does not > +mater. But if hey are not identical, due to one SC having been down for some > +time before the cluster restart, then the choice of the SC to load from does > +matter. [Note: the same type of problem will happen with regular 1PBE based > +on a shared filesystem (DRBD) if one SC fails to come up in time to join the > +(DRBD) sync protocol. The corresponding DRBD timeout is on the order of 20 > +seconds. Even if the other SC later joins, it will be too late because by > that > +time the loading has probably been completed. Even if loading is still in > +progress, DRBD can not correct/mutate the PBE file while it is being read by > +the sqlite-lib for loading.] > + > +Normal processing with 2PBE > +--------------------------- > +When loading has completed, two PBEs will be started. The primary PBE at the > +SC with the IMMND coord and the slave PBE at the other SC. > + > +In the same way as for 1PBE, the primary PBE, is the transaction coordinator > +for CCB commits, PRT operations and class-create/deletes. Specific for 2PBE, > +the slave PBE is in essence a class applier for all configuration classes, > +recording the same data as the primary PBE, but on a file at the other SC. > + > +The primary PBE is thus the entity that decides the outcomme of a CCB that is > +in the crtitical state. [The critical state is defined by the commit request > +having been sent to the (primary) PBE]. IF the pirmary PBE acks the commit, > the > +CCB commits in imm-ram. Finally, all appliers that where tracking the CCB get > +the commit/apply callback, including the slave PBE. > + > +With 2PBE, *both* PBEs must be available for the imm to be > persistent-writable. > +If one or both PBEs are unavailable (or unresponsive) then persistent writes > +(CCBs, PRT operations, class changes) will fail. > + > +In 2PBE, a restarted PBE (primary or slave) will more often need to > regenerate > +its sqlite file (from imm-ram). On the other hand, regeneration of the sqlite > +file should be faster in 2PBE than in regular 1PBE because the file is > typically > +placed on a local file system. > + > +OneSafe2PBE > +----------- > +If an SC is taken down by order from the operator, i.e. a controlled > shutdown, > +then the operator can also (directly or indirectly) request that persistent > +writes be allowed despite only one PBE being availble in a 2PBE system. This > is > +also typically needed for an uncontrolled and unexpected departure of an SC > if > +that SC does not immediately bounce back up. A repair is then apparently > needed > +and the system *must* be allowed to function with only one PBE, despite that > it > +only writes to one local filesystem using one PBE. > + > +The 1safe2PBE mechanism allows a 2PBE OpenSAF cluster to open up for > persistent > +writes using only one of the two PBEs - temporarily. This is only intended > to be > +used as an temporary state when one SC is long term unavailable. As soon as > the > +other SC returns, then the IMM will automatically re-enter normal 2-safe2PBE > and > +will reject persistent writes and attempts to enter 1safe2PBE until the > slave PBE > +has synced (regenerated its sqlite file) and rejoined the cluster. > + > +The 1safe2PBE state is entered by the administrative opeation: > + > + immadm -o 1 -a safImmService -p opensafImmNostdFlags:SA_UINT32_T:8 \ > + opensafImm=opensafImm,safApp=safImmService > + > +It is exited either automatically by a rejoined SC or by an explicit > administrative > +opertion: > + > + immadm -o 2 -a safImmService -p opensafImmNostdFlags:SA_UINT32_T:8 \ > + opensafImm=opensafImm,safApp=safImmService > + > +Note the explicit setting of admin-owner-name using "-a safImmService". This > is has > +to be used for these admin-operations because the imm service needs > admin-ownership > +over the object "opensafImm=opensafImm,safApp=safImmService" in order for > 2PBE to > +work properly. > + > +Hence the fourth bit in the opensafImmNostdFlags bitvector of the OpenSAF > service > +object, is used to toggle on/off oneSafe2PBE. Toggling this bit, on older > systems, > +or systems that do not have 2PBE configured, will have no effect. Toggling > this bit > +on (on a 2PBE system) is only accepted by the IMM service when there is only > one SC > +available. > + > +We reccommend that any deployment of OpenSAF that intends to allow usage of > 2PBE, > +invoke the toggling on of this bit in the wrapper function for performing a > planned > +stop of an SC. Note however that the operation will only succeed when the SC > has gone > +down, i.e. thre is only one SC available. Similarly, if there is any alarm > generated > +when an SC has gone down and not come back up quickly enough (a node repair > needed > +alarm), then we suggest that the alarm trigger the invocation of the admin > op to > +toggle this flag on. > + > +For "normal", but unplanned, processor restarts, we recommend that this flag > not be > +toggled on. This means that for such processor restarts, persistent writes > will not be > +allowed untill both SCs are available again. > + > ---------------------------------------- > DEPENDENCIES > ============ ------------------------------------------------------------------------------ Shape the Mobile Experience: Free Subscription Software experts and developers: Be at the forefront of tech innovation. Intel(R) Software Adrenaline delivers strategic insight and game-changing conversations that shape the rapidly evolving mobile landscape. Sign up now. http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel