Just a question to understand more. Why is a feature like 2PBE touching "imma" 
the IMM client library?

It doesn't make sense to me.

Thanks,
Hans

> -----Original Message-----
> From: Anders Bjornerstedt [mailto:[email protected]]
> Sent: den 5 november 2013 17:21
> To: [email protected]
> Cc: [email protected]
> Subject: [devel] [PATCH 0 of 6] Review Request for 2PBE - updated patch stack
> 
> Summary: IMM: 2PBE new version of the patch stack [#21]
> Review request for Trac Ticket(s): 21
> Peer Reviewer(s): Neel
> Pull request to:
> Affected branch(es): default (4.4)
> Development branch:
> 
> --------------------------------
> Impacted area       Impact y/n
> --------------------------------
>  Docs                    n
>  Build system            n
>  RPM/packaging           n
>  Configuration files     n
>  Startup scripts         n
>  SAF services            y
>  OpenSAF services        n
>  Core libraries          n
>  Samples                 n
>  Tests                   n
>  Other                   n
> 
> 
> Comments (indicate scope for each "y" above):
> ---------------------------------------------
> The patch stack for the 2PBE enhancement has been updated. This is just a new
> review erquest for enhancement #21.
> 
> The first patch (loading) has been adjusted to apply cleanly on top of 
> changeset:
> 
> 
>     changeset:   4588:393ca121ca7c
>     user:        Anders Bjornerstedt <[email protected]>
>     date:        Mon Nov 04 18:29:25 2013 +0100
>     summary:     IMM: IMMD file verification made upgrade safe [#596]
> 
> The second and third patches are unchanged. Three additional fix patches 
> follow on top
> of that.
> 
> changeset 0c554fd3174b67eac5c599af6d6d2cc97b126e51
> Author:       Anders Bjornerstedt <[email protected]>
> Date: Mon, 28 Oct 2013 10:25:27 +0100
> 
>       IMM: 2PBE patch-1 (loading) [#21]
> 
>       This patch contains the 2PBE loading mechanism, needed to support 2PBE. 
> The
>       IMMD's will detect 2PBE loading by the IMMSV_2PBE_PEER_SC_MAX_WAIT
>       environment variable being set in the immd.conf file. The active IMMD 
> will
>       order each SC IMMND to execute a "preload" probing the SC local 
> filesystem
>       for the file state that would be loaded to the cluster if that IMMND was
>       chosen as coord. The IMMND sends these stats to the active IMMD.
> 
>       The IMMD will wait for the IMMNDs at *both* SCs to complete this task 
> and
>       then determine which SC has the apparently latest file state. The IMMND 
> at
>       that SC will then be chosen as IMMND coord. Actual loading then 
> proceeds in
>       the same way as for regular 1PBE. The IMMSV_2PBE_PEER_SC_MAX_WAIT is by
>       default 30 seconds. This value should be high enough to make it very
>       unlikely that the active IMMD is forced choose loader when only a 
> single SC
>       IMMND has joined. If that happens, then the risk is that the cluster 
> restart
>       will be done *not* using the latest persistent imm state, effectively
>       rewinding the imm state. (Note the same thing willl happen with regular 
> 1PBE
>       based on a shared filesystem (DRBD) if the one SC fails to come up in 
> time
>       to join the DRBD sync protocol. The corresponding DRBD timeout is on the
>       order of 20 seconds.
> 
>       When loading has completed, additional 2PBE functiolaity will start two
>       PBEs, one at each SC. That functionality is delivered in subsequent 
> patches.
> 
> changeset 0f3cc59f1eb8031034ac41485cd75438ec17c4b1
> Author:       Anders Bjornerstedt <[email protected]>
> Date: Fri, 11 Oct 2013 01:17:44 +0200
> 
>       IMM: 2PBE patch-2 (dumping) [#21]
> 
>        This patch contains the 2PBE dumping mechanism, needed to support 
> 2PBE. A
>       PBE process is started by the IMMND at each SC, not just the IMMND 
> coord.
> 
>       The PBE colocated with the IMMND coord, called the primary PBE, is 
> still the
>       coordinator for transaction commits (for CCBs and PRT operations and 
> class-
>       create/deletes). The primary PBE (sometimes called the A-side PBE) works
>       very much in the same way as the regular single PBE does in 1-PBE. The 
> PBE
>       colocated with the SC resident but non-coord IMMND is called the slave 
> PBE
>       (sometimes called the B-side PBE).
> 
>       With 2PBE, *both* PBEs must be available for the imm to be persistent-
>       writable. If one or boths PBEs are unavailable (or unresponsive) then
>       persistent writes (ccbs and PRT operations) will fail.
> 
>       In 2PBE, a restarted PBE will more often need to regenerate the sqlite 
> file.
>       On the other hand, regeneration of the sqlite file should be faster in 
> 2PBE
>       than in regular 1PBE because the file is typically placed on a local 
> file
>       system.
> 
>       A subsequent patch will provide a mechhanism for allowing 1-safe-2PBE. 
> This
>       will allow the imm to open up for persistent writes when only one of 
> the two
>       PBEs are available. THis will only be allowed when and during the 
> absence of
>       an SC. As soon as the other SC rejoins the IMM has to re enter the non-
>       persistent-writable state.
> 
> changeset d99a312f527c8fb701149ff840dce1bffe416d75
> Author:       Anders Bjornerstedt <[email protected]>
> Date: Fri, 11 Oct 2013 02:08:51 +0200
> 
>       IMM: 2PBE patch-3 (1safe2pbe) [#21]
> 
>       This patch contains the 2PBE 1safe2Pbe mechanism. This mechanism allows 
> an
>       OpenSAF cluster to open up for persistent writes using only one of the 
> two
>       PBEs - temporarily.
> 
>       This is only intended to be used as an emergency action when one SC is 
> long
>       term unavailable, e.g. hardware problems. As soon as the other SC 
> returns,
>       the IMM has to re-enter normal 2-safe 2PBE and reject persistent writes
>       until the slave PBE has synced (regenerated its sqlite file) and 
> rejoined
>       the cluster.
> 
>       The 1safe2PBE state is entered by the administrative opeation:
> 
>        immadm -o 1 -p opensafImmNostdFlags:SA_UINT32_T:8 \
>       opensafImm=opensafImm,safApp=safImmService
> 
>       It is exited either automatically by a rejoined SC or by an explicit
>       administrative opertion:
> 
>        immadm -o 2 -p opensafImmNostdFlags:SA_UINT32_T:8 \
>       opensafImm=opensafImm,safApp=safImmService
> 
> changeset d4b720966e5e0769130163a21c5e9642d9e14864
> Author:       Anders Bjornerstedt <[email protected]>
> Date: Tue, 05 Nov 2013 16:41:05 +0100
> 
>       IMM: 2PBE patch-4 (Fix for si-swap problem) [#21]
> 
>        With the 2PBE patches applied and 2PBE configured, if si_swap:
> 
>        immadm -o 7 safSi=SC-2N,safApp=OpenSAF
> 
>       is attempted twice, then the second time will cause the new active SC to
>       reboot. This was caused by the cb->is_loading variable being 
> initialized to
>       true at both active and standby SC, whwn it should only have been set to
>       true in the active. When true also in the standby it is not set to 
> false by
>       loading completed. When the standby becomes active, it will not mbcp 
> fevs
>       messages to new standby, despite that loading was done a long time ago. 
> Next
>       si swap causes the new new active to start sending fevs messages way 
> below
>       the expected fevs number. This causes the new new standby to crash.
> 
> changeset ec4d0f6e1b90ab7dae6d2ab923aafe03f90c3f71
> Author:       Anders Bjornerstedt <[email protected]>
> Date: Tue, 05 Nov 2013 16:52:49 +0100
> 
>       IMM: 2PBE patch-5 (Fix loading retry problem) [#21]
> 
>        If after loadin-arbitration is done, loading is started but fails (for
>       example due to a corrupt sqlite file) then the IMMNDs are restarted but 
> not
>       IMMDs. The already collected loading arbitration info in the active 
> IMMD is
>       not cleared and in the next loading attempt the loading arbitration will
>       only wait for the stats from one of the SCs, then find it already has 
> stats
>       from both SC's but actually one of them will be stats from the previous 
> load
>       (!). This can result in incorrect arbitration. The file for which 
> loading
>       fails will have been moved to imm.db.xxxxx.failed and thus not used for
>       arbitration. The fallback file is typically much older, but that may be
>       masked by the old preload stats for that SC. The preload stats need to 
> be
>       cleared here when and if loading is restarted.
> 
> changeset 03064f5a3812a865e1c59abf59e1b51ace9d397e
> Author:       Anders Bjornerstedt <[email protected]>
> Date: Tue, 05 Nov 2013 17:00:20 +0100
> 
>       IMM: 2PBE patch-6 (PBE slave can re-attach when empty CCBs exist) [#21]
> 
>        When a 2PE system is up and running with ccbs being generated, if one 
> SC is
>       rebooted, then after the SC has synced imm-ram, the slave pbe typically 
> has
>       trouble in being allowed to generate its imm.db.xxxxx file. It keeps 
> getting
>       rejected due to active ccbs. There realy should not be any active ccbs
>       allowed here because the sync of the returned SC would only have started
>       when there are no active ccbs and once sync is finished the imm should 
> still
>       not be persistent writable. THe only problem hewre is that empty CCBs 
> are
>       allowed to be created. Thus the condiftion for allowing the slave to
>       generate its imm.db.xxxx file needs to be relaxed to allow empty CCBs.
> 
> 
> Complete diffstat:
> ------------------
>  osaf/libs/agents/saf/imma/imma_db.c              |     4 +-
>  osaf/libs/agents/saf/imma/imma_oi_api.c          |     4 +-
>  osaf/libs/agents/saf/imma/imma_proc.c            |    34 +-
>  osaf/libs/common/immsv/immpbe_dump.cc            |   148 +++-
>  osaf/libs/common/immsv/immsv_evt.c               |    55 +-
>  osaf/libs/common/immsv/include/immpbe_dump.hh    |    10 +-
>  osaf/libs/common/immsv/include/immsv_api.h       |    21 +-
>  osaf/libs/common/immsv/include/immsv_evt.h       |    18 +-
>  osaf/libs/common/immsv/include/immsv_evt_model.h |     4 +
>  osaf/services/saf/immsv/immd/immd_amf.c          |     5 +-
>  osaf/services/saf/immsv/immd/immd_cb.h           |     8 +-
>  osaf/services/saf/immsv/immd/immd_db.c           |     5 +
>  osaf/services/saf/immsv/immd/immd_evt.c          |    82 ++-
>  osaf/services/saf/immsv/immd/immd_main.c         |    51 +-
>  osaf/services/saf/immsv/immd/immd_mbcsv.c        |     2 +-
>  osaf/services/saf/immsv/immd/immd_proc.c         |   241 ++++++-
>  osaf/services/saf/immsv/immd/immd_proc.h         |     3 +-
>  osaf/services/saf/immsv/immd/immd_sbevt.c        |    47 +-
>  osaf/services/saf/immsv/immloadd/imm_loader.cc   |   323 +++++++-
>  osaf/services/saf/immsv/immloadd/imm_loader.hh   |     8 +-
>  osaf/services/saf/immsv/immloadd/imm_pbe_load.cc |   224 +++++-
>  osaf/services/saf/immsv/immnd/ImmModel.cc        |   583 ++++++++++++---
>  osaf/services/saf/immsv/immnd/ImmModel.hh        |    26 +-
>  osaf/services/saf/immsv/immnd/ImmSearchOp.cc     |     5 -
>  osaf/services/saf/immsv/immnd/immnd_cb.h         |    10 +-
>  osaf/services/saf/immsv/immnd/immnd_evt.c        |   582 ++++++++++++++--
>  osaf/services/saf/immsv/immnd/immnd_init.h       |    19 +-
>  osaf/services/saf/immsv/immnd/immnd_main.c       |     3 +-
>  osaf/services/saf/immsv/immnd/immnd_proc.c       |   319 +++++++-
>  osaf/services/saf/immsv/immpbed/immpbe.cc        |    77 +-
>  osaf/services/saf/immsv/immpbed/immpbe.hh        |     4 +
>  osaf/services/saf/immsv/immpbed/immpbe_daemon.cc |  2021 
> ++++++++++++++++++++++++++++++++++++++++++--------------
>  32 files changed, 3969 insertions(+), 977 deletions(-)
> 
> 
> Testing Commands:
> -----------------
> 2PBE is enabled by commenting in the immd.conf environment variable:
> 
>      export IMMSV_2PBE_PEER_SC_MAX_WAIT=30
> 
> 
> 
> Testing, Expected Results:
> --------------------------
> 2PBE should work, incrementally dumping all persistent data changes to
> sqlite files at both sC-1 and SC-2.
> 
> 
> Conditions of Submission:
> -------------------------
> Ack from Neel
> 
> 
> Arch      Built     Started    Linux distro
> -------------------------------------------
> mips        n          n
> mips64      n          n
> x86         n          n
> x86_64      n          n
> powerpc     n          n
> powerpc64   n          n
> 
> 
> Reviewer Checklist:
> -------------------
> [Submitters: make sure that your review doesn't trigger any checkmarks!]
> 
> 
> Your checkin has not passed review because (see checked entries):
> 
> ___ Your RR template is generally incomplete; it has too many blank entries
>     that need proper data filled in.
> 
> ___ You have failed to nominate the proper persons for review and push.
> 
> ___ Your patches do not have proper short+long header
> 
> ___ You have grammar/spelling in your header that is unacceptable.
> 
> ___ You have exceeded a sensible line length in your headers/comments/text.
> 
> ___ You have failed to put in a proper Trac Ticket # into your commits.
> 
> ___ You have incorrectly put/left internal data in your comments/files
>     (i.e. internal bug tracking tool IDs, product names etc)
> 
> ___ You have not given any evidence of testing beyond basic build tests.
>     Demonstrate some level of runtime or other sanity testing.
> 
> ___ You have ^M present in some of your files. These have to be removed.
> 
> ___ You have needlessly changed whitespace or added whitespace crimes
>     like trailing spaces, or spaces before tabs.
> 
> ___ You have mixed real technical changes with whitespace and other
>     cosmetic code cleanup changes. These have to be separate commits.
> 
> ___ You need to refactor your submission into logical chunks; there is
>     too much content into a single commit.
> 
> ___ You have extraneous garbage in your review (merge commits etc)
> 
> ___ You have giant attachments which should never have been sent;
>     Instead you should place your content in a public tree to be pulled.
> 
> ___ You have too many commits attached to an e-mail; resend as threaded
>     commits, or place in a public tree for a pull.
> 
> ___ You have resent this content multiple times without a clear indication
>     of what has changed between each re-send.
> 
> ___ You have failed to adequately and individually address all of the
>     comments and change requests that were proposed in the initial review.
> 
> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
> 
> ___ Your computer have a badly configured date and time; confusing the
>     the threaded patch review.
> 
> ___ Your changes affect IPC mechanism, and you don't present any results
>     for in-service upgradability test.
> 
> ___ Your changes affect user manual and documentation, your patch series
>     do not contain the patch that updates the Doxygen manual.
> 
> 
> ------------------------------------------------------------------------------
> November Webinars for C, C++, Fortran Developers
> Accelerate application performance with scalable programming models. Explore
> techniques for threading, error checking, porting, and tuning. Get the most
> from the latest Intel processors and coprocessors. See abstracts and register
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel

------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most 
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to