Hi AndersBj,

Reviewed the first version of 2PBE patch. following are the comments:

1. while applying the patch is failed with one rejection:

1 out of 13 hunks FAILED -- saving rejects to file 
osaf/services/saf/immsv/immnd/immnd_evt.c.rej


The #355 LOG_NO anf the if condition is changed in the #355. The 
following needs to be removed from the patch:

@@ -3301,7 +3304,7 @@ static SaAisErrorT immnd_fevs_local_chec
         immnd_evt_destroy(&frwrd_evt, SA_FALSE, __LINE__);

         if ((error != SA_AIS_OK) && (error != SA_AIS_ERR_NO_BINDINGS)) {
-               LOG_NO("Precheck of fevs message failed ERROR:%u", error);
+               LOG_NO("Precheck of fevs message type %u failed 
ERROR:%u", frwrd_evt.info.immnd.type, error);
         }

         TRACE_LEAVE();
2.   Typo,  ctgrl would have be ctrl:

+                  node accpet message for each SC. Subsequent may have 
ctgrl->canBeCoord

3. General comments:

     a. with this patch both controllers must have any one of the files 
like imm.db/immdb.nodeID/imm.xml for 2PBE configuration.

     b. If all the parameters are same why is the remote IMMND is 
choosen as co-ordinator?

4. SC-1 has imm.db and SC-2 has imm.xml. Both the nodes 2PBE is configured.

  case 1:
    a. SC-1(slot-3) is LOADING_SERVER and SC-2(slot-4) is LOADING_CLIENT

    b. SC-2 went for preload state.

    c. SC-2 after sending parameters to IMMD called AdminownerFinalize. 
Adminownerfinalize got timeout

    Jun 13 17:51:54 Slot-4 osafimmloadd: WA Failed on 
saImmOmAdminOwnerFinalize (5)

    Jun 13 17:51:52 Slot-3 osafimmnd[29960]: NO ERR_BAD_HANDLE: Admin 
owner 2 does not exist

    d. The adminownerFinalize event is discarded because the SC-2 IMMND 
is not in maccepted state.


    Jun 13 17:51:44.419650 osafimmnd [29784:immnd_evt.c:8029] T2 
DISCARDING msg no:6
    Jun 13 17:51:44.419670 osafimmnd [29784:immnd_evt.c:8035] TR ABT 
cb->highestProcessed:5
    Jun 13 17:51:44.419690 osafimmnd [29784:immnd_evt.c:8037] TR ABT 
cb->highestProcessed INCREMENTED TO:6

case 2:

    If SC-1 is started first and SC-2 is started next

     a. starting of SC-1 immediatly crestes adminowner at that time sc-2 
is not there.

     b. when AdminOwnerfinalize is called for the preload in SC-1, at 
the SC-2 will give error as bad_handle.

     Jun 13 18:40:44 Slot-4 osafimmd[1655]: NO IMMND coord at 2010f
     Jun 13 18:40:44 Slot-4 osafimmnd[1671]: NO ERR_BAD_HANDLE: Admin 
owner 1 does not exist


    e. ERR_BAD_HANDLE is because, in the when the node comes up, 
immediatly admin

5. SC-1 has imm.db and SC-2 has imm.db with 2PBE configured.

    In this case also the message is getting discarded, but the print is 
not there to syslog because in this case the SC-2 reads from imm.db.

6. In the 2PBE case the checking of immnd cofiguration file must be 
bypassed. Having different imm.xml, imm.db and directory names are valid 
in 2PBE.


   /Neel.


On Thursday 30 May 2013 06:26 PM, Anders Bjornerstedt wrote:
> Summary: IMM: Prototype for 2PBE loading (test patch) [#21]
> Review request for Trac Ticket(s): 21
> Peer Reviewer(s): Neel, Zoran
> Pull request to:
> Affected branch(es): none
> Development branch:
>
> --------------------------------
> Impacted area       Impact y/n
> --------------------------------
>   Docs                    n
>   Build system            n
>   RPM/packaging           n
>   Configuration files     n
>   Startup scripts         n
>   SAF services            y
>   OpenSAF services        n
>   Core libraries          n
>   Samples                 n
>   Tests                   n
>   Other                   n
>
>
> Comments (indicate scope for each "y" above):
> ---------------------------------------------
>
> changeset 617845e0e935da9e5968e4b85673902aedd06642
> Author:       Anders Bjornerstedt <[email protected]>
> Date: Thu, 30 May 2013 14:08:10 +0200
>
>       IMM: Prototype for 2PBE loading (test patch) [#21]
>
>       This is a testpatch containing a first version of the 2PBE loading
>       mechanism. This patch goes on top of the patch for [#17] the file
>       configuration consistency checking. See separate review request mail for
>       instructions. This patch will not be pushed. The intent with this patch 
> is
>       to allow testing and obtaining feedback on the 2PBE loading.
>
> This is the "separate review request mail".
> To test 2PBE loading do as folows:
>
> 1) Ensure that the patch for [#17] and this patch apply successfully on the 
> default branch.
> 2) Ensure that opensaf builds with configure for --enable-imm-pbe.
> 3) Verify that opensaf starts and loads without pbe configured in immnd.conf.
> 4) Configure immnd.conf for regular PBE.
> 5) Verify that opensaf starts and works normally for regular PBE. Stop 
> opensaf.
> 6) Configure immd.conf (not immnd.conf) for 2PBE by commenting in the
>   'export IMMSV_PEER_SC_MAX_WAIT=30' environment variable.
> 7) Start opensaf.
>
> You should now see loading arbitration being done by the active IMMD.
> This consists of ordering a "preload" at both SC1 and SC2 concurrently.
> That preload will probe the filesystem; determine which file to be loaded
> from locally; scan that file (xml or sqlite) for the parameters used in
> loading arbitration and fially send the paramters to the IMMD via an
> admin-operation.
> When the IMMD has received the loading arbitration parameters from both
> SCs, it will compare them and decide which SC to perform the loading.
> The IMMD then elects the IMMND coord to be at the SC that "won" the
> arbitration contest. Loading is then done in the normal way.
>
> After loading has completed, a PBE will be started only at the coord
> IMMND. Thus this patch does not provide the logic for full 2PBE and
> in particular not the 2 part of the PBE dumping.
> So after loading, the system works almost as normal 1PBE, with one
> difference. The file name used by the immpbedd process is not just
> the  regular "imm.db" file. The file name used is the configued
> file name (imm.db) with the processor id appended as a suffix.
> Thus from SC1 you will see 'imm.db.2010f' and from SC2 you will see 
> 'imm.db.2020f'.
> Of course if you run this on a shared file system then you may see both
> files. One reason for appending the processor id to the pbe file is to allow
> testing of 2PBE on systems currently prepared only for 1PBE. Thus 2PBE
> should work also when the PBE files are placed on a shared file system.
> It would of course be silly to have such a configuration in production.
> But for testing it may be convenient.
>
> One danger has to be pointed out with testing of 2PBE loading when not
> having the 2PBE dumping in place, is that you have to be very carefull
> with failover/switchover of SC or failover of immnd-coord.
> When the immnd-coord moves from one SC ot the other in 1PBE, it tries to
> avoid regenerating the imm.db file and instead re-attaches to the pre
> existing imm.db file. With this incomplete 2PBE setup however, the PBE
> will indeed failover, but if it re-attaches it will re-attach to the
> imm.db.suffix for that matches the SC it is executing on.
> The imm.db.suffix file it re-attaches to may be completely out of sync
> since any CCBS or other persistent writes done prior to the failover would
> only have reached the other imm.db.suffix file.
> So *if* you are going to perform any test that implies a failover of immnd
> coord, then you must remover or rename the imm.db.suffix file at SC where
> the new immnd coord (and immpbedd) will start. This will force the immpbedd
> to regenerate the file.
>
> When testing loading you may also experiment with updating the epoch or
> ccb-id in the sqlite file or the xml file.
>
> The file chosen for pre-loading and loading is in order of precedence:
> 1) imm.db.<suffix>
> 2) imm.db
> 3) imm.xml
>
> If it cant find opne file or fails to open it, it tries the next.
> But if it succeeds in opening it and preloading starts but fails,
> then there is no attempt to look for the next file. Instead such
> a failure is regarded as a corrupt file setup.
>
> The loading arbitration done at the IMMD is based on:
> 1) epoch
> 2) regular  ccb-id.
> 3) weak-ccb-id (a fake ccb-id for PRT updates and class updates).
> 4) regular commit-time.
> 5) weak commit-time.
>
> The first level with a difference will decide.
> If there is no difference (would be the normal case in a well functioning 
> system),
> then it arbitrarily chooses the coord at the standby SC (offloads the active 
> SC
> a tiny bit).
>
> Please do not hesitate to ask questions.
> The loading paramters may be extended a bit later.
>
>
> Complete diffstat:
> ------------------
>   osaf/libs/agents/saf/imma/imma_proc.c            |   33 ++++++-
>   osaf/libs/common/immsv/immsv_evt.c               |   55 ++++++++++++-
>   osaf/libs/common/immsv/include/immsv_api.h       |    2 +-
>   osaf/libs/common/immsv/include/immsv_evt.h       |   13 +++-
>   osaf/services/saf/immsv/immd/immd_amf.c          |    5 +-
>   osaf/services/saf/immsv/immd/immd_cb.h           |    6 +-
>   osaf/services/saf/immsv/immd/immd_db.c           |    2 +
>   osaf/services/saf/immsv/immd/immd_evt.c          |   82 +++++++++++++++++++-
>   osaf/services/saf/immsv/immd/immd_main.c         |   40 ++++++++++-
>   osaf/services/saf/immsv/immd/immd_proc.c         |  229 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>   osaf/services/saf/immsv/immd/immd_proc.h         |    3 +-
>   osaf/services/saf/immsv/immd/immd_sbevt.c        |   23 +++++-
>   osaf/services/saf/immsv/immloadd/imm_loader.cc   |  260 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++------------
>   osaf/services/saf/immsv/immloadd/imm_loader.hh   |    6 +-
>   osaf/services/saf/immsv/immloadd/imm_pbe_load.cc |  190 
> ++++++++++++++++++++++++++++++++++++++++++++++--
>   osaf/services/saf/immsv/immnd/ImmModel.cc        |   53 ++++++++++++-
>   osaf/services/saf/immsv/immnd/immnd_cb.h         |    2 +
>   osaf/services/saf/immsv/immnd/immnd_evt.c        |  121 
> ++++++++++++++++++++++++++++++-
>   osaf/services/saf/immsv/immnd/immnd_main.c       |    2 +-
>   osaf/services/saf/immsv/immnd/immnd_proc.c       |   91 
> +++++++++++++++++++---
>   osaf/services/saf/immsv/immpbed/immpbe_daemon.cc |   19 +++-
>   21 files changed, 1116 insertions(+), 121 deletions(-)
>
>
> Testing Commands:
> -----------------
>   <<LIST THE COMMAND LINE TOOLS/STEPS TO TEST YOUR CHANGES>>
>
>
> Testing, Expected Results:
> --------------------------
>   <<PASTE COMMAND OUTPUTS / TEST RESULTS>>
>
>
> Conditions of Submission:
> -------------------------
>   <<HOW MANY DAYS BEFORE PUSHING, CONSENSUS ETC>>
>
>
> Arch      Built     Started    Linux distro
> -------------------------------------------
> mips        n          n
> mips64      n          n
> x86         n          n
> x86_64      n          n
> powerpc     n          n
> powerpc64   n          n
>
>
> Reviewer Checklist:
> -------------------
> [Submitters: make sure that your review doesn't trigger any checkmarks!]
>
>
> Your checkin has not passed review because (see checked entries):
>
> ___ Your RR template is generally incomplete; it has too many blank entries
>      that need proper data filled in.
>
> ___ You have failed to nominate the proper persons for review and push.
>
> ___ Your patches do not have proper short+long header
>
> ___ You have grammar/spelling in your header that is unacceptable.
>
> ___ You have exceeded a sensible line length in your headers/comments/text.
>
> ___ You have failed to put in a proper Trac Ticket # into your commits.
>
> ___ You have incorrectly put/left internal data in your comments/files
>      (i.e. internal bug tracking tool IDs, product names etc)
>
> ___ You have not given any evidence of testing beyond basic build tests.
>      Demonstrate some level of runtime or other sanity testing.
>
> ___ You have ^M present in some of your files. These have to be removed.
>
> ___ You have needlessly changed whitespace or added whitespace crimes
>      like trailing spaces, or spaces before tabs.
>
> ___ You have mixed real technical changes with whitespace and other
>      cosmetic code cleanup changes. These have to be separate commits.
>
> ___ You need to refactor your submission into logical chunks; there is
>      too much content into a single commit.
>
> ___ You have extraneous garbage in your review (merge commits etc)
>
> ___ You have giant attachments which should never have been sent;
>      Instead you should place your content in a public tree to be pulled.
>
> ___ You have too many commits attached to an e-mail; resend as threaded
>      commits, or place in a public tree for a pull.
>
> ___ You have resent this content multiple times without a clear indication
>      of what has changed between each re-send.
>
> ___ You have failed to adequately and individually address all of the
>      comments and change requests that were proposed in the initial review.
>
> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
>
> ___ Your computer have a badly configured date and time; confusing the
>      the threaded patch review.
>
> ___ Your changes affect IPC mechanism, and you don't present any results
>      for in-service upgradability test.
>
> ___ Your changes affect user manual and documentation, your patch series
>      do not contain the patch that updates the Doxygen manual.
>


------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to