Hi Hans

Yes, they could be declared const member functions, as they generally don't change anything in the object. The changes are actually in the KV store.

But I guess we could potentially mislead callers about the intentions of the functions though.

What do you think?


On 13/04/18 16:16, Hans Nordebäck wrote:

On 04/12/2018 04:15 PM, Gary Lee wrote:

On 12/04/18 23:34, Anders Widell wrote:
Ack with comments:

* There is no need to use "const" when passing function arguments by value. E.g. the argument "const uint64_t cluster_size" should be "uint64_t cluster_size".

[GL] Sure, but it doesn't do any harm, and would stop accidental assignments (that would be lost anyway).
[HansN] perhaps these functions should be const member functions? E.g. SaAisErrorT PromoteThisNode(bool graceful_takeover, uint64_t cluster_size) const;

* You assume that all nodes in the cluster have synchronized clocks (probably using NTP). Would it be possible to use an expiration time for the etcd key instead of writing a time stamp in the value, so that etcd automatically deletes the takeover request when it expires? That way we would not require synchronized clocks.

[GL] Good idea. I did question why I hadn't use TTL/lease once I had finished the ticket. :-) Will see what I can do!

Anders Widell

On 04/11/2018 09:35 AM, Gary Lee wrote:
Summary: split-brain: select active SC from largest network partition V3 [#2795]
Review request for Ticket(s): 2795
Peer Reviewer(s): Anders, Ravi, Hans
Affected branch(es): develop
Development branch: ticket-2795
Base revision: 1c302a300e449e8a8527671fbd6c7f4e2b41e95d
Personal repository: git://git.code.sf.net/u/userid-2226215/review

Impacted area       Impact y/n
  Docs                    n
  Build system            n
  RPM/packaging           n
  Configuration files     n
  Startup scripts         n
  SAF services            n
  OpenSAF services        y
  Core libraries          y
  Samples                 n
  Tests                   n
  Other                   n

Comments (indicate scope for each "y" above):

*** Changes from V2: ***

fmd: made cluster_size atomic
fmd: wait 3 seconds before promoting to active, to allow topology events to be processed first
osaf: add check for existing takeover request, before trying to lock
etcdv3 plugin: reliablity improvements

revision c7bc78656d5de11f6147727bd8612274fb6e438f
Author:    Gary Lee <gary....@dektech.com.au>
Date:    Wed, 11 Apr 2018 17:16:46 +1000

rded: adapt to new Consensus API [#2795]

- add 3 new internal message:


- subscribe to AMFND service up events to keep track of the number
   of cluster members

- listen for takeover requests in KV store

revision 4899e5d0f5abdff8f15eca8ad17d3b13b6a00393
Author:    Gary Lee <gary....@dektech.com.au>
Date:    Wed, 11 Apr 2018 17:16:18 +1000

fmd: adapt to new Consensus API [#2795]

revision 812a315af21df06b2f9fdcc3d8fd5b7bbad3e550
Author:    Gary Lee <gary....@dektech.com.au>
Date:    Wed, 11 Apr 2018 17:15:41 +1000

amfd: adapt to new Consensus API [#2795]

revision b8a37c1b8965826e5faffbfebc44a84bdb6433a1
Author:    Gary Lee <gary....@dektech.com.au>
Date:    Wed, 11 Apr 2018 17:14:39 +1000

osaf: add lock takeover request fuction [#2795]

- add create and set (if previous value matches) functions to KeyValue class - add Consensus::MonitorTakeoverRequest() function for use by RDE to answer takeover requests - add Consensus::CreateTakeoverRequest() - before a SC is promoted to active, it will    create a takeover request in the KV store. An existing SC can reject the lock takeover

revision 955be872ba5887b1b521eac9f7732dd3f6afc593
Author:    Gary Lee <gary....@dektech.com.au>
Date:    Wed, 11 Apr 2018 17:13:45 +1000

osaf: extend API to include a create key and an enhanced set key function [#2795]

- add create_key function (fails if key already exists)
- add setkey_match_prev function (set value if previous value matches)
- add missing quotes
- add etcd3.plugin

Added Files:

Complete diffstat:
  src/amf/amfd/role.cc                     |   2 +-
  src/fm/fmd/fm_cb.h                       |   2 +-
  src/fm/fmd/fm_main.cc                    |  26 +-
  src/fm/fmd/fm_mds.cc                     |   2 +
  src/fm/fmd/fm_rda.cc                     |  27 +-
  src/osaf/consensus/consensus.cc          | 435 ++++++++++++++++++++++++++-----
  src/osaf/consensus/consensus.h           |  55 +++-
  src/osaf/consensus/key_value.cc          | 105 +++++---
  src/osaf/consensus/key_value.h           |  19 +-
  src/osaf/consensus/plugins/etcd.plugin   |  86 +++++-
  src/osaf/consensus/plugins/etcd3.plugin  | 366 ++++++++++++++++++++++++++
  src/osaf/consensus/plugins/sample.plugin |  67 ++++-
  src/rde/rded/rde_cb.h                    |  12 +-
  src/rde/rded/rde_main.cc                 |  75 ++++--
  src/rde/rded/rde_mds.cc                  |  39 ++-
  src/rde/rded/rde_rda.cc                  |   2 +-
  src/rde/rded/role.cc                     |  46 +++-
  src/rde/rded/role.h                      |   2 +-
  18 files changed, 1180 insertions(+), 188 deletions(-)

Testing Commands:
1) SI swap of safSi=SC-2N,safApp=OpenSAF
2) Isolate standby cluster (eg. use iptables to block port 6700 on a TCP system)
3) Isolate active cluster

Testing, Expected Results:
1) No error
2) Standby will fail to be promoted as active as the takeover request is rejected
3) Standby will be promoted

Conditions of Submission:
Ack from any reviewer

Arch      Built     Started    Linux distro
mips        n          n
mips64      n          n
x86         n          n
x86_64      y          y
powerpc     n          n
powerpc64   n          n

Reviewer Checklist:
[Submitters: make sure that your review doesn't trigger any checkmarks!]

Your checkin has not passed review because (see checked entries):

___ Your RR template is generally incomplete; it has too many blank entries
     that need proper data filled in.

___ You have failed to nominate the proper persons for review and push.

___ Your patches do not have proper short+long header

___ You have grammar/spelling in your header that is unacceptable.

___ You have exceeded a sensible line length in your headers/comments/text.

___ You have failed to put in a proper Trac Ticket # into your commits.

___ You have incorrectly put/left internal data in your comments/files
     (i.e. internal bug tracking tool IDs, product names etc)

___ You have not given any evidence of testing beyond basic build tests.
     Demonstrate some level of runtime or other sanity testing.

___ You have ^M present in some of your files. These have to be removed.

___ You have needlessly changed whitespace or added whitespace crimes
     like trailing spaces, or spaces before tabs.

___ You have mixed real technical changes with whitespace and other
     cosmetic code cleanup changes. These have to be separate commits.

___ You need to refactor your submission into logical chunks; there is
     too much content into a single commit.

___ You have extraneous garbage in your review (merge commits etc)

___ You have giant attachments which should never have been sent;
     Instead you should place your content in a public tree to be pulled.

___ You have too many commits attached to an e-mail; resend as threaded
     commits, or place in a public tree for a pull.

___ You have resent this content multiple times without a clear indication
     of what has changed between each re-send.

___ You have failed to adequately and individually address all of the
     comments and change requests that were proposed in the initial review.

___ You have a misconfigured ~/.gitconfig file (i.e. user.name, user.email etc)

___ Your computer have a badly configured date and time; confusing the
     the threaded patch review.

___ Your changes affect IPC mechanism, and you don't present any results
     for in-service upgradability test.

___ Your changes affect user manual and documentation, your patch series
     do not contain the patch that updates the Doxygen manual.

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Opensaf-devel mailing list

Reply via email to