Re: [devel] [PATCH 0/4] Review Request for amfd: improve controller failover behavior V2 [#3029]
Hi Gary ACK (not tested) Regards Canh -Original Message- From: Gary Lee Sent: Tuesday, July 9, 2019 1:21 PM To: canh.v.tru...@dektech.com.au; minh.c...@dektech.com.au; hans.nordeb...@ericsson.com Cc: opensaf-devel@lists.sourceforge.net; Gary Lee Subject: [PATCH 0/4] Review Request for amfd: improve controller failover behavior V2 [#3029] Summary: amfd: improve controller failover behavior [#3029] Review request for Ticket(s): 3029 Peer Reviewer(s): Canh, Minh, Hans Pull request to: *** LIST THE PERSON WITH PUSH ACCESS HERE *** Affected branch(es): develop Development branch: ticket-3029 Base revision: 71852f322b42437f074bfa4c618c021798357143 Personal repository: git://git.code.sf.net/u/userid-2226215/review Impacted area Impact y/n Docsn Build systemn RPM/packaging n Configuration files n Startup scripts n SAF servicesy OpenSAF servicesy Core libraries y Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 4feee2b631afa3393ae9e53fd6575c3768861dca Author: Gary Lee Date: Tue, 9 Jul 2019 14:38:49 +1000 osaf: make wait time configurable [#3029] If FMS_TAKEOVER_PRIORITISE_PARTITION_SIZE is enabled, make the time that we wait for MDS node events configurable. revision 2c419ba5fffb85272f0d15118b561bcfc1de4814 Author: Gary Lee Date: Tue, 9 Jul 2019 14:38:49 +1000 amfd: improve controller failover behavior [#3029] If consensus service is enabled, only perform node failover after peer controller has self-fenced (after 2 * FMS_TAKEOVER_REQUEST_VALID_TIME seconds). This also means if node failover delay is set to a large value, we do not unnecesarily wait too long before failing over assignments previously assigned to the peer controller. Remove unused fmd_conf_file variable. Change some LOG_ER calls to LOG_WA. revision 7c4fff483477082ca66a26f921a50b3bc1240538 Author: Gary Lee Date: Tue, 9 Jul 2019 14:38:49 +1000 fmd: add active promotion supervision timer [#3029] Add supervision timer so controller will reboot if it cannot obtain consensus lock within the allocation period (2* FMS_TAKEOVER_REQUEST_VALID_TIME). The peer controller can then safely perform a node failover after this period of time. revision 8b596a228402ff99b26906138daf920c23e965e7 Author: Gary Lee Date: Tue, 9 Jul 2019 14:38:49 +1000 osaf: add function to return takeover request expiry time [#3029] Complete diffstat: -- src/amf/amfd/cb.h | 1 - src/amf/amfd/clm.cc| 4 +- src/amf/amfd/main.cc | 1 - src/amf/amfd/ndfsm.cc | 8 ++-- src/amf/amfd/ndproc.cc | 19 src/amf/amfd/node_state.cc | 23 +- src/amf/amfd/node_state_machine.cc | 19 src/amf/amfd/node_state_machine.h | 2 + src/amf/amfd/proc.h| 1 + src/fm/fmd/fm_cb.h | 2 + src/fm/fmd/fm_main.cc | 14 +- src/fm/fmd/fm_rda.cc | 89 ++ src/fm/fmd/fmd.conf| 5 +++ src/osaf/consensus/consensus.cc| 13 ++ src/osaf/consensus/consensus.h | 4 ++ src/rde/rded/role.cc | 4 +- 16 files changed, 160 insertions(+), 49 deletions(-) Testing Commands: - 1) Ensure a 2N application is active on standby controller, and standy on the active controller 2) Isolate active & standby controller Testing, Expected Results: -- amfd should failover 2N application only after 2 * FMS_TAKEOVER_REQUEST_VALID_TIME seconds Conditions of Submission: - ack from any reviewer Arch Built StartedLinux distro --- mipsn n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: --- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not passed review because (see checked entries): ___ Your RR template is generally incomplete; it has too many blank entries that need proper data filled in. ___ You have failed to nominate the proper persons for review and push. ___ Your patches do not have proper short+long header ___ You have grammar/spelling in your header that is unacceptable. ___ You have exceeded a sensible line length in your headers/comments/text. ___ You have failed to put in a proper Trac Ticket # into your commits. ___ You have incorrectly put/left internal data in your comments/files (i.e. internal bug tracking tool IDs, product names etc) ___ You have not given any evidence of testing beyond basic build tests.
[devel] [PATCH 2/4] fmd: add active promotion supervision timer [#3029]
Add supervision timer so controller will reboot if it cannot obtain consensus lock within the allocation period (2* FMS_TAKEOVER_REQUEST_VALID_TIME). The peer controller can then safely perform a node failover after this period of time. --- src/fm/fmd/fm_cb.h| 2 ++ src/fm/fmd/fm_main.cc | 14 - src/fm/fmd/fm_rda.cc | 87 ++- 3 files changed, 74 insertions(+), 29 deletions(-) diff --git a/src/fm/fmd/fm_cb.h b/src/fm/fmd/fm_cb.h index 6eb0d54..b5ea5ae 100644 --- a/src/fm/fmd/fm_cb.h +++ b/src/fm/fmd/fm_cb.h @@ -39,6 +39,7 @@ typedef enum { FM_TMR_TYPE_MIN, FM_TMR_PROMOTE_ACTIVE, FM_TMR_ACTIVATION_SUPERVISION, + FM_TMR_CONSENSUS_SERVICE_SUPERVISION, FM_TMR_TYPE_MAX } FM_TMR_TYPE; @@ -83,6 +84,7 @@ struct FM_CB { /* Timers */ FM_TMR promote_active_tmr{}; FM_TMR activation_supervision_tmr{}; + FM_TMR consensus_service_supervision_tmr{}; /* Time in terms of one hundredth of seconds (500 for 5 secs.) */ uint32_t active_promote_tmr_val{}; diff --git a/src/fm/fmd/fm_main.cc b/src/fm/fmd/fm_main.cc index 2eb3c16..4a843cc 100644 --- a/src/fm/fmd/fm_main.cc +++ b/src/fm/fmd/fm_main.cc @@ -59,7 +59,8 @@ static uint32_t fm_get_args(FM_CB *); static uint32_t fms_fms_exchange_node_info(FM_CB *); static uint32_t fms_fms_inform_terminating(FM_CB *fm_cb); static uint32_t fm_nid_notify(uint32_t); -static uint32_t fm_tmr_start(FM_TMR *, SaTimeT); +uint32_t fm_tmr_start(FM_TMR *, SaTimeT); +void fm_tmr_stop(FM_TMR *tmr); static SaAisErrorT get_peer_clm_node_name(NODE_ID); static SaAisErrorT fm_clm_init(); static void fm_mbx_msg_handler(FM_CB *, FM_EVT *); @@ -449,6 +450,8 @@ static uint32_t fm_get_args(FM_CB *fm_cb) { /* Set timer variables */ fm_cb->promote_active_tmr.type = FM_TMR_PROMOTE_ACTIVE; fm_cb->activation_supervision_tmr.type = FM_TMR_ACTIVATION_SUPERVISION; + fm_cb->consensus_service_supervision_tmr.type = +FM_TMR_CONSENSUS_SERVICE_SUPERVISION; char *node_isolation_timeout = getenv("FMS_NODE_ISOLATION_TIMEOUT"); if (node_isolation_timeout != NULL) { @@ -704,6 +707,11 @@ static void fm_mbx_msg_handler(FM_CB *fm_cb, FM_EVT *fm_mbx_evt) { "Activation timer supervision " "expired: no ACTIVE assignment received " "within the time limit"); + } else if (fm_mbx_evt->info.fm_tmr->type == + FM_TMR_CONSENSUS_SERVICE_SUPERVISION) { +opensaf_quick_reboot("Consensus service supervision " + "expired: controller was not promoted " + "within the time limit"); } break; @@ -728,6 +736,10 @@ static void fm_evt_proc_rda_callback(FM_CB *cb, FM_EVT *evt) { uint32_t rc = NCSCC_RC_SUCCESS; TRACE_ENTER2("%d", (int)evt->info.rda_info.role); + if (evt->info.rda_info.role == PCS_RDA_ACTIVE) { +LOG_NO("Controller promoted. Stop supervision timer"); +fm_tmr_stop(_cb->consensus_service_supervision_tmr); + } if (evt->info.rda_info.role != PCS_RDA_ACTIVE && cb->activation_supervision_tmr.status == FM_TMR_RUNNING) { fm_tmr_stop(>activation_supervision_tmr); diff --git a/src/fm/fmd/fm_rda.cc b/src/fm/fmd/fm_rda.cc index d3063ba..c072cb0 100644 --- a/src/fm/fmd/fm_rda.cc +++ b/src/fm/fmd/fm_rda.cc @@ -23,6 +23,8 @@ #include "osaf/consensus/consensus.h" #include "rde/agent/rda_papi.h" +extern uint32_t fm_tmr_start(FM_TMR *tmr, SaTimeT period); +extern void fm_tmr_stop(FM_TMR *tmr); extern void rda_cb(uint32_t cb_hdl, PCS_RDA_CB_INFO *cb_info, PCSRDA_RETURN_CODE error_code); / @@ -64,6 +66,47 @@ done: return rc; } +void promote_node(FM_CB *fm_cb) { + TRACE_ENTER(); + + Consensus consensus_service; + if (consensus_service.PrioritisePartitionSize() == true) { +// Allow topology events to be processed first. The MDS thread may +// be processing MDS down events and updating cluster_size concurrently. +// We need cluster_size to be as accurate as possible, without waiting +// too long for node down events. +std::this_thread::sleep_for(std::chrono::seconds(2)); + } + + uint32_t rc; + rc = consensus_service.PromoteThisNode(true, fm_cb->cluster_size); + if (rc != SA_AIS_OK && rc != SA_AIS_ERR_EXIST) { +LOG_ER("Unable to set active controller in consensus service"); +opensaf_quick_reboot("Unable to set active controller " + "in consensus service"); + } else if (rc == SA_AIS_ERR_EXIST) { +// @todo if we don't reboot, we don't seem to recover from this. Can we +// improve? +LOG_ER( +"A controller is already active. We were separated from the " +"cluster?"); +opensaf_quick_reboot("A controller is already active. We were separated " + "from the cluster?"); + } + + PCS_RDA_REQ rda_req; + + /* set the RDA role to active */ +
[devel] [PATCH 4/4] osaf: make wait time configurable [#3029]
If FMS_TAKEOVER_PRIORITISE_PARTITION_SIZE is enabled, make the time that we wait for MDS node events configurable. --- src/fm/fmd/fm_rda.cc| 4 +++- src/fm/fmd/fmd.conf | 5 + src/osaf/consensus/consensus.cc | 9 + src/osaf/consensus/consensus.h | 2 ++ src/rde/rded/role.cc| 4 +++- 5 files changed, 22 insertions(+), 2 deletions(-) diff --git a/src/fm/fmd/fm_rda.cc b/src/fm/fmd/fm_rda.cc index c072cb0..fca417f 100644 --- a/src/fm/fmd/fm_rda.cc +++ b/src/fm/fmd/fm_rda.cc @@ -75,7 +75,9 @@ void promote_node(FM_CB *fm_cb) { // be processing MDS down events and updating cluster_size concurrently. // We need cluster_size to be as accurate as possible, without waiting // too long for node down events. -std::this_thread::sleep_for(std::chrono::seconds(2)); +std::this_thread::sleep_for( + std::chrono::seconds( +consensus_service.PrioritisePartitionSizeWaitTime())); } uint32_t rc; diff --git a/src/fm/fmd/fmd.conf b/src/fm/fmd/fmd.conf index 209e484..4dbf53a 100644 --- a/src/fm/fmd/fmd.conf +++ b/src/fm/fmd/fmd.conf @@ -36,6 +36,11 @@ export FMS_TAKEOVER_REQUEST_VALID_TIME=20 # Default is 1 #export FMS_TAKEOVER_PRIORITISE_PARTITION_SIZE=1 +# If FMS_TAKEOVER_PRIORITISE_PARTITION_SIZE is set to 1, wait until +# this number of seconds for MDS events before making a decision +# on partition size. Default is 4 seconds +#export FMS_TAKEOVER_PRIORITISE_PARTITION_SIZE_MDS_WAIT_TIME=4 + # Default behaviour is not to allow promotion of this node to Active # unless a lock can be obtained, if split brain prevention is enabled. # Uncomment the next line to allow promotion of this node at cluster startup, diff --git a/src/osaf/consensus/consensus.cc b/src/osaf/consensus/consensus.cc index 814885e..0e37fa3 100644 --- a/src/osaf/consensus/consensus.cc +++ b/src/osaf/consensus/consensus.cc @@ -207,6 +207,10 @@ bool Consensus::PrioritisePartitionSize() const { return prioritise_partition_size_; } +uint32_t Consensus::PrioritisePartitionSizeWaitTime() const { + return prioritise_partition_size_mds_wait_time_; +} + uint32_t Consensus::TakeoverValidTime() const { return takeover_valid_time_; } @@ -253,6 +257,8 @@ void Consensus::ProcessEnvironmentSettings() { uint32_t use_remote_fencing = base::GetEnv("FMS_USE_REMOTE_FENCING", 0); uint32_t prioritise_partition_size = base::GetEnv("FMS_TAKEOVER_PRIORITISE_PARTITION_SIZE", 1); + uint32_t prioritise_partition_size_mds_wait_time = +base::GetEnv("FMS_TAKEOVER_PRIORITISE_PARTITION_SIZE_MDS_WAIT_TIME", 4); uint32_t relaxed_node_promotion = base::GetEnv("FMS_RELAXED_NODE_PROMOTION", 0); config_file_ = base::GetEnv("FMS_CONF_FILE", ""); @@ -281,6 +287,9 @@ void Consensus::ProcessEnvironmentSettings() { if (use_consensus_ == true && relaxed_node_promotion == 1) { relaxed_node_promotion_ = true; } + + prioritise_partition_size_mds_wait_time_ = +prioritise_partition_size_mds_wait_time; } bool Consensus::ReloadConfiguration() { diff --git a/src/osaf/consensus/consensus.h b/src/osaf/consensus/consensus.h index 1fabf90..1aba561 100644 --- a/src/osaf/consensus/consensus.h +++ b/src/osaf/consensus/consensus.h @@ -61,6 +61,7 @@ class Consensus { bool IsRelaxedNodePromotionEnabled() const; bool PrioritisePartitionSize() const; + uint32_t PrioritisePartitionSizeWaitTime() const; uint32_t TakeoverValidTime() const; @@ -100,6 +101,7 @@ class Consensus { bool use_consensus_{false}; bool use_remote_fencing_{false}; bool prioritise_partition_size_{true}; + uint32_t prioritise_partition_size_mds_wait_time_{4}; bool relaxed_node_promotion_{false}; uint32_t takeover_valid_time_{20}; uint32_t max_takeover_retry_{0}; diff --git a/src/rde/rded/role.cc b/src/rde/rded/role.cc index b8c8157..b890117 100644 --- a/src/rde/rded/role.cc +++ b/src/rde/rded/role.cc @@ -83,7 +83,9 @@ void Role::MonitorCallback(const std::string& key, const std::string& new_value, consensus_service.PrioritisePartitionSize() == true) { // don't send this to the main thread straight away, as it will // need some time to process topology changes. - std::this_thread::sleep_for(std::chrono::seconds(4)); + std::this_thread::sleep_for( +std::chrono::seconds( + consensus_service.PrioritisePartitionSizeWaitTime())); } } else { msg->type = RDE_MSG_NEW_ACTIVE_CALLBACK; -- 2.7.4 ___ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel
[devel] [PATCH 0/4] Review Request for amfd: improve controller failover behavior V2 [#3029]
Summary: amfd: improve controller failover behavior [#3029] Review request for Ticket(s): 3029 Peer Reviewer(s): Canh, Minh, Hans Pull request to: *** LIST THE PERSON WITH PUSH ACCESS HERE *** Affected branch(es): develop Development branch: ticket-3029 Base revision: 71852f322b42437f074bfa4c618c021798357143 Personal repository: git://git.code.sf.net/u/userid-2226215/review Impacted area Impact y/n Docsn Build systemn RPM/packaging n Configuration files n Startup scripts n SAF servicesy OpenSAF servicesy Core libraries y Samples n Tests n Other n Comments (indicate scope for each "y" above): - revision 4feee2b631afa3393ae9e53fd6575c3768861dca Author: Gary Lee Date: Tue, 9 Jul 2019 14:38:49 +1000 osaf: make wait time configurable [#3029] If FMS_TAKEOVER_PRIORITISE_PARTITION_SIZE is enabled, make the time that we wait for MDS node events configurable. revision 2c419ba5fffb85272f0d15118b561bcfc1de4814 Author: Gary Lee Date: Tue, 9 Jul 2019 14:38:49 +1000 amfd: improve controller failover behavior [#3029] If consensus service is enabled, only perform node failover after peer controller has self-fenced (after 2 * FMS_TAKEOVER_REQUEST_VALID_TIME seconds). This also means if node failover delay is set to a large value, we do not unnecesarily wait too long before failing over assignments previously assigned to the peer controller. Remove unused fmd_conf_file variable. Change some LOG_ER calls to LOG_WA. revision 7c4fff483477082ca66a26f921a50b3bc1240538 Author: Gary Lee Date: Tue, 9 Jul 2019 14:38:49 +1000 fmd: add active promotion supervision timer [#3029] Add supervision timer so controller will reboot if it cannot obtain consensus lock within the allocation period (2* FMS_TAKEOVER_REQUEST_VALID_TIME). The peer controller can then safely perform a node failover after this period of time. revision 8b596a228402ff99b26906138daf920c23e965e7 Author: Gary Lee Date: Tue, 9 Jul 2019 14:38:49 +1000 osaf: add function to return takeover request expiry time [#3029] Complete diffstat: -- src/amf/amfd/cb.h | 1 - src/amf/amfd/clm.cc| 4 +- src/amf/amfd/main.cc | 1 - src/amf/amfd/ndfsm.cc | 8 ++-- src/amf/amfd/ndproc.cc | 19 src/amf/amfd/node_state.cc | 23 +- src/amf/amfd/node_state_machine.cc | 19 src/amf/amfd/node_state_machine.h | 2 + src/amf/amfd/proc.h| 1 + src/fm/fmd/fm_cb.h | 2 + src/fm/fmd/fm_main.cc | 14 +- src/fm/fmd/fm_rda.cc | 89 ++ src/fm/fmd/fmd.conf| 5 +++ src/osaf/consensus/consensus.cc| 13 ++ src/osaf/consensus/consensus.h | 4 ++ src/rde/rded/role.cc | 4 +- 16 files changed, 160 insertions(+), 49 deletions(-) Testing Commands: - 1) Ensure a 2N application is active on standby controller, and standy on the active controller 2) Isolate active & standby controller Testing, Expected Results: -- amfd should failover 2N application only after 2 * FMS_TAKEOVER_REQUEST_VALID_TIME seconds Conditions of Submission: - ack from any reviewer Arch Built StartedLinux distro --- mipsn n mips64 n n x86 n n x86_64 y y powerpc n n powerpc64 n n Reviewer Checklist: --- [Submitters: make sure that your review doesn't trigger any checkmarks!] Your checkin has not passed review because (see checked entries): ___ Your RR template is generally incomplete; it has too many blank entries that need proper data filled in. ___ You have failed to nominate the proper persons for review and push. ___ Your patches do not have proper short+long header ___ You have grammar/spelling in your header that is unacceptable. ___ You have exceeded a sensible line length in your headers/comments/text. ___ You have failed to put in a proper Trac Ticket # into your commits. ___ You have incorrectly put/left internal data in your comments/files (i.e. internal bug tracking tool IDs, product names etc) ___ You have not given any evidence of testing beyond basic build tests. Demonstrate some level of runtime or other sanity testing. ___ You have ^M present in some of your files. These have to be removed. ___ You have needlessly changed whitespace or added whitespace crimes like trailing spaces, or spaces before tabs. ___ You have mixed real technical changes with whitespace and other cosmetic code cleanup changes. These