Re: [devel] [PATCH 1/1] amf: change the way amfd handle amfnd down [#2891]

2018-07-13 Thread Minh Hon Chau

Hi Thuan,

ack from me.

Thanks

Minh


On 12/07/18 20:22, thuan.tran wrote:

There is a case that after AMFD send reboot order due to “out of sync window”.
AMFD receive CLM track callback but node is not AMF member yet and delete node
from node_id_db. Later AMFND down handler will do nothing since it cannot find
the node. When node reboot up, AMFD continue use old msg_id counter send msg to
AMFND cause messasge ID mismatch in AMFND then AMFND order reboot itself node.

Solution: in AMFND down handler, if not found node in node_id_db, searching node
in node_name_db. If found, continue proceed as normal AMFND down event.
---
  src/amf/amfd/ndfsm.cc | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/amf/amfd/ndfsm.cc b/src/amf/amfd/ndfsm.cc
index 9d54df13d..598c57c47 100644
--- a/src/amf/amfd/ndfsm.cc
+++ b/src/amf/amfd/ndfsm.cc
@@ -775,6 +775,16 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
nds_mds_ver_db.erase(evt->info.node_id);
amfnd_svc_db->erase(evt->info.node_id);
  
+  if (node == nullptr) {

+for (const auto  : *node_name_db) {
+  AVD_AVND *avnd = value.second;
+  if (avnd->node_info.nodeId == evt->info.node_id) {
+node = avnd;
+break;
+  }
+}
+  }
+
if (node != nullptr) {
  // Do nothing if the local node goes down. Most likely due to system
  // shutdown. If node director goes down due to a bug, the AMF watchdog 
will



--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1/1] amf: change the way amfd handle amfnd down [#2891]

2018-07-12 Thread nagendra
Hi Thuan,
 
Ack.
 
Thanks,
Nagendra, 91-9866424860
www.hasolutions.in
https://www.linkedin.com/company/hasolutions/
High Availability Solutions Pvt. Ltd.
- OpenSAF support and services
 
 
 
 
 
 
 
- Original Message - Subject: [PATCH 1/1] amf: change the way 
amfd handle amfnd down [#2891]
From: "thuan.tran" 
Date: 7/12/18 3:52 pm
To: nagen...@hasolutions.in, minh.c...@dektech.com.au, 
hans.nordeb...@ericsson.com, gary@dektech.com.au
Cc: opensaf-devel@lists.sourceforge.net, "thuan.tran" 


There is a case that after AMFD send reboot order due to “out of sync window”.
 AMFD receive CLM track callback but node is not AMF member yet and delete node
 from node_id_db. Later AMFND down handler will do nothing since it cannot find
 the node. When node reboot up, AMFD continue use old msg_id counter send msg to
 AMFND cause messasge ID mismatch in AMFND then AMFND order reboot itself node.
 
 Solution: in AMFND down handler, if not found node in node_id_db, searching 
node
 in node_name_db. If found, continue proceed as normal AMFND down event.
 ---
 src/amf/amfd/ndfsm.cc | 10 ++
 1 file changed, 10 insertions(+)
 
 diff --git a/src/amf/amfd/ndfsm.cc b/src/amf/amfd/ndfsm.cc
 index 9d54df13d..598c57c47 100644
 --- a/src/amf/amfd/ndfsm.cc
 +++ b/src/amf/amfd/ndfsm.cc
 @@ -775,6 +775,16 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
 nds_mds_ver_db.erase(evt->info.node_id);
 amfnd_svc_db->erase(evt->info.node_id);
 
 + if (node == nullptr) {
 + for (const auto  : *node_name_db) {
 + AVD_AVND *avnd = value.second;
 + if (avnd->node_info.nodeId == evt->info.node_id) {
 + node = avnd;
 + break;
 + }
 + }
 + }
 +
 if (node != nullptr) {
 // Do nothing if the local node goes down. Most likely due to system
 // shutdown. If node director goes down due to a bug, the AMF watchdog will
 -- 
 2.18.0
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel


Re: [devel] [PATCH 1/1] amf: change the way amfd handle amfnd down [#2891]

2018-07-12 Thread Tran Thuan
Hi Minh, Nagu,

Ok, I send out new version V5 as your suggestion.

Best Regards,
Thuan

-Original Message-
From: Minh Hon Chau  
Sent: Thursday, July 12, 2018 1:36 PM
To: thuan.tran ; nagen...@hasolutions.in; 
hans.nordeb...@ericsson.com; gary@dektech.com.au
Cc: opensaf-devel@lists.sourceforge.net
Subject: Re: [PATCH 1/1] amf: change the way amfd handle amfnd down [#2891]

Hi Thuan,

I think what Nagu suggested is sufficiently good to fix the issue in this 
ticket.

Regarding the @synced_headless you add, what I see in your patch for now, if 
active amfd has not created synced assignments, and mds down comes, amfd does 
not call avd_node_failover. That could cause a problem, because the 
send/receive counters could be changed during the sync window, and those need 
to be reset. Generally, the avd_node_failover() should be handling and checking 
the states of several node/su/si/... and most of the cases the node coming 
down, this function should be called.

This may (or may not) be a problem that you have found, I think you could 
create another ticket if you think it's a problem, since it looks quite 
separated and the scenario is significant.

Thanks

Minh


On 11/07/18 17:37, thuan.tran wrote:
> There is a case that after AMFD send reboot order due to “out of sync window”.
> AMFD receive CLM track callback but node is not AMF member yet and delete 
> node.
> Later AMFND MDS down will do nothing since it cannot find the node.
> When node reboot up, AMFD continue use old msg_id counter send to 
> AMFND cause messasge ID mismatch in AMFND then AMFND order reboot itself node.
>
> Also, if AMFND already synced info after headless to active AMFD, then 
> node failover actions need consider for this AMFND down.
>
> Use a flag synced_headless for node, turn it true if susi recreate, 
> then in AMFND down handler, searching the node_id in node_name_db.
> If found, check if need do node failover base on synced_headless flag.
> ---
>   src/amf/amfd/ndfsm.cc | 21 -
>   src/amf/amfd/node.cc  |  1 +
>   src/amf/amfd/node.h   |  1 +
>   src/amf/amfd/siass.cc |  1 +
>   4 files changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/src/amf/amfd/ndfsm.cc b/src/amf/amfd/ndfsm.cc index 
> 9d54df13d..6323d3a73 100644
> --- a/src/amf/amfd/ndfsm.cc
> +++ b/src/amf/amfd/ndfsm.cc
> @@ -767,6 +767,7 @@ void avd_mds_avnd_up_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
>
> **
> /
>   
>   void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
> +  bool node_failover = true;
> AVD_AVND *node = avd_node_find_nodeid(evt->info.node_id);
>   
> TRACE_ENTER2("%x, %p", evt->info.node_id, node); @@ -775,6 +776,20 
> @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
> nds_mds_ver_db.erase(evt->info.node_id);
> amfnd_svc_db->erase(evt->info.node_id);
>   
> +  if (node == nullptr) {
> +for (const auto  : *node_name_db) {
> +  AVD_AVND *avnd = value.second;
> +  if (avnd->node_info.nodeId == evt->info.node_id) {
> +node_failover = false;
> +node = avnd;
> +if (node->synced_headless) {
> +  node_failover = true;
> +}
> +break;
> +  }
> +}
> +  }
> +
> if (node != nullptr) {
>   // Do nothing if the local node goes down. Most likely due to system
>   // shutdown. If node director goes down due to a bug, the AMF 
> watchdog will @@ -784,7 +799,9 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, 
> AVD_EVT *evt) {
>   }
>   
>   if (avd_cb->avail_state_avd == SA_AMF_HA_ACTIVE) {
> -  avd_node_failover(node);
> +  if (node_failover) {
> +avd_node_failover(node);
> +  }
> // Update standby out of sync if standby sc goes down
> if (avd_cb->node_id_avd_other == node->node_info.nodeId) {
>   cb->stby_sync_state = AVD_STBY_OUT_OF_SYNC; @@ -802,6 +819,7 
> @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
> node->recvr_fail_sw = false;
> node->node_info.initialViewNumber = 0;
> node->node_info.member = SA_FALSE;
> +  node->synced_headless = false;
>   }
> }
>   
> @@ -1122,6 +1140,7 @@ void avd_node_mark_absent(AVD_AVND *node) {
>   
> node->node_info.initialViewNumber = 0;
> node->node_info.member = SA_FALSE;
> +  node->synced_headless = false;
>   
> /* Increment node failfast counter */
> avd_cb->nodes_exit_cnt++;
> diff --git a/src/amf/amfd/node.cc b/src/amf/amfd/node.cc index 
> 0ffcfb782..f421e68de 100644
> --- a/src/amf/amfd/node.cc
> +++ b/src/amf/amfd/node.cc
> @@ -94,6 +94,7 @@ void AVD_AVND::initialize() {
> node_name = {};
> node_info = {};
> node_info.member = SA_FALSE;
> +  synced_headless = false;
> adest = {};
> saAmfNodeClmNode = {};
> saAmfNodeCapacity = {};
> diff --git a/src/amf/amfd/node.h b/src/amf/amfd/node.h index 
> e64bf8c93..02b15bca8 100644
> --- a/src/amf/amfd/node.h
> +++ 

Re: [devel] [PATCH 1/1] amf: change the way amfd handle amfnd down [#2891]

2018-07-12 Thread Minh Hon Chau

Hi Thuan,

I think what Nagu suggested is sufficiently good to fix the issue in 
this ticket.


Regarding the @synced_headless you add, what I see in your patch for 
now, if active amfd has not created synced assignments, and mds down 
comes, amfd does not call avd_node_failover. That could cause a problem, 
because the send/receive counters could be changed during the sync 
window, and those need to be reset. Generally, the avd_node_failover() 
should be handling and checking the states of several node/su/si/... and 
most of the cases the node coming down, this function should be called.


This may (or may not) be a problem that you have found, I think you 
could create another ticket if you think it's a problem, since it looks 
quite separated and the scenario is significant.


Thanks

Minh


On 11/07/18 17:37, thuan.tran wrote:

There is a case that after AMFD send reboot order due to “out of sync window”.
AMFD receive CLM track callback but node is not AMF member yet and delete node.
Later AMFND MDS down will do nothing since it cannot find the node.
When node reboot up, AMFD continue use old msg_id counter send to AMFND
cause messasge ID mismatch in AMFND then AMFND order reboot itself node.

Also, if AMFND already synced info after headless to active AMFD,
then node failover actions need consider for this AMFND down.

Use a flag synced_headless for node, turn it true if susi recreate,
then in AMFND down handler, searching the node_id in node_name_db.
If found, check if need do node failover base on synced_headless flag.
---
  src/amf/amfd/ndfsm.cc | 21 -
  src/amf/amfd/node.cc  |  1 +
  src/amf/amfd/node.h   |  1 +
  src/amf/amfd/siass.cc |  1 +
  4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/amf/amfd/ndfsm.cc b/src/amf/amfd/ndfsm.cc
index 9d54df13d..6323d3a73 100644
--- a/src/amf/amfd/ndfsm.cc
+++ b/src/amf/amfd/ndfsm.cc
@@ -767,6 +767,7 @@ void avd_mds_avnd_up_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
   **/
  
  void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {

+  bool node_failover = true;
AVD_AVND *node = avd_node_find_nodeid(evt->info.node_id);
  
TRACE_ENTER2("%x, %p", evt->info.node_id, node);

@@ -775,6 +776,20 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
nds_mds_ver_db.erase(evt->info.node_id);
amfnd_svc_db->erase(evt->info.node_id);
  
+  if (node == nullptr) {

+for (const auto  : *node_name_db) {
+  AVD_AVND *avnd = value.second;
+  if (avnd->node_info.nodeId == evt->info.node_id) {
+node_failover = false;
+node = avnd;
+if (node->synced_headless) {
+  node_failover = true;
+}
+break;
+  }
+}
+  }
+
if (node != nullptr) {
  // Do nothing if the local node goes down. Most likely due to system
  // shutdown. If node director goes down due to a bug, the AMF watchdog 
will
@@ -784,7 +799,9 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
  }
  
  if (avd_cb->avail_state_avd == SA_AMF_HA_ACTIVE) {

-  avd_node_failover(node);
+  if (node_failover) {
+avd_node_failover(node);
+  }
// Update standby out of sync if standby sc goes down
if (avd_cb->node_id_avd_other == node->node_info.nodeId) {
  cb->stby_sync_state = AVD_STBY_OUT_OF_SYNC;
@@ -802,6 +819,7 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
node->recvr_fail_sw = false;
node->node_info.initialViewNumber = 0;
node->node_info.member = SA_FALSE;
+  node->synced_headless = false;
  }
}
  
@@ -1122,6 +1140,7 @@ void avd_node_mark_absent(AVD_AVND *node) {
  
node->node_info.initialViewNumber = 0;

node->node_info.member = SA_FALSE;
+  node->synced_headless = false;
  
/* Increment node failfast counter */

avd_cb->nodes_exit_cnt++;
diff --git a/src/amf/amfd/node.cc b/src/amf/amfd/node.cc
index 0ffcfb782..f421e68de 100644
--- a/src/amf/amfd/node.cc
+++ b/src/amf/amfd/node.cc
@@ -94,6 +94,7 @@ void AVD_AVND::initialize() {
node_name = {};
node_info = {};
node_info.member = SA_FALSE;
+  synced_headless = false;
adest = {};
saAmfNodeClmNode = {};
saAmfNodeCapacity = {};
diff --git a/src/amf/amfd/node.h b/src/amf/amfd/node.h
index e64bf8c93..02b15bca8 100644
--- a/src/amf/amfd/node.h
+++ b/src/amf/amfd/node.h
@@ -145,6 +145,7 @@ class AVD_AVND {
uint16_t node_up_msg_count;/* to count of node_up msg that director had
  received from this node */
bool reboot;
+  bool synced_headless;
bool is_campaign_set_for_all_sus() const;
// Member functions.
void node_sus_termstate_set(bool term_state) const;
diff --git a/src/amf/amfd/siass.cc b/src/amf/amfd/siass.cc
index 267c55c07..f23c5510e 100644
--- a/src/amf/amfd/siass.cc
+++ b/src/amf/amfd/siass.cc
@@ -1136,6 +1136,7 @@ SaAisErrorT 

Re: [devel] [PATCH 1/1] amf: change the way amfd handle amfnd down [#2891]

2018-07-11 Thread nagendra
Hi Thuan,
Nice work. Ack from me(I am not sure about headless use 
case.)
 
Thanks,
Nagendra, 91-9866424860
www.hasolutions.in
https://www.linkedin.com/company/hasolutions/
High Availability Solutions Pvt. Ltd.
- OpenSAF support and services
 
 
 
 
 
 
 
- Original Message - Subject: [PATCH 1/1] amf: change the way 
amfd handle amfnd down [#2891]
From: "thuan.tran" 
Date: 7/11/18 1:07 pm
To: nagen...@hasolutions.in, minh.c...@dektech.com.au, 
hans.nordeb...@ericsson.com, gary@dektech.com.au
Cc: opensaf-devel@lists.sourceforge.net, "thuan.tran" 


There is a case that after AMFD send reboot order due to “out of sync window”.
 AMFD receive CLM track callback but node is not AMF member yet and delete node.
 Later AMFND MDS down will do nothing since it cannot find the node.
 When node reboot up, AMFD continue use old msg_id counter send to AMFND
 cause messasge ID mismatch in AMFND then AMFND order reboot itself node.
 
 Also, if AMFND already synced info after headless to active AMFD,
 then node failover actions need consider for this AMFND down.
 
 Use a flag synced_headless for node, turn it true if susi recreate,
 then in AMFND down handler, searching the node_id in node_name_db.
 If found, check if need do node failover base on synced_headless flag.
 ---
 src/amf/amfd/ndfsm.cc | 21 -
 src/amf/amfd/node.cc | 1 +
 src/amf/amfd/node.h | 1 +
 src/amf/amfd/siass.cc | 1 +
 4 files changed, 23 insertions(+), 1 deletion(-)
 
 diff --git a/src/amf/amfd/ndfsm.cc b/src/amf/amfd/ndfsm.cc
 index 9d54df13d..6323d3a73 100644
 --- a/src/amf/amfd/ndfsm.cc
 +++ b/src/amf/amfd/ndfsm.cc
 @@ -767,6 +767,7 @@ void avd_mds_avnd_up_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
 **/
 
 void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
 + bool node_failover = true;
 AVD_AVND *node = avd_node_find_nodeid(evt->info.node_id);
 
 TRACE_ENTER2("%x, %p", evt->info.node_id, node);
 @@ -775,6 +776,20 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
 nds_mds_ver_db.erase(evt->info.node_id);
 amfnd_svc_db->erase(evt->info.node_id);
 
 + if (node == nullptr) {
 + for (const auto  : *node_name_db) {
 + AVD_AVND *avnd = value.second;
 + if (avnd->node_info.nodeId == evt->info.node_id) {
 + node_failover = false;
 + node = avnd;
 + if (node->synced_headless) {
 + node_failover = true;
 + }
 + break;
 + }
 + }
 + }
 +
 if (node != nullptr) {
 // Do nothing if the local node goes down. Most likely due to system
 // shutdown. If node director goes down due to a bug, the AMF watchdog will
 @@ -784,7 +799,9 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
 }
 
 if (avd_cb->avail_state_avd == SA_AMF_HA_ACTIVE) {
 - avd_node_failover(node);
 + if (node_failover) {
 + avd_node_failover(node);
 + }
 // Update standby out of sync if standby sc goes down
 if (avd_cb->node_id_avd_other == node->node_info.nodeId) {
 cb->stby_sync_state = AVD_STBY_OUT_OF_SYNC;
 @@ -802,6 +819,7 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
 node->recvr_fail_sw = false;
 node->node_info.initialViewNumber = 0;
 node->node_info.member = SA_FALSE;
 + node->synced_headless = false;
 }
 }
 
 @@ -1122,6 +1140,7 @@ void avd_node_mark_absent(AVD_AVND *node) {
 
 node->node_info.initialViewNumber = 0;
 node->node_info.member = SA_FALSE;
 + node->synced_headless = false;
 
 /* Increment node failfast counter */
 avd_cb->nodes_exit_cnt++;
 diff --git a/src/amf/amfd/node.cc b/src/amf/amfd/node.cc
 index 0ffcfb782..f421e68de 100644
 --- a/src/amf/amfd/node.cc
 +++ b/src/amf/amfd/node.cc
 @@ -94,6 +94,7 @@ void AVD_AVND::initialize() {
 node_name = {};
 node_info = {};
 node_info.member = SA_FALSE;
 + synced_headless = false;
 adest = {};
 saAmfNodeClmNode = {};
 saAmfNodeCapacity = {};
 diff --git a/src/amf/amfd/node.h b/src/amf/amfd/node.h
 index e64bf8c93..02b15bca8 100644
 --- a/src/amf/amfd/node.h
 +++ b/src/amf/amfd/node.h
 @@ -145,6 +145,7 @@ class AVD_AVND {
 uint16_t node_up_msg_count; /* to count of node_up msg that director had
 received from this node */
 bool reboot;
 + bool synced_headless;
 bool is_campaign_set_for_all_sus() const;
 // Member functions.
 void node_sus_termstate_set(bool term_state) const;
 diff --git a/src/amf/amfd/siass.cc b/src/amf/amfd/siass.cc
 index 267c55c07..f23c5510e 100644
 --- a/src/amf/amfd/siass.cc
 +++ b/src/amf/amfd/siass.cc
 @@ -1136,6 +1136,7 @@ SaAisErrorT 
avd_susi_recreate(AVSV_N2D_ND_SISU_STATE_MSG_INFO *info) {
 return SA_AIS_ERR_NOT_EXIST;
 }
 
 + node->synced_headless = true;
 for (su_state = info->su_list; su_state != nullptr;
 su_state = su_state->next) {
 AVD_SU *su = su_db->find(Amf::to_string(_state->safSU));
 -- 
 2.18.0
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

[devel] [PATCH 1/1] amf: change the way amfd handle amfnd down [#2891]

2018-07-11 Thread thuan.tran
There is a case that after AMFD send reboot order due to “out of sync window”.
AMFD receive CLM track callback but node is not AMF member yet and delete node.
Later AMFND MDS down will do nothing since it cannot find the node.
When node reboot up, AMFD continue use old msg_id counter send to AMFND
cause messasge ID mismatch in AMFND then AMFND order reboot itself node.

Also, if AMFND already synced info after headless to active AMFD,
then node failover actions need consider for this AMFND down.

Use a flag synced_headless for node, turn it true if susi recreate,
then in AMFND down handler, searching the node_id in node_name_db.
If found, check if need do node failover base on synced_headless flag.
---
 src/amf/amfd/ndfsm.cc | 21 -
 src/amf/amfd/node.cc  |  1 +
 src/amf/amfd/node.h   |  1 +
 src/amf/amfd/siass.cc |  1 +
 4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/src/amf/amfd/ndfsm.cc b/src/amf/amfd/ndfsm.cc
index 9d54df13d..6323d3a73 100644
--- a/src/amf/amfd/ndfsm.cc
+++ b/src/amf/amfd/ndfsm.cc
@@ -767,6 +767,7 @@ void avd_mds_avnd_up_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
  **/
 
 void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
+  bool node_failover = true;
   AVD_AVND *node = avd_node_find_nodeid(evt->info.node_id);
 
   TRACE_ENTER2("%x, %p", evt->info.node_id, node);
@@ -775,6 +776,20 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
   nds_mds_ver_db.erase(evt->info.node_id);
   amfnd_svc_db->erase(evt->info.node_id);
 
+  if (node == nullptr) {
+for (const auto  : *node_name_db) {
+  AVD_AVND *avnd = value.second;
+  if (avnd->node_info.nodeId == evt->info.node_id) {
+node_failover = false;
+node = avnd;
+if (node->synced_headless) {
+  node_failover = true;
+}
+break;
+  }
+}
+  }
+
   if (node != nullptr) {
 // Do nothing if the local node goes down. Most likely due to system
 // shutdown. If node director goes down due to a bug, the AMF watchdog will
@@ -784,7 +799,9 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
 }
 
 if (avd_cb->avail_state_avd == SA_AMF_HA_ACTIVE) {
-  avd_node_failover(node);
+  if (node_failover) {
+avd_node_failover(node);
+  }
   // Update standby out of sync if standby sc goes down
   if (avd_cb->node_id_avd_other == node->node_info.nodeId) {
 cb->stby_sync_state = AVD_STBY_OUT_OF_SYNC;
@@ -802,6 +819,7 @@ void avd_mds_avnd_down_evh(AVD_CL_CB *cb, AVD_EVT *evt) {
   node->recvr_fail_sw = false;
   node->node_info.initialViewNumber = 0;
   node->node_info.member = SA_FALSE;
+  node->synced_headless = false;
 }
   }
 
@@ -1122,6 +1140,7 @@ void avd_node_mark_absent(AVD_AVND *node) {
 
   node->node_info.initialViewNumber = 0;
   node->node_info.member = SA_FALSE;
+  node->synced_headless = false;
 
   /* Increment node failfast counter */
   avd_cb->nodes_exit_cnt++;
diff --git a/src/amf/amfd/node.cc b/src/amf/amfd/node.cc
index 0ffcfb782..f421e68de 100644
--- a/src/amf/amfd/node.cc
+++ b/src/amf/amfd/node.cc
@@ -94,6 +94,7 @@ void AVD_AVND::initialize() {
   node_name = {};
   node_info = {};
   node_info.member = SA_FALSE;
+  synced_headless = false;
   adest = {};
   saAmfNodeClmNode = {};
   saAmfNodeCapacity = {};
diff --git a/src/amf/amfd/node.h b/src/amf/amfd/node.h
index e64bf8c93..02b15bca8 100644
--- a/src/amf/amfd/node.h
+++ b/src/amf/amfd/node.h
@@ -145,6 +145,7 @@ class AVD_AVND {
   uint16_t node_up_msg_count;/* to count of node_up msg that director had
 received from this node */
   bool reboot;
+  bool synced_headless;
   bool is_campaign_set_for_all_sus() const;
   // Member functions.
   void node_sus_termstate_set(bool term_state) const;
diff --git a/src/amf/amfd/siass.cc b/src/amf/amfd/siass.cc
index 267c55c07..f23c5510e 100644
--- a/src/amf/amfd/siass.cc
+++ b/src/amf/amfd/siass.cc
@@ -1136,6 +1136,7 @@ SaAisErrorT 
avd_susi_recreate(AVSV_N2D_ND_SISU_STATE_MSG_INFO *info) {
 return SA_AIS_ERR_NOT_EXIST;
   }
 
+  node->synced_headless = true;
   for (su_state = info->su_list; su_state != nullptr;
su_state = su_state->next) {
 AVD_SU *su = su_db->find(Amf::to_string(_state->safSU));
-- 
2.18.0


--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel