Hi Thang,

ack (review + test). In below syslog, I got the assignment of sponsor + dependent on the locked SC removed, and the other SC creates new active assignments.

Minor comment: In sg_2n_fsm:node_fail_su_oper(), starting from line 3153, the codes are now most likely the same for both standby and active :)

3152:  } else {
3154:    /* the SU is not the same as the SU in the list */
3153:    if (avd_su_state_determine(su) == SA_AMF_HA_STANDBY) {

*// same as the below active*

            } /* if(avd_su_state_determine(su) == SA_AMF_HA_STANDBY) */
            else if (avd_su_state_determine(su) == SA_AMF_HA_ACTIVE) {

            }

Thanks

Minh

------syslog----------

2019-06-20 19:01:44.975 SC-1 osafamfnd[331]: NO Assigning 'safSi=ma_si,safApp=ma_app' QUIESCED to 'safSu=ma_su_1,safSg=ma_sg,safApp=ma_app' 2019-06-20 19:01:44.976 SC-1 amf_demo[533]: CSI Set - HAState Quiesced for all assigned CSIs 2019-06-20 19:01:44.977 SC-1 osafamfnd[331]: NO Assigning 'safSi=ma_si_new,safApp=ma_app_new' QUIESCED to 'safSu=ma_su_3_new,safSg=ma_sg_new,safApp=ma_app_new' 2019-06-20 19:01:44.977 SC-1 amf_demo_ori[599]: CSI Set - HAState Quiesced for all assigned CSIs 2019-06-20 19:01:44.977 SC-1 osafamfnd[331]: NO Assigned 'safSi=ma_si_new,safApp=ma_app_new' QUIESCED to 'safSu=ma_su_3_new,safSg=ma_sg_new,safApp=ma_app_new' 2019-06-20 19:01:51.978 SC-1 osafamfnd[331]: NO Assigned 'safSi=ma_si,safApp=ma_app' QUIESCED to 'safSu=ma_su_1,safSg=ma_sg,safApp=ma_app'
2019-06-20 19:01:52.895 SC-1 osafdtmd[169]: NO Lost contact with 'SC-2'

2019-06-20 19:01:52.903 SC-1 osafclmd[306]: NO Node 131599 went down. Not sending track callback for agents on that node
2019-06-20 19:01:52.903 SC-1 osafamfd[316]: NO Node 'SC-2' left the cluster

2019-06-20 19:01:52.937 SC-1 osaffmd[201]: NO Current role: ACTIVE
2019-06-20 19:01:52.938 SC-1 osaffmd[201]: Rebooting OpenSAF NodeId = 131599 EE Name = , Reason: Received Node Down for peer controller, OwnNodeId = 131343, SupervisionTime = 60 2019-06-20 19:01:52.957 SC-1 opensaf_reboot: Rebooting remote node in the absence of PLM is outside the scope of OpenSAF 2019-06-20 19:01:52.958 SC-1 osafamfnd[331]: NO Removing 'safSi=ma_si,safApp=ma_app' from 'safSu=ma_su_1,safSg=ma_sg,safApp=ma_app'
2019-06-20 19:01:52.958 SC-1 amf_demo[533]: CSI Remove for all CSIs
2019-06-20 19:01:52.959 SC-1 amf_demo[533]: state: 3, mode: 1, code: 1
2019-06-20 19:01:52.959 SC-1 osafamfnd[331]: NO Removing 'safSi=ma_si_new,safApp=ma_app_new' from 'safSu=ma_su_3_new,safSg=ma_sg_new,safApp=ma_app_new'
2019-06-20 19:01:52.959 SC-1 amf_demo_ori[599]: CSI Remove for all CSIs
2019-06-20 19:01:52.959 SC-1 osafamfnd[331]: NO Removed 'safSi=ma_si,safApp=ma_app' from 'safSu=ma_su_1,safSg=ma_sg,safApp=ma_app' 2019-06-20 19:01:52.960 SC-1 osafamfnd[331]: NO Removed 'safSi=ma_si_new,safApp=ma_app_new' from 'safSu=ma_su_3_new,safSg=ma_sg_new,safApp=ma_app_new' 2019-06-20 19:01:52.962 SC-1 osafamfd[316]: NO Assigning due to dep 'safSi=ma_si,safApp=ma_app' 2019-06-20 19:01:52.964 SC-1 osafamfd[316]: NO Tolerance timer started, sponsor si:'safSi=ma_si,safApp=ma_app', dependent si:safSi=ma_si_new,safApp=ma_app_new 2019-06-20 19:01:54.352 SC-1 osafdtmd[169]: NO Established contact with 'SC-2'

2019-06-20 19:01:56.368 SC-2 osafamfnd[257]: NO Assigning 'safSi=ma_si,safApp=ma_app' ACTIVE to 'safSu=ma_su_2,safSg=ma_sg,safApp=ma_app' 2019-06-20 19:01:56.369 SC-2 amf_demo[444]: CSI Set - add 'safCsi=ma_csi,safSi=ma_si,safApp=ma_app' HAState Active 2019-06-20 19:02:04.372 SC-2 osafamfnd[257]: NO Assigned 'safSi=ma_si,safApp=ma_app' ACTIVE to 'safSu=ma_su_2,safSg=ma_sg,safApp=ma_app' 2019-06-20 19:02:04.390 SC-2 osafamfnd[257]: NO Assigning 'safSi=ma_si_new,safApp=ma_app_new' ACTIVE to 'safSu=ma_su_4_new,safSg=ma_sg_new,safApp=ma_app_new' 2019-06-20 19:02:04.391 SC-2 amf_demo_ori[440]: CSI Set - add 'safCsi=ma_csi_new,safSi=ma_si_new,safApp=ma_app_new' HAState Active 2019-06-20 19:02:04.391 SC-2 osafamfnd[257]: NO Assigned 'safSi=ma_si_new,safApp=ma_app_new' ACTIVE to 'safSu=ma_su_4_new,safSg=ma_sg_new,safApp=ma_app_new'

On 12/6/19 12:01 pm, thang.d.nguyen wrote:
When lock node invokes on active assignment. The dependent SI
follow with sponsor SI move to QUIESCED. There is a case that
the active assignment for sponsor is happening on remain SC node.
And that remaining node was down. The remove SISU only happen for
sponsor SI.
The fix is to remove SUSI of dependent SI.
---
  src/amf/amfd/sg_2n_fsm.cc | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/src/amf/amfd/sg_2n_fsm.cc b/src/amf/amfd/sg_2n_fsm.cc
index 91ffc63..776696c 100644
--- a/src/amf/amfd/sg_2n_fsm.cc
+++ b/src/amf/amfd/sg_2n_fsm.cc
@@ -3175,6 +3175,9 @@ void SG_2N::node_fail_su_oper(AVD_SU *su) {
          }
su->sg_of_su->set_fsm_state(AVD_SG_FSM_SG_REALIGN);
+      } else {
+        avd_sg_su_si_del_snd(cb, su_oper_list.front());
+        su->sg_of_su->set_fsm_state(AVD_SG_FSM_SG_REALIGN);
        }
AVD_SU *su_at_head = su_oper_list.front();

_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to