Hi Praveen,

Any comments?
At second thought, I think it does not relate to #2416
The IMM sync call did time out but took just milisec, it mostly did not cause 
much latency of Opensaf Saf NoRED of PL3 getting assignment after ng command 
was issued. PL-3 came late due to timing in starting up node PL-3

May 22 14:34:03.580469 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1897] >> 
avd_saImmOiRtObjectCreate_sync: SaAmfSIAssignment safSi=ABC,safApp=ABC
May 22 14:34:03.580474 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:0136] >> isImmServiceReady 
May 22 14:34:03.580478 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:0151] << isImmServiceReady: 1:
May 22 14:34:03.580483 osafamfd 
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:2905] >> 
rt_object_create_common 
May 22 14:34:03.580497 osafamfd 
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3026] TR attr:safSISU 
May 22 14:34:03.580502 osafamfd 
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3026] TR 
attr:saAmfSISUHAState 
May 22 14:34:03.580505 osafamfd 
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3026] TR 
attr:saAmfSISUHAReadinessState 
May 22 14:34:03.580509 osafamfd 
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3026] TR 
attr:osafAmfSISUFsmState 
May 22 14:34:03.580632 osafamfd 
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3207] << 
rt_object_create_common 
May 22 14:34:03.580654 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1909] WA saImmOiRtObjectCreate_2 
of className:'SaAmfSIAssignment', parentName:'safSi=ABC,safApp=ABC', failed 
with 6
May 22 14:34:03.580658 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1930] >> 
avd_saImmOiRtObjectCreate: SaAmfSIAssignment safSi=ABC,ABC
May 22 14:34:03.580665 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1945] << 
avd_saImmOiRtObjectCreate 
May 22 14:34:03.580668 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1918] << 
avd_saImmOiRtObjectCreate_sync


Thanks,
Minh


---

** [tickets:#2466] AMF: NodeGroup Admin UNLOCK timeout during cluster start up**

**Status:** unassigned
**Milestone:** 5.17.06
**Created:** Tue May 23, 2017 01:19 AM UTC by Minh Hon Chau
**Last Updated:** Tue May 23, 2017 03:02 AM UTC
**Owner:** nobody


When cluster is coming up, if a nodegroup admin op UNLOCK is issued (by SMF in 
this case), the nodegroup admin op can be timed out, because the 
su_cnt_admin_oper of one of PLs remains 1 forever

Sequence in details:
- A cluster has 4 nodes, start cluster
- When 3 nodes (SC1, SC2, PL3) join cluster, admin unlock nodegroup issue
~~~
May 22 14:33:46.665539 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'SC-1' joined 
the cluster
May 22 14:33:48.115919 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'SC-2' joined 
the cluster
May 22 14:34:00.442633 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'PL-4' joined 
the cluster
~~~

  NoRed Opensaf SU of PL4 get assigned

~~~
May 22 14:34:00.637324 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1171] >> 
avd_su_si_assign_evh: id:30, node:2040f, act:2, 
'safSu=19781416d5,safSg=NoRed,safApp=OpenSAF', 'safSi=NoRed3,safApp=OpenSAF', 
ha:1, err:1, single:0
~~~

   admin unlock nodegroup issues

~~~
 May 22 14:34:02.989761 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/nodegroup.cc:1100] >> ng_admin_op_cb: 
'safAmfNodeGroup=smfLockAdmNg2,safAmfCluster=myAmfCluster', inv:'115964117001', 
op:'1'
 ~~~
 
- When NoRed Opensaf SU of PL-3 becomes ENABLED, it starts assignment

~~~
 May 22 14:34:10.096324 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0725] >> 
avd_su_oper_state_evh: id:29, node:2030f, 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' state:1
 May 22 14:34:10.097537 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sg_nored_fsm.cc:0305] >> su_insvc: 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF', 0
 May 22 14:34:10.097549 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0111] >> avd_new_assgn_susi: 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' 'safSi=a6b0d555f4,safApp=OpenSAF' 
state=1
May 22 14:34:10.097552 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/siass.cc:0440] >> avd_susi_create: 
safSu=PL-3,safSg=NoRed,safApp=OpenSAF safSi=a6b0d555f4,safApp=OpenSAF state=1
~~~

 The su_cnt_admin_oper of NoRed Opensaf SU is increased.
 
~~~
May 22 14:34:10.098839 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/util.cc:0978] << avd_snd_susi_msg 
May 22 14:34:10.098841 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0268] TR 
node:'safAmfNode=PL-3,safAmfCluster=myAmfCluster', su_cnt_admin_oper:1
~~~

- When NoRed Opensaf SU get assigned

~~~
May 22 14:34:10.105283 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1171] >> 
avd_su_si_assign_evh: id:30, node:2030f, act:2, 
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF', 'safSi=a6b0d555f4,safApp=OpenSAF', 
ha:1, err:1, single:0
~~~

  but this su_cnt_admin_oper is not decreased

~~~
May 22 14:34:10.108143 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sg_nored_fsm.cc:0000] << susi_success
May 22 14:34:10.108148 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1579] TR Node_state: 2 adest: 
2010f203defc2 node not ready for assignments
May 22 14:34:10.108153 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1579] TR Node_state: 2 adest: 
2020fc2b319b5 node not ready for assignments
May 22 14:34:10.108157 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0621] >> 
avd_nd_ncs_su_assigned 
May 22 14:34:10.108162 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/node.cc:0461] >> avd_node_state_set: 
'safAmfNode=PL-3,safAmfCluster=myAmfCluster' NCS_INIT => PRESENT
~~~

  At the end, su_cnt_admin_oper still remains 1.
  
  The application SU get assigned, the counter's always decreased
~~~
May 22 14:34:10.444624 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sg_2n_fsm.cc:2648] << susi_success: rc:1
May 22 14:34:10.444629 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1681] TR 
node:'safAmfNode=PL-3,safAmfCluster=myAmfCluster', su_cnt_admin_oper:2
May 22 14:34:10.444632 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0358] >> 
process_su_si_response_for_ng: 
'safSu=PL-3,safSg=2N,safApp=ERIC-sv.SVScsvStreamer'
May 22 14:34:10.444640 osafamfd 
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0457] << 
process_su_si_response_for_ng 
~~~
There is a check in avd_su_si_assign_evh(), that seems not to count Opensaf SU 
when decreased counter
...
      /* else admin oper still not complete */
    } else if ((su->sg_of_su->sg_ncs_spec == false) &&
               ((su->su_on_node->admin_ng != nullptr) ||
                (su->sg_of_su->ng_using_saAmfSGAdminState == true))) {
      AVD_AMF_NG *ng = su->su_on_node->admin_ng;
      // Got response from AMFND for assignments decrement su_cnt_admin_oper.
 ...
 
 In avd_new_assgn_susi(), this counter is increased only depends on @admin_ng 
(which means nodegroup issued) and regardless check of Opensaf SU
 ...
     if (avd_snd_susi_msg(cb, su, susi, AVSV_SUSI_ACT_ASGN, false, nullptr) ==
        NCSCC_RC_SUCCESS) {
      AVD_AVND *node = su->su_on_node;
      if ((node->admin_node_pend_cbk.invocation != 0) ||
          ((node->admin_ng != nullptr) &&
           (node->admin_ng->admin_ng_pend_cbk.invocation != 0))) {
        node->su_cnt_admin_oper++;
        TRACE("node:'%s', su_cnt_admin_oper:%u", node->name.c_str(),
              node->su_cnt_admin_oper);
        if (node->admin_ng != nullptr) {
          node->admin_ng->node_oper_list.insert(node->name);
          TRACE("node_oper_list size:%u", node->admin_ng->oper_list_size());
        }
 ...
 
 Please NOTE that this problem starts to happen since #2416, in which tries to 
update IMM sync call. This could be the reason that make NoRed Opensaf SU of 
PL-3 get assigned late so that nodegroup admin issued before assignment NoRed 
Opensaf SU of PL-3. 

This scenario makes upgrade failed at the step of UNLOCK cluster


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to