Hi Praveen,
Any comments?
At second thought, I think it does not relate to #2416
The IMM sync call did time out but took just milisec, it mostly did not cause
much latency of Opensaf Saf NoRED of PL3 getting assignment after ng command
was issued. PL-3 came late due to timing in starting up node PL-3
May 22 14:34:03.580469 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1897] >>
avd_saImmOiRtObjectCreate_sync: SaAmfSIAssignment safSi=ABC,safApp=ABC
May 22 14:34:03.580474 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:0136] >> isImmServiceReady
May 22 14:34:03.580478 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:0151] << isImmServiceReady: 1:
May 22 14:34:03.580483 osafamfd
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:2905] >>
rt_object_create_common
May 22 14:34:03.580497 osafamfd
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3026] TR attr:safSISU
May 22 14:34:03.580502 osafamfd
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3026] TR
attr:saAmfSISUHAState
May 22 14:34:03.580505 osafamfd
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3026] TR
attr:saAmfSISUHAReadinessState
May 22 14:34:03.580509 osafamfd
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3026] TR
attr:osafAmfSISUFsmState
May 22 14:34:03.580632 osafamfd
[11068:11068:../../opensaf/src/imm/agent/imma_oi_api.cc:3207] <<
rt_object_create_common
May 22 14:34:03.580654 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1909] WA saImmOiRtObjectCreate_2
of className:'SaAmfSIAssignment', parentName:'safSi=ABC,safApp=ABC', failed
with 6
May 22 14:34:03.580658 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1930] >>
avd_saImmOiRtObjectCreate: SaAmfSIAssignment safSi=ABC,ABC
May 22 14:34:03.580665 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1945] <<
avd_saImmOiRtObjectCreate
May 22 14:34:03.580668 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/imm.cc:1918] <<
avd_saImmOiRtObjectCreate_sync
Thanks,
Minh
---
** [tickets:#2466] AMF: NodeGroup Admin UNLOCK timeout during cluster start up**
**Status:** unassigned
**Milestone:** 5.17.06
**Created:** Tue May 23, 2017 01:19 AM UTC by Minh Hon Chau
**Last Updated:** Tue May 23, 2017 03:02 AM UTC
**Owner:** nobody
When cluster is coming up, if a nodegroup admin op UNLOCK is issued (by SMF in
this case), the nodegroup admin op can be timed out, because the
su_cnt_admin_oper of one of PLs remains 1 forever
Sequence in details:
- A cluster has 4 nodes, start cluster
- When 3 nodes (SC1, SC2, PL3) join cluster, admin unlock nodegroup issue
~~~
May 22 14:33:46.665539 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'SC-1' joined
the cluster
May 22 14:33:48.115919 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'SC-2' joined
the cluster
May 22 14:34:00.442633 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'PL-4' joined
the cluster
~~~
NoRed Opensaf SU of PL4 get assigned
~~~
May 22 14:34:00.637324 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1171] >>
avd_su_si_assign_evh: id:30, node:2040f, act:2,
'safSu=19781416d5,safSg=NoRed,safApp=OpenSAF', 'safSi=NoRed3,safApp=OpenSAF',
ha:1, err:1, single:0
~~~
admin unlock nodegroup issues
~~~
May 22 14:34:02.989761 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/nodegroup.cc:1100] >> ng_admin_op_cb:
'safAmfNodeGroup=smfLockAdmNg2,safAmfCluster=myAmfCluster', inv:'115964117001',
op:'1'
~~~
- When NoRed Opensaf SU of PL-3 becomes ENABLED, it starts assignment
~~~
May 22 14:34:10.096324 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0725] >>
avd_su_oper_state_evh: id:29, node:2030f,
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' state:1
May 22 14:34:10.097537 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sg_nored_fsm.cc:0305] >> su_insvc:
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF', 0
May 22 14:34:10.097549 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0111] >> avd_new_assgn_susi:
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' 'safSi=a6b0d555f4,safApp=OpenSAF'
state=1
May 22 14:34:10.097552 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/siass.cc:0440] >> avd_susi_create:
safSu=PL-3,safSg=NoRed,safApp=OpenSAF safSi=a6b0d555f4,safApp=OpenSAF state=1
~~~
The su_cnt_admin_oper of NoRed Opensaf SU is increased.
~~~
May 22 14:34:10.098839 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/util.cc:0978] << avd_snd_susi_msg
May 22 14:34:10.098841 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0268] TR
node:'safAmfNode=PL-3,safAmfCluster=myAmfCluster', su_cnt_admin_oper:1
~~~
- When NoRed Opensaf SU get assigned
~~~
May 22 14:34:10.105283 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1171] >>
avd_su_si_assign_evh: id:30, node:2030f, act:2,
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF', 'safSi=a6b0d555f4,safApp=OpenSAF',
ha:1, err:1, single:0
~~~
but this su_cnt_admin_oper is not decreased
~~~
May 22 14:34:10.108143 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sg_nored_fsm.cc:0000] << susi_success
May 22 14:34:10.108148 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1579] TR Node_state: 2 adest:
2010f203defc2 node not ready for assignments
May 22 14:34:10.108153 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1579] TR Node_state: 2 adest:
2020fc2b319b5 node not ready for assignments
May 22 14:34:10.108157 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0621] >>
avd_nd_ncs_su_assigned
May 22 14:34:10.108162 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/node.cc:0461] >> avd_node_state_set:
'safAmfNode=PL-3,safAmfCluster=myAmfCluster' NCS_INIT => PRESENT
~~~
At the end, su_cnt_admin_oper still remains 1.
The application SU get assigned, the counter's always decreased
~~~
May 22 14:34:10.444624 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sg_2n_fsm.cc:2648] << susi_success: rc:1
May 22 14:34:10.444629 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1681] TR
node:'safAmfNode=PL-3,safAmfCluster=myAmfCluster', su_cnt_admin_oper:2
May 22 14:34:10.444632 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0358] >>
process_su_si_response_for_ng:
'safSu=PL-3,safSg=2N,safApp=ERIC-sv.SVScsvStreamer'
May 22 14:34:10.444640 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0457] <<
process_su_si_response_for_ng
~~~
There is a check in avd_su_si_assign_evh(), that seems not to count Opensaf SU
when decreased counter
...
/* else admin oper still not complete */
} else if ((su->sg_of_su->sg_ncs_spec == false) &&
((su->su_on_node->admin_ng != nullptr) ||
(su->sg_of_su->ng_using_saAmfSGAdminState == true))) {
AVD_AMF_NG *ng = su->su_on_node->admin_ng;
// Got response from AMFND for assignments decrement su_cnt_admin_oper.
...
In avd_new_assgn_susi(), this counter is increased only depends on @admin_ng
(which means nodegroup issued) and regardless check of Opensaf SU
...
if (avd_snd_susi_msg(cb, su, susi, AVSV_SUSI_ACT_ASGN, false, nullptr) ==
NCSCC_RC_SUCCESS) {
AVD_AVND *node = su->su_on_node;
if ((node->admin_node_pend_cbk.invocation != 0) ||
((node->admin_ng != nullptr) &&
(node->admin_ng->admin_ng_pend_cbk.invocation != 0))) {
node->su_cnt_admin_oper++;
TRACE("node:'%s', su_cnt_admin_oper:%u", node->name.c_str(),
node->su_cnt_admin_oper);
if (node->admin_ng != nullptr) {
node->admin_ng->node_oper_list.insert(node->name);
TRACE("node_oper_list size:%u", node->admin_ng->oper_list_size());
}
...
Please NOTE that this problem starts to happen since #2416, in which tries to
update IMM sync call. This could be the reason that make NoRed Opensaf SU of
PL-3 get assigned late so that nodegroup admin issued before assignment NoRed
Opensaf SU of PL-3.
This scenario makes upgrade failed at the step of UNLOCK cluster
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets