Hi Minh,
I will go thorugh it today.
Thanks
Praveen
---
** [tickets:#2466] AMF: NodeGroup Admin UNLOCK timeout during cluster start up**
**Status:** unassigned
**Milestone:** 5.17.06
**Created:** Tue May 23, 2017 01:19 AM UTC by Minh Hon Chau
**Last Updated:** Tue May 23, 2017 05:13 AM UTC
**Owner:** nobody
When cluster is coming up, if a nodegroup admin op UNLOCK is issued (by SMF in
this case), the nodegroup admin op can be timed out, because the
su_cnt_admin_oper of one of PLs remains 1 forever
Sequence in details:
- A cluster has 4 nodes, start cluster
- When 3 nodes (SC1, SC2, PL3) join cluster, admin unlock nodegroup issue
~~~
May 22 14:33:46.665539 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'SC-1' joined
the cluster
May 22 14:33:48.115919 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'SC-2' joined
the cluster
May 22 14:34:00.442633 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0526] NO Node 'PL-4' joined
the cluster
~~~
NoRed Opensaf SU of PL4 get assigned
~~~
May 22 14:34:00.637324 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1171] >>
avd_su_si_assign_evh: id:30, node:2040f, act:2,
'safSu=19781416d5,safSg=NoRed,safApp=OpenSAF', 'safSi=NoRed3,safApp=OpenSAF',
ha:1, err:1, single:0
~~~
admin unlock nodegroup issues
~~~
May 22 14:34:02.989761 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/nodegroup.cc:1100] >> ng_admin_op_cb:
'safAmfNodeGroup=smfLockAdmNg2,safAmfCluster=myAmfCluster', inv:'115964117001',
op:'1'
~~~
- When NoRed Opensaf SU of PL-3 becomes ENABLED, it starts assignment
~~~
May 22 14:34:10.096324 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0725] >>
avd_su_oper_state_evh: id:29, node:2030f,
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' state:1
May 22 14:34:10.097537 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sg_nored_fsm.cc:0305] >> su_insvc:
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF', 0
May 22 14:34:10.097549 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0111] >> avd_new_assgn_susi:
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' 'safSi=a6b0d555f4,safApp=OpenSAF'
state=1
May 22 14:34:10.097552 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/siass.cc:0440] >> avd_susi_create:
safSu=PL-3,safSg=NoRed,safApp=OpenSAF safSi=a6b0d555f4,safApp=OpenSAF state=1
~~~
The su_cnt_admin_oper of NoRed Opensaf SU is increased.
~~~
May 22 14:34:10.098839 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/util.cc:0978] << avd_snd_susi_msg
May 22 14:34:10.098841 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0268] TR
node:'safAmfNode=PL-3,safAmfCluster=myAmfCluster', su_cnt_admin_oper:1
~~~
- When NoRed Opensaf SU get assigned
~~~
May 22 14:34:10.105283 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1171] >>
avd_su_si_assign_evh: id:30, node:2030f, act:2,
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF', 'safSi=a6b0d555f4,safApp=OpenSAF',
ha:1, err:1, single:0
~~~
but this su_cnt_admin_oper is not decreased
~~~
May 22 14:34:10.108143 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sg_nored_fsm.cc:0000] << susi_success
May 22 14:34:10.108148 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1579] TR Node_state: 2 adest:
2010f203defc2 node not ready for assignments
May 22 14:34:10.108153 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1579] TR Node_state: 2 adest:
2020fc2b319b5 node not ready for assignments
May 22 14:34:10.108157 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/ndfsm.cc:0621] >>
avd_nd_ncs_su_assigned
May 22 14:34:10.108162 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/node.cc:0461] >> avd_node_state_set:
'safAmfNode=PL-3,safAmfCluster=myAmfCluster' NCS_INIT => PRESENT
~~~
At the end, su_cnt_admin_oper still remains 1.
The application SU get assigned, the counter's always decreased
~~~
May 22 14:34:10.444624 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sg_2n_fsm.cc:2648] << susi_success: rc:1
May 22 14:34:10.444629 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:1681] TR
node:'safAmfNode=PL-3,safAmfCluster=myAmfCluster', su_cnt_admin_oper:2
May 22 14:34:10.444632 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0358] >>
process_su_si_response_for_ng:
'safSu=PL-3,safSg=2N,safApp=ERIC-sv.SVScsvStreamer'
May 22 14:34:10.444640 osafamfd
[11068:11068:../../opensaf/src/amf/amfd/sgproc.cc:0457] <<
process_su_si_response_for_ng
~~~
There is a check in avd_su_si_assign_evh(), that seems not to count Opensaf SU
when decreased counter
...
/* else admin oper still not complete */
} else if ((su->sg_of_su->sg_ncs_spec == false) &&
((su->su_on_node->admin_ng != nullptr) ||
(su->sg_of_su->ng_using_saAmfSGAdminState == true))) {
AVD_AMF_NG *ng = su->su_on_node->admin_ng;
// Got response from AMFND for assignments decrement su_cnt_admin_oper.
...
In avd_new_assgn_susi(), this counter is increased only depends on @admin_ng
(which means nodegroup issued) and regardless check of Opensaf SU
...
if (avd_snd_susi_msg(cb, su, susi, AVSV_SUSI_ACT_ASGN, false, nullptr) ==
NCSCC_RC_SUCCESS) {
AVD_AVND *node = su->su_on_node;
if ((node->admin_node_pend_cbk.invocation != 0) ||
((node->admin_ng != nullptr) &&
(node->admin_ng->admin_ng_pend_cbk.invocation != 0))) {
node->su_cnt_admin_oper++;
TRACE("node:'%s', su_cnt_admin_oper:%u", node->name.c_str(),
node->su_cnt_admin_oper);
if (node->admin_ng != nullptr) {
node->admin_ng->node_oper_list.insert(node->name);
TRACE("node_oper_list size:%u", node->admin_ng->oper_list_size());
}
...
This scenario makes upgrade failed at the step of UNLOCK nodegroup
---
Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets