When using PLM an AMF node mapped to a CLM node mapped to a PLM EE, can get stuck in locked state when rebooting, or going through a PLM EE lock/unlock.
When amfd receives a START step from CLM tracking it attempts to gracefully shutdown the AMF node using AMF admin operations lock/lock-in. When PLM is involved this doesn't always work correctly because PLM is also shutting down the node by calling "opensafd stop". There is a race condition between PLM using "opensafd stop", and amfd using the admin operations to bring down the node, so that sometimes the AMF node gets stuck in locked state. If the rootCauseEntity in the CLM tracking is a PLM entity then don't do anything, as "opensafd stop" is already being called. --- src/amf/amfd/clm.cc | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/src/amf/amfd/clm.cc b/src/amf/amfd/clm.cc index 2bcea2db0..7f675d8e9 100644 --- a/src/amf/amfd/clm.cc +++ b/src/amf/amfd/clm.cc @@ -274,6 +274,27 @@ static void clm_track_cb( TRACE_3("Already got callback for start of this change."); continue; } + + if (strncmp(osaf_extended_name_borrow(rootCauseEntity), + "safEE=", + sizeof("safEE=") - 1) == 0 || + strncmp(osaf_extended_name_borrow(rootCauseEntity), + "safHE=", + sizeof("safHE=") - 1) == 0) { + // PLM will take care of calling opensafd stop + TRACE("rootCause: %s from PLM operation so skipping %u", + osaf_extended_name_borrow(rootCauseEntity), + notifItem->clusterNode.nodeId); + + SaAisErrorT rc(saClmResponse_4(avd_cb->clmHandle, + invocation, + SA_CLM_CALLBACK_RESPONSE_OK)); + if (rc != SA_AIS_OK) + LOG_ER("saClmResponse_4 failed: %i", rc); + + continue; + } + /* invocation to be used by pending clm response */ node->clm_pend_inv = invocation; clm_node_exit_start(node, notifItem->clusterChange); @@ -304,7 +325,9 @@ static void clm_track_cb( osaf_extended_name_borrow(rootCauseEntity), notifItem->clusterNode.nodeId); if (strncmp(osaf_extended_name_borrow(rootCauseEntity), - "safEE=", 6) == 0) { + "safEE=", 6) == 0 || + strncmp(osaf_extended_name_borrow(rootCauseEntity), + "safHE=", 6) == 0) { /* This callback is because of operation on PLM, so we need to mark the node absent, because PLCD will anyway call opensafd stop.*/ AVD_AVND *node = -- 2.13.6 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel