Hi Mathi, I noticed that #1781 has moved the CLM init before amfd get assigned role. Would you be happy if I change amfd to make it does CLM init when it actually gets active/standby role?
Thanks, Minh On 25/07/16 11:25, minh chau wrote: > I mean "... no objection ..." > > Thanks, > Minh > > On 25/07/16 10:53, minh chau wrote: >> Hi, >> >> I have tried to reproduce the problem in #1781, when amfd in >> non-member node gets assigned role, amfd initializes CLM >> successfully. Only sfmd, cpkt got UNAVAILABLE from >> saClmClusterNodeGet(). >> Ander's explanation sounds reasonably that amfd should initialize CLM >> after amfd get assigned role. If there's no comment on the patch and >> no object to Anders' suggestion till Wednesday this week, I would >> like to float V2 patch that incorporates Anders' suggestion. >> >> Thanks, >> Minh >> On 22/07/16 22:31, Anders Widell wrote: >>> Yes, that's what I meant when I said that if we really do wish to >>> assign the STANDBY role to the directors running on a CLM locked >>> node, then we need to make sure CLM can differentiate between >>> middleware- and non-middleware clients. Middleware clients should >>> then not be affected by the CLM node lock. >>> >>> regards, >>> Anders Widell >>> >>> On 07/22/2016 02:16 PM, praveen malviya wrote: >>>> Such a configured non-member node will be notified only for node >>>> local changes in track callback and not cluster wide changes. This >>>> is one reason clm lock of active controller node is not allowed by >>>> CLM and AMF also rejects this in validation step. >>>> >>>> >>>> Thanks, >>>> Praveen >>>> >>>> >>>> On 22-Jul-16 5:11 PM, Anders Widell wrote: >>>>> Ok good, so then we there is no problem calling >>>>> saClmInitialize_4() and >>>>> saClmSelectionObjectGet() on a locked node. I checked >>>>> saClmClusterTrack_4() and it should also be safe according to the >>>>> spec. >>>>> >>>>> I would prefer if you don't initialize the CLM and NTF handles on >>>>> spare >>>>> nodes, until they actually get a STANDBY or ACTIVE assignment. One >>>>> reason is performance - initializing a handle presumably consumes >>>>> resources on both the spare controller node as well as in the CLM/NTF >>>>> server running on the active node. We (currently) don't need the >>>>> handles >>>>> as long as we are running as spares, so this is a waste of resources. >>>>> But more importantly, I think it is safer to defer the initialization >>>>> until the handles are needed. Let's suppose the spare starts up and >>>>> keeps running for a very long time as a spare. Then - much later - >>>>> you >>>>> become STANDBY or ACTIVE. Now you want to start using those >>>>> handles that >>>>> you initialized way back in history. Who knows if the handles are >>>>> still >>>>> working? A handle is essentially connection to a server running on >>>>> another node. A lot of things may have happened since we >>>>> initialized it; >>>>> controller switch-overs, fail-overs, software upgrades, and headless >>>>> situations. Yes the handles ought to work - and if not, they don't >>>>> they >>>>> ought to return BAD_HANDLE so that we can re-initialize them. But >>>>> there >>>>> is also a small possibility that a bug causes the handle to simply >>>>> not >>>>> work. A "fresh", newly initialized handle would be safer than a >>>>> one-year >>>>> old handle. >>>>> >>>>> Do you think it would work if you defer creating these background >>>>> threads until we get an ACTIVE/STANDBY assignment? >>>>> >>>>> regards, >>>>> Anders Widell >>>>> >>>>> On 07/22/2016 12:20 PM, minh chau wrote: >>>>>> Hi, >>>>>> >>>>>> I think Praveen's comment on #1812 was worrying about amfd hanging >>>>>> when init with CLM, this patch does not change position of CLM >>>>>> initialization and also it's done in thread so it will be ok? >>>>>> Regarding Anders' comment: I did quick test, lock clm on standby >>>>>> controller and reboot it, when it comes up it initializes with CLM >>>>>> successfully, so it seems we won't get ERR_UNAVAILABLE on configured >>>>>> non-member node >>>>>> Thought that handling ERR_UNAVAILABLE should be removed in CLM >>>>>> init in >>>>>> the scope of this ticket, but would it be useful in case that amfd >>>>>> re-init CLM up on receiving BAD_HANDLE? >>>>>> >>>>>> Thanks, >>>>>> Minh >>>>>> On 20/07/16 22:06, Anders Widell wrote: >>>>>>> Regarding ticket [#1781], I think that one requires some more >>>>>>> thought. First of all, do we want to assign the STANDBY role to the >>>>>>> OpenSAF directors running on a CLM locked node? If we do want a CLM >>>>>>> locked node to become standby, then CLM ought to provide service to >>>>>>> middleware clients running on a locked node! It must differentiate >>>>>>> between middleware- and non-middleware clients. >>>>>>> >>>>>>> By the way, saClmInitialize_4() and saClmSelectionObjectGet() >>>>>>> should >>>>>>> not return ERR_UNAVAILABLE on configured non-member nodes - they >>>>>>> shall only return ERR_UNAVAILABLE on unconfigured nodes. >>>>>>> >>>>>>> regards, >>>>>>> Anders Widell >>>>>>> >>>>>>> On 07/20/2016 12:55 PM, praveen malviya wrote: >>>>>>>> Hi Minh, >>>>>>>> >>>>>>>> For the ticket #1812 I had given one comment. >>>>>>>> It was: >>>>>>>> "In the fix of ticekt #1781, it has been suggested for spare >>>>>>>> controllers >>>>>>>> to init with CLM before become AMF role aware. Here also same >>>>>>>> problem >>>>>>>> will come. If spare is running on CLM locked node then it >>>>>>>> will never >>>>>>>> come out of avd_clm_init() as ERR_UNAVAILBLE is handled there for >>>>>>>> reinit. Also admin will not be able to unlock it until one of the >>>>>>>> controller joins. In this particular case AMFD must exit >>>>>>>> instead of >>>>>>>> indefinitely calling the CLM init API. >>>>>>>> Also this fix will cause re-init will CLM in non-headless case >>>>>>>> also. I >>>>>>>> think in non-headless case it will be good to initialize with >>>>>>>> CLM in a >>>>>>>> separate thread" >>>>>>>> >>>>>>>> I think it is valid here also or is it handled. >>>>>>>> Thanks, >>>>>>>> Praveen >>>>>>>> On 11-Jul-16 1:01 PM, Minh Hon Chau wrote: >>>>>>>>> osaf/services/saf/amf/amfd/clm.cc | 103 >>>>>>>>> ++++++++++++++++++++++++++---- >>>>>>>>> osaf/services/saf/amf/amfd/include/cb.h | 7 +- >>>>>>>>> osaf/services/saf/amf/amfd/include/clm.h | 6 +- >>>>>>>>> osaf/services/saf/amf/amfd/include/ntf.h | 3 + >>>>>>>>> osaf/services/saf/amf/amfd/main.cc | 20 ++++- >>>>>>>>> osaf/services/saf/amf/amfd/ntf.cc | 85 >>>>>>>>> +++++++++++++++++++++++++ >>>>>>>>> osaf/services/saf/amf/amfd/role.cc | 15 +--- >>>>>>>>> 7 files changed, 204 insertions(+), 35 deletions(-) >>>>>>>>> >>>>>>>>> >>>>>>>>> In new controller reallocation scenario with roaming sc >>>>>>>>> feature, if >>>>>>>>> immnd >>>>>>>>> dies in the node becoming active, the circular dependencies among >>>>>>>>> Opensaf >>>>>>>>> services appear, which leads eventually to a reboot. >>>>>>>>> >>>>>>>>> The dependencies are: >>>>>>>>> .clmd can not use IMM services since immnd dies >>>>>>>>> .immnd needs restarted by amfnd >>>>>>>>> .amfnd is hanging since amfnd is calling CLM services >>>>>>>>> .amfd is also hanging since amfd is calling CLM and NTF services >>>>>>>>> .ntfd is hanging due to logd's dependencies on IMM >>>>>>>>> >>>>>>>>> The problem could be solved if: >>>>>>>>> . amfd initializes NTF, CLM handle in thread in initialization >>>>>>>>> phase >>>>>>>>> . amfnd initializes CLM in thread if amfnd receives clm bad >>>>>>>>> handle >>>>>>>>> >>>>>>>>> Since amfnd has already initialized CLM in thread up on receiving >>>>>>>>> clm bad >>>>>>>>> handle. This patch does initialze CLM, NTF in thread at amfd >>>>>>>>> side. >>>>>>>>> Also, >>>>>>>>> threading initialization in this patch can be refactored later by >>>>>>>>> utilizing >>>>>>>>> the support of #1609 >>>>>>>>> >>>>>>>>> diff --git a/osaf/services/saf/amf/amfd/clm.cc >>>>>>>>> b/osaf/services/saf/amf/amfd/clm.cc >>>>>>>>> --- a/osaf/services/saf/amf/amfd/clm.cc >>>>>>>>> +++ b/osaf/services/saf/amf/amfd/clm.cc >>>>>>>>> @@ -386,14 +386,26 @@ static const SaClmCallbacksT_4 clm_callb >>>>>>>>> /*.saClmClusterTrackCallback =*/ clm_track_cb >>>>>>>>> }; >>>>>>>>> >>>>>>>>> -SaAisErrorT avd_clm_init(void) >>>>>>>>> +SaAisErrorT avd_clm_init(AVD_CL_CB* cb) >>>>>>>>> { >>>>>>>>> - SaAisErrorT error = SA_AIS_OK; >>>>>>>>> + SaAisErrorT error = SA_AIS_OK; >>>>>>>>> + SaClmHandleT clm_handle = 0; >>>>>>>>> + SaSelectionObjectT sel_obj = 0; >>>>>>>>> >>>>>>>>> + cb->clmHandle = 0; >>>>>>>>> + cb->clm_sel_obj = 0; >>>>>>>>> TRACE_ENTER(); >>>>>>>>> + /* >>>>>>>>> + * TODO: This CLM initialization thread can be re-factored >>>>>>>>> + * after having osaf dedicated thread, so that all APIs >>>>>>>>> calls >>>>>>>>> + * to external service can be automatically retried with >>>>>>>>> result >>>>>>>>> + * code (TRY_AGAIN, TIMEOUT, UNAVAILABLE), or reinitialized >>>>>>>>> within >>>>>>>>> + * BAD_HANDLE. Also, duplicated codes in initialization >>>>>>>>> thread >>>>>>>>> + * will be moved to osaf dedicated thread >>>>>>>>> + */ >>>>>>>>> for (;;) { >>>>>>>>> SaVersionT Version = { 'B', 4, 1 }; >>>>>>>>> - error = saClmInitialize_4(&avd_cb->clmHandle, >>>>>>>>> &clm_callbacks, &Version); >>>>>>>>> + error = saClmInitialize_4(&clm_handle, &clm_callbacks, >>>>>>>>> &Version); >>>>>>>>> if (error == SA_AIS_ERR_TRY_AGAIN || >>>>>>>>> error == SA_AIS_ERR_TIMEOUT || >>>>>>>>> error == SA_AIS_ERR_UNAVAILABLE) { >>>>>>>>> @@ -404,15 +416,21 @@ SaAisErrorT avd_clm_init(void) >>>>>>>>> osaf_nanosleep(&kHundredMilliseconds); >>>>>>>>> continue; >>>>>>>>> } >>>>>>>>> - if (error == SA_AIS_OK) break; >>>>>>>>> - LOG_ER("Failed to Initialize with CLM: %u", error); >>>>>>>>> + if (error == SA_AIS_OK) { >>>>>>>>> + break; >>>>>>>>> + }else { >>>>>>>>> + LOG_ER("Failed to Initialize with CLM: %u", error); >>>>>>>>> + goto done; >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + cb->clmHandle = clm_handle; >>>>>>>>> + error = saClmSelectionObjectGet(cb->clmHandle, &sel_obj); >>>>>>>>> + if (error != SA_AIS_OK) { >>>>>>>>> + LOG_ER("Failed to get selection object from CLM %u", >>>>>>>>> error); >>>>>>>>> + cb->clmHandle = 0; >>>>>>>>> goto done; >>>>>>>>> } >>>>>>>>> - error = saClmSelectionObjectGet(avd_cb->clmHandle, >>>>>>>>> &avd_cb->clm_sel_obj); >>>>>>>>> - if (SA_AIS_OK != error) { >>>>>>>>> - LOG_ER("Failed to get selection object from CLM %u", >>>>>>>>> error); >>>>>>>>> - goto done; >>>>>>>>> - } >>>>>>>>> + cb->clm_sel_obj = sel_obj; >>>>>>>>> >>>>>>>>> TRACE("Successfully initialized CLM"); >>>>>>>>> >>>>>>>>> @@ -428,10 +446,15 @@ SaAisErrorT avd_clm_track_start(void) >>>>>>>>> >>>>>>>>> TRACE_ENTER(); >>>>>>>>> error = saClmClusterTrack_4(avd_cb->clmHandle, trackFlags, >>>>>>>>> nullptr); >>>>>>>>> - if (SA_AIS_OK != error) >>>>>>>>> - LOG_ER("Failed to start cluster tracking %u", >>>>>>>>> error); >>>>>>>>> - >>>>>>>>> - TRACE_LEAVE(); >>>>>>>>> + if (error != SA_AIS_OK) { >>>>>>>>> + if (error == SA_AIS_ERR_TRY_AGAIN || error == >>>>>>>>> SA_AIS_ERR_TIMEOUT || >>>>>>>>> + error == SA_AIS_ERR_UNAVAILABLE) { >>>>>>>>> + LOG_WA("Failed to start cluster tracking %u", >>>>>>>>> error); >>>>>>>>> + } else { >>>>>>>>> + LOG_ER("Failed to start cluster tracking %u", >>>>>>>>> error); >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + TRACE_LEAVE(); >>>>>>>>> return error; >>>>>>>>> } >>>>>>>>> >>>>>>>>> @@ -468,3 +491,55 @@ void clm_node_terminate(AVD_AVND *node) >>>>>>>>> else >>>>>>>>> TRACE("Waiting for the pending SU presence state >>>>>>>>> updates"); >>>>>>>>> } >>>>>>>>> + >>>>>>>>> +static void* avd_clm_init_thread(void* arg) >>>>>>>>> +{ >>>>>>>>> + TRACE_ENTER(); >>>>>>>>> + AVD_CL_CB* cb = static_cast<AVD_CL_CB*>(arg); >>>>>>>>> + SaAisErrorT error = SA_AIS_OK; >>>>>>>>> + >>>>>>>>> + if (avd_clm_init(cb) != SA_AIS_OK) { >>>>>>>>> + LOG_ER("avd_clm_init FAILED"); >>>>>>>>> + goto done; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + if (cb->avail_state_avd == SA_AMF_HA_ACTIVE) { >>>>>>>>> + for (;;) { >>>>>>>>> + error = avd_clm_track_start(); >>>>>>>>> + if (error == SA_AIS_ERR_TRY_AGAIN || >>>>>>>>> + error == SA_AIS_ERR_TIMEOUT || >>>>>>>>> + error == SA_AIS_ERR_UNAVAILABLE) { >>>>>>>>> + osaf_nanosleep(&kHundredMilliseconds); >>>>>>>>> + continue; >>>>>>>>> + } >>>>>>>>> + if (error == SA_AIS_OK) { >>>>>>>>> + break; >>>>>>>>> + } else { >>>>>>>>> + LOG_ER("avd_clm_track_start FAILED, error: %u", >>>>>>>>> error); >>>>>>>>> + goto done; >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> +done: >>>>>>>>> + TRACE_LEAVE(); >>>>>>>>> + return nullptr; >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +SaAisErrorT avd_start_clm_init_bg(void) >>>>>>>>> +{ >>>>>>>>> + pthread_t thread; >>>>>>>>> + pthread_attr_t attr; >>>>>>>>> + pthread_attr_init(&attr); >>>>>>>>> + pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); >>>>>>>>> + >>>>>>>>> + if (pthread_create(&thread, &attr, avd_clm_init_thread, >>>>>>>>> avd_cb) != 0) { >>>>>>>>> + LOG_ER("pthread_create FAILED: %s", strerror(errno)); >>>>>>>>> + exit(EXIT_FAILURE); >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + pthread_attr_destroy(&attr); >>>>>>>>> + return SA_AIS_OK; >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> + >>>>>>>>> diff --git a/osaf/services/saf/amf/amfd/include/cb.h >>>>>>>>> b/osaf/services/saf/amf/amfd/include/cb.h >>>>>>>>> --- a/osaf/services/saf/amf/amfd/include/cb.h >>>>>>>>> +++ b/osaf/services/saf/amf/amfd/include/cb.h >>>>>>>>> @@ -47,6 +47,7 @@ >>>>>>>>> >>>>>>>>> #include <list> >>>>>>>>> #include <queue> >>>>>>>>> +#include <atomic> >>>>>>>>> >>>>>>>>> class AVD_SI; >>>>>>>>> class AVD_AVND; >>>>>>>>> @@ -193,7 +194,7 @@ typedef struct cl_cb_tag { >>>>>>>>> since the cluster boot time */ >>>>>>>>> >>>>>>>>> /********** NTF related params ***********/ >>>>>>>>> - SaNtfHandleT ntfHandle; >>>>>>>>> + std::atomic<SaNtfHandleT> ntfHandle; >>>>>>>>> >>>>>>>>> /********Peer AvD related*********************/ >>>>>>>>> AVD_EXT_COMP_INFO ext_comp_info; >>>>>>>>> @@ -207,8 +208,8 @@ typedef struct cl_cb_tag { >>>>>>>>> bool is_implementer; >>>>>>>>> >>>>>>>>> /* Clm stuff */ >>>>>>>>> - SaClmHandleT clmHandle; >>>>>>>>> - SaSelectionObjectT clm_sel_obj; >>>>>>>>> + std::atomic<SaClmHandleT> clmHandle; >>>>>>>>> + std::atomic<SaSelectionObjectT> clm_sel_obj; >>>>>>>>> >>>>>>>>> bool fully_initialized; >>>>>>>>> bool swap_switch; /* true - In middle of role switch. */ >>>>>>>>> diff --git a/osaf/services/saf/amf/amfd/include/clm.h >>>>>>>>> b/osaf/services/saf/amf/amfd/include/clm.h >>>>>>>>> --- a/osaf/services/saf/amf/amfd/include/clm.h >>>>>>>>> +++ b/osaf/services/saf/amf/amfd/include/clm.h >>>>>>>>> @@ -21,10 +21,14 @@ >>>>>>>>> #ifndef _AVD_CLM_H >>>>>>>>> #define _AVD_CLM_H >>>>>>>>> >>>>>>>>> -extern SaAisErrorT avd_clm_init(void); >>>>>>>>> +struct cl_cb_tag; >>>>>>>>> + >>>>>>>>> + >>>>>>>>> +extern SaAisErrorT avd_clm_init(struct cl_cb_tag*); >>>>>>>>> extern SaAisErrorT avd_clm_track_start(void); >>>>>>>>> extern SaAisErrorT avd_clm_track_stop(void); >>>>>>>>> extern void clm_node_terminate(AVD_AVND *node); >>>>>>>>> +extern SaAisErrorT avd_start_clm_init_bg(void); >>>>>>>>> >>>>>>>>> #endif >>>>>>>>> >>>>>>>>> diff --git a/osaf/services/saf/amf/amfd/include/ntf.h >>>>>>>>> b/osaf/services/saf/amf/amfd/include/ntf.h >>>>>>>>> --- a/osaf/services/saf/amf/amfd/include/ntf.h >>>>>>>>> +++ b/osaf/services/saf/amf/amfd/include/ntf.h >>>>>>>>> @@ -105,4 +105,7 @@ void avd_alarm_clear(const SaNameT *name >>>>>>>>> >>>>>>>>> void avd_send_error_report_ntf(const SaNameT *name, >>>>>>>>> SaAmfRecommendedRecoveryT recovery); >>>>>>>>> >>>>>>>>> +extern SaAisErrorT avd_ntf_init(struct cl_cb_tag*); >>>>>>>>> +extern SaAisErrorT avd_start_ntf_init_bg(void); >>>>>>>>> + >>>>>>>>> #endif >>>>>>>>> diff --git a/osaf/services/saf/amf/amfd/main.cc >>>>>>>>> b/osaf/services/saf/amf/amfd/main.cc >>>>>>>>> --- a/osaf/services/saf/amf/amfd/main.cc >>>>>>>>> +++ b/osaf/services/saf/amf/amfd/main.cc >>>>>>>>> @@ -578,12 +578,21 @@ static uint32_t initialize(void) >>>>>>>>> >>>>>>>>> // CLM init is independent of this SC's role. Init with CLM >>>>>>>>> early. >>>>>>>>> >>>>>>>>> - if (avd_clm_init() != SA_AIS_OK) { >>>>>>>>> + cb->avail_state_avd = role; >>>>>>>>> + >>>>>>>>> + if (avd_start_clm_init_bg() != SA_AIS_OK) { >>>>>>>>> LOG_EM("avd_clm_init FAILED"); >>>>>>>>> rc = NCSCC_RC_FAILURE; >>>>>>>>> goto done; >>>>>>>>> } >>>>>>>>> >>>>>>>>> + // Initialize NTF handle in thread >>>>>>>>> + if (avd_start_ntf_init_bg() != SA_AIS_OK) { >>>>>>>>> + LOG_EM("avd_start_ntf_init_bg FAILED"); >>>>>>>>> + goto done; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + >>>>>>>>> if ((rc = initialize_for_assignment(cb, role)) >>>>>>>>> != NCSCC_RC_SUCCESS) { >>>>>>>>> LOG_ER("initialize_for_assignment FAILED %u", >>>>>>>>> (unsigned) >>>>>>>>> rc); >>>>>>>>> @@ -633,11 +642,14 @@ static void main_loop(void) >>>>>>>>> while (1) { >>>>>>>>> fds[FD_MBCSV].fd = cb->mbcsv_sel_obj; >>>>>>>>> fds[FD_MBCSV].events = POLLIN; >>>>>>>>> - fds[FD_CLM].fd = cb->clm_sel_obj; >>>>>>>>> - fds[FD_CLM].events = POLLIN; >>>>>>>>> fds[FD_IMM].fd = cb->imm_sel_obj; // IMM fd must be >>>>>>>>> last >>>>>>>>> in array >>>>>>>>> fds[FD_IMM].events = POLLIN; >>>>>>>>> - >>>>>>>>> + >>>>>>>>> + if (cb->clmHandle != 0) { >>>>>>>>> + fds[FD_CLM].fd = cb->clm_sel_obj; >>>>>>>>> + fds[FD_CLM].events = POLLIN; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> if (cb->immOiHandle != 0) { >>>>>>>>> fds[FD_IMM].fd = cb->imm_sel_obj; >>>>>>>>> fds[FD_IMM].events = POLLIN; >>>>>>>>> diff --git a/osaf/services/saf/amf/amfd/ntf.cc >>>>>>>>> b/osaf/services/saf/amf/amfd/ntf.cc >>>>>>>>> --- a/osaf/services/saf/amf/amfd/ntf.cc >>>>>>>>> +++ b/osaf/services/saf/amf/amfd/ntf.cc >>>>>>>>> @@ -25,6 +25,7 @@ >>>>>>>>> #include <logtrace.h> >>>>>>>>> #include <util.h> >>>>>>>>> #include <ntf.h> >>>>>>>>> +#include "osaf_time.h" >>>>>>>>> >>>>>>>>> /***************************************************************************** >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Name : avd_send_comp_inst_failed_alarm >>>>>>>>> @@ -572,6 +573,12 @@ uint32_t sendAlarmNotificationAvd(AVD_CL >>>>>>>>> return status; >>>>>>>>> } >>>>>>>>> >>>>>>>>> + if (avd_cb->ntfHandle == 0) { >>>>>>>>> + LOG_ER("NTF handle has not been initialized, alarm >>>>>>>>> notification " >>>>>>>>> + "for (%s) will be lost", ntf_object.value); >>>>>>>>> + return status; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> if (type != 0) { >>>>>>>>> add_info_items = 1; >>>>>>>>> allocation_size = SA_NTF_ALLOC_SYSTEM_LIMIT; >>>>>>>>> @@ -660,6 +667,13 @@ uint32_t sendStateChangeNotificationAvd( >>>>>>>>> LOG_WA("State change notification lost for '%s'", >>>>>>>>> ntf_object.value); >>>>>>>>> return status; >>>>>>>>> } >>>>>>>>> + >>>>>>>>> + if (avd_cb->ntfHandle == 0) { >>>>>>>>> + LOG_WA("NTF handle has not been initialized, state >>>>>>>>> change >>>>>>>>> notification " >>>>>>>>> + "for (%s) will be lost", ntf_object.value); >>>>>>>>> + return status; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> if (additional_info_is_present == true) { >>>>>>>>> add_info_items = 1; >>>>>>>>> allocation_size = SA_NTF_ALLOC_SYSTEM_LIMIT; >>>>>>>>> @@ -770,4 +784,75 @@ void avd_send_error_report_ntf(const SaN >>>>>>>>> TRACE_LEAVE(); >>>>>>>>> } >>>>>>>>> >>>>>>>>> +SaAisErrorT avd_ntf_init(AVD_CL_CB* cb) >>>>>>>>> +{ >>>>>>>>> + SaAisErrorT error = SA_AIS_OK; >>>>>>>>> + SaNtfHandleT ntf_handle; >>>>>>>>> + TRACE_ENTER(); >>>>>>>>> >>>>>>>>> + // reset handle >>>>>>>>> + cb->ntfHandle = 0; >>>>>>>>> + >>>>>>>>> + /* >>>>>>>>> + * TODO: to be re-factored as CLM initialization thread >>>>>>>>> + */ >>>>>>>>> + for (;;) { >>>>>>>>> + SaVersionT ntfVersion = { 'A', 0x01, 0x01 }; >>>>>>>>> + >>>>>>>>> + error = saNtfInitialize(&ntf_handle, NULL, &ntfVersion); >>>>>>>>> + if (error == SA_AIS_ERR_TRY_AGAIN || >>>>>>>>> + error == SA_AIS_ERR_TIMEOUT || >>>>>>>>> + error == SA_AIS_ERR_UNAVAILABLE) { >>>>>>>>> + if (error != SA_AIS_ERR_TRY_AGAIN) { >>>>>>>>> + LOG_WA("saNtfInitialize returned %u", >>>>>>>>> + (unsigned) error); >>>>>>>>> + } >>>>>>>>> + osaf_nanosleep(&kHundredMilliseconds); >>>>>>>>> + continue; >>>>>>>>> + } >>>>>>>>> + if (error == SA_AIS_OK) { >>>>>>>>> + break; >>>>>>>>> + } else { >>>>>>>>> + LOG_ER("Failed to Initialize with NTF: %u", error); >>>>>>>>> + goto done; >>>>>>>>> + } >>>>>>>>> + } >>>>>>>>> + cb->ntfHandle = ntf_handle; >>>>>>>>> + TRACE("Successfully initialized NTF"); >>>>>>>>> + >>>>>>>>> +done: >>>>>>>>> + TRACE_LEAVE(); >>>>>>>>> + return error; >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +static void* avd_ntf_init_thread(void* arg) >>>>>>>>> +{ >>>>>>>>> + TRACE_ENTER(); >>>>>>>>> + AVD_CL_CB* cb = static_cast<AVD_CL_CB*>(arg); >>>>>>>>> + >>>>>>>>> + if (avd_ntf_init(cb) != SA_AIS_OK) { >>>>>>>>> + LOG_ER("avd_clm_init FAILED"); >>>>>>>>> + goto done; >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> +done: >>>>>>>>> + TRACE_LEAVE(); >>>>>>>>> + return nullptr; >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +SaAisErrorT avd_start_ntf_init_bg(void) >>>>>>>>> +{ >>>>>>>>> + pthread_t thread; >>>>>>>>> + pthread_attr_t attr; >>>>>>>>> + pthread_attr_init(&attr); >>>>>>>>> + pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); >>>>>>>>> + >>>>>>>>> + if (pthread_create(&thread, &attr, avd_ntf_init_thread, >>>>>>>>> avd_cb) != 0) { >>>>>>>>> + LOG_ER("pthread_create FAILED: %s", strerror(errno)); >>>>>>>>> + exit(EXIT_FAILURE); >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + pthread_attr_destroy(&attr); >>>>>>>>> + >>>>>>>>> + return SA_AIS_OK; >>>>>>>>> +} >>>>>>>>> diff --git a/osaf/services/saf/amf/amfd/role.cc >>>>>>>>> b/osaf/services/saf/amf/amfd/role.cc >>>>>>>>> --- a/osaf/services/saf/amf/amfd/role.cc >>>>>>>>> +++ b/osaf/services/saf/amf/amfd/role.cc >>>>>>>>> @@ -174,9 +174,8 @@ void avd_role_change_evh(AVD_CL_CB *cb, >>>>>>>>> uint32_t initialize_for_assignment(cl_cb_tag* cb, SaAmfHAStateT >>>>>>>>> ha_state) >>>>>>>>> { >>>>>>>>> TRACE_ENTER2("ha_state = %d", static_cast<int>(ha_state)); >>>>>>>>> - SaVersionT ntfVersion = {'A', 0x01, 0x01}; >>>>>>>>> uint32_t rc = NCSCC_RC_SUCCESS; >>>>>>>>> - SaAisErrorT error; >>>>>>>>> + >>>>>>>>> if (cb->fully_initialized) goto done; >>>>>>>>> cb->avail_state_avd = ha_state; >>>>>>>>> if (ha_state == SA_AMF_HA_QUIESCED) { >>>>>>>>> @@ -199,12 +198,7 @@ uint32_t initialize_for_assignment(cl_cb >>>>>>>>> rc = NCSCC_RC_FAILURE; >>>>>>>>> goto done; >>>>>>>>> } >>>>>>>>> - if ((error = saNtfInitialize(&cb->ntfHandle, nullptr, >>>>>>>>> &ntfVersion)) != >>>>>>>>> - SA_AIS_OK) { >>>>>>>>> - LOG_ER("saNtfInitialize Failed (%u)", error); >>>>>>>>> - rc = NCSCC_RC_FAILURE; >>>>>>>>> - goto done; >>>>>>>>> - } >>>>>>>>> + >>>>>>>>> if ((rc = avd_mds_set_vdest_role(cb, ha_state)) != >>>>>>>>> NCSCC_RC_SUCCESS) { >>>>>>>>> LOG_ER("avd_mds_set_vdest_role FAILED"); >>>>>>>>> goto done; >>>>>>>>> @@ -273,11 +267,6 @@ uint32_t avd_active_role_initialization( >>>>>>>>> >>>>>>>>> avd_imm_update_runtime_attrs(); >>>>>>>>> >>>>>>>>> - if (avd_clm_track_start() != SA_AIS_OK) { >>>>>>>>> - LOG_ER("avd_clm_track_start FAILED"); >>>>>>>>> - goto done; >>>>>>>>> - } >>>>>>>>> - >>>>>>>>> status = NCSCC_RC_SUCCESS; >>>>>>>>> done: >>>>>>>>> TRACE_LEAVE(); >>>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> What NetFlow Analyzer can do for you? Monitors network >>>>>>>> bandwidth and >>>>>>>> traffic >>>>>>>> patterns at an interface-level. Reveals which users, apps, and >>>>>>>> protocols are >>>>>>>> consuming the most bandwidth. Provides multi-vendor support for >>>>>>>> NetFlow, >>>>>>>> J-Flow, sFlow and other flows. Make informed decisions using >>>>>>>> capacity planning >>>>>>>> reports.http://sdm.link/zohodev2dev >>>>>>>> _______________________________________________ >>>>>>>> Opensaf-devel mailing list >>>>>>>> Opensaf-devel@lists.sourceforge.net >>>>>>>> https://lists.sourceforge.net/lists/listinfo/opensaf-devel >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>> >>> >> > ------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports.http://sdm.link/zohodev2dev _______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel