Hi Neel, Where did the failure occur when you reexecuted the campaign? In my test it succeeds.
Can you send the logs? Alex On 10/07/2016 08:49 AM, Neelakanta Reddy wrote: > ------------------------------------------------------------------------ > NOTICE: This email was received from an EXTERNAL sender > ------------------------------------------------------------------------ > > Hi Alex, > > I tested the patch, > when the same scenario is followed in the defect, the campaign is moving > to SUSPENDED_BY_ERROR_DETECTED. > > when the other controller comes up(which is rebooted in the middle of > si-swap), > > a) execute the campaign again, the campaign is moving to EXECUTION_FAILED. > b) rollback the campaign, the campaign us moving to ROLLBACK_COMPLETED. > > The case (a), above has to be looked into, why the campaign is moving to > EXECUTION_FAILED STATE. > The campaign, has to move to EXECUTION_COMPLETED state. > > Reviewing the code, let you know if there are any further comments. > > Regards, > Neel. > > > On 2016/10/06 01:50 AM, Alex Jones wrote: >> osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc | 69 > +++++++++++++++++--- >> osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.hh | 6 + >> 2 files changed, 64 insertions(+), 11 deletions(-) >> >> >> Sep 27 00:34:14 q50-s1 osafsmfd[6667]: NO SA_AMF_ADMIN_SI_SWAP [rc=1] > successfully initiated >> Sep 27 00:34:15 q50-s1 osafimmnd[6571]: NO ERR_BAD_OPERATION: Mismatch > on administrative owner '' != 'SMFSERVICE' >> Sep 27 00:34:17 q50-s1 osafsmfd[6667]: NO Fail to invoke admin > operation, rc=SA_AIS_ERR_BAD_OPERATION (20). > dn=[safSi=SC-2N,safApp=OpenSAF], opId=[7] >> Sep 27 00:34:17 q50-s1 osafsmfd[6667]: NO Admin op > SA_AMF_ADMIN_SI_SWAP fail [rc = 20] >> Sep 27 00:34:17 q50-s1 osafsmfd[6667]: NO CAMP: Procedure > safSmfProc=RollingUpgrade returned FAILED >> Sep 27 00:36:14 q50-s1 osafsmfd[6667]: NO Campaign thread does not > disappear within 120 seconds after SA_AMF_ADMIN_SI_SWAP, the operation > was assumed failed. >> Sep 27 00:36:14 q50-s1 kernel: [14934029.531187] osafsmfd[32024]: > segfault at 4 ip 00000000004425b6 sp 00007f67f7ffe1c0 error 4 in > osafsmfd[400000+9a000] >> Sep 27 00:36:14 q50-s1 osafamfnd[6649]: NO > 'safComp=SMF,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to > 'avaDown' : Recovery is 'nodeFailfast' >> Sep 27 00:36:14 q50-s1 osafamfnd[6649]: ER > safComp=SMF,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown > Recovery is:nodeFailfast >> >> There are a few problems here. One is that the SmfSwapThread is > pointing to a >> deleted procedure when the original active controller is reassigned > active. The >> second problem is that a new SmfSwapThread is created when the > original active >> controller is reassigned active, so now there are two running. The > first thread >> tries to use its proc pointer (which has been deleted when the > original active >> goes to quiesced) and causes the segfault. >> >> The proposed solution is a little different from that proposed in the > ticket >> description. This solution proposes to use the existence of the > SmfSwapThread as >> a test. When the original active controller is reassigned active > because the >> si-swap failed, it will still remove the RestartIndicator as it does > now. But, >> if the SmfSwapThread is still running, it will not create a new one, > but update >> it with the recreated procedure pointer, and let it handle the si-swap > timeout. >> Then it will report the error. I believe this solution is backwards > compatible >> because no IMM changes are made like the ones proposed in the ticket. >> >> diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc > b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc >> --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc >> +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.cc >> @@ -482,12 +482,18 @@ SmfUpgradeProcedure::switchOver() >> osafassert(0); >> } >> >> - TRACE("SmfUpgradeProcedure::switchOver: Create the restart indicator"); >> - > SmfCampaignThread::instance()->campaign()->getUpgradeCampaign()->createSmfRestartIndicator(); >> - >> - SmfSwapThread *swapThread = new SmfSwapThread(this); >> - TRACE("SmfUpgradeProcedure::switchOver, Starting SI_SWAP thread"); >> - swapThread->start(); >> + if (!SmfSwapThread::running()) { >> + TRACE("SmfUpgradeProcedure::switchOver: Create the restart indicator"); >> + > SmfCampaignThread::instance()->campaign()->getUpgradeCampaign()->createSmfRestartIndicator(); >> + >> + SmfSwapThread *swapThread = new SmfSwapThread(this); >> + TRACE("SmfUpgradeProcedure::switchOver, Starting SI_SWAP thread"); >> + swapThread->start(); >> + } else { >> + TRACE("SmfUpgradeProcedure::switchOver, SI_SWAP thread already > running"); >> + SmfSwapThread::setProc(this); >> + >> + } >> >> TRACE_LEAVE(); >> } >> @@ -4156,6 +4162,31 @@ SmfUpgradeProcedure::resetProcCounter() >> /* Static methods */ >> /*====================================================================*/ >> >> +SmfSwapThread *SmfSwapThread::me(0); >> +std::mutex SmfSwapThread::m_mutex; >> + >> +/** >> + * SmfSmfSwapThread::running >> + * Is the thread currently running? >> + */ >> +bool >> +SmfSwapThread::running(void) >> +{ >> + std::lock_guard<std::mutex> guard(m_mutex); >> + return me ? true : false; >> +} >> + >> +/** >> + * SmfSmfSwapThread::setProc >> + * Set the procedure pointer to the newly created procedure >> + */ >> +void >> +SmfSwapThread::setProc(SmfUpgradeProcedure *newProc) >> +{ >> + std::lock_guard<std::mutex> guard(m_mutex); >> + me->m_proc = newProc; >> +} >> + >> /** >> * SmfSmfSwapThread::main >> * static main for the thread >> @@ -4181,6 +4212,8 @@ SmfSwapThread::SmfSwapThread(SmfUpgradeP >> m_proc(i_proc) >> { >> sem_init(&m_semaphore, 0, 0); >> + std::lock_guard<std::mutex> guard(m_mutex); >> + me = this; >> } >> >> /** >> @@ -4188,6 +4221,8 @@ SmfSwapThread::SmfSwapThread(SmfUpgradeP >> */ >> SmfSwapThread::~SmfSwapThread() >> { >> + std::lock_guard<std::mutex> guard(m_mutex); >> + me = 0; >> } >> >> /** >> @@ -4309,13 +4344,25 @@ SmfSwapThread::main(void) >> >> exit_error: >> if (SmfCampaignThread::instance() != NULL) { >> - SmfProcStateExecFailed::instance()->changeState(m_proc, > SmfProcStateExecFailed::instance()); >> - } >> - >> - if (SmfCampaignThread::instance() != NULL) { >> + std::lock_guard<std::mutex> guard(m_mutex); >> + >> + SmfProcStateExecuting::instance()->changeState(m_proc, > SmfProcStateStepUndone::instance()); >> + >> + // find the failed upgrade step and set it to Undone >> + std::vector<SmfUpgradeStep *>& upgradeSteps(m_proc->getProcSteps()); >> + for (std::vector<SmfUpgradeStep *>::iterator > it(upgradeSteps.begin()); it != upgradeSteps.end(); ++it) { >> + if ((*it)->getSwitchOver()) { >> + (*it)->changeState(SmfStepStateUndone::instance()); >> + break; >> + } >> + } >> + >> + std::string error("si-swap of middleware failed"); >> + SmfCampaignThread::instance()->campaign()->setError(error); >> + >> CAMPAIGN_EVT *evt = new CAMPAIGN_EVT(); >> evt->type = CAMPAIGN_EVT_PROCEDURE_RC; >> - evt->event.procResult.rc = SMF_PROC_FAILED; >> + evt->event.procResult.rc = SMF_PROC_STEPUNDONE; >> evt->event.procResult.procedure = m_proc; >> SmfCampaignThread::instance()->send(evt); >> } >> diff --git a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.hh > b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.hh >> --- a/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.hh >> +++ b/osaf/services/saf/smfsv/smfd/SmfUpgradeProcedure.hh >> @@ -29,6 +29,7 @@ >> #include <vector> >> #include <list> >> #include <map> >> +#include <mutex> >> >> #include <saSmf.h> >> #include <saImmOi.h> >> @@ -791,7 +792,12 @@ class SmfSwapThread { >> ~SmfSwapThread(); >> int start(void); >> >> + static bool running(void); >> + static void setProc(SmfUpgradeProcedure *); >> + >> private: >> + static SmfSwapThread *me; >> + static std::mutex m_mutex; >> >> void main(void); >> int init(void); >>
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________ Opensaf-devel mailing list Opensaf-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/opensaf-devel