Hi Alex,

got it.
Ack from me.

Thanks,
Neel.

On 2017/04/08 03:25 AM, Alex Jones wrote:
>
> Hi Neel,
>
>   Replies to your comments:
>
> (1)I don’t see this. With this patch can you get the system into a 
> state where the suMaintenanceCampaign is set in AMF, the component 
> crashes right before the SMF commit clears the suMaintenanceCampaign, 
> and thecmgnError is not set? If so, that’s a bug that we should 
> address. The SMF-AIS spec states that the suMaintenanceCampaign 
> attribute is cleared when the campaign is committed. Therefore it is 
> entirely possible that an SU could become disabled right before the 
> attribute is cleared, and the campaign still transitions to COMMITTED. 
> But, cmpnError should still be set in this case. So, as I read the 
> spec the operator needs to check this after committing, and may need 
> to repair an SU after committing a campaign, if this corner case occurs.
>
> (2)Well, the SMF-AIS spec says that asyncFailures are considered in 
> EVERY state while the suMaintenanceCampaign attribute is set. A 
> suspend is only done in the states you mention, but cmpnError should 
> be set in all states. This is already handled by the campaign state 
> machine. Look at SmfCampState.cc. The ::asyncFailure virtual function 
> only suspends the campaign in the states that you mention. If an 
> asyncFailure occurs in any other state, the base class virtual 
> function is called which does nothing.
>
> Alex
>
> *From:*Neelakanta Reddy [mailto:reddy.neelaka...@oracle.com]
> *Sent:* Friday, April 7, 2017 5:05 AM
> *To:* Alex Jones <alex.jo...@genband.com>; lennart.l...@ericsson.com; 
> rafael.odza...@ericsson.com
> *Cc:* opensaf-devel@lists.sourceforge.net
> *Subject:* Re: [PATCH 1 of 1] smfd: fix race condition when detecting 
> async failures [#2413]
>
> ------------------------------------------------------------------------
>
> NOTICE: This email was received from an EXTERNAL sender
>
> ------------------------------------------------------------------------
>
>
> Hi Alex,
>
> Reviewed and tested the patch.
> The patch avoids, the crash in smfd.
>
> I have the following Questions on the implementation(which was
> overlooked at the time of #2145 review):
>
> 1. If the component is killed when the campaign is in
> SA_SMF_CMPG_EXECUTION_COMPLETED state(as in the defect) and
> the amfnd does not re-start the component because,
> saAmfSUMaintenanceCampaign is set. The saSmfCmpgError indication is not
> given in SMF, because the campaign is in the process of committing.
>
> who does this justify Asynchronous Failures of AMF Entities?
>
> 2.
> according to SMF AIS, the async failures has to be considered Executing,
> Suspending, or RollingBack only.
> Then how do we prevent in other states? And the AMFND still will not
> re-start(correct) because of saAmfSUMaintenanceCampaign is set.
>
> Thanks,
> Neel.
>
> On 2017/04/06 11:56 PM, Alex Jones wrote:
> > src/smf/smfd/SmfCampaignThread.cc | 16 ++++++++++------
> > 1 files changed, 10 insertions(+), 6 deletions(-)
> >
> >
> > smfd core dumps during commit of campaign.
> >
> > If an AMF SU under maintenance fails right as the campaign commit is 
> done, there
> > is a race condition present. Before SMF clears the suMaintenaceCampaign
> > attribute of the SU, if the SU fails, it will send a notification. 
> Meanwhile,
> > the commit deletes upgrade campaign pointer inside smfd. If the 
> deletion of the
> > campaign pointer inside smfd occurs before it receives the NTF event 
> a crash
> > will occur because the campaign pointer is gone.
> >
> > Solution is to always process NTF events before processing the 
> termination of
> > the campaign. The campaign termination code sets "m_running" to 
> false, and
> > deletes the pointer. This should always be last in the poll loop so 
> that the
> > loop will exit immediately without processing any NTF events (or 
> other future
> > events.)
> >
> > diff --git a/src/smf/smfd/SmfCampaignThread.cc 
> b/src/smf/smfd/SmfCampaignThread.cc
> > --- a/src/smf/smfd/SmfCampaignThread.cc
> > +++ b/src/smf/smfd/SmfCampaignThread.cc
> > @@ -907,12 +907,10 @@ int SmfCampaignThread::handleEvents(void
> > break;
> > }
> >
> > - /* Process the Mail box events */
> > - if (fds[0].revents & POLLIN) {
> > - /* dispatch MBX events */
> > - processEvt();
> > - }
> > -
> > + /*
> > + * Handle NTF events first because processEvt may delete and 
> terminate the
> > + * campaign thread.
> > + */
> > if (fds[1].revents & POLLIN) {
> > // dispatch NTF events
> > rc = saNtfDispatch(m_ntfHandle, SA_DISPATCH_ALL);
> > @@ -922,6 +920,12 @@ int SmfCampaignThread::handleEvents(void
> > }
> > }
> >
> > + /* Process the Mail box events */
> > + if (fds[0].revents & POLLIN) {
> > + /* dispatch MBX events */
> > + processEvt();
> > + }
> > +
> > m_campaign->updateElapsedTime();
> > }
> > TRACE_LEAVE();
> >
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to