Can't IMMSv subscribe for IMMND dests?
Thanks,
Mathi.

> -----Original Message-----
> From: Anders Björnerstedt [mailto:[email protected]]
> Sent: Thursday, July 18, 2013 5:30 PM
> To: Neelakanta Reddy; praveen malviya
> Cc: [email protected]
> Subject: Re: [devel] Early patch for #501 for review/testing (was Re: #501
> amf: No node directors register to AMF within time after "#7 cleanup instead
> of terminate used at component restart")
> 
> Sounds like you could have needed to use the "continuationId" parameter to
> saImmOmAdminOperationInvoke().
> Unfortunately this A.2.1 feature is not yet implemented in the immsv.
> 
> https://sourceforge.net/p/opensaf/tickets/51/
> 
> 
> /AndersBj
> 
> 
> -----Original Message-----
> From: Neelakanta Reddy [mailto:[email protected]]
> Sent: den 18 juli 2013 13:41
> To: praveen malviya
> Cc: [email protected]
> Subject: Re: [devel] Early patch for #501 for review/testing (was Re: #501
> amf: No node directors register to AMF within time after "#7 cleanup instead
> of terminate used at component restart")
> 
> HI Mathi/Praveen,
> 
> I misunderstood, the flow of admin operation related to the component.
> 
> After analyzing the logs the following is the reason why the reply can not be
> sent:
> 
> The admin operation,to terminate IMMND is called at standby. The
> implementer is the active amfd.
> 
> The active amfd sends the admin operation result to local active IMMND,
> active IMMND tries to send the result to the IMMND(standby) where the
> admin operation is called, the mds adest that is stored in the active IMMND is
> the adest of the old IMMND(standby).
> 
> Because of this the following error message will come at the active
> controller:
> 
> ER Problem in sending to peer IMMND over MDS. Discarding admin op reply.
> 
> 
> Thanks,
> Neel.
> On Thursday 18 July 2013 04:52 PM, praveen malviya wrote:
> > Hi,
> > For restart admin on any component AMFD sends admin operation
> message
> > to corresponding AMFND.
> > AMFND will restart the component. When the operation will be in
> > progress presence state of the component will transition from
> > INSTANTIATED to  RESTARTING and then from RESTARTING to
> INSTANTIATED.
> > AMFND updates presence state to AMFD whenever it changes,  but AMFD
> > will respond to IMM for the completion of operation only when
> > component presence state becomes INSTANTIATED.
> >
> > Thanks,
> > Praveen.
> > On 17-Jul-13 7:09 PM, Neelakanta Reddy wrote:
> >> Hi Mathi,
> >>
> >> After giving the terminate message to local amnfnd, amfd immediately
> >> sends the admin operation result.
> >>
> >> The amfnd sends the message to the IMMND, the IMMND is processing
> in
> >> the immnd_amf_comp_terminate_callback, which will terminate IMMND.
> >> The admin operation result also arrives at local IMMND. since the
> >> terminate callback is executed first, the IMMND will not get the
> >> chance to execute the admin operation result.
> >>
> >> The admin operation initiated for terminating immnd will eventually
> >> leads to TIMEOUT.
> >>
> >> Thanks,
> >> Neel.
> >>
> >>
> >> On Wednesday 17 July 2013 01:22 PM, Mathivanan Naickan Palanivelu
> wrote:
> >>> Hi,
> >>>
> >>> The attached patch works for this ticket. (Note: The afmterminate
> >>> callback has to be corrected for directors also, will do that in a
> >>> separate patch)
> >>>
> >>> Please note that when running this test for IMM, the immadm or
> >>> amf-adm commands do not return to the command prompt, even
> though
> >>> the command
> >>>
> >>> had functionally succeeded, i.e. IMM got successfully restarted.
> >>>
> >>> I suspect that the reason could be either be that AMF is not
> >>> responding the admin-op result to IMM or the result is being
> >>> discarded by IMM.
> >>>
> >>> Neel/Nagendra, could you please confirm whether the issue(response
> >>> to admin op) is with IMM or AMF?
> >>>
> >>> See snapshot below:
> >>>
> >>> Jul 17 13:08:33 SC-2 osafamfnd[8169]: NO Admin restart requested for
> >>> 'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF'
> >>>
> >>> Jul 17 13:08:33 SC-2 osafimmnd[8457]: NO Received AMF component
> >>> terminate callback, exiting
> >>>
> >>> Jul 17 13:08:33 SC-2 osafamfd[8159]: NO Re-initializing with IMM
> >>>
> >>> Jul 17 13:08:33 SC-2 osafimmnd[8530]: Started
> >>>
> >>> Jul 17 13:08:34 SC-2 osafimmnd[8530]: NO SERVER STATE:
> >>> IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
> >>>
> >>> Jul 17 13:08:34 SC-2 osafimmnd[8530]: NO SERVER STATE:
> >>> IMM_SERVER_CLUSTER_WAITING -->
> IMM_SERVER_LOADING_PENDING
> >>>
> >>> Jul 17 13:08:34 SC-2 osafimmnd[8530]: NO SERVER STATE:
> >>> IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
> >>>
> >>> Jul 17 13:08:34 SC-2 osafimmnd[8530]: NO NODE STATE->
> >>> IMM_NODE_ISOLATED
> >>>
> >>> Jul 17 13:08:35 SC-2 osafimmd[8101]: NO Ruling epoch noted as:10 on
> >>> IMMD standby
> >>>
> >>> Jul 17 13:08:35 SC-2 osafimmd[8101]: NO IMMND coord at 2010f
> >>>
> >>> Jul 17 13:08:35 SC-2 osafimmnd[8530]: NO NODE STATE->
> >>> IMM_NODE_W_AVAILABLE
> >>>
> >>> Jul 17 13:08:35 SC-2 osafimmnd[8530]: NO SERVER STATE:
> >>> IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmnd[8530]: NO NODE STATE->
> >>> IMM_NODE_FULLY_AVAILABLE 2171
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmnd[8530]: NO RepositoryInitModeT is
> >>> SA_IMM_INIT_FROM_FILE
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmnd[8530]: NO Epoch set to 10 in ImmModel
> >>>
> >>> Jul 17 13:08:36 SC-2 immadm: IN Received PROC_STALE_CLIENTS
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmd[8101]: NO SBY: New Epoch for IMMND
> >>> process at node 2010f old epoch: 9  new epoch:10
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmd[8101]: NO IMMND coord at 2010f
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmd[8101]: NO SBY: New Epoch for IMMND
> >>> process at node 2020f old epoch: 0  new epoch:10
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmnd[8530]: NO Implementer connected: 33
> >>> (MsgQueueService131599) <283, 2020f>
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmnd[8530]: NO SERVER STATE:
> >>> IMM_SERVER_SYNC_CLIENT --> IMM SERVER READY
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmnd[8530]: NO Implementer (applier)
> >>> connected: 34 (@safLogService) <511, 2020f>
> >>>
> >>> Jul 17 13:08:36 SC-2 osafimmnd[8530]: NO Implementer (applier)
> >>> connected: 35 (@safAmfService2020f) <512, 2020f>
> >>>
> >>> Jul 17 13:08:37 SC-2 osafamfd[8159]: NO Finished re-initializing
> >>> with IMM
> >>>
> >>> Thanks,
> >>>
> >>> Mathi.
> >>>
> >>> *From:*Mathi Naickan [mailto:[email protected]]
> >>> *Sent:* Tuesday, July 16, 2013 12:36 PM
> >>> *To:* [opensaf:tickets]
> >>> *Subject:* [opensaf:tickets] Re: #501 amf: No node directors
> >>> register to AMF within time after "#7 cleanup instead of terminate
> >>> used at component restart"
> >>>
> >>> I checked the NDs. I think we should remove these sleeps(legacy).
> >>>
> >>> Also, the exits should be styled like the daemon_exit()s.
> >>>
> >>> We also need to test such 'exit's from the terminatecallback for
> >>> directors as well and consider special classes like NTF where we
> >>> ought to
> >>>
> >>> call the likes of stop_ntfimcn().
> >>>
> >>> Will get back on this.
> >>>
> >>> Thanks,
> >>>
> >>> Mathi.
> >>>
> >>> From: Praveen [mailto:[email protected]]
> >>> Sent: Monday, July 15, 2013 9:35 AM
> >>> To: [opensaf:tickets]
> >>> Subject: [opensaf:tickets] Re: #501 amf: No node directors register
> >>> to AMF within time after "#7 cleanup instead of terminate used at
> >>> component restart"
> >>>
> >>> Can sleep(1) be added before giving response to AMF?
> >>>
> >>> Thanks
> >>> Praveen
> >>> On 15-Jul-13 8:10 AM, Nagendra Kumar wrote:
> >>>
> >>> There is no problem with AMF as amf is running instantiate script
> >>> for all the services(cpnd, glnd, mqnd, smfnd).
> >>> The problem resides in these services, because it is sleeping for 1
> >>> seconds after giving amf response in the terminate callback.
> >>> Ex:
> >>> cpnd_amf_comp_terminate_callback
> >>>
> >>> saAmfResponse(cb->amf_hdl,  invocation,  saErr);
> >>> ncshm_give_hdl(gl_cpnd_cb_hdl); sleep(1); LOG_NO("Received AMF
> >>> component terminate callback, exiting"); exit(0);
> >>>
> >>> When instantiate script is executed by amf, since the process is
> >>> still up and running(because of sleep of 1 second), 'start_daemon -p
> >>> $pidfile $binary $args' becomes ineffective and the processes(e.g.
> >>> cpnd) doesn't start.
> >>>
> >>> I tested by removing sleep and all worked as expected.
> >>>
> >>> So, it is advised in other services to find out why sleep of 1 was
> >>> introduced and how we can get rid of sleep.
> >>>
> >>> *_*
> >>>
> >>> HYPERLINK
> >>> "http://sourceforge.net/p/opensaf/tickets/501/"[tickets:#501]
> >>> <http://sourceforge.net/p/opensaf/tickets/501/>
> >>> http://sourceforge.net/p/opensaf/tickets/501/ amf:
> >>> No node directors register to AMF within time after "#7 cleanup
> >>> instead of terminate used at component restart"
> >>>
> >>> Status: unassigned
> >>> Created: Thu Jul 11, 2013 07:47 AM UTC by Ingvar Bergström Last
> >>> Updated: Thu Jul 11, 2013 07:47 AM UTC
> >>> Owner: nobody
> >>>
> >>> After introduction of patches solving "#7 cleanup instead of
> >>> terminate used at component restart", no node directors registers to
> >>> AMF within time according to messages log.
> >>> I have tried SMFND, CPND, GLND and MQND.
> >>>
> >>> It seems however that the main routines of the node director daemons
> >>> are not started until 10 seconds after the terminate callback (after
> >>> the registration timeout).
> >>>
> >>> It is very easy to see the fault by entering command "amf-adm
> >>> restart safComp=xxxND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF"
> >>>
> >>> *_*
> >>>
> >>> Sent from sourceforge.net because
> >>> HYPERLINK
> >>> "mailto:[email protected]"opensaf-tickets@lists.
> >>> sourceforge.net
> >>>
> >>> is subscribed to
> >>> https://sourceforge.net/p/opensaf/tickets/
> >>>
> >>> To unsubscribe from further messages, a project admin can change
> >>> settings at https://sourceforge.net/p/opensaf/admin/tickets/options.
> >>> Or, if this is a mailing list, you can unsubscribe from the mailing
> >>> list.
> >>>
> >>> *_*
> >>>
> >>> See everything from the browser to the database with AppDynamics
> Get
> >>> end-to-end visibility with application monitoring from AppDynamics
> >>> Isolate bottlenecks and diagnose root cause in seconds.
> >>> Start your free trial of AppDynamics Pro today!
> >>>
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg
> >>> .clktrk
> >>>
> >>>
> >>> *_*
> >>>
> >>> Opensaf-tickets mailing list
> >>> HYPERLINK
> >>> "mailto:[email protected]"Opensaf-tickets@lists.
> >>> sourceforge.net
> >>>
> >>> https://lists.sourceforge.net/lists/listinfo/opensaf-tickets
> >>>
> >>> *_*
> >>>
> >>> HYPERLINK
> >>> "http://sourceforge.net/p/opensaf/tickets/501/"[tickets:#501]
> >>> <http://sourceforge.net/p/opensaf/tickets/501/> amf: No node
> >>> directors register to AMF within time after "#7 cleanup instead of
> >>> terminate used at component restart"
> >>>
> >>> Status: unassigned
> >>> Created: Thu Jul 11, 2013 07:47 AM UTC by Ingvar Bergström Last
> >>> Updated: Mon Jul 15, 2013 02:42 AM UTC
> >>> Owner: nobody
> >>>
> >>> After introduction of patches solving "#7 cleanup instead of
> >>> terminate used at component restart", no node directors registers to
> >>> AMF within time according to messages log.
> >>> I have tried SMFND, CPND, GLND and MQND.
> >>>
> >>> It seems however that the main routines of the node director daemons
> >>> are not started until 10 seconds after the terminate callback (after
> >>> the registration timeout).
> >>>
> >>> It is very easy to see the fault by entering command "amf-adm
> >>> restart safComp=xxxND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF"
> >>>
> >>> *_*
> >>>
> >>> Sent from sourceforge.net because you indicated interest in
> >>> https://sourceforge.net/p/opensaf/tickets/501/
> >>>
> >>> To unsubscribe from further messages, please visit
> >>> https://sourceforge.net/auth/subscriptions/
> >>>
> >>> --------------------------------------------------------------------
> >>> ----
> >>>
> >>>
> >>> *[tickets:#501] <http://sourceforge.net/p/opensaf/tickets/501/> amf:
> >>> No node directors register to AMF within time after "#7 cleanup
> >>> instead of terminate used at component restart"*
> >>>
> >>> *Status:* unassigned
> >>> *Created:* Thu Jul 11, 2013 07:47 AM UTC by Ingvar Bergström *Last
> >>> Updated:* Mon Jul 15, 2013 02:42 AM UTC
> >>> *Owner:* nobody
> >>>
> >>> After introduction of patches solving "#7 cleanup instead of
> >>> terminate used at component restart", no node directors registers to
> >>> AMF within time according to messages log.
> >>> I have tried SMFND, CPND, GLND and MQND.
> >>>
> >>> It seems however that the main routines of the node director daemons
> >>> are not started until 10 seconds after the terminate callback (after
> >>> the registration timeout).
> >>>
> >>> It is very easy to see the fault by entering command "amf-adm
> >>> restart safComp=xxxND,safSu=SC-1,safSg=NoRed,safApp=OpenSAF"
> >>>
> >>> --------------------------------------------------------------------
> >>> ----
> >>>
> >>>
> >>> Sent from sourceforge.net because you indicated interest in
> >>> https://sourceforge.net/p/opensaf/tickets/501/
> >>>
> >>> To unsubscribe from further messages, please visit
> >>> https://sourceforge.net/auth/subscriptions/
> >>>
> >> ---------------------------------------------------------------------
> >> ---------
> >>
> >> See everything from the browser to the database with AppDynamics Get
> >> end-to-end visibility with application monitoring from AppDynamics
> >> Isolate bottlenecks and diagnose root cause in seconds.
> >> Start your free trial of AppDynamics Pro today!
> >>
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.
> >> clktrk
> >>
> >> _______________________________________________
> >> Opensaf-devel mailing list
> >> [email protected]
> >> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
> >
> 
> 
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics Get
> end-to-end visibility with application monitoring from AppDynamics Isolate
> bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clk
> trk
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
> 
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics Get
> end-to-end visibility with application monitoring from AppDynamics Isolate
> bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clk
> trk
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to