- **Component**: unknown --> amf
- **Version**: --> 4.3 GA
---
** [tickets:#498] OpenSAF startup issue on a clm locked node**
**Status:** unassigned
**Created:** Wed Jul 10, 2013 09:28 AM UTC by Sirisha Alla
**Last Updated:** Wed Jul 10, 2013 09:28 AM UTC
**Owner:** nobody
The issue is seen using changset 4325 on SLES 4 node VMs. IMM PBE is enabled
and loaded with 25k objects.
Lock is done on CLM node safNode=PL-3,safCluster=myClmCluster. Opensafd is
restarted/PL-3 is rebooted after clm node lock. The following is observed
1) OpenSAF fails to come after reboot/restart of opensaf.
Jul 9 17:23:04 SLES-64BIT-SLOT3 kernel: [ 1402.648451] TIPC: Established link
<1.1.3:eth0-1.1.1:eth0> on network plane A
Jul 9 17:23:04 SLES-64BIT-SLOT3 kernel: [ 1402.648481] TIPC: Established link
<1.1.3:eth0-1.1.4:eth1> on network plane A
Jul 9 17:23:04 SLES-64BIT-SLOT3 kernel: [ 1402.648511] TIPC: Established link
<1.1.3:eth0-1.1.2:eth1> on network plane A
Jul 9 17:23:04 SLES-64BIT-SLOT3 osafimmnd[3331]: Started
Jul 9 17:23:04 SLES-64BIT-SLOT3 osafimmnd[3331]: NO Persistent Back-End
capability configured, Pbe file:imm.db
.....
Jul 9 17:23:24 SLES-64BIT-SLOT3 osafclmna[3344]: Started
Jul 9 17:23:25 SLES-64BIT-SLOT3 osafclmna[3344]: NO
safNode=PL-3,safCluster=myClmCluster Joined cluster, nodeid=2030f
Jul 9 17:23:25 SLES-64BIT-SLOT3 osafamfnd[3353]: Started
Jul 9 17:23:25 SLES-64BIT-SLOT3 osafimmnd[3331]: NO Implementer connected: 44
(MsgQueueService132111) <0, 2040f>
Jul 9 17:39:55 SLES-64BIT-SLOT3 opensafd[3305]: ER Timed-out for response from
AMFND
Jul 9 17:39:55 SLES-64BIT-SLOT3 opensafd[3305]: ER
Jul 9 17:39:55 SLES-64BIT-SLOT3 opensafd[3305]: ER Going for recovery
Jul 9 17:39:55 SLES-64BIT-SLOT3 osafamfnd[3353]: NO Shutdown initiated
Jul 9 17:39:55 SLES-64BIT-SLOT3 osafamfnd[3353]: NO Terminating all AMF
components
Jul 9 17:39:55 SLES-64BIT-SLOT3 osafamfnd[3353]: NO No component to terminate,
exiting
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387709] TIPC: Disabling bearer
<eth:eth0>
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387715] TIPC: Lost link
<1.1.3:eth0-1.1.1:eth0> on network plane A
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387718] TIPC: Lost contact with
<1.1.1>
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387735] TIPC: Lost link
<1.1.3:eth0-1.1.4:eth1> on network plane A
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387737] TIPC: Lost contact with
<1.1.4>
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387742] TIPC: Lost link
<1.1.3:eth0-1.1.2:eth1> on network plane A
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387743] TIPC: Lost contact with
<1.1.2>
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387799] TIPC: Left network mode
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387837] NET: Unregistered
protocol family 30
Jul 9 17:39:55 SLES-64BIT-SLOT3 kernel: [ 2413.387842] TIPC: Deactivated
Jul 9 17:39:55 SLES-64BIT-SLOT3 opensafd: Starting OpenSAF failed
Here AMFND has timed out while bringing up. Following is observed in amfnd
traces of PL-3:
Jul 9 17:23:25.052461 osafamfnd [3353:avnd_clm.c:0208] TR Node has left the
cluster 'safNode=PL-3,safCluster=myClmCluster', avnd_cb->first_time_up
1,notifItem->clusterNode.nodeId 131855, avnd_cb->node_info.nodeId 131855
Jul 9 17:23:25.052469 osafamfnd [3353:avnd_clm.c:0251] << clm_track_cb
Jul 9 17:23:25.052475 osafamfnd [3353:clma_util.c:0468] << clma_hdl_cbk_rec_prc
Jul 9 17:23:25.052480 osafamfnd [3353:clma_util.c:0653] >> clma_msg_destroy
Jul 9 17:23:25.052486 osafamfnd [3353:clma_util.c:0677] << clma_msg_destroy
Jul 9 17:23:25.052495 osafamfnd [3353:clma_util.c:0543] <<
clma_hdl_cbk_dispatch_all
Jul 9 17:23:25.052502 osafamfnd [3353:clma_util.c:0622] <<
clma_hdl_cbk_dispatch
Jul 9 17:23:25.052507 osafamfnd [3353:clma_api.c:0774] << saClmDispatch
Jul 9 17:39:55.144868 osafamfnd [3353:avnd_proc.c:0257] >> avnd_evt_process
Jul 9 17:39:55.144928 osafamfnd [3353:avnd_proc.c:0272] TR Evt type:51
Jul 9 17:39:55.144949 osafamfnd [3353:avnd_term.c:0108] >>
avnd_evt_last_step_term_evh
Jul 9 17:39:55.145000 osafamfnd [3353:avnd_term.c:0112] NO Shutdown initiated
Jul 9 17:39:55.145016 osafamfnd [3353:avnd_term.c:0063] >> avnd_last_step_clean
Jul 9 17:39:55.145037 osafamfnd [3353:avnd_term.c:0065] NO Terminating all AMF
components
Jul 9 17:39:55.145061 osafamfnd [3353:avnd_term.c:0088] NO No component to
terminate, exiting
2) opensafd does not indicate the success or failure for opensaf startup
(missing logging ??)
Jul 10 12:46:30 SLES-64BIT-SLOT3 kernel: [71208.445001] TIPC: Established link
<1.1.3:eth0-1.1.4:eth1> on network plane A
Jul 10 12:46:31 SLES-64BIT-SLOT3 kernel: [71209.204837] TIPC: Established link
<1.1.3:eth0-1.1.1:eth0> on network plane A
Jul 10 12:46:31 SLES-64BIT-SLOT3 kernel: [71209.204889] TIPC: Established link
<1.1.3:eth0-1.1.2:eth1> on network plane A
Jul 10 12:46:31 SLES-64BIT-SLOT3 osafimmnd[17486]: NO SERVER STATE:
IMM_SERVER_ANONYMOUS --> IMM_SERVER_CLUSTER_WAITING
Jul 10 12:46:31 SLES-64BIT-SLOT3 osafimmnd[17486]: NO SERVER STATE:
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Jul 10 12:46:31 SLES-64BIT-SLOT3 osafimmnd[17486]: NO SERVER STATE:
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_SYNC_PENDING
Jul 10 12:46:31 SLES-64BIT-SLOT3 osafimmnd[17486]: NO NODE STATE->
IMM_NODE_ISOLATED
Jul 10 12:46:32 SLES-64BIT-SLOT3 osafimmnd[17486]: NO NODE STATE->
IMM_NODE_W_AVAILABLE
Jul 10 12:46:32 SLES-64BIT-SLOT3 osafimmnd[17486]: NO SERVER STATE:
IMM_SERVER_SYNC_PENDING --> IMM_SERVER_SYNC_CLIENT
Jul 10 12:46:45 SLES-64BIT-SLOT3 osafimmnd[17486]: NO NODE STATE->
IMM_NODE_FULLY_AVAILABLE 2144
Jul 10 12:46:45 SLES-64BIT-SLOT3 osafimmnd[17486]: NO RepositoryInitModeT is
SA_IMM_KEEP_REPOSITORY
Jul 10 12:46:45 SLES-64BIT-SLOT3 osafimmnd[17486]: NO Epoch set to 84 in
ImmModel
Jul 10 12:46:45 SLES-64BIT-SLOT3 osafimmnd[17486]: NO SERVER STATE:
IMM_SERVER_SYNC_CLIENT --> IMM SERVER READY
Jul 10 12:46:45 SLES-64BIT-SLOT3 osafclmna[17499]: Started
Jul 10 12:46:45 SLES-64BIT-SLOT3 osafclmna[17499]: NO
safNode=PL-3,safCluster=myClmCluster Joined cluster, nodeid=2030f
Jul 10 12:46:45 SLES-64BIT-SLOT3 osafamfnd[17508]: Started
Jul 10 12:46:49 SLES-64BIT-SLOT3 osafimmnd[17486]: NO Implementer connected: 52
(MsgQueueService131855) <0, 2010f>
Jul 10 12:46:49 SLES-64BIT-SLOT3 osafimmnd[17486]: NO Implementer disconnected
52 <0, 2010f> (MsgQueueService131855)
After this message there is no "Opensaf successfully started" or "OpenSAF
startup failed" message in the syslog.
After sometime(approx 1hr 40 mins) when clm unlock is done, middleware su
components are brought up successfully.
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafamfnd[17508]: NO
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' Presence State UNINSTANTIATED =>
INSTANTIATING
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafmsgnd[20002]: Started
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafimmnd[17486]: NO Implementer connected: 53
(MsgQueueService131855) <43, 2030f>
Jul 10 14:23:13 SLES-64BIT-SLOT3 osaflcknd[20018]: Started
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafsmfnd[20027]: Started
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafckptnd[20036]: Started
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafamfwd[20045]: Started
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafamfnd[17508]: NO
'safSu=PL-3,safSg=NoRed,safApp=OpenSAF' Presence State INSTANTIATING =>
INSTANTIATED
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafamfnd[17508]: NO Assigning
'safSi=NoRed4,safApp=OpenSAF' ACTIVE to 'safSu=PL-3,safSg=NoRed,safApp=OpenSAF'
Jul 10 14:23:13 SLES-64BIT-SLOT3 osafamfnd[17508]: NO Assigned
'safSi=NoRed4,safApp=OpenSAF' ACTIVE to 'safSu=PL-3,safSg=NoRed,safApp=OpenSAF'
While scenario 1 is seen quite frequently, it is difficult to reproduce
scenario 2. Have observed scenario 2 once in 20 tries. I think scenario 2
should be the expected behavior with opensafd indication that "opensaf started
successully"
Attached syslog and amfnd traces on PL-3.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets