- **assigned_to**: A V Mahesh (AVM) --> nobody
- **Blocker**: --> False
---
** [tickets:#457] Dtm: standby joins as active after restart in a 70 node
setup**
**Status:** unassigned
**Milestone:** future
**Created:** Fri Jun 14, 2013 06:48 AM UTC by Neelakanta Reddy
**Last Updated:** Wed Jul 15, 2015 02:21 PM UTC
**Owner:** nobody
**Attachments:**
-
[messages_SC1](https://sourceforge.net/p/opensaf/tickets/457/attachment/messages_SC1)
(65.5 kB; application/octet-stream)
-
[messages_SC2](https://sourceforge.net/p/opensaf/tickets/457/attachment/messages_SC2)
(208.0 kB; application/octet-stream)
After analyzing the logs following is the observation:
Slot1 is active and slot2 is standby
1. IMMND killed in slot-2
Jun 11 21:29:46 SLES-64BIT-SLOT2 osafamfnd[3750]: NO
'safComp=IMMND,safSu=SC-2,safSg=NoRed,safApp=OpenSAF' faulted due to 'avaDown'
: Recovery is 'componentRestart'
2. Active IMMD detected the slot-2 IMMND is discarded
Jun 11 15:54:02 SLES-64BIT-SLOT1 osafimmnd[3746]: NO Global discard node
received for nodeId:2020f pid:3668
3. New immnd at slot2 requests for sync
Jun 11 21:29:46 SLES-64BIT-SLOT2 osafimmnd[7315]: Started
Jun 11 15:54:03 SLES-64BIT-SLOT1 osafimmd[3736]: NO Node 2020f request sync
sync-pid:7315 epoch:0
4. slot2 went for reboot, IMMD is killed
Jun 11 21:29:49 SLES-64BIT-SLOT2 osafamfnd[3750]: ER
safComp=IMMD,safSu=SC-2,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
Jun 11 21:29:49 SLES-64BIT-SLOT2 osafamfnd[3750]: Rebooting OpenSAF NodeId =
131599 EE Name = , Reason: Component faulted: recovery is node failfast
Jun 11 21:29:49 SLES-64BIT-SLOT2 opensaf_reboot: Rebooting local node
5. After coming up the slot2 got active role (slot1 is still in active)
Jun 11 21:30:22 SLES-64BIT-SLOT2 osafrded[2095]: NO Peer not available =>
Active role
Jun 11 21:30:23 SLES-64BIT-SLOT2 osaffmd[2108]: Started
Jun 11 21:30:23 SLES-64BIT-SLOT2 osafimmd[2117]: Started
Jun 11 21:30:23 SLES-64BIT-SLOT2 osafimmnd[2127]: Started
6. After getting active role the node went for loading
Jun 11 21:30:23 SLES-64BIT-SLOT2 osafimmnd[2127]: NO This IMMND is now the NEW
Coord
7. After some time, there is a connection established to the active node
Jun 11 21:30:23 SLES-64BIT-SLOT2 osafdtmd[2077]: NO Established contact with
'SC-1
Jun 11 15:54:39 SLES-64BIT-SLOT1 osafdtmd[3696]: NO Established contact with
'SC-2'
8. after connecting the loading event reaches to active IMMD at Slot1, the
immnd up event is not received because by the time immnd is up the connection
is not established between the two nodes.
Jun 11 15:54:42 SLES-64BIT-SLOT1 osafimmd[3736]: WA Wrong PID 0 != 2127
9. AMFD, tries to re-connect to IMM because, IMMND return bad_handle when the
previous synchronous call from the amfd is not yet complete and AMFD requested
for one more request on same handle.
Jun 11 15:54:49 SLES-64BIT-SLOT1 osafamfd[3815]: NO Re-initializing with IMM
Jun 11 15:54:49 SLES-64BIT-SLOT1 osafimmnd[3746]: WA IMMND - Client Node Get
Failed for cli_hdl 85899477263
Jun 11 15:54:49 SLES-64BIT-SLOT1 osafamfd[3815]: ER saImmOiImplementerSet
failed 14
Jun 11 15:54:49 SLES-64BIT-SLOT1 osafamfd[3815]: ER exiting since
avd_imm_impl_set failed
conclusion:
The mds in the slot2 connected with slot1, after initiating loading in IMMND,
because of this slot2 got active role.
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets