- **status**: review --> fixed
- **Comment**:
default 5.0
changeset: 7344:7ecbd2a7b7a7 [7ecbd2]
tag: tip
user: Hung Nguyen <[email protected]>
date: Tue Mar 22 17:22:35 2016 +0700
summary: imm: Wait for veterans when IMMD starts [#1698]
---
** [tickets:#1698] imm: IMMD process the intro msg from newly-joined IMMND
before veterans**
**Status:** fixed
**Milestone:** 5.0.FC
**Created:** Wed Mar 09, 2016 10:40 AM UTC by Hung Nguyen
**Last Updated:** Thu Mar 10, 2016 08:04 AM UTC
**Owner:** Hung Nguyen
**Attachments:**
-
[syslog.tgz](https://sourceforge.net/p/opensaf/tickets/1698/attachment/syslog.tgz)
(10.8 kB; application/gzip)
During headless, there's an IMMND joins the cluster.
When IMMD is back, the IMMNDs (newly-joined and veteran) will send intro
messages (ND2D_INTRO) to the active IMMD.
If the intro msg from newly-joined IMMND reaches the IMMD before the veterans,
IMMD will order the newly-joined IMMND to load (LOADING_CLIENT) instead of sync
(SYNC_CLIENT).
~~~~
Mar 9 17:04:39 SC-1 osafimmd[1029]: NO Extended intro from node 2040f
Mar 9 17:04:39 SC-1 osafimmd[1029]: NO Payload node 2040f introduced before
first SC, can not yet verify File/Directory base matches SC.
~~~~
When IMMD processes the intro message from a veteran, it will set that IMMND as
coordinator.
And then when the new IMMND receives the sync start message (D2ND_SYNC_START),
it will crash (abort).
~~~~
Mar 9 17:04:39 SC-1 osafimmd[1029]: NO Sc Absence Allowed is configured (1800)
=> IMMND coord at payload node:2030f dest566313288150651
Mar 9 17:04:39 SC-1 osafimmd[1029]: NO Node 2010f request sync sync-pid:1039
epoch:0
Mar 9 17:04:40 SC-1 osafimmd[1029]: NO Successfully announced sync. New ruling
epoch:3
Mar 9 17:04:39 PL-4 osafimmnd[400]: NO SERVER STATE:
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Mar 9 17:04:39 PL-4 osafimmnd[400]: NO SERVER STATE:
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_LOADING_CLIENT
Mar 9 17:04:40 PL-4 osafimmnd[400]: WA Imm at this node has epoch 0, appears
to be a stragler in wrong state 5
~~~~
-----
-----
The ways that newly-joined IMMND and veteran IMMND send intro messages are
different.
- veteran IMMND: when receiving NCSMDS_UP event from IMMD
- newly-joined IMMND: in immnd_proc_server()
So normally, veteran will send the intro message faster than the newly-joined
IMMND (veterans send right after receiving NCSMDS_UP event).
That's why it's hard to reproduce this problem.
I have to put some sleep to the veterans to reproduce.
~~~~
@@ -10191,6 +10191,7 @@ static uint32_t immnd_evt_proc_mds_evt(I
} else if ((evt->info.mds_info.change == NCSMDS_UP) &&
(evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMD)) {
LOG_NO("IMMD service is UP ... ScAbsenseAllowed?:%u
introduced?:%u",
cb->mScAbsenceAllowed, cb->mIntroduced);
+ if(cb->mIntroduced == 2) usleep(100000);
if((cb->mIntroduced==2) && (immnd_introduceMe(cb) !=
NCSCC_RC_SUCCESS)) {
LOG_WA("IMMND re-introduceMe after IMMD restart failed,
will retry");
}
~~~~
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets