- **status**: review --> fixed
- **Comment**:

default 5.0

changeset:   7344:7ecbd2a7b7a7 [7ecbd2]
tag:         tip
user:        Hung Nguyen <[email protected]>
date:        Tue Mar 22 17:22:35 2016 +0700
summary:     imm: Wait for veterans when IMMD starts [#1698]




---

** [tickets:#1698] imm: IMMD process the intro msg from newly-joined IMMND 
before veterans**

**Status:** fixed
**Milestone:** 5.0.FC
**Created:** Wed Mar 09, 2016 10:40 AM UTC by Hung Nguyen
**Last Updated:** Thu Mar 10, 2016 08:04 AM UTC
**Owner:** Hung Nguyen
**Attachments:**

- 
[syslog.tgz](https://sourceforge.net/p/opensaf/tickets/1698/attachment/syslog.tgz)
 (10.8 kB; application/gzip)


During headless, there's an IMMND joins the cluster.
When IMMD is back, the IMMNDs (newly-joined and veteran) will send intro 
messages (ND2D_INTRO) to the active IMMD.

If the intro msg from newly-joined IMMND reaches the IMMD before the veterans,
IMMD will order the newly-joined IMMND to load (LOADING_CLIENT) instead of sync 
(SYNC_CLIENT).


~~~~
Mar  9 17:04:39 SC-1 osafimmd[1029]: NO Extended intro from node 2040f
Mar  9 17:04:39 SC-1 osafimmd[1029]: NO Payload node 2040f introduced before 
first SC, can not yet verify File/Directory base matches SC.
~~~~

When IMMD processes the intro message from a veteran, it will set that IMMND as 
coordinator.
And then when the new IMMND receives the sync start message (D2ND_SYNC_START), 
it will crash (abort).


~~~~
Mar  9 17:04:39 SC-1 osafimmd[1029]: NO Sc Absence Allowed is configured (1800) 
=> IMMND coord at payload node:2030f dest566313288150651
Mar  9 17:04:39 SC-1 osafimmd[1029]: NO Node 2010f request sync sync-pid:1039 
epoch:0 
Mar  9 17:04:40 SC-1 osafimmd[1029]: NO Successfully announced sync. New ruling 
epoch:3

Mar  9 17:04:39 PL-4 osafimmnd[400]: NO SERVER STATE: 
IMM_SERVER_CLUSTER_WAITING --> IMM_SERVER_LOADING_PENDING
Mar  9 17:04:39 PL-4 osafimmnd[400]: NO SERVER STATE: 
IMM_SERVER_LOADING_PENDING --> IMM_SERVER_LOADING_CLIENT
Mar  9 17:04:40 PL-4 osafimmnd[400]: WA Imm at this node has epoch 0, appears 
to be a stragler in wrong state 5
~~~~




-----


-----



The ways that newly-joined IMMND and veteran IMMND send intro messages are 
different.
- veteran IMMND: when receiving NCSMDS_UP event from IMMD
- newly-joined IMMND: in immnd_proc_server()

So normally, veteran will send the intro message faster than the newly-joined 
IMMND (veterans send right after receiving NCSMDS_UP event).
That's why it's hard to reproduce this problem.

I have to put some sleep to the veterans to reproduce.

~~~~
@@ -10191,6 +10191,7 @@ static uint32_t immnd_evt_proc_mds_evt(I
        } else if ((evt->info.mds_info.change == NCSMDS_UP) && 
(evt->info.mds_info.svc_id == NCSMDS_SVC_ID_IMMD)) {
                LOG_NO("IMMD service is UP ... ScAbsenseAllowed?:%u 
introduced?:%u",
                           cb->mScAbsenceAllowed, cb->mIntroduced);
+               if(cb->mIntroduced == 2) usleep(100000);
                if((cb->mIntroduced==2) && (immnd_introduceMe(cb) != 
NCSCC_RC_SUCCESS)) {
                        LOG_WA("IMMND re-introduceMe after IMMD restart failed, 
will retry");
                }
~~~~



---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to