After head less , if SC is starting ,
I can see some code in IMMD Waiting 3 seconds to allow IMMND MDS attachments to 
get processed. 
 
============================================================================
if (cb->mScAbsenceAllowed && cb->ha_state == SA_AMF_HA_ACTIVE) {
        /* If this IMMD has active role, wait for veteran payloads.
         * Give up after 3 seconds if there's no veteran payloads. */
        LOG_NO("Waiting 3 seconds to allow IMMND MDS attachments to get 
processed.");

        if (osaf_poll_one_fd(m_GET_FD_FROM_SEL_OBJ(immd_cb->veteran_sync_sel), 
3000) != 1) {
                TRACE("osaf_poll_one_fd on veteran_sync_sel failed or timed 
out");
        } else {
                LOG_NO("Received intro message from veteran payload, stop 
waiting");
        }

        m_NCS_LOCK(&immd_cb->veteran_sync_lock, NCS_LOCK_WRITE);
        m_NCS_SEL_OBJ_DESTROY(&immd_cb->veteran_sync_sel);
        m_NCS_UNLOCK(&immd_cb->veteran_sync_lock, NCS_LOCK_WRITE);
}
/===========================================================================


Ideally this should be timer insted of poll like AMFD node_sync_tmr

============================================================================

if (rc_node_up == sync_nd_size) {
                        if (cb->node_sync_tmr.is_active) {
                                avd_stop_tmr(cb, &cb->node_sync_tmr);
                                TRACE("stop NodeSync timer");
                        }
                        cb->all_nodes_synced = true;
                        LOG_NO("Received node_up_msg from all nodes");
                } else {
                        if (avnd->node_up_msg_count == 1 &&
                                (act_nd || 
n2d_msg->msg_info.n2d_node_up.leds_set)) {
 
                                // start (or restart) timer if this is the 
first message
                                // from amfnd-active-SC or amfnd-green-leds-PL
                                cb->node_sync_tmr.type = AVD_TMR_NODE_SYNC;
                                avd_start_tmr(cb, &(cb->node_sync_tmr), 
AVSV_DEF_NODE_SYNC_PERIOD);
 
                                TRACE("Received node_up_msg from node:%s. 
Start/Restart "
                                                " NodeSync timer waiting for 
remaining (%d) node(s)",
                                                
osaf_extended_name_borrow(&n2d_msg->msg_info.n2d_node_up.node_name),
                                                sync_nd_size - rc_node_up);
                                goto done;
                        }
                        if (cb->node_sync_tmr.is_active == true) {
                                if (n2d_msg->msg_info.n2d_node_up.leds_set == 
false) {
                                        TRACE("NodeSync timer is active, ignore 
this node_up msg (nodeid:%x)",
                                                
n2d_msg->msg_info.n2d_node_up.node_id);
                                        goto done;
                                }
                        }
                }
\===========================================================================    
          


---

** [tickets:#1955] imm: Fail to detect veteran node when NCSMDS_UP event comes 
late**

**Status:** accepted
**Milestone:** 5.0.1
**Created:** Wed Aug 17, 2016 10:44 AM UTC by Hung Nguyen
**Last Updated:** Wed Aug 24, 2016 04:45 AM UTC
**Owner:** Hung Nguyen
**Attachments:**

- 
[syslog.7z](https://sourceforge.net/p/opensaf/tickets/1955/attachment/syslog.7z)
 (93.4 kB; application/octet-stream)


Sometimes, the NCSMDS_UP event comes after the messages.
In this case, IMMD received the IMMD_EVT_ND2D_INTRO message before the 
NCSMDS_UP event.
IMMD failed to process the intro message because the node info had not been 
added to cb->immnd_tree.
~~~
Aug 12 08:13:53 SC-1 osafimmd[11184]: WA Node not found 566314186398634
Aug 12 08:13:53 SC-1 osafimmd[11184]: WA Error returned from processing message 
err:2 msg-type:2
Aug 12 08:13:53 SC-1 osafimmnd[11199]: NO SERVER STATE: IMM_SERVER_ANONYMOUS 
--> IMM_SERVER_CLUSTER_WAITING
Aug 12 08:13:53 SC-1 osafimmd[11184]: NO New IMMND process is on ACTIVE 
Controller at 2010f
Aug 12 08:13:53 SC-1 osafimmd[11184]: NO Extended intro from node 2010f
Aug 12 08:13:53 SC-1 osafimmd[11184]: NO First SC IMMND (OpenSAF 4.4 or later) 
attached 2010f
Aug 12 08:13:53 SC-1 osafimmd[11184]: NO Attached Nodes:2 Accepted nodes:1 
KnownVeteran:0 doReply:1
Aug 12 08:13:53 SC-1 osafimmd[11184]: NO First IMMND on SC found at 2010f this 
IMMD at 2010f. Cluster is loading, *not* 2PBE => designating that IMMND as 
coordinator
Aug 12 08:13:53 SC-1 osafimmnd[11199]: NO This IMMND is now the NEW Coord
~~~
IMMND on SC-1 was elected as coordinator insted of the veteran.


The MDS messages come from 'Dsock' socket and MDS events come from 'BSRsock'.
Since MDS uses two different sockets so I think we can't fix this problem in 
MDS.
IMM has to somehow handle this case.




---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to