---

** [tickets:#2061] smfd faulted on Active controller due csiSetcallbackTimeout 
during si-swap operation**

**Status:** unassigned
**Milestone:** 4.7.2
**Created:** Thu Sep 22, 2016 09:26 AM UTC by Ritu Raj
**Last Updated:** Thu Sep 22, 2016 09:26 AM UTC
**Owner:** nobody
**Attachments:**

- 
[SC-1.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2061/attachment/SC-1.tar.bz2)
 (178.0 kB; application/x-bzip)
- 
[SC-2.tar.bz2](https://sourceforge.net/p/opensaf/tickets/2061/attachment/SC-2.tar.bz2)
 (206.3 kB; application/x-bzip)


# Environment details
OS : Suse 64bit
Changeset : 7997 ( 5.1.FC)
Setup : 3 nodes ( 2 controllers and 1 payloads with headless feature disabled & 
1PBE with 100K objects)

# Summary
smfd faulted on Active(Previous active) controller due csiSetcallbackTimeout 
during si-swap operation 

# Steps followed & Observed behaviour
1. Initiate si-swap operation from Active Controller, simultaneously killed 
osafsmfnd on STANDBY and osafckptnd on Payload(PL-3) 
2. Observed that, during role change smfd faulted on Active

>From the traces, it is observed that:
** In the file "osaf/services/saf/smfsv/smfd/smfd_smfnd.c"
 there is no TRY_AGAIN mechanism for below api

/* Find Clm info about the node */
        rc = saClmInitialize(&clmHandle, NULL, &clmVersion);
        if (rc != SA_AIS_OK) {
                LOG_ER("saClmInitialize failed, rc=%s", saf_error(rc));
                if (newNode) free(smfnd);
                pthread_mutex_unlock(&smfnd_list_lock);
                return NCSCC_RC_FAILURE;
        }

        /* Get Clm info about the node */
        SaClmClusterNodeT clmInfo;
        rc = saClmClusterNodeGet(clmHandle, i_node_id,
                                 10000000000LL, &clmInfo);
        if (rc != SA_AIS_OK) {
                LOG_ER("saClmClusterNodeGet failed, rc=%s", saf_error(rc));
                if (newNode) free(smfnd);
                rc = saClmFinalize(clmHandle);
                if (rc != SA_AIS_OK) {
                        LOG_ER("saClmFinalize failed, rc=%s", saf_error(rc));
                }
                pthread_mutex_unlock(&smfnd_list_lock);
                return NCSCC_RC_FAILURE;
        }


**Syslog :
Sep 22 14:15:05 fos1 osafimmnd[6164]: NO Implementer disconnected 17 <0, 
2030f(down)> (MsgQueueService131855)
Sep 22 14:15:08 fos1 osafamfnd[6253]: NO 
'safComp=SMF,safSu=SC-1,safSg=2N,safApp=OpenSAF' faulted due to 
'csiSetcallbackTimeout' : Recovery is 'nodeFailfast'
Sep 22 14:15:08 fos1 osafamfnd[6253]: ER 
safComp=SMF,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due 
to:csiSetcallbackTimeout Recovery is:nodeFailfast
Sep 22 14:15:08 fos1 osafamfnd[6253]: Rebooting OpenSAF NodeId = 131343 EE Name 
= , Reason: Component faulted: recovery is node failfast, OwnNodeId = 131343, 
SupervisionTime = 60
Sep 22 14:15:08 fos1 opensaf_reboot: Rebooting local node; timeout=60
Sep 22 14:15:09 fos1 osafsmfd[6272]: ER saClmInitialize failed, 
rc=SA_AIS_ERR_TRY_AGAIN (6)
Sep 22 14:15:09 fos1 osafsmfd[6272]: WA proc_mds_info: SMFND UP failed


**Notes:
1. Syslog of controller's attached
2. smfd tarces attached


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to