---

** [tickets:#2510] amfd: payloads rebooted when recovering from SC absence **

**Status:** accepted
**Milestone:** 5.17.08
**Created:** Fri Jun 23, 2017 01:34 AM UTC by Gary Lee
**Last Updated:** Fri Jun 23, 2017 01:34 AM UTC
**Owner:** Gary Lee


When recovering from SC absence, sometimes all PLs are rebooted because node 
ups are not being processed by amfd.

Normally, node ups froms PL should be received by AMFD every 1s during the sync 
window. Eg. "NO Received node_up from 2040f: msg_id 1".

When the problem occurs, the node up messages are not logged by AMFD for 
periods of up to 15-20 seconds. After applying the following patch, it can be 
seen that the node ups are in fact received by AMFD, but not received by the 
main thread.

```
     default:
365     365            evt->rcv_evt = static_cast<AVD_EVT_TYPE>(
366     366                (rcv_msg->msg_type - AVSV_N2D_NODE_UP_MSG) + 
AVD_EVT_NODE_UP_MSG);
367     367            break;
368     368        }
369     369      
370     370        osafassert((AVD_EVT_INVALID < evt->rcv_evt) && (evt->rcv_evt 
< AVD_EVT_MAX));
371     371      
372     372        evt->info.avnd_msg = rcv_msg;
373     +  if (evt->rcv_evt == AVD_EVT_NODE_UP_MSG) {
374     +    LOG_NO("MDS Thread: Received node_up from %x: msg_id %u",
375     +           evt->info.avnd_msg->msg_info.n2d_node_up.node_id,
376     +           evt->info.avnd_msg->msg_info.n2d_node_up.msg_id);
377     +  }
373     378      
374     379        if (m_NCS_IPC_SEND(&cb->avd_mbx, evt, NCS_IPC_PRIORITY_HIGH) 
!=
375     380            NCSCC_RC_SUCCESS) {
376     381          LOG_ER("%s: ncs_ipc_send failed", __FUNCTION__);
377     382          avsv_dnd_msg_free(rcv_msg);
378     383          evt->info.avnd_msg = nullptr;
379     384          delete evt;
380     385          TRACE_LEAVE();
381     386          return NCSCC_RC_FAILURE;
382     387        }
```

```
var/log/opensaf/osafamfd:Jun 20 17:20:32.498510 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:32.499659 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:32.516707 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:33.599087 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:33.599784 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:33.618200 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:34.699435 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:34.700674 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:34.720039 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:35.799851 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:35.801838 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:35.821058 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:36.900386 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:36.902192 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:36.922864 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:38.000570 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:38.002417 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:38.023179 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:39.101071 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:39.103714 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:39.124949 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:40.201609 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:40.204156 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:40.244475 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:41.301996 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:41.304880 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:41.345560 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:42.402541 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:42.405582 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:42.445846 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:43.502786 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:43.505630 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:43.546256 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:44.603419 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:44.607302 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:44.647594 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:45.703701 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2020f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:45.707718 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2030f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:45.748243 osafamfd 
[10514:10517:src/amf/amfd/ndmsg.cc:0376] NO MDS Thread: Received node_up from 
2040f: msg_id 1
var/log/opensaf/osafamfd:Jun 20 17:20:46.054654 osafamfd 
[10514:10514:src/amf/amfd/ndfsm.cc:0296] >> avd_node_up_evh: from 2030f, 
safAmfNode=PL-3,safAmfCluster=myAmfCluster
var/log/opensaf/osafamfd:Jun 20 17:20:46.054675 osafamfd 
[10514:10514:src/amf/amfd/ndfsm.cc:0254] NO Received node_up from 2030f: msg_id 
1
var/log/opensaf/osafamfd:Jun 20 17:20:46.054682 osafamfd 
[10514:10514:src/amf/amfd/util.cc:0203] >> avd_snd_node_up_msg 
var/log/opensaf/osafamfd:Jun 20 17:20:46.054695 osafamfd 
[10514:10514:src/amf/amfd/util.cc:0236] << avd_snd_node_up_msg 
var/log/opensaf/osafamfd:Jun 20 17:20:46.054734 osafamfd 
[10514:10514:src/amf/amfd/ndfsm.cc:0439] WA Sending node reboot order to 
node:safAmfNode=PL-3,safAmfCluster=myAmfCluster, due to late node_up_msg after 
cluster startup timeout
var/log/opensaf/osafamfd:Jun 20 17:20:46.054749 osafamfd 
[10514:10514:src/amf/amfd/ndfsm.cc:0534] << avd_node_up_evh 
var/log/opensaf/osafamfd:Jun 20 17:20:46.055356 osafamfd 
[10514:10514:src/amf/amfd/ndfsm.cc:0296] >> avd_node_up_evh: from 2040f, 
safAmfNode=PL-4,safAmfCluster=myAmfCluster
var/log/opensaf/osafamfd:Jun 20 17:20:46.055378 osafamfd 
[10514:10514:src/amf/amfd/ndfsm.cc:0254] NO Received node_up from 2040f: msg_id 
1
var/log/opensaf/osafamfd:Jun 20 17:20:46.055387 osafamfd 
[10514:10514:src/amf/amfd/util.cc:0203] >> avd_snd_node_up_msg 
var/log/opensaf/osafamfd:Jun 20 17:20:46.055402 osafamfd 
[10514:10514:src/amf/amfd/util.cc:0236] << avd_snd_node_up_msg 
var/log/opensaf/osafamfd:Jun 20 17:20:46.055428 osafamfd 
[10514:10514:src/amf/amfd/ndfsm.cc:0439] WA Sending node reboot order to 
node:safAmfNode=PL-4,safAmfCluster=myAmfCluster, due to late node_up_msg after 
cluster startup timeout
```


---

Sent from sourceforge.net because [email protected] is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to