---
** [tickets:#3182] mds: immd fail to send broadcast message to all immnd**
**Status:** assigned
**Milestone:** 5.20.05
**Created:** Mon Apr 27, 2020 02:42 AM UTC by Thuan Tran
**Last Updated:** Mon Apr 27, 2020 02:42 AM UTC
**Owner:** Thuan Tran
IMMD fail to send broadcast message to all IMMNDs.
~~~
<143>1 2020-04-20T08:31:04.375963+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178678"] 29596:imm/immd/immd_evt.c:289 >> immd_evt_proc_fevs_req
<143>1 2020-04-20T08:31:04.37597+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178679"] 29596:imm/immd/immd_evt.c:332 T5 immd_evt_proc_fevs_req
send_count:156403 size:146
143>1 2020-04-20T08:31:04.376482+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178706"] 29596:imm/immd/immd_mds.c:773 >> immd_mds_bcast_send
<143>1 2020-04-20T08:31:04.376488+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178707"] 29596:imm/common/immsv_evt.c:6539 T8 Sending:
IMMND_EVT_D2ND_GLOB_FEVS_REQ_2 to 0
<139>1 2020-04-20T08:31:04.426888+02:00 SC-1 osafimmd 29596 mds.log [meta
sequenceId="482"] MDTM: Failed to send Multicast message err :Resource
temporarily unavailable
<139>1 2020-04-20T08:31:04.426905+02:00 SC-1 osafimmd 29596 mds.log [meta
sequenceId="483"] MDTM: Failed to send Multicast message Data lenght=222 From
svc_id = IMMD(24) to svc_id = IMMND(25) err :Resource temporarily unavailable
<139>1 2020-04-20T08:31:04.426911+02:00 SC-1 osafimmd 29596 mds.log [meta
sequenceId="484"] MDTM:Continue while(1) status = mds_mcm_send_msg_enc =
NCSCC_RC_FAILURE
...
<139>1 2020-04-20T08:31:05.034968+02:00 SC-1 osafimmd 29596 mds.log [meta
sequenceId="518"] MDTM: Failed to send Multicast message err :Resource
temporarily unavailable
<139>1 2020-04-20T08:31:05.034985+02:00 SC-1 osafimmd 29596 mds.log [meta
sequenceId="519"] MDTM: Failed to send Multicast message Data lenght=105 From
svc_id = IMMD(24) to svc_id = IMMND(25) err :Resource temporarily unavailable
<139>1 2020-04-20T08:31:05.03499+02:00 SC-1 osafimmd 29596 mds.log [meta
sequenceId="520"] MDTM:Continue while(1) status = mds_mcm_send_msg_enc =
NCSCC_RC_FAILURE
<143>1 2020-04-20T08:31:04.983299+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178708"] 29596:imm/immd/immd_mds.c:793 << immd_mds_bcast_send
<143>1 2020-04-20T08:31:04.983311+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178709"] 29596:imm/immd/immd_evt.c:413 << immd_evt_proc_fevs_req
<143>1 2020-04-20T08:31:04.983518+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178742"] 29596:imm/immd/immd_evt.c:289 >> immd_evt_proc_fevs_req
<143>1 2020-04-20T08:31:04.983526+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178743"] 29596:imm/immd/immd_evt.c:332 T5 immd_evt_proc_fevs_req
send_count:156404 size:29
<143>1 2020-04-20T08:31:04.984464+02:00 SC-1 osafimmd 29596 osafimmd [meta
sequenceId="178770"] 29596:imm/immd/immd_mds.c:773 >> immd_mds_bcast_send
~~~
It cause IMMNDs unexpected exit with OUT OF ORDER error.
Finally, it leads to node failfast as consequence.
~~~
2020-04-20T08:31:05.058+02:00 SC-1 osafimmnd[29612]: ER MESSAGE:156404 OUT OF
ORDER my highest processed:156402 - exiting
2020-04-20T08:31:05.073+02:00 SC-1 osafimmd[29596]: WA IMMND coordinator at
2010f apparently crashed => electing new coord
2020-04-20T08:31:05.073+02:00 SC-1 osafimmd[29596]: NO New coord elected,
resides at 2020f
2020-04-20T08:31:05.078+02:00 SC-1 osaffmd[29579]: NO IMMND down on: 2020f
2020-04-20T08:31:05.082+02:00 SC-1 osafimmd[29596]: WA IMMND coordinator at
2020f apparently crashed => electing new coord
2020-04-20T08:31:05.082+02:00 SC-1 osafimmd[29596]: NO Coord elected at
payload:2050f
2020-04-20T08:31:05.082+02:00 SC-1 osafimmd[29596]: NO New coord elected,
resides at 2050f
2020-04-20T08:31:05.083+02:00 SC-1 osafimmd[29596]: WA IMMD - MDS Send Failed
2020-04-20T08:31:05.084+02:00 SC-1 osafimmd[29596]: ER Failed to send MDS
message designating new IMMND coord, exiting
2020-04-20T08:31:05.085+02:00 SC-1 osafamfnd[29808]: ER
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery
is:nodeFailfast
2020-04-20T08:31:05.086+02:00 SC-1 osafamfnd[29808]: Rebooting OpenSAF NodeId =
131343 EE Name = , Reason: Component faulted: recovery is node failfast,
OwnNodeId = 131343, SupervisionTime = 60
~~~
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets