- **status**: review --> fixed
- **Comment**:

commit a44c0e418d8b4e5c1709a8a49565a88ce7158eab (HEAD -> develop, 
origin/develop)
Author: thuan.tran <thuan.t...@dektech.com.au>
Date:   Wed Apr 29 11:56:59 2020 +0700

    mds: forever retry EAGAIN in mds_mcast_sendto() [#3182]





---

** [tickets:#3182] mds: immd fail to send broadcast message to all immnd**

**Status:** fixed
**Milestone:** 5.20.05
**Created:** Mon Apr 27, 2020 02:42 AM UTC by Thuan Tran
**Last Updated:** Mon Apr 27, 2020 07:15 AM UTC
**Owner:** Thuan Tran


IMMD fail to send broadcast message to all IMMNDs.
~~~
<143>1 2020-04-20T08:31:04.375963+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178678"] 29596:imm/immd/immd_evt.c:289 >> immd_evt_proc_fevs_req 
<143>1 2020-04-20T08:31:04.37597+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178679"] 29596:imm/immd/immd_evt.c:332 T5 immd_evt_proc_fevs_req 
send_count:156403 size:146
143>1 2020-04-20T08:31:04.376482+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178706"] 29596:imm/immd/immd_mds.c:773 >> immd_mds_bcast_send 
<143>1 2020-04-20T08:31:04.376488+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178707"] 29596:imm/common/immsv_evt.c:6539 T8 Sending:  
IMMND_EVT_D2ND_GLOB_FEVS_REQ_2 to 0

<139>1 2020-04-20T08:31:04.426888+02:00 SC-1 osafimmd 29596 mds.log [meta 
sequenceId="482"] MDTM: Failed to send Multicast message err :Resource 
temporarily unavailable
<139>1 2020-04-20T08:31:04.426905+02:00 SC-1 osafimmd 29596 mds.log [meta 
sequenceId="483"] MDTM: Failed to send Multicast message Data lenght=222 From 
svc_id = IMMD(24) to svc_id = IMMND(25) err :Resource temporarily unavailable
<139>1 2020-04-20T08:31:04.426911+02:00 SC-1 osafimmd 29596 mds.log [meta 
sequenceId="484"] MDTM:Continue while(1) status = mds_mcm_send_msg_enc = 
NCSCC_RC_FAILURE
...
<139>1 2020-04-20T08:31:05.034968+02:00 SC-1 osafimmd 29596 mds.log [meta 
sequenceId="518"] MDTM: Failed to send Multicast message err :Resource 
temporarily unavailable
<139>1 2020-04-20T08:31:05.034985+02:00 SC-1 osafimmd 29596 mds.log [meta 
sequenceId="519"] MDTM: Failed to send Multicast message Data lenght=105 From 
svc_id = IMMD(24) to svc_id = IMMND(25) err :Resource temporarily unavailable
<139>1 2020-04-20T08:31:05.03499+02:00 SC-1 osafimmd 29596 mds.log [meta 
sequenceId="520"] MDTM:Continue while(1) status = mds_mcm_send_msg_enc = 
NCSCC_RC_FAILURE

<143>1 2020-04-20T08:31:04.983299+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178708"] 29596:imm/immd/immd_mds.c:793 << immd_mds_bcast_send 
<143>1 2020-04-20T08:31:04.983311+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178709"] 29596:imm/immd/immd_evt.c:413 << immd_evt_proc_fevs_req 
<143>1 2020-04-20T08:31:04.983518+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178742"] 29596:imm/immd/immd_evt.c:289 >> immd_evt_proc_fevs_req 
<143>1 2020-04-20T08:31:04.983526+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178743"] 29596:imm/immd/immd_evt.c:332 T5 immd_evt_proc_fevs_req 
send_count:156404 size:29
<143>1 2020-04-20T08:31:04.984464+02:00 SC-1 osafimmd 29596 osafimmd [meta 
sequenceId="178770"] 29596:imm/immd/immd_mds.c:773 >> immd_mds_bcast_send 
~~~
It cause IMMNDs unexpected exit with OUT OF ORDER error.
Finally, it leads to node failfast as consequence.
~~~
2020-04-20T08:31:05.058+02:00 SC-1 osafimmnd[29612]: ER MESSAGE:156404 OUT OF 
ORDER my highest processed:156402 - exiting
2020-04-20T08:31:05.073+02:00 SC-1 osafimmd[29596]: WA IMMND coordinator at 
2010f apparently crashed => electing new coord
2020-04-20T08:31:05.073+02:00 SC-1 osafimmd[29596]: NO New coord elected, 
resides at 2020f
2020-04-20T08:31:05.078+02:00 SC-1 osaffmd[29579]: NO IMMND down on: 2020f
2020-04-20T08:31:05.082+02:00 SC-1 osafimmd[29596]: WA IMMND coordinator at 
2020f apparently crashed => electing new coord
2020-04-20T08:31:05.082+02:00 SC-1 osafimmd[29596]: NO Coord elected at 
payload:2050f
2020-04-20T08:31:05.082+02:00 SC-1 osafimmd[29596]: NO New coord elected, 
resides at 2050f
2020-04-20T08:31:05.083+02:00 SC-1 osafimmd[29596]: WA IMMD - MDS Send Failed
2020-04-20T08:31:05.084+02:00 SC-1 osafimmd[29596]: ER Failed to send MDS 
message designating new IMMND coord, exiting
2020-04-20T08:31:05.085+02:00 SC-1 osafamfnd[29808]: ER 
safComp=IMMD,safSu=SC-1,safSg=2N,safApp=OpenSAF Faulted due to:avaDown Recovery 
is:nodeFailfast
2020-04-20T08:31:05.086+02:00 SC-1 osafamfnd[29808]: Rebooting OpenSAF NodeId = 
131343 EE Name = , Reason: Component faulted: recovery is node failfast, 
OwnNodeId = 131343, SupervisionTime = 60
~~~


---

Sent from sourceforge.net because opensaf-tickets@lists.sourceforge.net is 
subscribed to https://sourceforge.net/p/opensaf/tickets/

To unsubscribe from further messages, a project admin can change settings at 
https://sourceforge.net/p/opensaf/admin/tickets/options.  Or, if this is a 
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
Opensaf-tickets@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets

Reply via email to